Twitter, by nature, is of real-time, by the minute, trending topics. Data analysis of Twitter data is best if it is as well. If you want a sense of who’s tweeting, what languages are participating in a particular topic of conversation, and where (IF their geo-coord data is not set to private…which is the case when I’m visualizing librarian conference data–they are privacy aware;)) here are my go-to quick and dirty exploratory tools…super easy to use, and give me an initial sense of the scope and “feel” of the data.
- What’s your topic? I’m taking data from the outcry in reaction to the photograph of Syrian boy Aylan Kurdi, who drowned on his way from Turkey to Greece, just one soul in the recent migrant crises to die while fleeing. This Twitter data was gathered today at 9AM (Sept. 3, 2015), and showed 23 unique languages of user tweeting with the hashtag #KiyiyaVuranInsanlik.
- Follow your curiosity to shape the questions to ask your data. Before analysis, I always manually go to the head of the tweet stream of the hashtag I’m harvesting, to get a sense of it’s popularity, the type of media being posted, the users, etc.
- Get TAGS. It’s a google spreadsheet that uses the Twitter REST Api (credit: Martin Hawksey!). https://tags.hawksey.info/get-tags/ (Use the “New Tags”).
- Once your Google Spreadsheet is open, Click File–> Create new copy–> then rename. Now, in this new sheet, go to the TAGS tab at the top right of your menu bar. Enter the term you wish to scrape from twitter in the box (it’s all pretty self-explanatory). Click “Run now!” (You’ll need to be logged into Twitter and set up authorization.
- Authorization Once you have granted authorization to run through your Twitter account, click TAGS–> “Run now!” again. A message will appear indicating that the API is scraping (“working”). Wait patiently 🙂 Don’t go to the archive sheet while it’s working–just chill.
- How many unique tweets? When done, check the sheet to see the period of time (7 days back) of data scraped, and the number of unique tweets. Then, check out the archive!!!!
- Add to Google Fusion I used to download the archive, save as a CSV, then upload to Google Fusion tables to play with. However, if you simply take the web address of the archive spreadsheet, you can pull it into Fusion super easy peasy.
- Word Tree tweet content exploration Also, use Jason Davies’ Word Tree to do some SUPER simple exploratory NLP. Then, of course, load your text into python to clean it up and run nltk for bigram, trigrams, frequency analysis, etc. to dig deeper. https://www.jasondavies.com/wordtree/?source=&prefix=Aylan%20K
Explore! Here are the exploratory raw data and charts (via google Fusion tables) vis of languages of twitter users who tweeted with the hashtag #KiyiyaVuranInsanlik. Play around: http://bit.ly/1EDuq2L