Personally, I donât get Twitter. I have an account (mvgilliland) for anyone interested in not hearing any tweets from me. I follow a few people and have a few followers…
Personally, I donât get Twitter. I have an account (mvgilliland) for anyone interested in not hearing any tweets from me. I follow a few people and have a few followers (including some that aren't porn bots) — but what is the point? Does anyone really care that Iâm out hanging floss on the line to dry, or that Iâm stuck in the waiting room of my urologist with a prostate the size of a grapefruit?
The fact is, if someone is that interested in what Iâm doing right now, it makes me kind of nervous. Do I really want people âfollowingâ me? Arenât anti-stalking laws enacted for good reason?
Call me old school, a luddite, a 21st century puritan, even a techno-prude. Or perhaps Iâm just blind to the great new opportunities for data (not just dating) that social networking provides. I have actually been swayed a little bit in this direction by my colleague in Australia, Evan Stubbs. Leveraging some code from SAS software developer Zach Marshall, Evan put together a neat little demo for customers, illustrating how SAS Text Miner can be used to identify patterns that could be used in forecasting. From Evan:
Guest Blogger: Evan Stubbs, Solution Manager for SAS Analytics
Rightly or wrongly, we seem to love telling the world what weâre doing, often even if no-oneâs listening! Morgan Stanley recently published a report by a 15 year-old intern that for many, seemed to state the obvious: âOn the other hand, teenagers do not use twitter â¦ they realise that no one is viewing their profile, so their âtweetsâ are pointlessâ.
Ignoring the bizarre implications that âonly oldies use Twitterâ, to me, this misses the point; itâs not about whoâs listening right now, itâs about who might be listening. One of the points of talking publicly about a particular topic is the hope that other people who are also interested in that topic might just join in. For me, itâs about the chance of finding like-minded people with similar interests (whether they agree with me or not!). Itâs about connecting with new people, people I may never have met otherwise. It doesnât matter whether itâs about my passion for analytics, my fascination with my latest gadget, or my displeasure with my latest billing experience; with the growth of the Internet, thereâs bound to be people out there thinking and debating about similar things.
And, thatâs the clincher – the scale of these social networks canât be underestimated. A back-of-the-napkin poll I recently did to see how big some of the sites I knew about were stunned me; out of approximately 20 sites myself or my colleagues are a part of, only two had membership levels below 22 million. That may seem like an arbitrary number, but it has quite a bit of significance for me â itâs the population of Australia, the place I live. Sites like Facebook and MySpace have over ten times the population of Australia; These arenât just social networks, theyâre almost countries in their own right!
With that level of membership, itâs not surprising that thereâs a wealth of information available within them, information that changes as rapidly as the discussion does. Google Trends and Twitter Trending Topics are great to help see what people are talking about overall, but theyâre not personal â they donât always relate to what Iâm interested in. And, trying to trim down the torrent of information is almost an exercise in frustration â applications like TweetDeck help targeted searching and monitoring, but they donât solve the real problem around pattern extraction and trend analysis.
So, based on the excellent work done by Zach Marshall, one of the geniuses behind our Web Services development, I thought itâd be rather fun to use SAS to create a personalized Twitter search process that takes into account geographic information, language-based searches, and then use Enterprise Guideâs Stored Process capabilities to package it up into an installable process usable by anyone. For me, the exciting thing was how much of SASâs functionality I was easily able to use in what amounts to such a small effort:
â¢ SAS 9.2âs Web Services capabilities, to connect to Twitter and create the query
â¢ SASâs Regular Expressions parsing, to cleanse the XML documents and structure them correctly
â¢ SASâs XML parsing and data handling capabilities, to extract and structure data
â¢ SASâs Stored Process capabilities, to turn it into a reusable process thatâll deliver the results to anything (a SAS dataset, Excel, Internet Explorer â¦)
â¢ SASâs Text Mining capabilities, to extract trends and patterns of particular discussion
I spend a lot of time on planes, so one of the first things I searched for was what people were saying about some of our major airlines over the last seven days, centered approximately 100 kilometers around Sydney (where I live). The breakup was fascinating â for one, the discussion was focused around:
â¢ 4% of all discussions: TV related discussion, namely the Australian anti-censorship video being screened on the airline and various television awards programs
â¢ 41% of all discussions: Discussion about the airlineâs lounge, posts of people in-transit and waiting for the flights / going home
â¢ 22% of all discussions: Frequent flier points, the airlineâs club, a new joint loyalty reward program
â¢ 23% of all discussions: Work-related discussion and industry issues (e.g. A380, working at the airline)
â¢ 8% of all discussions: Cargo price fixing
The level of interest around their newly launched joint loyalty program must be pretty gratifying for them; itâs pretty clearly a hot topic on Twitter!
For me, itâs a brilliant way to extend my network, monitor the pulse of discussion, and spend more time thinking and debating and less time clicking. For organisations who care about their customers, it might be a way to create a personal, two-way dialogue with all of their customers. Or, it might be a way to help them solve their customersâ issues as they experience and Tweet about them. Or, it might simply be a way of keeping track of whatâs hot at the moment, quickly, easily, and dynamically.
In any case, I find it tremendously empowering. Itâs not just that Iâve got another way of taming the deluge of information Iâm increasing hit with every day; itâs also that I know that if I say something, the odds of someone hearing it who cares about similar things to me increases every day. And, SAS is right there, helping make it easier.
A great thing about working at SAS is that Iâm surrounded by smart and creative colleagues like Evan and Zach (and over 10,000 others from across the globe). If you arenât familiar with SAS, here is a recent write-up in Investorâs Business Daily. May I never have to leave SAS, or ever again have to work at a public company.
Can the results from text mining tweets be of use in forecasting? Like the use of Google Trends data in forecasting (discussed in this blog on July 10), this is an area of active research. While it is exciting to have all these new data sources, it is still to-be-determined whether they can actually improve the accuracy of our forecasts. Are you doing research in this area? If so, I invite you to share your results in a guest blogger posting on The BFD.
For more information on using SAS to analyze Twitter data, and for a sample of Evanâs code, you can contact him directly at email@example.com.