Saturday, April 09, 2011

Text Mining Tweets About Factory Farms

On another blog last year I noted the how those in the agriculture industry were benefiting from the use of social media. (like the Yellow Tail and Pilot Travel incidences).  While social media has allowed farmers to organize and communicate about their industry, it also provides a rich data source for measuring sentiment or perceptions about their industry. Companies are finding that by mining text from web pages, comments, blogs, and social media, they can get measure consumer perceptions almost as well or better than they can through explicit surveys. These powerful analytics could be very beneficial to those in the ag industry or agvocacy groups.

After a week as SAS Gobal Forum, I've been pretty excited about some of the text mining presentations that I got to see.  After getting home I found a tweet from @imusicmash sharing a post from the Heuristic Andrew blog that shared text mining code from R.    (although SAS has some pretty powerful text mining tools, I don’t have access to them for personal blogging purposes)  Anyway, I thought I’d take a stab at mining tweets related to ‘factory farms’ using open source R.

I extracted about 2000 tweets containing the term ‘factory farms’ and produced the following cluster analysis on the text:

 This seems to give an idea about the content of conversations regarding ‘factory farms.’  Some of these appear to center around gmo foods and Monsanto. This already informs me of misperceptions about ‘factory farms’ and biotechnology. Should people tend to associate these terms when 98% of farms are family farms and most of them raise biotech corn and soybeans? 

It seems there are separate clusters of conversations, some related to Monsanto and gmo’s, others related to food and livestock production in general.

It also appears that the topic of ‘factory farms’ is often discussed by the #agchat group, and other food and animal related issues.

I also ran some correlations, or ‘word associations.’  Terms that tend to be used in association with ‘factory farms’ include hens, debeaked,suffering, cruelty,secretive, excess.  All of these terms tend to be related to livestock production, and seem to have negative sentment.  Words correlated with family farms are more neutral, hauled, Missouri, beans, peas, operated, battling.   Terms associated with ‘gmo’ include ban, irreversible, killing.  Interesting the term ‘sustainable’ brought up neutral terms. It doesn’t appear, at least from this sample, that sustainable agriculture is associated with biotechnology, at least in the context of tweets related to ‘factory farms.’  Again, to me this speaks more about misperceptions related to modern sustainable agriculture.

Of course, this is just a first jab at this, I’m no expert in text analytics, and I had to rely on my subjective interpretation to some extent. And, obviously, I have not discovered anything that most people in the ag industry don’t already know. However, more sophisticated analysis is possible and could be more revealing than the example I just gave. I truly believe that text analytics can be a powerful tool for the ag industry and agvocation in the future.

No comments: