Wednesday, April 28, 2010

Sentiment Analysis: Can you get it right by just automating it?

Sentiment analysis or opinion mining, which is relatively a new area of research is based on text extraction, natural language processing, semantic technology, statistical and various hybrid approaches, has become a topic of great interest for many companies. The purpose is to extract opinions and sentiments from text and determine whether a product or a service is viewed positively or negatively. Basically, this area has gained lot of prominence in last three to four years - thanks to all aspects of social media like Twitter, Facebook, blogs, newsfeeds and customer review sites as they have provided enough data to do the sentiment analysis. Has it really arrived for mainstream adaption? Can you really trust the automation of sentiments and get actionable insights? How many sectors and companies are really using it? And how? Is it mainly about the accuracy of the technology to get it right? What level of human analysis is required along with the automation? Is there money for vendors in this space? There are just too many questions out there if you ask most of the people outside the vendor and research community.

Now, we even have a symposium dedicated to it which was recently held at New York city with a good representation from vendors and researchers in this space. There are many companies who are in this space like Lexalytics, Brandwatch, Attensity, Scoutlabs, Istrategy labs, Sentiment 360, Saplo, Serendio and many others from text analytics and natural language processing space who are positioning themselves to take advantage of this promising market. Despite various claims, it is not an easy task to determine whose algorithm is better and how good is their machine learning and language processing modules. Right now, there are hardly any standards out there to compare them. I doubt that we will see any standards in near future also. Till this date, I have seen sentiment analysis being applied more in context of reputation management or understanding voice of customer like:
  • Which conversations are relevant and active
  • Which products are most talked about
  • Sentiments behind each product
  • Intentions to buy product
  • Top issues, cries for help
  • product suggestions
  • Early warning of issues

Even the most optimistic of researchers and vendors agree that it is not an exact science and according to them even seventy five percent of accuracy is hard to get - basically, law of diminishing returns sets in if you want to stretch beyond it. Linguistic nuances or ambiguity of the language, culture and getting the right context is often cited as the reason which leads to lack of accuracy.  Overall, it is still worth it to do the sentiment analysis when you have to do deal with piles of content in this web 2.0 and beyond world. Other than issue of accuracy there are many more challenges like:

  • How do you do segmentation of the audience? I have a hard time understanding many sentiment analysis graph which doesn't tell me anything about the profile of people whose sentiment is being tracked.
  • You do get a broader idea about the polarity of a sentiment but it is very difficult to understand the degree of emotion
  • The industry is realizing that the sentiment analysis has to be used in conjunction with other research techniques though it is still not clear or defined about how it will be done in a repeatable manner?
  • How do you track and interpret sentiments about a product on global basis? It is just not about supporting international languages. There are various instances where a product has done well in US but has failed in a different country or viceversa.
  • Do you apply the same criteria of sentiment analysis whether it is a product, person, service, social or international issue?
  • Technical Integration with CRM or business intelligence will not be an issue. But what should be the criteria to assign weightage to a sentiment?
  • Sentiments can vary considerably if you measure them in different durations. In this article from BBC, it is much easier to measure the positive and negative sentiments associated with Gordon Brown in the UK elections debate during two hours of duration. It seems there were negative comments when he expressed his views about immigration. It just proves that there is a component of time dimension also in an opinion about a famous personality. Though, a sentiment about a product can't vary within hours.  That too from the same person.
Mid-March brought a turn in public sentiment towards Obama from "majority approving" to "majority disapproving" in Gallup : http://www.gallup.com/poll/113980/Gallup-Daily-Obama-Job-Approval.aspx. Gallup tracks daily the percentage of Americans who approve or disapprove of the job Barack Obama is doing as president. Results are based on telephone interviews with approximately 1,500 national adults. And, if you compare these results with TipTop, a sentiment engine based on twitter : http://feeltiptop.com/obama.php, the results are not way off. It does surprise you but it tells me that sentiment analysis in case of public opinion about a burning social issue or a famous personality is relatively easier. It is also easier to get sentiment about a popular consumer product like iPad or iPhone. Probably, you get lots of data and strong opinions. On the other hand, if you read this article, it tells you that results produced by machine in a certain scenario were crap after the humans analyzed it.So there are definetely sceptics out there who will question the value of this technology.

Overall, the risk is that this discipline shouldn't be oversold and needs to go beyond technologists. It is still at an early stage but it is going to stay because the business case is solid. It needs to be positioned as an aid to human analysis rather than a stand alone discipline. Someone rightly said about sentiment analysis - "I think of these tools like a metal detector, sure it beeps when it finds something, but it’s still up to you to expend the energy to dig in up to your elbows and find the nugget." For it to become mainstream, marketers, product management and analysts needs to work in tandem with technologists and develop methods to measure its effectiveness in the long run. For mainstream adaption, more customized use cases have to be thought of and this technology has to be applied in a very pragmatic and realistic way to succeed.

In my opinion, voice of a customer and reputation management is just too crowded and new areas like financial services seems promising. Though, there are  already products like Reuters Newscope who is playing in the financial services market. Pharma is another area but I have heard that there are still open regulatory issues out there which needs to be sorted out. I often think that at some point in future, sentiment analysis can also be integrated with early warning systems or some flavor of predictive analytics.

The topic of sentiment analysis always reminds me of saying from Bertrand Russel, the founder of analytic philosphy, that "The fact that an opinion has been widely held is no evidence whatever that is not utterly absurd."

4 comments:

  1. Priyank-
    Interesting post...I wanted to clarify the Sentiment360 service offering as it is pertinent to your argument. Sentiment360 was created expressly to add the human analysis component to automated search tool. As a "tool agnostic" service we've found that human involvement will allow for accuracy of reporting that is at least 50% better than machine alone (often the numbers are much higher than that).
    Our methodology study for CBS has demonstrated that machines cannot discern nuanced conversations, use of symbolism, video and images. Human analysis is currently the only viable solution. Our use of market research professionals based in Manila allow us to do so in a cost effective manner. The bottom line is that any company that is making decisions based on machine-only sentiment may be making judgment calls based on misinformation.
    Scott

    ReplyDelete
  2. Another marvelous post, Priyank. Thanks for mentioning some of TipTop's results. I, for one, have complete confidence that my algorithms do solve satisfactorily the sentiment detection problem in 99% of cases of real-world applications. Anyone who thinks that TipTop will not produce the results they want should let me know what problem exactly they want to solve and why. I will be happy to review and get back to them. Thanks.

    ReplyDelete
  3. Hi Priyank,

    I enjoyed your article and agree the field is close; the same could be said for social media expertise, search engine optimisation etc - while there may be a need for these things, there is also a gold-rush effect. I've been looking at the nascent w3c specifications for emotionML, which addresses a need to capture emotion data and cross-reference it back to audio and video recordings etc... it doesn't provide a consistent vocabulary, which I think would be perhaps the most useful facet of the exercise, but I do think the approach of decoupling overall emotional response from sentiment and relevance makes sense, then using the more general data to determine relevance.

    Anyways, thanks for writing this - any post that quotes Bertrand Russel scores immediate points in my book!

    ReplyDelete
  4. OK - but does it work? On StackOverflow I read about 61% success rates. Congrats to ScoutLabs for being acquired, but I wouldn't pay for that.

    ReplyDelete