In many industries for many years, automation has replaced old fashioned human labour and brought efficiencies and cost savings. But while computers can be faster and more accurate than humans at many things, will they ever do what human beings do best? Will computers ever think? In 1950, mathematician Alan Turing wrote that “if, during a text-based conversation, a machine is indistinguishable from a human, then it could be said to be thinking” He devised his famous Turing Test and predicted that by 2000 computer technology would have advanced so much that 30% of human judges would be fooled into thinking that a machine was human.
So far the Turing Test has never been passed. Over last weekend, controversial scientist Kevin Warwick conducted the latest set of tests as part of the 18th Loebner Prize for artificial intelligence. Unfortunately even the winning entry, a robot called Elbot, managed to convince 25% of the judges and so officially failed the test. “We really, really have come very close” said Warwick, although it should be noted that the sample size in a field made up of computer experts and journalists was just 12!
When it comes to media analysis, while computers have helped enormously, full automation has proven to be difficult given the complex task of “understanding” language and therefore attributing the correct sentiment, messages and topics. This has resulted in the emphasis on a human analyst to read every article which means that the cost of an analysis program scales proportionately to the volume of coverage. When you are interested in measuring your own organisation in the relatively small universe of traditional media this is fine but what if you want to benchmark yourselves against competitors. What if you want to extend it to social-media - how do you measure the whole of the internet? Since the cost / benefit equation becomes an issue, maybe it is time to look at automation again.
Indeed, many of the social-media specialist companies have gone down the automation route. In Nathan Gilliatt’s guide to Social Media Analysis 2008, at least 23 out of the 64 companies featured use some form of automated analysis.
So just how good have computers got? It is certainly true to say that there have been significant developments over the past few years, particularly in the areas of natural language processing and machine learning. Topic identification and entity extraction has been widely researched, while measuring sentiment has also received a fair amount of attention. In a series of experiments between 2002 and 2004, Bo Pang and Lillian Lee from Cornell University used a variety of machine learning techniques on positive and negative film reviews. They managed to produce accuracy rates of between 75% and 86%.
This of course can be applied to measuring sentiment for organisations in media coverage. A number of generic tools exist for doing exactly that with similar results. For example Corpora Software (who were bought by Infonic and are now part of Lexalytics) have claimed a accuracy of 75%.
One problem with sentiment is that what is favourable in one domain is not necessarily favourable in another. For example while the word ‘cancer’ would usually be an indicator of negative sentiment, for a cancer charity this would probably not be the case. Domain specific machine learning can help with this issue. We recently conducted an experiment on coverage from a leading computer anti-virus company. When using a generic sentiment tool we managed to get an accuracy of 79%. However, many words that would usually be considered negative: ‘virus’, ‘attack’, ‘malware’ are often found in positive articles. By ‘training’ the sentiment model on more relevant coverage we managed to increase the accuracy to 92%!
So we know how good computers are, but how good are humans? Copora reckoned that there was a 82% chance of two or more human analysts agreeing with each other while in a piece of research from the Natural Language Processing team at Microsoft, human analysts agreed on average just 74% of the time. To quote Microsoft: “this suggests that the task of sentiment classification is difficult even for people”. From our own experience, this is pretty low – with a tightly defined brief, a good system that reduces subjectivity backed up with extensive quality control will result in much better accuracy and certainly better than we can get with a computer.
US measurement expert Katie Paine has said “We’ve gotten very good at teaching computers to understand words, the problem is that they don’t understand the nuances of conversations. Computers still can’t tell the difference between sarcasm and irony. And throw in slang and you have an even bigger problem. Facebook, HP and Microsoft did extensive research before selecting measurement tools and all three insisted on human analysts. So I ask you if some of the leading players in technology don’t trust computers, why should you?”
What about my own thoughts? Are computers as good as humans for analysis – no. Would I trust a computer to measure my most important media – no. However for getting a litmus test on the ‘long tail’ be it social media or competitor benchmarking then there is a definite use for automated analysis. Finally I feel that this is not a black and white issue – it is not a straight choice of computers vs humans. Why not get the best of both worlds with a hybrid model. An emphasis on humans for the important stuff with more automation combined with a certain amount of human checking for the rest.
As to the future, as technology advances we may well see the removal of humans from the analysis ‘engine’. However since even the mighty Alan Turing got his prediction wrong, you might not want to bet on this.