Semantic Search: Finding Stuff and Creating more Businesses in this Flat World!

If you want more good jobs then spawn more Steve Jobs" says Thomas Friedman, the author of  "The World is Flat "  in this new article in NY times. Not everyone is a genius like Steve Jobs, can put 10,000 hours or is a part of 1955 club which were the key characterstics of these successful enterpreneurs as pointed by Malcom Galdwell in his interesting book "Outliers." Not every company becomes Apple also. Well, what US really needs is more of small businesses than ever which are one of the driving force behind its economy. There are some real facts about the small businesses in US:
  • Employ just over half of the country’s private sector workforce
  • Hire 40 percent of high tech workers, such as scientists, engineers and computer workers
  • Include 52 percent home-based businesses and two percent franchises
  • Represent 97.3 percent of all the exporters of goods
  • Represent 99.7 percent of all employer firms
  • Generate a majority of the innovations that come from United States companies

We need more enterpeneurs than ever who can identify opportunities worldwide and develop it into profitable ventures. Where do you start? How do you get the information? How do I know about the gaps in the markets for particular products and services? What should be the focus area? Has it been done before? Who can partner with me? And there are so many questions you can think of. I undertand that you don't start every business just by searching on the web as there are other important things like personal contacts, capital, your network, your own experience, trade associations etc. etc.. But the search on the web is increasingly becoming the major starting point for many of these activities. It is more relevant than ever to do the research because everything you want to do can be outsourced; can be imported; maybe already exists somewhere; or demand is going to go away and you are blissfully unaware. Not that it is easy to figure this out but atleast we should have more resources than just the big three search engines - Google, Yahoo and Bing.

Try to find or research that information on these engines and you will know it is so hard. All three of them have done good jobs in the last many years but they can't continue to be all things to all people in all the contexts. Too much of emphasis has been on ranking also. I personally like google but it definetely falls short as far as exploratory and interactivity aspect is concerned. Bing, which calls itself decision engine, has shown some very good improvements in last one year. Also, somehow the big three have ended up promoting a marketing view of search on the web and they thrive on the tension created between SEO consultants/advertisers and them. It seems that media and analyst community are also too concerned with glorifying every percentage gain by Bing over Yahoo as you can see in this news and may others. Maybe, "its just not about search, its about business" - probably, Michael Corleane (from movie Godfather) would have said if he worked for Google in this era.

In general, I have seen a very narrow view of the search on the web from a school of thought which believes that whatever could be done in search will be just confined to these three as far as web is concerned - they think that new entrants will make some noise initially and then go away quietly. I completey disagree. In my opinion, it is no different from the view in eighteenth century when there was a school of thought which believed that whatever human beings can think of or can invent has already been done - there is no scope of anything new. Sounds ridiculous if you evaluate the progress mankind has made since then!

A lot has been written about the benefits about the Semantic Search and how it is better than the key-word based search. I have also written in one of my previous article about "semantics" in semantic search. Recently, Seth Grimes also compiled a very good article about types of semantic search. So I am not going to talk about what semantic search is but more about the opportunities for the new breed of semantic search engines.

Occassionally, you do see articles like "semantic search engines which will change the world which lists new breed of semantic search engines - some call them google killers. I always wonder why these semantic search have not been able to make measurable impact yet. Some of these engines have very good technology also though the list doesn't include many others which are out there. The internal details of most of these engines are still proprietary and they combine a natural-language processing with various flavours of semantics. One more semantic engine which I like is TipTop which mines Twitter database and does sentiment analysis also. But there are more than six Twitter search-based products in the market as you can see in this review - product from Tiptop is not even mentioned here while the other products may not be doing semantic search. Now, even Bing has a product which searches Twitter. So what can be the next step for these new semantic search engines?

In my opinion, the big issue is the lack of focus for some of these "semantic search" startups and their obsession to boil the ocean. Many of them waste lot of time comparing themselves with google. Some of them are also trying to do very similar things. Market will continue to get crowded with semantic search engines in next few years but there is a risk that many companies with excellent technologies will get lost in the crowd. Very soon, every search engine will start calling itself a semantic search engine, the same way every SAAS offering is a Cloud offering nowadays. The other issue is lack of understanding from business and users about what "semanticity" in search engines really means. Ideally, Semantic search engines should have some aspects of natural language, contextual (focus on disambiguating queries), ontologies and reasoning. The hard part is always developing the understanding how much of work is required to customize the technology to incorporate all these aspects so that it is relevant for a particular domain.

There is a great opportunity for these new breed of semantic search engines to rethink about their strategy. They should also not try to be all things to all people. They really need to carve out a space for themselves in specific segments. If they go after enterprises, they will face stiff competion from the big three in the enterprise - Sharepoint/Fast, Autonomy and Endeca who have customers in hundreds and have evolved over the years. Among them, Autonomy has done a great job in e-discovery space by following a vertical strategy. Even companies like Marklogic with its powerful XML server can solve many search related problems for unstructured content. In my opinion, semantic search startups can always continue to tweek their algorithms and enhance semantics but simultaneously the focus should be on verticalization, branding, strategy, positioning etc.. Application-centric or vertical strategy will be better for them as opposed to platform-centric strategy.  They can also think about merging what is there on the web with the enterprise data/content to give extended BI inxights which is still a new area to develop powerful applications. Though I can count few small companies in this area also and even big ones like Business Objects after acquistion of Inxight. Still , there is ample oportunity to think creatively and develop useful analytics-based applications for enterprise.

Recently, Financial times launched Newssift (still in Beta) which is a business news semantic search engine which indexes thousands of news sources worldwide. They have used Endeca technology for faceted search and sentiment analysis is provided by Lexalytics. It can be a useful  tool but more can be done in this area. If any of these new breed of semantic search engines can correlate data from historical sources and the one which is acquired from multiple sources to: identify patterns and indicate important events then it can be a killer application on Wall Street. Unstructured data is already being leveraged in electronic trading strategies but the adaption is not so fast. Generating alpha from the stream of unstructured data is not an easy task but a great opportunity.

Another set of innovative companies I want to mention in this context are Bintro ,Trialx and Echo Nest. Bintro matches you to what you are looking for like employment, partnerships, investment and joint ventures. TrialX is a free service that matches participants to relevant clinical trials based on their personal health information. TrialX uses a comprehensive database of 25,000+ clinical trials approved by the Food and Drug Administration (FDA) in the United States. Echo Nest helps you find your audience with targetted music production. It claims that it can understand every music writer on the web (bloggers, review sites etc.) and helps you find the writer most likely to review your music. Again, very useful way to leverage semantic technology to solve problems in a particular domain!

In the end, I believe that there is a scope for hundreds of similar applications for semantic search engines and they can happily coexist with Google, Yahoo and Bing. We will also see very interesting changes once the data web evolves and semantic markup starts becoming more prevalent.

