Open Calais initiative, started by Thomson-Reuters, is one of the most interesting things which has happened in favour of semantic web vision. Reuters acquired this technology as part of their ClearForest, one of the leading vendors in the text analytics space, acquisition. It is stated that the service could quickly become the largest repository of metadata (in the form of named entites and facts) on the Web if it stored the resulting metadata from each request. Open Calais is the "metadata extraction service" ; it is a Web service that allows you to automatically annotate content and extract information like facts and named entities (people, places, and organizations, and much more) from unstructured text. Calais uses linguistic parsing (also known as entity extraction) in a service enables way to producr RDF triples and Semantic Web data models.
Open Calais opens the door to the possibility of lowering the barrier enough for everyday users to publish semantic content. It finally does what critics say to be the greatest obstacle to the Semantic Web: Taking the metadata burden from the end-user by providing an automatic meta-tagging tool. Open Calais initiative will also be one of the biggest enabler of the Linked data initiative.
Recently, CNET has joined OpenCalais initiative as one of the first commercial media companies to publish core data assets for public, programmatic use on the open semantic Web. CNET will leverage OpenCalais' connection to the rapidly expanding 'Linked Data cloud' to allow its original content -- such as tech product reviews on laptops, TVs, smart phones, and digital cameras; news articles and blog posts from its CNET News editorial staff; and parts of its core technology product catalog - to be available for public use.