Semantic Technology, Financial Reporting and the Toxic Assets!

Financial markets, traditionally the earliest adapter of any new technology relative to other industries, has been a laggard as far as Semantic Technology is concerned. It seems that the turmoil in the capital markets in last two years has managed to dampen enthusiasm for new technologies in capital markets and banking industry. All of it is about to change! The two obvious reasons are: we are coming out of recession and  new regulations regarding financial reporting in XBRL. I believe that the third reason is the inherent limitation of XBRL as far as Semantics is concerned!

As we know, XBRL, solves two significant problems for companies who prepare financial statements along with analysts, investors, regulators, financial publishers and data aggregators:
  • The first problem is that preparing a financial statement for printing, for a Web site, and for filing today means that a company could typically enter information three times
  • The second problem is that today (if the report is not in XBRL), extracting specified detailed information from a financial statement - for e.g we still can't ask questions like "Give me depreciation expense from 2003 a financial report."
The basic idea behind XBRL is to provide grammer and syntax behind financial reporting so that it can be extracted, analyzed and queried. Although XBRL has been around for 10 years now, the adoption and acceptance has only begun to significantly accelerate during 2007 with the support of  SEC. Since the year 2009, the filing has become mandatory for largest five hundred US corporations and other companies will follow in a phased manner from 2010 onwards. Market is already flooded with XBRL products , services and tools. Most of these products and tools help in one or more of following things : creation, viewing, analysis, taxonomy creation, custom document creation and various other automation features. XBRL can be stored in RDBMS as well as XML databases like Marklogic.

So what is the problem? Why do we need Semantic Technology in this context? While XBRL allows for more accurate consumption and interpretation of financial information, there is still a need to connect to the authentic source of the document and to recombine the XBRL content with other data sources. The fundamental issue here is that XBRL document working with other dat source doesn't understand anything about the semantics of data. There is just no meaning associated with the nesting of tags. The limitation of XBRL becomes more obvious when you have to use/analyze/query XBRL reports along with other sources of data which is not XBRL compliant.

If you read this  article in Wall Street Jornal on Toxic assets then it will make you think more clearly about importance of "semantics" in reporting in the world of derivatives.. The key points are:

  1. Ever since humans started trading, lending and investing beyond the confines of the family and the tribe, we have depended on legally authenticated written statements to get the facts about things of value
  2. Derivatives are the root of the credit crunch. Why? Unlike all other property paper, derivatives are not required by law to be recorded, continually tracked and tied to the assets they represent. Nobody knows precisely how many there are, where they are, and who is finally accountable for them.
  3. Every financial deal must be firmly tethered to the real performance of the asset from which it originated.
  4. All documents and the assets and transactions they represent or are derived from must be recorded in publicly accessible registries
  5. Governments can encourage assets to be leveraged, transformed, combined, recombined and repackaged into any number of tranches, provided the process intends to improve the value of the original asset
  6. Financial institutions will have to serve society and fully report what they own and what they owe -- just like the rest of us -- so that we get the facts necessary to find our way out of the current maze
  7. Governments can no longer tolerate the use of opaque and confusing language in drafting financial instruments. Clarity and precision are indispensable for the creation of credit and capital through paper.
XBRL, by itself can't fulfill all of these requirements as we need to corelate/link/resolve various reports to the source data - a very important thing in the world of derivatives reporting. You need Semantic Technology for that! We need to represent XBRL in RDF or OWL representation. I will recommend my readers to read another rebuilding public trust - a nice article on the same topic. The author is also talking about services which can allow the financial data in XBRL to be combined with data from other industry and government sectors — basically, transforming the way we explore information.

There are various techniques to convert an XBRL document to RDF. I will not go into those details in this blog. One example - machine automates XBRL tagging of Excel data in RDF format with one-click Save As XBRL functionality.

I believe that  long-term (probably very long-term) vision of XBRL reports should be to publish it as RDF triples and make it a part of Linkedata cloud. This will help in achieving all linkages, transparency and verification as far as financial reporting is concened. I would like sceptics to know that by April 2009, more than 600 XBRL reports, approx. 1,3 million RDF triplets,  were already part of Linked data cloud.  But at the same time, you need lot more governance, regulations and process behind this effort to get real value. Also, there has to be some kind of incentives for financial organizations to do this.

  1. Great observations. As you suggest it is already possible to construct rdf, and therefore make financial statements part of the Linked Open Data Cloud.

    It will take a while longer to work out what much of the non standard tags mean. Its probably easy to find the "Gross Revenue" from standard tags (I haven't tried, just a guess) but as John Turner suggested at the Data Governance Counsel last week, a more specialized tag, from an industry or company specific taxonomy, such as "Equity from Bottling Operations" (or whatever the example was) is going to be much more difficult.

    We may need a combination of:
    1) crowdsourcing (that is perhaps a number of industry analysts will weigh in on what an industry or company specific taxonomy term "means"), or
    2) convince people who are creating custom taxonomies to do so from a standard set of terms and axioms (this may take a while as they are not incented to do so)

    I think you're right that XBRL will explode now. It seems to have hit the trifecta:
    1) it is becoming mandatory,
    2) it is publicly available and,
    3) there is high value in processing this information