Microformats and Tags
Posted April 27, 2008on:
I talked about Microformats in a post last year on web20expo. It appears that the technology is now going main stream. I attended a workshop on Web2.0 Best Practices at the Web20 Expo this week in which the speaker, Niall Kennedy expounded on th advantages of using microformats. He said he’s seen a significant growth in traffic on his site since he started doing so since search engine results show direct links to pages on his site.
Yahoo is adding microformats to many of their properties. The yahoo event site already has them. This is exciting since microformats are a bridge to the semantic web, which we’ve been talking about for several years now. However, the talk has never seemed to materialize into anything concrete. Meanwhile, the web2.0 world has decided to do things their own way.
A classic example is tagging. While the semantic folks talk about taxonomies and ontologies, the web guys invented folksonomies (aka tagging). Tagging has allowed users and sites to group stuff together, attaching semantic meaning to their data. Tag clouds have worked fairly well and sites like flickr are extending the concept by automatically creating even more tags ! The problem with tags of course is that a word can have several meanings and it’s not easy to figure out which exact interpretation should be used. This problem is what RDF solves nicely, but more on that later.
Microformats are better than tags in the sense that they have a more rigid format and as such provide better semantics, although not perfect. Let’s look at an example:
<div class="vevent"><br /> <span class="summary">JavaOne Conference</span>: <br /> <span class="description">The premier java conference</span><br /> <p><a class="url"><a href="http://java.sun.com/javaone/sf">http://java.sun.com/javaone/sf</a><br /> <p><abbr class="dtstart" title="2008-05-06">May 6</abbr>-<br /> <abbr class="dtend" title="2008-05-09">9</abbr>,<br /> at the <span class="location">Moscone Center, San Francisco, CA</span><br /> </div>
which will display as :
The premier java conference
at the Moscone Center, San Francisco, CA
The advantage of such a format is that it clearly specifies various properties associated with the event: summary, description, url, start and end dates, location etc. However, it can still be ambiguous since it uses literals for many properites e.g. the location. If someone specified the location simply as “San Francisco”, it could mean any of 27 different San Francisco’s.
If we take this formalizing a step further, we reach the world of RDF. Here every entry is specified as a tuple of the form: &lt;subject&gt;&lt;predicate&gt;&lt;object&gt; using URIs to represent the objects in an unambiguous manner. Without going into the syntactic details, we could specify a location to be defined in the standard format of: number, street, city, state, country, zip. This provides an object with identity, the property that uniquely identifies it.
I’ll talk more about RDF and semantic web in another post.