A key aspect of Web 2.0 is tagging, the ability of people to assign keywords to web resources, be they del.icio.us favorites, Flickr photographs, blog entries, or YouTube videos. Tagging facilitates personal indexing, for retrieval of items based on a word with a familiar relationship. Tagging also facilitates group or social indexing, letting me retrieve items based on what others believe a word is related to.

One limitation of tags is that they are single words only. If I want to tag something “semantic web” then I must either assign two separate tags or assign a compound tag like “semantic-web,” “SemanticWeb,” or “semantic_web”. When using the tag, I must remember the specific compound tag. Otherwise, I am likely to retrieve the union of items tagged “semantic” with items tagged “web”.

del.icio.us provides recommended tags based on how others have tagged a URL. This is helpful both as a voting mechanism (I agree that the tag applies) as well as providing me insight into how large numbers of others perceive the URL.

By using both mechanisms, I have ended up with a large number of tags in del.icio.us. It then becomes tempting to “consolidate” some of my tags which seem to differ only superficially (e.g. in capitalization or “www” vs. “web”). It also becomes tempting to eliminate some tags that I chose because they were recommended, but which still are not meaningful or helpful to me.

It also becomes tempting to organize tags into hierarchies, like “domestic,” and “international,” as sub-tags under “travel”.

These are issues of knowledge representation. Experts in artificial intelligence have grappled with them for thirty years or more. It seems an unfortunate failure of that field that a solution is not well-known. However, it is interesting to see them play out and be addressed on a massive and public scale.

Tags: Technical

Updated at: 23 April 2008 12:04 AM