Saturday, November 11, 2006

Suggestion on making tag on classification

Although there are no standard guidelines on good tag selection practices, those in the folksonomy community have offered many ideas. Some “best practices” including:

  1. using plurals rather than singulars
  2. using lower case.
  3. grouping words using an underscore,
  4. following tag convention started by others and
  5. adding synonyms.

Many folksonomies allow users to modify this tags, and there is considerable scope for users to tidy up the entries that they have already created. Currently, tags are generally defined as single words or compound words, which means that information can be lost during that tagging process. Single-word tags lose that information that would generally be encoded in the word order of a phrase.

The commonness of compound tags, including tags that concatenate more than two words, may suggest that users miss the richness of the sentence structure. The “non-breakable space” can be introduces. Although many compound words are produced sing separator characters, such as this_is_tag.

Several del.icio.us taggers have established a private presudo-hierarchy of terms, by establishing tag conventions that resemble directory tructures, such as, Programming/C++, Programming/JAVA.

Smart systems

Alongside education users, there is much that system creators can do to improve the end-data their systems are helping to create. There are tow main ways in which improvements can be made. Firstly, much can be done at the point at which new resources are contributed to the system. Error-checking potentially accounts for a number of tag errors --- although rather fewer misspellings occur than may be expected. Furthermore, some sites already make tag suggestions when users submit resources. Scrumptions, a recent Firefox extension, offers popular tags for every URL. Systems could easily suggest synonyms, expansion of acronyms, and the like when users type in their tags.

Secondly, improvements can be made in the way systems search for resources already in the system. Synonyms suggestions could also be made here, suggesting for example, “ladybug” instead of “ladybird”.

Clay Shirky notes:

Tagging gets better with scale. With a multiplicity of points of view the question isn’t ‘’ Is everyone tagging any given line ‘correctly’’’, but rather ‘’ Is anyone tagging it the way I do?’’ As long as at least on other person tags something the way you would, you’ll find it – using a thesaurus to force everyone’s tags into tighter synchrony would actually worsen the noise you’ll get with your signal. If there is no shelf, then even imagining that there is one right way to organize things is an error.”

Conclusions

The investigations described in the article are brief, simple and relatively unscientific, as are the number provided within. That the results from both del.icio.us and flickr tended to be rather similar imply that they can be trusted only as much as a short, seat-of-the-pants. Only those with direct access to the del.icio.us nd flickr databases can be aware of the exact state of affairs and how it has changed across the months. For the research purposes, the interesting features of the tags are not in the precise percentages of usage, but in the choice of tag, the choice of structure, and the choice of language. Somewhere around a third of tags were indeed “malformed”, in tat they were beyond the grasp of a multilingual spell-checker for on e reason or another. Many of there were not misspelt, but mis-constructed, some of the latter in a correctable manner.

Still, possibly the real problem with folksonomies in not their chaotic tags but they are trying to serve two masters at once; the personal collection and the collective collection. So it possible to have the best of both worlds? At the moment, many investigations of tag data are in progress, including how tags can be used for searching. As a consequence, development in this fields tends to confine itself to methods for improving the quality of the user-contributed tags for this purpose. In practice, this involves promoting commonly-chosen tags above single-use or infrequently used tags by various means. It is possible that the data collected through folksonomy tagging is more complete than we had imagined. Some single-use tags are explicitly designed as such, such as the latitude/longitude makers used by geotagging. Some may be perceived as valuable or helpful to the reader. Some may be infinitely helpful for search purpose, if only the information provided therein is accessed in an appropriate manner. Is it therefore preferable, rather than attempting to stamp out single use or sloppy tags, to suggest that each item be tagged with mixture of approaches, including several search-friendly keywords?



Source: http://webdoc.sub.gwdg.de/edoc/aw/d-lib/dlib/january06/guy/01guy.html

Folksonomies
Tidying up Tags?

Marieke Guy
UKOLN


Emma Tonkin
UKOLN

from D-Lib Megazine

No comments: