Some ideas about tags, categories, and metadata for online commonplace books and search

Earlier this morning I was reading The Difference Between Good and Bad Tags and the discussion of topics versus objects got me thinking about semantics on my website in general.

People often ask why WordPress has both a Category and a Tag functionality, and to some extent it would seem to be just for this thing–differentiating between topics and objects–or at least it’s how I have used it and perceived others doing so as well. (Incidentally from a functionality perspective categories in the WordPress taxonomy also have a hierarchy while tags do not.) I find that I don’t always do a great job at differentiating between them nor do I do so cleanly every time. Typically it’s more apparent when I go searching for something and have a difficult time in finding it as a result. Usually the problem is getting back too many results instead of a smaller desired subset. In some sense I also look at categories as things which might be more interesting for others to subscribe to or follow via RSS from my site, though I also have RSS feeds for tags as well as for post types/kinds as well.

I also find that I have a subtle differentiation using singular versus plural tags which I think I’m generally using to differentiate between the idea of “mine” versus “others”. Thus the (singular) tag for “commonplace book” should be a reference to my particular commonplace book versus the (plural) tag “commonplace books” which I use to reference either the generic idea or the specific commonplace books of others. Sadly I don’t think I apply this “rule” consistently either, but hope to do so in the future.

I’ve also been playing around with some more technical tags like math.NT (standing for number theory), following the lead of arXiv.org. While I would generally have used a tag “number theory”, I’ve been toying around with the idea of using the math.XX format for more technical related research on my site and the more human readable “number theory” for the more generic popular press related material. I still have some more playing around with the idea to see what shakes out. I’ve noticed in passing that Terence Tao uses these same designations on his site, but he does them at the category level rather than the tag level.

Now that I’m several years into such a system, I should probably spend some time going back and broadening out the topic categories (I arbitrarily attempt to keep the list small–in part for public display/vanity reasons, but it’s relatively easy to limit what shows to the public in my category list view.) Then I ought to do a bit of clean up within the tags themselves which have gotten unwieldy and often have spelling mistakes which cause searches to potentially fail. I also find that some of my auto-tagging processes by importing tags from the original sources’ pages could be cleaned up as well, though those are generally stored in a different location on my website, so it’s not as big a deal to me.

Naturally I find myself also thinking about the ontogeny/phylogeny problems of how I do these things versus how others at large do them as well, so feel free to chime in with your ideas, especially if you take tags/categories for your commonplace book/website seriously. I’d like to ultimately circle back around on this with regard to the more generic tagging done from a web-standards perspective within the IndieWeb and Microformats communities. I notice almost immediately that the “tag” and “category” pages on the IndieWeb wiki redirect to the same page yet there are various microformats including u-tag-of and u-category which are related but have slightly different meanings on first blush. (There is in fact an example on the IndieWeb “tag” page which includes both of these classes neither of which seems to be counter-documented at the Microformats site.) I should also dig around to see what Kevin Marks or the crew at Technorati must surely have written a decade or more ago on the topic.


cc: Greg McVerry, Aaron Davis, Ian O’Byrne, Kathleen Fitzpatrick, Jeremy Cherfas