October, 2018 | Chris Aldrich

Some ideas about tags, categories, and metadata for online commonplace books and search

Earlier this morning I was reading The Difference Between Good and Bad Tags and the discussion of topics versus objects got me thinking about semantics on my website in general.

People often ask why WordPress has both a Category and a Tag functionality, and to some extent it would seem to be just for this thing–differentiating between topics and objects–or at least it’s how I have used it and perceived others doing so as well. (Incidentally from a functionality perspective categories in the WordPress taxonomy also have a hierarchy while tags do not.) I find that I don’t always do a great job at differentiating between them nor do I do so cleanly every time. Typically it’s more apparent when I go searching for something and have a difficult time in finding it as a result. Usually the problem is getting back too many results instead of a smaller desired subset. In some sense I also look at categories as things which might be more interesting for others to subscribe to or follow via RSS from my site, though I also have RSS feeds for tags as well as for post types/kinds as well.

I also find that I have a subtle differentiation using singular versus plural tags which I think I’m generally using to differentiate between the idea of “mine” versus “others”. Thus the (singular) tag for “commonplace book” should be a reference to my particular commonplace book versus the (plural) tag “commonplace books” which I use to reference either the generic idea or the specific commonplace books of others. Sadly I don’t think I apply this “rule” consistently either, but hope to do so in the future.

I’ve also been playing around with some more technical tags like math.NT (standing for number theory), following the lead of arXiv.org. While I would generally have used a tag “number theory”, I’ve been toying around with the idea of using the math.XX format for more technical related research on my site and the more human readable “number theory” for the more generic popular press related material. I still have some more playing around with the idea to see what shakes out. I’ve noticed in passing that Terence Tao uses these same designations on his site, but he does them at the category level rather than the tag level.

Now that I’m several years into such a system, I should probably spend some time going back and broadening out the topic categories (I arbitrarily attempt to keep the list small–in part for public display/vanity reasons, but it’s relatively easy to limit what shows to the public in my category list view.) Then I ought to do a bit of clean up within the tags themselves which have gotten unwieldy and often have spelling mistakes which cause searches to potentially fail. I also find that some of my auto-tagging processes by importing tags from the original sources’ pages could be cleaned up as well, though those are generally stored in a different location on my website, so it’s not as big a deal to me.

Naturally I find myself also thinking about the ontogeny/phylogeny problems of how I do these things versus how others at large do them as well, so feel free to chime in with your ideas, especially if you take tags/categories for your commonplace book/website seriously. I’d like to ultimately circle back around on this with regard to the more generic tagging done from a web-standards perspective within the IndieWeb and Microformats communities. I notice almost immediately that the “tag” and “category” pages on the IndieWeb wiki redirect to the same page yet there are various microformats including u-tag-of and u-category which are related but have slightly different meanings on first blush. (There is in fact an example on the IndieWeb “tag” page which includes both of these classes neither of which seems to be counter-documented at the Microformats site.) I should also dig around to see what Kevin Marks or the crew at Technorati must surely have written a decade or more ago on the topic.

cc: Greg McVerry, Aaron Davis, Ian O’Byrne, Kathleen Fitzpatrick, Jeremy Cherfas

🎧 ‘The Daily’: How Trump Withstands So Many Controversies | New York Times

Listened to ‘The Daily’: How Trump Withstands So Many Controversies from New York Times

As President Trump faces a hailstorm of criticism over his meeting with Russia’s president, his supporters are doubling down. It’s a pattern we’ve seen before.

We really need some people to stand up to all the non-sense.

Extending a User Interface Idea for Social Reading Online

This morning I was reading an article online and I bookmarked it as “read” using the Reading.am browser extension which I use as part of my workflow of capturing all the things I’ve been reading on the internet. (You can find a feed of these posts here if you’d like to cyber-stalk most of my reading–I don’t post 100% of it publicly.)

I mention it because I was specifically intrigued by a small piece of excellent user interface and social graph data that Reading.am unearths for me. I’m including a quick screen capture to better illustrate the point. While the UI allows me to click yes/no (i.e. did I like it or not) or even share it to other networks, the thing I found most interesting was that it lists the other people using the service who have read the article as well. In this case it told me that my friend Jeremy Cherfas had read the article.¹

Reading.am user interface indicating who else on the service has read an article.

In addition to having the immediate feedback that he’d read it, which is useful and thrilling in itself, it gives me the chance to search to see if he’s written any thoughts about it himself, and it also gives me the chance to tag him in a post about my own thoughts to start a direct conversation around a topic which I now know we’re both interested in at least reading about.²

The tougher follow up is: how could we create a decentralized method of doing this sort of workflow in a more IndieWeb way? It would be nice if my read posts on my site (and those of others) could be overlain on websites via a bookmarklet or other means as a social layer to create engaged discussion. Better would have been the ability to quickly surface his commentary, if any, on the piece as well–functionality which I think Reading.am also does, though I rarely ever see it. In some sense I would have come across Jeremy’s read post in his feed later this weekend, but it doesn’t provide the immediacy that this method did. I’ll also admit that I prefer having found out about his reading it only after I’d read it myself, but having his and others’ recommendations on a piece (by their explicit read posts) is a useful and worthwhile piece of data, particularly for pieces I might have otherwise passed over.

In some sense, some of this functionality isn’t too different from that provided by Hypothes.is, though that is hidden away within another browser extension layer and requires not only direct examination, but scanning for those whose identities I might recognize because Hypothes.is doesn’t have a specific following/follower social model to make my friends and colleagues a part of my social graph in that instance. The nice part of Hypothes.is’ browser extension is that it does add a small visual indicator to show that others have in fact read/annotated a particular site using the service.

A UI example of Hypothes.is functionality within the Chrome browser. The yellow highlighted browser extension bug indicates that others have annotated a document. Clicking the image will take one to the annotations in situ.

I’ve also previously documented on the IndieWeb wiki how WordPress.com (and WordPress.org with JetPack functionality) facepiles likes on content (typically underneath the content itself). This method doesn’t take things as far as the Reading.am case because it only shows a small fraction of the data, is much less useful, and is far less likely to unearth those in your social graph to make it useful to you, the reader.

WordPress.com facepiles likes on content which could surface some of this social reading data.

I seem to recall that Facebook has some similar functionality that is dependent upon how (and if) the publisher embeds Facebook into their site. I don’t think I’ve seen this sort of interface built into another service this way and certainly not front and center the way that Reading.am does it.

The closest thing I can think of to this type of functionality in the analog world was in my childhood when library card slips in books had the names of prior patrons on them when you signed your own name when checking out a book, though this also had the large world problem that WordPress likes have in that one typically wouldn’t have know many of the names of prior patrons necessarily. I suspect that the Robert Bork privacy incident along with the evolution of library databases and bar codes have caused this older system to disappear.

This general idea might make an interesting topic to explore at an upcoming IndieWebCamp if not before. The question is: how to add in the social graph aspect of reading to uncover this data? I’m also curious how it might or might not be worked into a feed reader or into microsub related technologies as well. Microsub clients or related browser extensions might make a great place to add this functionality as they would have the data about whom you’re already following (aka your social graph data) as well as access to their read/like/favorite posts. I know that some users have reported consuming feeds of friends’ reads, likes, favorites, and bookmarks as potential recommendations of things they might be interested in reading as well, so perhaps this would be an additional extension of that as well?

[1] I’ve certainly seen this functionality before, but most often the other readers are people I don’t know or know that well because the service isn’t huge and I’m not using it to follow a large number of other people.
[2] I knew he was generally interested already as I happen to be following this particular site at his prior recommendation, but the idea still illustrates the broader point.

👓 FBI has not contacted dozens of potential sources in Kavanaugh investigation | NBC News

Read FBI has not contacted dozens of potential sources in Kavanaugh investigation (NBC News)

With the investigation winding down, multiple individuals who have tried to contact the bureau have not heard back.

👓 Why History Matters | Audrey Watters

Read Why History Matters by Audrey Watters (Hack Education)

This talk was given today to Eddie Maloney’s class at Georgetown University (specifically, its Learning and Design program) on “Technology & Innovation By Design”

👓 The Cruelty Is the Point | The Atlantic

Read The Cruelty Is the Point (The Atlantic)

Trump and his supporters find community by rejoicing in the suffering of those they hate and fear.

A searing piece of writing here. A must-read.

This makes a compelling argument about why some humans are so painfully cruel.

👓 Trump Engaged in Suspect Tax Schemes as He Reaped Riches From His Father | New York Times

Read Trump Engaged in Suspect Tax Schemes as He Reaped Riches From His Father by David Barstow (nytimes.com)

The president has long sold himself as a self-made billionaire, but a Times investigation found that he received at least $413 million in today’s dollars from his father’s real estate empire, much of it through tax dodges in the 1990s.

I had suspected something like this for a long time and my suspicions were pushed during the election upon reports of Trump cheating sub-contractors and not paying them and again earlier this year when Jonathan Greenberg revised some of his 1980’s reportage for Forbes, but this is simply incredible!

While there are a lot of things one can take away from this stunning, thorough, and long read, the thing that strikes me is what Trump did to attempt to cheat his own father, who had been repeatedly been digging him out of trouble, when he was against the wall. He tried to defraud and steal from his greatest benefactor. How can anyone trust him to fight for America or real Americans when his entire substance as well as facade is a complete sham?

Combined with the millions he’s losing on real estate and other deals over the past decade, one is forced (again) to wonder who exactly is funding him now?

👓 How Times Journalists Uncovered the Original Source of the President’s Wealth | New York Times

Read How Times Journalists Uncovered the Original Source of the President’s Wealth (New York Times)

Three reporters spent over a year digging through more than 100,000 pages of documents and chasing down key sources familiar with President Trump’s father and his empire.

👓 11 Takeaways From The Times’s Investigation Into Trump’s Wealth | The New York Times

Read 11 Takeaways From The Times’s Investigation Into Trump’s Wealth (nytimes.com)

Based on a trove of confidential financial records, the Times report offers the first comprehensive look at the inherited fortune and tax dodges that guaranteed Donald Trump a gilded life.

A quick précis of the whole 13,000+ word story for those without the time.