Ah for a decent world wide blog posts search engine. ❧
Ha! Look who’s talking. 🙂
Brent Simmons has been reminiscing about blog search engines and writing down some ideas for how one could be made today. Something he wrote sparked a memory.Instead of having it crawl blogs, I’d have it download and index RSS feeds. This should be cheaper than crawling pages, and it ensures that it skips indexing page junk (navigation and so on).
I can dream about how I’d build one of these. (I’m not going to! This is way outside my expertise, and I have other things to do.)
One thing I wish we had that we used to have: blog-only search engines. You could go and search for a hash tag. Or for links to your blog or elsewhere. Or for keywords. Etc. It should have an API that returns RSS, so RSS reader users could set up persistent, updated searches. There used to be a bunch of these, and now there are none that I know of.
It's pronounced poe-WAH-zek.
Derek Powazek has worked the web since 1995 at pioneering sites like HotWired, Blogger, and Technorati. He is the author of “Design for Community: The Art of Connecting Real People in Virtual Places” (New Riders, 2001). He is the cofounder of JPG, the photography magazine that’s made by its community. He has been Chief of Design for HP’s MagCloud, advisor to a handful of startup companies, and creator of Fray, the magazine of true stories and original art.
It's been six months since we added Tags to Technorati (where I'm Senior Designer), and as it turns out, it was a pretty big deal. So before we get too far away from it, here's the story of how it came about. From my perspective, anyway.
The page was set up to show any post that contained a link to it – in other words, if you linked to that page, then your post appeared on that page. ❧
Just a rehash of Refbacks? or an early implementation of Webmention?!
October 04, 2018 at 09:19AM
Round and round we go, where we’ll stop, nobody knows! The crazy game of tags gets crazier. What are Technorati tags really? And should we use them now that categories are being indexed in the same way? Jeff Jarvis has started another good conversation about tagging over at Buzzmachine. (He started another good conversation about tagging recently). He recently implementated his interpretation of “tags”, and that got him thinking about their value and purpose.
People often ask why WordPress has both a Category and a Tag functionality, and to some extent it would seem to be just for this thing–differentiating between topics and objects–or at least it’s how I have used it and perceived others doing so as well. (Incidentally from a functionality perspective categories in the WordPress taxonomy also have a hierarchy while tags do not.) I find that I don’t always do a great job at differentiating between them nor do I do so cleanly every time. Typically it’s more apparent when I go searching for something and have a difficult time in finding it as a result. Usually the problem is getting back too many results instead of a smaller desired subset. In some sense I also look at categories as things which might be more interesting for others to subscribe to or follow via RSS from my site, though I also have RSS feeds for tags as well as for post types/kinds as well.
I also find that I have a subtle differentiation using singular versus plural tags which I think I’m generally using to differentiate between the idea of “mine” versus “others”. Thus the (singular) tag for “commonplace book” should be a reference to my particular commonplace book versus the (plural) tag “commonplace books” which I use to reference either the generic idea or the specific commonplace books of others. Sadly I don’t think I apply this “rule” consistently either, but hope to do so in the future.
I’ve also been playing around with some more technical tags like math.NT (standing for number theory), following the lead of arXiv.org. While I would generally have used a tag “number theory”, I’ve been toying around with the idea of using the math.XX format for more technical related research on my site and the more human readable “number theory” for the more generic popular press related material. I still have some more playing around with the idea to see what shakes out. I’ve noticed in passing that Terence Tao uses these same designations on his site, but he does them at the category level rather than the tag level.
Now that I’m several years into such a system, I should probably spend some time going back and broadening out the topic categories (I arbitrarily attempt to keep the list small–in part for public display/vanity reasons, but it’s relatively easy to limit what shows to the public in my category list view.) Then I ought to do a bit of clean up within the tags themselves which have gotten unwieldy and often have spelling mistakes which cause searches to potentially fail. I also find that some of my auto-tagging processes by importing tags from the original sources’ pages could be cleaned up as well, though those are generally stored in a different location on my website, so it’s not as big a deal to me.
Naturally I find myself also thinking about the ontogeny/phylogeny problems of how I do these things versus how others at large do them as well, so feel free to chime in with your ideas, especially if you take tags/categories for your commonplace book/website seriously. I’d like to ultimately circle back around on this with regard to the more generic tagging done from a web-standards perspective within the IndieWeb and Microformats communities. I notice almost immediately that the “tag” and “category” pages on the IndieWeb wiki redirect to the same page yet there are various microformats including
u-category which are related but have slightly different meanings on first blush. (There is in fact an example on the IndieWeb “tag” page which includes both of these classes neither of which seems to be counter-documented at the Microformats site.) I should also dig around to see what Kevin Marks or the crew at Technorati must surely have written a decade or more ago on the topic.
If I recall, programming wasn’t necessarily your strong suit, but like many in the IndieWeb will say: “Manual until it hurts!” By doing things manually, you’ll more easily figure out what might work and what might not, and then when you’ve found the thing that does, then you spend some time programming it to automate the whole thing to make it easier. It’s quite similar to designing a college campus: let the students walk around naturally for a bit then pave the natural walkways that they’ve created. This means you won’t have both the nicely grided and unused sidewalks in addition to the ugly grass-less beaten paths. It’s also the broader generalization of paving the cow paths.
In addition to my Following page I’ve also been doing some experimenting with following posts using the Post Kinds Plugin. It is definitely a lot more manual than I’d like it to be. It does help to have made a bookmarklet to more quickly create follow posts, but until I’ve got it to a place that I really want it, it’s not (yet) worth automating taking the data from those follow posts to dump them into my Follow page for output there as well. Of course the fact that my follow posts have h-entry and h-feed mark up means that someone might also decide to build a parser that will extract my posts into a feed which could then be plugged into something else like a microsub-based reader so that I could make a follow post on my own site and the source is automatically added to my subscription list in my reader automatically.
In addition to Kicks Condor, I’me seeing others start to kick the tires of these things as well. David Shanske recently wrote Brainstorming on Implementing Vouch, Following, and Blogrolls, but I think he’s got a lot more going on in his thinking than he’s indicated in his post which barely scratches the surface.
I also still often think back to a post from Dave Winer in 2016: Are you ready to share your OPML? This too has some experimental discovery features that only scratch the surface of the adjacent possible.
And of course just yesterday, Kevin Marks (previously of Technorati) reminded us about rel=”directory” which could have some interesting implications for discovery and following. Think for a bit of how one might build a decentralized Technorati or something along the lines of Ryan Barrett’s indie map.
As things continue to grow, I’m seeing some of all of our decisions and experiments begin to effect others as these are all functionality and discovery mechanisms that we’ll all need in the very near future. I hope you’ll continue to experiment and make cow paths that can eventually be paved.