microformats | Chris Aldrich

In August 2012, I wrote a quick script to stream front-page Hackernews stories to an IRC channel on Freenode (##hackernews in case you're interested) so that I could quickly glance at popular stories there instead of needing to load Hackernews. Since IRC is my feed reader, I've always tried to pipe as much there as possible.

Very smart and reminiscent of some of the stuff Drew McLellan and Jeremy Keith were doing almost a decade before. There is a lot more power in microformats that most web developers give them credit for. Aaron has a great example and use-case.

My favorite part here:

So in 2.5 years of parsing the HTML, I never had any problems. In 2 days of parsing the JSON API, I hit a glitch where all the stories were empty.
Since more people and programs see the HTML than use the API, the HTML ends up being more reliable.

Manual Backfeed in the Blogosphere

Forcing webmentions for conversations on websites that don’t support Webmention

Within the IndieWeb community there is a process called backfeed which is the process of syndicating interactions on your syndicated (POSSE) copies back (AKA reverse syndicating) to your original posts. As it’s commonly practiced, often with the ever helpful Brid.gy service, it is almost exclusively done with social media silos like Twitter, Instagram, Flickr, Github, and Mastodon. This is what allows replies to my content that I’ve syndicated to Twitter, for example, to come back and live here on my website.

Why not practice this with other personal websites? This may become increasingly important in an ever growing and revitalizing blogosphere as people increasingly eschew corporate social sites and their dark patterns of tracking, manipulative algorithmic feeds, and surveillance capitalism. It’s also useful for sites whose owners may not have the inclination, time, effort, energy or expertise to support the requisite technology.

I’ve done the following general reply pattern using what one might call manual backfeed quite a few times now (and I’m sure a few others likely have too), but I don’t think I’ve seen it documented anywhere as a common IndieWeb practice. As a point of fact, my method outlined below is really only half-manual because I’m cleverly leveraging incoming webmentions to reduce some of the work.

Manually syndicating my replies

Sometimes when using my own website to reply to another that doesn’t support the W3C’s Webmention spec, I’ll manually syndicate (a fancy way of saying cut-and-paste) my response to the website I’m responding to. In these cases I’ll either put the URL of my response into the body of my reply, or in sites like WordPress that ask for my website URL, I’ll use that field instead. Either way, my response appears on their site with my reply URL in it (sometimes I may have to wait for my comment to be moderated if the receiving site does that).

Here’s the important part: Because my URL appears on the receiving site (sometimes wrapped as a link on either my name or the date/time stamp depending on the site’s user interface choices), I can now use it to force future replies on that site back to my original via webmention! My site will look for a URL pointing back to it to verify an incoming webmention on my site.

Replies from a site that doesn’t support sending Webmentions

Once my comment appears on the receiving site, and anyone responds to it, I can take the URL (with fragment) for those responses, and manually input them into my original post’s URL reply box. This will allow me to manually force a webmention to my post that will show up at minimum as a vanilla mention on my website.

(Note, if your site doesn’t have a native box like this for forcing manual webmentions, you might try external tools like Aaron Parecki’s Telegraph or Kevin Mark’s Mention.Tech, which are almost as easy. For those who are more technical, cURL is an option as well.)

Depending on the microformats mark up of the external site, the mention may or may not have an appropriate portion for the response and/or an avatar/name. I can then massage those on my own site (one of the many benefits of ownership!) so that the appropriate data shows, and I can change the response type from a “mention” to a “reply” (or other sub-types as appropriate). Et voilà, with minimal effort, I’ve got a native looking reply back on my site from a site that does not support Webmention! This is one of the beautiful things of even the smallest building-blocks within the independent web or as a refrain some may wish to sing–“small pieces, loosely joined”!

This method works incredibly well with WordPress websites in particular. In almost all cases the comments on them will have permalink URLs (with fragments) to target the individual pieces, often they’ve got reasonable microformats for specifying the correct h-card details, and, best of all, they have functionality that will send me an email notification when others reply to my portion of the conversation, so I’m actually reminded to force the webmentions manually.

An Illustrative Example

As an example, I posted on my website that I’d read an article on Matt Maldre’s site along with a short comment. Since Matt (currently) doesn’t support either incoming or outgoing webmentions, I manually cut-and-pasted my reply to the comment section on his post. I did the same thing again later with an additional comment on my site to his (after all, why start a new separate conversation thread when I can send webmentions from my comments section and keep the context?).

Matt later approved my comments and posted his replies on his own website. Because his site is built on WordPresss, I got email notifications about his replies, and I was able to use the following URLs with the appropriate fragments of his comments in my manual webmention box:

https://www.spudart.org/blog/xeroxing-your-face/#comment-43843
https://www.spudart.org/blog/xeroxing-your-face/#comment-43844

After a quick “massage” to change them from “mentions” into “replies” and add his gravatar, they now live on my site where I expect them and in just the way I’d expect them to look if he had Webmention support on his website.

I’ll mention that, all of this could be done in a very manual cut-and-paste manner–even for two sites, neither of which have webmention support. But having support for incoming webmentions on one’s site cuts back significantly on that manual pain.

For those who’d like to give it a spin, I’ll also mention that I’ve similarly used the incredibly old refbacks concept in the past as a means of notification from other websites (this can take a while) and then forced manual webmentions to get better data out of them than the refback method allows.

Read The discovery metadata field by Matt Maldre (Matt Maldre)

The internet would be a really interesting place if every article that was shared automatically had a “via link.” Ok, so the internet is already interesting. But what makes the internet such a great place is its connectivity. Everything is linked together. We can easily share a link to an article. So many links all … The discovery metadata field Read More »

I’ve been fascinated with this idea of vias, hat tips, and linking credit (a la the defunct Curator’s Code) just like Jeremy Cherfas. I have a custom field in my site for collecting these details sometimes, but I should get around to automating it and showing it on my pages rather than doing it manually.

Links like these seem like throwaways, but they can have a huge amount of value in aggregate. As an example, if I provided the source of how I found this article, then it’s likely that my friend Matt would then be able to see a potential treasure trove of information about the exact same topic which he’s sure to have a lot of interest in as well.

One of the things I love about webmentions is that these sorts of links to give credit could be used to create bi-directional links between sites as well. I’m half-tempted to start using custom experimental microformats classes on these links so that when the idea takes off that people could potentially display them in their comments sections as such instead of just vanilla “mentions”. This could be useful for sites that serve as inspiration in much the same way that journalistic outlets might display reads (versus bookmarks, likes, or reposts) or podcasts could display listens. Just imagine the power that displaying webmentions on wikis could have for their editors to later update pages or readers might have to delve into further resources that mention and link to those pages, especially when the content on those linked pages extends the ideas?

Tim Berners-Lee’s original proposal for hypertext was rejected because it didn’t bake bi-directional links into the web (c.f. Webstock ‘18: Jeremy Keith – Taking Back The Web at 13:39 into the video). Webmentions seems to be a simple way of ensconcing them after-the-fact, but in a way that makes them more resilient as well as update-able and even delete-able by either side.

Of course now I come to wonder just how it was that Jeremy Cherfas finds such a deep link on Matt’s site from over a year ago? 😉

↬ Jeremy Cherfas‘ update on the IndieWeb wiki ᔥ the IndieWeb-meta chat (2020-01-29)

I changed the theme of my website the other day, so I spent a few minutes this morning updating the microformats in it to be more IndieWeb friendly. Here’s hoping I haven’t broken anything too badly.

I like the idea of the microformats web extension. I suspect it could do with an update to microformats v2 though. The idea of being able to parse a page and add contact information or events directly to my address books or calendars is pretty awesome.

I could also appreciate it parsing a page and allowing me to use an h-card to quickly create a follow post and automatically add a page’s feed to my feed reader.

Domains, power, the commons, credit, SEO, and some code implications

How to provide better credit on the web using the standard rel=“canonical” by looking at an example from the Open Learner Patchbook

A couple of weeks back, I noticed and began following Cassie Nooyen when I became aware of her at the Domains 2019 conference which I followed fairly closely online.

She was a presenter and wrote a couple of nice follow up pieces about her experiences on her website. I bookmarked one of them to read later, and then two days later I came across this tweet by Terry Green, who had also apparently noticed her post:

I really hope this new post of the Open Learner Patchbook comes across the feed of lots of learners who haven’t experienced a Domain of One’s Own program before.

Patch Twenty Five – My Domain, My Place To Grow

by Cassie Nooyen @CassieNooyen https://t.co/0hjEtyJ2XU @Autumm

— Terry Greene (@greeneterry) June 21, 2019

But I was surprised to see the link in the tweet points to a different post in the Open Learner Patchbook, which is an interesting site in and of itself.

This means that there are now at least two full copies of Cassie’s post online:

While I didn’t see a Creative Commons notice on Cassie’s original or any mention of permissions or even a link to the source of the original on the copy on the Open Patchbook, I don’t doubt that Terry asked Cassie for permission to post a copy of her work on his site. I’ll also suspect that it may have been the case that Cassie might not have wanted any attention drawn to herself or her post on her site and may have eschewed a link to it. I will note that the Open Patchbook did have a link to her Twitter presence as a means of credit. (I’ll still maintain that people should be preferring links to their own domain over Twitter for credits like these–take back your power!)

Even with these crediting caveats aside, there’s a subtle technical piece hiding here relating to search engines and search engine optimization that many in the Domain of One’s Own space may not realize exists, or if they do they may not be sure how to fix. This technical subtlety is that search engines attempt to assign proper credit too. As a result there’s a very high chance that Open Patchbook could rank higher in search for Cassie’s own post than Cassie’s original. As researchers and educators we’d obviously vastly prefer the original to get the credit. So what’s going on here?

Search engines use a web standard known as rel=“canonical”, a microformat which is most often found in the HTML <header> of a web page. If we view the current source of the copy on the Open Learner Patchbook, we’ll see the following:

<link rel="canonical" href="http://openlearnerpatchbook.org/technology/patch-twenty-five-my-domain-my-place-to-grow/" />

According to the Microformats wiki:

By adding rel=“canonical” to a hyperlink, a page indicates that the destination of that hyperlink should be considered the preferred or definitive version of the current page. This helps search engines avoid duplicate content, and is useful for deciding how to link to a page when citing it.

In the case of our example of Cassie’s post, search engines will treat the two pages as completely separate, but will suspect that one is a duplicate of the other. This could have dramatic consequences for one or the other sites in which search engines will choose one to prefer over the other, and, in some cases, search engines may penalize one site for having duplicate content and not stating that fact (in their metadata). Typically this would have more drastic and averse consequences for Cassie’s original in comparison with an institutional site.

How do we fix the injustice of this metadata?

There are a variety of ways, but I’ll focus on several in the WordPress space.

WordPress core has built-in functionality that should set the permalink for a particular page as the canonical one. This is why the Open Patchbook page displays the incorrect canonical link. Since most people are likely to already have an SEO related plugin installed on their site and almost all of them have this capability, this is likely the quickest and easiest method for being able to change canonical links for pages and posts. Two popular choices for this are Yoast and All in One SEO which have simple settings for inputting and saving alternate canonical URLs. Yoast documents the steps pretty well, so I’ll provide an example using All in One SEO:

If not done already, click the checkbox for canonical URLs in the “General Settings” section for the plugin generally found at /wp-admin/admin.php?page=all-in-one-seo-pack%2Faioseop_class.php.
For the post (or page) in question, within the All in One SEO metabox in the admin interface (pictured) put the full URL of the original posts’ location.
(Re-)publish the post.

If you’re using another SEO plugin, it likely handles canonical URLs similarly, so check their documentation.

For aggregation websites, like the Open Learner Patchbook, there’s also another solid option for not only setting the canonical URL, but for more quickly copying the original post as well. In these cases I love PressForward, a WordPress plugin from the Roy Rosenzweig Center for History and New Media which was designed with the education space in mind. The plugin allows one to quickly gather, organize, and republish content from other places on the web. It does so in a smart and ethical way and provides ample opportunity for providing appropriate citations as well as, for our purposes, setting the original URL as the canonical one. Because PressForward is such a powerful and diverse tool (as well as a built-in feed reader for your WordPress website), I’ll refer users to their excellent documentations.

Another useful reason I’ll mention for using rel-canonical mark up is that I’ve seen cases in which using it will allow other web standards-based tools like Hypothes.is to match pages for highlights and annotations. I suspect that if the Open Patchwork page did have the canonical link specified that any annotations made on it with Hypothes.is should mirror properly on the original as well (and vice-versa).

I also suspect that there are some valuable uses of this sort of small metadata-based mark up within the Open Educational Resources (OER) space.

In short, when copying and reposting content from an original source online, it’s both courteous and useful to mark the copy as such by putting a tag onto the URL of the original to provide it with the full credit as the canonical source.

An annotation example for Hypothes.is using <blockquote> markup to maintain annotations on quoted passages

A test of some highlighting functionality with respect to rel-canonical mark up. I’m going to blockquote a passage of an original elsewhere on the web with a Hypothes.is annotation/highlight on it to see if the annotation will properly transclude it.

I’m using the following general markup to make this happen:

<blockquote><link rel="canonical" href="https://www.example.com/annotated_URL">
Text of the thing which was previously annotated.
</blockquote>

Let’s give it a whirl:

This summer marks the one-year anniversary of acquiring my domain through St. Norbert’s “Domain of One’s Own” program Knight Domains. I have learned a few important lessons over the past year about what having your own domain can mean.

SECURITY

The first issue that I never really thought about was the security and privacy on my domain. A few months after having my domain, I realized that if you searched my name, my domain was one of the first things that popped up. I was excited about this, but I soon realized that this meant everything I blogged about was very much in the open. This meant all of my pictures and also every person I have mentioned. I made the decision to only use first names when talking about others and the things we have done together. This way, I can protect their privacy in such an open space. With social media you have some control over who can see your post based on who “friends” or “follows you”; on a domain, this is not as much of a luxury. Originally, I thought my domain would be something I only shared with close friends and family, like a social media page, but understanding how many people have the opportunity to see it really shocked me and pushed me to think about the bigger picture of security and safety for me and those around me.

—Cassie Nooyens in What Having a Domain for a year has Taught Me

Unfortunately, however, I’m noticing that if I quote multiple sources this way (at least in my Chrome browser), only the last quoted block of text transcludes the Hypothes.is annotations. Based on prior experiments using rel-canonical mark up I’ve noticed this behavior, but I suspect it’s simply the fact that the rel-canonical appears on the page and matches one original. It would be awesome if such a rel-canonical link which was nested into any number of blockquote tags would cause the annotations from the originals

Perhaps Jon Udell and friends could shed some light on this and or make some tweaks so that blockquoting multiple sources within the same page could also allow the annotations on those quoted passages to be transcluded onto them?

Separately, I’m a tad worried that any annotations now made on my original could also be mistakenly pushed back to the quoted pages because of the matching rel-canonical without anything taking into account the nested portions of the page or the blockquoted pieces. I’ll make a test on a word or phrase like “security and privacy” to see if this is the case. We’ll all notice that of course this test fails by seeing the highlight on Cassie’s original. Oh well…

So the question becomes, is there a way within the annotation spec to allow us to write simple HTML documents that blockquote portions of other texts in such a way that we can bring over the annotations of those other texts (or allow annotating them on our original page and have them pushed back to the original) within the blockquoted portions, yet still not interfere with annotating our own original document? Ideally what other HTML tags could/should this work on? Further could this be common? Generally useful? Or simply just a unique edge case with wishful thinking made from this pet example? Perhaps there’s a better way to implement it than my just having thrown in the random link on a whim? Am I misguidedly attempting to do something that already exists?