When it comes to their stuff, people often have a hard time letting go. When the object of their obsession are rooms full of old clothes or newspapers, it can be unhealthy—even dangerous. But what about a stash that fits on 10 5-inch hard drives?
This is an important topic and something which should be tended to on an ongoing basis.
Ben Welsh of the LA Times data desk has built Savemy.News which leverages Twitter in combination with archive.is, webcitation.org, and archive.org to allow journalists to quickly create multiple archives of their work by simply inputting the URLs of their related pages. It’s also got a useful download functionality too.
Those with heavier digital journalism backgrounds and portfolios may find some useful information and research coming out of Reynolds Journalism Institute’s Dodging the Memory Hole series of conferences. I can direct those interested to a variety of archivists, librarians, researchers, and technologists should they need heavier lifting that simpler solutions than archive.org, et al.
Additional ideas for archiving and saving online work can be found on the IndieWeb wiki page archival copy. There are some additional useful ideas and articles on the IndieWeb for Journalism page as well. I’d welcome anyone with additional ideas or input to feel free to add to any of these pages for others’ benefit as well. If you’re unfamiliar with wiki notation or editing, feel free to reply to this post; I’m happy to make additions on your behalf or help you log in and navigate the system directly.
If you don’t have a website where you keep your personal archive and/or portfolio online already, now might be a good time to put one together. The IndieWeb page mentioned above has some useful ideas, real world examples, and even links to tutorials.
As an added bonus for those who clicked through, if you’re temporarily unemployed and don’t have your own website/portfolio already, I’m happy to help build an IndieWeb-friendly website (gratis) to make it easier to store and display your past and future articles.
I’ve recently outlined how ideas like a Domain of One’s Own and IndieWeb philosophies could be used to allow researchers and academics to practice academic samizdat on the open web to own and maintain their own open academic research and writing. A part of this process is the need to have useful and worthwhile back up and archiving ability as one thing we have come to know in the history of the web is that link rot is not our friend.
Toward that end, for those in the space I’ll point out some useful resources including the IndieWeb wiki pages for archival copies. Please contribute to it if you can. Another brilliant resource is the annual Dodging the Memory Hole conference which is run by the Reynolds Journalism Institute.
While Dodging the Memory Hole is geared toward saving online news in particular, many of the conversations are nearly identical to those in the broader archival space and also involve larger institutional resources and constituencies like the Internet Archive, the Library of Congress, and university libraries as well. The conference is typically in the fall of each year and is usually announced in August/September sometime, so keep an eye out for its announcement. In the erstwhile, they’ve recorded past sessions and have archive copies of much of their prior work in addition to creating a network of academics, technologists, and journalists around these ideas and related work. I’ve got a Twitter list of prior DtMH participants and stake-holders for those interested.
I’ll also note briefly, that as I self-publish on my own self-hosted domain, I use a simple plugin so that both my content and the content to which I link are being sent to the Internet Archive to create copies there. In addition to semi-regular back ups I make locally, this hopefully helps to mitigate potential future loss and link rot.
As a side note, major bonus points to Robin DeRosa (@actualham) for the use of the IndieWeb hashtag in her post!!
Dave Winer has a great post today on the closing of blogs.harvard.edu. These are sites run by Berkman, some dating back to 2003, which are being shut down. My galaxy brain goes towards the idea of …
An interesting take on self-hosting and DoOO ideas with regard to archiving and maintaing web presences. I’ll try to write a bit more on this myself shortly as it’s an important area that needs to be expanded for all on the open web.
I got an email in the middle of the night asking if I had seen an announcement from Berkman Center at Harvard that they will stop hosting blogs.harvard.edu. It's not clear what will happen to the archives. Let's have a discussion about this. That was the first academic blog hosting system anywhere. It was where we planned and reported on our Berkman Thursday meetups, and BloggerCon. It's where the first podcasts were hosted. When we tried to figure out what makes a weblog a weblog, that's where the result was posted. There's a lot of history there. I can understand turning off the creation of new posts, making the old blogs read-only, but as a university it seems to me that Harvard should have a strong interest in maintaining the archive, in case anyone in the future wants to study the role we played in starting up these (as it turns out) important human activities.
This is some earthshaking news. Large research institutions like this should be maintaining archives of these types of things in a defacto manner. Will have to think about some implications for others in the DoOO and IndieWeb spaces.
The researcher’s post can webmention an aggregating website similar to the way they would pre-print their research on a server like arXiv.org. The aggregating website can then parse the original and display the title, author(s), publication date, revision date(s), abstract, and even the full paper itself. This aggregator can act as a subscription hub (with WebSub technology) to which other researchers can use to find, discover, and read the original research.
Readers of the original research can then write about, highlight, annotate, and even reply to it on their own websites to effectuate peer-review which then gets sent to the original by way of Webmention technology as well. The work of the peer-reviewers stands in the public as potential work which could be used for possible evaluation for promotion and tenure.
Readers of original research can post metadata relating to it on their own website including bookmarks, reads, likes, replies, annotations, etc. and send webmentions not only to the original but to the aggregation sites which could aggregate these responses which could also be given point values based on interaction/engagement levels (i.e. bookmarking something as “want to read” is 1 point where as indicating one has read something is 2 points, or that one has replied to something is 4 points and other publications which officially cite it provide 5 points. Such a scoring system could be used to provide a better citation measure of the overall value of of a research article in a networked world. In general, Webmention could be used to provide a two way audit-able trail for citations in general and the citation trail can be used in combination with something like the Vouch protocol to prevent gaming the system with spam.
Government institutions (like Library of Congress), universities, academic institutions, libraries, and non-profits (like the Internet Archive) can also create and maintain an archival copy of digital and/or printed copies of research for future generations. This would be necessary to guard against the death of researchers and their sites disappearing from the internet so as to provide better longevity.
How many more times do people have to get stiffed by a free web service that just bites the dust and leaves you bubkas?
A monster post, some ranting on companies like Storify who offer free services that leverage our effort to get worth enough to get sold – when they do they just yank our content, an approach for local archiving your storify dying content, a new home spun tool for extracting all embeddable content links and how to use it to create your own archives in WordPress.
Storify Is Nuking, for no credible reason, All Your Content
Okay there are two kinds of people or organizations that create things for the web. One is looking to make money or fame and cares not what happens once they get either (or none and go back to flipping burgers). The other has an understanding and care for the history and future of the web, and makes every effort to make archived content live on, to not leave trails of dead links.
I like Alan Levine’s take on type one and type two silo services. Adobe/Storify definitely seems to be doing things the wrong way for shutting down a service. He does a great job of laying out some thought on how to create collection posts, particularly on WordPress, though I suspect the user interface could easily be recreated on other platforms.
I would add some caution to some of his methods as he suggests using WordPress’s embed capabilities by using raw URLs to services like Twitter. While this can be a reasonable short term solution and the output looks nice, if the original tweet or content at that URL is deleted (or Twitter shuts down and 86s it the same way Storify has just done), then you’re out of luck again!
Better than relying on the auto-embed handled by WordPress, actually copy the entire embed from Twitter to capture the text and content from the original.
There’s a big difference in the following two pieces of data:
<blockquote class="twitter-tweet" data-lang="en">
<p dir="ltr" lang="en">I hope <a href="https://twitter.com/Storify?ref_src=twsrc%5Etfw">@storify</a> will follow the example set by <a href="https://twitter.com/dougkaye?ref_src=twsrc%5Etfw">@dougkaye</a> when he shut down ITConversations: <a href="https://t.co/oBTWmR5M3A">https://t.co/oBTWmR5M3A</a>.</p>
My shows there are now preserved (<a href="https://t.co/IuIUMvMXi3">https://t.co/IuIUMvMXi3</a>) in a way that none of my magazine writing was.
— Jon Udell (@judell) <a href="https://twitter.com/judell/status/940973536675471360?ref_src=twsrc%5Etfw">December 13, 2017</a>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8">
While WordPress ostensibly displays them the same, one will work as long as Twitter lives, and the other lives as long as your own site lives and actually maintains the original content.
Now there are certainly bigger issues for saving video content this way from places like YouTube given copyright issues as well as bandwidth and other technical concerns. In these cases, perhaps embedding the URLs only within WordPress is the way to go. But keep in mind what it is you’re actually copying/archiving when you use the method he discusses.
Side note: I prefer the closer Yiddish spelling of bupkis. It is however a great term for what you often end up receiving from social silos that provide you with services that you can usually pretty easily maintain yourself.
Introduction to what one would consider basic web communication
A few days ago I had written a post on my website and a colleague had written a reply on his own website. Because we were both using the W3C Webmention specification on our websites, my site received the notification of his response and displayed it in the comments section of my website. (This in and of itself is really magic enough–cross website @mentions!)
To reply back to him I previously would have written a separate second post on my site in turn to reply to his, thereby fragmenting the conversation across multiple posts and making it harder to follow the conversation. (This is somewhat similar to what Medium.com does with their commenting system as each reply/comment is its own standalone page.)
Instead, I’ve now been able to configure my website to allow me to write a reply directly to a response within my comments section admin UI (or even in the comments section of the original page itself), publish it, and have the comment be sent to his reply and display it there. Two copies for the price of one!
This means that now, WordPress-based websites (at least self-hosted versions running the WordPress.org code) can easily and simply allow multiple parties to write posts on their own sites and participate in multi-sided conversations back and forth while all parties maintain copies of all sides of the conversation on their own websites in a way that maintains all of the context. As a result, if one site should be shut down or disappear, the remaining websites will still have a fully archived copy of the entire conversation thread. (Let’s hear it for the resilience of the web!)
What is happening?
This functionality is seemingly so simple that one is left wondering:
“Why wasn’t this baked into WordPress (and the rest of the web) from the start?”
“Why wasn’t this built after the rise of Twitter, Facebook, or other websites which do this as a basic function?”
“How can I get it tout suite?!” (aka gimme, gimme, gimme, and right now!!!)
While seeming simple, the technical hurdles aren’t necessarily because there had previously never been a universal protocol for the web to allow it. (The Webmentions spec now makes it possible.) Sites like Facebook, Twitter, and others enable it because they’ve got a highly closed and highly customized environment that makes it a simpler problem to solve. In fact, even old-school web-based bulletin boards allowed this!
But even within social media one will immediately notice that you can’t use your Facebook account to reply to a Twitter account. And why not?! (While the web would be far better if one website or page could talk to another, these sites don’t for the simple economic reason that they want you using only their site and not others, and not enabling this functionality keeps you locked into what they’re selling.)
I’ll detail the basic set up below, but thought that it would be highly illustrative to have a diagram of what’s physically happening in case the description above seems a bit confusing to picture properly. I’ll depict two websites, each in their own column and color-coded so that content from site A is one color while content from site B is another color.
It really seems nearly incomprehensible to me how this hasn’t been built into the core functionality of the web from the beginning of at least the blogosphere. Yet here we are, and somehow I’m demonstrating how to do this from one WordPress site to another via the open web in 2017. To me this is the entire difference between a true Internet and just using someone else’s intranet.
While this general functionality is doable on any website, I’ll stick to enabling it specifically on WordPress, a content management system that is powering roughly 30% of all websites on the internet. You’ll naturally need your own self-hosted WordPress-based website with a few custom plugins and a modern semantic-based theme. (Those interested in setting it up on other platforms are more than welcome to explore the resources of the IndieWeb wiki and their chat which has a wealth of resources.)
As a minimum set you’ll want to have the following list of plugins enabled and configured:
Other instructions and help for setting these up and configuring them can be found on the IndieWeb wiki, though not all of the steps there are necessarily required for this functionality.
Ideally this all should function regardless of the theme you have chosen, but WordPress only provides the most basic support for microformats version 1 and doesn’t support the more modern version 2 out of the box. As a result, the display of comments from site to site may be a bit wonky depending on how supportive your particular theme is of the microformats standards. As you can see I’m using a relatively standard version of the TwentySixteen theme without a lot of customization and getting some reasonable results. If you have a choice, I’d recommend one of the following specific themes which have solid semantic markup:
The final plugin that enables sending comments from one comment section to another is the WordPress Webmention for Comments plugin. As it is still somewhat experimental and is not available in the WordPress repository, you’ll need to download it from GitHub and activate it. That’s it! There aren’t any settings or anything else to configure.
With the plugin installed, you should now be able to send comments and replies to replies directly within your comments admin UI (or directly within your comments section in individual pages, though this can not require additional clicks to get there, but you also don’t have the benefit of the admin editor either).
There is one current caveat however. For the plugin to actually send the webmention properly, it will need to have a URL in your reply that includes the microformats u-in-reply-to class. Currently you’ll need to do this manually until the plugin can properly parse and target the fragmentions for the comments properly. I hope the functionality can be added to the plugin to make the experience seamless in the future.
So what does this u-in-reply-to part actually look like? Here’s an example of the one I used to send my reply:
The class tells the receiving site that the webmention is a reply and to display it as such and the URL is necessary for your webmention plugin to know where to send the notification. You’d simply need to change the URL and the word (or words) that appear between the anchor tags.
If you want to have a hidden link and still send a webmention you could potentially add your link to a zero width space as well. This would look like the following:
Based on my experiments, using a <link> via HTML will work, but it will send it as a plain webmention to the site and it won’t show up natively as a reply.
Sadly, a plain text reply doesn’t work (yet), but hopefully some simple changes could be made to force it to using the common fragmentions pattern that WordPress uses for replies.
Interestingly this capability has been around for a while, it just hasn’t been well documented or described. I hope now that those with WordPress sites that already support Webmentions will have a better idea what this plugin is doing and how works.
Eventually one might expect that all the bugs in the system get worked out and the sub-plugin for sending comment Webmentions will be rolled up into the main Webmentions plugin, which incidentally handles fragmentions already.
In addition to the notes above, I will say that this is still technically experimental code not running on many websites, so its functionality may not be exact or perfect in actual use, though in experimenting with it I have found it to be very stable. I would recommend checking that the replies actually post to the receiving site, which incidentally must be able to accept webmentions. If the receiving website doesn’t have webmention support, one will need to manually cut and paste the content there (and likely check the receive notification of replies via email, so you can stay apprised of future replies).
You can check the receiving site’s webmention support in most browsers by right clicking and viewing the pages source. Within the source one should see code in the <head> section of the page which indicates there is a webmention endpoint. Here is an example of the code typically injected into WordPress websites that you’d be looking for:
Also keep in mind that some users moderate their comments, so that even though your mention was sent, they may need to approve it prior to it displaying on the page.
If you do notice problems or issues or have quirks, please file the issue with as full a description of what you did and what resulted as you can so that it can be troubleshot and made to work not only for you, but hopefully work better for everyone else.
Give it a try
So you’ve implemented everything above? Go ahead and write a reply on your own WordPress website and send me a webmention! I’ll do my best to reply directly to you so you can send another reply to make sure you’ve got things working properly.
Once you’re set, go forward and continue helping to make the web a better place.
I wanted to take a moment to give special thanks to Aaron Parecki, Matthias Pfefferle, and David Shanske who have done most of the Herculean work to get this and related functionality working. And thanks also to all who make up the IndieWeb community that are pushing the boundaries of what the web is and what it can accomplish. And finally, thanks to Khürt Williams who became the unwitting guinea pig for my first attempt at this. Thank you all!
I really like Barnes’ intent to share. I just wonder if there is a means of owning these notes. Ideally, taking a POSSE approach, she might live blog and post this to Twitter. I vaguely remember Chris Aldrich sharing something about this recently, but the reference escapes me. This is also limited with her blog being located at WP.com. I therefore wondered about the option of pasting the content of the tweets into a blog as an archive.
Aaron, the process I use for taking longer streams of Tweets to own them (via PESOS) has Kevin Marks‘ excellent tool Noter Live at its core. Noter Live allows you to log in via Twitter and tweet(storm) from it directly. As its original intent was for live-tweeting at conferences and events, it has some useful built in tools for storing the names of multiple speakers (in advance, or even quickly on the fly) as well as auto-hashtagging your conversation. (I love it so much I took the time to write and contribute a user-manual.)
The best part is that it not only organically threads your tweets together into one continuing conversation, but it also gives you a modified output including the appropriate HTML and microformats classes so that you can cut and paste the entire thread and simply dump it into your favorite CMS and publish it as a standard blog post. (It also strips out the hashtags and repeated speaker references in a nice way.) With a small modification, you can also get your site to add hovercards to your post as well. I’ll also note in passing that it’s also been recently updated to support the longer 280 characters too.
Another shorter tweetstorm which also has u-syndication links for all of the individual tweets can be found at Indieweb and Education Tweetstorm. This one has the benefit of pulling in all the resultant conversations around my tweetstorm with backfeed from Brid.gy, though they’re not necessarily threaded properly in the comments the way I would ultimately like. As you mention in the last paragraph that having the links to the syndicated copies would be useful, I’ll note that I’ve already submitted it as an issue to Noter Live’s GitHub repo. In some sense, the entire Twitter thread is connected, so having the original tweet URL gives you most of the context, though it isn’t enough for all of the back feed by common methods (Webmentions+Brid.gy) presently.
I’ll also note that I’ve recently heard from a reputable source about a WordPress specific tool called Publishiza that may be useful in this way, but I’ve not had the chance to play with it yet myself.
Clearly, you can embed Tweets, often by adding the URL. However, there are more and more people deleting their Tweets and if you embed something that is deleted, this content is then lost. (Not sure where this leaves Storify etc.)
It’s interesting that you ask where this leaves Storify, because literally as I was reading your piece, I got a pop-up notification announcing that Storify was going to be shut down altogether!! (It sounds to me like you may have been unaware when you wrote your note. So Storify and those using it are in more dire circumstances than you had imagined.)
Storify announces it will disappear from the web on May 16, 2018. Once a core part of social-focused journalism projects like @acarvin‘s work, it’s larger archives and URLs will be gone. https://t.co/9KhEYCbX2e
It’s yet another reason in a very long list why one needs to have and own their own digital presence.
As for people deleting their tweets, I’ll note that by doing a full embed (instead of just using a URL) from Twitter to WordPress (or using Noter Live), that the original text is preserved so that even if the original is deleted, a full archival copy of the original still exists.
Also somewhat related in flavor for the mechanism you’re discussing, I also often use Hypothesis to comment on, highlight, and annotate on web pages for academic/research uses. To save these annotations, I’ll add hashtags to the annotations within Hypothesis and then use Kris Shaffer’s excellent Hypothesis Aggregator plugin to parse the data and pull it in the specific parts I want. Though here again, either Hypothesis as a service or the plugin itself may ultimately fail, so I will copy/paste the raw HTML from its output to post onto my site for future safekeeping. In some sense I’m using the plugin as a simple tool to make the transcription and data transport much easier/quicker.
I hope these tips make it easier for you and others to better collect your content and display it for later consumption and archival use.
Who owns the publication you’re reading right now?
It’s a question you should ask no matter what you’re reading. In Latin there’s a phrase cui bono, which roughly translates as “who is benefiting?” It’s a good idea to know who is profiting in any situation. Why? So you can make educated decisions.
If things are as potentially as nefarious as they sound here, I’m archiving a copy of this article to the Internet Archive now, just in case the new owners notice and it disappears.
Wait! Aren’t you researching Twitter?
I am indeed and the preceding discussion has largely centred on pingbacks, a feature of blogs, rather than microblogs. I have two points to make here: firstly that microblogs and Twitter may have features which function in a similar way to pingbacks. The retweet for example provides a similar link to a text or resource that someone else has produced. I’ll admit that it has less permanence than a pingback, patiently ensconced at the foot of a blog and ready to whisk the reader off to the linked blog, but then the structure and function of Twitter is one of flow and change when compared with a blog; it’s a different beast. The second is that my point of entry to the blogs and their interconnected web of enabling pingbacks was a tweet. Two actually. Andrea’s tweet took me to another tweet which referenced Aditi’s blog post; had I not been on Twitter and had Andrea and I not made a connection through that platform, the likelihood of me ever being aware of Aditi’s post and the learning opportunities that it and its wider assemblage brings together would be minimal.
I’m finding your short study and thoughts on pingbacks while I was thinking about Webmentions (and a particular issue that Aaron Davis was having with them) after having spent a chunk of the day remotely following the Dodging the Memory Hole 2017 conference at the Internet Archive in San Francisco.
It’s made me realize that one of the bigger values of the iteration that Webmentions has over its predecessor pingbacks and trackbacks is that at least a snapshot of the content has captured on the receiving site. As you’ve noted that while the receiving site has the scant data from the pingback, there’s not much to look at in general and even less when the sending site has disappeared from the web. In the case of Webmentions, even if the sending site has disappeared from the web, the receiving site can still potentially display more of that missing content if it wishes. Within the WordPress ecosystem simple mentions only show the indication that the article was mentioned, but hiding within the actual database on the back end is a copy of the post itself. With a few quick changes to make the “mention” into a “reply” the content of the original post can be quickly uncovered/recovered. (I do wonder a bit if you cross-referenced the Internet Archive or other sources in your search to attempt to recover those lost links.)
I will admit that I recall the Webmention spec allowing a site to modify and/or update its replies/webmentions, but in practice I’m not sure how many sites actually implement this functionality, so from an archiveal standpoint it’s probably pretty solid/stable at the moment.
Separately, I also find myself looking at your small example and how you’ve expanded it out a level or two within your network to see how it spread. This reminds me of Ryan Barrrett’s work from earlier this year on the IndieWeb network in creating the Indie Map tool which he used to show the interconnections between over three thousand people (or their websites) using links like Webmentions. Depending on your broader study, it might make an interesting example to look at and/or perhaps some code to extend?
With particular regard to your paragraph under “Wait! Aren’t you researching Twitter?” I thought I’d point you to a hybrid approach of melding some of Twitter and older/traditional blogs together. I personally post everything to my own website first and syndicate it to Twitter and then backfeed all of the replies, comments, and reactions via Brid.gy using webmentions. While there aren’t a lot of users on the internet doing something like this at the moment, it may provide a very different microcosm for you to take a look at. I’ve even patched together a means to allow people to @mention me on Twitter that sends the data to my personal website as a means of communication.
After a bit of poking around, I was also glad to find a fellow netizen who is also consciously using their website as a commonplace book of sorts.