archival copy | Chris Aldrich

Read The Internet's Dark Ages (The Atlantic)

If a Pulitzer-nominated 34-part series of investigative journalism can vanish from the web, anything can.

[The Web] is a constantly changing patchwork of perpetual nowness. ❧

Highlighted on January 07, 2020 at 11:58AM

👓 The Day the Music Burned | New York Times Magazine

Read The Day the Music Burned (New York Times Magazine)

It was the biggest disaster in the history of the music business — and almost nobody knew. This is the story of the 2008 Universal fire.

This brings back some memories of when I worked for several months for Iron Mountain at their Hollywood facility right next to Anawalt lumber. They had quite a large repository of music masters stored there as well as a custom nitrate film vault. At the time I remember thinking many of the same things mentioned here. I suspect that there’s an even bigger issue in film preservation, though this particular article makes it seem otherwise.

I’m surprised that the author doesn’t whip out any references to the burning of the Library at Alexandria, which may have been roughly on par in terms of cultural loss to society. It’s painfully sad that UMG covered up the devastating loss.

The artwork for the piece is really brilliant. Some great art direction here.

🔖 “The New Old Web: Preserving the Web for the Future With Containers” | Ilya Kreymer

Bookmarked "The New Old Web: Preserving the Web for the Future With Containers" by Ilya Kreymer (docs.google.com)

This talk will present innovative uses of Docker containers, emulators and web archives to allow anyone to experience old web sites using old web browsers, as demonstrated by the Webrecorder and oldweb.today projects. Combining containerization with emulation can provide new techniques in preserving both scholarly and artistic interactive works, and enable obsolete technologies like Flash and Java applets to be accessible today and in the future. The talk will briefly cover the technology and how it can be deployed both locally and in the cloud. Latest research in this area, such as automated preservation of education publishing platforms like Scalar will also be presented. The presentation will include live demos and users will also be invited to try the latest version of oldweb.today and interact with old browsers directly in their browser. The Q&A will help serve to foster a discussion on the potential opportunities and challenges of containerization technology in ‘future-proofing’ interactive web content and software.

hat tip:

Quick slides from my talk at #Domains19 on using containers as part of the equation in preserving the web:https://t.co/XitvAGlSa8 https://t.co/UckDiCR5yB

— Ilya Kreymer (@IlyaKreymer) June 10, 2019

Read When a Presidential Library Is Digital by

Dan Cohen (dancohen.org)

I’ve got a new piece over at The Atlantic on Barack Obama’s prospective presidential library, which will be digital rather than physical. This has caused some consternation. We need to realize, however, that the Obama library is already largely digital: The vast majority of the record his presid...

I love the perspective given here, and in the article, of how important a digital library might be.

The means and methods of digital preservation also become an interesting test case for this particular presidency because so much of it was born digitally. I’m curious what the overlaps are for those working in the archival research space? In fact, I know that groups like the Reynolds Journalism Institute have been hosting conferences like Dodging the Memory Hole which are working at preserving born digital news and I suspect there’s a huge overlap with what digital libraries like this one are doing. I have to think Dan would make an interesting keynote speaker if there were another Dodging the Memory Hole conference in the near future.

Given my technological background, I’m less reticent than some detractors of digital libraries, but this article reminds me of some of the structural differences in this particular library from an executive and curatorial perspective. Some of these were well laid out in an episode of On the Media which I listened to recently. I’d be curious to hear what Dan thinks of this aspect of the curatorial design, particularly given the differences a primarily digital archive might have. For example, who builds the search interface? Who builds the API for such an archive and how might it be designed to potentially limit access of some portions of the data? Design choices may potentially make it easier for researchers, but given the current and some past administrations, what could happen if curators were less than ideal? What happens with changes in technology? What about digital rot or even link rot? Who chooses formats? Will they be standardized somehow? What prevents pieces from being digitally tampered with? When those who win get to write the history, what prevents those in the future from digitally rewriting the narrative? There’s lots to consider here.

👓 What Do You Do with 11,000 Blogs? Preserving, Archiving, and Maintaining UMW Blogs—A Case Study | The Journal of Interactive Technology and Pedagogy

Read What Do You Do with 11,000 Blogs? Preserving, Archiving, and Maintaining UMW Blogs—A Case Study by Angie Kemp, Lee Skallerup Bessette, Kris Shaffer (The Journal of Interactive Technology and Pedagogy)

What do you do with 11,000 blogs on a platform that is over a decade old? That is the question that the Division of Teaching and Learning Technologies (DTLT) and the UMW Libraries are trying to answer. This essay outlines the challenges of maintaining a large WordPress multisite installation and offers potential solutions for preserving institutional digital history. Using a combination of data mining, personal outreach, and available web archiving tools, we show the importance of a systematic, collaborative approach to the challenges we didn’t expect to face in 2007 when UMW Blogs launched. Complicating matters is the increased awareness of digital privacy and the importance of maintaining ownership and control over one’s data online; the collaborative nature of a multisite and the life cycle of a student or even faculty member within an institution blurs the lines of who owns or controls the data found on one of these sites. The answers may seem obvious, but as each test case emerges, the situation becomes more and more complex. As an increasing number of institutions are dealing with legacy digital platforms that are housing intellectual property and scholarship, we believe that this essay will outline one potential path forward for the long-term sustainability and preservation.

Some interesting things to consider for a DoOO project in terms of longevity and archiving.

Replied to a tweet by

Hayley Campbell (Twitter)

PSA: with all the websites dying, go save the stuff you wrote as pdfs before they disappear. Otherwise you’ll be trying to get jobs with pocket fluff.
— Hayley Campbell (@hayleycampbell) February 1, 2019

This is an important topic and something which should be tended to on an ongoing basis.

Ben Welsh of the LA Times data desk has built Savemy.News which leverages Twitter in combination with archive.is, webcitation.org, and archive.org to allow journalists to quickly create multiple archives of their work by simply inputting the URLs of their related pages. It’s also got a useful download functionality too.

Richard MacManus, founder of RWW, wrote a worthwhile article on how and why he archived a lot of his past work.

Those with heavier digital journalism backgrounds and portfolios may find some useful information and research coming out of Reynolds Journalism Institute’s Dodging the Memory Hole series of conferences. I can direct those interested to a variety of archivists, librarians, researchers, and technologists should they need heavier lifting that simpler solutions than archive.org, et al.

Additional ideas for archiving and saving online work can be found on the IndieWeb wiki page archival copy. There are some additional useful ideas and articles on the IndieWeb for Journalism page as well. I’d welcome anyone with additional ideas or input to feel free to add to any of these pages for others’ benefit as well. If you’re unfamiliar with wiki notation or editing, feel free to reply to this post; I’m happy to make additions on your behalf or help you log in and navigate the system directly.

If you don’t have a website where you keep your personal archive and/or portfolio online already, now might be a good time to put one together. The IndieWeb page mentioned above has some useful ideas, real world examples, and even links to tutorials.

As an added bonus for those who clicked through, if you’re temporarily unemployed and don’t have your own website/portfolio already, I’m happy to help build an IndieWeb-friendly website (gratis) to make it easier to store and display your past and future articles.

Reply to Robin DeRosa et al on archiving and self-hosting in DoOO

Replied to a tweet by

Robin DeRosa (Twitter)

A Provocation for the #OpenPedagogy Community, from ⁦@holden⁩. On self- and institutional hosting, and where we should be building our stuff on the web. #digped #indieweb #OpenPed https://t.co/x98BVJyHtc
— Robin DeRosa (@actualham) August 10, 2018

I had read Dave Winer’s post^⁮† and shortly thereafter Mike Caulfield’s response^‡, which was similar to some of my own reaction (particularly the analogy to nature and proliferation of copies via means of DNA, etc.)

I’ve recently outlined how ideas like a Domain of One’s Own and IndieWeb philosophies could be used to allow researchers and academics to practice academic samizdat on the open web to own and maintain their own open academic research and writing. A part of this process is the need to have useful and worthwhile back up and archiving ability as one thing we have come to know in the history of the web is that link rot is not our friend.

Toward that end, for those in the space I’ll point out some useful resources including the IndieWeb wiki pages for archival copies. Please contribute to it if you can. Another brilliant resource is the annual Dodging the Memory Hole conference which is run by the Reynolds Journalism Institute.

While Dodging the Memory Hole is geared toward saving online news in particular, many of the conversations are nearly identical to those in the broader archival space and also involve larger institutional resources and constituencies like the Internet Archive, the Library of Congress, and university libraries as well. The conference is typically in the fall of each year and is usually announced in August/September sometime, so keep an eye out for its announcement. In the erstwhile, they’ve recorded past sessions and have archive copies of much of their prior work in addition to creating a network of academics, technologists, and journalists around these ideas and related work. I’ve got a Twitter list of prior DtMH participants and stake-holders for those interested.

I’ll also note briefly, that as I self-publish on my own self-hosted domain, I use a simple plugin so that both my content and the content to which I link are being sent to the Internet Archive to create copies there. In addition to semi-regular back ups I make locally, this hopefully helps to mitigate potential future loss and link rot.

As a side note, major bonus points to Robin DeRosa (@actualham) for the use of the IndieWeb hashtag in her post!!

❤️ actualham tweet

Liked a tweet by

Robin DeRosa (Twitter)

A Provocation for the #OpenPedagogy Community, from ⁦@holden⁩. On self- and institutional hosting, and where we should be building our stuff on the web. #digped #indieweb #OpenPed https://t.co/x98BVJyHtc
— Robin DeRosa (@actualham) August 10, 2018

👓 A Provocation for the Open Pedagogy Community | Hapgood

Read A Provocation for the Open Pedagogy Community by

Mike Caulfield (Hapgood)

Dave Winer has a great post today on the closing of blogs.harvard.edu. These are sites run by Berkman, some dating back to 2003, which are being shut down. My galaxy brain goes towards the idea of …

An interesting take on self-hosting and DoOO ideas with regard to archiving and maintaing web presences. I’ll try to write a bit more on this myself shortly as it’s an important area that needs to be expanded for all on the open web.

👓 Friday, August 10, 2018 | Scripting News

Read Friday, August 10, 2018 by Dave Winer (Scripting News)

I got an email in the middle of the night asking if I had seen an announcement from Berkman Center at Harvard that they will stop hosting blogs.harvard.edu. It's not clear what will happen to the archives. Let's have a discussion about this. That was the first academic blog hosting system anywhere. It was where we planned and reported on our Berkman Thursday meetups, and BloggerCon. It's where the first podcasts were hosted. When we tried to figure out what makes a weblog a weblog, that's where the result was posted. There's a lot of history there. I can understand turning off the creation of new posts, making the old blogs read-only, but as a university it seems to me that Harvard should have a strong interest in maintaining the archive, in case anyone in the future wants to study the role we played in starting up these (as it turns out) important human activities.

This is some earthshaking news. Large research institutions like this should be maintaining archives of these types of things in a defacto manner. Will have to think about some implications for others in the DoOO and IndieWeb spaces.