This really gives a new meaning to the “paper of record."
I saw the internet create and destroy a bizarro version of myself.
I’ve been reading some pieces from my archive on context collapse and people losing jobs/opportunities as the result of online bullies digging up old social media posts which has become a bigger issue as of late. Many people have been wanting to leave social media platforms for their toxic cultures, and this seems to be a subset of that in that it has people going back and deleting old social posts for fear of implications in the present.
Quinn Norton has some relatively sage advice about the internet in this piece. Of course it’s no coincidence that The New York Times editorial board wanted to hire her.
Highlights, Quotes, Annotations, & Marginalia
History doesn’t ask you if you want to be born in a time of upheaval, it just tells you when you are. ❧
August 03, 2018 at 08:00AM
I have a teenage daughter, and I have told her all her life that all the grown-ups are making it up as they go along. I have also waggled my eyebrows suggestively while saying it, to make it clear to her that I mean me, too. ❧
August 03, 2018 at 08:00AM
This taught me that not everyone worthy of love is worthy of emulation. It also taught me that being given terrible ideas is not a destiny, and that intervention can change lives. ❧
August 03, 2018 at 08:02AM
Not everyone believes loving engagement is the best way to fight evil beliefs, but it has a good track record. Not everyone is in a position to engage safely with racists, sexists, anti-Semites, and homophobes, but for those who are, it’s a powerful tool. Engagement is not the one true answer to the societal problems destabilizing America today, but there is no one true answer. The way forward is as multifarious and diverse as America is, and a method of nonviolent confrontation and accountability, arising from my pacifism, is what I can bring to helping my society. ❧
August 03, 2018 at 08:03AM
I am not immune from these mistakes, for mistaking a limited snapshot of something for what it is in its entirety. I have been on the other side. ❧
August 03, 2018 at 08:04AM
I had been a victim of something the sociologists Alice Marwick and danah boyd call context collapse, where people create online culture meant for one in-group, but exposed to any number of out-groups without its original context by social-media platforms, where it can be recontextualized easily and accidentally. ❧
August 03, 2018 at 08:05AM
I had even written about context collapse myself, but that hadn’t saved me from falling into it, and then hurting other people I didn’t mean to hurt. ❧
August 03, 2018 at 08:06AM
It helped me learn a lesson: Be damn sure when you make angry statements. ❧
August 03, 2018 at 08:07AM
Don’t internet angry. If you’re angry, internet later. ❧
August 03, 2018 at 08:07AM
Context collapse is our constant companion online. ❧
August 03, 2018 at 08:07AM
I used to think that showing someone how wrong they were on the internet could fix the world. I said a lot of stupid things when I believed that. ❧
August 03, 2018 at 08:08AM
I am not, and will never be, a simple writer. I have sought to convict, accuse, comfort, and plead with my readers. I’m leaving the majority of my flaws online: Go for it, you can find them if you want. It’s a choice I made long ago. ❧
August 03, 2018 at 08:09AM
If you look long enough you can find my early terrible writing. You can find blog posts in which I am an idiot. I’ve had a lot of uninformed and passionate opinions on geopolitical issues from Ireland to Israel. You can find tweets I thought were witty, but think are stupid now. You can find opinions I still hold that you disagree with. I’m going to leave most of that stuff up. In doing so, I’m telling you that you have to look for context if you are seeking to understand me. You don’t have to try, I’m not particularly important, but I am complicated. When I die, I’m going to instruct my executors to burn nothing. Leave the crap there, because it’s part of my journey, and that journey has a value. People who came from where I did, and who were given the thoughts I was given, should know that the future can be different from the past. ❧
August 03, 2018 at 08:13AM
A new audio series following Rukmini Callimachi as she reports on the Islamic State and the fall of Mosul. This series includes disturbing language and scenes of graphic violence.
I’ve sampled several episodes via The Daily, so I’m officially subscribing so I can get the rest of the episodes.
There are some interesting thoughts here about archiving news pages online. It also subtly highlights the importance of having one’s own domain to be able to redirect pages from their originals to archived versions, possibly containing different technological support. This article is sure to be of interest to folks in the Journalism Digital News Archive/Dodging the Memory Hole Camp (#DtMH2017)
Here’s what you need to know to start your day.
If you missed the notes from Day 1, see this post.
It may take me a week or so to finish putting some general thoughts and additional resources together based on the two day conference so that I might give a more thorough accounting of my opinions as well as next steps. Until then, I hope that the details and mini-archive of content below may help others who attended, or provide a resource for those who couldn’t make the conference.
Overall, it was an incredibly well programmed and run conference, so kudos to all those involved who kept things moving along. I’m now certainly much more aware at the gaping memory hole the internet is facing despite the heroic efforts of a small handful of people and institutions attempting to improve the situation. I’ll try to go into more detail later about a handful of specific topics and next steps as well as a listing of resources I came across which may provide to be useful tools for both those in the archiving/preserving and IndieWeb communities.
Archive of materials for Day 2
Below are the recorded audio files embedded in .m4a format (using a Livescribe Pulse Pen) for several sessions held throughout the day. To my knowledge, none of the breakout sessions were recorded except for the one which appears below.
Summarizing archival collections using storytelling techniques
Presentation: Summarizing archival collections using storytelling techniques by Michael Nelson, Ph.D., Old Dominion University
Saving the first draft of history
Special guest speaker: Saving the first draft of history: The unlikely rescue of the AP’s Vietnam War files by Peter Arnett, winner of the Pulitzer Prize for journalism
Kiss your app goodbye: the fragility of data journalism
Panel: Kiss your app goodbye: the fragility of data journalism
Featuring Meredith Broussard, New York University; Regina Lee Roberts, Stanford University; Ben Welsh, The Los Angeles Times; moderator Martin Klein, Ph.D., Los Alamos National Laboratory
The future of the past: modernizing The New York Times archive
Panel: The future of the past: modernizing The New York Times archive
Featuring The New York Times Technology Team: Evan Sandhaus, Jane Cotler and Sophia Van Valkenburg; moderated by Edward McCain, RJI and MU Libraries
Lightning Rounds: Six Presenters
Lightning rounds (in two parts)
Six + one presenters: Jefferson Bailey, Terry Britt, Katherine Boss (and team), Cynthia Joyce, Mark Graham, Jennifer Younger and Kalev Leetaru
1: Jefferson Bailey, Internet Archive, “Supporting Data-Driven Research using News-Related Web Archives” 2: Terry Britt, University of Missouri, “News archives as cornerstones of collective memory” 3: Katherine Boss, Meredith Broussard and Eva Revear, New York University: “Challenges facing preservation of born-digital news applications” 4: Cynthia Joyce, University of Mississippi, “Keyword ‘Katrina’: Re-collecting the unsearchable past” 5: Mark Graham, Internet Archive/The Wayback Machine, “Archiving news at the Internet Archive” 6: Jennifer Younger, Catholic Research Resources Alliance: “Digital Preservation, Aggregated, Collaborative, Catholic” 7. Kalev Leetaru, senior fellow, The George Washington University and founder of the GDELT Project: A Look Inside The World’s Largest Initiative To Understand And Archive The World’s News
Technology and Community
Presentation: Technology and community: Why we need partners, collaborators, and friends by Kate Zwaard, Library of Congress
Breakout: Working with CMS
Working with CMS, led by Eric Weig, University of Kentucky
Alignment and reciprocity
Alignment & reciprocity by Katherine Skinner, Ph.D., executive director, the Educopia Institute
Closing remarks by Edward McCain, RJI and MU Libraries and Todd Grappone, associate university librarian, UCLA
Live Tweet Archive
Reminder: In many cases my tweets don’t reflect direct quotes of the attributed speaker, but are often slightly modified for clarity and length for posting to Twitter. I have made a reasonable attempt in all cases to capture the overall sentiment of individual statements while using as many original words of the participant as possible. Typically, for speed, there wasn’t much editing of these notes. Below I’ve changed the attribution of one or two tweets to reflect the proper person(s). Fore convenience, I’ve also added a few hyperlinks to useful resources after the fact that didn’t have time to make the original tweets. I’ve attached .m4a audio files of most of the audio for the day (apologies for shaky quality as it’s unedited) which can be used for more direct attribution if desired. The Reynolds Journalism Institute videotaped the entire day and livestreamed it. Presumably they will release the video on their website for a more immersive experience.
Condoms were required issue in Vietnam–we used them to waterproof film containers in the field.
Do not stay close to the head of a column, medics, or radiomen. #warreportingadvice
I told the AP I would undertake the task of destroying all the reporters’ files from the war.
Instead the AP files moved around with me.
Eventually the 10 trunks of material went back to the AP when they hired a brilliant archivist.
“The negatives can outweigh the positives when you’re in trouble.”
Our first panel:Kiss your app goodbye: the fragility of data jornalism
I teach data journalism at NYU
A news app is not what you’d install on your phone
Dollars for Docs is a good example of a news app
A news app is something that allows the user to put themself into the story.
Often there are three CMSs: web, print, and video.
News apps don’t live in any of the CMSs. They’re bespoke and live on a separate data server.
This has implications for crawlers which can’t handle them well.
Then how do we save news apps? We’re looking at examples and then generalizing.
Everyblock.com was a good example based on chicagocrime and later bought by NBC and shut down.
What?! The internet isn’t forever? Databases need to be save differently than web pages.
Reprozip was developed by NYU Center for Data and we’re using it to save the code, data, and environment.
We make apps that serve our audience.
We also make internal tools that empower the newsroom.
We also use our nerdy skills to do cool things.
Most of us aren’t good programmers, we “cheat” by using frameworks.
Frameworks do a lot of basic things for you, so you don’t have to know how to do it yourself.
Archiving tools often aren’t built into these frameworks.
Instagram, Pinterest, Mozilla, and the LA Times use django as our framework.
Memento for WordPress is a great way to archive pages.
We must do more. We need archiving baked into the systems from the start.
Slides at http://bit.ly/frameworkfix
Got data? I’m a librarian at Stanford University.
I’ll mention Christine Borgman’s book Big Data, Little Data, No data.
Journalists are great data liberators: FOIA requests, cleaning data, visualizing, getting stories out of data.
But what happens to the data once the story is published?
BLDR: Big Local Digital Repository, an open repository for sharing open data.
For metadata: www.ddialliance.org, RDF, International Image Interoperability Framework (iiif) and MODS
We’ll open up for questions.
What’s more important: obey copyright laws or preserving the content?
The new creative commons licenses are very helpful, but we have to be attentive to many issues.
Perhaps archiving it and embargoing for later?
Saving the published work is more important to me, and the rest of the byproduct is gravy.
I work for the New York Times, you may have heard of it…
Talking about modernizing the born-digital legacy content.
Our problem was how to make an article from 2004 look like it had been published today.
There were 100’s of thousands of articles missing.
There was no one definitive list of missing articles.
Outlining the workflow for reconciling the archive XML and the definitive list of URLs for conversion.
It’s important to use more than one source for building an archive.
I’m going to talk about all of “the little things” that came up along the way..
Article Matching: Fusion – How to convert print XML with web HTML that was scraped.
Primarily, we looked at common phrases between the corpus of the two different data sets.
We prioritized the print data over the digital data.
We maintain a system called switchboard that redirects from old URLs to the new ones to prevent link rot.
The case of the missing sections: some sections of the content were blank and not transcribed.
We made the decision of taking out data we had in lieu of making a better user experience for missing sections.
In the future, we’d also like to put photos back into the articles.
Can you discuss the decision to go with a more modern interface rather than a traditional archive of how it looked?
Some of the decision was to get the data into an accessible format for modern users.
We do need to continue work on preserving the original experience.
Is there a way to distinguish between the print version and the online versions in the archive?
Could a researcher do work on the entire corpora? Is it available for subscription?
We do have a sub-section of data availalbe, but don’t have it prior to 1960.
Have you documented the process you’ve used on this preservation project?
We did save all of the code for the project within GitHub.
We do have meeting notes which provide some documentation, though they’re not thorough.
Oh dear. Of roughly 1,155 tweets I counted about #DtMH2016 in the last week, roughly 25% came from me. #noisy
Opensource tool I had mentioned to several: @wallabagapp A self-hostable application for saving web pages https://www.wallabag.org
Should I be adding major media outlets to my Facebook feed as family members? Changes by Facebook, which are highlighted in, may mean this is coming: The Atlantic can be my twin brother, and Foreign Affairs could be my other sister.
“News content posted by publishers will show up less prominently, resulting in less traffic to companies that have come to rely on Facebook audiences.” — Facebook to Change News Feed to Focus on Friends and Family in New York Times
After reading this article, I can only think that Facebook wrongly thinks that my family is so interesting (and believe me, I don’t think I’m any better, most of my posts–much like my face–are ones which only a mother could “like”/”love” and my feed will bear that out! BTW I love you mom.) The majority of posts I see there are rehashes of so-called “news” sites I really don’t care about or invitations to participate in games like Candy Crush Saga.
While I love keeping up with friends and family on Facebook, I’ve had to very heavily modify how I organize my Facebook feed to get what I want out of it because the algorithms don’t always do a very good job. Sadly, I’m probably in the top 0.0001% of people who take advantage of any of these features.
It really kills me that although publishers see quite a lot of traffic from social media silos (and particularly Facebook), they’re still losing some sight of the power of owning your own website and posting there directly. Apparently the past history littered with examples like Zynga and social reader tools hasn’t taught them the lesson to continue to iterate on their own platforms. One day the rug will be completely pulled out from underneath them and real trouble will result. They’ll wish they’d put all their work and effort into improving their own product rather than allowing Facebook, Twitter, et al. to siphon off a lot of their resources. If there’s one lesson that we’ve learned from media over the years, it’s that owning your own means of distribution is a major key to success. Sharecropping one’s content out to social platforms is probably not a good idea while under pressure to change for the future.
Psst… With all this in mind, if you’re a family member or close friend who wants to
- have your own website;
- own your own personal data (which you can automatically syndicate to most of the common social media sites); and
- be in better control of your online identity,
I’ll offer to build you a simple one and host it at cost.
Syndicated copies to: