If there’s one thing that’s abundantly true recently, it is that journalism can get all the help it can. Won’t you join me in helping to effect some change?Syndicated copies to:
Over 75 alumni, students, and faculty from Johns Hopkins got together on January 12, 2017 for a networking mixer and panel discussion relating to the entertainment and media business. Johns Hopkins alum and emeritus board of trustees member Don Kurz (A&S ’77) graciously hosted the event at his company Omelete in Culver City.
Just prior to the event the current students, primarily seniors and juniors who were visiting Los Angeles as part of an Intersession course within the Film and Media Studies Program, met with alum and cinematography legend Caleb Deschanel (A&S ’66) to hear about his industry experiences and ask questions. Following this there was an hour long drinks/hors d’oeuvres mixer of both students and alumni.
Host Don Kurz then thanked everyone for attending and introduced Film and Media Studies Program Director Linda DeLibero. She gave a quick overview of the program and its growth over the past few years and introduced the group of students who had traveled out from Baltimore for the class.
Following this Don moderated a panel discussion and Q&A featuring alumni Paul Boardman (A&S ’89), Jason Altman (A&S ’99), Devon Chivvis (A&S ’96), and Chris Aldrich (Engr. ’96). Panelists discussed how some of their Hopkins experiences helped to shape their subsequent careers in the entertainment and media sectors.
After the discussion, Linda presented Don with a gift of a framed artistic print of the newly renovated Center Theatre, part of Station North and a hub for the Film and Media Studies Program. She thanked him for all of his work and effort on behalf of the program, saying she knew he’s “always got our back.”
Emily Hogan gave a brief overview of her work at the JHU Career Center and encouraged alumni who had job openings or internship opportunities within their companies or knew of other opportunities for students/alumni to contact her with details and help in filling them.
Following some additional networking, a portion of the crowd retired to nearby pub/restaurant Public School 310 to continue the discussion.
Event Photo Gallery
I’ve been invited to participate in a panel discussion as part of an Intersession course by the Johns Hopkins Film and Media Studies Program. I hope fellow alumni in the entertainment and media sectors will come out and join us in Culver City on Thursday.
Join the Hopkins in Hollywood Affinity Group (AEME LA) as they welcome Linda DeLibero, Director of the JHU Film and Media Studies Program, and current students of the program for a dynamic evening of networking which features an alumni panel of industry experts.
Open to alumni, students, and friends of Hopkins, this event is sponsored by Donald Kurz (A&S ’77), Johns Hopkins University Emeritus Trustee and School of Arts and Sciences Advisory Board Member, and the Hopkins in Hollywood (AEME LA) Affinity Group.
Event Date: Thursday, January 12, 2017
Start Time: 6:30pm
End Time: 8:30pm
Donald Kurz, A&S ’77
Jason Altman, A&S ’99
Jason Altman is an Executive Producer at Activision working on the Skylanders franchise and new development projects. Prior to Activision, he spent the past 5 years at Ubisoft Paris in different leadership roles, most recently as the Executive Producer of Just Dance, the #1 music video game franchise. He is a veteran game producer who loves the industry, and is a proud graduate of the media studies program at Johns Hopkins.
Paul Harris Boardman, A&S ’89
Paul Boardman wrote The Exorcism of Emily Rose (2005) and Devil’s Knot (2014), both of which he also produced, and Deliver Us From Evil (2014), which he also executive produced. In 2008, Paul produced The Day the Earth Stood Still for Fox, and he did production rewrites on Poltergeist, Scream 4, The Messengers, and Dracula 2000, as well as writing and directing the second unit for Hellraiser: Inferno (2000) and writing Urban Legends: Final Cut (2000). Paul has written screenplays for various studios and production companies, including Trimark, TriStar, Phoenix Pictures, Miramax/Dimension, Disney, Bruckheimer Films, IEG, APG, Sony, Lakeshore, Screen Gems, Universal and MGM.
Devon Chivvis, A&S ’96
Devon Chivvis is a showrunner/director/producer of narrative and non-fiction television and film. Inspired by a life-long passion for visual storytelling combined with a love of adventure and the exploration of other cultures, Devon has made travel a priority through her work in film and television. Devon holds a B.A. from Johns Hopkins University in International Relations and French, with a minor in Italian.
Chris Aldrich, Engr ’96
Chris started his career at Hopkins while running several movie groups on campus and was responsible for over $200,000 of renovations in Shriver Hall including installing a new screen, sound system, and 35mm projection while also running the 29th Annual Milton S. Eisenhower Symposium “Framing Society: A Century of Cinema” on the 100th anniversary of the moving picture.
Following Hopkins he joined Creative Artists Agency where he worked in Motion Picture Talent and also did work in music-crossover. He later joined Davis Entertainment with a deal at 20th Century Fox where he worked on the productions of Heartbreakers, Dr. Dolittle 2, Behind Enemy Lines as well as acquisition and development of Alien v. Predator, Paycheck, Flight of the Phoenix, Garfield, The Man from U.N.C.L.E., I, Robot and countless others.
Missing the faster pace of representation, he later joined Writers & Artists Agency for several years working in their talent, literary, and book departments. Since that time he’s had his own management company focusing on actors, writers, authors, and directors. Last year he started Boffo Socko Books, an independent publishing company and recently put out the book Amerikan Krazy.
Part of the course:
Students will have the opportunity to spend one week in Los Angeles with Film and Media Studies Director Linda DeLibero. Students will meet and network with JHU alums in the entertainment industry, as well as heads of studios and talent agencies, screenwriters, directors, producers, and various other individuals in film and television. Associated fee with this intersession course is $1400 (financial support is available for those who qualify). Permission of Linda DeLibero is required. Film and Media Studies seniors and juniors will be given preference for the eight available slots, followed by senior minors.Students are expected to arrive in Los Angeles on January 8. The actual course runs January 9-13 with lodging check-in on January 8 and check-out on January 14.
|Course Number: AS.061.377.60
Days: Monday 1/9/2017 – Friday 1/13/2017
Times: M – TBA | Tu- TBA | W- TBA | Th- TBA | F- TBA
Instructor: Linda DeLibero
My friend (and brilliant writer) Kevin Smokler will be appearing tonight and tomorrow here in Los Angeles as part of the tour for his new book Brat Pack America: A Love Letter to 80s Teen Movies published by Rare Bird Books, based right here, downtown, around the corner from the Last Book Store.
Event #1 Echo Park, Wednesday November 16th, 8pm
Special guest Daniel Waters, screenwriter of Heathers
Kevin will be appearing at Stories Books & Cafe with special guest Daniel Waters. 80s trivia, games and fruit roll ups included.
(1716 W. Sunset just east of Glendale Blvd in Echo Park)
Event #2 Culver City, Thursday, November 17th, 7:30-9:30 pm
Kevin will be part of an event called “Romantic Comedy” at The Ripped Bodice Bookstore at 3806 Main Street (Corner of Main and Venice Blvd) in downtown Culver City. “Romantic Comedy” is great comedians riffing on romance novels. He’s on the bill with Laurie Kilmartin, who is a writer for Conan O’Brien.
Here’s how they lay it out…
Funny people. Sexy books. Free wine.
This month we’re grateful for 80s movies, instagram superstars, and the return of one of our all-time faves, Laurie Kilmartin!
All at the best venue imaginable: the Ripped Bodice, America’s first romance-only bookstore, in Culver City.
Festivities begin when seating opens at 7:30. Come early, drink up, and get 10% off everything in the store. Comedy starts at 8pm, but come early to reserve your spot!!!
This month’s guests:
+ Kevin Smokler, author of Brat Pack America
Hosted by Erin Judge and Jenny Chalikian.
The Facebook event is here: https://www.facebook.com/events/1852329114986550/
I hope to see you at one (or both) of these.Syndicated copies to:
Little Free Library #8424
About two years ago, I registered Little Free Library #8424 and a year and three months ago it opened up with a just a few books to serve the Adams Hill neighborhood in Glendale, CA. Along the way during the intervening time, we’ve had almost 500 donated books go through our humble metal doors. In addition to our local library, some of our donated books also go to help seed several dozen similar libraries in surrounding communities, many of which are considered book deserts, meaning that there are few outlets (public libraries, school libraries, or bookstores, etc.) for books or reading available to people in those communities. As a result, and unsurprisingly, the literacy rates in these neighborhoods are not as high as they should be.
A Surprise Invitation
Several weeks ago I was pleasantly surprised to receive an invitation from Little Free Library stewards and founders of The Literacy Club, Doug and Jean Chadwick, who said they would be hosting a steward meet-up for people running Little Free Libraries in the Los Angeles area.
Little Free Library & The Literacy Club Presents: An evening with Todd Bol
Come meet Todd and your fellow stewards for an evening of fun! You’ll get to talk Little Libraries and books, enjoy snacks, beer, wine and soft drinks, and swap stories with everyone in attendance.
Part of the motivation for the event was because Todd Bol, co-founder and executive director of the Little Free Library movement was coming to Los Angeles on Thursday, November 3rd.
It seemed like a great excuse to meet some of my fellow library stewards in the area and swap stories, and exchange advice.
Little Free Library #50,000
At the time I didn’t know that Todd was coming out to the West coast from Wisconsin in part to celebrate the unveiling of Little Free Library charter number 50,000 in Santa Ana, California, the day after he met with us. To help put the growth of the movement into perspective, remember that I registered library #8424 about two years ago.
The Literacy Club
As I was to discover when I arrived, Todd came not only to meet several library stewards in the Los Angeles area but to help honor all our efforts. In particular to honor the efforts of the Literacy Club which has helped to set up and run over 50 Little Free Libraries in the Los Angeles area including in hospitals, various neighborhoods, and every police station in the city (except two, which are on their to-do list). They’ve also built and host libraries in Ohio and Wisconsin as well.
I was very impressed with their efforts and even a tad jealous that I hadn’t thought to set up dozens of libraries like this, though trust me, the amount of work involved is no small potatoes–it’s obviously a full time hobby and then some.
As a small comparison, I opened up Little Free Library charter #8424 a year and three months ago, and we’ve had almost 500 books move through our library; the Literacy Club is moving thousands of books a month!
Paul Krekorian, Councilmember of the Second District of the City of Los Angeles, had sent a Certificate of Appreciation to present to The Literacy Club for all of their fantastic work in the city. Our little soiree included a lovely presentation by Field Deputy Sahag Yedalian (who was representing Krekorian’s office) to the Chadwicks for their work on The Literacy Club’s behalf.
Shockingly to me, after a whirl-wind presentation, I too had such a lovely certificate in my hands!
After catching my breath, I was a bit sad that the certificate wasn’t made out to the Little Free Library #8424, which is really the true recipient of the honor. While I did do a good bit of work to put the library together and erect it in front of my house, it really is the neighborhood and community that do all of the work in supporting and using our Adams Hill treasure. So I’ll take a moment to say thank you to all my neighbors and friends in and beyond Adams Hill in Glendale for supporting our neighborhood Little Free Library.
Many other LFL stewards in attendance were also presented with certificates of appreciation for their help in seeding book deserts in the surrounding Los Angeles areas.
During the evening it was great hearing some stories and ideas from many in the room. In particular it was nice to hear the story of Little Free Library #1 that Todd built and thereby started the growing movement of book exchanges.
It was also interesting to hear his philosophy of treating the Little Free Library organization as a “reverse franchise” set up. Most franchise operations perfect the concept of their business before spinning it out into thousands of locations. He prefers to have a few interesting ideas to put out into the community, which is likely to be wildly more creative and perfect those ideas or come up with incarnations and offshoots that the small staff at headquarters couldn’t have possibly created. Then, once perfected, headquarters can help disseminate the ideas to everyone and everywhere else. I though this was great advice for non-profit organizations like this.
Also at the party, I also got to meet the President of the Burbank noon Kiwanis Charles Chavoor who was present to show support for The Literacy Club and their efforts. The Kiwanis there are funding a large Little Free Library to be dedicated shortly.
We also got to hear advance news about a major pending announcement for which we were all embargoed until November 14th, so you’ll have to wait until then for more details.
Todd also shared some of his work in growing the Little Free Library movement in Indonesia as well as several partnerships including the U.S. Army which is stewarding a large number of libraries.
Doug Chadwick shared a somewhat heartbreaking story based on his volunteer experience. He said that an unintended consequence and benefit of putting Little Free Libraries into police stations around the city is that police stations are often the site of court mandated child exchanges between divorced parents who don’t always get along or respect each other. At least while waiting during drop offs and pick ups, the children who are caught in the middle are able to sit down and not only read a book or two while they wait, but they can take them home with them as well.
Doug also shared a previous story of receiving the Little Free Library’s “Master Builder Award” and Todd indicated how rare these original Amish planes were to be able to establish such an award.
The Book Room
When I came to the party, I thought it would be a nice gesture to bring a book or two from my own library for the hosts or to swap with some of the other stewards. I noticed that a few other attendees did the same. Our gracious hosts also had the same idea, but, like the Literacy Club with its grand mission, they managed to pull their version off in even grander style.
As I was leaving, I was invited into The Book Room. Now, I’ll preface this with the fact that I’ve been into the offices and stock rooms over more than a dozen nice sized specialty book shops. The book room in the Chadwick’s home handily put most of them to shame. I was immediately surrounded by shelves with hundreds of stacks of books each with a dozen or more copies of the same book all waiting to be pulled off to create restocking boxes for any of the various Little Free Libraries around town that The Literacy Club stewards.
While I often try to have lightly worn or like new books in my library, every book in this room was brand new and sure to make a proud treasure for the thousands of children who were soon to receive them. It’s exactly the kind of room every library steward dreams of having in their own house.
I was thrilled to be sent home with not just one box full of books, but three boxes. Thus Little Free Library #8424 will soon have some new children’s selections, and, much like an early Santa Claus, I’ll be dropping off many books at some of the surrounding LFLs in the Eagle Rock, Glendale, South Pasadena, and Pasadena areas to spread the wealth and cheer and help continue seeding libraries nearby.
In the meanwhile, I’m dreaming about how I might be able to add on an additional room to the house for books…
Thanks again to The Literacy Club and to Doug and Jean Chadwick, who have impossibly edged me out as the #2 most enthusiastic Little Free Library steward after Todd Bol. And thanks again for hosting such a lovely little party to bring us all closer together. I’m glad to know I’m not alone in my love for what we’re all doing. I’ll be in touch shortly about volunteering some of my time to The Literacy Club’s efforts.
Thanks also to The Little Free Library organization which provided guests with lots of great items like The Little Free Library book, buttons, book marks and more.
And finally, thanks yet again to all my friends, family, and neighbors who help to support Little Free Library #8424.
Would you like to help?
You can help in a variety of ways from donating your lightly used books, volunteering your time, starting your own library, or even making a financial contribution. We welcome your help and know that it will help make our communities better one book at a time. After seeing some of the excellent work that The Literacy Club is doing, you could also help support their GoFundMe campaign.Syndicated copies to:
I voted in the November 8th, 2016 Election! 🇺🇸
After having spent the weekend at IndieWebCamp Los Angeles, it somehow seems appropriate to have a “Voted post type” for the election today†. To do it I’m proposing the following microformats, an example of which can be found in the mark up of the post above. This post type is somewhat similar to both a note/status update and an RSVP post type with a soupçon of checkin.
- Basic markup
<span class="p-voted">I voted</span>
in the <a href="http://example.com/election" class="u-voted-in">November 8th, 2016 Election</a>
Possible Voted values: I voted, I didn’t vote, I was disenfranchised, I was intimidated, I was apathetic, I pathetically didn’t bother to register
- Send a Webmention to the election post of your municipality’s Registrar/Clerk/Records office as you would for a reply to any post.
You should include author information in your Voted post so the registrar knows who voted (and then send another Webmention so the voting page gets the update).
Here’s another example with explicit author name and icon, in case your site or blog does not already provide that on the page.
<a class="p-author h-card" href="http://mysite.example.org">
<img alt="" src="http://mysite.example.org/icon.jpg"/>
<span class="p-voted">I voted</span>
to <a href="http://example.com/election" class="u-voted-in">IndieWeb Election </a>
You can also use the data element to express the meaning behind the literal p-voted value while providing your own visible human readable language:
<data class="p-voted" value="I voted">I voted for the first female president today!
Finally, feel free to POSSE to multiple social media networks to encourage your friends and family to vote today.
† I’m being a bit facetious and doing this in fun. But it does invite some interesting speculation…Syndicated copies to:
The Santa Fe Institute, in New Mexico, is a place for studying complex systems. I’ve never been there! Next week I’ll go there to give a colloquium on network theory, and also to participate in this workshop.
I just found out about this from John Carlos Baez and wish I could go! How have I not managed to have heard about it?
Syndicated copies to:
November 16, 2016 – November 18, 2016
Noyce Conference Room
This workshop will address a fundamental question in theoretical biology: Does the relationship between statistical physics and the need of biological systems to process information underpin some of their deepest features? It recognizes that a core feature of biological systems is that they acquire, store and process information (i.e., perform computation). However to manipulate information in this way they require a steady flux of free energy from their environments. These two, inter-related attributes of biological systems are often taken for granted; they are not part of standard analyses of either the homeostasis or the evolution of biological systems. In this workshop we aim to fill in this major gap in our understanding of biological systems, by gaining deeper insight in the relation between the need for biological systems to process information and the free energy they need to pay for that processing.
The goal of this workshop is to address these issues by focusing on a set three specific question:
- How has the fraction of free energy flux on earth that is used by biological computation changed with time?;
- What is the free energy cost of biological computation / function?;
- What is the free energy cost of the evolution of biological computation / function.
In all of these cases we are interested in the fundamental limits that the laws of physics impose on various aspects of living systems as expressed by these three questions.
Purpose: Research Collaboration
SFI Host: David Krakauer, Michael Lachmann, Manfred Laubichler, Peter Stadler, and David Wolpert
Details for the conference can be found at Dodging the Memory Hole 2016.
My previous posts and notes about the conference:
- Notes from Day 1 of Dodging the Memory Hole: Saving Online News | Thursday, October 13, 2016
- Notes from Day 2 of Dodging the Memory Hole: Saving Online News | Friday, October 14, 2016
- Twitter List for #DtMH2016 Participants | Dodging the Memory Hole 2016: Saving Online News
If you missed the notes from Day 1, see this post.
It may take me a week or so to finish putting some general thoughts and additional resources together based on the two day conference so that I might give a more thorough accounting of my opinions as well as next steps. Until then, I hope that the details and mini-archive of content below may help others who attended, or provide a resource for those who couldn’t make the conference.
Overall, it was an incredibly well programmed and run conference, so kudos to all those involved who kept things moving along. I’m now certainly much more aware at the gaping memory hole the internet is facing despite the heroic efforts of a small handful of people and institutions attempting to improve the situation. I’ll try to go into more detail later about a handful of specific topics and next steps as well as a listing of resources I came across which may provide to be useful tools for both those in the archiving/preserving and IndieWeb communities.
Archive of materials for Day 2
Below are the recorded audio files embedded in .m4a format (using a Livescribe Pulse Pen) for several sessions held throughout the day. To my knowledge, none of the breakout sessions were recorded except for the one which appears below.
Summarizing archival collections using storytelling techniques
Presentation: Summarizing archival collections using storytelling techniques by Michael Nelson, Ph.D., Old Dominion University
Saving the first draft of history
Special guest speaker: Saving the first draft of history: The unlikely rescue of the AP’s Vietnam War files by Peter Arnett, winner of the Pulitzer Prize for journalism
Kiss your app goodbye: the fragility of data journalism
Panel: Kiss your app goodbye: the fragility of data journalism
Featuring Meredith Broussard, New York University; Regina Lee Roberts, Stanford University; Ben Welsh, The Los Angeles Times; moderator Martin Klein, Ph.D., Los Alamos National Laboratory
The future of the past: modernizing The New York Times archive
Panel: The future of the past: modernizing The New York Times archive
Featuring The New York Times Technology Team: Evan Sandhaus, Jane Cotler and Sophia Van Valkenburg; moderated by Edward McCain, RJI and MU Libraries
Lightning Rounds: Six Presenters
Lightning rounds (in two parts)
Six + one presenters: Jefferson Bailey, Terry Britt, Katherine Boss (and team), Cynthia Joyce, Mark Graham, Jennifer Younger and Kalev Leetaru
1: Jefferson Bailey, Internet Archive, “Supporting Data-Driven Research using News-Related Web Archives” 2: Terry Britt, University of Missouri, “News archives as cornerstones of collective memory” 3: Katherine Boss, Meredith Broussard and Eva Revear, New York University: “Challenges facing preservation of born-digital news applications” 4: Cynthia Joyce, University of Mississippi, “Keyword ‘Katrina’: Re-collecting the unsearchable past” 5: Mark Graham, Internet Archive/The Wayback Machine, “Archiving news at the Internet Archive” 6: Jennifer Younger, Catholic Research Resources Alliance: “Digital Preservation, Aggregated, Collaborative, Catholic” 7. Kalev Leetaru, senior fellow, The George Washington University and founder of the GDELT Project: A Look Inside The World’s Largest Initiative To Understand And Archive The World’s News
Technology and Community
Presentation: Technology and community: Why we need partners, collaborators, and friends by Kate Zwaard, Library of Congress
Breakout: Working with CMS
Working with CMS, led by Eric Weig, University of Kentucky
Alignment and reciprocity
Alignment & reciprocity by Katherine Skinner, Ph.D., executive director, the Educopia Institute
Closing remarks by Edward McCain, RJI and MU Libraries and Todd Grappone, associate university librarian, UCLA
Live Tweet Archive
Reminder: In many cases my tweets don’t reflect direct quotes of the attributed speaker, but are often slightly modified for clarity and length for posting to Twitter. I have made a reasonable attempt in all cases to capture the overall sentiment of individual statements while using as many original words of the participant as possible. Typically, for speed, there wasn’t much editing of these notes. Below I’ve changed the attribution of one or two tweets to reflect the proper person(s). Fore convenience, I’ve also added a few hyperlinks to useful resources after the fact that didn’t have time to make the original tweets. I’ve attached .m4a audio files of most of the audio for the day (apologies for shaky quality as it’s unedited) which can be used for more direct attribution if desired. The Reynolds Journalism Institute videotaped the entire day and livestreamed it. Presumably they will release the video on their website for a more immersive experience.
Condoms were required issue in Vietnam–we used them to waterproof film containers in the field.
Do not stay close to the head of a column, medics, or radiomen. #warreportingadvice
I told the AP I would undertake the task of destroying all the reporters’ files from the war.
Instead the AP files moved around with me.
Eventually the 10 trunks of material went back to the AP when they hired a brilliant archivist.
“The negatives can outweigh the positives when you’re in trouble.”
Our first panel:Kiss your app goodbye: the fragility of data jornalism
I teach data journalism at NYU
A news app is not what you’d install on your phone
Dollars for Docs is a good example of a news app
A news app is something that allows the user to put themself into the story.
Often there are three CMSs: web, print, and video.
News apps don’t live in any of the CMSs. They’re bespoke and live on a separate data server.
This has implications for crawlers which can’t handle them well.
Then how do we save news apps? We’re looking at examples and then generalizing.
Everyblock.com was a good example based on chicagocrime and later bought by NBC and shut down.
What?! The internet isn’t forever? Databases need to be save differently than web pages.
Reprozip was developed by NYU Center for Data and we’re using it to save the code, data, and environment.
We make apps that serve our audience.
We also make internal tools that empower the newsroom.
We also use our nerdy skills to do cool things.
Most of us aren’t good programmers, we “cheat” by using frameworks.
Frameworks do a lot of basic things for you, so you don’t have to know how to do it yourself.
Archiving tools often aren’t built into these frameworks.
Instagram, Pinterest, Mozilla, and the LA Times use django as our framework.
Memento for WordPress is a great way to archive pages.
We must do more. We need archiving baked into the systems from the start.
Slides at http://bit.ly/frameworkfix
Got data? I’m a librarian at Stanford University.
I’ll mention Christine Borgman’s book Big Data, Little Data, No data.
Journalists are great data liberators: FOIA requests, cleaning data, visualizing, getting stories out of data.
But what happens to the data once the story is published?
BLDR: Big Local Digital Repository, an open repository for sharing open data.
For metadata: www.ddialliance.org, RDF, International Image Interoperability Framework (iiif) and MODS
We’ll open up for questions.
What’s more important: obey copyright laws or preserving the content?
The new creative commons licenses are very helpful, but we have to be attentive to many issues.
Perhaps archiving it and embargoing for later?
Saving the published work is more important to me, and the rest of the byproduct is gravy.
I work for the New York Times, you may have heard of it…
Talking about modernizing the born-digital legacy content.
Our problem was how to make an article from 2004 look like it had been published today.
There were 100’s of thousands of articles missing.
There was no one definitive list of missing articles.
Outlining the workflow for reconciling the archive XML and the definitive list of URLs for conversion.
It’s important to use more than one source for building an archive.
I’m going to talk about all of “the little things” that came up along the way..
Article Matching: Fusion – How to convert print XML with web HTML that was scraped.
Primarily, we looked at common phrases between the corpus of the two different data sets.
We prioritized the print data over the digital data.
We maintain a system called switchboard that redirects from old URLs to the new ones to prevent link rot.
The case of the missing sections: some sections of the content were blank and not transcribed.
We made the decision of taking out data we had in lieu of making a better user experience for missing sections.
In the future, we’d also like to put photos back into the articles.
Can you discuss the decision to go with a more modern interface rather than a traditional archive of how it looked?
Some of the decision was to get the data into an accessible format for modern users.
We do need to continue work on preserving the original experience.
Is there a way to distinguish between the print version and the online versions in the archive?
Could a researcher do work on the entire corpora? Is it available for subscription?
We do have a sub-section of data availalbe, but don’t have it prior to 1960.
Have you documented the process you’ve used on this preservation project?
We did save all of the code for the project within GitHub.
We do have meeting notes which provide some documentation, though they’re not thorough.
Oh dear. Of roughly 1,155 tweets I counted about #DtMH2016 in the last week, roughly 25% came from me. #noisy
Opensource tool I had mentioned to several: @wallabagapp A self-hostable application for saving web pages https://www.wallabag.org
Today I spent most of the majority of the day attending the first of a two day conference at UCLA’s Charles Young Research Library entitled “Dodging the Memory Hole: Saving Online News.” While I knew mostly what I was getting into, it hadn’t really occurred to me how much of what is on the web is not backed up or archived in any meaningful way. As a part of human nature, people neglect to back up any of their data, but huge swaths of really important data with newsworthy and historic value is being heavily neglected. Fortunately it’s an interesting enough problem to draw the 100 or so scholars, researchers, technologists, and journalists who showed up for the start of an interesting group being conglomerated through the Reynolds Journalism Institute and several sponsors of the event.
What particularly strikes me is how many of the philosophies of the IndieWeb movement and tools developed by it are applicable to some of the problems that online news faces. I suspect that if more journalists were practicing members of the IndieWeb and used their sites not only for collecting and storing the underlying data upon which they base their stories, but to publish them as well, then some of the (future) archival process may be easier to accomplish. I’ve got so many disparate thoughts running around my mind after the first day that it’ll take a bit of time to process before I write out some more detailed thoughts.
Twitter List for the Conference
As a reminder to those attending, I’ve accumulated a list of everyone who’s tweeted with the hashtag #DtMH2016, so that attendees can more easily follow each other as well as communicate online following our few days together in Los Angeles. Twitter also allows subscribing to entire lists too if that’s something in which people have interest.
Archiving the day
It seems only fitting that an attendee of a conference about saving and archiving digital news, would make a reasonable attempt to archive some of his experience right?! Toward that end, below is an archive of my tweetstorm during the day marked up with microformats and including hovercards for the speakers with appropriate available metadata. For those interested, I used a fantastic web app called Noter Live to capture, tweet, and more easily archive the stream.
Note that in many cases my tweets don’t reflect direct quotes of the attributed speaker, but are often slightly modified for clarity and length for posting to Twitter. I have made a reasonable attempt in all cases to capture the overall sentiment of individual statements while using as many original words of the participant as possible. Typically, for speed, there wasn’t much editing of these notes. I’m also attaching .m4a audio files of most of the audio for the day (apologies for shaky quality as it’s unedited) which can be used for more direct attribution if desired. The Reynolds Journalism Institute videotaped the entire day and livestreamed it. Presumably they will release the video on their website for a more immersive experience.
If you prefer to read the stream of notes in the original Twitter format, so that you can like/retweet/comment on individual pieces, this link should give you the entire stream. Naturally, comments are also welcome below.
Below are the audio files for several sessions held throughout the day.
Greetings and Keynote
Greetings: Edward McCain, digital curator of journalism, Donald W. Reynolds Journalism Institute (RJI) and University of Missouri Libraries and Ginny Steel, university librarian, UCLA
Keynote: Digital salvage operations — what’s worth saving? given by Hjalmar Gislason, vice president of data, Qlik
Why save online news? and NewsScape
Panel: “Why save online news?” featuring Chris Freeland, Washington University; Matt Weber, Ph.D., Rutgers, The State University of New Jersey; Laura Wrubel, The George Washington University; moderator Ana Krahmer, Ph.D., University of North Texas
Presentation: “NewsScape: preserving TV news” given by Tim Groeling, Ph.D., UCLA Communication Studies Department
Born-digital news preservation in perspective
Speaker: Clifford Lynch, Ph.D., executive director, Coalition for Networked Information on “Born-digital news preservation in perspective”
Live Tweet Archive
Getting Noter Live fired up for Dodging the Memory Hole 2016: Saving Online News https://www.rjionline.org/dtmh2016
I’m glad I’m not at NBC trying to figure out the details for releasing THE APPRENTICE tapes.
Let’s thank @UCLA and the library for hosting us all.
While you’re here, don’t forget to vote/provide feedback throughout the day for IMLS
Someone once pulled up behind me and said “Hi Tiiiigeeerrr!” #Mizzou
A server at the Missourian crashed as the system was obsolete and running on baling wire. We lost 15 years of archives
The dean & head of Libraries created a position to save born digital news.
We’d like to help define stake-holder roles in relation to the problem.
Newspaper is really an outmoded term now.
I’d like to celebrate that we have 14 student scholars here today.
We’d like to have you identify specific projects that we can take to funding sources to begin work after the conference
We’ll be going to our first speaker who will be introduced by Martin Klein from Los Alamos.
Hjalmar Gislason is a self-described digital nerd. He’s the Vice President of Data.
I wonder how one becomes the President of Data?
My Icelandic name may be the most complicated part of my talk this morning.
Speaking on Digital Salvage Operations: What’s worth Saving”
My father in law accidentally threw away my wife’s favorite stuffed animal. #DeafTeddy
Some people just throw everything away because they’re not being used. Others keep everything and don’t throw it away.
The fundamental question: Do you want to save everything or do you want to get rid of everything?
I joined @qlik two years ago and moved to Boston.
Before that I was with spurl.net which was about saving copies of webpages they’d previously visited.
I had also previously invested in kjarninn which is translated as core.
We used to have little data, now we’re with gigantic data and moving to gargantuan data soon.
One of my goals today is to broaden our perspective about what data needs saving.
There’s the Web, the “Deep” Web, then there’s “Other” data which is at the bottom of the pyramid.
I got to see into the process of #panamapapers but I’d like to discuss the consequences from April 3rd.
The amount of meetings were almost more than could have been covered in real time in Iceland.
The #panamapapers were a soap opera, much like US politics.
Looking back at the process is highly interesting, but it’s difficult to look at all the data as they unfoldedd
How can we capture all the media minute by minute as a story unfolds.
You can’t trust that you can go back to a story at a certain time and know that it hasn’t been changed. #1984 #Orwell
There was a relatively pro-HRC piece earlier this year @NYTimes that was changed.
Newsdiffs tracks changes in news over time. The HRC article had changed a lot.
Let’s say you referenced @CNN 10 years ago, likely now, the CMS and the story have both changed.
8 years ago, I asked, wouldn’t we like to have the social media from Iceland’s only Nobel Laureate as a teenager?
What is private/public, ethical/unethical when dealing with data?
Much data is hidden behind passwords or on systems which are not easily accessed from a database perspective.
Most of the content published on Facebook isn’t public. It’s hard to archive in addition to being big.
We as archivists have no claim on the hidden data within Facebook.
The #indieweb could help archivists in the future in accessing more personal data.
Then there’s “other” data: 500 hours of video us uploaded to YouTube per minute.
No organization can go around watching all of this video data. Which parts are newsworthy?
Content could surface much later or could surface through later research.
Hornbjargsviti lighthouse recorded the weather every three hours for years creating lots of data.
And that was just one of hundreds of sites that recorded this type of data in Iceland.
Lots of this data is lost. Much that has been found was by coincidence. It was never thought to archive it.
This type of weather data could be very valuable to researchers later on.
There was also a large archive of Icelandic data that was found.
Showing a timelapse of Icelandic earthquakes https://vimeo.com/24442762
You can watch the magma working it’s way through the ground before it makes it’s way up through the land.
National Geographic featured this video in a documentary.
Sometimes context is important when it comes to data. What is archived today may be more important later.
As the economic crisis unfolded in Greece, it turned out the data that was used to allow them into EU was wrong.
The data was published at the time of the crisis, but there was no record of what the data looked like 5 years earlier.
Only way to recreate the data was to take prior printed sources. This is usu only done in extraordinary cirucumstances.
We captured 150k+ data sets with more than 8 billion “facts” which was just a tiny fraction of what exists.
How can we delve deeper into large data sets, all with different configurations and proprietary systems.
“There’s a story in every piece of data.”
Once a year energy consumption seems to dip because February has fewer days than other months. Plotting it matters.
Year over year comparisons can be difficult because of things like 3 day weekends which shift over time.
Here’s a graph of the population of Iceland. We’ve had our fair share of diseases and volcanic eruptions.
To compare, here’s a graph of the population of sheep. They outnumber us by an order(s) of magnitude.
In the 1780’s there was an event that killed off lots of sheep, so people had the upper hand.
Do we learn more from reading today’s “newspaper” or one from 30, 50, or 100 years ago?
There was a letter to the editor about an eruption and people had to move into the city.
letter: “We can’t have all these people come here, we need to build for our own people first.”
This isn’t too different from our problems today with respect to Syria. In that case, the people actually lived closer.
In the born-digital age, what will the experience look like trying to capture today 40 years hence?
Will it even be possible?
Machine data connections will outnumber “people” data connections by a factor of 10 or more very quickly.
With data, we need to analyze, store, and discard data. How do we decide in a spit-second what to keep & discard?
We’re back to the father-in-law and mother-in-law question: What to get rid of and what to save?
Computing is continually beating human tasks: chess, Go, driving a car. They build on lots more experience based on data
Whoever has the most data on driving cars and landscape will be the ultimate winner in that particular space.
Data is valuable, sometimes we just don’t know which yet.
Hoarding is not a strategy.
You can only guess at what will be important.
“Commercial use in Doubt” The third sub-headline in a newspaper about an early test of television.
There’s more to it than just the web.
Hoarding isn’t a strategy really resonates with librarians, what could that relationship look like?
One should bring in data science, industry may be ahead of libraries.
Cross-disciplinary approaches may be best. How can you get a data scientist to look at your problem? Get their attention?
There’s 60K+ books about the Viet Nam War. How do we learn to integrate what we learn after an event (like that)?
Perspective always comes with time, as additional information arrives.
Scientific papers are archived in a good way, but the underlying data is a problem.
In the future you may have the ability to add supplementary data as a supplement what appears in a book (in a better way)
Archives can give the ability to have much greater depth on many topics.
Are there any centers of excellence on the topics we’re discussing today? This conference may be IT.
We need more people that come from the technical side of things to be watching this online news problem.
Hacks/Hackers is a meetup group that takes place all over the world.
It brings the journalists and computer scientists together regularly for beers. It’s some of the outreach we need.
If you’re not interested in money, this is a good area to explore. 10 minute break.
Don’t forget to leave your thoughts on the questions at the back of the room.
We’re going to get started with our first panel. Why is it important to save online news?
I’m Matt Weber from Rugters University and in communications.
I’ll talk about web archives and news media and how they interact.
I worked at Tribune Corp. for several years and covered politics in DC.
I wanted to study the way in which the news media is changing.
We’re increadingly seeing digital only media with no offline surrogate.
It’s becomign increasingly difficult to do anything but look at it now as it exists.
There was no large scale online repository of online news to do research.
#OccupyWallStreet is one of the first examples of stories that exist online in ocurence and reportage.
There’s a growing need to archive content around local news particularly politics and democracy.
When there is a rich and vibrant local news environment, people are more likely to become engaged.
Local news is one of the least thought about from an archive perspective.
I’m at GWU Librarys in the scholarly technology group.
I’m involved in social feed manager which allows archivists to put together archives from social services.
Kimberly Gross, a faculty member, studies tweets of news outlets and journalists.
We created a prototype tool to allow them to collect data from social media.
Journalists were 2011 primarily using their Twitter presences to direct people to articles rather than for conversation
We collect data of political candidates.
I’m an associate library and representing “Documenting the Now” with WashU, UCRiverside, & UofMd
Documenting the Now revolves around Twitter documentation.
It started with the Ferguson story and documenting media, videos during the protests in the community.
What can we as memory institutions do to capture the data?
We gathered 14million tweets relating to Ferguson within two weeks.
We tried to build a platform that others could use in the future for similar data capture relating to social.
Ethics is important in archiving this type of news data.
Digitally preserving pdfs from news organizations and hyper-local news in Texas.
We’re approaching 5million pages of archived local news.
What is news that needs to be archived, and why?
First, what is news? The definition is unique to each individual.
We need to capture as much of the social news and social representation of news which is fragmented.
It’s an important part of society today.
We no longer produce hard copies like we did a decade ago. We need to capture the online portion.
We’d like to get the perspective of journalists, and don’t have one on the panel today.
We looked at how midterm election candidates used Twitter. Is that news itself? What tools do we use to archive it?
What does it mean to archive news by private citizens?
Twitter was THE place to find information in St. Louis during the Ferguson protests.
Local news outlets weren’t as good as Twitter during the protests.
I could hear the protest from 5 blocks away and only found news about it on Twitter.
The story was bing covered very differently on Twitter than the local (mainstream) news.
Alternate voices in the mix were very interesting and important.
Twitter was in the moment and wasn’t being edited and causing a delay.
What can we learn from this massive number of Ferguson tweets.
It gives us information about organizing, and what language was being used.
I think about the archival portion of this question. By whom does it need to be archived?
What do we archive next?
How are we representing the current population now?
Who is going to take on the burden of archiving? Should it be corporate? Cultural memory institution?
Someone needs to currate it, who does that?
our next question: What do you view as primary barriers to news archiving?
How do we organize and staff? There’s no shortage of work.
Tools and software can help the process, but libraries are usually staffed very thinly.
No single institution can do this type of work alone. Collaboration is important.
Two barriers we deal with: terms of service are an issue with archiving. We don’t own it, but can use it.
Libraries want to own the data in perpetuity. We don’t own our data.
There’s a disconnect in some of the business models for commercialization and archiving.
Issues with accessing data.
People were worried about becoming targets or losing jobs because of participation.
What is role of ethics of archiving this type of data? Allowing opting out?
What about redacting portions? anonymizing the contributions?
Publishers have a responsibility for archiving their product. Permission from publishers can be difficult.
We have a lot of underserved communities. What do we do with comments on stories?
Corporations may not continue to exist in the future and data will be lost.
There’s a balance to be struck between the business side and the public good.
It’s hard to convince for profit about the value of archiving for the social good.
Next Q: What opportunities have revealed themselves in preserving news?
Finding commonalities and differences in projects is important.
What does it mean to us to archive different media types? (think diversity)
What’s happening in my community? in the nation? across the world?
The long-history in our archives will help us learn about each other.
We can only do so much with the resources we have.
We’ve worked on a cyber cemetery product in the past.
Someone else can use the tools we create within their initiatives.
repeating ?: What are issues in archiving longerform video data with regard to stories on Periscope?
How do you channel the energy around archiving news archiving?
Research in the area is all so new.
Does anyone have any experience with legal wrangling with social services?
The ACLU is waging a lawsuit against Twitter about archived tweets.
Outreach to community papers is very rhizomic.
How do you take local examples and make them a national model?
We’re teenagers now in the evolution of what we’re doing.
Peter Arnett just said “This is all ore interesting than I thought it would be.”
Next Presentation: NewsScape: preserving TV news
I’ll be talking about the NewsScape project of Francis Steen, Director, Communication Studies Archive
I’m leading the archiving of the analog portion of the collection.
The oldest of our collection dates from the 1950’s. We’ve hosted them on YouTube which has created some traction.
Commenters have been an issue with posting to YouTube as well as copyright.
NewsScape is the largest collecction of TV news and public affairs programs (local & national)
Prior to 2006, we don’t know what we’ve got.
Paul said “Ill record everytihing I can and someone in the future can deal with it.”
We have 50K hours of Betamax.
VHS are actually most threatened, despite being newest tapes.
Our budget was seriously strapped.
Maintaining closed captioning is important to our archiving efforts.
We’ve done 36k hours of encoding this year.
We use a layer of dead VCR’s over our good VCR’s to prevent RF interference and audio buzzing. 🙂
Post-2006 We’re now doing straight to digital
Preservation is the first step, but we need to be more than the world’s best DVR.
Searching the news is important too.
Showing a data visualization of news analysis with regard to the Heathcare Reform movement.
We’re doing facial analysis as well.
We have interactive tools at viz2016.com.
We’ve tracked how often candidates have smiled in election 2016. Hillary > Trump
We want to share details within our collection, but don’t have tools yet.
Having a good VCR repairman has helped us a lot.
Breaking for lunch…
Talk “Born-digital news preservation in perspective”
There’s a shared consensus that preserving scholarly publications is important.
While delivery models have shifted, there must be some fall back to allow content to survive publisher failure.
Preservation was a joint investment between memory institutions and publishers.
Keepers register their coverage of journals for redundancy.
In studying coverage, we’ve discovered Elsevier is REALLY well covered, but they’re not what we’re worried about.
It’s the small journals as edge cases that really need more coverage.
Smaller journals don’t have resources to get into the keeper services and it’s more expensive.
Many Open Access Journals are passion projects and heavily underfunded and they are poorly covered.
Being mindful of these business dynamics is key when thinking about archiving news.
There are a handful of large news outlets that are “too big to fail.”
There are huge numbers of small outlets like subject verticals, foreign diasporas, etc. that need to be watched
Different strategies should be used for different outlets.
The material on lots of links (as sources) disappears after a short period of time.
While Archive.org is a great resource, it can’t do everything.
Preserving underlying evidence is really important.
How we deal with massive databases and queries against them are a difficult problem.
I’m not aware of studies of link rot with relationship to online news.
Who steps up to preserve major data dumps like Snowden, PanamaPapers, or email breaches?
Social media is a collection of observations and small facts without necessarily being journalism.
Journalism is a deliberate act and is meant to be public while social media is not.
We need to come up with a consensus about what parts of social media should be preserved as news..
News does often delve into social media as part of its evidence base now.
Responsible journalism should include archival storage, but it doesn’t yet.
Under current law, we can’t protect a lot of this material without the permission of the creator(s).
The Library of Congress can demand deposit, but doesn’t.
With funding issues, I’m not wild about the Library of Congress being the only entity [for storage.]
In the UK, there are multiple repositories.
testing to see if I’m still live
What happens if you livetweet too much in one day.