Publishing | Chris Aldrich

Twitter List for #DtMH2016 Participants | Dodging the Memory Hole 2016: Saving Online News

Live Tweeting and Twitter Lists

While attending the upcoming conference Dodging the Memory Hole 2016: Saving Online News later this week, I’ll make an attempt to live Tweet as much as possible. (If you’re following me on Twitter on Thursday and Friday and find me too noisy, try using QuietTime.xyz to mute me on Twitter temporarily.) I’ll be using Kevin Marks‘ excellent Noter Live web app to both send out the tweets as well as to store and archive them here on this site thereafter (kind of like my own version of Storify.)

In getting ramped up to live Tweet it, it helps significantly to have a pre-existing list of attendees (and remote participants) talking about #DtMH2016 on Twitter, so I started creating a Twitter list by hand. I realized that it would be nice to have a little bot to catch others as the week progresses. Ever lazy, I turned to IFTTT.com to see if something already existed, and sure enough there’s a Twitter search with a trigger that will allow one to add people who mention a particular hashtag to a Twitter list automatically.

Here’s the resultant list, which should grow as the event unfolds throughout the week:
🔖 People on Twitter talking about #DtMH2016

Feel free to follow or subscribe to the list as necessary. Hopefully this will make attending the conference more fruitful for those there live as well as remote.

Not on the list? Just tweet a (non-private) message with the conference hashtag: #DTMH2016 and you should be added to the list shortly.

Lazy like me? Click the bird to tweet: “I’m attending #DtMH2016 @rji | Dodging the Memory Hole 2016: Saving Online News http://ctt.ec/5RKt2+”

IFTTT Recipe for Creating Twitter Lists of Conference Attendees

For those interested in creating their own Twitter lists for future conferences (and honestly the hosts of all conferences should do this as they set up their conference hashtag and announce the conference), below is a link to the ifttt.com recipe I created for this, but which can be modified for use by others.

Naturally, it would also be nice if, as people registered for conferences, they were asked for their Twitter handles and websites so that the information could be used to create such online lists to help create longer lasting relationships both during the event and afterwards as well. (Naturally providing these details should be optional so that people who wish to maintain their privacy could do so.)

Selfie with author Henry James Korn who reveals details about his next novel

Instagram filter used: Lark

Photo taken at: Porta Via

I had lunch today with author Henry James Korn who revealed big chunks of the plot of his upcoming novel Zionista to me. I should be getting a copy of the first draft to read over the weekend, and I can’t wait. It sounds like it continues the genius of his political satire in Amerikan Krazy.

Homebrew Website Club Meetup Pasadena/Los Angeles Notes from 8-24-16

Last night, shy a few regulars at the tail end of a slow August and almost on the eve of IndieWebCamp NY2, Angelo Gladding and I continued our biweekly Homebrew Website Club meetings.

We met at Charlie’s Coffee House, 266 Monterey Road, South Pasadena, CA, where we stayed until closing at 8:00. Deciding that we hadn’t had enough, we moved the party (South Pasadena rolls up their sidewalks early) over to the local Starbucks, 454 Fair Oaks Ave, South Pasadena, CA where we stayed until they closed at 11:00pm.

Quiet Writing Hour

Angelo manned the fort alone with aplomb while building intently. If I’m not mistaken, he did use my h-card to track down my phone number to see what was holding me up, so as they say in IRC: h-card++!

Introductions and Demonstrations

Participants included:

Needing no introductions this week, Angelo launched us off with a relatively thorough demo of his Canopy platform which he’s built from the ground up in python! Starting from an empty folder on a host with a domain name, he downloaded and installed his code directly from Github and spun up a completely new version of his site in under 2 minutes. In under 20 minutes of some simple additional downloads and configuration of a few files, he also had locations, events, people and about modules up and running. Despite the currently facile appearance of his website, there’s really a lot of untapped power in what he’s built so far. It’s all available on Github for those interested in playing around; I’m sure he’d appreciate pull requests.

Along the way, I briefly demoed some of the functionality of Kevin Marks’ deceptively powerful Noterlive web app for not only live tweeting, but also owning those tweets on one’s own site in a simple way after the fact (while also automatically including proper markup and microformats)! I also ran through some of the overall functionality of my Known install with a large number of additional plugins to compare and contrast UX/UI with respect to Canopy.

We also discussed a bit of Angelo’s recent Indieweb Graph network crawling project, and I took the opportunity to fix a bit of the representative h-card on my site. (Angelo, does a new crawl appear properly on lahacker.net now?)

Before leaving Charlie’s we did manage to remember to take a group photo this time around. Not having spent enough time chatting over the past few weeks, we decamped to a local Starbucks and continued our conversation along with some addition brief demos and discussion of other itches for future building.

We also spent a few minutes discussing the upcoming IndieWebCamp LA logistics for November as well as outreach to the broader Los Angeles area dev communities. If you’re interested in attending, please RSVP. If you’d like to volunteer or help sponsor the camp, please don’t hesitate to contact either of us. I’m personally hoping to attend DrupalCamp LA this weekend while wearing a stylish IndieWebCamp t-shirt that’s already on its way to me.

Next Meeting

In keeping with the schedule of the broader Homebrew movement, so we’re already committed to our next meeting on September 7. It’s tentatively at the same location unless a more suitable one comes along prior to then. Details will be posted to the wiki in the next few days.

Thanks for coming everyone! We’ll see you next time.

Live Tweets Archive

Though not as great as the notes that Kevin Marks manages to put together, we did manage to make good use of noterlive for a few supplementary thoughts:

Chris Aldrich:

On my way to Homebrew Website Club Los Angeles in moments. http://stream.boffosocko.com/2016/homebrew-website-club-la-2016-08-24 #

Angelo Gladding:

I’ve torn some things down, but slowly rebuilding. I’m just minutes away from rel-me to be able to log into wiki #

ChrisAldrich:

Explaining briefly how @kevinmarks‘ noterlive.com works for live tweeting events… #

Angelo Gladding:

My github was receiving some autodumps from a short-lived indieweb experiment. #

is describing his canopy system used to build his site #

Canopy builds in a minute and 52 secs… inside are folders roots and trunk w/ internals #

Describing how he builds in locations to Canopy #

Apparently @t has a broken certificate for https, so my parser gracefully falls back to http instead. #

A New Reading Post-type for Bookmarking and Reading Workflow

This morning while breezing through my Woodwind feed reader, I ran across a post by Rick Mendes with the hashtags #readlater and #readinglist which put me down a temporary rabbit hole of thought about reading-related post types on the internet.

I’m obviously a huge fan of reading and have accounts on GoodReads, Amazon, Pocket, Instapaper, Readability, and literally dozens of other services that support or assist the reading endeavor. (My affliction got so bad I started my own publishing company last year.)

READ LATER is an indication on (or relating to) a website that one wants to save the URL to come back and read the content at a future time.

I started a page on the IndieWeb wiki to define read later where I began writing some philosophical thoughts. I decided it would be better to post them on my own site instead and simply link back to them. As a member of the Indieweb my general goal over time is to preferentially quit using these web silos (many of which are listed on the referenced page) and, instead, post my reading related work and progress here on my own site. Naturally, the question becomes, how does one do this in a simple and usable manner with pretty and reasonable UX/UI for both myself and others?

Current Use

Currently I primarily use a Pocket bookmarklet to save things (mostly newspaper articles, magazine pieces, blog posts) for reading later and/or the like/favorite functionality in Twitter in combination with an IFTTT recipe to save the URL in the tweet to my Pocket account. I then regularly visit Pocket to speed read though articles. While Pocket allows downloading of (some) of one’s data in this regard, I’m exploring options to bring in the ownership of this workflow into my own site.

For more academic leaning content (read journal articles), I tend to rely on an alternate Mendeley-based workflow which also starts with an easy-to-use bookmarklet.

I’ve also experimented with bookmarking a journal article and using hypothes.is to import my highlights from that article, though that workflow has a way to go to meet my personal needs in a robust way while still allowing me to own all of my own data. The benefit is that fixing it can help more than just myself while still fitting into a larger personal workflow.

Brainstorming

A Broader Reading (Parent) Post-type

Philosophically a read later post-type could be considered similar to a (possibly) unshared or private bookmark with potential possible additional meta-data like: progress, date read, notes, and annotations to be added after the fact, which then technically makes it a read post type.

A potential workflow viewed over time might be: read later >> bookmark >> notes/annotations/marginalia >> read >> review. This kind of continuum of workflow might be able to support a slightly more complex overall UI for a more simplified reading post-type in which these others are all sub-types. One could then make a single UI for a reading post type with fields and details for all of the sub-cases. Being updatable, the single post could carry all the details of one’s progress.

Indieweb encourages simplicity (DRY) and having the fewest post-types possible, which I generally agree with, but perhaps there’s a better way of thinking of these several types. Concatenating them into one reading type with various data fields (and the ability of them to be public/private) could allow all of the subcategories to be included or not on one larger and more comprehensive post-type.

Examples

Not including one subsection (or making it private), would simply prevent it from showing, thus one could have a traditional bookmark post by leaving off the read later, read, and review sub-types and/or data.
As another example, I could include the data for read later, bookmark, and read, but leave off data about what I highlighted and/or sub-sections of notes I prefer to remain private.

A Primary Post with Webmention Updates

Alternately, one could create a primary post (potentially a bookmark) for the thing one is reading, and then use further additional posts with webmentions on each (to the original) thereby adding details to the original post about the ongoing progress. In some sense, this isn’t too far from the functionality provided by GoodReads with individual updates on progress with brief notes and their page that lists the overall view of progress. Each individual post could be made public/private to allow different viewerships, though private webmentions may be a hairier issue. I know some are also experimenting with pushing updates to posts via micropub and other methods, which could be appealing as well.

This may be cumbersome over time, but could potentially be made to look something like the GoodReads UI below, which seems very intuitive. (Note that it’s missing any review text as I’m currently writing it, and it’s not public yet.)

Other Thoughts

Ideally, better distinguishing between something that has been bookmarked and read/unread with dates for both the bookmarking and reading, as well as potentially adding notes and highlights relating to the article is desired. Something potentially akin to Devon Zuegel‘s “Notes” tab (built on a custom script for Evernote and Tumblr) seems somewhat promising in a cross between a simple reading list (or linkblog) and a commonplace book for academic work, but doesn’t necessarily leave room for longer book reviews.

I’ll also need to consider the publishing workflow, in some sense as it relates to the reverse chronological posting of updates on typical blogs. Perhaps a hybrid approach of the two methods mentioned would work best?

Potentially having an interface that bolts together the interface of GoodReads (picture above) and Amazon’s notes/highlights together would be excellent. I recently noticed (and updated an old post) that they’re already beta testing such a beast.

Kindle Notes and Highlights are now shoing up as a beta feature in GoodReads

Comments

I’ll keep thinking about the architecture for what I’d ultimately like to have, but I’m always open to hearing what other (heavy) readers have to say about the subject and the usability of such a UI.

Please feel free to comment below, or write something on your own site (which includes the URL of this post) and submit your URL in the field provided below to create a webmention in which your post will appear as a comment.

Homebrew Website Club Meetup Pasadena/Los Angeles 8/10/16

Last night we continued the blossoming group of indiewebbers meeting up on the East side of the Los Angeles Area, leading up to IndieWeb Camp Los Angeles in November.

We met at Charlie’s Coffee House, 266 Monterey Road, Pasadena, CA.

Quiet Writing Hour

The quiet writing hour started off quiet with Angelo holding down the fort while others were stuck in interminable traffic, but if the IRC channel is any indication, he got some productive work done.

Introductions and Quick Demonstrations

Participants included:

Following introductions, I did a demo of the browser-based push notifications I enabled on this site about a week ago and discussed some pathways to help others explore options for doing so on theirs. Coincidentally, WordPress.com just unveiled some functionality like this yesterday that is more site-owner oriented than user oriented, so I’ll be looking into that functionality shortly.

Angelo showed off some impressive python code which he’s preparing to opensource, but just before the meeting had managed to completely bork his site, so everyone got a stunning example of a “502 Bad Gateway” notice.

At the break, we were so engaged we all completely forgot to either take a break or do the usual group photo. My 1 minute sketch gives a reasonable facsimile of what a photo would have looked like.

Peer-to-Peer Building and Help

With a new group, we spent some time discussing some general Indieweb principles, outlining ideas, and example projects.

Since Michael was very new to the group, we helped him install the WordPress IndieWeb plugin and configure a few of the sub-plugins to get him started. We discussed some basic next steps and pointers to the WordPress documentation to provide him some direction for building until we meet again.

We spent a few minutes discussing the upcoming IndieWebCamp logistics as well as outreach to the broader Los Angeles area community.

Next Meeting

For a new group, there’s enough enthusiasm to do at least two meetings a month, in keeping with the broader Homebrew movement, so we’re already committed to our next meeting on August 24. It’s tentatively at the same location unless a more suitable one comes along prior to then.

Thanks for coming everyone! We’ll see you next time.

How publications are committing harakari!

Liked How publications are committing harakari! by

Om Malik (Om Malik)

I have become increasingly frustrated by the fact that many of the publications I used to like are turning into churnicle factories, creating platforms for anybody and everybody to post whatever dr…

Web-based Push Notifications with Pushpad

Push Notifications

A push notification (AKA client notification) is a notification that shows up on one or more of your client devices without you having to explicitly request it — it’s “pushed” to you, instead of you having to poll for it. –Source: IndieWeb.org

Pushpad

Today I came across a beta web service called Pushpad that provides easy-to-install push notifications. As a result, for people who spend a lot of time in front of their screens, they can now subscribe to updates on the site here via web browser push notifications. Subscribers will get a small toaster-like pop up notification in real time on their screen to indicate that new content was published.

Set up

The service was quick and simple to set up with lots of documentation. While geared at large corporations looking for a simple turnkey implementation for push notifications on most major web browsers, it’s also easily usable by smaller sites. Even better it’s free for providing less than 10,000 notifications a month, which covers most small sites.

They provide an “Express” version that requires no serious technical skills and sets up in just a few minutes and a separate “Pro” version which provides a lot of additional customization (including a white labeled version) for those with the development skills to implement it.

For those on WordPress, they also have an easy to use plugin.

Pushpad supports the Push API for Chrome and Firefox and APNs for Safari.

Automation

Pushpad also supports integration with Zapier (currently in beta), which means that any of the hundreds of applications that are integrated with Zapier can be used to create push notifications on the desktop. Hopefully they include IFTTT.com soon too. I’m already using Pushbullet with IFTTT for integration between my Android phone and my desktop, but additional integrations for personalized notifications could be cool.

Roll Your Own

But maybe you’re hard core? If you prefer not relying on outside services, you can always build your own push notifications! In particular, IndieWeb.org provides some thoughts and tips about how to implement these for yourself based on open web standards.

Push Notifications for BoffoSocko.com

Now that we’ve been talking about them, would you like to try receiving them in the future? You can subscribe to push notifications for my blog by simply clicking on the icon below and then authenticating your subscription:

Not into push notifications? Maybe this isn’t your favorite way to find out about my content? If not, I offer a number of other ways to subscribe and consume my content.

Ten Simple Rules for Taking Advantage of Git and GitHub

Bookmarked Ten Simple Rules for Taking Advantage of Git and GitHub (journals.plos.org)

Bioinformatics is a broad discipline in which one common denominator is the need to produce and/or use software that can be applied to biological data in different contexts. To enable and ensure the replicability and traceability of scientific claims, it is essential that the scientific publication, the corresponding datasets, and the data analysis are made publicly available [1,2]. All software used for the analysis should be either carefully documented (e.g., for commercial software) or, better yet, openly shared and directly accessible to others [3,4]. The rise of openly available software and source code alongside concomitant collaborative development is facilitated by the existence of several code repository services such as SourceForge, Bitbucket, GitLab, and GitHub, among others. These resources are also essential for collaborative software projects because they enable the organization and sharing of programming tasks between different remote contributors. Here, we introduce the main features of GitHub, a popular web-based platform that offers a free and integrated environment for hosting the source code, documentation, and project-related web content for open-source projects. GitHub also offers paid plans for private repositories (see Box 1) for individuals and businesses as well as free plans including private repositories for research and educational use.

Homebrew Website Club Meetup Pasadena/Los Angeles 7/27/16

Tonight was the beginning of a new group of indiewebbers meeting up on the East side of the Los Angeles Area, in what we hope to be an ongoing in-person effort, particularly as we get nearer to IndieWeb Camp Los Angeles in November.

We met at Starbucks, 575 South Lake Avenue, Pasadena, CA.

Quiet Writing Hour

The quiet writing hour started off pretty well with three people which quickly grew to 6 at the official start of the meeting including what may be the youngest participants ever (at 6months and 5 1/2 years old).

#indieweb @ChrisAldrich: Quiet writing hours for the first Homebrew Website Club in Pasadena, CA are in full swing. 4 people already here.

— ChrisAldrich (@ChrisAldrich) July 28, 2016

Introductions and Quick Demonstrations

Participants included:

Chris Aldrich
Angelo Gladding
Bryan Cole, a retired photographer
Jervey Tervalon, a writer, and his 6 month old daughter
Evie (5 years old, private site)

Following introductions, I did a quick demo of the simple workflow I’ve been slowly perfecting for liking/retweeting posts from Twitter via mobile so that they post on my own site while simultaneously POSSEing to Twitter. Angelo showed a bit of his code and set-up for his custom-built site based on a Python framework and inspired by Aaron Schwartz’s early efforts. (He also has an interesting script for scraping other’s sites searching for microformats data with a mf2 parser that I’d personally like to see more of and hope he’ll open source it. It found a few issues with some redundant/malformed rel=”me” links in the header of my own site that I’ll need to sort out shortly).

Bryan showed some recent work he’s done on his photography blog, which he’s slowly but surely been managing to cobble together from a self-hosted version of WordPress with help from friends and the local WordPress Meetup. (Big kudos to him for his sheer tenacity in building his site up!) Jervey described some of what he’d like to build as it relates to a WordPress based site he’s putting together for a literary journal, while his daughter slept peacefully until someone mentioned a silo named Facebook. 5 year old Evie showed off some coding work she’d done during the quiet writing hour on the Scratch Platform on iOS that she hopes to post to her own blog shortly, so she can share with her grandparents.

At the break, we managed to squeeze everyone in for a group selfie.

Peer-to-Peer Building and Help

Since many in the group were building with WordPress, we did a demo build on Evie’s (private) site by installing the IndieWeb Plugin and activating and configuring a few of the basic sub-plugins. We then built a small social links menu to demonstrate the ease of adding rel-me to an Instagram link as an example. We also showed a quick example of IndieAuth, followed by a quick build for doing PESOS from Instagram with proper microformats2 markup. Bryan had a few questions about his site from the first half of the meeting, so we wrapped up by working our way through a portion of those so he can proceed with some additional work before our next meeting.

Summary & Next Meeting

In all, not a bad showing for what I expected to be a group of 5 less people than what we ultimately got! I can’t wait until the next meetup on either 8/10 or 8/24 (at the very worst) pending some scheduling. I hope to do every two weeks, but we’ll definitely commit to do at least once a month going forward.

New York Times will you be my brother on Facebook?

Should I be adding major media outlets to my Facebook feed as family members? Changes by Facebook, which are highlighted in this New York Times article, may mean this is coming: The Atlantic can be my twin brother, and Foreign Affairs could be my other sister.

“News content posted by publishers will show up less prominently, resulting in less traffic to companies that have come to rely on Facebook audiences.” — Facebook to Change News Feed to Focus on Friends and Family in New York Times

After reading this article, I can only think that Facebook wrongly thinks that my family is so interesting (and believe me, I don’t think I’m any better, most of my posts–much like my face–are ones which only a mother could “like”/”love” and my feed will bear that out! BTW I love you mom.) The majority of posts I see there are rehashes of so-called “news” sites I really don’t care about or invitations to participate in games like Candy Crush Saga.

While I love keeping up with friends and family on Facebook, I’ve had to very heavily modify how I organize my Facebook feed to get what I want out of it because the algorithms don’t always do a very good job. Sadly, I’m probably in the top 0.0001% of people who take advantage of any of these features.

It really kills me that although publishers see quite a lot of traffic from social media silos (and particularly Facebook), they’re still losing some sight of the power of owning your own website and posting there directly. Apparently the past history littered with examples like Zynga and social reader tools hasn’t taught them the lesson to continue to iterate on their own platforms. One day the rug will be completely pulled out from underneath them and real trouble will result. They’ll wish they’d put all their work and effort into improving their own product rather than allowing Facebook, Twitter, et al. to siphon off a lot of their resources. If there’s one lesson that we’ve learned from media over the years, it’s that owning your own means of distribution is a major key to success. Sharecropping one’s content out to social platforms is probably not a good idea while under pressure to change for the future.

Psst… With all this in mind, if you’re a family member or close friend who wants to

have your own website;
own your own personal data (which you can automatically syndicate to most of the common social media sites); and
be in better control of your online identity,

I’ll offer to build you a simple one and host it at cost.

Penguin Revives Decades-Old Software for 30th Anniversary Edition of “The Blind Watchmaker” | The Digital Reader

Liked Penguin Revives Decades-Old Software for 30th Anniversary Edition of "The Blind Watchmaker" by Nate Hoffelder (The Digital Reader)

Even in 2016, publishers and authors are still struggling when it comes to re-releasing decades-old books, but Penguin had a unique problem when it set out to publish a 30th anniversary edition of Richard Dawkin's The Blind Watchmaker. The Bookseller reports that Penguin decided to revive four programs Dawkins wrote in 1986. Written in Pascal for the Mac, The Watchmaker Suite was an experiment in algorithmic evolution. Users could run the programs and create a biomorph, and then watch it evolve across the generations. And now you can do the same in your web browser. A website, MountImprobable.com, was built by the publisher’s in-house Creative Technology team—comprising community manager Claudia Toia, creative developer Mathieu Triay and cover designer Matthew Young—who resuscitated and redeployed code Dawkins wrote in the 1980s and ’90s to enable users to create unique, “evolutionary” imprints. The images will be used as cover imagery on Dawkins’ trio to grant users an entirely individual, personalised print copy.

Hypothes.is and the IndieWeb

Last night I saw two great little articles about Hypothes.is, a web-based annotation engine, written by a proponent of the IndieWeb:

Hypothes.is as a public research notebook

Hypothes.is Aggregator ― a WordPress plugin

As a researcher, I fully appreciate the pro-commonplace book conceptualization of the first post, and the second takes things amazingly further with a plugin that allows one to easily display one’s hypothes.is annotations on one’s own WordPress-based site in a dead-simple fashion.

This functionality is a great first step, though honestly, in keeping with IndieWeb principles of owning one’s own data, I think it would be easier/better if Hypothes.is both accepted and sent webmentions. This would potentially allow me to physically own the data on my own site while still participating in the larger annotation community as well as give me notifications when someone either comments or augments on one of my annotations or even annotates one of my own pages (bits of which I’ve written about before.)

Either way, kudos to Kris Shaffer for moving the ball forward!

Examples

My Hypothes.is Notebook

The plugin mentioned in the second article allows me to keep a running online “notebook” of all of my Hypothes.is annotations on my own site.

My IndieWeb annotations

I can also easily embed my recent annotations about the IndieWeb below:

[ hypothesis user = 'chrisaldrich' tags = 'indieweb']

Webmention + Books = BookMention

Part of my plans to (remotely) devote the weekend to the IndieWeb Summit in Portland were hijacked by the passing of Muhammad Ali. Wait… What?! How does that happen?

A year ago, I opened started a publishing company and we came out with our first book Amerikan Krazy in late February. The author has a small backcatalogue that’s out of print, so in conjunction with his book launch, we’ve been slowly releasing ebook versions of his old titles. Coincidentally one of them was a fantastic little book about Ali entitled Muhammad Ali Retrospective, so I dropped everything I was doing to get it finished up and out as a quick way of honoring his passing.

But while I was working on some of the minutiae, I’ve been thinking in the back of my mind about the ideas of marginalia, commonplace books, and Amazon’s siloed community of highlights and notes. Is there a decentralized web-based way of creating a construct similar to webmention that will allow all readers worldwide to highlight, mark up and comment across electronic versions of texts so that they can share them in an open manner while still owning all of their own data? And possibly a way to aggregate them at the top for big data studies in the vein of corpus linguistics?

I think there is…

However it’ll take some effort, but effort that could have a worthwhile impact.

I have a few potential architectures in mind, but also want to keep online versions of books in the loop as well as potentially efforts like hypothes.is or even the academic portions of Genius.com which do web-based annotation.

If anyone in the IndieWeb, books, or online marginalia worlds has thought about this as well, I’d love to chat.

Lit Fest Pasadena panel on Indie Publishing

Indie Publisher Panel at LitFest Pasadena

Some Thoughts on Academic Publishing and “Who’s downloading pirated papers? Everyone” from Science | AAAS

Bookmarked Who's downloading pirated papers? Everyone by John Bohannon (Science | AAAS)

An exclusive look at data from the controversial web site Sci-Hub reveals that the whole world, both poor and rich, is reading pirated research papers.

Sci Hub has been in the news quite a bit over the past half a year and the bookmarked article here gives some interesting statistics. I’ll preface some of the following editorial critique with the fact that I love John Bohannon’s work; I’m glad he’s spent the time to do the research he has. Most of the rest of the critique is aimed at the publishing industry itself.

From a journalistic standpoint, I find it disingenuous that the article didn’t actually hyperlink to Sci Hub. Neither did it link out (or provide a full quote) to Alicia Wise’s Twitter post(s) nor link to her rebuttal list of 20 ways to access their content freely or inexpensively. Of course both of these are editorial related, and perhaps the rebuttal was so flimsy as to be unworthy of a link from such an esteemed publication anyway.

Sadly, Elsevier’s list of 20 ways of free/inexpensive access doesn’t really provide any simple coverage for graduate students or researchers in poorer countries which are the likeliest group of people using Sci Hub, unless they’re going to fraudulently claim they’re part of a class which they’re not, and is this morally any better than the original theft method? It’s almost assuredly never used by patients, which seem to be covered under one of the options, as the option to do so is painfully undiscoverable past their typical $30/paper firewalls. Their patchwork hodgepodge of free access is so difficult to not only discern, but one must keep in mind that this is just one of dozens of publishers a researcher must navigate to find the one thing they’re looking for right now (not to mention the thousands of times they need to do this throughout a year, much less a career).

Consider this experiment, which could be a good follow up to the article: is it easier to find and download a paper by title/author/DOI via Sci Hub (a minute) versus through any of the other publishers’ platforms with a university subscription (several minutes) or without a subscription (an hour or more to days)? Just consider the time it would take to dig up every one of 30 references in an average journal article: maybe just a half an hour via Sci Hub versus the days and/or weeks it would take to jump through the multiple hoops to first discover, read about, and then gain access and then download them from the over 14 providers (and this presumes the others provide some type of “access” like Elsevier).

Those who lived through the Napster revolution in music will realize that the dead simplicity of their system is primarily what helped kill the music business compared to the ecosystem that exists now with easy access through the multiple streaming sites (Spotify, Pandora, etc.) or inexpensive paid options like (iTunes). If the publishing business doesn’t want to get completely killed, they’re going to need to create the iTunes of academia. I suspect they’ll have internal bean-counters watching the percentage of the total (now apparently 5%) and will probably only do something before it passes a much larger threshold, though I imagine that they’re really hoping that the number stays stable which signals that they’re not really concerned. They’re far more likely to continue to maintain their status quo practices.

Some of this ease-of-access argument is truly borne out by the statistics of open access papers which are downloaded by Sci Hub–it’s simply easier to both find and download them that way compared to traditional methods; there’s one simple pathway for both discovery and download. Surely the publishers, without colluding, could come up with a standardized method or protocol for finding and accessing their material cheaply and easily?

“Hart-Davidson obtained more than 100 years of biology papers the hard way—legally with the help of the publishers. ‘It took an entire year just to get permission,’ says Thomas Padilla, the MSU librarian who did the negotiating.” John Bohannon in Who’s downloading pirated papers? Everyone

Personally, I use use relatively advanced tools like LibX, which happens to be offered by my institution and which I feel isn’t very well known, and it still takes me longer to find and download a paper than it would via Sci Hub. God forbid if some enterprising hacker were to create a LibX community version for Sci Hub. Come to think of it, why haven’t any of the dozens of publishers built and supported simple tools like LibX which make their content easy to access? If we consider the analogy of academic papers to the introduction of machine guns in World War I, why should modern researchers still be using single-load rifles against an enemy that has access to nuclear weaponry?

My last thought here comes on the heels of the two tweets from Alicia Wise mentioned, but not shown in the article:

I’m all for universal access, but not theft! There are lots of legal ways to get access https://t.co/iDZW2XcPhy 1/2 .@mbeisen .@Sci_Hub

— Alicia Wise (@wisealic) March 14, 2016

A digital sub to the NYT $260/person and for all #Elsevier content $215/researcher. Both fantastic value! 2/2 .@mbeisen .@Scihub @nytimes

— Alicia Wise (@wisealic) March 14, 2016

She mentions that the New York Times charges more than Elsevier does for a full subscription. This is tremendously disingenuous as Elsevier is but one of dozens of publishers for which one would have to subscribe to have access to the full panoply of material researchers are typically looking for. Further, Elsevier nor their competitors are making their material as easy to find and access as the New York Times does. Neither do they discount access to the point that they attempt to find the subscription point that their users find financially acceptable. Case in point: while I often read the New York Times, I rarely go over their monthly limit of articles to need any type of paid subscription. Solely because they made me an interesting offer to subscribe for 8 weeks for 99 cents, I took them up on it and renewed that deal for another subsequent 8 weeks. Not finding it worth the full $35/month price point I attempted to cancel. I had to cancel the subscription via phone, but why? The NYT customer rep made me no less than 5 different offers at ever decreasing price points–including the 99 cents for 8 weeks which I had been getting!!–to try to keep my subscription. Elsevier, nor any of their competitors has ever tried (much less so hard) to earn my business. (I’ll further posit that it’s because it’s easier to fleece at the institutional level with bulk negotiation, a model not too dissimilar to the textbook business pressuring professors on textbook adoption rather than trying to sell directly the end consumer–the student, which I’ve written about before.)

(Trigger alert: Apophasis to come) And none of this is to mention the quality control that is (or isn’t) put into the journals or papers themselves. Fortunately one need’t even go further than Bohannon’s other writings like Who’s Afraid of Peer Review? Then there are the hordes of articles on poor research design and misuse of statistical analysis and inability to repeat experiments. Not to give them any ideas, but lately it seems like Elsevier buying the Enquirer and charging $30 per article might not be a bad business decision. Maybe they just don’t want to play second-banana to TMZ?

Interestingly there’s a survey at the end of the article which indicates some additional sources of academic copyright infringement. I do have to wonder how the data for the survey will be used? There’s always the possibility that logged in users will be indicating they’re circumventing copyright and opening themselves up to litigation.

I also found the concept of using the massive data store as a means of applied corpus linguistics for science an entertaining proposition. This type of research could mean great things for science communication in general. I have heard of people attempting to do such meta-analysis to guide the purchase of potential intellectual property for patent trolling as well.

Finally, for those who haven’t done it (ever or recently), I’ll recommend that it’s certainly well worth their time and energy to attend one or more of the many 30-60 minute sessions most academic libraries offer at the beginning of their academic terms to train library users on research tools and methods. You’ll save yourself a huge amount of time.