Part of my plans to (remotely) devote the weekend to the IndieWeb Summit in Portland were hijacked by the passing of Muhammad Ali. Wait… What?! How does that happen?
A year ago, I opened started a publishing company and we came out with our first book Amerikan Krazy in late February. The author has a small backcatalogue that’s out of print, so in conjunction with his book launch, we’ve been slowly releasing ebook versions of his old titles. Coincidentally one of them was a fantastic little book about Ali entitled Muhammad Ali Retrospective, so I dropped everything I was doing to get it finished up and out as a quick way of honoring his passing.
But while I was working on some of the minutiae, I’ve been thinking in the back of my mind about the ideas of marginalia, commonplace books, and Amazon’s siloed community of highlights and notes. Is there a decentralized web-based way of creating a construct similar to webmention that will allow all readers worldwide to highlight, mark up and comment across electronic versions of texts so that they can share them in an open manner while still owning all of their own data? And possibly a way to aggregate them at the top for big data studies in the vein of corpus linguistics?
I think there is…
However it’ll take some effort, but effort that could have a worthwhile impact.
I have a few potential architectures in mind, but also want to keep online versions of books in the loop as well as potentially efforts like hypothes.is or even the academic portions of Genius.com which do web-based annotation.
If anyone in the IndieWeb, books, or online marginalia worlds has thought about this as well, I’d love to chat.
An exclusive look at data from the controversial web site Sci-Hub reveals that the whole world, both poor and rich, is reading pirated research papers.
Sci Hub has been in the news quite a bit over the past half a year and the bookmarked article here gives some interesting statistics. I’ll preface some of the following editorial critique with the fact that I love John Bohannon’s work; I’m glad he’s spent the time to do the research he has. Most of the rest of the critique is aimed at the publishing industry itself.
From a journalistic standpoint, I find it disingenuous that the article didn’t actually hyperlink to Sci Hub. Neither did it link out (or provide a full quote) to Alicia Wise’s Twitter post(s) nor link to her rebuttal list of 20 ways to access their content freely or inexpensively. Of course both of these are editorial related, and perhaps the rebuttal was so flimsy as to be unworthy of a link from such an esteemed publication anyway.
Sadly, Elsevier’s list of 20 ways of free/inexpensive access doesn’t really provide any simple coverage for graduate students or researchers in poorer countries which are the likeliest group of people using Sci Hub, unless they’re going to fraudulently claim they’re part of a class which they’re not, and is this morally any better than the original theft method? It’s almost assuredly never used by patients, which seem to be covered under one of the options, as the option to do so is painfully undiscoverable past their typical $30/paper firewalls. Their patchwork hodgepodge of free access is so difficult to not only discern, but one must keep in mind that this is just one of dozens of publishers a researcher must navigate to find the one thing they’re looking for right now (not to mention the thousands of times they need to do this throughout a year, much less a career).
Consider this experiment, which could be a good follow up to the article: is it easier to find and download a paper by title/author/DOI via Sci Hub (a minute) versus through any of the other publishers’ platforms with a university subscription (several minutes) or without a subscription (an hour or more to days)? Just consider the time it would take to dig up every one of 30 references in an average journal article: maybe just a half an hour via Sci Hub versus the days and/or weeks it would take to jump through the multiple hoops to first discover, read about, and then gain access and then download them from the over 14 providers (and this presumes the others provide some type of “access” like Elsevier).
Those who lived through the Napster revolution in music will realize that the dead simplicity of their system is primarily what helped kill the music business compared to the ecosystem that exists now with easy access through the multiple streaming sites (Spotify, Pandora, etc.) or inexpensive paid options like (iTunes). If the publishing business doesn’t want to get completely killed, they’re going to need to create the iTunes of academia. I suspect they’ll have internal bean-counters watching the percentage of the total (now apparently 5%) and will probably only do something before it passes a much larger threshold, though I imagine that they’re really hoping that the number stays stable which signals that they’re not really concerned. They’re far more likely to continue to maintain their status quo practices.
Some of this ease-of-access argument is truly borne out by the statistics of open access papers which are downloaded by Sci Hub–it’s simply easier to both find and download them that way compared to traditional methods; there’s one simple pathway for both discovery and download. Surely the publishers, without colluding, could come up with a standardized method or protocol for finding and accessing their material cheaply and easily?
“Hart-Davidson obtained more than 100 years of biology papers the hard way—legally with the help of the publishers. ‘It took an entire year just to get permission,’ says Thomas Padilla, the MSU librarian who did the negotiating.” John Bohannon in Who’s downloading pirated papers? Everyone
Personally, I use use relatively advanced tools like LibX, which happens to be offered by my institution and which I feel isn’t very well known, and it still takes me longer to find and download a paper than it would via Sci Hub. God forbid if some enterprising hacker were to create a LibX community version for Sci Hub. Come to think of it, why haven’t any of the dozens of publishers built and supported simple tools like LibX which make their content easy to access? If we consider the analogy of academic papers to the introduction of machine guns in World War I, why should modern researchers still be using single-load rifles against an enemy that has access to nuclear weaponry?
My last thought here comes on the heels of the two tweets from Alicia Wise mentioned, but not shown in the article:
She mentions that the New York Times charges more than Elsevier does for a full subscription. This is tremendously disingenuous as Elsevier is but one of dozens of publishers for which one would have to subscribe to have access to the full panoply of material researchers are typically looking for. Further, Elsevier nor their competitors are making their material as easy to find and access as the New York Times does. Neither do they discount access to the point that they attempt to find the subscription point that their users find financially acceptable. Case in point: while I often read the New York Times, I rarely go over their monthly limit of articles to need any type of paid subscription. Solely because they made me an interesting offer to subscribe for 8 weeks for 99 cents, I took them up on it and renewed that deal for another subsequent 8 weeks. Not finding it worth the full $35/month price point I attempted to cancel. I had to cancel the subscription via phone, but why? The NYT customer rep made me no less than 5 different offers at ever decreasing price points–including the 99 cents for 8 weeks which I had been getting!!–to try to keep my subscription. Elsevier, nor any of their competitors has ever tried (much less so hard) to earn my business. (I’ll further posit that it’s because it’s easier to fleece at the institutional level with bulk negotiation, a model not too dissimilar to the textbook business pressuring professors on textbook adoption rather than trying to sell directly the end consumer–the student, which I’ve written about before.)
(Trigger alert: Apophasis to come) And none of this is to mention the quality control that is (or isn’t) put into the journals or papers themselves. Fortunately one need’t even go further than Bohannon’s other writings like Who’s Afraid of Peer Review? Then there are the hordes of articles on poor research design and misuse of statistical analysis and inability to repeat experiments. Not to give them any ideas, but lately it seems like Elsevier buying the Enquirer and charging $30 per article might not be a bad business decision. Maybe they just don’t want to play second-banana to TMZ?
Interestingly there’s a survey at the end of the article which indicates some additional sources of academic copyright infringement. I do have to wonder how the data for the survey will be used? There’s always the possibility that logged in users will be indicating they’re circumventing copyright and opening themselves up to litigation.
I also found the concept of using the massive data store as a means of applied corpus linguistics for science an entertaining proposition. This type of research could mean great things for science communication in general. I have heard of people attempting to do such meta-analysis to guide the purchase of potential intellectual property for patent trolling as well.
Finally, for those who haven’t done it (ever or recently), I’ll recommend that it’s certainly well worth their time and energy to attend one or more of the many 30-60 minute sessions most academic libraries offer at the beginning of their academic terms to train library users on research tools and methods. You’ll save yourself a huge amount of time.
wo years ago today, I officially began to (try to) own all of my own web data and host it on my own server.
It began when I moved from WordPress.com to my own domain at BoffoSocko.com. At the time, I wasn’t aware of the IndieWeb movement, but shortly thereafter I ran across IndieWebCamp.org and began using their principles and philosophy, which seemed to me to be how the Web and the Internet should have worked from the start.
Though I still use corporate-owned social media sites (primarily for increased distribution), I no longer rely on them for being the sole source of my internet presence or identity.
Now, through the boffosocko.com domain and a variety of tools, I post all of my content here on my own site first and then syndicate it out to Facebook, Twitter, Google+, LinkedIn, Tumblr, and any other useful sites. [Sadly, because of API restrictions I do still natively post to Instagram, but using OwnYourGram, I’m able to programmatically post the same photo on my site simultaneously.] This means that if any of these silos were to disappear, I would still own all of my own content (including comments I make on other sites, which sometimes could be blogposts/articles in and of themselves, or worse, through administrative interfaces could actually not be approved/published, and therefore completely lost as if I hadn’t written them to begin with.)
Also slowly, but surely, I’ve been able to have all of the resulting interactions that take place on my content on many of these silos (Facebook, Twitter, Google+) appear back on my site in the comments section on the original post. This way, if you’re commenting and interacting on this post on Facebook (for example) and you comment there, the comment is ported over to the comment section on my own site where it exists for everyone to see and interact with.
If you think the mission and philosophy of the Indie Web are interesting and would like some help setting something like this up for yourself, I’m happy to help! Just post a comment below or reply to this post (depending on what platform you’re reading this.)
I also want to say a BIG THANK YOU to all those in the indieweb community who’ve helped me come much farther and faster than I would have done by myself!
I’m copying some useful introductory material from IndieWebCamp.org below for those interested:
What is the IndieWeb?
The IndieWeb is a people-focused alternative to the ‘corporate web’.
Selfdogfood instead of email. Show before tell. Prioritize by scratching your own itches, creating, iterating on your own site.
Design first, protocols & formats second. Focus on good UX & selfdogfood prototypes to create minimum necessary formats & protocols.
Perhaps most importantly, we are people-focused instead of project-focused, and have regular meetups where everyone is welcome:
Homebrew Website Club
Homebrew Website Club is a (bi)weekly meetup of creatives passionate about designing, improving, building, and actively using their own websites, sharing their successes and challenges with a like-minded and supportive community. We have adopted a similar structure as the classic Homebrew Computer Club meetings. 
We typically meet every other Wednesday* right after work, 18:30-19:30, across cities and online. Some locations also have a 17:30-18:30 Quiet Writing Hour beforehand. Edinburgh is meeting every week, and some cities meet on Tuesdays!
His response was probably innocuous enough, but I thought the article should be put to task a bit more.
“35 million academics, independent scholars and graduate students as users, who collectively have uploaded some eight million texts”
35 million users is an okay number, but their engagement must be spectacularly bad if only 8 million texts are available. How many researchers do you know who’ve published only a quarter of an article anywhere, much less gotten tenure?
“the platform essentially bans access for academics who, for whatever reason, don’t have an Academia.edu account. It also shuts out non-academics.”
They must have changed this, as pretty much anyone with an email address (including non-academics) can create a free account and use the system. I’m fairly certain that the platform was always open to the public from the start, but the article doesn’t seem to question the statement at all. If we want to argue about shutting out non-academics or even academics in poorer countries, let’s instead take a look at “big publishing” and their $30+/paper paywalls and publishing models, shall we?
“I don’t trust academia.edu”
Given his following discussion, I can only imagine what he thinks of big publishers in academia and that debate.
“McGill’s Dr. Sterne calls it “the gamification of research,”
Most research is too expensive to really gamify in such a simple manner. Many researchers are publishing to either get or keep their jobs and don’t have much time, information, or knowledge to try to game their reach in these ways. And if anything, the institutionalization of “publish or perish” has already accomplished far more “gamification”, Academia.edu is just helping to increase the reach of the publication. Given that research shows that most published research isn’t even read, much less cited, how bad can Academia.edu really be? [Cross reference: Reframing What Academic Freedom Means in the Digital Age]
If we look at Twitter and the blogging world as an analogy with Academia.edu and researchers, Twitter had a huge ramp up starting in 2008 and helped bloggers obtain eyeballs/readers, but where is it now? Twitter, even with a reasonable business plan is stagnant with growing grumblings that it may be failing. I suspect that without significant changes that Academia.edu (which is a much smaller niche audience than Twitter) will also eventually fall by the wayside.
The article rails against not knowing what the business model is or what’s happening with the data. I suspect that the platform itself doesn’t have a very solid business plan and they don’t know what to do with the data themselves except tout the numbers. I’d suspect they’re trying to build “critical mass” so that they can cash out by selling to one of the big publishers like Elsevier, who might actually be able to use such data. But this presupposes that they’re generating enough data; my guess is that they’re not. And on that subject, from a journalistic viewpoint, where’s the comparison to the rest of the competition including ResearchGate.net or Mendeley.com, which in fact was purchased by Elsevier? As it stands, this simply looks like a “hit piece” on Academia.edu, and sadly not a very well researched or reasoned one.
In sum, the article sounds to me like a bunch of Luddites running around yelling “fire”, particularly when I’d imagine that most referred to in the piece feed into the more corporate side of publishing in major journals rather than publishing it themselves on their own websites. I’d further suspect they’re probably not even practicing academic samizdat. It feels to me like the author and some of those quoted aren’t actively participating in the social media space to be able to comment on it intelligently. If the paper wants to pick at the academy in this manner, why don’t they write an exposé on the fact that most academics still have websites that look like they’re from 1995 (if, in fact, they have anything beyond their University’s mandated business card placeholder) when there are a wealth of free and simple tools they could use? Let’s at least build a cart before we start whipping the horse.
For academics who really want to spend some time and thought on a potential solution to all of this, I’ll suggest that they start out by owning their own domain and own their own data and work. The #IndieWeb movement certainly has an interesting philosophy that’s a great start in fixing the problem; it can be found at http://www.indiewebcamp.com.
There are potential solutions to the recent News Genius-gate incident, and simple notifications can go a long way toward helping prevent online bullying behavior.
There has been a recent brouhaha on the Internet (see related stories below) because of bad actors using News Genius (and potentially other web-based annotation tools like Hypothes.is) to comment on websites without their owner’s knowledge, consent, or permission. It’s essentially the internet version of talking behind someone’s back, but doing it while standing on their head and shouting with your fingers in their ears. Because of platform and network effects, such rude and potentially inappropriate commentary can have much greater reach than even the initial website could give it. Naturally in polite society, such bullying behavior should be curtailed.
This type of behavior is also not too different from more subtle concepts like subtweets or the broader issues platforms like Twitter are facing in which they don’t have proper tools to prevent abuse and bullying online.
A creator receives no notification if someone has annotated their content.–Ella Dawson
I think that a major part of improving the issue of abuse and providing consent is building in notifications so that website owners will at least be aware that their site is being marked up, highlighted, annotated, and commented on in other locations or by other platforms. Then the site owner at least has the knowledge of what’s happening and can then be potentially provided with information and tools to allow/disallow such interactions, particularly if they can block individual bad actors, but still support positive additions, thought, and communication. Ideally this blocking wouldn’t occur site-wide, which many may be tempted to do now as a knee-jerk reaction to recent events, but would be fine grained enough to filter out the worst offenders.
Toward the end of notifications to site owners, it would be great if any annotating activity would trigger trackbacks, pingbacks, or the relatively newer and better webmention protocol of the W3C which comes out of the IndieWeb movement. Then site owners would at least have notifications about what is happening on their site that might otherwise be invisible to them. (And for the record, how awesome would it be if social media silos like Facebook, Twitter, Instagram, Google+, Medium, Tumblr, et al would support webmentions too!?!)
Perhaps there’s a way to further implement filters or tools (a la Akismet on platforms like WordPress) that allow site users to mark materials as spam, abusive, or “other” so that they are then potentially moved from “public” facing to “private” so that the original highlighter can still see their notes, but that the platform isn’t allowing the person’s own website to act as a platform to give safe harbor (or reach) to bad actors.
Further some site owners might appreciate gradable filters (G, PG, PG-13, R, X) so that either they or their users (or even parents of younger children) can filter what they’re willing to show on their site (or that their users can choose to see).
Consider also annotations on narrative forms that might be posted as spoilers–how can these be guarded against? For what happens when a even a well-meaning actor posts an annotation on page two which foreshadows that the butler did it thereby ruining the surprise on the last page? Certainly there’s some value in having such a comment from an academic/literary perspective, but it doesn’t mean that future readers will necessarily appreciate the spoiler. (Some CSS and a spoiler tag might easily and unobtrusively remedy the situation here?)
Certainly options can be built into the annotating platform itself as well as allowing server-side options for personal websites attempting to deal with flagrant violators and truly hard-to-eradicate cases.
Note: You’re welcome to highlight and annotate this post using Hypothes.is (see upper right corner of page) or on News Genius.
Do you have a solution for helping to harden the Internet against bullies? Share it in the comments below.
You can now highlight and annotate most of the pages here on Boffo Socko as well as other web pages.
I’d played around with many of them in the past, but a recent conversation with Matt Gross about News Genius and their issues in the last week reminded me about internet annotation platforms. Since some of what I write here is academic in nature, I thought I would add native Hypothes.is Annotation support to the site.
If you haven’t heard about it before, you might find the ability to highlight and annotate web pages very useful. Hypothesis allows for public or private highlights and notes and it can be a very useful extension of one’s commonplace book.
At the moment, I’m not sure where it all fits into the IndieWeb infrastructure I’m building here, but, at least for the moment, I’d hope that those making public annotations and notes will also enter their commentary into the comments either here on the blog or by way of syndicated versions on Facebook or Twitter so that they’re archived here for posterity. (Keep in mind site-deaths are prevalent and even Hypothes.is acknowledges in a video on their homepage that there have been many incarnations of web annotations that have come and gone in the life of the internet.) Perhaps one day there will be a federated and cross-linked version of highlights and annotations in the IndieWeb universe with webmentions included?!
Educators and researchers interested in using web annotation are encouraged to visit the wealth of information provided by providers like Hypothes.is and Genius.com. In particular, the Hypothes.is blog has some great material and examples over the past year, and they have a special section for educators as well.
As it’s similar in functionality to highlighting on the web, I’ll remind users that we also still support Kevin Marks’sfragmentions as well.
If anyone is aware of people or groups working on the potential integration of the IndieWeb movement (webmentions) and web annotation/highlighting, please include them in the comments below–I’d really appreciate it.
I have a feeling that a few more sales this week would not only put us solidly in the top 100 in the first category, but could earn the book a space among some of the greats in the genre along with Kurt Vonnegut, Carl Hiaasen, Ray Bradbury, Bret Easton Ellis, Vladimir Nabokov, Don Delillo, Thomas Pynchon, and Umberto Eco!
If you haven’t purchased a copy yet, but want to help support our efforts to get the book out there, now is the time to take the plunge.
Amerikan Krazy novelist Henry James Korn is slated to appear at the curated exhibit “Amerikan Krazy: Life Out of Balance” featuring the work of over twenty notable Southland artists. More details at Boffo Socko Books.