marginalia | Chris Aldrich

Notes, Highlights, and Marginalia: From E-books to Online

For several years now, I’ve been meaning to do something more interesting with the notes, highlights, and marginalia from the various books I read. In particular, I’ve specifically been meaning to do it for the non-fiction I read for research, and even more so for e-books, which tend to have slightly more extract-able notes given their electronic nature. This fits in to the way in which I use this site as a commonplace book as well as the IndieWeb philosophy to own all of one’s own data.[1]

Over the past month or so, I’ve been experimenting with some fiction to see what works and what doesn’t in terms of a workflow for status updates around reading books, writing book reviews, and then extracting and depositing notes, highlights, and marginalia online. I’ve now got a relatively quick and painless workflow for exporting the book related data from my Amazon Kindle and importing it into the site with some modest markup and CSS for display. I’m sure the workflow will continue to evolve (and further automate) somewhat over the coming months, but I’m reasonably happy with where things stand.

The fact that the Amazon Kindle allows for relatively easy highlighting and annotation in e-books is excellent, but having the ability to sync to a laptop and do a one click export of all of that data, is incredibly helpful. Adding some simple CSS to the pre-formatted output gives me a reasonable base upon which to build for future writing/thinking about the material. In experimenting, I’m also coming to realize that simply owning the data isn’t enough, but now I’m driven to help make that data more directly useful to me and potentially to others.

As part of my experimenting, I’ve just uploaded some notes, highlights, and annotations for David Christian’s excellent text Maps of Time: An Introduction to Big History[2] which I read back in 2011/12. While I’ve read several of the references which I marked up in that text, I’ll have to continue evolving a workflow for doing all the related follow up (and further thinking and writing) on the reading I’ve done in the past.

I’m still reminded me of Rick Kurtzman’s sage advice to me when I was a young pisher at CAA in 1999: “If you read a script and don’t tell anyone about it, you shouldn’t have wasted the time having read it in the first place.” His point was that if you don’t try to pass along the knowledge you found by reading, you may as well give up. Even if the thing was terrible, at least say that as a minimum. In a digitally connected era, we no longer need to rely on nearly illegible scrawl in the margins to pollinate the world at a snail’s pace.[4] Take those notes, marginalia, highlights, and meta data and release it into the world. The fact that this dovetails perfectly with Cesar Hidalgo’s thesis in Why Information Grows: The Evolution of Order, from Atoms to Economies,[3] furthers my belief in having a better process for what I’m attempting here.

Hopefully in the coming months, I’ll be able to add similar data to several other books I’ve read and reviewed here on the site.

If anyone has any thoughts, tips, tricks for creating/automating this type of workflow/presentation, I’d love to hear them in the comments!

Footnotes

[1]

“Own your data,” IndieWeb. [Online]. Available: http://indieweb.org/own_your_data. [Accessed: 24-Oct-2016]

[2]

D. Christian and W. McNeill H., Maps of Time: An Introduction to Big History, 2nd ed. University of California Press, 2011.

[3]

C. Hidalgo, Why Information Grows: The Evolution of Order, from Atoms to Economies, 1st ed. Basic Books, 2015.

[4]

O. Gingerich, The Book Nobody Read: Chasing the Revolutions of Nicolaus Copernicus. Bloomsbury Publishing USA, 2004.

A New Reading Post-type for Bookmarking and Reading Workflow

This morning while breezing through my Woodwind feed reader, I ran across a post by Rick Mendes with the hashtags #readlater and #readinglist which put me down a temporary rabbit hole of thought about reading-related post types on the internet.

I’m obviously a huge fan of reading and have accounts on GoodReads, Amazon, Pocket, Instapaper, Readability, and literally dozens of other services that support or assist the reading endeavor. (My affliction got so bad I started my own publishing company last year.)

READ LATER is an indication on (or relating to) a website that one wants to save the URL to come back and read the content at a future time.

I started a page on the IndieWeb wiki to define read later where I began writing some philosophical thoughts. I decided it would be better to post them on my own site instead and simply link back to them. As a member of the Indieweb my general goal over time is to preferentially quit using these web silos (many of which are listed on the referenced page) and, instead, post my reading related work and progress here on my own site. Naturally, the question becomes, how does one do this in a simple and usable manner with pretty and reasonable UX/UI for both myself and others?

Current Use

Currently I primarily use a Pocket bookmarklet to save things (mostly newspaper articles, magazine pieces, blog posts) for reading later and/or the like/favorite functionality in Twitter in combination with an IFTTT recipe to save the URL in the tweet to my Pocket account. I then regularly visit Pocket to speed read though articles. While Pocket allows downloading of (some) of one’s data in this regard, I’m exploring options to bring in the ownership of this workflow into my own site.

For more academic leaning content (read journal articles), I tend to rely on an alternate Mendeley-based workflow which also starts with an easy-to-use bookmarklet.

I’ve also experimented with bookmarking a journal article and using hypothes.is to import my highlights from that article, though that workflow has a way to go to meet my personal needs in a robust way while still allowing me to own all of my own data. The benefit is that fixing it can help more than just myself while still fitting into a larger personal workflow.

Brainstorming

A Broader Reading (Parent) Post-type

Philosophically a read later post-type could be considered similar to a (possibly) unshared or private bookmark with potential possible additional meta-data like: progress, date read, notes, and annotations to be added after the fact, which then technically makes it a read post type.

A potential workflow viewed over time might be: read later >> bookmark >> notes/annotations/marginalia >> read >> review. This kind of continuum of workflow might be able to support a slightly more complex overall UI for a more simplified reading post-type in which these others are all sub-types. One could then make a single UI for a reading post type with fields and details for all of the sub-cases. Being updatable, the single post could carry all the details of one’s progress.

Indieweb encourages simplicity (DRY) and having the fewest post-types possible, which I generally agree with, but perhaps there’s a better way of thinking of these several types. Concatenating them into one reading type with various data fields (and the ability of them to be public/private) could allow all of the subcategories to be included or not on one larger and more comprehensive post-type.

Examples

Not including one subsection (or making it private), would simply prevent it from showing, thus one could have a traditional bookmark post by leaving off the read later, read, and review sub-types and/or data.
As another example, I could include the data for read later, bookmark, and read, but leave off data about what I highlighted and/or sub-sections of notes I prefer to remain private.

A Primary Post with Webmention Updates

Alternately, one could create a primary post (potentially a bookmark) for the thing one is reading, and then use further additional posts with webmentions on each (to the original) thereby adding details to the original post about the ongoing progress. In some sense, this isn’t too far from the functionality provided by GoodReads with individual updates on progress with brief notes and their page that lists the overall view of progress. Each individual post could be made public/private to allow different viewerships, though private webmentions may be a hairier issue. I know some are also experimenting with pushing updates to posts via micropub and other methods, which could be appealing as well.

This may be cumbersome over time, but could potentially be made to look something like the GoodReads UI below, which seems very intuitive. (Note that it’s missing any review text as I’m currently writing it, and it’s not public yet.)

Other Thoughts

Ideally, better distinguishing between something that has been bookmarked and read/unread with dates for both the bookmarking and reading, as well as potentially adding notes and highlights relating to the article is desired. Something potentially akin to Devon Zuegel‘s “Notes” tab (built on a custom script for Evernote and Tumblr) seems somewhat promising in a cross between a simple reading list (or linkblog) and a commonplace book for academic work, but doesn’t necessarily leave room for longer book reviews.

I’ll also need to consider the publishing workflow, in some sense as it relates to the reverse chronological posting of updates on typical blogs. Perhaps a hybrid approach of the two methods mentioned would work best?

Potentially having an interface that bolts together the interface of GoodReads (picture above) and Amazon’s notes/highlights together would be excellent. I recently noticed (and updated an old post) that they’re already beta testing such a beast.

Kindle Notes and Highlights are now shoing up as a beta feature in GoodReads

Comments

I’ll keep thinking about the architecture for what I’d ultimately like to have, but I’m always open to hearing what other (heavy) readers have to say about the subject and the usability of such a UI.

Please feel free to comment below, or write something on your own site (which includes the URL of this post) and submit your URL in the field provided below to create a webmention in which your post will appear as a comment.

Hypothes.is and the IndieWeb

Last night I saw two great little articles about Hypothes.is, a web-based annotation engine, written by a proponent of the IndieWeb:

Hypothes.is as a public research notebook

Hypothes.is Aggregator ― a WordPress plugin

As a researcher, I fully appreciate the pro-commonplace book conceptualization of the first post, and the second takes things amazingly further with a plugin that allows one to easily display one’s hypothes.is annotations on one’s own WordPress-based site in a dead-simple fashion.

This functionality is a great first step, though honestly, in keeping with IndieWeb principles of owning one’s own data, I think it would be easier/better if Hypothes.is both accepted and sent webmentions. This would potentially allow me to physically own the data on my own site while still participating in the larger annotation community as well as give me notifications when someone either comments or augments on one of my annotations or even annotates one of my own pages (bits of which I’ve written about before.)

Either way, kudos to Kris Shaffer for moving the ball forward!

Examples

My Hypothes.is Notebook

The plugin mentioned in the second article allows me to keep a running online “notebook” of all of my Hypothes.is annotations on my own site.

My IndieWeb annotations

I can also easily embed my recent annotations about the IndieWeb below:

[ hypothesis user = 'chrisaldrich' tags = 'indieweb']

Webmention + Books = BookMention

Part of my plans to (remotely) devote the weekend to the IndieWeb Summit in Portland were hijacked by the passing of Muhammad Ali. Wait… What?! How does that happen?

A year ago, I opened started a publishing company and we came out with our first book Amerikan Krazy in late February. The author has a small backcatalogue that’s out of print, so in conjunction with his book launch, we’ve been slowly releasing ebook versions of his old titles. Coincidentally one of them was a fantastic little book about Ali entitled Muhammad Ali Retrospective, so I dropped everything I was doing to get it finished up and out as a quick way of honoring his passing.

But while I was working on some of the minutiae, I’ve been thinking in the back of my mind about the ideas of marginalia, commonplace books, and Amazon’s siloed community of highlights and notes. Is there a decentralized web-based way of creating a construct similar to webmention that will allow all readers worldwide to highlight, mark up and comment across electronic versions of texts so that they can share them in an open manner while still owning all of their own data? And possibly a way to aggregate them at the top for big data studies in the vein of corpus linguistics?

I think there is…

However it’ll take some effort, but effort that could have a worthwhile impact.

I have a few potential architectures in mind, but also want to keep online versions of books in the loop as well as potentially efforts like hypothes.is or even the academic portions of Genius.com which do web-based annotation.

If anyone in the IndieWeb, books, or online marginalia worlds has thought about this as well, I’d love to chat.

How can we be sure old books were ever read? – University of Glasgow Library

Bookmarked How can we be sure old books were ever read? by Robert MacLean (University of Glasgow Library)

Owning a book isn’t the same as reading it; we need only look at our own bloated bookshelves for confirmation.

This is a great little overview for people reading the books of others. There are also lots of great links to other resources.

Inquire in the Margine

If the Page satisfie not, inquire in the Margine:

John Selden (1584-1654), English jurist and a scholar
in Illustrations (1612), a commentary on Poly-Olbion, a poem by Michael Drayton
in the margin next to ‘A table to the chiefest passages, in the Illustrations, which, worthiest of observation, are not directed unto by the course of the volume.’

Git and Version Control for Novelists, Screenwriters, Academics, and the General Public

Marginalia and Revision Control

At the end of April, I read an article entitled “In the Margins” in the Johns Hopkins University Arts & Sciences magazine. I was particularly struck by the comments of eminent scholar Jacques Neefs on page thirteen (or paragraph 20) about computers making marginalia a thing of the past:

Neefs believes contemporary literature is losing a valuable component in an age when technology often precludes and trumps the need to save manuscripts or rough drafts. But it is not something that keeps him up at night. ‘The modern technique of computers and everything makes [marginalia] a thing of the past,’ he says. ‘There’s a new way of creation. Some would say it’s tragic, but something new has been invented. I don’t consider it tragic. There are still great writers who write and continue to have a way to keep the process.’

Photo looking over the shoulder of Jacques Neefs onto the paper he's been studing on the table in front of him. — Jacques Neefs (Image courtesy of Johns Hopkins University)

I actually think that he may be completely wrong and that current technology actually allows us to keep far more marginalia! (Has anyone heard of digital exhaust?) The bigger issue may be that many writers just don’t know how to keep a better running log of their work to maintain all the relevant marginalia they’re actually producing. (Of course there’s also the subsequent broader librarian’s “digital dilemma” of maintaining formats for the future. As an example, thing about how easy or hard it might be for you to read that ubiquitous 3.5 inch floppy disk you used in 1995.)

A a technologist who has spent many years in the entertainment industry, I feel compelled to point everyone towards the concept of revision control (or version control) within the realm of computer science. Though it’s primarily used in tracking changes in computer programs and is often a tool used by large teams of programmers, it can very easily be used for tracking changes in almost any type of writing from novels, short stories, screenplays, legal contracts, or any type of textual documentation of nearly any sort.

Example Use Cases for Revision Control

Publishing

As a direct example, I’m using what is known as a Git repository to track every change I make in a textbook I’m currently writing. I can literally go back and view every change I’ve made since beginning the project, so though I’m directly revising one (or more) text files, all of my “marginalia” and revisions are saved and available. Currently I’m only doing it for my own reference and for additional backup not supposing that anyone other than myself or an editor possibly may want to ever peruse it. If I was working in conjunction with otheres, there are ways for me to track the changes, edits, or notes that others (perhaps an editor or collaborator) might make.

In addition to the general back-up of the project (in case of catastrophic computer failure), I also have the ability to go back and find that paragraph (or multiple pages) I deleted last week in haste, but realize that I desperately want them back now instead of having to recreate them de n0vo.

Because it’s all digital, future scholars also won’t have problems parsing my handwriting issues as has occasionally come up in differentiating Mary Shelley’s writing from that of her husband in digital projects like the Shelley Godwin Archive. The fact that all changes are tracked and placed in a tree-like structure will indicate who wrote what and when and will indicate which changes were ultimately accepted and merged into the final version.

Screenplays in Hollywood

One particular use case I can easily see for such technology is tracking changes in screenplays over time. I’m honestly shocked that every production company or even more likely studios don’t use such technology to follow changes in drafts over time. In the end, doing such tracking will certainly make Writers Guild of America (WGA) arbitrations much easier as literally every contribution to a script can be tracked to give screenwriters appropriate credit. The end results with the easy ability to time-machine one’s way back into older drafts is truly lovely, and the outputs give so much more information about changes in the script compared to the traditional and all-too-simple (*) which screenwriters use to indicate that something/anything changed on a specific line or the different colored pages which are used on scripts during production.

I can also picture future screenwriters using services like GitHub as platforms for storing and distributing their screenplays to potential agents, managers, and producers.

Redlining Legal Documents

Having seen thousands of legal agreements go back and forth over the years, revision control is a natural tool for tracking the redlining and changes of legal documents as they change over time before they are finally (or even never) executed. I have to imagine that being able to abstract out the appropriate metadata in the long run may actually help attorneys, agents, etc. to become better negotiators, but something like this is a project for another day.

Academia

In addition to direct research for projects being undertaken by academics like Neefs, academics should look into using revision control in their own daily work and writings. While writing a book, paper, journal article, essay, monograph, etc. (or graduate students writing theses) one could use their own Git repository to not only save but to back up all of their own work not only for themselves primarily, but also future scholars who come later who would not otherwise have access to the “marginalia” one creates while manufacturing their written thoughts in digital form.

I can easily picture Git as a very simple “next step” in furthering the concept of the digital humanities as well as in helping to bridge the gap between C.P. Snow’s “two cultures.” (I’d also suggest that revision control is a relatively simple step one could take before learning a particular programming language, which I think should be a mandatory tool in everyone’s daily toolbox regardless of their field(s) of interest.)

Start Using Revision Control

“But how do I get started?” you ask.

Know going in that it may take parts of a day to get things set up and running, but once you’ve started with the basics, things are actually pretty easy and you can continue to learn the more advanced subtleties as you progress. Once things are working smoothly, the additional overhead you’ll be expending won’t be too much more than the old method of hitting Alt-S to save one of your old Word documents in the time before auto-save became ubiquitous.

First one should start by choosing one of the myriad revision control systems that exist. For the sake of brevity in this short introductory post, I’ll simply suggest that users take a very close look at Git because of its ubiquity and popularity in the computer science world and the fact that it includes a tremendously large amount of free information and support from a variety of sites on the internet. Git also has the benefit of having versions for all major operating systems (Windows, MacOS, and Linux). Git also has the benefit of a relatively long and robust life within the computer science community meaning that it’s very stable and has many more resources for the uninitiated to draw upon.

Once one has Git installed on their computer and has begun using it, I’d then recommending linking one’s local copy of the repository to a cloud storage solution like either GitHub or BitBucket. While GitHub is certainly one of the most popular Git-related services out there (because it acts, in part, as the hub for a large portion of the open internet and thus promotes sharing), I often recommend using BitBucket as it allows free unlimited private but still share-able repositories while GitHub requires a small subscription fee for keeping one’s work private. Having a repository in the cloud will help tremendously in that your work will be available and downloadable from almost anywhere and because it also serves as a de-facto back-up solution for your work.

I’ve recently been playing around with version control to help streamline the writing/editing process for a book I’ve been writing. Though Git and it’s variants probably seem more daunting than they should to the everyday user, they really represent a very powerful tool. I’ve spent less than two days learning the basics of both Git and hosted repositories (GitHub and Bitbucket), and it has been more than well worth the minor effort.

There is a huge wealth of information on revision control in general and on installing and using Git available on the internet, including full textbooks. For the complete beginners, I’d recommend starting with The Chronicle’s “A Gentle Introduction to Version Control.” Keep in mind that though some of these resources look highly technical, it’s because many are trying to enumerate every function one could potentially desire, when even just the basic core functionality is more than enough to begin with. (I could analogize it to learning to drive a car versus actually reading the full manual so that you know how to take the engine apart and put it back together from scratch. To start with revision control, you only need to learn to “drive.”) Professors might also avail themselves of the use of their local institutional libraries which may host small sessions on learning such tools, or they might avail themselves of the help of their colleagues or students in the computer science department. For others, I’d recommend taking a look at Git’s primary website. BitBucket has an excellent step-by-step tutorial (and troubleshooting) for setting up the requisite software and using it.

What do you use for revision control?

I’ll welcome any thoughts, experiences, or additional resources one might want to share with others in the comments.