A note taking problem and a proposed solution

tl;dr

It’s too painful to quickly get frequent notes into note taking and related platforms. Hypothes.is has an open API and a great UI that can be leveraged to simplify note taking processes.

Note taking tools

I’ve been keeping notes in systems like OneNote and Evernote for ages, but for my memory-related research and work in combination with my commonplace book for the last year, I’ve been alternately using TiddlyWiki (with TiddlyBlink) and WordPress (it’s way more than a blog.)

I’ve also dabbled significantly enough with related systems like Roam Research, Obsidian, Org mode/Org Roam, MediaWiki, DocuWiki, and many others to know what I’m looking for.

Many of these, particularly those that can be used alternately as commonplace books and zettelkasten appeal to me greatly when they include the idea of backlinks. (I’ve been using Webmention to leverage that functionality in WordPress settings, and MediaWiki gives it grudgingly with the “what links to this page” basic functionality that can be leveraged into better transclusion if necessary.)

The major problem with most note taking tools

The final remaining problem I’ve found with almost all of these platforms is being able to quickly and easily get data into them so that I can work with or manipulate it. For me the worst part of note taking is the actual taking of notes. Once I’ve got them, I can do some generally useful things with them—it’s literally the physical method of getting data from a web page, book, or other platform into the actual digital notebook that is the most painful, mindless, and useless thing for me.

Evernote and OneNote

Older note taking services like Evernote and OneNote come with browser bookmarklets or mobile share functionality that make taking notes and extracting data from web sources simple and straightforward. Then once the data is in your notebook you can actually do some work with it. Sadly neither of these services has the backlinking functionality that I find has become de rigueur for my note taking or knowledge wrangling needs.

WordPress

My WordPress solutions are pretty well set since that workflow is entirely web-based and because WordPress has both bookmarklet and Micropub support. There I’m primarily using a variety of feeds and services to format data into a usable form that I can use to ping my Micropub endpoint. The Micropub plugin handles the post and most of the meta data I care about.

It would be great if other web services had support for Micropub this way too, as I could see some massive benefits to MediaWiki, Roam Research, and TiddlyWiki if they had this sort of support. The idea of Micropub has such great potential for great user interfaces. I could also see many of these services modifying projects like Omnibear to extend themselves to create highlighting (quoting) and annotating functionality with a browser extension.

With this said, I’m finding that the user interface piece that I’m missing for almost all of these note taking tools is raw data collection.

I’m not the sort of person whose learning style (or memory) is benefited by writing or typing out notes into my notebooks. I’d far rather just have it magically happen. Even copying and pasting data from a web browser into my digital notebook is a painful and annoying process, especially when you’re reading and collecting/curating as many notes as I tend to. I’d rather be able to highlight, type some thoughts and have it appear in my notebook. This would prevent the flow of my reading, thinking, and short annotations from being subverted by the note collection process.

Different modalities for content consumption and note taking 

Based on my general experience there are only a handful of different spaces where I’m typically making notes.

Reading online

A large portion of my reading these days is done in online settings. From newspapers, magazines, journal articles and more, I’m usually reading them online and taking notes from them there.

.pdf texts

Some texts I want to read (often books and journal articles) only live in .pdf form. While reading them in an app-specific setting has previously been my preference, I’ve taken to reading them from within browsers. I’ll explain why in just a moment, but it has to do with a tool that treats this method the same as the general online modality. I’ll note that most of the .pdf  specific apps have dreadful data export—if any.

Reading e-books (Kindle, e-readers, etc.)

If it’s not online or in .pdf format, I’m usually reading books within a Kindle or other e-reading device. These are usually fairly easy to add highlights, annotations, and notes to. While there are some paid apps that can extract these notes, I don’t find it too difficult to find the raw file and cut and paste the data into my notebook of choice. Once there, going through my notes, reformatting them (if necessary), tagging them and expanding on them is not only relatively straightforward, but it also serves as a simple method for doing a first pass of spaced repetition and review for better long term recall.

Lectures

Naturally taking notes from live lectures, audiobooks, and other spoken events occurs, but more often in these cases, I’m typically able to type them directly into my notebook of preference or I’m using something like my digital Livescribe pen for notes which get converted by OCR and are easy enough to convert in bulk into a digital notebook. I won’t belabor this part further, though if others have quick methods, I’d love to hear them.

Physical books

While I love a physical book 10x more than the next 100 people, I’ve been trying to stay away from them because I find that though they’re easy to highlight, underline, and annotate the margins, it takes too much time and effort (generally useless for memory purposes for me) to transfer these notes into a digital notebook setting. And after all, it’s the time saving piece I’m after here, so my preference is to read in some digital format if at all possible.

A potential solution for most of these modalities

For several years now, I’ve been enamored of the online Hypothes.is annotation tool. It’s open source, allows me reasonable access to my data from the (free) hosted version, and has a simple, beautiful, and fast process for bookmarking, highlighting, and annotating online texts on desktop and mobile. It works exceptionally well for both web pages and when reading .pdf texts within a browser window.

I’ve used it daily to make several thousand annotations on 800+ online web pages and documents. I’m not sure how I managed without it before. It’s the note taking tool I wished I’d always had. It’s a fun and welcome part of my daily life. It does exactly what I want it to and generally stays out of the way otherwise. I love it and recommend it unreservedly. It’s helped me to think more deeply and interact more directly with countless texts.

When reading on desktop or mobile platforms, it’s very simple to tap a browser extension and have all their functionality immediately available. I can quickly highlight a section of a text and their UI pops open to allow me to annotate, tag it, and publish. I feel like it’s even faster than posting something to Twitter. It is fantastically elegant.

The one problem I have with it is that while it’s great for collecting and aggregating my note data into my Hypothes.is account, there’s not much I can do with it once it’s there. It’s missing the notebook functionality some of these other services provide. I wish I could plug all my annotation and highlight content into spaced repetition systems or move it around and modify it within a notebook where it might be more interactive and cross linked for the long term. Sadly I don’t think that any of this sort of functionality is on Hypothes.is’ roadmap any time soon.

There is some great news however! Hypothes.is is open source and has a reasonable API. This portends some exciting things! This means that any of these wiki, zettelkasten, note taking, or spaced repetition services could leverage the UI for collecting data and pipe it into their interfaces for direct use.

As an example, what if I could quickly tell Obsidian to import all my pre-existing and future Hypothes.is data directly into my Obsidian vault for manipulating as notes? (And wouldn’t you know, the small atomic notes I get by highlighting and annotating are just the sort that one would like in a zettelkasten!) What if I could pick and choose specific course-related data from my reading and note taking in Hypothes.is (perhaps by tag or group) for import into Anki to quickly create some flash cards for spaced repetition review? For me, this combination would be my dream application!

These small pieces, loosely joined can provide some awesome opportunities for knowledge workers, students, researchers, and others. The education focused direction that Hypothes.is, many of these note taking platforms, and spaced repetition systems are all facing positions them to make a super-product that we all want and need.

An experiment

So today, as a somewhat limited experiment, I played around with my Hypothes.is atom feed (https://hypothes.is/stream.atom?user=chrisaldrich, because you know you want to subscribe to this) and piped it into IFTTT. Each post creates a new document in a OneDrive file which I can convert to a markdown .md file that can be picked up by my Obsidian client. While I can’t easily get the tags the way I’d like (because they’re not included in the feed) and the formatting is incredibly close, but not quite there, the result is actually quite nice.

Since I can “drop” all my new notes into a particular folder, I can easily process them all at a later date/time if necessary. In fact, I find that the fact that I might want to revisit all my notes to do quick tweaks or adding links or additional thoughts provides the added benefit of a first round of spaced repetition for the notes I took.

Some notes may end up being deleted or reshuffled, but one thing is clear: I’ve never been able to so simply highlight, annotate, and take notes on documents online and get them into my notebook so quickly. And when I want to do something with them, there they are, already sitting in my notebook for manipulation, cross-linking, spaced repetition, and review.

So if the developers of any of these platforms are paying attention, I (and I’m sure others) really can’t wait for plugin integrations using the full power of the Hypothes.is API that allow us to all leverage Hypothes.is’ user interface to make our workflows seamlessly simple.

Git and Version Control for Novelists, Screenwriters, Academics, and the General Public

Marginalia and Revision Control

At the end of April, I read an article entitled “In the Margins” in the Johns Hopkins University Arts & Sciences magazine.  I was particularly struck by the comments of eminent scholar Jacques Neefs on page thirteen (or paragraph 20) about computers making marginalia a thing of the past:

Neefs believes contemporary literature is losing a valuable component in an age when technology often precludes and trumps the need to save manuscripts or rough drafts. But it is not something that keeps him up at night. ‘The modern technique of computers and everything makes [marginalia] a thing of the past,’ he says. ‘There’s a new way of creation. Some would say it’s tragic, but something new has been invented. I don’t consider it tragic. There are still great writers who write and continue to have a way to keep the process.’

Photo looking over the shoulder of Jacques Neefs onto the paper he's been studing on the table in front of him.
Jacques Neefs (Image courtesy of Johns Hopkins University)

I actually think that he may be completely wrong and that current technology actually allows us to keep far more marginalia! (Has anyone heard of digital exhaust?) The bigger issue may be that many writers just don’t know how to keep a better running log of their work to maintain all the relevant marginalia they’re actually producing. (Of course there’s also the subsequent broader librarian’s “digital dilemma” of maintaining formats for the future. As an example, thing about how easy or hard it might be for you to read that ubiquitous 3.5 inch floppy disk you used in 1995.)

A a technologist who has spent many years in the entertainment industry, I feel compelled to point everyone towards the concept of revision control (or version control) within the realm of computer science.  Though it’s primarily used in tracking changes in computer programs and is often a tool used by large teams of programmers, it can very easily be used for tracking changes in almost any type of writing from novels, short stories, screenplays, legal contracts, or any type of textual documentation of nearly any sort.

Example Use Cases for Revision Control

Publishing

As a direct example, I’m using what is known as a Git repository to track every change I make in a textbook I’m currently writing.  I can literally go back and view every change I’ve made since beginning the project, so though I’m directly revising one (or more) text files, all of my “marginalia” and revisions are saved and available.  Currently I’m only doing it for my own reference and for additional backup not supposing that anyone other than myself or an editor possibly may want to ever peruse it.  If I was working in conjunction with otheres, there are ways for me to track the changes, edits, or notes that others (perhaps an editor or collaborator) might make.

In addition to the general back-up of the project (in case of catastrophic computer failure), I also have the ability to go back and find that paragraph (or multiple pages) I deleted last week in haste, but realize that I desperately want them back now instead of having to recreate them de n0vo.

Because it’s all digital, future scholars also won’t have problems parsing my handwriting issues as has occasionally come up in differentiating Mary Shelley’s writing from that of her husband in digital projects like the Shelley Godwin Archive. The fact that all changes are tracked and placed in a tree-like structure will indicate who wrote what and when and will indicate which changes were ultimately accepted and merged into the final version.

Screenplays in Hollywood

One particular use case I can easily see for such technology is tracking changes in screenplays over time.  I’m honestly shocked that every production company or even more likely studios don’t use such technology to follow changes in drafts over time. In the end, doing such tracking will certainly make Writers Guild of America (WGA) arbitrations much easier as literally every contribution to a script can be tracked to give screenwriters appropriate credit. The end results with the easy ability to time-machine one’s way back into older drafts is truly lovely, and the outputs give so much more information about changes in the script compared to the traditional and all-too-simple (*) which screenwriters use to indicate that something/anything changed on a specific line or the different colored pages which are used on scripts during production.

I can also picture future screenwriters using services like GitHub as platforms for storing and distributing their screenplays to potential agents, managers, and producers.

Redlining Legal Documents

Having seen thousands of legal agreements go back and forth over the years, revision control is a natural tool for tracking the redlining and changes of legal documents as they change over time before they are finally (or even never) executed. I have to imagine that being able to abstract out the appropriate metadata in the long run may actually help attorneys, agents, etc. to become better negotiators, but something like this is a project for another day.

Academia

In addition to direct research for projects being undertaken by academics like Neefs, academics should look into using revision control in their own daily work and writings.  While writing a book, paper, journal article, essay, monograph, etc. (or graduate students writing theses) one could use their own Git repository to not only save but to back up all of their own work not only for themselves primarily, but also future scholars who come later who would not otherwise have access to the “marginalia” one creates while manufacturing their written thoughts in digital form.

I can easily picture Git as a very simple “next step” in furthering the concept of the digital humanities as well as in helping to bridge the gap between C.P. Snow’s “two cultures.” (I’d also suggest that revision control is a relatively simple step one could take before learning a particular programming language, which I think should be a mandatory tool in everyone’s daily toolbox regardless of their field(s) of interest.)

Git Logo

Start Using Revision Control

“But how do I get started?” you ask.

Know going in that it may take parts of a day to get things set up and running, but once you’ve started with the basics, things are actually pretty easy and you can continue to learn the more advanced subtleties as you progress.  Once things are working smoothly, the additional overhead you’ll be expending won’t be too much more than the old method of hitting Alt-S to save one of your old Word documents in the time before auto-save became ubiquitous.

First one should start by choosing one of the myriad revision control systems that exist.  For the sake of brevity in this short introductory post, I’ll simply suggest that users take a very close look at Git because of its ubiquity and popularity in the computer science world and the fact that it includes a tremendously large amount of free information and support from a variety of sites on the internet. Git also has the benefit of having versions for all major operating systems (Windows, MacOS, and Linux). Git also has the benefit of a relatively long and robust life within the computer science community meaning that it’s very stable and has many more resources for the uninitiated to draw upon.

Once one has Git installed on their computer and has begun using it, I’d then recommending linking one’s local copy of the repository to a cloud storage solution like either GitHub or BitBucket.  While GitHub is certainly one of the most popular Git-related services out there (because it acts, in part, as the hub for a large portion of the open internet and thus promotes sharing), I often recommend using BitBucket as it allows free unlimited private but still share-able repositories while GitHub requires a small subscription fee for keeping one’s work private. Having a repository in the cloud will help tremendously in that your work will be available and downloadable from almost anywhere and because it also serves as a de-facto back-up solution for your work.

I’ve recently been playing around with version control to help streamline the writing/editing process for a book I’ve been writing. Though Git and it’s variants probably seem more daunting than they should to the everyday user, they really represent a very powerful tool. I’ve spent less than two days learning the basics of both Git and hosted repositories (GitHub and Bitbucket), and it has been more than well worth the minor effort.

There is a huge wealth of information on revision control in general and on installing and using Git available on the internet, including full textbooks. For the complete beginners, I’d recommend starting with The Chronicle’s “A Gentle Introduction to Version Control.” Keep in mind that though some of these resources look highly technical, it’s because many are trying to enumerate every function one could potentially desire, when even just the basic core functionality is more than enough to begin with. (I could analogize it to learning to drive a car versus actually reading the full manual so that you know how to take the engine apart and put it back together from scratch. To start with revision control, you only need to learn to “drive.”) Professors might also avail themselves of the use of their local institutional libraries which may host small sessions on learning such tools, or they might avail themselves of the help of their colleagues or students in the computer science department. For others, I’d recommend taking a look at Git’s primary website. BitBucket has an excellent step-by-step tutorial (and troubleshooting) for setting up the requisite software and using it.

What do you use for revision control?

I’ll welcome any thoughts, experiences, or additional resources one might want to share with others in the comments.