In the book The Professor and the Madman: A Tale of Murder, Insanity, and the Making of the Oxford English Dictionary, Simon Winchester describes the pigeonhole and slip system that professor James Murray used to create the Oxford English Dictionary. The editors essentially put out a call to readers to note down interesting every day words they found in their reading along with examples sentences and references. They then collected these words alphabetically into pigeonholes and from here were able to collectively compile their magisterial dictionary. Those who are fans of the various methods of knowledge collection and management represented by the index card-based commonplace book or the zettelkasten, will appreciate this scheme as a method of collectively finding and collating knowledge. It’s akin to Paul Otlet and Henri La Fontaine’s work on creating the Mundaneum, but focused on the niche area of lexicography and historical linguistics.
The Professor and the Madman is broadly the fascinating story of Dr. W. C. Minor, an insane asylum patient, who saw the call to collect words and sentences began a written correspondence with James Murray by sending in over ten thousand slips with words from his personal reading.
Wordnik and Hypothes.is
A similar word collecting scheme is currently happening on the internet now, though perhaps with a bit more focus on interesting neologisms (and hopefully without me being cast as an insane asylum patient.) The lovely folks at the online dictionary Wordnik have been using the digital annotation tool Hypothes.is to collect examples of words as they happen in the wild. One can create a free account on the Hypothes.is service and quickly and easily begin collecting words for the effort by highlighting example sentences and tagging with “wordnik” and “hw-[InsertFoundWordHere]”.
So for example, this morning I was reading about the clever new animations in the language app Duolingo and came across a curious new word (at least to me): viseme.
To create accurate animations, we generate the speech, run it through our in-house speech recognition and pronunciation models, and get the timing for each word and phoneme (speech sound). Each sound is mapped onto a visual representation, or viseme, in a set we designed based on linguistic features.
So I clicked on my handy browser extension for Hypothes.is, highlighted the sentence with a bit of context, and tagged it with “wordnik” and “hw-viseme”. The “hw-” prefix ostensibly means “head word” which is how lexicographers refer to the words you see defined in dictionaries.
Then the fine folks at Wordnik are able to access the public annotations matching the tag Wordnik, and use Hypothes.is’ API to pull in the collections of new words for inclusion into their ever-growing corpus.
Since I’ve collected interesting new words and neologisms for ages anyway, this has been a quick and easy method of helping out other like minded word collectors along the way. In addition to the ability to help out others, a side benefit of the process is that the collected words are all publicly available for reading and using in daily life! You can not only find the public page for Wordnik words on Hypothes.is, but you can subscribe to it via RSS to see all the clever and interesting neologisms appearing in the English language as collected in real time! So if you’re the sort who enjoys touting new words at cocktail parties, a rabid cruciverbalist who refuses to be stumped by this week’s puzzle, or a budding lexicographer yourself, you’ve now got a fantastic new resource! I’ve found it to be far more entertaining and intriguing than any ten other word-of-the-day efforts I’ve seen in published or internet form.
If you like, there’s also a special Hypothes.is group you can apply to join to more easily aid in the effort. Want to know more about Wordnik and their mission, check out their informative Kickstarter page.