Hypothes.is as a Digital Zettelkasten for Neologism and Word Collection for Wordnik

A little while ago, one of the followers of my Hypothes.is account where I actively mark up my reading with highlights, annotations, and notes asked me why I was tagging seemingly random sentences with “wordnik” and other odd tags that started with “hw-“. Today I thought I’d write out the explanation of the habits around one of my side hobbies of word collecting.

Some background

In the book The Professor and the Madman: A Tale of Murder, Insanity, and the Making of the Oxford English Dictionary, Simon Winchester describes the pigeonhole and slip system that professor James Murray used to create the Oxford English Dictionary. The editors essentially put out a call to readers to note down interesting every day words they found in their reading along with examples sentences and references. They then collected these words alphabetically into pigeonholes and from here were able to collectively compile their magisterial dictionary.  Those who are fans of the various methods of knowledge collection and management represented by the index card-based commonplace book or the zettelkasten, will appreciate this scheme as a method of collectively finding and collating knowledge. It’s akin to Paul Otlet and Henri La Fontaine’s work on creating the Mundaneum, but focused  on the niche area of lexicography and historical linguistics.

Book cover of The Professor and the Madman featuring a sepia toned image of a seated professor in a full beard and a mustache holding a book.The Professor and the Madman is broadly the fascinating story of Dr. W. C. Minor, an insane asylum patient, who saw the call to collect words and sentences began a written correspondence with James Murray by sending in over ten thousand slips with words from his personal reading. 

Wordnik and Hypothes.is

A similar word collecting scheme is currently happening on the internet now, though perhaps with a bit more focus on interesting neologisms (and hopefully without me being cast as an insane asylum patient.) The lovely folks at the online dictionary Wordnik have been using the digital annotation tool Hypothes.is to collect examples of words as they happen in the wild. One can create a free account on the Hypothes.is service and quickly and easily begin collecting words for the effort by highlighting example sentences and tagging with “wordnik” and “hw-[InsertFoundWordHere]”.

So for example, this morning I was reading about the clever new animations in the language app Duolingo and came across a curious new word (at least to me): viseme.

To create accurate animations, we generate the speech, run it through our in-house speech recognition and pronunciation models, and get the timing for each word and phoneme (speech sound). Each sound is mapped onto a visual representation, or viseme, in a set we designed based on linguistic features.

So I clicked on my handy browser extension for Hypothes.is, highlighted the sentence with a bit of context, and tagged it with “wordnik” and “hw-viseme”. The “hw-” prefix ostensibly means “head word” which is how lexicographers refer to the words you see defined in dictionaries.

Then the fine folks at Wordnik are able to access the public annotations matching the tag Wordnik, and use Hypothes.is’ API to pull in the collections of new words for inclusion into their ever-growing corpus.

Since I’ve collected interesting new words and neologisms for ages anyway, this has been a quick and easy method of helping out other like minded word collectors along the way. In addition to the ability to help out others, a side benefit of the process is that the collected words are all publicly available for reading and using in daily life! You can not only find the public page for Wordnik words on Hypothes.is, but you can subscribe to it via RSS to see all the clever and interesting neologisms appearing in the English language as collected in real time! So if you’re the sort who enjoys touting new words at cocktail parties, a rabid cruciverbalist who refuses to be stumped by this week’s puzzle, or a budding lexicographer yourself, you’ve now got a fantastic new resource! I’ve found it to be far more entertaining and intriguing than any ten other word-of-the-day efforts I’ve seen in published or internet form.

If you like, there’s also a special Hypothes.is group you can apply to join to more easily aid in the effort. Want to know more about Wordnik and their mission, check out their informative Kickstarter page.

 

Having been studying Welsh for a while, this video about how it informed J.R.R. Tolkien’s creation of Elvish languages for his fiction was fascinating.

The fact that he uses the word Nazgûl [~““35:51] from the Irish (nasc) and Scots Gaelic (nasg) words meaning “ring” to take a linguistic dig at Irish is notable. He was probably motivated by his political views of the time rather than celebrating (as one should) the value and diversity of all languages.

Tolkien once termed Welsh ‘the elder language of the men of Britain’; this talk explores how the sounds and grammar of Welsh captured Tolkien’s imagination and are reflected in Sindarin, one of the two major Elvish languages which he created.

Via https://podcasts.ox.ac.uk/medieval-welsh. For those interested on Tolkien, they’ve got a huge list of other scholarly content on his work: https://podcasts.ox.ac.uk/keywords/tolkien.

Am I wrong in thinking that the reason they’re calling it Web3 instead of Web 3.0 for parallelism with Web 2.0 is that hashtagging it on Twitter just doesn’t work with the period in there? (i.e. #⁠Web3.0 doesn’t link properly on Twitter the way it does on my website.) And if I’m right, is this a problem that we can expect the blockchain to fix? #⁠HistoricalLinguistics
Dot Porter did a more thorough tour of MS Codex 1248 today compared to our prior glimpse.

Today I learned that the phrase “run the gamut” comes from Γ ut or gamma ut, which is the lowest note of the hexachord system on the Guidonian hand and is also used to describe all the possible notes.


And for some somewhat related musical fun via John Carlos Baez:

Guillaume Dufay (1397 – 1474) is the most famous of the first generation of the Franco-Flemish school. (This first generation is also called the Burgundian School.) He is often considered a transitional figure from the medieval to the Renaissance. His isorhythmic motets illustrate that—their tonality is dissonant and dramatic compared to typical Renaissance polyphony.

Communication with rocks

I’ve now heard three references to rocks talking in Braiding Sweetgrass: Indigenous Wisdom, Scientific Knowledge and the Teachings of Plants by Robin Wall Kimmerer. Along with other indigenous attestations the idea has gone beyond coincidence for me.

It is far from the only source to exhibit this “oddity”. Biblical references from the time of King David exist as well as in Neolithic archaeology.

I’m increasingly confident of a hidden meaning here of which Western culture is unaware (it having been long forgotten) and which is likely that Indigenous peoples may have forgotten (read: had ripped and stolen from their identities during colonialization).

References to this lost knowledge in oral and written sources still remain as evidence of my theory: “communication” or “conversations” with rocks was literally a “bedrock” cultural knowledge underpinning many human cultures and ways of life for millennia.

I’ll define this “communication” more fully shortly as I continue to collect examples in the literature as well as examples in archaeological contexts.

I’d welcome other references from others should they come across them in any contexts.

Some notes about the semantic change of “interlink” and “backlink”

I’m reasonably certain that he’s raised the question or issue about the definition of “interlink” or “backlink” before, but it’s come up again today with some discussion and notes which I wanted to capture permanently here with few modifications for myself:

doubleloop[m] APP 12:30 PM
I have some notes I’ve taken on interlinking wikis here – https://commonplace.doubleloop.net/interlinking-wikis

tantek 12:39 PM
doubleloop[m], what’s the difference between “just” a link and an “interlink” from a user perspective?
genuine question (feel free to also answer if you have an idea @chrisaldrich) because Wikipedia seems to consider “interlink” as a common noun to be a synonym for “hyperlink” https://en.wikipedia.org/wiki/Interlink

Chris Aldrich 20:45 PM
I think that the definition for interlinking is expanding based on actual use cases. Historically Tim Berners Lee tried to create hyperlinks as bi-directional and then scrapped the idea as not easily implementable. As a result we’ve all come to expect that links are uni-directional.

In the digital gardens, wiki spaces and now, even with Webmention, there’s an expectation (I would suggest) by a growing number of people that some links in practice will be bi-directional.

If Neil puts a link to something within his own wiki/digital garden, he’s expecting that to be picked up in a space like the Agora and it will interlink his content with that of others.

Many who are practicing POSSE/PESOS are programatically (or manually) placing backlinks between their content and the copies that live on silos creating a round trip set of links that typically hasn’t been seen on the web historically.

Because we’ve mostly grown up with a grammar of single directional links and no expectation of visible reverse links (except perhaps in the spammy framing of SEO linkfarms), the word “interlink” has taken on the connotation seen in Wikipedia. I think that definition is starting to change.

Among a class of users in the note taking/personal knowledge management space (Roam Research, Obsidian, Logseq, TiddlyWiki, et al) most users are expecting tools to automatically interlink (in my definition with the sense of an expected bi-directional link) pages. Further, they’re expecting that if you change the word(s) that appear within a [[wikilink]] that it will globally change all instances of that word/phrase that are so linked within one’s system.

In many of those systems you can also do a manual /redirect the way we do on the IndieWeb wiki, but they expect the system to actively rename their bi-directional links without any additional manual work.

tantek 1:08 PM
ok, the bidirectionality as expectation is interesting

Chris Aldrich 1:08 PM
By analogy, many in the general public have a general sense of what /syndication is within social media, but you (Tantek) and others in the IndieWeb space have created words/phrases/acronyms that specify a “target” and “source” to indicate in which direction the syndication is being done and between sites of differing ownership (POSSE, PESOS, PASTA, PESETAS, POOSNOW,… not to mention a linear philosophical value proposition of which are more valuable to the end user). There is a group of people who are re-claiming a definition of the words “interlink” and perhaps “backlink” to a more logical position based on new capabilities in technology. Perhaps it may be better if they created neologisms for these, but linguistically that isn’t the path being taken as there are words that would seem to have an expandable meaning for what they want. I’d classify it as a semantic change/shift/drift in the words meanings: https://en.wikipedia.org/wiki/Semantic_change

I suspect that if Roam Research, or any of the other apps that have this bi-directionality built in, were to remove it as a feature, they’d loose all of their userbase.

tantek 1:11 PM
yes, such a semantic shift in the meaning of “interlink” seems reasonable, and a useful distinction from the now ubiquitously expected unidirectionality of “hyperlink”

Chris Aldrich 1:12 PM
I’m expecting that sometime within the next year or so that major corporate apps like Evernote and OneNote will make this bi-directional linking a default as well.

tantek 1:12 PM
in sci-fi metaphor terms, one-way vs two-way wormholes (per other uses of “hyper”)

Chris Aldrich 1:14 PM
I can only imagine what a dramatically different version of the web we’d be living in if the idea of Webmention had existed in the early 90s. Particularly as there’s the ability to notify the other end in changes/updates/deletions of a page. Would the word “linkrot” exist in that world?

Joe Crawford 1:22 PM
Or in a world with Xanaduian transclusions, for that matter.
Alas

Chris Aldrich 1:25 PM
Related to this and going into the world of the history of information is the suggestion by Markus Krajewski in “Paper Machines: About Cards & Catalogs, 1548-1929” that early card catalog and index card systems are really an early paper/manual form of a Turing Machine: https://mitpress.mit.edu/books/paper-machines.

One might imagine the extended analogy libraries:books:index cards :: Internet:websites:links with different modes and speeds of transmission.

Read - Reading: Braiding Sweetgrass: Indigenous Wisdom, Scientific Knowledge, and the Teachings of Plants by Robin Wall Kimmerer (Milkweed Editions )
As a botanist, Robin Wall Kimmerer as been trained to ask questions of nature with the tools of science. As a member of the Citizen Potawatomi Nation, she embraces the notion that plants and animals are our oldest teachers. In Braiding Sweetgrass, Kimmerer brings these lenses of knowledge together to show that the awakening of a wider ecological consciousness requires the acknowledgment and celebration of our reciprocal relationship with the rest of the living world. For only when we can hear the languages of other beings are we capable of understanding the generosity of the earth, and learning to give our own gifts in return.
  • 15%

Learning the grammar of animacy.
What a sea change of perspective!! English speakers have trouble with other humans’ pronouns, wait until they need to pronoun animals and bodies of water.

Mnemonic techniques and language acquisition

Over the years in academic settings I’ve picked up pieces of Spanish, French, Latin and a few odd and ends of other languages.

Six years ago we put our daughter into a dual immersion Japanese program (in the United States) and it has changed some of my view of how we teach and learn languages, a process which is also affected by my slowly picking up conversational Welsh using the method at https://www.saysomethingin.com/ over the past year and change, a hobby which I wish I had more targeted time for.

Children learn language through a process of contextual use and osmosis which is much more difficult for adults. I’ve found that the slowly guided method used by SSiW is fairly close to this method, but is much more targeted. They’ll say a few words in the target language and give their English equivalents, then they’ll provide phrases and eventually sentences in English and give you a few seconds to form them into the target language with the expectation that you try to say at least something, or pause the program to do your best. It’s okay if you mess up even repeatedly, they’ll say the correct phrase/sentence two times after which you’ll repeat it again thus giving you three tries at it. They’ll also repeat bits from one lesson to the next, so you’ll eventually get it, the key is not to worry too much about perfection.

Things slowly build using this method, but in even about 10 thirty minute lessons, you’ll have a pretty strong grasp of fluent conversational Welsh equivalent to a year or two of college level coursework. Your work on this is best supplemented with interacting with native speakers and/or watching television or reading in the target language as much as you’re able to.

For those who haven’t experienced it before I’d recommend trying out the method at https://www.saysomethingin.com/welsh/course1/intro to hear it firsthand.

The experience will give your brain a heavy work out and you’ll feel mentally tired after thirty minutes of work, but it does seem to be incredibly effective. A side benefit is that over time you’ll also build up a “gut feeling” about what to say and how without realizing it. This is something that’s incredibly hard to get in most university-based or book-based language courses.

This method will give you quicker grammar acquisition and you’ll speak more like a native, but your vocabulary acquisition will tend to be slower and you don’t get any writing or spelling practice. This can be offset with targeted memory techniques and spaced repetition/flashcards or apps like Duolingo that may help supplement one’s work.

I like some of the suggestions made in Lynne Kelly’s post about Chinese as I’ve been pecking away at bits of Japanese over time myself. There’s definitely an interesting structure to what’s going on, especially with respect to the kana and there are many similarities to what is happening in Japanese to the Chinese that she’s studying. I’m also approaching it from a more traditional university/book-based perspective, but if folks have seen or heard of a SSiW repetition method, I’d love to hear about it.

Hopefully helpful by comparison, I’ll mention a few resources I’ve found for Japanese that I’ve researched on setting out a similar path that Lynne seems to be moving.

Japanese has two different, but related alphabets and using an app like Duolingo with regular practice over less than a week will give one enough experience that trying to use traditional memory techniques may end up wasting more time than saving, especially if one expects to be practicing regularly in both the near and the long term. If you’re learning without the expectation of actively speaking, writing, or practicing the language from time to time, then wholesale mnemotechniques may be the easier path, but who really wants to learn a language like this?

The tougher portion of Japanese may come in memorizing the thousands of kanji which can have subtly different meanings. It helps to know that there are a limited set of specific radicals with a reasonably delineable structure of increasing complexity of strokes and stroke order.

The best visualization I’ve found for this fact is the Complete Listing of the 214 Radicals and Major Variations from An Introduction to Japanese Kanji Calligraphy by Kunii Takezaki (Tuttle, 2005) which I copy below:

A chart of Japanese radicals in columns by number, character, and radical name & variations with a legend for reading the chart
Complete Listing of the 214 Radicals and Major Variations from An Introduction to Japanese Kanji Calligraphy by Kunii Takezaki (Tuttle, 2005)

(Feel free to right click and view the image in another tab or download it and view it full size to see more detail.)

I’ve not seen such a chart in any of the dozens of other books I’ve come across. The numbered structure of increasing complexity of strokes here would certainly suggest an easier to build memory palace or songline.

I love this particular text as it provides an excellent overview of what is structurally happening in Japanese with lots of tidbits that are otherwise much harder won in reading other books.

There are many kanji books with various forms of what I would call very low level mnemonic aids. I’ve not found one written or structured by what I would consider a professional mnemonist. One of the best structured ones I’ve seen is A Guide to Remembering Japanese Characters by Kenneth G. Henshall (Tuttle, 1988). It’s got some great introductory material and then a numbered list of kanji which would suggest the creation of a quite long memory palace/journey/songline.

Each numbered Kanji has most of the relevant data and readings, but provides some description about how the kanji relates or links to other words of similar shapes/meanings and provides a mnemonic hint to make placing it in one’s palace a bit easier. Below is an example of the sixth which will give an idea as to the overall structure.

Box number 6 with a Japanese kanji, its two readings, number of strokes and a written description of the word and how it relates to other words as well as a suggested mnemonic story that relates to some of the other words.

I haven’t gotten very far into it yet, but I’d found an online app called WaniKani for Japanese that has some mnemonic suggestions and built-in spaced repetition that looks incredibly promising for taking small radicals and building them up into more easily remembered complex kanji.

I suspect that there are likely similar sources for these couple of books and apps for Chinese that may help provide a logical overall structuring which will make it easier to apply or adapt one’s favorite mnemotechniques to make the bulk vocabulary memorization easier.

The last thing I’ll mention I’ve found, that’s good for practicing writing by hand as well as spaced repetition is a Kanji notebook frequently used by native Japanese speaking children as they’re learning the levels of kanji in each grade. It’s non-obvious to the English speaker, and took me a bit to puzzle out and track down a commercially printed one, even with a child in a classroom that was using a handmade version. The notebook (left to right and top to bottom) has sections for writing a big example of the learned kanji; spaces for the “Kun” and “On” readings; spaces for the number of strokes and the radical pieces; a section for writing out the stroke order as it builds up gradually; practice boxes for repeated practice of writing the whole kanji; examples of how to use the kanji in context; and finally space for the student to compose their own practice sentences using the new kanji.

A section of a Kanji notebook (in Japanese) frequently used by native Japanese speaking children as they’re learning the levels of kanji in each grade. The notebook (left to right and top to bottom) has sections for writing a big example of the learned kanji; spaces for the “Kun” and “On” readings; spaces for the number of strokes and the radical pieces; a section for writing out the stroke order as it builds up gradually; practice boxes for repeated practice of writing the whole kanji; examples of how to use the kanji in context; and finally space for the student to compose their own practice sentences using the new kanji.

Regular use and practice with these can be quite helpful for moving toward mastery.

I also can’t emphasize enough that regularly and actively watching, listening, reading, and speaking in the target language with materials that one finds interesting is incredibly valuable. As an example, one of the first things I did for Welsh was to find a streaming television and radio that I want to to watch/listen to on a regular basis has been helpful. Regular motivation and encouragement is key.

I won’t go into them in depth and will leave them to speak for themselves, but two of the more intriguing videos I’ve watched on language acquisition which resonate with some of my experiences are:

Read Introducing The Endonym Project (chrisfinke.com)
An endonym is a name that people give to the area where they live. For example, you might live in a city that is officially named "Brooklyn Heights," but you and all of your neighbors call it "The Heights." This is an endonym. I've always wondered about how well-defined the geographic boundaries are for endonyms that aren't tied to specific locations.  For example, how far east do you have to go from Minnesota before the

A cool bit on geography and names. Can’t wait to play with it.

Liked a tweet (Twitter)

This is incredibly true. One needs to throw caution to the wind and focus on making as many mistakes as possible.