Global Language Networks

Recent research on global language networks has interesting relations to big history, complexity economics, and current politics.

Yesterday I ran across this nice little video explaining some recent research on global language networks. It’s not only interesting in its own right, but is a fantastic example of science communication as well.

I’m interested in some of the information theoretic aspects of this as well as the relation of this to the area of corpus linguistics. I’m also curious if one could build worthwhile datasets like this for the ancient world (cross reference some of the sources I touch on in relation to the Dickinson College Commentaries within Latin Pedagogy and the Digital Humanities) to see what influences different language cultures have had on each other. Perhaps the historical record could help to validate some of the predictions made in relation to the future?

The paper “Global distribution and drivers of language extinction risk” indicates that of all the variables tested, economic growth was most strongly linked to language loss.

This research also has some interesting relation to the concept of “Collective Learning” within the realm of a Big History framework via David Christian, Fred Spier, et al.  I’m curious to revisit my hypothesis: Collective learning has potentially been growing at the expense of a shrinking body of diverse language some of which was informed by the work of Jared Diamond.

Some of the discussion in the video is reminiscent to me of some of the work Stuart Kauffman lays out in At Home in the Universe: The Search for the Laws of Self-Organization and Complexity (Oxford, 1995). Particularly in chapter 3 in which Kauffman discusses the networks of life.  The analogy of this to the networks of language here indicate to me that some of Cesar Hidalgo’s recent work in Why Information Grows: The Evolution of Order, From Atoms to Economies (MIT Press, 2015) is even more interesting in helping to show the true value of links between people and firms (information sources which he measures as personbytes and firmbytes) within economies.

Finally, I can also only think about how this research may help to temper some of the xenophobic discussion that occurs in American political life with respect to fears relating to Mexican immigration issues as well as the position of China in the world economy.

Those intrigued by the video may find the website set up by the researchers very interesting. It contains links to the full paper as well as visualizations and links to the data used.

Abstract

Languages vary enormously in global importance because of historical, demographic, political, and technological forces. However, beyond simple measures of population and economic power, there has been no rigorous quantitative way to define the global influence of languages. Here we use the structure of the networks connecting multilingual speakers and translated texts, as expressed in book translations, multiple language editions of Wikipedia, and Twitter, to provide a concept of language importance that goes beyond simple economic or demographic measures. We find that the structure of these three global language networks (GLNs) is centered on English as a global hub and around a handful of intermediate hub languages, which include Spanish, German, French, Russian, Portuguese, and Chinese. We validate the measure of a language’s centrality in the three GLNs by showing that it exhibits a strong correlation with two independent measures of the number of famous people born in the countries associated with that language. These results suggest that the position of a language in the GLN contributes to the visibility of its speakers and the global popularity of the cultural content they produce.

Citation: Ronen S, Goncalves B, Hu KZ, Vespignani A, Pinker S, Hidalgo CA
Links that speak: the global language network and its association with global fame, Proceedings of the National Academy of Sciences (PNAS) (2014), 10.1073/pnas.1410931111

Related posts:

“A language like Dutch — spoken by 27 million people — can be a disproportionately large conduit, compared with a language like Arabic, which has a whopping 530 million native and second-language speakers,” Science reports. “This is because the Dutch are very multilingual and very online.”

Syndicated copies to:

Collective learning has potentially been growing at the expense of a shrinking body of diverse language

Big History may indicate why we're losing diversity in the number of languages on Earth.

Yesterday, I saw an interesting linguistic exercise:

Short activity to show how flexible our language is and how difficult collective learning would have been for our non sapiens ancestors.

Step 1: As a class, choose 200 random words. (I had 15 kids choose 14 words each)

Step 2: Answer the following questions using only the words listed:

  1. How should we try to kill that mammoth?
  2. Explain why you should marry me.
  3. Give directions for a simple task.
  4. Come up with a plan to improve our cave.
  5. Describe a physical landscape.
  6. Come up with your own question!
Chris Scaturo
on February 3 at 8:44am in Yammer Group on Big History: Unit 6 – Early Humans Group

I have to imagine that once the conceptualization of language and some basic grammar existed, word generation was a much more common thing than it is now. It’s only been since the time of Noah Webster that humans have been actively standardizing things like spelling. If we can use Papua New Guinea as a model of pre-agrarian society and consider that almost 12% of extant languages on the Earth are spoken in an area about the size of Texas (and with about 1/5th the population of Texas too), then modern societies are actually severely limiting language (creation, growth, diversity, creativity, etc.) [cross reference: A World of Languages – and How Many Speak Them (Infographic)]

Consider that the current extinction of languages is about one every 14 weeks, which puts us on a course to loose about half of the 7,100 languages on the planet right now before the end of the century. Collective learning has potentially been growing at the expense of a shrinking body of diverse language! In the paper “Global distribution and drivers of language extinction risk” the authors indicate that of all the variables tested, economic growth was most strongly linked to language loss.

To help put this exercise into perspective, we can look at the corpus of extant written Latin (a technically dead language):

“It is a truly impressive fact that, simply by knowing that if one can memorize and master about 250 words in Latin, it will allow them to read and understand 50% of most written Latin. Further, knowledge of 1,500 Latin words will put one at the 80% level of vocabulary mastery for most texts. Mastering even a very small list of vocabulary allows one to read a large variety of texts very comfortably.”

BoffoSocko.com
with data from Dickinson College Commentaries

These numbers become even smaller when considering ancient Greek texts.

Another interesting measurement is the vocabulary of a modern 2 year old who typically has a 50-75 word vocabulary while a 4 year old has 250-500 words, which is about the level of the exercise.

As a contrast, consider the message in this TED Youth Talk from last year by Erin McKean, which students should be able to relate to:

And of course, there’s the dog Chaser, which 60 minutes recently reported has a vocabulary of over 1,000 words. (Are we now destroying variants of “dog language” for English too?!)

Hopefully the evolutionary value of the loss of the multiple languages will be more than balanced out by the power of collective learning in the long run.

Syndicated copies to: