👓 Humane Ingenuity 9: GPT-2 and You | Dan Cohen | Buttondown

Read Humane Ingenuity 9: GPT-2 and You by Dan CohenDan Cohen (buttondown.email)
This newsletter has not been written by a GPT-2 text generator, but you can now find a lot of artificially created text that has been.

For those not familiar with GPT-2, it is, according to its creators OpenAI (a socially conscious artificial intelligence lab overseen by a nonprofit entity), “a large-scale unsupervised language model which generates coherent paragraphs of text.” Think of it as a computer that has consumed so much text that it’s very good at figuring out which words are likely to follow other words, and when strung together, these words create fairly coherent sentences and paragraphs that are plausible continuations of any initial (or “seed”) text.

This isn’t a very difficult problem and the underpinnings of it are well laid out by John R. Pierce in *[An Introduction to Information Theory: Symbols, Signals and Noise](https://amzn.to/32JWDSn)*. In it he has a lot of interesting tidbits about language and structure from an engineering perspective including the reason why crossword puzzles work.
November 13, 2019 at 08:33AM

The most interesting examples have been the weird ones (cf. HI7), where the language model has been trained on narrower, more colorful sets of texts, and then sparked with creative prompts. Archaeologist Shawn Graham, who is working on a book I’d like to preorder right now, An Enchantment of Digital Archaeology: Raising the Dead with Agent Based Models, Archaeogaming, and Artificial Intelligence, fed GPT-2 the works of the English Egyptologist Flinders Petrie (1853-1942) and then resurrected him at the command line for a conversation about his work. Robin Sloan had similar good fun this summer with a focus on fantasy quests, and helpfully documented how he did it.

Circle back around and read this when it comes out.

Similarly, these other references should be an interesting read as well.
November 13, 2019 at 08:36AM

From this perspective, GPT-2 says less about artificial intelligence and more about how human intelligence is constantly looking for, and accepting of, stereotypical narrative genres, and how our mind always wants to make sense of any text it encounters, no matter how odd. Reflecting on that process can be the source of helpful self-awareness—about our past and present views and inclinations—and also, some significant enjoyment as our minds spin stories well beyond the thrown-together words on a page or screen.

And it’s not just happening with text, but it also happens with speech as I’ve written before: Complexity isn’t a Vice: 10 Word Answers and Doubletalk in Election 2016 In fact, in this mentioned case, looking at transcripts actually helps to reveal that the emperor had no clothes because there’s so much missing from the speech that the text doesn’t have enough space to fill in the gaps the way the live speech did.
November 13, 2019 at 08:43AM

🔖 GLTR (glitter) v0.5

Bookmarked GLTR from MIT-IBM Watson AI Lab and HarvardNLP (gltr.io)
This demo enables forensic inspection of the visual footprint of a language model on input text to detect whether a text could be real or fake.

🔖 The En-Gedi Scroll (2016) | Internet Archive

Bookmarked The En-Gedi Scroll (2016) (Internet Archive)

The data and virtual unwrapping results on the En-Gedi scroll. 

 
See the following papers for more information:
Seales, William Brent, et al. "From damage to discovery via virtual unwrapping: Reading the scroll from En-Gedi." Science advances 2.9 (2016): e1601247. (Web Article)
 
Segal, Michael, et al. "An Early Leviticus Scroll From En-Gedi: Preliminary Publication." Textus 26 (2016): 1-30. (PDF)

🔖 Digital Restoration Initiative

Bookmarked Digital Restoration Initiative (Digital Restoration Initiative)
The written word has been used throughout history to chronicle and contemplate the human experience, but many valuable texts are “lost” to us due to damage. The words of these documents and the knowledge they seek to impart are locked behind the destruction and decay wrought by time and injury, while the physical manuscripts themselves form an “invisible library” of sorts — closeted away on dark shelves, well-protected but prevented from proffering knowledge and encouraging inquiry. For more than 20 years, Dr. Seales has been working to create and use hi-tech, non-invasive tools to rescue these lost texts from the blink of oblivion and restore them to humanity. We call this innovative process “virtual unwrapping.”

h/t Dan Cohen newsletter #1

📺 EDUCE: Imaging the Herculaneum Scrolls | YouTube

Watched Imaging the Herculaneum Scrolls from YouTube
The eruption of Mt. Vesuvius covered the city of Herculaneum in twenty meters of lava, simultaneously destroying the Herculaneum scrolls through carbonization and preserving the scrolls by protecting them from the elements. Unwrapping the scrolls would damage them, but researchers are anxious to read the texts. Researchers from the University of Kentucky collaborated with the Institut de France and SkyScan to digitally unwrap and preserve the scrolls. To learn more about the EDUCE project, go to http://cs.uky.edu/dri.

They haven’t finished the last mile, but having high resolution scans of the objects is great. I’m not sure why they’re handling these items manually when they could very likely be secured in better external casings and still imaged the same way.

👓 What can Schrödinger’s cat say about 3D printers on Mars? | Aeon | Aeon Essays

Read What can Schrödinger’s cat say about 3D printers on Mars? by Michael Lachmann and Sara Walker (Aeon | Aeon Essays)
A cat is alive, a sofa is not: that much we know. But a sofa is also part of life. Information theory tells us why

A nice little essay in my area, but I’m not sure there’s anything new in it for me. It is nice that they’re trying to break some of the problem down into smaller components before building it back up into something else. Reframing things can always be helpful. Here, in particular, they’re reframing the definitions of life and alive.

🔖 Origins Of Life | Complexity Explorer

Bookmarked Origins Of Life (complexityexplorer.org)

About the Course:

This course aims to push the field of Origins of Life research forward by bringing new and synthetic thinking to the question of how life emerged from an abiotic world.

This course begins by examining the chemical, geological, physical, and biological principles that give us insight into origins of life research. We look at the chemical and geological environment of early Earth from the perspective of likely environments for life to originate.

Taking a look at modern life we ask what it can tell us about the origin of life by winding the clock backwards. We explore what elements of modern life are absolutely essential for life, and ask what is arbitrary? We ponder how life arose from the huge chemical space and what this early 'living chemistry'may have looked like.

We examine phenomena, that may seem particularly life like, but are in fact likely to arise given physical dynamics alone. We analyze what physical concepts and laws bound the possibilities for life and its formation.

Insights gained from modern evolutionary theory will be applied to proto-life. Once life emerges, we consider how living systems impact the geosphere and evolve complexity. 

The study of Origins of Life is highly interdisciplinary - touching on concepts and principles from earth science, biology, chemistry, and physics.  With this we hope that the course can bring students interested in a broad range of fields to explore how life originated. 

The course will make use of basic algebra, chemistry, and biology but potentially difficult topics will be reviewed, and help is available in the course discussion forum and instructor email. There will be pointers to additional resources for those who want to dig deeper.

This course is Complexity Explorer's first Frontiers Course.  A Frontiers Course gives students a tour of an active interdisciplinary research area. The goals of a Frontiers Course are to share the excitement and uncertainty of a scientific area, inspire curiosity, and possibly draw new people into the research community who can help this research area take shape!

I’m totally in for this!

Hat tip for the reminder to:

Replied to a tweet by John StewartJohn Stewart (Twitter)

I bookmarked a great post by Jim Luke (@econproph) a few weeks ago on scale and scope. I suspect that tech’s effect on education is heavily (if not permanently) scale-limited, but scope may be a better avenue going forward.

I also suspect that Cesar Hidalgo’s text Why Information Grows: The Evolution of Order, from Atoms to Economies may provide a strong clue with some details. To some extent I think we’ve generally reached the Shannon limit for how much information we can pour into a single brain. We now need to rely on distributed and parallel networking among people to proceed forward.

📑 Solomon Golomb (1932–2016) | Stephen Wolfram Blog

Annotated Solomon Golomb (1932–2016) by Stephen Wolfram (blog.stephenwolfram.com)

As it happens, he’d already done some work on coding theory—in the area of biology. The digital nature of DNA had been discovered by Jim Watson and Francis Crick in 1953, but it wasn’t yet clear just how sequences of the four possible base pairs encoded the 20 amino acids. In 1956, Max Delbrück—Jim Watson’s former postdoc advisor at Caltech—asked around at JPL if anyone could figure it out. Sol and two colleagues analyzed an idea of Francis Crick’s and came up with “comma-free codes” in which overlapping triples of base pairs could encode amino acids. The analysis showed that exactly 20 amino acids could be encoded this way. It seemed like an amazing explanation of what was seen—but unfortunately it isn’t how biology actually works (biology uses a more straightforward encoding, where some of the 64 possible triples just don’t represent anything).  

I recall talking to Sol about this very thing when I sat in on a course he taught at USC on combinatorics. He gave me his paper on it and a few related issues as I was very interested at the time about the applications of information theory and biology.

I’m glad I managed to sit in on the class and still have the audio recordings and notes. While I can’t say that Newton taught me calculus, I can say I learned combinatorics from Golomb.

👓 Solomon Golomb (1932–2016) | Stephen Wolfram

Read Solomon Golomb (1932–2016) by Stephen WolframStephen Wolfram (blog.stephenwolfram.com)

The Most-Used Mathematical Algorithm Idea in History

An octillion. A billion billion billion. That’s a fairly conservative estimate of the number of times a cellphone or other device somewhere in the world has generated a bit using a maximum-length linear-feedback shift register sequence. It’s probably the single most-used mathematical algorithm idea in history. And the main originator of this idea was Solomon Golomb, who died on May 1—and whom I knew for 35 years.

Solomon Golomb’s classic book Shift Register Sequences, published in 1967—based on his work in the 1950s—went out of print long ago. But its content lives on in pretty much every modern communications system. Read the specifications for 3GLTEWi-FiBluetooth, or for that matter GPS, and you’ll find mentions of polynomials that determine the shift register sequences these systems use to encode the data they send. Solomon Golomb is the person who figured out how to construct all these polynomials.

A fantastic and pretty comprehensive obit for Sol. He did miss out on more of Sol’s youth as well as his cross-town chess rivalry with Basil Gordon when they both lived in Baltimore, but before they lived across town from each other again in Los Angeles.

Many of the fantastical seeming stories here, as well as Sol’s personality read very true to me with respect to the man I knew for almost two decades.

📑 Solomon Golomb (1932–2016) | Stephen Wolfram Blog

Annotated Solomon Golomb (1932–2016) by Stephen Wolfram (blog.stephenwolfram.com)
in June 1955 he wrote his final report, “Sequences with Randomness Properties”—which would basically become the foundational document of the theory of shift register sequences.  

❤️ lpachter tweeted I once asked Robert McEliece whether he would mentor me.

Liked a tweet by Lior PachterLior Pachter (Twitter)

👓 Robert J. McEliece, 1942–2019 | Caltech

Read Robert J. McEliece, 1942–2019 (caltech.edu)
Alumnus and engineering faculty member Robert J. McEliece has passed away.

May is apparently the month that many of the greats in information theory pass away. I was reminded of Sol Golomb’s passing in May 2016 the other day.

I didn’t know him well, but met Dr. McEliece a handful of times and at least a few of the books in my personal information theory library are hand-me-down copies from his personal library. He’ll definitely be missed.

Three open books piled on top of each other with McEliece's signature and dates in the top right hand of the first page and CalTech bookstore price stamps in them as well.