🔖 Decoding Anagrammed Texts Written in an Unknown Language and Script

Bookmarked Decoding Anagrammed Texts Written in an Unknown Language and Script by Bradley Hauer, Grzegorz Kondrak (Transactions of the Association for Computational Linguistics)
Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.

Aside: It’s been ages since I’ve seen someone with Refbacks listed on their site!

👓 Sexual harassment allegations roil Princeton University | WHYY

Read Sexual harassment allegations roil Princeton University by Avi Wolfman-Arent (WHYY)
Another high-profile instance of sexual harassment has rocked a major institution — this time Princeton University in New Jersey. And students say administrators didn’t act transparently or strongly enough when disciplining the alleged perpetrator, a decorated professor.

Once you start reaching Sergio Verdu’s age, and particularly with his achievements, your value to the University becomes more geared toward service. How much service can a professor do with an albatross like this hanging around their neck?

It would be nice if Universities were required to register offenders like this so that applicants to programs would be aware of them prior to applying–a sort of Megan’s Law for the professoriate. Naturally they don’t do this because it goes against their interests, but by the same token this is how a lot of issues run out of control within their sports programs as well. If someone did create such a website, I imagine the chilling effects on colleges and universities would be such that they might change their tunes about how these cases are handled. Immediately recent cases like Michigan State’s athletics problem, USC’s Medical School Dean issues, Christian Ott at Caltech come to mind, but I’m sure there must be hundreds if not thousands of others.

Maybe we need a mashup site that’s a cross between RateMyProfessors.com and California’s Megan’s Law site, but which specifically targeted Universities?

Fortunately even given Sergio’s accomplishments and profile, it will probably take forever for web searches for his name to not surface the story within the top couple of links, but this is sad consolation, particularly in a field like Information Theory which is heavily underrepresented already.

Syndicated copies to:

👓 Read Professor Verdu’s emails to student where he invites her over to watch explicit film before sexually harassing her | The Tab

Read Read Professor Verdu’s emails to student where he invites her over to watch explicit film before sexually harassing her (Princeton University)
‘P.S. Please call me Sergio ☺️’

I was just wondering why Sergio Verdu was so quiet on Twitter. Then I wondered why his Twitter account had disappeared.

Now I know the sad and painfully disappointing answer.

Syndicated copies to:

👓 Mysterious 15th century manuscript finally decoded 600 years later | The Independent

Read Code in the 'world's most mysterious book' deciphered by AI (The Independent)
Artificial intelligence has allowed scientists to make significant progress in cracking a mysterious ancient text, the meaning of which has eluded scholars for centuries.

Interesting news if it’s really true! Though I do feel a bit sad as there are some methods I had wanted to try on this longstanding puzzle, but never had the time to play with.

Syndicated copies to:

🔖 [1801.06022] Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors | arXiv

Bookmarked Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors by Yonatan Yehezkeally and Moshe Schwartz (arxiv.org)
DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited to the medium, which inherently replicates stored data in multiple distinct ways, caused by mutations. We consider noise introduced solely by uniform tandem-duplication, and utilize the relation to constant-weight integer codes in the Manhattan metric. By bounding the intersection of the cross-polytope with hyperplanes, we prove the existence of reconstruction codes with greater capacity than known error-correcting codes, which we can determine analytically for any set of parameters.
Syndicated copies to:
Reposted “My ten hour white noise video now has five copyright claims! :)” by Sebastian Tomczak (Twitter)

Information Theory and signal processing FTW!

(Aside: This is a great example of how people really don’t understand our copyright system or science in general.)

Syndicated copies to:

🔖 Quantum Information: What Is It All About? by Robert B. Griffiths | Entropy

Bookmarked Quantum Information: What Is It All About? by Robert B. Griffiths (MDPI (Entropy))
This paper answers Bell’s question: What does quantum information refer to? It is about quantum properties represented by subspaces of the quantum Hilbert space, or their projectors, to which standard (Kolmogorov) probabilities can be assigned by using a projective decomposition of the identity (PDI or framework) as a quantum sample space. The single framework rule of consistent histories prevents paradoxes or contradictions. When only one framework is employed, classical (Shannon) information theory can be imported unchanged into the quantum domain. A particular case is the macroscopic world of classical physics whose quantum description needs only a single quasiclassical framework. Nontrivial issues unique to quantum information, those with no classical analog, arise when aspects of two or more incompatible frameworks are compared.

Entropy 201719(12), 645; doi:10.3390/e19120645

This article belongs to the Special Issue Quantum Information and Foundations

View Full-Text | Download PDF [211 KB, uploaded 29 November 2017]

Syndicated copies to:

Energy and Matter at the Origins of Life by Nick Lane | Santa Fe Institute

Bookmarked Energy and Matter at the Origin of Life by Nick Lane (Santa Fe Institute Community Event (YouTube))
All living things are made of cells, and all cells are powered by electrochemical charges across thin lipid membranes — the ‘proton motive force.’ We know how these electrical charges are generated by protein machines at virtually atomic resolution, but we know very little about how membrane bioenergetics first arose. By tracking back cellular evolution to the last universal common ancestor and beyond, scientist Nick Lane argues that geologically sustained electrochemical charges across semiconducting barriers were central to both energy flow and the formation of new organic matter — growth — at the very origin of life. Dr. Lane is a professor of evolutionary biochemistry in the Department of Genetics, Evolution and Environment at University College London. His research focuses on how energy flow constrains evolution from the origin of life to the traits of complex multicellular organisms. He is a co-director of the new Centre for Life’s Origins and Evolution (CLOE) at UCL, and author of four celebrated books on life’s origins and evolution. His work has been recognized by the Biochemical Society Award in 2015 and the Royal Society Michael Faraday Prize in 2016.

h/t Santa Fe Institute

Syndicated copies to:

👓 Ergodic | John D. Cook

Read Ergodic by John D. Cook (John D. Cook Consulting)
Roughly speaking, an ergodic system is one that mixes well. You get the same result whether you average its values over time or over space. This morning I ran across the etymology of the word ergodic.

I’d read this before, but had a nice reminder about it this morning.

Syndicated copies to:

🔖 Upcoming Special Issue “Information Theory in Neuroscience” | Entropy (MDPI)

Bookmarked Special Issue "Information Theory in Neuroscience" (Entropy | MDPI)
As the ultimate information processing device, the brain naturally lends itself to be studied with information theory. Application of information theory to neuroscience has spurred the development of principled theories of brain function, has led to advances in the study of consciousness, and to the development of analytical techniques to crack the neural code, that is to unveil the language used by neurons to encode and process information. In particular, advances in experimental techniques enabling precise recording and manipulation of neural activity on a large scale now enable for the first time the precise formulation and the quantitative test of hypotheses about how the brain encodes and transmits across areas the information used for specific functions. This Special Issue emphasizes contributions on novel approaches in neuroscience using information theory, and on the development of new information theoretic results inspired by problems in neuroscience. Research work at the interface of neuroscience, Information Theory and other disciplines is also welcome. A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Information Theory". Deadline for manuscript submissions: 1 December 2017
Syndicated copies to:

👓 How the NSA identified Satoshi Nakamoto | Alexander Muse

Read How the NSA identified Satoshi Nakamoto by Alexander Muse (Alexander Muse | Medium)
The ‘creator’ of Bitcoin, Satoshi Nakamoto, is the world’s most elusive billionaire. Very few people outside of the Department of Homeland Security know Satoshi’s real name. In fact, DHS will not publicly confirm that even THEY know the billionaire’s identity. Satoshi has taken great care to keep his identity secret employing the latest encryption and obfuscation methods in his communications. Despite these efforts (according to my source at the DHS) Satoshi Nakamoto gave investigators the only tool they needed to find him — his own words. Using stylometry one is able to compare texts to determine authorship of a particular work. Throughout the years Satoshi wrote thousands of posts and emails and most of which are publicly available. According to my source, the NSA was able to the use the ‘writer invariant’ method of stylometry to compare Satoshi’s ‘known’ writings with trillions of writing samples from people across the globe. By taking Satoshi’s texts and finding the 50 most common words, the NSA was able to break down his text into 5,000 word chunks and analyse each to find the frequency of those 50 words. This would result in a unique 50-number identifier for each chunk. The NSA then placed each of these numbers into a 50-dimensional space and flatten them into a plane using principal components analysis. The result is a ‘fingerprint’ for anything written by Satoshi that could easily be compared to any other writing.

The article itself is dubious and unsourced and borders a bit on conspiracy theory, but the underlying concept about stylometry and its implications to privacy will be interesting to many. Naturally, it’s not much new.

Syndicated copies to:

🔖 Spontaneous fine-tuning to environment in many-species chemical reaction networks | PNAS

Bookmarked Spontaneous fine-tuning to environment in many-species chemical reaction networks (Proceedings of the National Academy of Sciences)
Significance A qualitatively more diverse range of possible behaviors emerge in many-particle systems once external drives are allowed to push the system far from equilibrium; nonetheless, general thermodynamic principles governing nonequilibrium pattern formation and self-assembly have remained elusive, despite intense interest from researchers across disciplines. Here, we use the example of a randomly wired driven chemical reaction network to identify a key thermodynamic feature of a complex, driven system that characterizes the “specialness” of its dynamical attractor behavior. We show that the network’s fixed points are biased toward the extremization of external forcing, causing them to become kinetically stabilized in rare corners of chemical space that are either atypically weakly or strongly coupled to external environmental drives. Abstract A chemical mixture that continually absorbs work from its environment may exhibit steady-state chemical concentrations that deviate from their equilibrium values. Such behavior is particularly interesting in a scenario where the environmental work sources are relatively difficult to access, so that only the proper orchestration of many distinct catalytic actors can power the dissipative flux required to maintain a stable, far-from-equilibrium steady state. In this article, we study the dynamics of an in silico chemical network with random connectivity in an environment that makes strong thermodynamic forcing available only to rare combinations of chemical concentrations. We find that the long-time dynamics of such systems are biased toward states that exhibit a fine-tuned extremization of environmental forcing.

Suggested by First Support for a Physics Theory of Life in Quanta Magazine.

Syndicated copies to:

👓 First Support for a Physics Theory of Life | Quanta Magazine

Read First Support for a Physics Theory of Life by Natalie Wolchover (Quanta Magazine)
Take chemistry, add energy, get life. The first tests of Jeremy England’s provocative origin-of-life hypothesis are in, and they appear to show how order can arise from nothing.

Interesting article with some great references I’ll need to delve into and read.


The situation changed in the late 1990s, when the physicists Gavin Crooks and Chris Jarzynski derived “fluctuation theorems” that can be used to quantify how much more often certain physical processes happen than reverse processes. These theorems allow researchers to study how systems evolve — even far from equilibrium.

I want to take a look at these papers as well as several about which the article is directly about.


Any claims that it has to do with biology or the origins of life, he added, are “pure and shameless speculations.”

Some truly harsh words from his former supervisor? Wow!


maybe there’s more that you can get for free

Most of what’s here in this article (and likely in the underlying papers) sounds to me to have been heavily influenced by the writings of W. Loewenstein and S. Kauffman. They’ve laid out some models/ideas that need more rigorous testing and work, and this seems like a reasonable start to the process. The “get for free” phrase itself is very S. Kauffman in my mind. I’m curious how many times it appears in his work?

Syndicated copies to:

📅 Entropy 2018: From Physics to Information Sciences and Geometry

RSVPed Might be attending Entropy 2018: From Physics to Information Sciences and Geometry
14-16 May 2018; Auditorium Enric Casassas, Faculty of Chemistry, University of Barcelona, Barcelona, Spain

One of the most frequently used scientific words, is the word “Entropy”. The reason is that it is related to two main scientific domains: physics and information theory. Its origin goes back to the start of physics (thermodynamics), but since Shannon, it has become related to information theory. This conference is an opportunity to bring researchers of these two communities together and create a synergy. The main topics and sessions of the conference cover:

  • Physics: classical Thermodynamics and Quantum
  • Statistical physics and Bayesian computation
  • Geometrical science of information, topology and metrics
  • Maximum entropy principle and inference
  • Kullback and Bayes or information theory and Bayesian inference
  • Entropy in action (applications)

The inter-disciplinary nature of contributions from both theoretical and applied perspectives are very welcome, including papers addressing conceptual and methodological developments, as well as new applications of entropy and information theory.

All accepted papers will be published in the proceedings of the conference. A selection of invited and contributed talks presented during the conference will be invited to submit an extended version of their paper for a special issue of the open access Journal Entropy. 

Entropy 2018 Conference

Syndicated copies to:

An Information Theory Playlist on Spotify

In honor of tomorrow’s release of Jimmy Soni and Rob Goodman’s book A Mind at Play: How Claude Shannon Invented the Information Age, I’ve created an Information Theory playlist on Spotify.

Songs about communication, telephones, conversation, satellites, love, auto-tune and even one about a typewriter! They all relate at least tangentially to the topic at hand. To up the ante, everyone should realize that digital music would be impossible without Shannon’s seminal work.

Let me know in the comments or by replying to one of the syndicated copies listed below if there are any great tunes that the list is missing.

Enjoy the list and the book!

Syndicated copies to: