Pantheon (September 27, 2016)
Previously, I had made a large and somewhat random list of books which lie in the intersection of the application of information theory, physics, and engineering practice to the area of biology. Below I’ll begin to do a somewhat better job of providing a finer gradation of technical level for both the hobbyist or the aspiring student who wishes to bring themselves to a higher level of understanding of these areas. In future posts, I’ll try to begin classifying other texts into graduated strata as well. The final list will be maintained here: Books at the Intersection of Information Theory and Biology.
Introductory / General Readership / Popular Science Books
These books are written on a generally non-technical level and give a broad overview of their topics with occasional forays into interesting or intriguing subtopics. They include little, if any, mathematical equations or conceptualization. Typically, any high school student should be able to read, follow, and understand the broad concepts behind these books. Though often non-technical, these texts can give some useful insight into the topics at hand, even for the most advanced researchers.
Possibly one of the best places to start, this text gives a great overview of most of the major areas of study related to these fields.
One of the best books on the concept of entropy out there. It can be read even by middle school students with no exposure to algebra and does a fantastic job of laying out the conceptualization of how entropy underlies large areas of the broader subject. Even those with Ph.D.’s in statistical thermodynamics can gain something useful from this lovely volume.
A relatively recent popular science volume covering various conceptualizations of what information is and how it’s been dealt with in science and engineering. Though it has its flaws, its certainly a good introduction to the beginner, particularly with regard to history.
One of the most influential pieces of writing known to man, this classical text is the basis from which major strides in biology have been made as a result. A must read for everyone on the planet.
The four books above have a significant amount of overlap. Though one could read all of them, I recommend that those pressed for time choose Ben-Naim first. As I write this I’ll note that Ben-Naim’s book is scheduled for release on May 30, 2015, but he’s been kind enough to allow me to read an advance copy while it was in process; it gets my highest recommendation in its class. Loewenstein covers a bit more than Avery who also has a more basic presentation. Most who continue with the subject will later come across Yockey’s Information Theory and Molecular Biology which is similar to his text here but written at a slightly higher level of sophistication. Those who finish at this level of sophistication might want to try Yockey third instead.
In the coming weeks/months, I’ll try to continue putting recommended books on the remainder of the rest of the spectrum, the balance of which follows in outline form below. As always, I welcome suggestions and recommendations based on others’ experiences as well. If you’d like to suggest additional resources in any of the sections below, please do so via our suggestion box. For those interested in additional resources, please take a look at the ITBio Resources page which includes information about related research groups; references and journal articles; academic, research institutes, societies, groups, and organizations; and conferences, workshops, and symposia.
Lower Level Undergraduate
These books are written at a level that can be grasped and understood by most with a freshmen or sophomore university level. Coursework in math, science, and engineering will usually presume knowledge of calculus, basic probability theory, introductory physics, chemistry, and basic biology.
Upper Level Undergraduate
These books are written at a level that can be grasped and understood by those at a junior or senor university level. Coursework in math, science, and engineering may presume knowledge of probability theory, differential equations, linear algebra, complex analysis, abstract algebra, signal processing, organic chemistry, molecular biology, evolutionary theory, thermodynamics, advanced physics, and basic information theory.
These books are written at a level that can be grasped and understood by most working at the level of a master’s level at most universities. Coursework presumes all the previously mentioned classes, though may require a higher level of sub-specialization in one or more areas of mathematics, physics, biology, or engineering practice. Because of the depth and breadth of disciplines covered here, many may feel the need to delve into areas outside of their particular specialization.
Overall James Gleick’s book The Information: a History, a Theory, a Flood is an excellent read. Given that it’s an area with which I’m intimately interested, I’m not too surprised that most of it is “review”, but I’d highly recommend it to the general public to know more about some of the excellent history, philosophy, and theory which Gleick so nicely summarizes throughout the book.
There are one or two references in the back which I’ll have to chase down and read and one or two, which after many years, seem like they may be worth a second revisiting after having completed this.
Even for the specialist, Gleick manages to tie together some disparate thoughts to create an excellent whole which makes it a very worthwhile read. I found towards the last several chapters, Gleick’s style becomes much more flowery and less concrete, but most of it is as a result of covering the “humanities” perspective of information as opposed to the earlier parts of the text which were more specific to history and the scientific theories he covered.
Books have always been digital, not analog. Even when made of paper & ink, they are sequences of discrete symbols. That is all.
— James Gleick (@JamesGleick) June 7, 2011
John Battelle recently posted a review of James Gleick’s last book The Information: A History, A Theory, A Flood. It reminds me that I find it almost laughable when the vast majority of the technology press and the digiterati bloviate about their beats when at its roots, they know almost nothing about how technology truly works or the mathematical or theoretical underpinnings of what is happening — and even worse that they don’t seem to really care.
I’ve seen hundreds of reviews and thousands of mentions of Steven Levy’s book In the Plex: How Google Thinks, Works, and Shapes Our Lives in the past few months, — in fact, Battelle reviewed it just before Gleick’s book — but I’ve seen few, if any, of Gleick’s book which I honestly think is a much more worthwhile read about what is going on in the world and has farther reaching implications about where we are headed.
I’ll give a BIG tip my hat to John for his efforts to have read Gleick and post his commentary and to continue to push the boundary further as he invites Gleick to speak at Web 2.0 Summit in the fall. I hope his efforts will bring the topic to the much larger tech community. I further hope he and others might take the time to read Claude Shannon’s original paper [.pdf download], and if he’s further interested in the concept of thermodynamic entropy, I can recommend Andre Thess’s text The Entropy Principle: Thermodynamics for the Unsatisfied, which I’ve recently discovered and think does a good (and logically) consistent job of defining the concept at a level accessible to the average public.
“The Information,” by James Gleick, is to the nature, history and significance of data what the beach is to sand.
This book is assuredly going to have to skip up to the top of my current reading list.
“The Information” is so ambitious, illuminating and sexily theoretical that it will amount to aspirational reading for many of those who have the mettle to tackle it. Don’t make the mistake of reading it quickly. Imagine luxuriating on a Wi-Fi-equipped desert island with Mr. Gleick’s book, a search engine and no distractions. “The Information” is to the nature, history and significance of data what the beach is to sand.
In this relaxed setting, take the time to differentiate among the Brownian (motion), Bodleian (library) and Boolean (logic) while following Mr. Gleick’s version of what Einstein called “spukhafte Fernwirkung,” or “spooky action at a distance.” Einstein wasn’t precise about what this meant, and Mr. Gleick isn’t always precise either. His ambitions for this book are diffuse and far flung, to the point where providing a thumbnail description of “The Information” is impossible.
So this book’s prologue is its most slippery section. It does not exactly outline a unifying thesis. Instead it hints at the amalgam of logic, philosophy, linguistics, research, appraisal and anecdotal wisdom that will follow. If Mr. Gleick has one overriding goal it is to provide an animated history of scientific progress, specifically the progress of the technology that allows information to be recorded, transmitted and analyzed. This study’s range extends from communication by drumbeat to cognitive assault by e-mail.
As an illustration of Mr. Gleick’s versatility, consider what he has to say about the telegraph. He describes the mechanical key that made telegraphic transmission possible; the compression of language that this new medium encouraged; that it literally was a medium, a midway point between fully verbal messages and coded ones; the damaging effect its forced brevity had on civility; the confusion it created as to what a message actually was (could a mother send her son a dish of sauerkraut?) and the new conceptual thinking that it helped implement. The weather, which had been understood on a place-by-place basis, was suddenly much more than a collection of local events.
Beyond all this Mr. Gleick’s telegraph chapter, titled “A Nervous System for the Earth,” finds time to consider the kind of binary code that began to make sense in the telegraph era. It examines the way letters came to treated like numbers, the way systems of ciphers emerged. It cites the various uses to which ciphers might be put by businessmen, governments or fiction writers (Lewis Carroll, Jules Verne and Edgar Allan Poe). Most of all it shows how this phase of communication anticipated the immense complexities of our own information age.
Although “The Information” unfolds in a roughly chronological way, Mr. Gleick is no slave to linearity. He freely embarks on colorful digressions. Some are included just for the sake of introducing the great eccentrics whose seemingly marginal inventions would prove to be prophetic. Like Richard Holmes’s “Age of Wonder” this book invests scientists with big, eccentric personalities. Augusta Ada Lovelace, the daughter of Lord Byron, may have been spectacularly arrogant about what she called “my immense reasoning faculties,” claiming that her brain was “something more than merely mortal.” But her contribution to the writing of algorithms can, in the right geeky circles, be mentioned in the same breath as her father’s contribution to poetry.
The segments of “The Information” vary in levels of difficulty. Grappling with entropy, randomness and quantum teleportation is the price of enjoying Mr. Gleick’s simple, entertaining riffs on the Oxford English Dictionary’s methodology, which has yielded 30-odd spellings of “mackerel” and an enchantingly tongue-tied definition of “bada-bing” and on the cyber-battles waged via Wikipedia. (As he notes, there are people who have bothered to fight over Wikipedia’s use of the word “cute” to accompany a picture of a young polar bear.) That Amazon boasts of being able to download a book called “Data Smog” in less than a minute does not escape his keen sense of the absurd.
As it traces our route to information overload, “The Information” pays tribute to the places that made it possible. He cites and honors the great cogitation hives of yore. In addition to the Institute for Advanced Study in Princeton, N.J., the Mount Rushmore of theoretical science, he acknowledges the achievements of corporate facilities like Bell Labs and I.B.M.’s Watson Research Center in the halcyon days when many innovations had not found practical applications and progress was its own reward.
“The Information” also lauds the heroics of mathematicians, physicists and computer pioneers like Claude Shannon, who is revered in the computer-science realm for his information theory but not yet treated as a subject for full-length, mainstream biography. Mr. Shannon’s interest in circuitry using “if … then” choices conducting arithmetic in a binary system had novelty when he began formulating his thoughts in 1937. “Here in a master’s thesis by a research assistant,” Mr. Gleick writes, “was the essence of the computer revolution yet to come.”
Among its many other virtues “The Information” has the rare capacity to work as a time machine. It goes back much further than Shannon’s breakthroughs. And with each step backward Mr. Gleick must erase what his readers already know. He casts new light on the verbal flourishes of the Greek poetry that preceded the written word: these turns of phrase could be as useful for their mnemonic power as for their art. He explains why the Greeks arranged things in terms of events, not categories; how one Babylonian text that ends with “this is the procedure” is essentially an algorithm; and why the telephone and the skyscraper go hand in hand. Once the telephone eliminated the need for hand-delivered messages, the sky was the limit.
In the opinion of “The Information” the world of information still has room for expansion. We may be drowning in spam, but the sky’s still the limit today.