Bookmarked An information-based sequence distance and its application to whole mitochondrial genome phylogeny. by M. Li, J. H. Badger, X. Chen, S. Kwong, P. Kearney, H. Zhang (Bioinformatics. 2001 Feb;17(2):149-54.)

MOTIVATION: Traditional sequence distances require an alignment and therefore are not directly applicable to the problem of whole genome phylogeny where events such as rearrangements make full length alignments impossible. We present a sequence distance that works on unaligned sequences using the information theoretical concept of Kolmogorov complexity and a program to estimate this distance.

RESULTS: We establish the mathematical foundations of our distance and illustrate its use by constructing a phylogeny of the Eutherian orders using complete unaligned mitochondrial genomes. This phylogeny is consistent with the commonly accepted one for the Eutherians. A second, larger mammalian dataset is also analyzed, yielding a phylogeny generally consistent with the commonly accepted one for the mammals.

AVAILABILITY: The program to estimate our sequence distance, is available at http://www.cs.cityu.edu.hk/~cssamk/gencomp/GenCompress1.htm. The distance matrices used to generate our phylogenies are available at http://www.math.uwaterloo.ca/~mli/distance.html.

PMID: 11238070

Bookmarked Measuring the similarity of protein structures by means of the universal similarity metric. by N. Krasnogor, D. A. PeltaN. Krasnogor, D. A. Pelta (Bioinformatics. 2004 May 1;20(7):1015-21. Epub 2004 Jan 29.)

MOTIVATION: As an increasing number of protein structures become available, the need for algorithms that can quantify the similarity between protein structures increases as well. Thus, the comparison of proteins' structures, and their clustering accordingly to a given similarity measure, is at the core of today's biomedical research. In this paper, we show how an algorithmic information theory inspired Universal Similarity Metric (USM) can be used to calculate similarities between protein pairs. The method, besides being theoretically supported, is surprisingly simple to implement and computationally efficient.

RESULTS: Structural similarity between proteins in four different datasets was measured using the USM. The sample employed represented alpha, beta, alpha-beta, tim-barrel, globins and serpine protein types. The use of the proposed metric allows for a correct measurement of similarity and classification of the proteins in the four datasets.

AVAILABILITY: All the scripts and programs used for the preparation of this paper are available at http://www.cs.nott.ac.uk/~nxk/USM/protocol.html. In that web-page the reader will find a brief description on how to use the various scripts and programs.

PMID: 14751983 DOI: 10.1093/bioinformatics/bth031

Bookmarked Information theory in living systems, methods, applications, and challenges. by R. A. Gatenby, B. R. FriedenR. A. Gatenby, B. R. Frieden (Bull Math Biol. 2007 Feb;69(2):635-57. Epub 2006 Nov 3.)

Living systems are distinguished in nature by their ability to maintain stable, ordered states far from equilibrium. This is despite constant buffeting by thermodynamic forces that, if unopposed, will inevitably increase disorder. Cells maintain a steep transmembrane entropy gradient by continuous application of information that permits cellular components to carry out highly specific tasks that import energy and export entropy. Thus, the study of information storage, flow and utilization is critical for understanding first principles that govern the dynamics of life. Initial biological applications of information theory (IT) used Shannon's methods to measure the information content in strings of monomers such as genes, RNA, and proteins. Recent work has used bioinformatic and dynamical systems to provide remarkable insights into the topology and dynamics of intracellular information networks. Novel applications of Fisher-, Shannon-, and Kullback-Leibler informations are promoting increased understanding of the mechanisms by which genetic information is converted to work and order. Insights into evolution may be gained by analysis of the the fitness contributions from specific segments of genetic information as well as the optimization process in which the fitness are constrained by the substrate cost for its storage and utilization. Recent IT applications have recognized the possible role of nontraditional information storage structures including lipids and ion gradients as well as information transmission by molecular flux across cell membranes. Many fascinating challenges remain, including defining the intercellular information dynamics of multicellular organisms and the role of disordered information storage and flow in disease.

PMID: 17083004 DOI: 10.1007/s11538-006-9141-5

Poor State of Automated Machine-Based Language Translation

You know that automated machine language translation is not in good shape when the editor-in-chief of the IEEE’s Signal Processing Magazine says:

As an anecdote, during the early stage in creating the Chinese translation of the [Signal Processing] magazine, we experimented with automated machine translation first, only to quickly switch to professional human translation.  This makes us appreciate why “universal translation” is the “needs and wants” of the future rather than of the present; see [3] for a long list of of future needs and wants to be enabled by signal processing technology.

Global classical solutions of the Boltzmann equation with long-range interactions

Bookmarked Global classical solutions of the Boltzmann equation with long-range interactions (pnas.org)
Finally, after 140 years, Robert Strain and Philip Gressman at the University of Pennsylvania have found a mathematical proof of Boltzmann’s equation, which predicts the motion of gas molecules.

Abstract

This is a brief announcement of our recent proof of global existence and rapid decay to equilibrium of classical solutions to the Boltzmann equation without any angular cutoff, that is, for long-range interactions. We consider perturbations of the Maxwellian equilibrium states and include the physical cross-sections arising from an inverse-power intermolecular potential r-(p-1) with p > 2, and more generally. We present here a mathematical framework for unique global in time solutions for all of these potentials. We consider it remarkable that this equation, derived by Boltzmann (1) in 1872 and Maxwell (2) in 1867, grants a basic example where a range of geometric fractional derivatives occur in a physical model of the natural world. Our methods provide a new understanding of the effects due to grazing collisions.

via pnas.org

 

Bookmarked The structure of degradable quantum channels by Toby S. Cubitt, Mary Beth Ruskai, Graeme Smith (Journal of Mathematical Physics 49, 102104 (2008))
Degradable quantum channels are among the only channels whose quantum and private classical capacities are known. As such, determining the structure of these channels is a pressing open question in quantum information theory. We give a comprehensive review of what is currently known about the structure of degradable quantum channels, including a number of new results as well as alternate proofs of some known results. In the case of qubits, we provide a complete characterization of all degradable channels with two dimensional output, give a new proof that a qubit channel with two Kraus operators is either degradable or anti-degradable, and present a complete description of anti-degradable unital qubit channels with a new proof. For higher output dimensions we explore the relationship between the output and environment dimensions (dB and dE, respectively) of degradable channels. For several broad classes of channels we show that they can be modeled with an environment that is “small” in the sense of ΦC. Such channels include all those with qubit or qutrit output, those that map some pure state to an output with full rank, and all those which can be represented using simultaneously diagonal Kraus operators, even in a non-orthogonal basis. Perhaps surprisingly, we also present examples of degradable channels with “large” environments, in the sense that the minimal dimension dE>dB. Indeed, one can have dE>14d2B. These examples can also be used to give a negative answer to the question of whether additivity of the coherent information is helpful for establishing additivity for the Holevo capacity of a pair of channels. In the case of channels with diagonal Kraus operators, we describe the subclasses that are complements of entanglement breaking channels. We also obtain a number of results for channels in the convex hull of conjugations with generalized Pauli matrices. However, a number of open questions remain about these channels and the more general case of random unitary channels.
Alternate version on arXiv: https://arxiv.org/abs/0802.1360 

Acquired The Mathematical Theory of Communication by Claude E. Shannon and Warren Weaver

Acquired The Mathematical Theory of Communication (The University of Illinois Press)
Scientific knowledge grows at a phenomenal pace--but few books have had as lasting an impact or played as important a role in our modern world as The Mathematical Theory of Communication, published originally as a paper on communication theory in the Bell System Technical Journal more than fifty years ago. Republished in book form shortly thereafter, it has since gone through four hardcover and sixteen paperback printings. It is a revolutionary work, astounding in its foresight and contemporaneity. The University of Illinois Press is pleased and honored to issue this commemorative reprinting of a classic.