Disconnected, Fragmented, or United? A Trans-disciplinary Review of Network Science

Bookmarked Disconnected, Fragmented, or United? A Trans-disciplinary Review of Network Science by César A. HidalgoCésar A. Hidalgo (Applied Network Science | SpringerLink)
Applied Network Science

Abstract

During decades the study of networks has been divided between the efforts of social scientists and natural scientists, two groups of scholars who often do not see eye to eye. In this review I present an effort to mutually translate the work conducted by scholars from both of these academic fronts hoping to continue to unify what has become a diverging body of literature. I argue that social and natural scientists fail to see eye to eye because they have diverging academic goals. Social scientists focus on explaining how context specific social and economic mechanisms drive the structure of networks and on how networks shape social and economic outcomes. By contrast, natural scientists focus primarily on modeling network characteristics that are independent of context, since their focus is to identify universal characteristics of systems instead of context specific mechanisms. In the following pages I discuss the differences between both of these literatures by summarizing the parallel theories advanced to explain link formation and the applications used by scholars in each field to justify their approach to network science. I conclude by providing an outlook on how these literatures can be further unified.

Weekly Recap: Interesting Articles 7/24-7/31 2016

Went on vacation or fell asleep at the internet wheel this week? Here’s some of the interesting stuff you missed.

Science & Math

Publishing

Indieweb, Internet, Identity, Blogging, Social Media

General

Ten Simple Rules for Taking Advantage of Git and GitHub

Bookmarked Ten Simple Rules for Taking Advantage of Git and GitHub (journals.plos.org)
Bioinformatics is a broad discipline in which one common denominator is the need to produce and/or use software that can be applied to biological data in different contexts. To enable and ensure the replicability and traceability of scientific claims, it is essential that the scientific publication, the corresponding datasets, and the data analysis are made publicly available [1,2]. All software used for the analysis should be either carefully documented (e.g., for commercial software) or, better yet, openly shared and directly accessible to others [3,4]. The rise of openly available software and source code alongside concomitant collaborative development is facilitated by the existence of several code repository services such as SourceForge, Bitbucket, GitLab, and GitHub, among others. These resources are also essential for collaborative software projects because they enable the organization and sharing of programming tasks between different remote contributors. Here, we introduce the main features of GitHub, a popular web-based platform that offers a free and integrated environment for hosting the source code, documentation, and project-related web content for open-source projects. GitHub also offers paid plans for private repositories (see Box 1) for individuals and businesses as well as free plans including private repositories for research and educational use.

Lessons Learned from IndiewebCamp and WordCamp – David Shanske

Liked Lessons Learned from IndiewebCamp and WordCamp by David ShanskeDavid Shanske (David Shanske)
For a little over two years, I have been involved in Indiewebcamp. This past weekend, for the first time in five years, I was able to attend WordCamp. WordCamp NYC was a massive undertaking, to which I must give credit to the organizers. WordCamp was moved to coincide with OpenCamps week at the United Nations, …

The emotional arcs of stories are dominated by six basic shapes

Bookmarked The emotional arcs of stories are dominated by six basic shapes (arxiv.org)
Advances in computing power, natural language processing, and digitization of text now make it possible to study our a culture's evolution through its texts using a "big data" lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories, forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,737 stories from Project Gutenberg's fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives. We strengthen our findings by separately applying optimization, linear decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.

🔖 Paper: Paging Through History by Mark Kurlansky

Bookmarked Paper: Paging Through History by Mark Kurlansky (Amazon.com)
Paper is one of the simplest and most essential pieces of human technology. For the past two millennia, the ability to produce it in ever more efficient ways has supported the proliferation of literacy, media, religion, education, commerce, and art; it has formed the foundation of civilizations, promoting revolutions and restoring stability. One has only to look at history’s greatest press run, which produced 6.5 billion copies of Máo zhuxí yulu, Quotations from Chairman Mao Tse-tung (Zedong)―which doesn’t include editions in 37 foreign languages and in braille―to appreciate the range and influence of a single publication, in paper. Or take the fact that one of history’s most revered artists, Leonardo da Vinci, left behind only 15 paintings but 4,000 works on paper. And though the colonies were at the time calling for a boycott of all British goods, the one exception they made speaks to the essentiality of the material; they penned the Declaration of Independence on British paper. Now, amid discussion of “going paperless”―and as speculation about the effects of a digitally dependent society grows rampant―we’ve come to a world-historic juncture. Thousands of years ago, Socrates and Plato warned that written language would be the end of “true knowledge,” replacing the need to exercise memory and think through complex questions. Similar arguments were made about the switch from handwritten to printed books, and today about the role of computer technology. By tracing paper’s evolution from antiquity to the present, with an emphasis on the contributions made in Asia and the Middle East, Mark Kurlansky challenges common assumptions about technology’s influence, affirming that paper is here to stay. Paper will be the commodity history that guides us forward in the twenty-first century and illuminates our times.
🔖 Marked as “want to read” Paper: Paging Through History by Mark Kurlansky (W. W. Norton & Company; 1st edition, May 10, 2016; ISBN: 9780393239614)

Hypothes.is and the IndieWeb

Last night I saw two great little articles about Hypothes.is, a web-based annotation engine, written by a proponent of the IndieWeb:

Hypothes.is as a public research notebook

Hypothes.is Aggregator ― a WordPress plugin

As a researcher, I fully appreciate the pro-commonplace book conceptualization of the first post, and the second takes things amazingly further with a plugin that allows one to easily display one’s hypothes.is annotations on one’s own WordPress-based site in a dead-simple fashion.

This functionality is a great first step, though honestly, in keeping with IndieWeb principles of owning one’s own data, I think it would be easier/better if Hypothes.is both accepted and sent webmentions. This would potentially allow me to physically own the data on my own site while still participating in the larger annotation community as well as give me notifications when someone either comments or augments on one of my annotations or even annotates one of my own pages (bits of which I’ve written about before.)

Either way, kudos to Kris Shaffer for moving the ball forward!

Examples

My Hypothes.is Notebook

The plugin mentioned in the second article allows me to keep a running online “notebook” of all of my Hypothes.is annotations on my own site.

My IndieWeb annotations

I can also easily embed my recent annotations about the IndieWeb below:

[ hypothesis user = 'chrisaldrich' tags = 'indieweb']

How Can We Apply Physics to Biology?

Bookmarked How Can We Apply Physics to Biology? by Philip Ball (nautil.us)
We don’t yet know quite what a physics of biology will consist of. But we won’t understand life without it.
This is an awesome little article with some interesting thought and philosophy on the current state of physics within biology and other related areas of study. It’s also got some snippets of history which aren’t frequently discussed in longer form texts.

Some Thoughts on Academic Publishing and “Who’s downloading pirated papers? Everyone” from Science | AAAS

Bookmarked Who's downloading pirated papers? Everyone by John Bohannon (Science | AAAS)
An exclusive look at data from the controversial web site Sci-Hub reveals that the whole world, both poor and rich, is reading pirated research papers.

Sci Hub has been in the news quite a bit over the past half a year and the bookmarked article here gives some interesting statistics. I’ll preface some of the following editorial critique with the fact that I love John Bohannon’s work; I’m glad he’s spent the time to do the research he has. Most of the rest of the critique is aimed at the publishing industry itself.

From a journalistic standpoint, I find it disingenuous that the article didn’t actually hyperlink to Sci Hub. Neither did it link out (or provide a full quote) to Alicia Wise’s Twitter post(s) nor link to her rebuttal list of 20 ways to access their content freely or inexpensively. Of course both of these are editorial related, and perhaps the rebuttal was so flimsy as to be unworthy of a link from such an esteemed publication anyway.

Sadly, Elsevier’s list of 20 ways of free/inexpensive access doesn’t really provide any simple coverage for graduate students or researchers in poorer countries which are the likeliest group of people using Sci Hub, unless they’re going to fraudulently claim they’re part of a class which they’re not, and is this morally any better than the original theft method? It’s almost assuredly never used by patients, which seem to be covered under one of the options, as the option to do so is painfully undiscoverable past their typical $30/paper firewalls. Their patchwork hodgepodge of free access is so difficult to not only discern, but one must keep in mind that this is just one of dozens of publishers a researcher must navigate to find the one thing they’re looking for right now (not to mention the thousands of times they need to do this throughout a year, much less a career).

Consider this experiment, which could be a good follow up to the article: is it easier to find and download a paper by title/author/DOI via Sci Hub (a minute) versus through any of the other publishers’ platforms with a university subscription (several minutes) or without a subscription (an hour or more to days)? Just consider the time it would take to dig up every one of 30 references in an average journal article: maybe just a half an hour via Sci Hub versus the days and/or weeks it would take to jump through the multiple hoops to first discover, read about, and then gain access and then download them from the over 14 providers (and this presumes the others provide some type of “access” like Elsevier).

Those who lived through the Napster revolution in music will realize that the dead simplicity of their system is primarily what helped kill the music business compared to the ecosystem that exists now with easy access through the multiple streaming sites (Spotify, Pandora, etc.) or inexpensive paid options like (iTunes). If the publishing business doesn’t want to get completely killed, they’re going to need to create the iTunes of academia. I suspect they’ll have internal bean-counters watching the percentage of the total (now apparently 5%) and will probably only do something before it passes a much larger threshold, though I imagine that they’re really hoping that the number stays stable which signals that they’re not really concerned. They’re far more likely to continue to maintain their status quo practices.

Some of this ease-of-access argument is truly borne out by the statistics of open access papers which are downloaded by Sci Hub–it’s simply easier to both find and download them that way compared to traditional methods; there’s one simple pathway for both discovery and download. Surely the publishers, without colluding, could come up with a standardized method or protocol for finding and accessing their material cheaply and easily?

“Hart-Davidson obtained more than 100 years of biology papers the hard way—legally with the help of the publishers. ‘It took an entire year just to get permission,’ says Thomas Padilla, the MSU librarian who did the negotiating.” John Bohannon in Who’s downloading pirated papers? Everyone

Personally, I use use relatively advanced tools like LibX, which happens to be offered by my institution and which I feel isn’t very well known, and it still takes me longer to find and download a paper than it would via Sci Hub. God forbid if some enterprising hacker were to create a LibX community version for Sci Hub. Come to think of it, why haven’t any of the dozens of publishers built and supported simple tools like LibX which make their content easy to access? If we consider the analogy of academic papers to the introduction of machine guns in World War I, why should modern researchers still be using single-load rifles against an enemy that has access to nuclear weaponry?

My last thought here comes on the heels of the two tweets from Alicia Wise mentioned, but not shown in the article:

She mentions that the New York Times charges more than Elsevier does for a full subscription. This is tremendously disingenuous as Elsevier is but one of dozens of publishers for which one would have to subscribe to have access to the full panoply of material researchers are typically looking for. Further, Elsevier nor their competitors are making their material as easy to find and access as the New York Times does. Neither do they discount access to the point that they attempt to find the subscription point that their users find financially acceptable. Case in point: while I often read the New York Times, I rarely go over their monthly limit of articles to need any type of paid subscription. Solely because they made me an interesting offer to subscribe for 8 weeks for 99 cents, I took them up on it and renewed that deal for another subsequent 8 weeks. Not finding it worth the full $35/month price point I attempted to cancel. I had to cancel the subscription via phone, but why? The NYT customer rep made me no less than 5 different offers at ever decreasing price points–including the 99 cents for 8 weeks which I had been getting!!–to try to keep my subscription. Elsevier, nor any of their competitors has ever tried (much less so hard) to earn my business. (I’ll further posit that it’s because it’s easier to fleece at the institutional level with bulk negotiation, a model not too dissimilar to the textbook business pressuring professors on textbook adoption rather than trying to sell directly the end consumer–the student, which I’ve written about before.)

(Trigger alert: Apophasis to come) And none of this is to mention the quality control that is (or isn’t) put into the journals or papers themselves. Fortunately one need’t even go further than Bohannon’s other writings like Who’s Afraid of Peer Review? Then there are the hordes of articles on poor research design and misuse of statistical analysis and inability to repeat experiments. Not to give them any ideas, but lately it seems like Elsevier buying the Enquirer and charging $30 per article might not be a bad business decision. Maybe they just don’t want to play second-banana to TMZ?

Interestingly there’s a survey at the end of the article which indicates some additional sources of academic copyright infringement. I do have to wonder how the data for the survey will be used? There’s always the possibility that logged in users will be indicating they’re circumventing copyright and opening themselves up to litigation.

I also found the concept of using the massive data store as a means of applied corpus linguistics for science an entertaining proposition. This type of research could mean great things for science communication in general. I have heard of people attempting to do such meta-analysis to guide the purchase of potential intellectual property for patent trolling as well.

Finally, for those who haven’t done it (ever or recently), I’ll recommend that it’s certainly well worth their time and energy to attend one or more of the many 30-60 minute sessions most academic libraries offer at the beginning of their academic terms to train library users on research tools and methods. You’ll save yourself a huge amount of time.

Physicists Hunt For The Big Bang’s Triangles | Quanta Magazine

Bookmarked Physicists Hunt for the Big Bang'€™s Triangles (Quanta Magazine )

“The notion that counting more shapes in the sky will reveal more details of the Big Bang is implied in a central principle of quantum physics known as “unitarity.” Unitarity dictates that the probabilities of all possible quantum states of the universe must add up to one, now and forever; thus, information, which is stored in quantum states, can never be lost — only scrambled. This means that all information about the birth of the cosmos remains encoded in its present state, and the more precisely cosmologists know the latter, the more they can learn about the former.”

How can we be sure old books were ever read? – University of Glasgow Library

Bookmarked How can we be sure old books were ever read? by Robert MacLean (University of Glasgow Library)
Owning a book isn’t the same as reading it; we need only look at our own bloated bookshelves for confirmation.
This is a great little overview for people reading the books of others. There are also lots of great links to other resources.

What is Information? by Christoph Adami

Bookmarked What is Information? [1601.06176] by Christoph AdamiChristoph Adami (arxiv.org)

Information is a precise concept that can be defined mathematically, but its relationship to what we call "knowledge" is not always made clear. Furthermore, the concepts "entropy" and "information", while deeply related, are distinct and must be used with care, something that is not always achieved in the literature. In this elementary introduction, the concepts of entropy and information are laid out one by one, explained intuitively, but defined rigorously. I argue that a proper understanding of information in terms of prediction is key to a number of disciplines beyond engineering, such as physics and biology.

Comments: 19 pages, 2 figures. To appear in Philosophical Transaction of the Royal Society A
Subjects: Adaptation and Self-Organizing Systems (nlin.AO); Information Theory (cs.IT); Biological Physics (physics.bio-ph); Quantitative Methods (q-bio.QM)
Cite as:arXiv:1601.06176 [nlin.AO] (or arXiv:1601.06176v1 [nlin.AO] for this version)

From: Christoph Adami
[v1] Fri, 22 Jan 2016 21:35:44 GMT (151kb,D) [.pdf]

A proper understanding of information in terms of prediction is key to a number of disciplines beyond engineering, such as physics and biology.

Donald Forsdyke Indicates the Concept of Information in Biology Predates Claude Shannon

As it was published, I had read Kevin Hartnett’s article and interview with Christoph Adami The Information Theory of Life in Quanta Magazine. I recently revisited it and read through the commentary and stumbled upon an interesting quote relating to the history of information in biology:

Polymath Adami has ‘looked at so many fields of science’ and has correctly indicated the underlying importance of information theory, to which he has made important contributions. However, perhaps because the interview was concerned with the origin of life and was edited and condensed, many readers may get the impression that IT is only a few decades old. However, information ideas in biology can be traced back to at least 19th century sources. In the 1870s Ewald Hering in Prague and Samuel Butler in London laid the foundations. Butler’s work was later taken up by Richard Semon in Munich, whose writings inspired the young Erwin Schrodinger in the early decades of the 20th century. The emergence of his text – “What is Life” – from Dublin in the 1940s, inspired those who gave us DNA structure and the associated information concepts in “the classic period” of molecular biology. For more please see: Forsdyke, D. R. (2015) History of Psychiatry 26 (3), 270-287.

Donald Forsdyke, bioinformatician and theoretical biologist
in response to The Information Theory of Life in Quanta Magazine on

These two historical references predate Claude Shannon’s mathematical formalization of information in A Mathematical Theory of Communication (The Bell System Technical Journal, 1948) and even Erwin Schrödinger‘s lecture (1943) and subsequent book What is Life (1944).

For those interested in reading more on this historical tidbit, I’ve dug up a copy of the primary Forsdyke reference which first appeared on arXiv (prior to its ultimate publication in History of Psychiatry [.pdf]):

🔖 [1406.1391] ‘A Vehicle of Symbols and Nothing More.’ George Romanes, Theory of Mind, Information, and Samuel Butler by Donald R. Forsdyke  [1]
Submitted on 4 Jun 2014 (v1), last revised 13 Nov 2014 (this version, v2)

Abstract: Today’s ‘theory of mind’ (ToM) concept is rooted in the distinction of nineteenth century philosopher William Clifford between ‘objects’ that can be directly perceived, and ‘ejects,’ such as the mind of another person, which are inferred from one’s subjective knowledge of one’s own mind. A founder, with Charles Darwin, of the discipline of comparative psychology, George Romanes considered the minds of animals as ejects, an idea that could be generalized to ‘society as eject’ and, ultimately, ‘the world as an eject’ – mind in the universe. Yet, Romanes and Clifford only vaguely connected mind with the abstraction we call ‘information,’ which needs ‘a vehicle of symbols’ – a material transporting medium. However, Samuel Butler was able to address, in informational terms depleted of theological trappings, both organic evolution and mind in the universe. This view harmonizes with insights arising from modern DNA research, the relative immortality of ‘selfish’ genes, and some startling recent developments in brain research.

Comments: Accepted for publication in History of Psychiatry. 31 pages including 3 footnotes. Based on a lecture given at Santa Clara University, February 28th 2014, at a Bannan Institute Symposium on ‘Science and Seeking: Rethinking the God Question in the Lab, Cosmos, and Classroom.’

The original arXiv article also referenced two lectures which are appended below:

http://www.youtube.com/watch?v=a3yNbTUCPd4

[Original Draft of this was written on December 14, 2015.]

References

[1]
D. Forsdyke R., “‘A vehicle of symbols and nothing more’. George Romanes, theory of mind, information, and Samuel Butler,” History of Psychiatry, vol. 26, no. 3, Aug. 2015 [Online]. Available: http://journals.sagepub.com/doi/abs/10.1177/0957154X14562755

Winter Q-BIO Quantitative Biology Meeting February 15-18, 2016

Bookmarked Winter Q-BIO Quantitative Biology Meeting February 15-18, 2016 (w-qbio.org)
The Winter Q-BIO Quantitative Biology Meeting is coming up at the Sheraton Waikiki in Oahu, HI, USA

A predictive understanding of living systems is a prerequisite for designed manipulation in bioengineering and informed intervention in medicine. Such an understanding requires quantitative measurements, mathematical analysis, and theoretical abstraction. The advent of powerful measurement technologies and computing capacity has positioned biology to drive the next scientific revolution. A defining goal of Quantitative Biology (qBIO) is the development of general principles that arise from networks of interacting elements that initially defy conceptual reasoning. The use of model organisms for the discovery of general principles has a rich tradition in biology, and at a fundamental level the philosophy of qBIO resonates with most molecular and cell biologists. New challenges arise from the complexity inherent in networks, which require mathematical modeling and computational simulation to develop conceptual “guideposts” that can be used to generate testable hypotheses, guide analyses, and organize “big data.”

The Winter q-bio meeting welcomes scientists and engineers who are interested in all areas of q-bio. For 2016, the meeting will be hosted at the Sheraton Waikiki, which is located in Honolulu, on the island of Oahu. The resort is known for its breathtaking oceanfront views, a first-of-its-kind recently opened “Superpool” and many award-winning dining venues. Registration and accommodation information can be found via the links at the top of the page.

Source: Winter Q-BIO Quantitative Biology Meeting

Obituary: Wes Craven

Bookmarked Wes Craven Dead: Movies 'Scream', 'Nightmare on Elm Street' Horrified Viewers (The Hollywood Reporter)
Wes Craven, the famed maestro of horror known for the Nightmare on Elm Street and Scream franchises, died Sunday after a battle with brain cancer. He was 76.
Saddened to  hear that filmmaker and fellow Johns Hopkins University alum Wes Craven has passed away this afternoon. He was certainly a scholar and a gentleman and will be missed terribly.

Obituary: Wes Craven, Horror Maestro, Dies at 76 – Hollywood Reporter 

Wes Craven
Wes Craven