Over the years, Lennon and McCartney have revealed who really wrote what, but some songs are still up for debate. The two even debate between themselves — their memories seem to differ when it comes to who wrote the music for 1965's "In My Life."
Mathematics professor Jason Brown spent 10 years working with statistics to solve the magical mystery. Brown's the findings were presented on Aug. 1 at the Joint Statistical Meeting in a presentation called "Assessing Authorship of Beatles Songs from Musical Content: Bayesian Classification Modeling from Bags-Of-Words Representations."
Why you should use percentages, not words, to express probabilities.
Highlights, Quotes, & Marginalia
This result is consistent with analysis by the data science team at Quora, a site where users ask and answer questions. That team found that women use uncertain words and phrases more often than men do, even when they are just as confident. ❧
The competition, whose finals play out tonight, is as famed for its politics as its cheesy
I often read the Economist’s Espresso daily round up, but don’t explicitly post that I do. I’m making an exception in this case because I find the voting partnerships mentioned here quite interesting. Might be worth delving into some of the underlying voting statistics for potential application to other real life examples. I’m also enamored of the nice visualization they provide. I wonder what the overlap of this data is with other related world politics looks like?
Three new books on the challenge of drawing confident conclusions from an uncertain world.
Not sure how I missed this when it came out two weeks ago, but glad it popped up in my reader today.
This has some nice overview material for the general public on probability theory and science, but given the state of research, I’d even recommend this and some of the references to working scientists.
I remember bookmarking one of the texts back in November. This is a good reminder to circle back and read it.
What happens when several thousand distinguished physicists, researchers, and students descend on the nation’s gambling capital for a conference? The answer is "a bad week for the casino"—but you'd never guess why. The year was 1986, and the American Physical Society’s annual April meeting was slated to be held in San Diego. But when scheduling conflicts caused the hotel arrangements to fall through just a few months before, the conference's organizers were left scrambling to find an alternative destination that could accommodate the crowd—and ended up settling on Las Vegas's MGM grand.
Totally physics clickbait. The headline should have read: “Vegas won’t cater to physics conferences anymore because they’re too smart to gamble.”
Brian Wansink won fame, funding, and influence for his science-backed advice on healthy eating. Now, emails show how the Cornell professor and his colleagues have hacked and massaged low-quality data into headline-friendly studies to “go virally big time.”
This article is painful to read and has some serious implications for both science in general and the issue of repeat-ability. I suspect that this is an easily caught flagrant case and that it probably only scratches the surface. The increased competition in research and the academy is sure to create more cases of this in the future.
We really need people to begin publishing their negative results and doing a better job on understanding and practicing statistics. Science is already not “believed” by far too many in the United States, we really don’t need bad actors like this eroding the solid foundations we’ve otherwise built.
Observational data about human behavior is often heterogeneous, i.e., generated by subgroups within the population under study that vary in size and behavior. Heterogeneity predisposes analysis to Simpson's paradox, whereby the trends observed in data that has been aggregated over the entire population may be substantially different from those of the underlying subgroups. I illustrate Simpson's paradox with several examples coming from studies of online behavior and show that aggregate response leads to wrong conclusions about the underlying individual behavior. I then present a simple method to test whether Simpson's paradox is affecting results of analysis. The presence of Simpson's paradox in social data suggests that important behavioral differences exist within the population, and failure to take these differences into account can distort the studies' findings.
In the sixteenth and seventeenth centuries, gamblers and mathematicians transformed the idea of chance from a mystery into the discipline of probability, setting the stage for a series of breakthroughs that enabled or transformed innumerable fields, from gambling, mathematics, statistics, economics, and finance to physics and computer science. This book tells the story of ten great ideas about chance and the thinkers who developed them, tracing the philosophical implications of these ideas as well as their mathematical impact. Persi Diaconis and Brian Skyrms begin with Gerolamo Cardano, a sixteenth-century physician, mathematician, and professional gambler who helped develop the idea that chance actually can be measured. They describe how later thinkers showed how the judgment of chance also can be measured, how frequency is related to chance, and how chance, judgment, and frequency could be unified. Diaconis and Skyrms explain how Thomas Bayes laid the foundation of modern statistics, and they explore David Hume’s problem of induction, Andrey Kolmogorov’s general mathematical framework for probability, the application of computability to chance, and why chance is essential to modern physics. A final idea―that we are psychologically predisposed to error when judging chance―is taken up through the work of Daniel Kahneman and Amos Tversky. Complete with a brief probability refresher, Ten Great Ideas about Chance is certain to be a hit with anyone who wants to understand the secrets of probability and how they were discovered.
Simpson's Paradox Part 2. This video is about how to tell whether or not university admissions are biased using statistics: aka, it's about Simpson's Paradox again!
Original Berkeley Grad Admissions Paper
Interactive Simpson’s Paradox Explainer
No Lawsuit, But Yes, Berkeley Study on Gender Bias
Statistics on college majors by gender:
Earnings by college major
Wall Street Journal Article on Simpson’s Paradox
We discuss properties of the "beamsplitter addition" operation, which provides a non-standard scaled convolution of random variables supported on the non-negative integers. We give a simple expression for the action of beamsplitter addition using generating functions. We use this to give a self-contained and purely classical proof of a heat equation and de Bruijn identity, satisfied when one of the variables is geometric.
The U.S.C./Los Angeles Times poll has consistently been an outlier, showing Donald Trump in the lead or near the lead.
Alone, he has been enough to put Mr. Trump in double digits of support among black voters. He can improve Mr. Trump’s margin by 1 point in the survey, even though he is one of around 3,000 panelists.
He is also the reason Mrs. Clinton took the lead in the U.S.C./LAT poll for the first time in a month on Wednesday. The poll includes only the last seven days of respondents, and he hasn’t taken the poll since Oct. 4. Mrs. Clinton surged once he was out of the sample for the first time in several weeks.
This is the signal for the second.
How can you not follow this twitter account?!
Now I’m waiting for a Shannon bot and a Weiner bot. Maybe a John McCarthy bot would be apropos too?!Syndicated copies to:
In catching up on blogs/reading from the holidays, I’ve noticed that physicist Sean Carroll has a forthcoming book entitled The Big Picture: On the Origins of Life, Meaning, and the Universe Itself (Dutton, May 10, 2016) that will be of interest to many of our readers. One can already pre-order the book via Amazon.
Prior to the holidays Sean wrote a blogpost that contains a full overview table of contents, which will give everyone a stronger idea of its contents. For convenience I’ll excerpt it below.
I’ll post a review as soon as a copy arrives, but it looks like a strong new entry in the category of popular science books on information theory, biology and complexity as well as potentially the areas of evolution, the origin of life, and physics in general.
As a side bonus, for those reading this today (1/15/16), I’ll note that Carroll’s 12 part lecture series from The Great Courses The Higgs Boson and Beyond (The Learning Company, February 2015) is 80% off.
Syndicated copies to:
THE BIG PICTURE: ON THE ORIGINS OF LIFE, MEANING, AND THE UNIVERSE ITSELF
* Part One: Cosmos
- 1. The Fundamental Nature of Reality
- 2. Poetic Naturalism
- 3. The World Moves By Itself
- 4. What Determines What Will Happen Next?
- 5. Reasons Why
- 6. Our Universe
- 7. Time’s Arrow
- 8. Memories and Causes
* Part Two: Understanding
- 9. Learning About the World
- 10. Updating Our Knowledge
- 11. Is It Okay to Doubt Everything?
- 12. Reality Emerges
- 13. What Exists, and What Is Illusion?
- 14. Planets of Belief
- 15. Accepting Uncertainty
- 16. What Can We Know About the Universe Without Looking at It?
- 17. Who Am I?
- 18. Abducting God
* Part Three: Essence
- 19. How Much We Know
- 20. The Quantum Realm
- 21. Interpreting Quantum Mechanics
- 22. The Core Theory
- 23. The Stuff of Which We Are Made
- 24. The Effective Theory of the Everyday World
- 25. Why Does the Universe Exist?
- 26. Body and Soul
- 27. Death Is the End
* Part Four: Complexity
- 28. The Universe in a Cup of Coffee
- 29. Light and Life
- 30. Funneling Energy
- 31. Spontaneous Organization
- 32. The Origin and Purpose of Life
- 33. Evolution’s Bootstraps
- 34. Searching Through the Landscape
- 35. Emergent Purpose
- 36. Are We the Point?
* Part Five: Thinking
- 37. Crawling Into Consciousness
- 38. The Babbling Brain
- 39. What Thinks?
- 40. The Hard Problem
- 41. Zombies and Stories
- 42. Are Photons Conscious?
- 43. What Acts on What?
- 44. Freedom to Choose
* Part Six: Caring
- 45. Three Billion Heartbeats
- 46. What Is and What Ought to Be
- 47. Rules and Consequences
- 48. Constructing Goodness
- 49. Listening to the World
- 50. Existential Therapy
- Appendix: The Equation Underlying You and Me
- Further Reading
While browsing through some textbooks and researchers today, I came across a fantastic looking title: Probability Models for DNA Sequence Evolution by Rick Durrett (Springer, 2008). While searching his website at Duke, I noticed that he’s made a .pdf copy of a LaTeX version of the 2nd edition available for download. I hope others find it as interesting and useful as I do.
I’ll also give him a shout out for being a mathematician with a fledgling blog: Rick’s Ramblings.Syndicated copies to: