This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.
For a more advanced treatment of these topics: The Elements of Statistical Learning.
Samvera is a versatile and feature rich repository solution that is being used by institutions worldwide to provide access to their digital content.
The best I could hope for back in 2008, and part of why I created the @JohnsHopkins Twitter handle, was that researchers would discover Twitter and be doing the types of things that some of the Johns Hopkins professors outlined in this recent article are now finally doing. It seems sad that it has taken over a decade and this article is really only highlighting the bleeding edge of the broader academic scene now. While what they’re doing is a great start, I think they really aren’t going far enough. They aren’t doing their audiences as much service as they could because there’s only so much that Twitter allows in terms of depth of ideas and expressiveness. It would be far better if they were doing this sort of work from their own websites and more directly interacting with their colleagues on the open web. The only value that Twitter is giving them is a veneer of reach to a broader audience, but they’re also opening themselves up to bigger attacks as is described in the article.
In addition to Kimberly’s example, another related area of potential innovation would be moving the journal clubs run by many research groups and labs online and opening them up. Want to open up science? Then let’s really do it! By bookmarking a variety of articles on their own websites, various members could be aggregated to contribute to a larger group, which could then use their own websites with protocols like Webmention or even simple tools like Hypothes.is to guide and participate in larger online conversations to move science communication along at an even faster pace. Greg McVerry and I have experimented in taking some of these tools into the classroom in the past.
If you think about it, arXiv and other preprint servers are really just journal clubs writ large. The problem is that they’re only communicating in one direction by aggregating the initial content, but they’re dramatically failing their audiences in that they aren’t facilitating or aggregating any open discussion around that content. As a result, the largest portion of their true value is still locked away in the individual brains of their readers rather than as commentary or even sentence level highlights and annotations on particular pieces out in the open. Often is the time that I’ll tweet about an interesting article only to receive a (lucky) reply that the results have been debunked, yet that information is almost never disclosed in or around the journal article (especially online) where it certainly belongs. Academic publishers are not only gouging us financially by siloing their content, they’re failing us far worse than most realize.
Another idea: Can’t get a journal of negative results to publish your latest research failure? Why not post a note or article on your own website to help out future researchers? (or even demonstrate to your students that not everything always works out?)
Naturally having aggregation services like indieweb.xyz, building planets, using OPML subscriptions, or the coming wave of feed readers could make a lot of these things easier, but we’re already right on the cusp for people who are willing to take a shot for doing this type of research online on their own websites and out in the open.
Want to try out some of the above? I’m happy to help (gratis) researchers who’d like to experiment in the area to get themselves set up. Just send me a note or give me a call.
As a leader in the global movement toward open access to publicly funded research, the University of California is taking a firm stand by deciding not to renew its subscriptions with Elsevier. Despite months of contract negotiations, Elsevier was unwilling to meet UC’s key goal: securing universal open access to UC research while containing the rapidly escalating costs associated with for-profit journals.
I’ve been doing this for several years now and it gives me a lot more control over how much meta data I can add, change, or modify as I see fit. Let me know if I can help you do something similar.
Dear Friends, My book, Twitter and Tear Gas: The Power and Fragility of Networked Protest, is officially out today, as of May 16th! It is published by Yale University Press, and it weaves stories w…
Some news: there will be a free creative commons copy of my book. It will be available as a free PDF download in addition to being sold as a bound book. This is with the hopes that anyone who wants to read it can do so without worrying about the cost. However, this also means that I need to ask that a few people who can afford to do so to please consider purchasing a copy. This is not just so that Yale University Press can do this for more authors, but also because if it is not sold (at least a little bit!) in the initial few weeks, bookstores will not stock it and online algorithms will show it to fewer people. No sales will mean less visibility, and less incentive for publishers to allow other authors creative commons copies. ❧
I negotiated the creative commons copy with my (wonderful!) publisher Yale University Press because I really wanted to do what I could to share my insights as broadly as I could about social movements and the networked public sphere. If I make a penny more from this book because it sells well by some miracle, I will donate every extra penny to groups supporting refugees, and if I ever meet you in person and you purchased a copy of the book in support, please let me know and I’ll buy the coffee or beer. 😀 This isn’t at all about money for me.
An excellent example of academic samizdat
November 28, 2018 at 11:24AM
I don’t recall though, are either of them open source, or do we need to re-build by hand?
Six years ago I received an email from a colleague in the mathematics department at UC Berkeley asking me whether he should participate in a study that involved “collecting DNA from the brigh…
This post, Gowers’, and Tao’s are all excellent reasons for a more IndieWeb philosophical approach in academic blogging (and other scientific communication). Many of the respondents/commenters have little, if any, indication of their identities or backgrounds which makes it imminently harder to judge or trust their bonafides within the discussion. Some even chose to remain anonymous and throw bombs. If each of the respondents were commenting (preferably using their real names) on their own websites and using the Webmention protocol, I suspect the discussion would have been richer and more worthwhile by an order of magnitude. Rivin at least had a linked Twitter account with an avatar, though I find it less than useful that his Twitter account is protected, a fact that makes me wonder if he’s only done so recently as a result of fallout from this incident? I do note that it at least appears his Twitter account links to his university website and vice-versa, so there’s a high likelihood that they’re at least the same person.
I’ll also note that a commenter noted that they felt that their reply had been moderated out of existence, something which Lior Pachter certainly has the ability and right to do on his own website, but which could have been mitigated had the commenter posted their reply on their own website and syndicated it to Pachter’s.
Hiding in the comments, which are generally civil and even-tempered, there’s an interesting discussion about academic publishing that could have been its own standalone post. Beyond the science involved (or not) in this entire saga, a lot of the background for the real story is one of process, so this comment was one of my favorite parts.
A senior official at Memorial Sloan Kettering Cancer Center has received millions of dollars in payments from companies that are involved in medical research.
I’m kind of shocked that major publishers like Elsevier are continually saying they add so much value to the chain of publishing they do, yet somehow, in all the major profits they (and others) are making that they don’t do these sorts of checks as a matter of course.
Running time: 0h 12m 59s | Download (13.9 MB) | Subscribe by RSS | Huffduff
Researcher posts research work to their own website (as bookmarks, reads, likes, favorites, annotations, etc.), they can post their data for others to review, they can post their ultimate publication to their own website.
The researcher’s post can webmention an aggregating website similar to the way they would pre-print their research on a server like arXiv.org. The aggregating website can then parse the original and display the title, author(s), publication date, revision date(s), abstract, and even the full paper itself. This aggregator can act as a subscription hub (with WebSub technology) to which other researchers can use to find, discover, and read the original research.
Readers of the original research can then write about, highlight, annotate, and even reply to it on their own websites to effectuate peer-review which then gets sent to the original by way of Webmention technology as well. The work of the peer-reviewers stands in the public as potential work which could be used for possible evaluation for promotion and tenure.
Readers of original research can post metadata relating to it on their own website including bookmarks, reads, likes, replies, annotations, etc. and send webmentions not only to the original but to the aggregation sites which could aggregate these responses which could also be given point values based on interaction/engagement levels (i.e. bookmarking something as “want to read” is 1 point where as indicating one has read something is 2 points, or that one has replied to something is 4 points and other publications which officially cite it provide 5 points. Such a scoring system could be used to provide a better citation measure of the overall value of of a research article in a networked world. In general, Webmention could be used to provide a two way audit-able trail for citations in general and the citation trail can be used in combination with something like the Vouch protocol to prevent gaming the system with spam.
Government institutions (like Library of Congress), universities, academic institutions, libraries, and non-profits (like the Internet Archive) can also create and maintain an archival copy of digital and/or printed copies of research for future generations. This would be necessary to guard against the death of researchers and their sites disappearing from the internet so as to provide better longevity.
Resources mentioned in the microcast
IndieWeb for Education
IndieWeb for Journalism
arXiv.org (an example pre-print server)
A Domain of One’s Own
Article on A List Apart: Webmentions: Enabling Better Communication on the Internet
Synidicating to Discovery sites
The world of scholarly communication is broken. Giant, corporate publishers with racketeering business practices and profit margins that exceed Apple’s treat life-saving research as a private commodity to be sold at exorbitant profits. Only around 25 per cent of the global corpus of research knowledge is ‘open access’, or accessible to the public for free and without subscription, which is a real impediment to resolving major problems, such as the United Nations’ Sustainable Development Goals.
So yes, more of the how to fix it piece please.
If you’ve already blogged about these in the past, then even links to those could be helpful to others using similar publishing practices in the future. Thoughts on brainstorming, best practices, pros/cons, could be highly useful as the landscape changes.