Welcome to the second edition of the Smarter Living newsletter.
What is a rel=”me”?
Rel=”me” is a microformat tag put on hyperlinks that indicates that the paged linked to is another representation of the person who controls the site/page you’re currently looking at. Thus on my home page the Facebook bug has a link to my facebook account which is another representation of me on the web, thus it has a rel=”me” tag on it.
His data is a bit old as I now maintain a page entitled Social Media Accounts and Links with some (but far from all) of my disparate and diverse social media accounts. That page currently has 190 rel=”me”s on it! While there was one other example that had rel-mes pointing to every other internal page on the site (at 221, if I recall), I’m proud to say, without gaming the system in such a quirky way, that each and every one of the rel=”me” URLs is indeed a full legitimate use of the tag.
I’m proud to be at the far end of the Zipf tail for this. And even more proud to be tagged as such during the week in which Microformats celebrates its 12th birthday. But for those doing research or who need edge cases of rel-me use, I’m also happy to serve as a unique test case. (If I’m not mistaken, I think my Google+ page broke one of Ryan’s web crawlers/tools in the past for a similar use-case a year or two ago).
The Moral of the Story
The take away from this seemingly crazy and obviously laughable example is simply just how fragmented one’s online identity can become by using social silos. Even more interesting for some is the number of sites on that page which either no longer have links or which are crossed out indicating that they no longer resolve. This means those sites and thousands more are now gone from the internet and along with them all of the data that they contained not only for me but thousands or even millions of other users.
This is one of the primary reasons that I’m a member of the Indieweb, have my own domain, and try to own all of my own data.
While it seemed embarrassing for a moment (yes, I could hear the laughter even in the live stream folks!), I’m glad Ryan drew attention to my rel-me edge case in part because it highlights some of the best reasons for being in the Indieweb.
(And by the way Ryan, thanks for a great presentation! I hope everyone watches the full video and checks out the new site/tool!)
Ask questions in a live Nextcloud Q&A Hangout with Frank Karlitschek and Jos Poortvliet, moderated by Bryan Lunduke at 18:00 PM Berlin/Amsterdam/Paris time, 10:00 AM Pacific time on June 2nd, 2016.
It has taken all of us to build the web we have, and now it is up to all of us to build the web we want – for everyone
tantek [10:07 AM]
I made a minor cassis.js auto_link bug fix that is unlikely to affect folks (involves a parameter to explicitly turn off embeds)
(revealed by my own posting UI, so selfdogfooding FTW)
tantek [10:10 AM]
/me realizes his upcoming events on his home page are out of date, again. manual hurts.
Tantek’s thoughts and the reference to selfdogfooding, while I’m thinking about food, makes me think that there’s kind of an analogy between food and people who choose to eat at restaurants versus those who cook at home and websites/content on the internet.
The IndieWeb is made of people who are “cooking” their websites at home. In some sense I hope we’re happier, healthier, and better/smarter communicators as a result, but it also makes me think about people who can’t afford to eat or afford internet access.
Are silos the equivalent of fast food? Are too many people consuming content that isn’t good for them and becoming intellectually obese? Would there be more thought and intention if there were more home chefs making and consuming content in smaller batches? Would it be more nutritious and mentally valuable?
I think there’s some value hiding in extending this comparison.
Most users on the web spend most of their time in apps. The most popular of those apps, like Facebook, Twitter, Gmail, Tumblr and others, are primarily focused on a single, simple stream that offers a river of news which users can easily scroll through, skim over, and click on to read in more depth.
Most media companies on the web spend all of their effort putting content into content management systems which publish pages. These pages work essentially the same way that pages have worked since the beginning of the web, with a single article or post living at a particular address, and then tons of navigation and cruft (and, usually, advertisements) surrounding that article.
Users have decided they want streams, but most media companies are insisting on publishing more and more pages. And the systems which publish the web are designed to keep making pages, not to make customized streams.
It's time to stop publishing web pages.
The Web grows every day. Tools, approaches, and styles change constantly, and keeping up is a challenge. We've compiled the best insights from subject matter experts for you in one place, so you can dive deep into the latest of what's happening in web development.
Fake news is the easiest of the problems to fix.
…a new set of ways to report and share news could arise: a social network where the sources of articles were highlighted rather than the users sharing them. A platform that makes it easier to read a full story than to share one unread. A news feed that provides alternative sources and analysis beneath every shared article.
This sounds like the kind of platforms I’d like to have. Reminiscent of some of the discussion at the beginning of This Week in Google: episode 379 Ixnay on the Eet-tway.
I suspect that some of the recent coverage of “fake news” and how it’s being shared on social media has prompted me to begin using Reading.am, a bookmarking-esqe service that commands that users to:
Share what you’re reading. Not what you like. Not what you find interesting. Just what you’re reading.
Naturally, in IndieWeb fashion, I’m also posting these read articles to my site. While bookmarks are things that I would implicitly like to read in the near future (rather than “Christmas ornaments” I want to impress people with on my “social media Christmas tree”), there’s a big difference between them and things that I’ve actually read through and thought about.
I always feel like many of my family, friends, and the general public click “like” or “share” on articles in social media without actually having read them from top to bottom. Research would generally suggest that I’m not wrong.   Some argue that the research needs to be more subtle too.  I generally refuse to participate in this type of behavior if I can avoid it.
Some portion of what I physically read isn’t shared, but at least those things marked as “read” here on my site are things that I’ve actually gone through the trouble to read from start to finish. When I can, I try to post a few highlights I found interesting along with any notes/marginalia (lately I’m loving the service Hypothes.is for doing this) on the piece to give some indication of its interest. I’ll also often try to post some of my thoughts on it, as I’m doing here.
Gauging Intent of Social Signals
I feel compelled to mention here that on some platforms like Twitter, that I don’t generally use the “like” functionality there to indicate that I’ve actually liked a tweet itself or any content that’s linked to in it. In fact, I’ve often not read anything related to the tweet but the simple headline presented in the tweet itself.
The majority of the time I’m liking/favoriting something on Twitter, it’s because I’m using an IFTTT.com applet which takes the tweets I “like” and saves them to my Pocket account where I come back to them later to read. It’s not the case that I actually read everything in my pocket queue, but those that I do read will generally appear on my site.
There are however, some extreme cases in which pieces of content are a bit beyond the pale for indicating a like on, and in those cases I won’t do so, but will manually add them to my reading queue. For some this may create some grey area about my intent when viewing things like my Twitter likes. Generally I’d recommend people view that feed as a generic linkblog of sorts. On Twitter, I far more preferred the nebulous star indicator over the current heart for indicating how I used and continue to use that bit of functionality.
I’ll also mention that I sometimes use the like/favorite functionality on some platforms to indicate to respondents that I’ve seen their post/reply. This type of usage could also be viewed as a digital “Thank You”, “hello”, or even “read receipt” of sorts since I know that the “like” intent is pushed into their notifications feed. I suspect that most recipients receive these intents as I intend them though the Twitter platform isn’t designed for this specifically.
I wish that there was a better way for platforms and their readers to better know exactly what the intent of the users’ was rather than trying to intuit them. It would be great if Twitter had the ability to allow users multiple options under each tweet to better indicate whether their intent was to bookmark, like, or favorite it, or to indicate that they actually read/watched the content on the other end of the link in the tweet.
In true IndieWeb fashion, because I can put these posts on my own site, I can directly control not only what I post, but I can be far more clear about why I’m posting it and give a better idea about what it means to me. I can also provide footnotes to allow readers to better see my underlying sources and judge for themselves their authenticity and actual gravitas. As a result, hopefully you’ll find no fake news here.
Of course some of the ensuing question is: “How does one scale this type of behaviour up?”
The biggest change in the intervening time is the spread of the internet which supplies a broad variety of related websites with not only interesting resources for things like basic reading and writing, but even audio sources apparently including listening to the nightly news in Latin. There are a variety of blogs on Latin as well as even online courseware, podcasts, pronunciation recordings, and even free textbooks. I’ve written briefly about the RapGenius platform before, but I feel compelled to mention it as a potentially powerful resource as well. (Julius Caesar, Seneca, Ovid, Cicero, et al.) There is a paucity of these sources in a general sense in comparison with other modern languages, but given the size of the niche, there is quite a lot out there, and certainly a mountain in comparison to what existed only twenty years ago.
There has also been a spread of pedagogic aids like flashcard software including Anki and Mnemosyne with desktop, web-based, and even mobile-based versions making learning available in almost any situation. The psychology and learning research behind these types of technologies has really come a long way toward assisting students to best make use of their time in learning and retaining what they’ve learned in long term memory. Simple mobile applications like Duolingo exist for a variety of languages – though one doesn’t currently exist for classical Latin (yet).
The other great change is the advancement of the digital humanities which allows for a lot of interesting applications of knowledge acquisition. One particular one that I ran across this week was the Dickinson College Commentaries (DCC). Specifically a handful of scholars have compiled and documented a list of the most common core vocabulary words in Latin (and in Greek) based on their frequency of appearance in extant works. This very specific data is of interest to me in relation to my work in information theory, but it also becomes a tremendously handy tool when attempting to learn and master a language. It is a truly impressive fact that, simply by knowing that if one can memorize and master about 250 words in Latin, it will allow them to read and understand 50% of most written Latin. Further, knowledge of 1,500 Latin words will put one at the 80% level of vocabulary mastery for most texts. Mastering even a very small list of vocabulary allows one to read a large variety of texts very comfortably. I can only think about the old concept of a concordance (which was generally limited to heavily studied texts like the Bible or possibly Shakespeare) which has now been put on some serious steroids for entire cultures. Another half step and one arrives at the Google Ngram Viewer.
The best part is that one can, with very little technical knowledge, easily download the DCC Core Latin Vocabulary (itself a huge research undertaking) and upload and share it through the Anki platform, for example, to benefit a fairly large community of other scholars, learners, and teachers. With a variety of easy-to-use tools, shortly it may be even that much easier to learn a language like Latin – potentially to the point that it is no longer a dead language. For those interested, you can find my version of the shared DCC Core Latin Vocabulary for Anki online; the DCC’s Chris Francese has posted details and a version for Mnemosyne already.
[Editor’s note: Anki’s web service occasionally clears decks of cards from their servers, so if you find that the Anki link to the DCC Core Latin is not working, please leave a comment below, and we’ll re-upload the deck for shared use.]
What tools and tricks do you use for language study and pedagogy?
Computer pioneer who helped create the first spreadsheet, Bob Frankston, is this week's guest.
On a recent episode of Leo Laporte and Tom Merrit’s show Triangulation, they interviewed Bob Frankston of VisiCalc fame. They gave a great discussion of the current state of broadband in the U.S. and how it might be much better. They get just a bit technical in places, but it’s a fantastic and very accessible discussion of the topic of communications that every American should be aware of.
Profound as it may be, the Internet revolution still pales in comparison to that earlier revolution that first brought screens in millions of homes: the TV revolution. Americans still spend more of their non-sleep, non-work time on watching TV than on any other activity. And now the immovable object (the couch potato) and the irresistible force (the business-model destroying Internet) are colliding.
For decades, the limitations of technology only allowed viewers to watch TV programs as they were broadcast. Although limiting, this way of watching TV has the benefit of simplicity: the viewer only has to turn on the set and select a channel. They then get to see what was deemed broadcast-worthy at that particular time. This is the exact opposite of the Web, where users type a search query or click a link and get their content whenever they want. Unsurprisingly, TV over the Internet, a combination that adds Web-like instant gratification to the TV experience, has seen an enormous growth in popularity since broadband became fast enough to deliver decent quality video. So is the Internet going to wreck TV, or is TV going to wreck the Internet? Arguments can certainly be made either way.
The process of distributing TV over a data network such as the Internet, a process often called IPTV, is a little more complex than just sending files back and forth. Unless, that is, a TV broadcast is recorded and turned into a file. The latter, file-based model is one that Apple has embraced with its iTunes Store, where shows are simply downloaded like any other file. This has the advantage that shows can be watched later, even when there is no longer a network connection available, but the download model doesn’t exactly lend itself to live broadcasts—or instant gratification, for that matter.
Most of the new IPTV services, like Netflix and Hulu, and all types of live broadcasts use a streaming model. Here, the program is set out in real time. The computer—or, usually by way of a set-top-box, the TV—decodes the incoming stream of audio and video and then displays it pretty much immediately. This has the advantage that the video starts within seconds. However, it also means that the network must be fast enough to carry the audio/video at the bitrate that it was encoded with. The bitrate can vary a lot depending on the type of program—talking heads compress a lot better than car crashes—but for standard definition (SD) video, think two megabits per second (Mbps).
To get a sense just how significant this 2Mbps number is, it’s worth placing it in the context of the history of the Internet, as it has moved from transmitting text to images to audio and video. A page of text that takes a minute to read is a few kilobytes in size. Images are tens to a few hundred kilobytes. High quality audio starts at about 128 kilobits per second (kbps), or about a megabyte per minute. SD TV can be shoehorned in some two megabits per second (Mbps), or about 15 megabytes per minute. HDTV starts around 5Mbps, 40 megabytes per minute. So someone watching HDTV over the Internet uses about the same bandwidth as half a million early-1990s text-only Web surfers. Even today, watching video uses at least ten times as much bandwidth as non-video use of the network.
In addition to raw capacity, streaming video also places other demands on the network. Most applications communicate through TCP, a layer in the network stack that takes care of retransmitting lost data and delivering data to the receiving application in the right order. This is despite the fact that the IP packets that do TCP’s bidding may arrive out of order. And when the network gets congested, TCP’s congestion control algorithms slow down the transmission rate at the sender, so the network remains usable.
However, for real-time audio and video, TCP isn’t such a good match. If a fraction of a second of audio or part of a video frame gets lost, it’s much better to just skip over the lost data and continue with what follows, rather than wait for a retransmission to arrive. So streaming audio and video tended to run on top of UDP rather than TCP. UDP is the thinnest possible layer on top of IP and doesn’t care about lost packets and such. But UDP also means that TCP’s congestion control is out the door, so a video stream may continue at full speed even though the network is overloaded and many packets—also from other users—get lost. However, more advanced streaming solutions are able to switch to lower quality video when network conditions worsen. And Apple has developed a way to stream video using standard HTTP on top of TCP, by splitting the stream into small files that are downloaded individually. Should a file fail to download because of network problems, it can be skipped, continuing playback with the next file.
Where are the servers? Follow the money
Like any Internet application, streaming of TV content can happen from across town or across the world. However, as the number of users increases, the costs of sending such large amounts of data over large distances become significant. For this reason, content delivery networks (CDNs), of which Akamai is probably the most well-known, try to place servers as close to the end-users as possible, either close to important interconnect locations where lots of Internet traffic comes together, or actually inside the networks of large ISPs.
Interestingly, it appears that CDNs are actually paying large ISPs for this privilege. This makes the IPTV business a lot like the cable TV business. On the Internet, the assumption is that both ends (the consumer and the provider of over-the-Internet services) pay their own ISPs for the traffic costs, and the ISPs just transport the bits and aren’t involved otherwise. In the cable TV world, this is very different. An ISP provides access to the entire Internet; a cable TV provider doesn’t provide access to all possible TV channels. Often, the cable companies pay for access to content.
For services like Netflix or Hulu, where everyone is watching their own movie or their own show, streaming makes a lot of sense. Not so much with live broadcasts.
So far, we’ve only been looking at IPTV over the public Internet. However, many ISPs around the world already provide cable-like service on top of ADSL or Fiber-To-The-Home (FTTH). With such complete solutions, the ISPs can control the whole service, from streaming servers to the set-top box that decodes the IPTV data and delivers it to a TV. This “walled garden” type of IPTV typically provides a better and more TV-like experience—changing channels is faster, image quality is better, and the service is more reliable.
Such an IPTV Internet access service is a lot like what cable networks provide, but there is a crucial difference: with cable, the bandwidth of the analog cable signal is split into channels, which can be used for analog or digital TV broadcasts or for data. TV and data don’t get in each other’s way. With IPTV on the other hand, TV and Internet data are communication vessels: what is used by one is unavailable to the other. And to ensure a good experience, IPTV packets are given higher priority than other packets. When bandwidth is plentiful, this isn’t an issue, but when a network fills up to the point that Internet packets regularly have to take a backseat to IPTV packets, this could easily become a network neutrality headache.
Multicast to the rescue
Speaking of networks that fill up: for services like Netflix or Hulu, where everyone is watching their own movie or their own show, streaming makes a lot of sense. Not so much with live broadcasts. If 30 million people were to tune into Dancing with the Stars using streaming, that means 30 million copies of each IPTV packet must flow down the tubes. That’s not very efficient, especially given that routers and switches have the capability to take one packet and deliver a copy to anyone who’s interested. This ability to make multiple copies of a packet is called multicast, and it occupies territory between broadcasts, which go to everyone, and regular communications (called unicast), which go to only one recipient. Multicast packets are addressed to a special group address. Only systems listening for the right group address get a copy of the packet.
Multicast is already used in some private IPTV networks, but it has never gained traction on the public Internet. Partially, this is a chicken/egg situation, where there is no demand because there is no supply and vice versa. But multicast is also hard to make work as the network gets larger and the number of multicast groups increases. However, multicast is very well suited to broadcast type network infrastructures, such as cable networks and satellite transmission. Launching multiple satellites that just send thousands of copies of the same packets to thousands of individual users would be a waste of perfectly good rockets.
Peer-to-peer and downloading
Converging to a single IP network that can carry the Web, other data services, telephony, and TV seems like a no-brainer.
Multicast works well for a relatively limited number of streams that are each watched by a reasonably sized group of people—but having very many multicast groups takes up too much memory in routers and switches. For less popular content, there’s another delivery method that requires no or few streaming servers: peer-to-peer streaming. This was the technology used by the Joost service in 2007 and 2008. With peer-to-peer streaming, all the systems interested in a given stream get blocks of audio/video data from upstream peers, and then send those on to downstream peers. This approach has two downsides: the bandwidth of the stream has to be limited to fit within the upload capacity of most peers, and changing channels is a very slow process because a whole new set of peers must be contacted.
For less time-critical content, downloading can work very well. Especially in a form like podcasts, where an RSS feed allows a computer to download new episodes of shows without user intervention. It’s possible to imagine a system where regular network TV shows are made available for download one or two days before they air—but in encrypted form. Then, “airing” the show would just entail distributing the decryption keys to viewers. This could leverage unused network capacity at night. Downloads might also happen using IP packets with a lower priority, so they don’t get in the way of interactive network use.
IP addresses and home networks
A possible issue with IPTV could be the extra IP addresses required. There are basically two approaches to handling this issue: the one where the user is in full control, and the one where an IPTV service provider (usually the ISP) has some control. In the former case, streaming and downloading happens through the user’s home network and no extra addresses are required. However, wireless home networks may not be able to provide bandwidth with enough consistency to make streaming work well, so pulling Ethernet cabling may be required.
When the IPTV provider provides a set-top box, it’s often necessary to address packets toward that set-top box, so the box must be addressable in some way. This can eat up a lot of addresses, which is a problem in these IPv4-starved times. For really large ISPs, the private address ranges in IPv4 may not even be sufficient to provide a unique address to every customer. Issues in this area are why Comcast has been working on adopting IPv6 in the non-public part of its network for many years. When an IPTV provider provides a home gateway, this gateway is often outfitted with special quality-of-service mechanisms that make (wireless) streaming work better than run-of-the-mill home gateways that treat all packets the same.
Predicting the future
Converging to a single IP network that can carry the Web, other data services, telephony, and TV seems like a no-brainer. The phone companies have been working on this for years because that will allow them to buy cheap off-the-shelf routers and switches, rather than the specialty equipment they use now. So it seems highly likely that in the future, we’ll be watching our TV shows over the Internet—or at least over an IP network of some sort. The extra bandwidth required is going to be significant, but so far, the Internet has been able to meet all challenges thrown at it in this area. Looking at the technologies, it would make sense to combine nightly pushed downloads for popular non-live content, multicast for popular live content, and regular streaming or peer-to-peer streaming for back catalog shows and obscure live content.
However, the channel flipping model of TV consumption has proven to be quite popular over the past half century, and many consumers may want to stick with it—for at least part of their TV viewing time. If nothing else, this provides an easy way to discover new shows. The networks are also unlikely to move away from this model voluntarily, because there is no way they’ll be able to sell 16 minutes of commercials per hour using most of the other delivery methods. However, we may see some innovations. For instance, if you stumble upon a show in progress, wouldn’t it be nice to be able to go back to the beginning? In the end, TV isn’t going anywhere, and neither is the Internet, so they’ll have to find a way to live together.
Correction: The original article incorrectly stated that cable providers get paid by TV networks. For broadcast networks, cable operators are required by the law’s “must carry” provisions to carry all of the TV stations broadcast in a market. Ars regrets the error.
What I’ve found most interesting in many of these debates, including this one, is that though there is occasional discussion of building out additional infrastructure to provide additional capacity, there is generally never discussion of utilizing information theory to improve bandwidth either mathematically or from an engineering perspective. Claude Shannon is rolling in his grave.
Apparently, despite last year’s great “digital switch” in television frequencies from analog to provide additional television capacity and the subsequent auction of the 700MHz spectrum, everyone forgets that engineering additional capacity is often cheaper and easier than just physically building more. Shannon’s original limit is far from a reality, so we know there’s much room for improvement here, particularly because most of the improvement on reaching his limit in the past two decades has come about particularly because of the research in and growth of the mobile communications industry.
Perhaps our leaders could borrow a page from JFK in launching the space race in the 60’s, but instead of focusing on space, they might look at science and mathematics in making our communications infrastructure more robust and guaranteeing free and open internet access to all Americans?
But tonight Twitter began to change the landscape of how Hollywood, and in particular the representation segment, does its day-to-day business.
It began with the news that Alyssa Milano’s ABC series ROMANTICALLY CHALLENGED, which premiered on April 19th earlier this year, had been cancelled. Michael Ausiello of the Ausiello Files for Entertainment Weekly broke the story online at 7:44 pm (Pacific) and tweeted out the news. Alyssa Milano saw the news on Twitter about an hour later, and at 8:45 pm, she out her disappointment to the world.
Just found out #romanticallychallenged was cancelled through Twitter! How crazy is that shit?
— Alyssa Milano (@Alyssa_Milano)
Her agent/manager is going to have a fire to put out tomorrow, if it doesn’t burn itself into oblivion tonight! If anything, her agent typically could have or should have been amongst one of the first to know, generally being informed by the studio executive in charge of the project or potentially by the producer of the show who would also have been in that first round to know about the cancellation. And following the news from the network, Alyssa should have been notified immediately.
Typically this type of news is treated like pure commodity within the representation world. If a competing agent, particularly one who wanted a client like Alyssa, to move to their agency, they would dig up the early news, call her at home, break the bad news early and fault the current representative for dropping the ball and not doing their job. Further, the agent would likely put together a group of several new scripts (which the servicing agent either wouldn’t have access to or wouldn’t have sent her) and have them sent over to her for her immediate consideration. Suddenly there’s an unhappy client who is seriously considering taking their business across the street.
The major difference here is that it isn’t a competing agent breaking the bad news, but the broader internet! Despite the brevity of the less than 140 characters Ms. Milano had, it’s quite obvious that she’s both shocked and a bit upset at the news. We cannot imagine that she’s happy with the source of the news; it’s very likely that her representation got an upset call this evening which they’re currently scurrying to verify and then put out the subsequent fire.
Beyond this frayed relationship, there is also the subsequent strain on the relationships between representation and the overseeing studio executive(s), studio/network chief, and potentially further between the Agency and the Network over what is certain to be one of the more expensive television talent deals in the business right now.
We’re sure there will be a few more agents, managers, and attorneys who sign up for Twitter accounts tomorrow and begin monitoring their clients’ brands more closely on the real-time web.
[As a small caveat to all of this, keep in mind that the show was picked up in early August last year and only aired four episodes premiering in April of this year, so from a technical point of view, the show’s cancellation isn’t a major surprise simply given the timing of the pick-up and the premiere, the promotional push behind the show, or the show’s ratings. Nevertheless, this is sure to have an effect on the flow of business.]