- DuckDuckGo announced recently that they will be creating an email privacy service to block user behavior tracking by advertisers and publishers. They join many others including Brave, Firefox and CloudFlare who are also tackling web privacy. Additionally, we can now add Apple to the mix who recently planted their flag firmly on app privacy.
- This is becoming more than common. It's a trend. We are routinely hearing from big industry players who are aiming to squash user behavior tracking in as many digital spaces as possible. Yes, that plays well with their public image. But, it's more than that. It’s what we, as their customers, want.
- As private customers, this is desirable. However, if you’re in an industry that relies upon side-channel user behavior data, it’s time to start reshaping your industry around an actual agreed upon, in-channel protocol instead. Otherwise the trend lines are not in your favor.
- Podcasting Stats Are Next
- Podcast attribution tracking has enjoyed a seemingly unassailable side-channel for many years: the IP address. But, starting with the next version of iOS, Apple has squarely taken aim at obfuscating IP addresses with their new private relay feature.
- Initially, Apple's Private Relay will target only HTTP app traffic, DNS requests and Safari traffic. But, given Apple's track record, and the movements we see in the industry as a whole, podcast statistics and attribution companies need to see this for what it is: a call to action.
- Private relay will not be the only service of its kind adopted by mobile phone platforms going forward. And, it certainly won’t stay restricted to this current use case. IP address cloaking and identity obfuscation are going to spread to the entire mobile operating system, and all mobile apps over time. Users want privacy in their mobile devices. And, device manufacturers are in the business of providing features that users want. It's inevitable.
- IP addresses have, over time, become a treasure trove of PII (personally identifiable information). It’s truly startling how much you can know about a person simply from their IP address. The fact that podcast stats rely so heavily on the same data point which is now in the crosshairs of powerful, privacy minded platform builders, should wake up the entire industry.
- Going Beyond the IP Address
- We’ve identified two solutions to the elimination of IP address and other private side-channel data:
- 1. Work with podcast app developers to create a URL parameter standard for podcast episode enclosures that anonymously identifies listening in an accurate and truly private way. This can be built quickly.
- 2. Take advantage of streaming micropayments to anonymously identify listening behaviors based on the Lightning TLV record specification that was developed as part of Podcasting 2.0. This is already being used today.
- These two options are not mutually exclusive. You don’t have to pick one or the other. One exists already, and the other can be built quickly. Both are valuable and neither one needs a nonprofit org, a steering committee or a $50,000 membership and annual recertification. These are straightforward problems that can be solved by simply working together in an open way - and, that's really what I'm proposing here.
- The "_ulid" URL Parameter
- URL query parameters are universal, simple and easy to verify in adherence to a standard. Dan Benjamin came up with the idea of an anonymized, universal private listen identifier. And, over the past year, we've worked together to shape it into something that's very easy to implement.
- It looks like this:
- GET https://example.com/podcast/episode1.mp3?_ulid=6975bcb2-32b5-4d16-b002-15a68ada2234
- Whenever a listener on a podcast app taps the play button on an episode, or that episode autodownloads, a unique sha256 ULID value is created and sent along with the enclosure request. After the first download, or the first 90 seconds worth of range requests, the ULID value is discarded and not sent again for at least 10 days. After 10 days elapses, it makes sense to treat this as a "new" listen. So, a new ULID value is created, attached and then discarded.
- When analyzing logs, downloads from User-Agents that are known to support the ULID standard can be looked at. If those downloads do not contain a "_ulid" parameter, they can be discarded as duplicates. All other downloads from non-ULID compliant User-Agents would continue to be treated like they are now.
- In this way, we can transition from IP address dependent attribution to something way more private. Whether the listener is at the office, at home, or out in the world they can be properly attributed without risk of tracking or other shady business. And, because it doesn’t rely on an IP address, there is no need for complex IP-range source filtering. The analytics are easier on the hosting side, and the app remains in control of it's listener privacy.
- We have already been using it on the Podcastindex.org website player for a few months. It uses browser local storage to "pin" the ULID value to an episode enclosure and handle the discard timing. This is just a proof of concept. You can see it being used here:
- This solution ensures listeners get the privacy they want and podcasters get the reliable, truthful listener data they need. This is not download counting. This is actually listener attribution. It's something that only the silo'd streaming apps claim they are able to do. But, we’re doing it in the open RSS ecosystem as well.
- But what about fraud? Could someone game the numbers by generating fake downloads with unique GUID's? Sure, but that can already be done. Fraud detection is always part of the game in the world of digital advertising and attribution. The aim isn't to design a system that is un-gameable. That isn't possible. The goal is having a system that's simple and transparent enough to make fraud detection a fairly straightforward process.
- The other modern source of accurate listener behavior comes from the TLV records included within Podcasting 2.0 micropayments. Every minute that someone listens to a podcast episode, a rich set of anonymized data is sent back to the podcaster (and/or hosting platform) along with the payment. This data includes the payment amount, the timestamp within the episode that the payment was made, the name of the app, the podcast title, the episode title and even the playback speed.
- There’s also a new "sender_id" field being built that holds an anonymized user identification hash (similar to the _ulid above) that is only identifiable within the app itself, yet still allows the podcaster to see that different people are listening. This information provides for beautiful charts like this one, where a podcaster can see exactly the most popular parts of their podcast episode:
- Clearly something happened at the 76-minute mark that resonated with listeners and caused them to send boost payments at that moment. This is immensely valuable information to a podcast creator. And just like the url parameters mentioned earlier, it completely protects the listeners privacy.
- And, when it comes to fraud, it's hardly a problem here. This is fraud you'd have to pay to commit.
- But These Changes Would Require Apps To Do Something
- If you just uttered that phrase in your mind, you are not understanding the nature and scope of the problem. Go back to the top and read the first section again. Changing the way podcast listening attribution is done isn’t an option anymore. It’s mandatory. Tech industry signals are already being sent loud and clear. IP address reliability (and thus download attribution) is no longer a viable strategy for the future. It will continue to work for some period of time but, if you don’t plan for what comes next, the future of the podcast advertising business is at great risk.
- Now is the time to engage app developers and begin creating a standard (or set of standards) that provide everyone the key data they need and want. People like Bryan Barletta have been saying this for a long time. When it comes to privacy, he’s been encouraging hosts and publishers to form tighter relationships with app developers in order to push privacy all the way up the attribution chain. Nobody has seriously listened yet because they didn’t have to. Now they have to.
- Podcast app developers want to safeguard their customers privacy, because that’s what their customers want. And, since podcast apps are where the rubber meets the road in this industry, if attribution companies and podcasting hosts engage app developers now and create a standard that everyone can live with, there won’t be a future reckoning.
- For too long, there has been a sort of uneasy tension between hosting companies, attribution companies and podcasting apps. But, it doesn't have to be like that. It's only that way because of the helter skelter way that the podcast industry came to maturity. Some app developers are also podcasters. They understand the importance of proper attribution. They just want to do it the right way for their customers. They have no other choice. And, if done right, everyone benefits.
- Well, maybe not everyone…
- The unfortunate truth is that some podcasters will not want to know how many actual listens they get. They'd rather stick with download numbers - and for obvious reasons. The inherent inaccuracy of podcast download statistics is only a secret to new podcasters. Everybody else in the industry already knows it. It's the reason the recent bug in the Apple Podcasts app can cut download numbers by 27% and hardly any listeners noticed. The only people who noticed were hosts and publishers. Those were probably never real listens but they got counted as such for all these years.
- Switching to true listen stats from download stats is bound to hurt some podcasters' numbers and some advertisers' pride. But, now is actually the perfect time to make this jump. A new generation of podcasting apps is being born, existing podcast apps are blossoming with new features and the stability of the giant, traditional podcasting silos is showing cracks.
- During this time of renewal and change, one truly bright spot has been a resurgence of open cooperation between so many different parts of the podcast industry. It’s never been a better time to get out ahead of an impending problem and, for once, be proactive.
- Ok, I'm Convinced. Now What?
- The OPAWG (Open Podcast Analytics Working Group) is the natural place for these discussions to take place. They've been focused on industry-wide analytics for quite a while. I've created this issue within their "measurement" repo to propose a start to this work. If you are willing to get involved, this would be a good place to start.
- For Podcasting 2.0 value enabled micropayments, you can get involved in the "podcast" namespace github repo. We are gearing up for phase 4 of the namespace. It's a great time to get involved. For an overview of what Podcasting 2.0 is all about, this short document is a good place to start.
- And, finally, we constantly discuss the technology behind everything related to podcasting on the Podcastindex.social Mastodon instance. Please join us. No podcasting tech discussion is off limits there. Your voice would be welcome.