Audio vs Video: 2025's fiercest fight is its most pointless one
Should you pivot to video? And does it matter?
If you enjoy this post, or any of my posts, please consider taking out a premium subscription to Future Proof, so that I can keep writing and also feeding my dog the expensive kibble to which he has become accustomed.
INTRO.
It’s 2025, and one question is dominating all of podcastland: do I need to pivot to video?
The tides will turn, the moon will wane, the sun will set, and video will return. It is an inevitable part of our precious life cycle. Video has been the preoccupation of the media since it started to go digital, back in the 1990s. It is, in some ways, the apotheosis of the act of journalism, combining almost the whole range of tools available. You cannot produce video journalism without writing, you cannot produce it without audio. It is, in a sense, the largest of the Matryoshka dolls, containing everything else within it.
It is also expensive, time consuming and subject to a fickle advertising and distribution market. Indeed, this is what killed the first “pivot to video”, back in the 2010s. Digital publishers invested heavily in creating this content – because it required heavy investment – but they found it hard to extract revenue or even guarantee audiences. When the cost of creating a 5-minute video is ten times that of publishing an 800-word opinion piece, you have to have some confidence in it reaching excess eyeballs. Instead, this video content was all too regularly getting swamped by a sea of crap. The time wasn’t right.
Almost all of the issues that dogged the first ‘pivot to video’ have been solved. Broadcast technology is cheaper than ever. Editing video is easy to outsource and many of the time consuming functions (such as subtitling, my person bugbear) have been automated. YouTube has grown to be a powerful and ubiquitous platform, not just for native Gen Z but millennials and older too. For many publishers, creating a video show is now only twice (rather than 10x) as expensive and energy-sapping as making the same show in audio.
But what hasn’t been squared is the discoverability questions. Indeed, I think this issue has actually been problematised by the emergence of a generation of non-professional video creators who are outputting incredibly high-spec content. The, apparently unstoppable, rise of AI, too, is bridging the gap between the quality of capital intensive content being created by major publishers, and videos being cooked up in the basement of Joe Nobody’s mom’s house. Video has been normalised for consumption, but it has also been normalised for creation. If it weren’t for this, the answer to that question – “do I need to pivot to video?” – would be a simple, and resounding, “yes”. But, as it is, things are a bit more complicated.
I. THE BUSINESS PREFERENCE FOR VIDEO
The first thing to understand – and something that I think is much under-discussed in this subject area – is that there is a structural business preference for video over audio.
There are two extremely simple reasons for this. The first is that, if you’re an advertiser, there are few products that aren’t enhanced by visuals. “You’ll love the feeling you get from driving our latest hybrid,” the voice on the podcast advert purrs. “You’ll love the feeling you get from driving our latest hybrid,” the voice on the video advert purrs, whilst we see this ‘latest hybrid’ calmly taking a bend on some improbably quiet road in the California hills. Which do you think is more likely to inspire someone to start thinking about purchasing that car?
Here in the UK (and I assume the US is similar), digital advertising spend is dominated by products that are better suited to a video advert than an audio one. Around 16.4% of digital advertising is Fast-Moving Consumer Goods (FMCG), things like fizzy drinks, food or cleaning products. That makes them the biggest single industry for advertising spend, and all of that is better in a video ad. Travel and leisure is the next highest at about 13% – and clearly the same is true there. Telecomms is next (9.6%), which is the first industry where it’s a bit of a coin toss (though clearly the hardware part of that industry will prefer video, even if the networks are more ambivalent). Retail (8.9%) and Entertainment (7%) are next, and then, finally, we arrive at Financial Services (6.8%), which is the industry least disadvantaged by an audio-only ad.
Fine, advertisers may prefer video – may even pay a small premium for it – but that wouldn’t deter them from spending heavily on audio, given the reach of the markets. But there’s a second problem: the call to action. Advertising is basically a game of converting somebody who doesn’t know that they want your product into becoming an active consumer. Audio is a useful tool for brand awareness (it’s why I know of the existence of BetterHelp or Beer52 or Nutmeg from J.P.Morgan), and that’s a big part of advertising. Consumers are far less likely to select a product they’ve never heard of. But the other key thing is to create a pathway for that person – the consumer of your advert – to become a consumer of your product.
This has been the circle that digital media has squared for the advertising industry. Frustrated children of the 1990s, like myself, beat our tubby fists bloody against the window of the television, desperate for that Tamagotchi trapped within; children of the 2020s touch a single button on the iPad and they’ve accidentally bought an £800 weekend away at Peppa Pig World. The pathway from product discovery to purchase has never been more deliciously/terrifyingly (delete as applicable) convenient.
Meanwhile, over in podcastland, host-read adverts are still imploring users to go to www.terriblylongandinconvenientURL.com/buythisproductthatyoujustheardaboutandcantquiterememberthenameof and the use promo code PODCAST12#2025REMIX! for 3% off. It’s such a bad system that I avoid encouraging advertisers to give me a promo code. They want it as a metric of how many conversions a podcast is giving them, but really podcast advertising is only good for reinforcement. BetterHelp, BetterHelp, BetterHelp; Beer52, Beer52, Beer5s; Nutmeg from J.P.Morgan, Nutmeg from J.P.Morgan, Nutmeg from J.P.Morgan.
All of this is to say that there is a force at play here that is often elided from conversations about video vs audio: the business interest. Both podcasting and video (which I will generally refer to as YouTubing, for simplicity’s sake) are still heavily reliant on advertising revenue, even while they try to roll out more subscriptions. And for as long as there has been a media, it has been shaped by revenue extraction. The entire architecture of a newspaper was built around its adverts. They were then kicked in the nuts by the emergence of digital advertising on websites (which ought to have been so much better), which flooded the world with penny advertising for crap products. As the digital advertising market has declined, the spend on advertising on YouTube has increased. Now you interrupt some bloke in the Netherlands (who Google believes has a household income of around €150,000 per annum) watching videos about tyre pressure, to show him a mandatory advert for your ‘latest hybrid’, complete with a single-click link that will take him through to financing options.
And so the business overlords who shape our media like a child manipulating Play-Doh sausages have made their verdict clear: pivot to video.
II. CREATOR END CHANGES
I alluded to it in the intro, but how we make content has changed significantly in the past few years. Indeed, I think the pace of change at present is as fast as it’s been in the digital era.
The Black Swan event that has reconfigured production standards was, undoubtedly, the covid-19 pandemic. Remote recording went from being a marginal pursuit used predominantly for foreign bulletins, to the dominant method of recording. But it wasn’t just the exogenous shock of the pandemic. This once in a lifetime jolt to recording standards (for TV, radio, podcasts, and YouTube) happened at a time when similar trends were already taking off. YouTube, for example, had been dominated, in its first decade, by to-camera, first-person videos. These were very intimate, very rudimentary, but formed the aesthetic grounding of the medium. Audiences acclimated to this lo-fi ‘look’ which enabled creators to get away with a very minimal production. No camera crew, limited editing, just personality driven broadcasting.
And, at the same time, the rise of streaming as a content medium was doubling down on this ‘look’. Streaming is a very communal experience, and streamers often stream ‘with’ other streamers, not to mention directly conversing with their ‘chat’. But the ‘with’, there, means, by default, remotely. Discord, the audio communications platform that underpins a lot of streaming content, was a game changer for streamers, just as platforms like Zoom and Riverside and Cleanfeed would shake things up, a few years later, for podcasters.
These were the foundations of YouTube, and they recalibrated audience expectations. At the same time, by 2020, the amount of money flooding into YouTube meant that we were witnessing the rise of near-TV quality productions. This is almost the inverse trend of the one described above. Think about channels like MrBeast or Dude Perfect or the Sidemen: these are high production quality channels, far more inspired by the breakneck pace of 90s MTV than the limited early YouTube ‘look’. Their ability to create that sort of content at a fraction of the price that television channels would have to pay is a key part of everything that’s followed. It’s a change that was precipitated by the rise of prosumer technologies, in terms of cameras, microphones and post-production software, as well as a ready surplus of volunteer labour. Because many of these channels didn’t set out as profit making endeavours (MrBeast is the most obvious/egregious example here), they exploited cheap and free labour, and reinvested what little revenue they had into the production.
The effect has been to drink TV’s milkshake, for a whole generation of viewers. An Ofcom survey last year found that within Gen Z (priced, in the survey, as 16-24 year olds), the average amount of live TV consumer per day was 20 minutes (compared to 281 minutes per day for over-75s) whereas the amount of video viewed on ‘video sharing sites’ (largely YouTube) was 93 mins. The survey found that it wasn’t until the 35-44 demographic that the ratio of live TV to YouTube tipped in favour of traditional media (and, even there, it was 59:46, a relatively slim margin). In a very real sense, YouTube is not actually competing with podcasts. It’s competing with TV.
The other big creator end change has been related to the thorny issue of discoverability. This is something that vexes content creators of all stripes (I know, we all hate the term ‘content creator’, but it effectively unites everyone from Paul Krugman on Substack to PewDiePie on YouTube). How do you get people to subscribe to your newsletter? How do you get people to listen to your podcast? How do you get people to watch your video?
The answer, for a long time, has been social media. And in the past few years, the algorithms at work in the big networks – Facebook and Instagram, Twitter (as was), TikTok – has massively prioritised the creation of video content. And so, people building a marketing plan for any digital media content have been faced with a tricky question. How do you create video content to market a project if it’s not a video project in the first place? And if you start creating video content (for marketing purposes) why don’t you just conceive of it as video-first project? What is there to lose?
It comes back to an old adage that I wheel out over and over again (an old adage that I think I actually invented). Turning a video into a podcast is as easy as closing your eyes; turning a podcast into a video requires several hours of outsourced labour. The reality is that we are living in a very uninventive time for marketing. Despite its myriad flaws (exposed, so dramatically, in Elon Musk’s takeover of Twitter), social media is still our collective best guess for a viral marketing pathway. And so we are all forced to use it. Twitter had a brief experiment with uploading native audio clips, but X, as it now is, only allows video or pictures to be uploaded. That’s why you’ve seen the increasing dominance of video podcasts that straddle all these different lines (something I refer to as ‘content soup’). Podcasters book a recording studio that has video facilities (as they pretty much all do now) and just publish the video of them notionally recording the audio. It’s almost meta, but it’s become the new dominant ‘look’, the heir to the dead-eyed make-up tutorials and ‘get ready with me’s.
And so, again, it makes sense to conceive of things with video in mind, because the tools are becoming more accessible, and AI is rapidly streamlining the last remaining tricky bits. I remember spending an horrific amount of time manually subtitling videos. That’s now a job for AI. Render times, key-framing, montaging – all of these are being taken on, and over, by AI. And the easier that video becomes, the more we’ll see a wash of it.
III. ARMS RACE
Which means that we are inevitably heading for another video arms race. I think there will be a few conscientious objectors, recusants who are burnt from the last ‘pivot to video’, but most legacy publishers recognise that they need to compete in this market.
Already we have seen non-trivial expansion into the frontiers of YouTube. The BBC’s main channel has 14.7m subscribers (bear in mind that 23.9m households pay for a TV license) and other prestige brands are in a similar place. The TV channels have a clear advantage in creating video content – Fox News = 13.2m; CNN = 17.7m; NBC News – but the legacy media corporations – 11.1m. New York Times = 4.68m; Washington Post = 2.74m; The Guardian = 2.34m – are growing into this space. This is going to be a proper bunfight.
But it’s a funny one, because these broadcasters, and their shadowy owners, are not just going head-to-head with one another, but a whole landscape of independent content creators. Like the Cold War, if every other rebel group also had ICBMs and nukes. The more we talk about video being the next big publishing frontier, the more inevitable it is that investment will pour in, but it remains to be seen how effectively a brand carries over from one medium to the other. After all, the Times is probably the most influential print news brand on the planet – but 4.68m subscribers puts it considerably behind MrBeast’s third channel (on 7m) let alone his first (361m) or second (46m), not to mention his gaming channel (46m), his reacts channel (35m), or his philanthropy channel (27m). The NYT is not a big fish in this space; they barely even qualify as a small fry. And there comes a point where pitching for a growth that’s not coming is quietly humbling, especially when you consider yourself the paper of record. Do you really want to be judged against MrBeast’s 3rd channel, even if only in a two-bit Substack post?
The bigger (or, at least, more consequential) arms race will actually be between the distribution platforms. Spotify have made a big play for video this year, which can be read in many ways but I think is more representative of a fear that YouTube will become so monopolistic within digital media that they can eventually overtake Spotify’s market lead on music (YouTube Music has been no great success, but that doesn’t mean the product couldn’t evolve). Either way, it’s clear that both Spotify and YouTube believe that video is the primary route forward. Spotify clearly wants to increase its revenue as it moves into the profit phase of its existence. In their 2023 financial reports, it was revealed that 87% of revenue came from premium services, compared to 13% from ad sales. In almost 20 years of existence, Spotify’s freemium offering has been squeezed almost into non-existence.
The balance between subscriptions and ad-sales is a tricky one. It is something that a platform like Netflix – arguably the most advanced player in this game – is still grappling with. Amazon Prime, for example, has introduced ads by default into its standard tier subscription, whereas Netflix has created a cheaper tier of subscription that includes adverts. This speaks to a slight anxiety that we have passed ‘peak subscription' (Netflix’s subscriber figures declined sharply post-pandemic, necessitating a brutal, but effective, gouge of multi-household subscriptions) and that advertising represents a more stable revenue pathway. Really, if you want to be profitable you need to keep raising and raising your prices. Get people hooked and then incrementally increase the pricing until you have a profit margin that your shareholders are happy with. But this is a tricky model when platforms are as competitive as they currently are. Don’t like the latest round of Netflix price hikes? Give Amazon Prime a go. And for music it’s an even thornier question, because the libraries tend to be near identical (whereas Netflix/Amazon/Disney/Apple etc are still heavily reliant on their original, and exclusively licensed, programming as a point of differentiation).
This all goes back to part one. Advertisers want video, so Spotify wants video. Spotify wants video because YouTube currently has video (has it sown up and tied in a neat little bow), and YouTube probably wants audio, which Spotify currently has.
The arms race means that the powerful tier above the content creators – the distributors – are going to prioritise video. Which means that they are suggesting the answer to the original question – “do I need to pivot to video?” – is a yes. But is it quite as simple as that?
IV. THE CONTRACTION OF AUDIO-ONLY
Obviously not. Because whilst Spotify’s interest in video is much reported and transparently true, it is also not a universal truth. Spotify has always had a very fickle relationship with creators, spending huge money on acquisitions like The Joe Rogan Experience and Call Her Daddy, whilst significantly underinvesting in mid-tier partnerships, and paying out at the same feeble rate as they do for the music industry. Anyone who believes that if they turn their limited-audience audio podcast into a video show it will suddenly be rocket-fuelled by the Spotify curators and algorithmic Gods is, I’m afraid, living in a fantasy.
It would be no bad thing, in my opinion, for the audio-only industry to contract slightly. At the moment, podcasting is in a mess, where it is subject to loose or misleading definitions. I read this piece in Semafor, which is titled ‘Podcasting sees explosive growth on YouTube’ and yet refers, repeatedly, to ‘viewership’ (rather than ‘listenership’ or even ‘audience’) and only directly cites MediasTouch, a media network which is doing amazingly well on YouTube, but is also transparently not a simple podcast (it looks much more like a classic news channel, complete with clips and inserts).
The word ‘podcast’ has ceased to mean ‘podcast’, and because of this confusion, many content creators have felt like they must adapt to this new definition. After all, if ‘podcast’ now means a video show distributed via a range of different video and audio mechanisms, are you really making a ‘podcast’ if it’s just some an MP3 on an RSS feed? The past is a foreign country; we did things differently there.
Call it whatever you want, there is still space for the audio-only podcast. It is still cheaper, still quicker to make. There is still a large demographic of consumers (not to mention an influential, and wealthy, one) who are not yet comfortable with digital video. And if being an audio-only publisher becomes more rarified, it also becomes a clearer point of differentiation. It’s possible that the real fight isn’t between video and audio at all – it’s between video and video. Could you avoid the fray entirely by just going back to what you did best? There are plenty of examples of magazines that thrived in the early internet era by avoiding the rush to digitise and, therefore, the excesses of the dot-com bubble. Is the audio-only podcast the hedge bet against Pivot to Video 2.0 ending up in the same place as P-to-V v1?
I have said in the past that creators should be guided by the question of what suits the material they’re putting out. To some extent, I stand by that. The reality is that the question of video vs audio is immaterial except for the top 1% of creators. For the remaining 99%, we are squabbling over quite limited real estate. Rough data for 2025 pitched the average number of views on a YouTube video at around 5.5k (though that may be a selective sample; another piece of research, in 2024, estimated that, for accounts with under 2,000 subscribers an average video received 15.6 views). It doesn’t really matter, because, wherever they are exactly, these are levels of viewership where you will i) struggle to generate revenue, and ii) have limited audience penetration. 5,000 listeners to a podcast (using the patented, and very secret, Hilton Method of Audience Conversion) would be worth 30,000 views on a YouTube video. But neither is really what people are talking about when they discuss audio vs video. There, they’re talking about an almost-mythical elite.
Can you be an elite – by which I mean having content reach an audience of more than a million people – digital media brand, in 2025, without video? Probably not. There will be exceptions, but not enough to problematise that answer. But are you – you! – really going to be one of these elite digital media brands anyway? And if you’re not, does the pivot to video really suit your interests?
It is harder to tell a story on YouTube. It is harder to look professional on YouTube. It is harder to be covered in the media with a YouTube series. It is harder to book interviews for YouTube. It is, if not ‘harder’ then no easier to extract revenue from YouTube for the majority of content creators (estimates suggest a figure around 5m monthly viewers before channels can wash their face via YouTube's native ad sales). All other things being equal, would you pitch your project as a video or audio endeavour?
Because if your reason for converting your audio-only show to a video one – or envisaging your next project as a video-first effort – is that you’ve heard reports that this is the year of ‘explosive’ growth for video, and the future of podcasts is video, then I don’t think that’s a good reason. “Do I need to pivot to video?” isn’t really the question, for most content creators. That’s the question they ask people like me. The question that I ask them, in return, is the more important one. Do you want to pivot to video?
V. A LOOK TO THE FUTURE
Let’s face it, content has headed in one direction over the past hundred years (more, probably, since the invention of Herr Gutenberg’s press, or earlier still, when the Lindisfarne monks were illuminating manuscripts) and that has been becoming more dynamic. Video is the logical progression, in terms of where our technology currently is, for journalism.
In the future, we will doubtless see video as an antique. Already, Big Tech companies like Apple and Meta are experimenting with Augmented Reality (AR) and Virtual Reality (VR), both technologies designed to make quaint old things, like video, obsolete. But even though video has been around, in one form or another, for yonks, the time feels quite right for it. Problems have been solved. It is now a realistic way of creating that dynamic content.
But it is not a panacea, nor is it an inevitability. On the latter point, it is sensible to be wary of revisiting previous bubbles – the history of bubbles suggests they are likely to pop again. On the former point, and perhaps the word ‘panacea’ is suggestive, this rush to video reminds me a little of Big Pharma. It is clear that pharmaceutical development serves a purpose, but what we see, as consumers of a medicine, is the final product of a very complex pathway that begins, predominantly, with a financial decision. Can we afford to develop this drug? Is there a market for it? How do we steal a march on our competitors? Down-the-line, what we, as consumers, witness is just a medicine for an ailment, but how, when and if we receive that medicine is, in fact, decided by men in suits in boardrooms across America.
If video entirely destroys audio, then it will be the result of a series of interlinking financial decisions. Joe Nobody decides that he can’t afford a new car in the current economic climate, and men like Joe contribute to a 10% year-on-year decrease in car sales for a major manufacturer, let’s call them Alset. The CEO of Alset lays off some staff to appease shareholders, and shrinks budgets including that of the firm’s marketing team. Alset’s Head of Marketing decides that they will only put advertising spend into video content with clickthrough potential, to increase their conversions so that they can demonstrate to the CEO the value of their proposition (thus guaranteeing their jobs, and their potential to be future consumers of the Alset Model X). At the major audio-visual distributors – YouTube, Spotify, Apple, Amazon, Netflix – this preference is duly noted, and more money is shovelled into dynamic video advertising. Because they themselves are subject to the same financial pressures (and the rapacious shareholder desire for job cuts) they decide to sunset their proprietary audio-ad insertion programme and lay-off that team. News spreads, like wildfire, that the only way to profitably make digital content is to go ‘video first’, and so, from that first decision of Joe Nobody not to buy the car, we end up here, with video having defeated audio in a fight that nobody asked it to take on.
This is the sticky truth: for some time, content hasn’t really been the problem for the media. The issue has been revenue extraction. And that’s what the podcast industry has cottoned onto. Video has become a business imperative at a time when it has also become a practical reality. This is a confluence that makes it uniquely attractive. On a creative level, it doesn’t solve any real issues. It will still be hard to make content, still hard to market it, still hard to reach an audience. In fact, it may well get a lot harder, as creators and publishers coalesce around a single medium. Pivot to Video 2.0 is not about creating a plural and diverse media landscape. It’s about trying – possibly vainly – to establish a sustainable one.
For creators, the choice between video and audio is a false one. The real decisions will be made far earlier. You can fight the tide or swim with it, but the reality is that we’re all headed one way. Downstream.
I think one of the big questions is 'do you want to make a show on a platform that only a third of the population consume?' [here in the UK anyway]
It doesn't mean don't make audio - it's a great platform - but choosing to only make it only for audio platforms seems limiting.