Audiobook pricing

Vertigo

Mad Mountain Man
Supporter
Joined
Jun 29, 2010
Messages
8,763
Location
Scottish Highlands
I am a little curious as to how audiobooks get the pricing they do (I've not looked that thoroughly but they seem to be over twice the price of the paperback edition). So I was wondering if any of the folk around here who know a bit about the publishing business could enlighten me.

After the similar discussions about ebooks, I got to thinking that I've never heard similar discussions about audiobooks and yet it seems very similar to me, except that audiobooks are even more expensive.

I guess a lot depends on who you get to read it and how much they are going to cost, especially if you want a celebrity. Personally I prefer audiobooks read by someone I've not heard of before; I find it distracting to hear them read by a famous voice. Although there are exceptions; Richard Burton's reading of Under Milk Wood is a masterpiece (technically not a book I know).

However it seems to me that the reading is a one off cost and, if not a celebrity, not even a very high cost. Assuming we are only talking about unabridged versions, the editing has already been done. And the media is certainly no more expensive and almost certainly considerably cheaper to produce than a paperback.

Of course it must share the original production costs, which would justify it being the same price as a paperback in the same way as an ebook is justified at being at that price, but it isn't; the audiobook is more than double that price.

Just curious is all!
 
I'm not a professional in the field of audiobooks, but I know a little something about sound recording. "Editing" is more than abridging a book, which would be done to the script, not a full recording. (This assumes only one version of the completed audiobook.) The talent might flub lines and have to redo them. Master mix-down might involve compression and other cleaning, just like music. Some audiobooks on CD are generally segmented into three-minute chunks, which might be automated. Audiobooks in file-based format might have chaptering and other embellishments. And at a guess, I would think the market for audiobooks is much smaller than that for print/text books, which would increase the price.

Then again, a distributor like Librivox works with volunteers and provides audiobooks of public domain works for free. I've listened to several books by Mark Nelson; the reading and the engineering of the book are up there with the best commercial books. In counterpoint, I've run into several commercial audiobooks that were really terrible—such as the reader failing to keep a constant volume, a problem which could be fixed by the recording engineers.

So how is the price fixed? As high as the traffic will bear.
 
Ah, well, audio recording is my thing, even if I haven't recorded for a speaking books for the blind, on LPs that ran at sixteen and two thirds RPM. I might be doing another this autumn, if the economics work out.

Firstly, for reasons of audio fidelity, the recording has to be done in a recording studio or similarly soundproofed, acoustically treated environment (not the editing, the longest part of the job, but once you've started in optimum conditions it's difficult to go back to simpler later). No "the same headphones I use for Skype and in my own living room" stuff. A motorcycle behind a phrase might be perfectly tolerable for a radio play; when it repeats every time the recording is listened to, it becomes intensely irritating, so that bit of the recording must be redone. The studio will have the requisite programs that allow for rapid and efficient editing, loudspeakers on which you can truly judge the quality of the final product, and personnel who know how to get the most out of above, and all of this must be paid for.

If the book lasts a couple of hours, you can expect the recording to last at least a day, the rough cut and relisten (with the occasional rerecord of a portion) half as long again (and that's assuming a professional speaker, not the author himself or someone he thinks has a "nice voice and delivery" but no mic training or experience in going back to cover occasional errors). Professional speakers need paying; less, perchance, for this than a voiceover for a Rolex film, but that's still a lot of hours. If the author can be there, to check the fidelity of the performance relative to his original concept, the final result will be closer to his ideal, but the process will take at the very least twice as long. A recording engineer with the text, following what is going on will save hours in final edit, but costs money (despite what the doctors say, I like to eat).

The "prooflistening", the maintaining speech rhythms while editing the audio, the level and tone equalisation take much longer than their print equivalents. Speakers can't go on talking hour after hour, until the job is done; they need breaks, and if you spend these listening to what has already been recorded, tend to grow stale (and find dozens of paragraphs they "could have done better", and want to try again). Time evaporates, and all of this is after all the corrections already done for the words themselves, and their punctuation, chapterisation, POV changes, et al.

Then there's the legal side, and lawyers earn considerably more than sound engineers. The standard contract for books has been refined over generations, that for music over nearly as long (even if technology has been continuously mutating it). All that is required is putting figures in the blank spaces. But for the recorded word? Does the speaker get royalties, or just a lump sum for the job? In the case of a foreign version, does the original interpretation get credited the way the first artist to record a song does, even though he might not be composer or lyricist? Lots of legalistic cash registers going "ding".

Of course, if you sell a million copies (or even a hundred thousand) all of these details – two hundred dollars here, a thousand there – become irrelevant. But most audiobooks won't, and either cash will by syphoned off the successful ones to pay the costs of the others (something that best-selling authors could justifiably take exception to) or audio books will stay at the expensive end of the market.

Distribution of audio titles requires considerably more data travelling than text files, or even music. Your average novel fits into a couple of Megs of text (assuming you're not using a profligate program like Word) and, with cover, map and a couple of illustrations probably fits comfortably in ten. Audio, for several hours of material, needs hundreds of Megs, even if quality has taken a poor second place to convenience.

For many applications a physical medium is essential; although some cars are equipped with mp3 CD players, few have download capacity from the Web. Standards aren't – although mp3 has a nice solid foot in the door (or, for other sound engineers, in the DAW) it is a long way from being universal, and technically better, newer compression algorithms abound, clamouring for entry. If you're going to listen to your books on your computer, this is not a problem, but you aren't, are you? You're going to take them jogging, or driving, to the gym, the spa, the massage parlour, on transatlantic jets…

The market is not yet big enough that iTunes and Amazon (and probably Google, just to keep things stirred up) are assassinating each other's agents for greater market share, unfortunately for the consumer but not the producer; we've seen what competition does for prices. But it's growing, along with the selection of titles available, and it will never be a 'self publishing" market under the control of authors – too many different people involved. Perhaps that's just as well, as it allows for a modicum of quality control, sadly lacking in some other branches. It opens possibilities of special effects, music and sound effects for children's books, embedded graphics and metadata for iPods in the train, subtitles and footnotes for scholars, and abuse of every one of these for those who only want their bedside stories read to them.

Perhaps I should be convincing my sister to do audio versions of her poetry books, with live versions from readings and festivals to studio recordings…
 
Thanks for that both of you, absolutely fascinating! I guess my idea of recording was pretty naive and if I'd really stopped to think about it, it would be obvious it was going to be a lot of work. And you're probably right that even that likely wouldn't matter too much if there were more sold but I would imagine there are an awful lot less sold than any other medium. So I guess the price will always be a lot higher.

Re your sister Chris; I should imagine audiobooks would be the ideal medium for poetry, just like Under Milk Wood.
 
Hopefully we'll get to the point, soon, where a synthetic voice will be a passable, if not good, substitution for a human voice. This way, devices like the Kindle will be able turn any book into an audio book. This will be especially beneficial for the blind and people hard of seeing, and it will lower the costs of audiobooks a great deal.
 
Hopefully we'll get to the point, soon, where a synthetic voice will be a passable

More than passable. Mac OS X includes some very good voices, and I've also bought software from Cepstral and AssistiveWare. These voices are good, though still recognizably synthetic. I've heard some other synthetic voices that are even smoother.

The most difficult thing to overcome is the lack of inflection, such as the pitch change commonly used for questions. Thus, the most advanced synthetic voice I've heard is Yamaha's Vocaloid, marketed as a "singer in a box." The software features automated pitch shifts so that lyrics follow melody, but the artist can also manually tweak the voice the same way an animator might fine tune character movement in 3D animation. Examples include Hatsune Miku (a name meaning "First Voice from the Future") and the soundtrack to the movie Paprika.

All of the above are "sampled" synthetic voices. I have no idea if anyone has generated a completely artificial voice yet.
 
The most difficult thing to overcome is the lack of inflection, such as the pitch change commonly used for questions. Thus, the most advanced synthetic voice I've heard is Yamaha's Vocaloid, marketed as a "singer in a box." The software features automated pitch shifts so that lyrics follow melody, but the artist can also manually tweak the voice the same way an animator might fine tune character movement in 3D animation. Examples include Hatsune Miku (a name meaning "First Voice from the Future") and the soundtrack to the movie Paprika.

Yeah, this stuff is awesome. Messed around with stuff like it a bit. Totally reminds me of the AI Pop Star in Macross Plus - Sharon Apple. I'm sure the stuff in Paprika was inspired by that.
 
Reading a book out loud is a performance; we are far from software that can transmit emotion by inflection or rhythm change. If someone were to generate the requisite flexibility into this situation it would be so time consuming that it would be cheaper to employ the actor. (Which doesn't mean it couldn't be done, or shouldn't, or won't. It would be an exercise in something quite different from merely producing a listenable version of the book, like Rondò Veneziano using a synthesizer to simulate acoustic performances accurately enough that you only knew it was a Fairlight because it was an open secret). It might even sell. Or, if you were transcribing the Asimov robot books, or cyberpunk, an absolute lack of emotion might turn out to be a positive rather than annoying detail.

I know at least two distributors of audio books give free copies to visually impaired readers, which, in turn, means they have to charge their healthy customers more, so as to stay in business.

Sorry about the first line of my previous post which started as "even if I haven't recorded one since working for a pre-cassette "speaking books for the blind" project," and got a bit mangled somewhere.
 
Last edited:
Richard Burton's reading of Under Milk Wood is a masterpiece (technically not a book I know).

I tried to buy that years ago from Woolworths, the last tape copy left was meant to be Richard Burton reading it. However due to a mix up somewhere the tape I opened when I got home was not Under Milk Wood read by Richard Burton but Under Milk Wood read by Dylan Thomas so I wasn't too disappointed. I lost that tape ages ago I must look out for a cd or something.
 
Agree with you there PM he really did have the most astonishing voice. There have been a few others I would compare and there are even some around today: Peter O'toole, Richard Harris, Anthony Hopkins, John Hurt and even one American ;) Morgan Freeman. I suspect most of them had theatrical stage training.

Interesting discussion about the present state of synthetic voice. I wounder just how much of our information is picked up from things like inflection and timing. This of course is always one of the problems with the written word and why it is so easy to get misunderstandings on forums such as this.
 
Not to interrupt the erudite discussion that is taking place thus far, but (16 rpm notwithstanding) the price of books on CD has proven to be much more economical than the old books on cassette. The price per volume seems much closer to reaslity.
 
the price of books on CD has proven to be much more economical than the old books on cassette.

Optical media are much easier to reproduce than tapes (audio or video cassettes). File-based distribution is easier still because it bypasses the physical media—although there are other kinds of overhead to the price, as discussed in another thread.

For those who are not ready to give up physical media, yet want access to rare titles that are not likely to be in stock, some merchants offer a DVD-R or BD-R option—order it, and they'll burn a copy for you.

Vertigo wrote: I wounder just how much of our information is picked up from things like inflection and timing.

Take note of the way people gesture even when talking on the phone. Communication is multi-channeled and far richer than mere words. Picking up the subtle cues for veracity is another "more than words" aspect of our communication. James P. Hogan highlighted this is the "Giants" novels, and Jean Auel described the Clan as being incapable of telling lies due to their language.

Paranoid Marvin wrote: I think I could happily listen to a shopping list being read by Richard Burton.

There are many great voices out there. I bought a collection of poems titled Silver Lining because of the collection of voices:


Hope Is The Thing With Feathers—Emily Dickinson—Julie Harris
Sonnet CXVI—Wm. Shakespeare—Michael York
Playback—David Harsent—John Hurt
Snake—D. H. Lawrence—Jeremy Irons
Let Me Live Out My Years—John G. Neihardt—Kirk Douglas
Will I Think Of You—Leonard Nimoy—Leonard Nimoy
Helen—Christopher Marlowe—Simon Ward
Indian Serenade—Percy Shelley—David Warner
Whales Weep Not!—D. H. Lawrence—William Shatner
pity this busy monster, manunkind,—ee cummings—Rod Steiger
Sonnet — To Sleep—John Keats—Patrick Stewart
The Donkey—G. K. Chesterton—Christopher Lee
Annabel Lee—Edgar Allen Poe—Michael Caine
Corinthians 13—New Testament—Gary Oldman
If He Could Talk—Patrick Cargill—Patrick Cargill
Emilio—Charles Frank Laughton—Martin Sheen
Last Verse—Noel Coward—Douglas Fairbanks, Jr.
Gus: The Theater Cat—T. S. Eliot—Richard Kiley
Little Boy Blue—Eugene Field—Tim Curry
Ozymandias—Percy Shelley—John Standing
A Death In The Family—Alasdair MacLean—Ian Holm
The Chances—Wilfred Owen—John Castle
The Stolen Child—W. B. Yeats—James Earl Jones

One of my favorite Simpsons Halloween segments is "The Raven," featuring James Earl Jones.
 
I wounder just how much of our information is picked up from things like inflection and timing.

I suppose it depends on how much of the work you expect the reader to do. A good reading adds the excitement, the discomfort, the joy; but even a mediocre reading, relying on the listener to do all the work, is better than nothing.

I had a friend in Las Vegas, totally blind, with a braille computer screen (thousands of little blunt needles that push forward or retract; how's that for a touch screen?) who was delighted with her voice reading software; and this was twenty years ago, when things were much more primitive than now. It sounded like a robot from a low budget film, but just being able to check eMails, or log onto a forum, without needing someone else to lend a hand was a huge luxury. (That's how I learnt about the free volumes, by the way.) When I'm talking about audio books, that's not the market I was considering; I'm talking about something at least as good as readings across the radio.

There will be a thousand opinions on how much dialogue should be reported, against in the voice of the character, whether special effects - either electronic, vocal or mic technique – enhance or distract from the experience, whether mono (as the reader is essentially a point source) is acceptable as a way to reduce data quantities…

And every single one of them is right.
 
I agree that blind people are likely to just be happy to be able to experience the book, regardless of poor "performance" by text-to-speech software - but presumably we're reaching the point where, with so many ebooks on the market and text-to-speech built into computers, a specialist market in "books for the blind" is a thing of the past?

However I disagree that it would be impossible to create a synthesized voice with apparent emotion. Of course it would probably require a human operator to "score" the text first, but that's going to be faster than having a fallible human read the entire thing out loud.

Macross Plus SPOILER: Like Sharon Apple...

It's likely that my own book will come out in audio format, as my publishers recently signed a deal with Brilliance Audio. I can see I'm going to have to write up a detailed pronunciation guide for the poor sod who has to read it!
 
That'll surely be fun for you Anne, I trust you at least know how everything (in particular names) should be pronounced.
 
Of course it would probably require a human operator to "score" the text first, but that's going to be faster than having a fallible human read the entire thing out loud.

Less wearying for the speaker, yes. Faster? With today's technology, no way. If you'd done editing for thirty second TV commercials you'd know that, even with the correct emphasis to begin with, when you stick together segments of different takes you can spend three, four hours getting rhythms and pitches lined up and sounding natural. (they can afford to pay for it). Each phrase would need a tempo, tempo change, pitch and articulation variant envelope, to prevent the reading from sounding flat and boring, and punctuation pauses adjustable by a factor of ten. Yes, you could build a rough structure with rising inflections before question marks, short sentences running slightly faster than longer ones, and a random fifteen cents pitch variation to prevent the monotone; but, unless you had an AI that actually understood the text it's going to take a human being weeks of listening to the same passage over and over, tweaking details, to prevent it from being deadly boring or recognisably synthetic. An author (presumably dissatisfied with the results of a previous recording) could spend as much time getting it to sound as perfect as possible as in getting the words right in the first place.
 
That'll surely be fun for you Anne, I trust you at least know how everything (in particular names) should be pronounced.

Of course! I've been conlanging since my teens, but with this one I deliberately went for something that wasn't too "out there" in terms of phonology. If I can get the narrator to pronounce the 'j' in skraylings' names like a 'y' (similar to most Germanic languages) and the 'q' as in Arabic, rather than the way they're pronounced in English, that will be half the battle!

The nasal vowels in the other language might be a bit trickier to explain...
 
Ouch, I can see a fair chance of you not being too happy with the final result, Anne! I guess it depends on how much they actually bother to consult you and how much they just go ahead and do their own thing!
 
Yeah, well, we'll see. Angry Robot are pretty good when it comes to working with their writers - I kicked up a fuss about the cover copy being wrong, and they let me write my own :D

They also asked for a description of my hero when they were commissioning cover art - maybe they'll think to ask for a pronunciation guide if it goes to audiobook!
 

Similar threads


Back
Top