Metadata: Getting it right, even when it’s wrong.
Well, today is my last official day working on BBC HD audio. Somehow I don’t think this project will leave me alone just yet, but after a week’s leave, my main focus will be elsewhere. So I thought I’d take the opportunity to talk about something which has consumed a fair bit of my time, but which I haven’t blogged much about: metadata. For the uninitiated, metadata is “data about data”. A photo’s metadata for example might tell you what camera it was taken with, where it was taken, what exposure was used and so on. In the case of BBC HD’s audio, metadata is carried by the Dolby E and Dolby Digital streams we use, and has two main functions: it describes the audio being carried, and it controls the decoders in your homes. One parameter, often called dialnorm (for Dialogue Normalisation), tells your decoder how loud the programme is, so that it can attempt to smoothe out differences between programmes and channels to give you a more consistent loudness. Another set of parameters control what happens when your decoder downmixes the audio, meaning when it produces a stereo mix for your stereo speakers from the surround sound we may be sending. It’s important stuff, so we have to make sure that metadata survives our distribution chain, and sometimes we even have to add metadata to a programme automatically, which can be tricky. Here’s some of the work we’ve done…
We’ve worked on making sure that metadata survives our distribution chain, but that’s all a bit dull, so I won’t bore you. If you want an entertaining metadata story, read Andy’s blog on our trial of Reverse Karaeoke. What I will share is what happens when we have to create metadata in the delivery chain; normally for a surround sound programme the sound engineer will create the metadata that goes with it, ensuring that the metadata matches the programme content well and the effects of the metadata in your receiver aid the artistic intent of the mix rather than disrupting it. But stereo programmes are delivered to us without metadata, so we have to make it up. What values should we choose? And if something goes wrong somewhere, one of the first things to die could be the metadata, so if that happens we need to create new metadata. Again, what values to choose? The two metadata sets I’ve described here are referred to internally as “stereo metadata” and “reversion metadata”.
Stereo should be fairly easy. BBC HD uses Dolby Digital to send its audio, but SD channels use MPEG2 audio, and they have no metadata. So programmes mixed for stereo-only are mixed to a standard level - they all sound much the same volume hopefully. Therefore we don’t need to set a different dialnorm for each programme, all we need to do is choose one value for all stereo. We currently use -23dB based on imperical testing, and a consistency with other broadcasters who use the same value. Then there’s DRC, or Dynamic Range Control. This one’s a bit more tricky to explain, but basically it allows your receiver to reduce the dynamic range of the audio, which is the difference in volume between the quietest and loudest parts. So DRC makes the quiet bits a little louder, and the loud bits a little quieter. The idea is that we can broadcast programmes with a nice big dynamic range so that those with a high-end audio system can get cinematic effects, while allowing the decoder to reduce the range for those of you listening on small speakers in your telly for example, which won’t be able to produce such a range of volumes. So far so good, but stereo programmes are generally mixed for compatibility with stereo-only channels (i.e. all BBC channels except BBC HD), so they have a small dynamic range in the first place - they’re designed to work on all TVs and audio systems without dynamic range control. So recently, we switched from using a small amount of DRC in our stereo metadata to using none at all. This should ensure that stereo programmes sound the same on any channel, and we’re watching the results carefully. OK, so what about reversion?
Well this is trickier. Remember that this is what happens if things go badly wrong - not something we want to happen, but something we must prepare for. We have to come up with a set of metadata which works for all programmes as best we can, causing the least degredation to the biggest range of programmes so that if reversion happens, whatever programme we’re broadcasting will sound OK. So question one is this: do we tell your decoder that we’re sending 5.1 or stereo audio? The answer has to be 5.1 - if the metadata says 5.1 and a stereo programme is sent, your decoder will just reproduce the left and right channels in the left and right speakers. Any fancy Dolby Pro-Logic decoding won’t work, but you will hear the basic audio. If we did the opposite and signalled the programme as 2.0 (stereo), a 5.1 programme would be badly degraded, as you wouldn’t hear the centre and rear channels, which would probably mean you wouldn’t hear the dialogue!
The DRC is the next question, and a relatively easy one - we stick with the default setting, which applies quite a lot of dynamic range processing. This will make sure that any 5.1 programming comes out of your speakers in a way that works for all programming and all speakers, even if it doesn’t sound so impressive on high-end systems. And while stereo programming might be affected a bit, it won’t seriously degrade the audio. The final important question is the dialnorm. Whatever happens, if the dialnorm doesn’t match the programme, you’ll hear the sound either too loud or too quiet. Since not all programmes are at the same level, there is no ‘perfect’ value to choose, we have to simply make a best guess. The choice we made is to use a dialnorm of -23dB, the same as for stereo. What this means is that stereo programmes should sound normal, while surround sound programmes will likely sound too quiet (by a varying amount depending on the programme). Again, we based this decision on the least-worst effect it would have; surround being too quiet is less bad than stereo programmes being too loud (which would have been the other option) as people generally find things jumping up in volume more annoying.
So there you go, that’s metadata for you. We think we have a pretty strong system now, so that all surround and stereo programmes reach you with the best metadata settings possible, and even if things go wrong the results should be pretty good. We’ve also used some tricks with metadata to help us identify the source of a problem if one occurs, so as well as sounding better if things do go wrong, we can fix the problem faster. Some of you may be disappointed to hear it, but I don’t think we’ll be having any more Reverse Karaeoke!
So that’s it from me! If anything particularly exciting happens in the world of BBC HD’s sound, I’ll try to let you know. And as I move up north to help develop a new Research and Development lab I’ll try to tell you a bit about that too, as I think it’ll be an exciting journey. Watch out for a new series of posts from me about that on this website, and for updates on BBC HD and the BBC’s wider technology work, keep reading BBC Internet Blog. Cheers!
I’m an engineer with the BBC and sharing information about my work, but this is my personal website.