The rejection notice arrives at 11 PM. Your file has been in broadcaster ingest for six hours. The campaign airs in three days. The automated QC system at a major European broadcaster has flagged your audio: True Peak violation, loudness non-compliance, sample rate mismatch. You have no idea what any of that means. Your audio engineer is in a different time zone. Your client is not going to be told about this.
This is the situation that audio post-production exists to prevent — and the one that happens when it is treated as the engineer's problem to manage rather than the PM's risk surface to understand.
You do not need to mix audio. You do not need to run a DAW. But you do need to know what a LUFS target is, why one master delivered to eight markets is usually non-compliant in at least two of them, and what question to ask before delivery that eliminates the entire category of broadcaster rejection. That knowledge is not in your production training. It sits in a technical layer that most PMs encounter only when something goes wrong.
This guide covers it before something goes wrong.
01. What Audio Post-Production Is — and What It Actually Does
Audio post-production is the set of processes applied to a recorded voice-over signal to prepare it for delivery. It works on the raw recording — the file that comes out of the studio or remote session — and transforms it from captured audio into broadcast-ready, technically compliant audio.
The pipeline is sequential and non-negotiable. You cannot skip a stage and patch it later. You cannot fix a dynamics problem at the delivery stage without compromising the mix. Every stage builds on what came before.
The Four-Stage Audio Post Workflow
VoiceArchive's audio post process follows four discrete stages:
Cleanup
Surgical work on the raw recording. Noise reduction removes room tone, hum, and environmental interference. Spectral repair addresses clicks, pops, and artefacts. Breath control shapes audible breathing between lines. De-essing reduces sibilance — the harsh S and SH sounds that distort on earbuds and headphones. On a luxury brand campaign, unaddressed sibilance is not a minor audio issue. It is a brand register problem.
Tone Shaping
EQ and tonal correction bring the voice into the frequency range required for the delivery platform and brand context. A voice recorded in a slightly bright room needs different treatment to a voice recorded in a controlled booth. Tone shaping is where the character of the final delivery is established. If it is not done at this stage, no amount of downstream processing recovers it.
Dynamics in Mastering
Compression and limiting bring the dynamic range of the recording into the window the delivery platform requires. Dynamics processing is irreversible once applied. This is the stage where the "delivered quiet and clean beats delivered loud and distorted" principle is enforced by competent engineers — and ignored by those who do not understand platform normalization.
Loudness and True Peak
The final compliance stage. The processed file is matched to the loudness target of the destination platform — measured in LUFS — and True Peak values are checked and limited. This is not a cosmetic adjustment. It determines whether the file is accepted or rejected at broadcaster ingest.
The Five-Step Pipeline
The four processing stages sit inside a broader operational pipeline — everything that happens from the moment talent files arrive to the moment the final deliverable leaves the building.
Receive Files
Incoming recordings from talent — studio sessions, remote recordings, multi-market files — are received and logged. Every file in the job must be accounted for before processing begins. Missing files caught here cost minutes. Missing files caught at delivery cost days.
First QC
Incoming files are checked against the brief before a single processing decision is made: format compliance, sample rate, session integrity, missing elements. A sample rate error caught at First QC is a five-minute fix. The same error caught by a broadcaster's automated ingest system at 2 AM is a campaign crisis. First QC is the gate that determines whether what arrives is actually workable.
Post-production
The four-stage processing workflow: Cleanup → Tone Shaping → Dynamics in Mastering → Loudness & True Peak. This is where the raw recording becomes broadcast-ready audio. Each stage is sequential and non-negotiable — skipping one corrupts the input for everything downstream. See the full breakdown in Section 04.
Mixing and Edit
The processed VO is assembled into the final deliverable: clip arrangement, sync to picture, VO balanced against music and SFX, level-matching across all campaign executions, versioning (30s/15s/6s cuts). This is a distinct step from post-production — and the one where a baked mix becomes uneditable. The distinction matters every time a client wants a line changed after delivery.
Delivery
Final files are exported to the specifications of each delivery destination — format, sample rate, bit depth, LUFS target, True Peak limit, naming conventions — and submitted. A campaign delivering to Europe, the US, Japan, and Australia requires four different master exports. One master delivered to all four is non-compliant in at least two markets.
02. Common Mistakes Across the Pipeline
The failures PMs encounter most often are not random. They follow a pattern, and most of them are caused by information that was never in the brief.
The Five Brief Gaps That Drive Rework
The most common audit finding in audio post is not a technical failure on the engineer's side. It is a brief that did not contain the information needed to produce the right output.
No delivery destination specified
"Broadcast" is not a delivery destination. EBU R128 for European broadcast, ATSC A/85 for US broadcast, ARIB for Japan, and OP-59 for Australia and New Zealand are different standards with different loudness targets. A PM who writes "for broadcast" in the brief has given the engineer no actionable information.
No loudness target requested
Without a specified LUFS target, the engineer delivers to a default that may or may not match the receiving platform. The cost of discovering this at ingest is a full redelivery cycle.
No revision window defined
If the brief does not include a revision window — and by extension, a requirement for session files to be retained — the engineer has no obligation to retain an OMF or AAF. When a line change request arrives two weeks after delivery, the files may no longer exist. The result is a full re-record, re-process, and re-mix. Not because anyone made a mistake. Because the brief did not ask for what was needed.
No format specification shared
The delivery format — file type, bit depth, sample rate, mono vs. stereo, codec — should be confirmed before post-production begins, not requested after the fact.
Sample rate not specified
The standard for broadcast is 48kHz. Many DAWs default to 44.1kHz (the CD standard). A file at 44.1kHz delivered to a broadcaster running 48kHz ingest will fail automated QC. It is one of the top five rejection reasons across major European broadcasters.
What Broadcaster Rejection Actually Looks Like
Major European broadcasters — BBC, ZDF, France Télévisions, RAI — operate automated QC ingest systems. A file that fails does not reach a human reviewer. It is rejected automatically, with a notice issued between four and twenty-four hours after submission.
The Five Most Common Rejection Reasons
- 1. Loudness non-compliance
- 2. True Peak violations
- 3. Sample rate mismatch
- 4. Codec or format errors
- 5. Loudness Range violations on mixed content
The time cost of a rejection cycle: detection (4–48 hours) plus correction (2–4 hours) plus resubmission (4–48 hours). With three days to air, one rejection cycle is a campaign crisis. With two days to air, it is a missed air date.
03. What Clients Actually Get From Audio Post Done Right
The output of audio post-production is not a processed file. It is a delivery decision made before the client ever hears the finished campaign.
The Perceptual Reality
The listener does not consciously hear "bad audio." They hear "untrustworthy brand." The attribution is subconscious — and it is consistent.
UCL research on acoustic environment and speaker credibility established that the same speaker delivering identical content is rated significantly lower in competence and trustworthiness when the acoustic conditions are poor. Edison Research data identifies poor audio quality as the most cited reason audiences abandon audio content — ahead of poor content quality.
For a PM, this translates directly: audio post that is done correctly means the creative work is not undermined by technical failures the audience cannot name but will feel. Audio post that is done incorrectly means the voice talent, the script, the music, and the strategic brief are all working against a signal that makes the brand feel smaller than it is.
What Consistency Across Markets Requires
A campaign running across eight markets is not one campaign delivered eight times. It is eight executions that must hold together as a single coherent brand experience. Inconsistent mic placement across markets produces different tonal characters from market to market. Inconsistent loudness levels mean each execution sounds different on back-to-back listening.
Delivering consistent output across markets requires consistent process, not consistent talent. The audio post workflow is what creates that consistency — not the studio each market records in.
04. Inside the Engine Room: The Audio Post Workflow
What happens between "raw recording" and "broadcast-ready file" is not an artistic process. It is a technical production process with defined stages, defined targets, and defined failure modes. Understanding it at operational level — not engineering level — is what gives you the language to brief it correctly and review deliverables with competence.
What Happens at Each Stage
Each stage is a set of discrete operations. Here is what is happening inside each one — and what it protects the PM from if done correctly.
| Stage | Process | What It Does |
|---|---|---|
| 01 Cleanup |
Breaths | Removes unwanted and distracting breath sounds between lines. Uncontrolled breaths undermine professional delivery — particularly in broadcast and luxury brand contexts. |
| Clicks & Pops | Identifies and eliminates clicks, spit, and mouth noises. These artefacts are often inaudible in a studio monitor but severe on earbuds and car speakers. | |
| Room Resonances | Manages low-frequency resonances from the recording environment. Unchecked room resonance creates a muddy, unprofessional sound that no downstream EQ fully corrects. | |
| Steady Noise | Reduces continuous background noise — air conditioning, computer hum, electrical interference. The floor of the recording must be clean before any processing begins. | |
| Hums & Tones | Addresses electrical hum and tonal interference. These create a sound foundation clean enough for accurate mastering — without them, loudness targets are hit against a noisy floor. | |
| 02 Tone Shaping |
EQ | Balances lows, mids, and highs for clarity. A well-EQ'd voice is smooth and easy to listen to — the listener processes the message, not the signal. |
| Balance | Ensures vocal consistency across takes and edits. A campaign with 12 markets that all sound tonally different is a brand consistency problem, not just a technical one. | |
| Frequency | Targeted frequency adjustment to reduce muddiness and enhance crispness. This is where the voice is shaped for the delivery platform — broadcast EQ differs from podcast EQ. | |
| Intelligibility | Enhances listener comprehension of the voice-over. In advertising, message delivery failure is campaign failure — intelligibility is not an aesthetic goal, it is a functional one. | |
| 03 Dynamics in Mastering |
Gentle Compression | Applies smooth level control to maintain vocal consistency and clarity across the dynamic range of a performance. Eliminates jarring volume changes between loud and quiet lines. |
| De-essing | Reduces harsh sibilance (S/SH sounds). Critical for campaigns delivered to streaming and social platforms, where the majority of consumption happens on earbuds at close range. | |
| Dynamic Range | Balances loudness variations across the recording. Without this, the mix has no coherent loudness signature — each line sits at a different level and the delivery feels uncontrolled. | |
| Audio Enhancement | Processing to enhance vocal presence and engagement. Makes the performance more arresting without altering its character — particularly useful for re-energising flat remote recordings. | |
| Limiter | Prevents clipping by controlling signal peaks. Clipping is irreversible distortion. A limiter is the last line of defence before the signal reaches the loudness measurement stage. | |
| 04 Loudness & True Peak |
Target LUFS | The processed file is matched to the loudness target of the delivery platform. Every broadcast and streaming platform publishes a LUFS target — the engineer must know it before the mastering stage begins, not after the file has been submitted. |
| Safe Headroom | Maintains a safe True Peak (dBTP) threshold to avoid clipping at the inter-sample level. True Peak differs from standard peak metering — a file reading -0.5 dBFS can still exceed the True Peak limit. | |
| Monitoring | Loudness is measured with platform-approved tools (LUFS meters, True Peak meters). An engineer without True Peak-aware tooling in their chain can deliver a file that appears compliant and still fails ingest. | |
| Consistency | Ensures consistent loudness across all voice-over segments within a delivery. A spot that shifts in perceived volume mid-way through fails the audience before it fails the QC system. | |
| Final Check | A complete pre-delivery review against the brief specifications — format, sample rate, bit depth, LUFS, True Peak, naming conventions. The final check is what First QC catches on the way in, and what the engineer must catch on the way out. |
Why Stage Order Is Not Negotiable
Every stage in the audio post workflow outputs a signal that the next stage processes. Cleanup before tone shaping means the EQ is working on a clean signal, not on noise. Tone shaping before dynamics means the compressor is responding to the actual character of the voice, not to artefacts that should have been removed. Dynamics before loudness means the LUFS measurement reflects the true programme loudness, not loudness that has been artificially inflated or suppressed.
If any stage is skipped or done out of order, the downstream stages are working on incorrect input. The problem does not become visible until the file reaches broadcast ingest — or until the client hears the spot and it does not sound like the brand.
What the First QC Gate Actually Does
First QC is the stage that most PMs do not know exists, and the one that has the highest return on investment in the pipeline. Incoming files are checked against: format, sample rate, session integrity, and any brief-specified technical parameters.
A sample rate error caught at First QC is a five-minute fix. The same error caught by a broadcaster's automated ingest system at 2 AM, three days before air date, is not.
That 9/10 number is not the result of talented engineers working quickly. It is the result of a First QC gate, a four-stage workflow, and a team that reads the spec sheet before the session begins.
05. What is Mixing and Edit and Why It Matters
This is the distinction most PMs do not have, and it is the one that causes the most expensive rework.
Audio post-production processes the individual signal — it works on a single VO recording and makes it technically and sonically correct. Mixing and editing assembles the final deliverable: clip arrangement, synchronisation to picture, VO balanced against music and SFX, level-matching across all campaign executions, and versioning — the 30-second, 15-second, and 6-second cuts that each require their own mix pass.
A baked stereo mix cannot be surgically altered. When a mix is printed — bounced to a single stereo file — it is a fixed output. You cannot pull one element out of it. If the client wants one line of VO changed after the mix has been delivered, the process is: re-record the line → re-process → re-edit → re-mix → re-deliver. On a 10-market campaign, one line change without session management can cascade into forty or more individual file deliveries.
Studio Vocabulary PMs Need
Key Terms
- The mix — the combined stereo output. The "final" that goes to the broadcaster or platform.
- VO stem — the isolated, processed voice-over track. Essential for re-versioning.
- Baked / printed — the mix has been bounced to a single file and the original session no longer exists. No individual elements are accessible.
- Revision rounds — the number of client feedback cycles built into the session budget and schedule. Must be briefed upfront. If not briefed, there is no guarantee the session files will be retained.
If you are in a review session and someone says "we can just reprint the mix," that is a signal that the session is being treated as a single-pass deliverable. Ask where the stems are.
06. The Secret Language of Audio: OMF, AAF, and Stems
The single most preventable source of audio rework in post-production is the absence of an OMF or AAF request in the brief. This is not a technical detail. It is a contractual and operational one. And it belongs in the brief, not in the revision email.
What These Files Are
A baked WAV file is a photograph of a painting. An OMF (Open Media Framework) or AAF (Advanced Authoring Format) is the original layered file — every clip, every fade, every processing parameter preserved and editable in a compatible DAW. These formats are the interchange standard between professional audio workstations. They exist specifically to make sessions portable and revisable.
Stems are a middle option: rendered layer groups (VO stem, music stem, SFX stem) delivered as separate files. Not as flexible as OMF/AAF — you cannot make clip-level edits — but they allow re-balancing without a full re-mix. Stems are the minimum for any campaign that may require market-specific versioning.
The Critical Constraint
The OMF/AAF request must be made before recording begins. A session that was not built for interchange export cannot be retrofitted. If the engineer did not set up the session to export a clean OMF or AAF, there is no backwards path. This is not a refusal. It is a technical limitation.
On a 10-market campaign that requires corrections in three markets: without OMF/AAF, that is three full pipeline runs — re-record, re-process, re-mix, re-deliver, per market. With OMF/AAF, it is three targeted edits in the existing session.
The Brief Language to Use
"Please retain project files for 90 days post-campaign and provide an OMF or AAF of the session with the final delivery. We require revision access within this window."
That sentence, included in the initial brief, eliminates an entire category of rework.
07. LUFS Standards by Platform — The Reference Table
LUFS (Loudness Units relative to Full Scale) is the measurement standard used by every broadcast and streaming platform to specify the loudness of delivered audio. Every platform publishes a LUFS target. Every platform's automated ingest system checks against it.
Broadcast Standards
| Market / Standard | Loudness Target | True Peak Limit |
|---|---|---|
| Europe (EBU R128) | -23 LUFS | -1 dBTP |
| United States (ATSC A/85) | -24 LKFS | -2 dBTP |
| Japan (ARIB) | -24 LKFS | — |
| Australia / New Zealand (OP-59) | -23 LUFS | — |
Streaming and Social Standards
| Platform | Loudness Target | True Peak Limit |
|---|---|---|
| Spotify | -14 LUFS | -1 dBTP |
| YouTube | -14 LUFS | -1 dBTP |
| Apple Music / Podcasts | -16 LUFS | -1 dBTP |
| Netflix (dialogue) | -27 LUFS | -2 dBTP |
| Facebook / Meta | -24 LUFS | — |
| TikTok | -14 LUFS | — |
| Safe cross-platform social target | -16 LUFS | -1 dBTP |
IVR and Telephony — A Different Standard Entirely
IVR and telephony audio is not measured in LUFS. It uses dBFS RMS. Standard telephony targets -9 to -12 dBFS RMS. Delivery format is typically 8kHz or 16kHz, mono, 16-bit for legacy systems. A file that is LUFS-compliant for broadcast will not necessarily be correct for IVR. If your campaign includes any IVR or phone-based channel, specify this explicitly in the brief.
The Auto-Normalization Trap
Streaming platforms normalize incoming audio. This does not save a badly mastered file.
If the VO was heavily compressed to sound loud, the platform reduces its overall volume when normalizing — but the dynamic compression artefacts are already baked into the signal. Delivered quiet and clean beats delivered loud and distorted, every time.
What "One Master for Eight Markets" Actually Means
A campaign delivering to Europe, the US, Japan, and Australia requires four different master exports. One master at -23 LUFS delivered to all four markets is non-compliant in at least two of them. The engineer needs to know every delivery destination before the mastering stage begins.
08. Specific Voice Challenges — What Post Can Resolve (and What It Cannot)
Audio post can resolve a significant range of technical problems in a recorded performance. It cannot resolve problems that originated in the recording environment or the performance itself.
What Post-Production Resolves
Sibilance is the most pervasive issue on digital campaign audio. Harsh S and SH sounds cause distortion on earbuds and headphones. De-essing in the cleanup stage addresses this. But severe sibilance caused by mic placement or performance may require multiple passes and can affect the natural character of the voice. Brief it early.
Plosives — the burst of air on P and B sounds — behave differently across playback systems. Barely audible on studio monitors, they can be severe on laptop speakers or car stereos. Post-production can reduce plosives, but the most reliable fix is a pop filter in the recording session.
Room reverb and reflections reduce speech intelligibility. Post-production can apply reverb reduction, but a badly reverberant recording environment leaves artefacts that no amount of processing fully removes.
What Post-Production Cannot Fix
Beyond the reach of post
- Inconsistent mic placement across markets. Post-production can correct individual files, but matching sixteen different tonal characters to a single brand voice adds significant time and budget. Include microphone setup guidance in the recording spec.
- Performance issues. Pace, emphasis, and emotional register are performance decisions. Post-production does not change these. If a market's recording lands with the wrong energy for the brand, the fix is a retake.
- Fundamentally poor recording quality. A recording made on consumer hardware in an uncontrolled environment can be made marginally more listenable. It cannot be made broadcast-compliant. First QC catches this — which is exactly where it belongs.
Work With VoiceArchive's Audio Post Team
The brief language, the platform standards, the OMF/AAF question — these are what separate campaigns that air on time from campaigns that generate 11 PM rejection notices. Tell us your delivery destinations. We will give you the LUFS targets, the format specs, and the brief language before the session goes into production.
Work with VoiceArchive's Audio Post Team