The Audio Post-Production Guide for Project Managers — VoiceArchive
Audio Post-Production

The Audio Post-Production Guide for Project Managers

What you need to know to brief it right, catch errors early, and keep your campaign on air.

VoiceArchive  |  Production Workflow

Real-world risk

The rejection notice arrives at 11 PM. Your file has been in broadcaster ingest for six hours. The campaign airs in three days. The automated QC system at a major European broadcaster has flagged your audio: True Peak violation, loudness non-compliance, sample rate mismatch. You have no idea what any of that means. Your audio engineer is in a different time zone. Your client is not going to be told about this.

This is the situation that audio post-production exists to prevent — and the one that happens when it is treated as the engineer's problem to manage rather than the PM's risk surface to understand.

You do not need to mix audio. You do not need to run a DAW. But you do need to know what a LUFS target is, why one master delivered to eight markets is usually non-compliant in at least two of them, and what question to ask before delivery that eliminates the entire category of broadcaster rejection. That knowledge is not in your production training. It sits in a technical layer that most PMs encounter only when something goes wrong.

This guide covers it before something goes wrong.


01. What Audio Post-Production Is — and What It Actually Does

Audio post-production is the set of processes applied to a recorded voice-over signal to prepare it for delivery. It works on the raw recording — the file that comes out of the studio or remote session — and transforms it from captured audio into broadcast-ready, technically compliant audio.

The pipeline is sequential and non-negotiable. You cannot skip a stage and patch it later. You cannot fix a dynamics problem at the delivery stage without compromising the mix. Every stage builds on what came before.

The Four-Stage Audio Post Workflow

VoiceArchive's audio post process follows four discrete stages:

1

Cleanup

Surgical work on the raw recording. Noise reduction removes room tone, hum, and environmental interference. Spectral repair addresses clicks, pops, and artefacts. Breath control shapes audible breathing between lines. De-essing reduces sibilance — the harsh S and SH sounds that distort on earbuds and headphones. On a luxury brand campaign, unaddressed sibilance is not a minor audio issue. It is a brand register problem.

2

Tone Shaping

EQ and tonal correction bring the voice into the frequency range required for the delivery platform and brand context. A voice recorded in a slightly bright room needs different treatment to a voice recorded in a controlled booth. Tone shaping is where the character of the final delivery is established. If it is not done at this stage, no amount of downstream processing recovers it.

3

Dynamics in Mastering

Compression and limiting bring the dynamic range of the recording into the window the delivery platform requires. Dynamics processing is irreversible once applied. This is the stage where the "delivered quiet and clean beats delivered loud and distorted" principle is enforced by competent engineers — and ignored by those who do not understand platform normalization.

4

Loudness and True Peak

The final compliance stage. The processed file is matched to the loudness target of the destination platform — measured in LUFS — and True Peak values are checked and limited. This is not a cosmetic adjustment. It determines whether the file is accepted or rejected at broadcaster ingest.

The Five-Step Pipeline

The four processing stages sit inside a broader operational pipeline — everything that happens from the moment talent files arrive to the moment the final deliverable leaves the building.

1

Receive Files

Incoming recordings from talent — studio sessions, remote recordings, multi-market files — are received and logged. Every file in the job must be accounted for before processing begins. Missing files caught here cost minutes. Missing files caught at delivery cost days.

2

First QC

Incoming files are checked against the brief before a single processing decision is made: format compliance, sample rate, session integrity, missing elements. A sample rate error caught at First QC is a five-minute fix. The same error caught by a broadcaster's automated ingest system at 2 AM is a campaign crisis. First QC is the gate that determines whether what arrives is actually workable.

3

Post-production

The four-stage processing workflow: Cleanup → Tone Shaping → Dynamics in Mastering → Loudness & True Peak. This is where the raw recording becomes broadcast-ready audio. Each stage is sequential and non-negotiable — skipping one corrupts the input for everything downstream. See the full breakdown in Section 04.

4

Mixing and Edit

The processed VO is assembled into the final deliverable: clip arrangement, sync to picture, VO balanced against music and SFX, level-matching across all campaign executions, versioning (30s/15s/6s cuts). This is a distinct step from post-production — and the one where a baked mix becomes uneditable. The distinction matters every time a client wants a line changed after delivery.

5

Delivery

Final files are exported to the specifications of each delivery destination — format, sample rate, bit depth, LUFS target, True Peak limit, naming conventions — and submitted. A campaign delivering to Europe, the US, Japan, and Australia requires four different master exports. One master delivered to all four is non-compliant in at least two markets.


02. Common Mistakes Across the Pipeline

The failures PMs encounter most often are not random. They follow a pattern, and most of them are caused by information that was never in the brief.

The Five Brief Gaps That Drive Rework

The most common audit finding in audio post is not a technical failure on the engineer's side. It is a brief that did not contain the information needed to produce the right output.

1

No delivery destination specified

"Broadcast" is not a delivery destination. EBU R128 for European broadcast, ATSC A/85 for US broadcast, ARIB for Japan, and OP-59 for Australia and New Zealand are different standards with different loudness targets. A PM who writes "for broadcast" in the brief has given the engineer no actionable information.

2

No loudness target requested

Without a specified LUFS target, the engineer delivers to a default that may or may not match the receiving platform. The cost of discovering this at ingest is a full redelivery cycle.

3

No revision window defined

If the brief does not include a revision window — and by extension, a requirement for session files to be retained — the engineer has no obligation to retain an OMF or AAF. When a line change request arrives two weeks after delivery, the files may no longer exist. The result is a full re-record, re-process, and re-mix. Not because anyone made a mistake. Because the brief did not ask for what was needed.

4

No format specification shared

The delivery format — file type, bit depth, sample rate, mono vs. stereo, codec — should be confirmed before post-production begins, not requested after the fact.

5

Sample rate not specified

The standard for broadcast is 48kHz. Many DAWs default to 44.1kHz (the CD standard). A file at 44.1kHz delivered to a broadcaster running 48kHz ingest will fail automated QC. It is one of the top five rejection reasons across major European broadcasters.

What Broadcaster Rejection Actually Looks Like

Major European broadcasters — BBC, ZDF, France Télévisions, RAI — operate automated QC ingest systems. A file that fails does not reach a human reviewer. It is rejected automatically, with a notice issued between four and twenty-four hours after submission.

The Five Most Common Rejection Reasons

  • 1. Loudness non-compliance
  • 2. True Peak violations
  • 3. Sample rate mismatch
  • 4. Codec or format errors
  • 5. Loudness Range violations on mixed content
Deadline risk

The time cost of a rejection cycle: detection (4–48 hours) plus correction (2–4 hours) plus resubmission (4–48 hours). With three days to air, one rejection cycle is a campaign crisis. With two days to air, it is a missed air date.


03. What Clients Actually Get From Audio Post Done Right

The output of audio post-production is not a processed file. It is a delivery decision made before the client ever hears the finished campaign.

The Perceptual Reality

The listener does not consciously hear "bad audio." They hear "untrustworthy brand." The attribution is subconscious — and it is consistent.

UCL research on acoustic environment and speaker credibility established that the same speaker delivering identical content is rated significantly lower in competence and trustworthiness when the acoustic conditions are poor. Edison Research data identifies poor audio quality as the most cited reason audiences abandon audio content — ahead of poor content quality.

For a PM, this translates directly: audio post that is done correctly means the creative work is not undermined by technical failures the audience cannot name but will feel. Audio post that is done incorrectly means the voice talent, the script, the music, and the strategic brief are all working against a signal that makes the brand feel smaller than it is.

What Consistency Across Markets Requires

A campaign running across eight markets is not one campaign delivered eight times. It is eight executions that must hold together as a single coherent brand experience. Inconsistent mic placement across markets produces different tonal characters from market to market. Inconsistent loudness levels mean each execution sounds different on back-to-back listening.

Key principle

Delivering consistent output across markets requires consistent process, not consistent talent. The audio post workflow is what creates that consistency — not the studio each market records in.


04. Inside the Engine Room: The Audio Post Workflow

What happens between "raw recording" and "broadcast-ready file" is not an artistic process. It is a technical production process with defined stages, defined targets, and defined failure modes. Understanding it at operational level — not engineering level — is what gives you the language to brief it correctly and review deliverables with competence.

What Happens at Each Stage

Each stage is a set of discrete operations. Here is what is happening inside each one — and what it protects the PM from if done correctly.

Stage Process What It Does
01
Cleanup
Breaths Removes unwanted and distracting breath sounds between lines. Uncontrolled breaths undermine professional delivery — particularly in broadcast and luxury brand contexts.
Clicks & Pops Identifies and eliminates clicks, spit, and mouth noises. These artefacts are often inaudible in a studio monitor but severe on earbuds and car speakers.
Room Resonances Manages low-frequency resonances from the recording environment. Unchecked room resonance creates a muddy, unprofessional sound that no downstream EQ fully corrects.
Steady Noise Reduces continuous background noise — air conditioning, computer hum, electrical interference. The floor of the recording must be clean before any processing begins.
Hums & Tones Addresses electrical hum and tonal interference. These create a sound foundation clean enough for accurate mastering — without them, loudness targets are hit against a noisy floor.
02
Tone Shaping
EQ Balances lows, mids, and highs for clarity. A well-EQ'd voice is smooth and easy to listen to — the listener processes the message, not the signal.
Balance Ensures vocal consistency across takes and edits. A campaign with 12 markets that all sound tonally different is a brand consistency problem, not just a technical one.
Frequency Targeted frequency adjustment to reduce muddiness and enhance crispness. This is where the voice is shaped for the delivery platform — broadcast EQ differs from podcast EQ.
Intelligibility Enhances listener comprehension of the voice-over. In advertising, message delivery failure is campaign failure — intelligibility is not an aesthetic goal, it is a functional one.
03
Dynamics in Mastering
Gentle Compression Applies smooth level control to maintain vocal consistency and clarity across the dynamic range of a performance. Eliminates jarring volume changes between loud and quiet lines.
De-essing Reduces harsh sibilance (S/SH sounds). Critical for campaigns delivered to streaming and social platforms, where the majority of consumption happens on earbuds at close range.
Dynamic Range Balances loudness variations across the recording. Without this, the mix has no coherent loudness signature — each line sits at a different level and the delivery feels uncontrolled.
Audio Enhancement Processing to enhance vocal presence and engagement. Makes the performance more arresting without altering its character — particularly useful for re-energising flat remote recordings.
Limiter Prevents clipping by controlling signal peaks. Clipping is irreversible distortion. A limiter is the last line of defence before the signal reaches the loudness measurement stage.
04
Loudness & True Peak
Target LUFS The processed file is matched to the loudness target of the delivery platform. Every broadcast and streaming platform publishes a LUFS target — the engineer must know it before the mastering stage begins, not after the file has been submitted.
Safe Headroom Maintains a safe True Peak (dBTP) threshold to avoid clipping at the inter-sample level. True Peak differs from standard peak metering — a file reading -0.5 dBFS can still exceed the True Peak limit.
Monitoring Loudness is measured with platform-approved tools (LUFS meters, True Peak meters). An engineer without True Peak-aware tooling in their chain can deliver a file that appears compliant and still fails ingest.
Consistency Ensures consistent loudness across all voice-over segments within a delivery. A spot that shifts in perceived volume mid-way through fails the audience before it fails the QC system.
Final Check A complete pre-delivery review against the brief specifications — format, sample rate, bit depth, LUFS, True Peak, naming conventions. The final check is what First QC catches on the way in, and what the engineer must catch on the way out.

Why Stage Order Is Not Negotiable

Every stage in the audio post workflow outputs a signal that the next stage processes. Cleanup before tone shaping means the EQ is working on a clean signal, not on noise. Tone shaping before dynamics means the compressor is responding to the actual character of the voice, not to artefacts that should have been removed. Dynamics before loudness means the LUFS measurement reflects the true programme loudness, not loudness that has been artificially inflated or suppressed.

If any stage is skipped or done out of order, the downstream stages are working on incorrect input. The problem does not become visible until the file reaches broadcast ingest — or until the client hears the spot and it does not sound like the brand.

What the First QC Gate Actually Does

First QC is the stage that most PMs do not know exists, and the one that has the highest return on investment in the pipeline. Incoming files are checked against: format, sample rate, session integrity, and any brief-specified technical parameters.

A sample rate error caught at First QC is a five-minute fix. The same error caught by a broadcaster's automated ingest system at 2 AM, three days before air date, is not.
9/10 First-pass approval rate (last 12 months)
90K+ Jobs delivered across global markets
20 yrs Experience across every broadcast standard

That 9/10 number is not the result of talented engineers working quickly. It is the result of a First QC gate, a four-stage workflow, and a team that reads the spec sheet before the session begins.


05. What is Mixing and Edit and Why It Matters

This is the distinction most PMs do not have, and it is the one that causes the most expensive rework.

Audio post-production processes the individual signal — it works on a single VO recording and makes it technically and sonically correct. Mixing and editing assembles the final deliverable: clip arrangement, synchronisation to picture, VO balanced against music and SFX, level-matching across all campaign executions, and versioning — the 30-second, 15-second, and 6-second cuts that each require their own mix pass.

Critical constraint

A baked stereo mix cannot be surgically altered. When a mix is printed — bounced to a single stereo file — it is a fixed output. You cannot pull one element out of it. If the client wants one line of VO changed after the mix has been delivered, the process is: re-record the line → re-process → re-edit → re-mix → re-deliver. On a 10-market campaign, one line change without session management can cascade into forty or more individual file deliveries.

Studio Vocabulary PMs Need

Key Terms

  • The mix — the combined stereo output. The "final" that goes to the broadcaster or platform.
  • VO stem — the isolated, processed voice-over track. Essential for re-versioning.
  • Baked / printed — the mix has been bounced to a single file and the original session no longer exists. No individual elements are accessible.
  • Revision rounds — the number of client feedback cycles built into the session budget and schedule. Must be briefed upfront. If not briefed, there is no guarantee the session files will be retained.
PM action

If you are in a review session and someone says "we can just reprint the mix," that is a signal that the session is being treated as a single-pass deliverable. Ask where the stems are.


06. The Secret Language of Audio: OMF, AAF, and Stems

The single most preventable source of audio rework in post-production is the absence of an OMF or AAF request in the brief. This is not a technical detail. It is a contractual and operational one. And it belongs in the brief, not in the revision email.

What These Files Are

A baked WAV file is a photograph of a painting. An OMF (Open Media Framework) or AAF (Advanced Authoring Format) is the original layered file — every clip, every fade, every processing parameter preserved and editable in a compatible DAW. These formats are the interchange standard between professional audio workstations. They exist specifically to make sessions portable and revisable.

Stems are a middle option: rendered layer groups (VO stem, music stem, SFX stem) delivered as separate files. Not as flexible as OMF/AAF — you cannot make clip-level edits — but they allow re-balancing without a full re-mix. Stems are the minimum for any campaign that may require market-specific versioning.

The Critical Constraint

Must act before recording begins

The OMF/AAF request must be made before recording begins. A session that was not built for interchange export cannot be retrofitted. If the engineer did not set up the session to export a clean OMF or AAF, there is no backwards path. This is not a refusal. It is a technical limitation.

On a 10-market campaign that requires corrections in three markets: without OMF/AAF, that is three full pipeline runs — re-record, re-process, re-mix, re-deliver, per market. With OMF/AAF, it is three targeted edits in the existing session.

The Brief Language to Use

"Please retain project files for 90 days post-campaign and provide an OMF or AAF of the session with the final delivery. We require revision access within this window."

That sentence, included in the initial brief, eliminates an entire category of rework.


07. LUFS Standards by Platform — The Reference Table

LUFS (Loudness Units relative to Full Scale) is the measurement standard used by every broadcast and streaming platform to specify the loudness of delivered audio. Every platform publishes a LUFS target. Every platform's automated ingest system checks against it.

Broadcast Standards

Market / Standard Loudness Target True Peak Limit
Europe (EBU R128) -23 LUFS -1 dBTP
United States (ATSC A/85) -24 LKFS -2 dBTP
Japan (ARIB) -24 LKFS
Australia / New Zealand (OP-59) -23 LUFS

Streaming and Social Standards

Platform Loudness Target True Peak Limit
Spotify -14 LUFS -1 dBTP
YouTube -14 LUFS -1 dBTP
Apple Music / Podcasts -16 LUFS -1 dBTP
Netflix (dialogue) -27 LUFS -2 dBTP
Facebook / Meta -24 LUFS
TikTok -14 LUFS
Safe cross-platform social target -16 LUFS -1 dBTP

IVR and Telephony — A Different Standard Entirely

Different measurement system

IVR and telephony audio is not measured in LUFS. It uses dBFS RMS. Standard telephony targets -9 to -12 dBFS RMS. Delivery format is typically 8kHz or 16kHz, mono, 16-bit for legacy systems. A file that is LUFS-compliant for broadcast will not necessarily be correct for IVR. If your campaign includes any IVR or phone-based channel, specify this explicitly in the brief.

The Auto-Normalization Trap

Streaming platforms normalize incoming audio. This does not save a badly mastered file.

Common misconception

If the VO was heavily compressed to sound loud, the platform reduces its overall volume when normalizing — but the dynamic compression artefacts are already baked into the signal. Delivered quiet and clean beats delivered loud and distorted, every time.

What "One Master for Eight Markets" Actually Means

A campaign delivering to Europe, the US, Japan, and Australia requires four different master exports. One master at -23 LUFS delivered to all four markets is non-compliant in at least two of them. The engineer needs to know every delivery destination before the mastering stage begins.


08. Specific Voice Challenges — What Post Can Resolve (and What It Cannot)

Audio post can resolve a significant range of technical problems in a recorded performance. It cannot resolve problems that originated in the recording environment or the performance itself.

What Post-Production Resolves

Sibilance is the most pervasive issue on digital campaign audio. Harsh S and SH sounds cause distortion on earbuds and headphones. De-essing in the cleanup stage addresses this. But severe sibilance caused by mic placement or performance may require multiple passes and can affect the natural character of the voice. Brief it early.

Plosives — the burst of air on P and B sounds — behave differently across playback systems. Barely audible on studio monitors, they can be severe on laptop speakers or car stereos. Post-production can reduce plosives, but the most reliable fix is a pop filter in the recording session.

Room reverb and reflections reduce speech intelligibility. Post-production can apply reverb reduction, but a badly reverberant recording environment leaves artefacts that no amount of processing fully removes.

What Post-Production Cannot Fix

Beyond the reach of post

  • Inconsistent mic placement across markets. Post-production can correct individual files, but matching sixteen different tonal characters to a single brand voice adds significant time and budget. Include microphone setup guidance in the recording spec.
  • Performance issues. Pace, emphasis, and emotional register are performance decisions. Post-production does not change these. If a market's recording lands with the wrong energy for the brand, the fix is a retake.
  • Fundamentally poor recording quality. A recording made on consumer hardware in an uncontrolled environment can be made marginally more listenable. It cannot be made broadcast-compliant. First QC catches this — which is exactly where it belongs.

Work With VoiceArchive's Audio Post Team

The brief language, the platform standards, the OMF/AAF question — these are what separate campaigns that air on time from campaigns that generate 11 PM rejection notices. Tell us your delivery destinations. We will give you the LUFS targets, the format specs, and the brief language before the session goes into production.

Work with VoiceArchive's Audio Post Team