Boxes or Cans

Traditionally, most engineers mix on speakers and use headphones to check their mixes. The thinking behind this has always been that mixes done over speakers will translate better to headphones than vice versa, due to fundamental differences in the nature of headphone monitoring relating to stereo image perception and the headphones’ tendency to more readily reveal certain details in the mix. But nowadays, given that many music lovers use headphones as their primary listening environment, mixers may have cause to rethink this conventional wisdom. This is especially true in situations where the speaker or room environment may be less than ideal acoustically such as in many small project studios or home studios. Or, when logistics make it less convenient or practical to mix over speakers—when doing so would disturb others, or when the mixer is a musician or engineer on the road with only a laptop and a pair of phones to provide a consistent mixing environment.

If a mixer does decide to switch to phones or even to just take on a particular mix project with headphones as the primary monitoring environment, one of the first things to do would be to calibrate the headphones for flat (or flatter) response, ideally matching the same similarly-calibrated response of the main speakers. But while that’s an important step, there will still be fundamental differences between speaker and headphone monitoring that need to be looked at, and the mixer will need to be well aware of those differences to ensure the widest compatibility for the finished mixes.

Inter-aural Affairs

The way stereo sound from speakers reaches our ears is inherently different to how it works in headphone monitoring. In speakers, the two actual sound sources—the left and right speakers—occupy the same physical space, so while in theory the sound from the left speaker is intended for the left ear and the sound from the right speaker is intended for the right ear, what actually happens is that both ears hear the sound from both speakers.

Inter-Aural Crosstalk
Inter-Aural Crosstalk

The left ear receives the sound wave from the left speaker, but it also receives the wave from the right speaker. Since the right speaker’s signal comes from a very slightly greater distance, the level is slightly less at the left ear than the level of the left speaker signal. And again due to that slightly greater distance, the right speaker wave is slightly delayed, arriving a little later at the left ear than the left speaker wave. The same thing occurs with the right ear and left speaker—the left speaker’s wave arrives at the right ear a little later and at a slightly lower level than the right speaker’s wave.

This is called Inter-Aural Crosstalk. It consists of an inter-aural level difference and an inter-aural time difference, and it determines the limitation of speakers to reproduce the stereo image. Specifically, it’s the reason the stereo image in speakers is restricted to the (usually) 60° angle between the speakers, rather than spreading out to a full 180° width beyond the actual speaker locations, as a real acoustic sound field would.

And of course the additional delayed signal at each ear combines with the intended signal for that ear, creating interference effects like comb filtering.

Of course, stereo perception in speaker monitoring is more complex than just simple inter-aural crosstalk. The head itself affects the sound waves reaching the ears, with more of a masking effect on a sound from a particular source at the more distant ear. Here, the sound wave may have to travel around the head, and is subject to the damping effect of that object in its path. Even the shape of our ears themselves plays a role, with the pinna—the outer ear—focusing sound waves into the ear canal. These aspects affect not only our perception of left-right stereo width, but also height—our ability to get a sense of how high the sound source is relative to our ears.

Inside Your Head (phones)

But for mixing concerns, the width of the stereo image—the inter-aural crosstalk—is the main thing that differentiates headphone monitoring from speaker monitoring. In the phones, there is no inter-aural crosstalk—the left and right drivers in each ear cup are right up against the corresponding ear, so the left ear does hear only the left signal, and the right ear hears only the right.

Speaker vs Headphone Monitoring
Speaker vs Headphone Monitoring

Without the crosstalk, the stereo image is wider and any additional interference effect from the crosstalk is absent, making for a wider soundstage and clearer sound, which allows for greater perception of detail.

This extra detail is especially noticeable with certain aspects of a typical mix – the amount and depth of reverb and ambience; the audibility of the delayed signal in subtle short delay effects like doubling or chorusing, and the general audibility of subtly-mixed background parts. Sometimes a background part, for example an instrumental counterpoint that was mixed in speakers to be just barely audible, or not really perceptible on its own but adding extra thickness to the main part – will come through more clearly on headphones, which may not have been the intent.

Back & Forth

And that gets to the question of how those basic differences can affect the ability of a mix to sound right on both speakers and headphones. Mixers have long been afraid that if they mix entirely or primarily on headphones, the extra width of the stereo image and the extra clarity and detail that can be perceived in the cans—thanks to the lack of inter-aural crosstalk—will lead them to make decisions that won’t work well enough in speaker listening. They worry that panning choices based on the wider stereo image in the phones will result in a more congested sound field when sound is heard from speakers.

And they may be concerned that subtle mix balances and reverb/effect levels dialed up with the benefit of the extra clarity in the phones will result in those parts or effects not coming through clearly enough when heard on speakers, making the mix sound flatter, or be lacking in the arranging and processing details they worked so hard to come up with. On the other hand, if they mix in speakers they may feel that all they have to do is check that certain subtle aspects of the mix don’t sound too prominent in the cans. And headphones do have their disadvantages—though the sound field is wider, the centre sounds like it’s in your head rather than out in front, making it potentially harder to gauge a sense of front-back depth in the mix, which can be simulated by the subtle use of delay and early-reflection reverb patterns.

Headphone Helper

Of course, a mixer who works in phones as their primary monitoring environment can get accustomed to these differences, and he’ll still be checking his mixes on speakers. But it is possible to narrow the fundamental perception difference between speakers and phones somewhat to bring the two experiences closer together, and that could potentially be a tipping point for someone thinking about shifting to mixing on phones as the main workspace.

If a little crosstalk is injected into the headphone signal, emulating the acoustic effect that occurs with speakers, that could help to somewhat minimize the differences in perception and make it more likely that subtle mix choices made in the phones will translate with greater accuracy when the mix is monitored over speakers. This may allow the mixer to take advantage of the other benefits of headphone listening such as clarity, isolation, a calibrated response free of room effects—and not have to make significant changes when the mix is eventually checked on speakers.

But you wouldn’t want to just randomly bus some opposite channel crosstalk from one side to the other. Like speaker calibration, it should be done carefully, incorporating both the inter-aural level and time differences, to mimic as closely as possible the actual degree and character of the acoustic crosstalk effect.

CanOpener

You can get that with CanOpener Studio, a plug-in from Goodhertz, makers of a collection of audio plug-ins.

CanOpener Studio
CanOpener Studio

CanOpener creates inter-aural crosstalk properly, and can help bring the resulting perceptual qualities to headphone monitoring. It includes controls to set how strong the crossfeed effect will be (I prefer it at the maximum setting with the phones I work with), and a simple bass and treble EQ to counter any unwanted tonal balance changes that may result from the extra crosstalk (I turn the bass down half a dB). There are also a couple of useful graphs depicting the stereo image, but other than that the plug-in just does what it’s designed to do without the need for complex measurements or major tweaking.

The crossfeed effect is very subtle, as it should be! Aside from a little bass emphasis, you don’t really notice it until you turn it off and then the headphone sound seems to very subtly get a little more constricted/in-the-head. It’s definitely not a 3D/widening effect – remember, it’s not supposed to be an effect at all! But it does seem to add a very subtle, almost subliminal sense of openness. With the inter-aural crossfeed doing its job, mixing choices like panning decisions, background levels and ambience and depth should sound more consistent when the mix is heard on actual speakers, and that’s the ultimate goal.

Reference 4 & CanOpener in Use
Reference 4 & CanOpener in Use

Of course, you’ll still want to check headphone mixes done on your Reference 4-calibrated, CanOpener-equipped phones, both on your main (ideally calibrated) speakers and other different speakers, and on other headphones without the benefit of any crossfeed processing, to get a good sense of what other listeners may eventually hear. But the basic mix done on the primary headphones should translate a little more successfully onto any listening setup than if it was done on uncalibrated headphones with the normal lack of crosstalk. The combination of flat frequency response and a more speaker-like perception should make the option of mixing in headphones a much more viable alternative, even for the sceptics and traditionalists among us.