Boxes or Cans

Historically, most engineers have mixed on speakers and only used headphones to listen for details they may have missed on monitors. The thinking behind this has always been that mixes done over speakers will translate better to headphones than vice versa. This is due to stereo image perception on ‘phones and the headphones’ tendency to highlight certain details of a mix. In these modern times, given that many music lovers use headphones as their primary listening environment, mixers have begun to rethink this conventional wisdom. Headphone mixing also suits situations where the monitoring environment may be less than ideal—such as in many project studios. Often it’s simply inconvenient or impractical to mix over speakers.  We might be traveling with only a laptop and a pair of phones or we may not want to bother others around us.

If a mixer decides to use headphones as their monitor system, one of the first things to do would be to calibrate the headphones for flat/ideal response so that the frequency response of the headphones can be trusted to translate well to the outside world. While calibration is an important step, there will still be fundamental differences between monitoring on speakers vs. headphones that need to be considered. 

Interaural Affairs

The way sound from stereo speakers reaches our ears is inherently different from monitoring with headphones. With speakers, the two actual sound sources—the left and right speakers—produce sound into the same physical space, so while in theory the sound from the left speaker is intended for the left ear and the sound from the right speaker is intended for the right ear, what actually happens is that both ears hear the sound from both speakers to a greater or lesser extent.

Interaural Crosstalk from stereo speakers. The left ear hears the left speaker plus some of the signal from the right signal. The right ear hears the right speaker, plus some of the left speaker’s signal.

The left ear receives the sound wave from the left speaker, but it also receives some sound from the right speaker. To the left ear, the right speaker signal comes from a very slightly greater distance, so the right speaker level is slightly lower at the left ear compared to the left speaker signal. Also due to that slightly greater distance, the right speaker wave arrives a little later at the left ear than the left speaker wave (up to 600µsec). The same thing occurs with the right ear and left speaker—the left speaker’s wave arrives at the right ear a little later and at a slightly lower level than the right speaker’s wave. And of course, the additional delayed signal at each ear combines with the intended signal for that ear, creating interference effects like comb filtering.

Of course, stereo perception in speaker monitoring is more complex than just simple interaural crosstalk. The head itself affects the sound waves reaching the ears, with more of a masking effect on a sound from a particular source at the more distant ear. Here, the sound wave may have to travel around the head and is subject to the damping effect of that object in its path. Even the shape of our ears themselves plays a role, with the pinna—the outer ear—focusing sound waves into the ear canal. These aspects affect not only our perception of left-right stereo width but also height—our ability to get a sense of how high the sound source is relative to our ears.

Inside Your Head(phones)

For audio producers, the interaural crosstalk (and its effects on the stereo image) is the main thing that differentiates headphone monitoring from speaker monitoring. In ‘phones, there is no interaural crosstalk—the left and right drivers in each ear cup are isolated with the corresponding ear, so the left ear hears only the left signal, and the right ear hears only the right. Without crosstalk and the additional interference effects from the crosstalk, headphones produce a wider soundstage and clearer sound, which allows for greater perception of detail.

Speaker vs Headphone Monitoring. In headphones (right) there is no interaural crosstalk, so each ear only hears the sound from its earcup.

This extra detail provided by headphones is especially noticeable in certain aspects of a mix; reverbs and ambiences are more easily heard, the delayed signal in effects like doubling or chorusing is more audible, and subtly mixed background parts can be clearly heard. During mixing, we may mix parts together on speakers so that they combine to create a certain blend or effect, but on headphones, we can more clearly hear the individual elements, and the impression of the blend can be less effective.

Back & Forth

The question then becomes, can we predict how those basic differences will affect the way our mixes translate to speakers and headphones? Mixers have long been afraid that if they mix primarily on headphones, the extra width of the stereo image and the extra clarity and detail—thanks to the lack of interaural crosstalk—will lead them to make decisions that won’t translate over speakers. They worry that panning choices based on the wider stereo image in the phones will result in a more congested soundfield when a mix is played back on speakers.

Subtle mix balances and effects dialed in with the benefit of the extra clarity in the ‘phones will result in certain parts or effects not coming through clearly enough when heard on speakers. The mix will sound flat and lacking in arrangement and processing details that were carefully crafted. On the other hand, if one mixes on speakers they may feel that all they have to do is check that certain subtle aspects of the mix don’t sound too prominent in the cans. Headphones also have their disadvantages. The soundfield may be 180˚ wide, but the center image sounds like it’s inside your head rather than out in front (floating between the speakers), making it potentially harder to gauge the sense of front-back depth in the mix. This lack of phantom center can affect our judgment of the volume of the center-panned lead vocal or other lead instruments against the arrangement.

Headphone Helper

A mixer who works primarily in ‘phones can get somewhat accustomed to these differences, but they will still need to check mixes on speakers. It is possible, though, to narrow the fundamental perception difference between speakers and ‘phones somewhat and bring the two experiences closer together. That could potentially be a tipping point for someone thinking about shifting to mixing on ‘phones as their main monitor system.

A small amount of crosstalk could be injected into the headphone signal, emulating the acoustic effect (interaural crosstalk) that occurs with speakers, and that could help to minimize the differences in perception between speakers and headphones. Subtle mix choices made in the phones would translate with greater accuracy when the mix is monitored over speakers and the mixer could still take advantage of the benefits of headphone listening such as clarity, isolation, and an acoustically controlled environment in which to work.

You wouldn’t want to just randomly bus some crosstalk from one channel to the other. It should be done carefully, incorporating both the inter-aural level and time differences, to mimic as closely as possible the actual degree and character of the acoustic crosstalk effect.


CanOpener Studio, a plug-in from Goodhertz, simulates the effects of loudspeakers on headphones. This plugin doesn’t make your headphones sound like a specific brand of speaker, it creates that effect that your headphones are speakers, and not attached directly to your head.

CanOpener Studio

CanOpener properly creates interaural crosstalk and brings the resulting perceptual qualities to headphone monitoring. The controls allow you to set the strength of the crossfeed effect (I prefer it at the maximum setting with the phones I work with). You can also apply a simple bass and treble EQ to counter any unwanted tonal balance changes that may result from the extra crosstalk (I turn the bass down half a dB). There are also a couple of stereo image meters, but other than that the plugin just does what it’s designed to do without the need for complex measurements or major tweaking.

The crossfeed effect is very subtle, as it should be! Aside from a little bass emphasis, you may not notice it until you turn it off, and then the headphone sound becomes a little more constricted. This is not a 3D/widening effect – remember, it’s not supposed to be an effect at all! CanOpener adds a very subtle, almost subliminal sense of openness and natural directivity of sound. With the interaural crossfeed doing its job, mixing choices like panning decisions, reverb levels, and the subtle interaction of parts should translate well when the mix is played on speakers.

SoundID Reference and CanOpener on the monitor bus in Pro Tools

CanOpener can be a game-changer for people who want to (or need to) work on headphones. Our recommendation if you are using a DAW is to first insert Sonarworks SoundID Reference so that your headphones will start with a calibrated frequency response. Then insert Canopener after SoundID Reference to add the crossfeed effects to the calibrated headphone sound. If you are using SoundID Reference on the audio system output of your computer (rather than as a DAW plugin), you can still safely use CanOpener as a plugin in your DAW, even though its processing happens before the SoundID Reference calibration. Both SoundID Reference and CanOpener are linear processors, so there won’t be any unwanted coloration or distortion no matter in which order the processing occurs. 

You still might want to check mixes done on your SoundID Reference-calibrated, CanOpener-equipped phones on your main speakers and other speaker systems, just as you would with any mix. I find Canopener helps me make mixing decisions easier when I’m using headphones and when the mix is done and I turn off CanOpener, the mix completely holds together. Mixes I complete on headphones with SoundID Reference and CanOpener translate extremely well to speaker systems and after months of working this way, I rarely feel the need to check my headphone mixes on alternate playback systems.

The combination of flat frequency response (with SoundID Reference) and a more speaker-like perception (with CanOpener) should make the option of mixing in headphones a viable alternative, even for the skeptics and traditionalists among us.

Additional reading for getting the most from your headphones:

Getting the most from mixing on headphones.

Studio Headphone Guide

Headphone amp considerations