Stage and Sound: I Can't Hear The Words!
- By Jon Burton
In sound engineering, fundamental principles often emerge from seemingly simple advice. One such
cornerstone is elegantly straightforward: 'Point the speakers at the audience.' While this might
appear obvious, this principle encapsulates crucial aspects of audio system design, acoustic
behaviour, and the relationship between sound reproduction and human perception.
Let's start with the audience, what do they want to hear? I think for most of us the only complaint
we ever receive as engineers is, I can't hear the words. Audiences like to sing along with their
artists and appreciate and value vocal clarity. When we consider music, especially popular music,
melody and lyrics are very important.
Melody in music is often carried by multiple instruments. These are blended, often with percussion
instruments to provide a beat, to provide a cohesive whole. Singing adds another layer of melody,
but also lyrical content, words, that we like to hear and to understand, even if the meaning may not
be profound or interesting, we still like to hear them and often sing along.
Being able to hear the lyrics in a song often takes a greater degree of clarity than the melody. We
can often pick out the tune but be frustrated we cannot understand the words. Our brains can follow
the instruments, even when blended and not individually recognisable, as they all contribute to what
we understand as music. The melody of the vocal joins this collection of tones to complete the
picture. However, we are also trying to recognise words and meanings that are much harder to pick
out, requiring a clarity that we may not need to follow melody alone. It is for this reason that as
engineers we usually prioritise the vocal over all other musical instruments. We need to be able to
hear the differences between the vowels and consonants, the sounds that make up the words we are
trying to recognise.
Now, you may counter that you often listen to music where the words are in a different language you
don't understand yet still enjoy. Yes, I would agree, however, our brains are programmed to
recognise speech and will try and grasp for meaning even if just to discover it is not a meaning
they have been trained to comprehend.
Speech requires the sound system to have clarity. It needs to be able to reproduce the frequency
range of the human voice at a sufficient sound pressure level to be clearly heard by the audience.
IEC 60268 is the international standard for objective rating of speech intelligibility and looks at
the frequency range of 125Hz to 8kHz, a good starting point for discussion. The human voice can't
sing that high, C6 is just over 1kHz, but the extra frequencies help to provide the timbre, the
unique quality or tone colour that allows us to distinguish between different sounds, even when they
have the same pitch, loudness, and duration. The higher frequencies help us to recognise the
phonemes, the smallest units of sound that make up words in spoken language. The highest frequencies
are also the hardest to project across a large audience, for many reasons.
When we set up our speaker system, which may be made up of two or three types of drivers, we need to
think about what each section does. Many systems have a larger low-frequency driver, a mid-range
speaker and a driver dedicated to the high frequencies. The low-frequency driver, often operating
from 250Hz and below, covers the lower registers of the musical scale, providing depth or weight.
The mid-range speaker is carrying most of the melody, probably up to about 2kHz. The high-frequency
driver takes over here and provides us with the harmonics, the additional frequencies that give the
voice its timbral quality, that helps us pick it out from the other instruments.
Modern sound reinforcement systems employ specialised drivers, each optimised for specific frequency
bands. In a three-way system, the low-frequency driver handles frequencies typically below 250-300
Hz, reproducing fundamental bass notes, kick drums, and providing the foundation of the mix. The
mid-frequency driver, operating from approximately 300 Hz to 2 kHz, reproduces the critical vocal
fundamentals and primary melodic content of most instruments. The high-frequency driver manages
frequencies above 2 kHz, delivering the overtones, transients, and harmonic content essential for
articulation, and timbral definition.
As an experiment, if you can, turn off the different bands on your system and have a listen. What
information are they carrying? Can you pick out the tune using just the mid-range? Yes? Can you hear
the words? Listen to the high-frequency driver just on its own? What do you get? Some understanding
of the words? But how much melody? What about the bass? What is it contributing? We need the entire
range for a good sound, but without the high frequencies, we have a melody but little understanding
of meaning.
If you then walk around your speaker stack, what can you hear? As you move away from the front you
can still hear the bass and probably pick out the melody. As you get to the back the bass will still
be there, but the melody will be harder to pick out, and your ability to hear the words will
probably be lost.
The relationship between frequency and directivity follows a fundamental acoustic principle: lower
frequencies exhibit omnidirectional radiation patterns due to their longer wavelengths. When the
wavelength exceeds the dimensions of the source (the speaker cabinet) the sound pressure levels
remain relatively consistent around it. As frequency increases, wavelengths shorten, and the sound
naturally becomes more directional. At high frequencies, typically above 2kHz, we enhance this
directivity by coupling drivers to acoustic horns, which control dispersion patterns and increase
efficiency through acoustic impedance matching.
Look at your horn. What shape is it? This will dictate what area it can cover. This is usually
designed to be wider than it is tall, audiences normally spread out over the horizontal rather than
vertical plane. Have a listen to some quiet music. Walk slowly across the front of your cabinet
until you begin to lose the hi-hats. This is the coverage of your driver. If this is not pointing at
the audience, they cannot hear it. They may still pick up the melody, but they may struggle to hear
the words. If so, they will let you know. It may be the only complaint you ever get, but it's the
most important one, 'I can't hear the words'.
Profile:
Jon Burton is a sound engineer with over thirty years of concert touring experience. Working
with a wide range of artists, from Bryan Ferry to Radiohead, including 20 years as FOH engineer
for The Prodigy. Having had no prior formal education in sound, in 2017 Jon completed an MSc in
Music Technology. Jon is currently studying for a PhD at the University of Derby, UK, where he
works as a Senior Lecturer in Entertainment Engineering. Jon is a founder member of HELA, an
international certification for hearing health awareness at live music events. Jon is also a
partner in the Laundry Rooms recording studio complex in Sheffield, UK.