Separating the Scientific Facts from Science Fiction

4 weeks ago

20 35 minutes read

image.jpeg

Just like we’ve found with speaker cables and
audio interconnects, snake oil and gimmicks are also alive and well in the
world of loudspeakers. With over 400
loudspeaker brands in the consumer market competing for a very small piece of
action, it’s no wonder some manufacturers feel the need to differentiate themselves, often using Ivory Tower tactics and pseudoscience to proclaim product
superiority. In fact some of the loudspeaker science often reads less believable than science fiction. In this article and a one-on-one interview between Gene and Hugo, we break
down and discuss some of the common nonsense we’ve found surrounding consumer
loudspeaker products.

The exclusive interview below includes a great discussion about the Loudspeaker Myths covered in this article but also goes into more details including personal experiences that should be both informative and entertaining.

Must Watch Speaker Myths Interview with Gene DellaSala (left) and Hugo Rivera (right)

Myth
#1: Speaker Break-In

If you convince a consumer to keep a speaker for a long period of time the likelihood of product return diminishes.

Some
manufactures claim their products need hundreds of hours of break-in to perform at
peak performance. We have a series of
articles that investigate this very topic.
The bottom line is it’s rarely true and most of the time just plain
nonsense. Higher mass drivers such as
woofers and subwoofers that use spiders which rely heavily on spider stiffening
agents instead of a thicker more rigid material do in fact lose stiffness in
their suspensions with use, particularly when used for long periods under high
excursions. This is the kind of
excursion that tests a designer’s skill and pushes the driver to the extremes
for long periods of use. During use, the
spiders and surrounds loosen up, but it usually
happens in minutes, not hours, weeks or months. In the majority of
cases, the air compliance inside the speaker cabinet plays a larger role in
impeding the cone movement than the spider and surround.

In short, at some point it will not matter
how much you break in that driver because when you put a big cone in a small
box, the stiffness of the air in that box is going to completely dominate the overall
stiffness and resonance of the system.

However, if you convince consumers to keep a
pair of speakers in their home for a long period of time, they typically become
acclimated to the sound and less likely to return the product.

Editorial
Note on Driver Break-In from Shane Rich of RBH Sound

The time for break-in really depends on the level the speaker or subwoofer is
being played at and the driver compliment in the system. Not all drivers are the same. Some drivers will exhibit little if any
“break-in” at all. Others, especially
those with more compliant suspensions can exhibit change that is readily
apparent from a measurement standpoint (i.e. Fs lowers by 3-4 Hz on a woofer.) Our 6.5” mid-woofer does break in and that
usually happens pretty quickly if used in a two-way speaker because bass
frequencies will require enough excursion that the suspension stretches
slightly in a relatively short time.
However, use that same driver in a three-way system as a mid-range and
the break-in will happen over a longer period of time due to the fact that the
driver is not playing the low frequencies as it would in a two-way system.

For more information see:
Speaker Break-In Fact or Fiction

Myth
#2: The Double Blind and ABX Test Religion

There
are some companies and audiophiles that believe the only valid test in audio
for comparing products is through a strict Double Blind Test (DBT). While I will not argue with the benefits of
reducing visual biases through the proper usage of blind testing, in my
opinion, the DBT is the single most abused term used in the audio industry.
Hardly anyone ever truly follows a strict DBT protocol. Double Blind Testing is designed to eke out
both small and large differences between very similar stimuli while removing
perceptual biases which in the case of comparing loudspeakers are predominately
visual.

This is a critical component to eliminating
bias in controlled situations like those in the medical field. However, in comparative audio testing where
differences between competing speakers are usually quite dramatic, I personally
feel the necessity can sometimes be a bit overplayed by DBT proponents and
often miscategorized by those with a clear agenda to promote their
products.

Most DBT’s run by manufacturers or consumers don’t follow a true DBT protocol.

Over our 14-year history of testing
loudspeakers at Audioholics, we’ve run both controlled Single Blind Tests (SBTs) and sighted
listening tests. I think it’s a good
idea to mix it up to compare and understand the perceptual differences but just
because a test is run blind doesn’t mean it is also bias free or even less
biased than a properly run sighted listening test. Some folks are better than others at running
these types of tests and minimizing potential sources of bias. However, biases will ALWAYS exist even if not
deliberate or realized.

Editorial Note about SBT vs DBT by Dr. Floyd Toole

The Single Blind Test (SBT) removes
the visual cue, the Double Blind Test (DBT) prevents the experimenter from biasing
the result. In the course of this the selection of loudspeaker, music and
the coded identities of the products being tested are all randomized.
Each new music selection opens what is in effect a new test.
Location of the speaker and listener in the room can singly or together
be as influential as anything else, which is why randomizing locations between
tests must be done, or, as is done at Harman, using positional substitution.
This involves a computer controlled pneumatic shuffler for quietly and quickly
(about 3 seconds) moving loudspeakers so that those that are active are always
in the same locations in the room; the unused ones are parked a few feet away.
The room effects then become constant factors.

3 Blind Rats – Who Gets the Cheese?

Manufacturers NEVER lose their own
Blind Tests?

Things get a bit unclear when listening tests are run by manufacturers whom all
have a vested interest in their
product. They want their
product to “win” or, at the very least, to not lose. They’ll take a
tie, especially if it is a tie with a much more expensive speaker. This sets them up for a scenario where they
can NEVER lose. At worst they will be “similarly good”. When you see this term “similarly good”, you should think we could not prove superiority of
our product, so we changed the parameters of the test to make it harder to tell
the competitors speaker was better so the detection would happen far less often. In this way we can assert a rough equivalence
to a better product.

Manufacturers often never reveal a big source of error in their blind tests – familiarity bias.

Years ago there was a speaker company
which claimed spending beyond $2,000/pair bought purely cosmetic upgrades in speakers,
and that minimal or no performance improvements could be achieved. Today, that company no longer makes this
claim as they are selling speakers north of $5,000/pair. We doubt this is a coincidence. While we can all agree there comes a point of
diminishing returns for any type of product, it’s a fruitless task to attempt
to define it by an exact dollar amount that increases each time a company comes
out with a higher priced product.

Some so-called DBTs run by loudspeaker companies have been found to have a huge source of bias never
disclosed in their results. They often
use their own employees or trained listeners especially familiar with the
company’s products in such tests. This
introduces familiarity bias which a blind screen covering the speaker will not
eliminate. If we already know who likes
what, then although the test may be blind, the result is likely known at the
start when the listeners used have well-known preferences.

A true DBT has to be run and analyzed by a
completely neutral party under laboratory controls which is ALMOST never done.

I always find it amusing when a speaker
company claims their speakers win DBTs (run by them of course). One has to wonder if each manufacturer’s
own product is winning, then whose is actually losing? We’ve had speakers from pro-DBT companies in
our own shootouts—both sighted and blind—and they’ve either ranked in the
middle of the pack or towards the bottom, NEVER clearly superior to other
brands that didn’t follow the DBT religion.
It always puzzles me that nobody ever claims we should conduct DBT when
comparing the performance of automobiles or chainsaws ;-)

A true DBT has to be run and analyzed by a
completely neutral party under laboratory controls. This is ALMOST never done. To do it right involves great expense and
time. The reality is most loudspeaker
companies’ so-called DBTs are NOT double blind at all. They are just blind tests with their own set
of biases rarely disclosed, and possibly never realized by the testers
themselves.

Editorial
Note About True Double Blind Tests (DBTs)
A
Double Blind Test (DBT) is one in which neither the participants nor the
experimenters know who is receiving a particular treatment. They were really
designed to be used in the medical and scientific arena because of their
usefulness in helping prevent bias due to demand characteristics or the placebo
effect found in all experiments. A
demand characteristic is a subtle cue that makes participants aware of what the
experimenter expects to find or how participants are expected to behave. In
audio this could be the familiarity with the horn-like sound of certain
brands, etc. Demand characteristics can
change the outcome of an experiment because participants will often alter their
behavior to conform to the experimenters expectations. In order to make sure
even further bias is eliminated, a control group in a scientific experiment is
necessary. This is a group separated from the rest of the experiment where the
independent variable being tested (i.e. speaker preference based on looks or
brand) cannot influence the results. This isolates the independent variable’s
effects on the experiment and can help rule out alternate explanations of the
experimental results.

What Our Experiences Tells Us as Audio
Reviewers

For what it’s worth, we have our own opinion on this topic based on 15-20 years of reviewing AV gear and observing other listeners and reviewers. Most listeners we’ve brought into tests at our facility can usually NOT
identify what speaker is playing in a sighted test if only two pairs are being
compared at a time to minimize positional differences. We often line up 3-4 pairs at a time to
confuse people as to which products are playing. This is especially true when the listeners
are sitting 15-20 feet away from the speakers with the grilles in place so no
cone movements can be observed. While we
have observed some variance in results between sighted and blind tests, our
testing hasn’t found it to be as dramatic as often claimed. The argument is that the prettier, more
prestigious brand speaker will ALWAYS win in a sighted test. What kind of product integrity does that
entrust by the manufacturer making such a claim? It almost implies they are insecure about their
look and brand name. Why wouldn’t a manufacturer want to produce a
product that sounds great and looks great to add more appeal and increase owner
satisfaction?

The majority of the listeners we’ve had on
our panels didn’t really consider cosmetics during the listening phase nor did
they even know the brands being tested.
In fact, there have been many cases where the so-called “ugly” speaker
unanimously won the whole test. I’ve had
people tell me that they preferred the sound of the ugly speaker but would
rather have the prettier speaker in their rooms upon closer visual inspection.

Still I can’t argue with the fact that on a
busy showroom floor, an uneducated consumer will likely gravitate towards the
prettier speaker with more drivers. This
gives credence to the insistent DBT camp for running blind speaker tests
especially when the consumer has such an obvious bird’s eye view of the competing
products. Given the choice between blind
or sighted listening comparisons, with all other things being equal, blind is
definitely the way to go.

Potential Limitations of Running ABX
Speaker Testing

ABX
tests were created to test the ability of listeners to hear a difference, any
difference, between two sounds. It is
useful for amplifier, wire and other such comparisons where differences are
vanishingly small. According to Dr.
Floyd Toole, they are poor at loudspeaker evaluations, as, in fact, are any
simple A vs. B tests. If the comparison
sounds both have a similar defect it may not be noticed. That is why multiple A/B tests in randomized
format should be used but they are very time consuming. Better is a multiple 3-
or 4-way simultaneous comparison of products which is what Dr. Toole has used
from the outset 40 years ago during his NRC days, and is still being used today
at Harman.

Editorial
Note on ABX Testing

An
ABX test is a method of
comparing two choices of sensory stimuli (in this case two different
loudspeakers) to detect and determine differences between them. A listener is
presented with two known samples (sample A reference and sample B reference)
followed by one unknown sample X that is randomly selected from either A or B.

Switching
Confusion

One could argue there are limitations to running comparative ABX speaker tests
that reign true for sighted and blind tests such as instantaneous switching
which can be confusing and overwhelming to the listening panel. Even if the speakers are level matched, we’ve
found inexperienced listeners often prefer brighter and
bassier sound to tonal accuracy. This is a
good reason why companies like Harman offer software to help train consumers to
become better listeners. Try this out
for yourself!

See: Six
Free Software Applications Every Audiophile Should Download

In our opinion, it’s so important to engage
in extended listening tests with a product over a course of several days and
weeks, not a simple 15 minute ABX comparison test. However one could also argue that prolonged listening
tests can actually train the brain to filter out and accept the flaws of a
loudspeaker. This is why it’s so important to have a solid reference. Very few
people have enough experience with the sound of a live unamplified performance
to frame such a reference.

Level
Matching

The notion that you can balance the level of two speakers with pink noise, and
then send in a musical signal with unknown spectral content to two speakers
with differing frequency responses and still have a level match is a bit naive.
You’d have to equalize the speakers for identical frequency response at the
listening position as well as level match them for a level-independent
test. If the frequency responses are
different between the speakers, and the signal is in the exact same spectrum as
the one used for level matching, their playback levels will also be different
99.999% of the time. However,
according to Dr. Floyd Toole, most of the serious problems with loudspeakers
are traceable to resonances. His
research indicates that an exact loudness balance is not necessary.
Absolute perfection will be elusive, but as time passed, most
loudspeakers have drifted towards “flattish” overall response, so the
problem is less severe. In the end listeners have been able to identify
superior products with impressive consistency.

Editorial Note about
Loudspeaker Resonances by Dr. Floyd Toole
The essence of the
timbre of musical and vocal sounds is resonances – these must be captured and
preserved. There are other factors, but this is the big one. The
job of the loudspeaker is to reproduce those sounds – those resonances -
without adding any of its own, thus monotonously coloring everything it
reproduces. Research has revealed that levels of resonances in loudspeakers
that are just perceptible are perceived as colorations when listening to various kinds of
music. The idea is to be able to recognize in a spinorama (see below) the presence of
resonances and to be able to evaluate the likelihood that they will be audible
when listening to music. Knowing this enables design engineers to manage
the compromises that are a part of all loudspeakers, especially those at the
bottom of the price scale. At high prices there are simply no excuses for
audible resonances.

Low-Q resonances are detectable at very small deviations.
This is why broadband spectral variations matter so much. A tilt is
detectable at about 0.1 dB/octave or 1 dB from 20 Hz to 20 kHz. We are much
more forgiving of high-Q resonances, the ones that ring on and look so alarming
in waterfall diagrams. Why?
Because in order for them to be energized, a musical spectral component
must hit the frequency exactly and stay there long enough to transfer energy.
In the ever changing musical sounds, especially anything with vibrato,
such instances are rare. It’s as simple as that.

Further, loudness grows much more rapidly at low frequencies than
at middle to high frequencies, making bass level especially critical. This
is a huge variable in recordings and movies because of the lack of standardization
in control rooms as can be seen in Figure 2.4 of Sound Reproduction: The Acoustics and Psychoacoustics of
Loudspeakers and Rooms. The notion
that a system can be set up and sound perfect for all recordings is naive.

2010 Audioholics Speaker Shootout (Sighted and Blind Testing)

Placement
Interference

Setting up a panel of speakers to instantly compare also often compromises
placement and imaging since you now have speakers closely placed together,
which causes baffle diffraction. In
addition the effect speaker box and driver complement (A) has on the different
speaker (B) NEXT to it is not the same as the effect that speaker (B) is going
to exert on speaker (A) or any other speaker.
Change the speaker’s size, shape or driver complement and you change the
effect it has on its neighboring speaker.

This makes real-time side by side comparisons
VERY difficult to implement.
Instantaneous comparison tests (sighted and blind) are a very useful
metric but in our opinion not a replacement to extended listening tests with a
speaker system properly setup in the listening environment to enjoy without the
pressure of comparing differences to another product.

Instantaneous speaker comparisons are a useful metric but in our opinion they don’t replace extended critical listening tests.

Drawbacks in Running Mono Speaker Tests?

We understand that Harman runs mono (single) speaker comparisons using positional substitution so that each time a speaker is switched, the competitor product is placed directly in the same location as the last speaker tested. This method is designed to remove the room and the stereo effect from the contest. According to Dr. Floyd Toole’s research, a speaker that wins in mono will ALWAYS win in stereo too.

Why not add the room and stereo effect to the contest? If that gives one speaker an advantage over another, why should that not count in the contest? Is eliminating the room and stereo effect fair to a speaker which has incorporated those elements into its design philosophy? Why not incorporate room reflections and stereo effect into the test since those are elements of our listening rooms?

I would caution the reader that tests performed by running a single mono speaker placed in the center of the room will be favorable certain types of speakers and unfavorable to other types of speakers based on their directivity (radiation pattern). This is why, in our opinion, there is still merit in properly setting up each pair of speakers per manufacturer’s guidelines in a room for extended listening sessions to truly appreciate their potential performance under conditions more closely representing the real-world use of the loudspeakers.

Editorial Note about Mono Testing:

Harman has their reasons for running mono single speaker tests. Stereo introduces a great deal of comb filtering in the speaker, and combining that with the comb filtering native to design flaws/compromises in the speaker unit makes distinguishing the source very difficult and would increase the listening time needed to determine the actual rating of the speaker.

Putting the speaker in the room center eliminates the room boundary effects relative to placing it in a corner, for example. No one would (hopefully) try this approach with an Allison or a Klipschorn, but for the typical box speaker, the result of placing it in the corner can make it very colored.

Designing for specific in-room use is useful, and our concern here is that real-world effects (like the confusion of comb filtering), when eliminated from the test, may emphasize certain traits in a speaker over others which may end up being just as important under real-world conditions.

For
more information see:
How to Skew a Blind Listening Test
Revealing Flaws in a Loudspeaker Demo and Double Blind
Test

Loudspeaker Myths: Anechoic Chambers , the NRC and Flat Frequency Response

Myth
#3: An Anechoic Chamber is a Must for Designing and Measuring Loudspeaker Performance

Everybody would love to have an anechoic chamber at their disposal. They are extremely useful in allowing
a designer to accurately and consistently run a series of measurements very
quickly, without having to worry about room effects. This not only reduces design time but also measurement error.

However, with recent advances in measurement
software, a competent designer or loudspeaker reviewer can remove the room
effects of a measurement by gating the measurement. This gives you a true anechoic representation
of frequency response above the room transition frequency (usually 200-300Hz). In order to gain high enough resolution (1/20^th octave or greater)
to see loudspeaker resonances, the measurement should be done outdoors or in a
very LARGE room to minimize truncation in the time domain caused by
gating.

For lower frequencies you can use a technique
called “ground plane” which is a VERY accurate method for measuring bass
frequencies. You can then splice the
measurements together to form a complete acoustical response of the speaker
system.

Ground plane is a tried and true methodology
for accurately measuring subwoofers. If
anyone tells you otherwise, realize they are likely doing so to create
exclusivity of their product that only they can accurately measure its true
performance using their methodology. Ground-plane or tall-tower measurements are both used as the basis for calibrating
anechoic chambers at low frequencies, so there is no dispute about their
accuracy if well done – except among the biased and ignorant.

If someone claims a loudspeaker can’t be properly designed without the use of an anechoic chamber, they are either being intellectually dishonest or lacking the knowledge on how to properly measure loudspeakers by other means.

Now if one is purely engaging in scientific
research to develop a correlation between how loudspeakers measure and what
listeners prefer, then access to an anechoic chamber is paramount. It is what allowed Dr. Floyd Toole and Sean
Olive to develop their groundbreaking research back at the NRC decades ago that is still an evolving practice at
Harman today. It doesn’t mean you can’t design a great performing loudspeaker system without one however. Many DIY’ers are producing excellent loudspeakers using state of the art parts with a solid understanding of loudspeaker mechanics and how to carefully measure and analyze them.

Myth
#4: A Speaker Should Measure like a
Flat Line from 20Hz to 20kHz

The
ideal loudspeaker would measure like a straight line from 20Hz to 20kHz
right? Well maybe, but how is the
measurement being taken? At what
loudness and what measurement resolution?

The Ideal Loudspeaker Measurement?

Most speakers roll off before 20 Hz and are
compromised in reproducing fundamental tones relative to the
harmonic relatives of those tones, so the relative harmonic distortion is going
to increase by the same ratio that the system favors the harmonics relative to
the fundamental bass frequencies.

If the manufacturer decides to make a speaker
flat to 20 Hz by making its mass very high to artificially reach down that low
in a small box, or equalizes the system to artificially boost the low end
rather than by creating a system where the native efficiency of the driver is
kept in full down to the lowest frequency needed to reproduce all the sound we
can hear (arguably down to 16 Hz), then he or she has made a serious compromise
in the reproduction of the lowest frequencies we can hear. The reality is that more and more synthesizers are
reaching down that low, and more and more musical artists (rap artists too) are
creating super-low frequency content that is played at very high sound pressure levels (SPLs) at live
events. This trend is being augmented
by the demand for more and more subwoofers of increasing output and power
handling by artists performing at live events.
While 40 years ago bass below 40 Hz was an oddity, today very high
output bass in the upper 20 Hz region is not at all uncommon at SPLs at
or above 120 db.

However, this doesn’t stop manufacturers from
manipulating their measurement graphs either by employing a large vertical axis
or by averaging or smoothing the measurements.

What about compression?
How many times have you seen a
shoebox-size subwoofer with an 8 or 10″ driver claiming below 20Hz
extension? The manufacturer will almost never tell you at what SPL and
distance that measurement was taken. Sure it can likely hit 20Hz at say
80dB at 1 meter but what does it do at meaningful output levels? 80dB
at 20Hz is barely audible (see chart). When looking at manufacturers’ loudspeaker
measurements, pay close attention to the test conditions, output levels,
measurement distance, vertical axis and if smoothing was applied to the
graph.

For more
information on this topic see:
Loudspeaker Measurements Useful vs Bogus
Audioholics Loudspeaker Measurement Standard

How Many Measurements Are Needed?

Some companies claim it takes thousands of frequency response measurements to
put together a true response of the product in order to predict how it will perform in a real room.
While it’s extremely important to understand how a speaker radiates
sound both on-axis and off-axis to formulate a family of curves, the reality is
it DOESN’T take thousands of measurements.
However if you mislead the consumer to believe it does, and it has to be
done anechoically, well then you’ve just convinced them how your product is different and exclusive from their
competitors.

You can get a very good idea of how a speaker
radiates by measuring on-axis and off-axis in 15 degree increments horizontally
and vertically until you hit 90 degrees off-axis. That’s six measurements vertically and six
measurements horizontally if the driver topology is symmetrical on the front
baffle or 12 measurements if it isn’t.
So you are looking at a total of 12 or up to 24 measurements to get a
really good idea of on and off axis performance and any anomalies that need to
be corrected either in the crossover integration or diffraction off the baffle.

If you want to get a full 360 degree view of
performance to develop a sound power response it would still take at most
36 vertical and 36 horizontal measurements at 10 degree increments totaling 72
measurements, not thousands. Harman
actually runs a total of 70 measurements via a spinorama to collect both
horizontal and vertical data of a speaker at a distance of 2 meters with 1/20^th
octave resolution. From that they can
generate all of the on-axis and off-axis information needed to develop a
directivity index and sound power response.

If the speaker has any frequency response
anomalies, you will find them within the first 90 degrees.

If the speaker has any frequency response
anomalies, you will find them within the first 90 degrees. There is NO magic here. When looking at loudspeaker measurements, Dr.
Toole’s research indicates that a single spatially averaged response curve is useful
in helping to separate speaker resonances (bad) from acoustical interference
(not so bad, or irrelevant depending on what causes it). However, you still need to see each
individual on-axis and off-axis measurement to truly understand how the speaker
is radiating in the room.

It
reminds me of a familiar quote I once heard from an enlightened audio
colleague:

Not
everything that matters can be measured and not everything that can be measured
matters.

It is my belief that some speakers that audiophiles
often consider to be bright are in fact bright because the manufacturer placed
too much emphasis of flat anechoic response power response without thoroughly
testing the product in real-world room environments or carefully optimizing
driver integration during the design phase.

Measurements are a great starting point in
understanding how a speaker will perform in a room but nothing replaces extended controlled
listening tests in real rooms for final voicing of a product.

Measurements are a great starting point in
understanding how a speaker will perform in a room, and how the drivers
integrate into the system. But, in my opinion, nothing replaces controlled
listening tests with the target goal of reproducing a live unamplified
performance as the reference. People
will also react strongly to a speaker that is up 3 dB at 5kHz and down 3 dB at 500
Hz vs. a speaker that is exactly the opposite. One will be ‘bright and thin,’
while the other will be ‘dark and heavy,’ but both are ± 3 dB—a great
response. This is very similar to how
two sports cars can produce similar track results but offer completely
different driving experiences and preferences. It is also why specifications
and even measurements don’t tell us the whole story.

Myth
#5: The National Research Council – the
Buck Stops Here

The
National Research Council (NRC) is a Canadian government laboratory, operating
in many scientific disciplines, which exists to perform some basic research, but
mainly to provide a scientific resource for governmental departments, services
and technical facilities for the benefit of industry and the public. From the
1970s to the early 90s a program of audio research, created and managed by Dr.
Floyd Toole, contributed usefully to the understanding of listener preferences
in sound quality and how these were represented in technical measurements of
loudspeakers and in rooms. Despite some manufacturers claiming they were
intimately involved in the audio research conducted at the NRC, it was actually
initiated and supervised by Dr. Floyd Toole.
No manufacturers were there contributing to the science.

In that process much was learned about
how to conduct meaningful double-blind subjective evaluations, as well as how
to collect comprehensive anechoic data on loudspeakers that could predict sound
fields in listening rooms. Much was learned about the psychoacoustics relating
what was heard and what was measured, providing a basis for subsequent research
leading to reliable correlations with subjective ratings of sound quality.
After 1991, the research begun at NRC continued in the research group at Harman
International, which is currently under the supervision of Dr. Sean Olive, who
began working with Dr. Toole at the NRC in the mid-80s. The measurement
method created at the NRC evolved into what is called the spinorama, and it is
embodied in the recently issued ANSI/CEA-2034 “Standard Method of
Measurement for In-Home Loudspeakers”. More work remains to be done,
but much progress has been made.

Listening Window Response, Spinorama and Sound Power by Dr. Floyd Toole

Dr. Toole pioneered the audio research at the NRC. No other
manufacturers participated despite the marketing claims often found on
some loudspeaker manufacturer websites.

The
listening window is a combination of 0, ± 15 degree vertical, and ± 15 and
30 degree horizontal measurements (the NRC measurements used 15 degree increments, Harman uses
10 degrees). This describes the average direct sound arriving at a group of
listeners. This is one component of the full set of measurements, called the spinorama. However the listening window response does NOT include the first reflections.

The spinorama is a 360 degree set of 70 frequency response measurements (10 degree increments on horizontal and vertical axes)
intended to capture the complete sound that is radiated into a room –
and therefore which will arrive at a listener in that room. They are intended to allow us to estimate what happens in rooms. THESE measurements DO take first reflections into consideration which are the second-loudest sounds to arrive at a listener.

We
start with the on-axis frequency response which will be the first sound
to arrive at a single listener in the sweet spot, at which the speaker
should be aimed. The listening window is intended to be an estimate of
the first sound to arrive at a group of listeners – in a good
loudspeaker there is little difference between this and the on-axis
response. Next, we look at the combined energy of the early
reflections, estimated by knowing the angular ranges over which wall,
ceiling and floor reflections occur – this curve is called the “early
reflections” curve.

Then we estimate the total sound power, the energy
radiated through a sphere surrounding the loudspeaker, by weighting the
individual frequency responses and combining them (this is NOT the
simple average of all the frequency responses that is often mistakenly claimed).

So, out of it all we
get estimates of the three principal classes of sounds arriving at
listeners: direct, early reflected, and late reflected or reverberant.
With this data it is possible to predict with good accuracy (a) the
average steady-state room curve (above the transition frequency of
course) measured in the listening area and (b) the subjective preference
rating as determined in double-blind, positional-substitution listening
tests. This assumes that the room is “typical” – not having aberrant
acoustical treatment.

But what does this mean exactly?

This testing practice is certainly a good way (not with nearly 100% certainty mind you) of predicting subjective speaker preferences when placed in ordinary untreated listening rooms. However, we can’t help but wonder how these results change in rooms that have more controlled acoustical properties that don’t mask subtle details with too many reflections. Most living rooms in Florida, for example, have vaulted ceilings and tiled floors opening to larger rooms which makes them somewhat unpalatable for critical listening.

What About Distortion?

In speaking with Dr. Toole, we discussed how the NRC measures loudspeaker
distortion. The NRC uses stepped tones
for measuring distortion while the current (proprietary) Harman system uses
tones/chirps. With tones one can also measure harmonic distortion, which is
almost useless, but better than nothing because if it is very low, things might
be tolerable and if it is very high, things are likely to be intolerable.

Harmonic distortion measurements also tell us if the system is well designed in general from a basic
standpoint. While it may be true that
the absolute total harmonic distortion (THD) number is a useless metric to determine the human annoyance
value of any given level of distortion, it is true that a system which is well
made is usually going to have a lower THD than a system which is poorly made
and/or designed. While THD in itself is
a poor metric, it is a relatively good measure of how well the product is made
in general. It’s harder to find a system with a high THD that sounds clean than a system with a low THD that sounds clean.

Distortion remains one of the unanswered puzzles in loudspeakers still to this day.

Distortion remains one of the unanswered puzzles in loudspeakers
because the signal that generates the distortion determines the audible effects
of the distortion-properties of the listener need to be incorporated into of
the “measurement” system. Pure tones are virtually unknown in
music, as are multi-tone test signals. Music is the test signal that
matters and it is indeterminate. In the end, a workable distortion
measurement scheme must incorporate a perceptual model of simultaneous masking,
wherein the signal that generates the distortion also causes portions of the
distortion products to be inaudible. Any measurement that does not include the
concept of masking is destined not to correlate well with audibility. It
is a topic that was not addressed at NRC, and because of its complexity
not everybody is equipped to embark on this kind of research. It is worth doing
though.

Bottom Line about the NRC Research

The research done by Dr.
Toole and his staff is largely responsible for the modernization of
loudspeakers today.

I believe the NRC audio research offers some great target goals for loudspeakers that should
be considered when designing a speaker system.
In fact, I would go so far as to say it is the research done by Dr.
Toole and his staff that is largely responsible for the modernization of
loudspeakers today. Before Dr. Toole’s research, speaker design was considered more of an art form than a science. His research made listening tests more objective, bringing real science to the field for understanding perceptual preferences that was once lacking.

Most manufacturers
now recognize the importance of minimizing diffraction by better integrating
the drivers and crossover networks and narrowing the baffle area. It is rare today to find a legitimately high
fidelity consumer loudspeaker in a big wide 1970’s style speaker cabinet with
tweeters firing off opposite ends of the baffle. Most competent loudspeaker designers don’t do this anymore, especially in a horizontal placement of drivers.

Not everyone agrees how a speaker should radiate sound in a room, hence why there are still over 400 brands to choose from.

It’s important to realize that not
everyone agrees with the primary design goals of how a speaker should radiate
in a room. Some prefer to design a
product to be omni-polar where it radiates similarly around the speaker while
others focus their efforts on designing a speaker that controls dispersion
in efforts to get more direct sound energy at the listener. The latter tends to give you better, more
focused imaging and clarity at the primary listening seat while the former
tends to give everyone good sound but not as highly focused. Listening preferences and design goals vary,
which is why there is still a viable market for the 400+ speaker brands out
there. People should evaluate a
speaker’s radiation pattern in context with how closely they’ll sit to the
speakers. Imaging isn’t particularly as relevant
in the absolute farfield (i.e. sitting very far from the speakers making you
hear more of the reflected sound than the direct sound).

Loudspeaker Myths: Crossovers, Bracing, Drivers Oh My!

Myth
#6: Better Crossover Parts Don’t Matter

Well
this is true if in fact the quality of the drivers, cabinetry and actual crossover
design implementation are all subpar. In
that case, replacing electrolytic caps with poly caps or iron core inductors
with air core may make little or no audible difference.

Polypropylene Capacitors (left pic) ; Electrolytic Capacitor (right pic)

Electrolytic capacitors can have their place
in crossovers if they aren’t in series with a high frequency driver, namely the
tweeter. The problem with electrolytic
capacitors is they exhibit very non-linear resistance behavior at high
frequencies. This is why poly or Mylar
capacitors are the preferred choice.

Air Core Inductor (left pic) ; Iron Core Inductor (right pic)

Laminated core inductors are convenient to
use at bass frequencies where the capacitance value has to be large, often making it
impractical to use air core. As long as
the laminated core power handling is at least 2X higher than the rated power of
the speaker system, the end results will likely be fine. However, air cores are preferred for mid/high
frequency components because they are more linear and don’t experience
saturation issues that laminate cores do which adds distortion. In addition, air cores don’t exhibit deleterious
distortion caused by hysteresis which is common in iron cores and another
reason why air cores are always the preferred choice and always more
expensive.

Again none of this matters with a poorly-executed crossover design. We’ve seen
far too many speakers that skimped too much on the actual crossover design
allowing the drivers to operate out of their intended bandwidth or not properly
integrating components that just made them sound like mush regardless if they produced
good farfield measurements.

For more information on this topic see:

Loudspeaker Crossovers Identifying Myths and Facts about
Good Design Practices

Myth
#7: The Simpler Crossover is ALWAYS
Better

Some
companies rely too much on farfield frequency response without looking at the
individual nearfield driver response curves.
You can, for example, decide to run a midrange in a 3-way design with no crossover components at all to fill in a mid-bass response gap NEVER realizing how much
intermodulation distortion that little driver will now be producing when high
power low frequency content is running through it. At the same time, if you’re operating that
driver out of its designed bandwidth without a crossover component, it will
exhibit nasty effects of cone break up as you can see in the measurement
below.

Woofer Nearfield Frequency Response of a Two-Way Bookshelf Speaker Operating it’s Woofer with NO XOVER
Note that the near field woofer data is raw. (IE effects of
port and baffle step are not applied)

The picket fence frequency response above
3kHz is where the cone is actually breaking up causing lots of distortion that
will often be missed in a simple farfield frequency response measurement or
sine-sweep least mean squares (LMS) distortion test. The driver is actually breaking up causing deleterious distortion artifacts which can often lead to listening fatigue and/or smearing of the sound.

Farfield Response of the Same Speaker (note the out of phase behavior between the drivers above 5kHz)

It’s interesting looking at the
breakup node at 7kHz in the woofer response and the subsequent null in summed
response. The tweeter must be completely out of phase and have a peak of
its own at this frequency!

Ironically this could have all been fixed for just a few dollars by adding a low pass filter (LPF) on the woofer. This begs the question, did the profit margins or over generalized “science” get in the way of producing a truly good sounding and performing product?

A good designer will look at the individual nearfield driver responses to ensure proper system integration.

In a three-way system with a dedicated
woofer, midrange and tweeter, not having the good sense to put at least a
capacitor in series with the midrange causes compromises in both efficiency and
distortion. Decreased efficiency is
caused by dropping the load impedance without increasing output and causing the
midrange to over-excurse for a given program which in turn increases
distortion.

We’ve seen numerous loudspeaker companies
defend their 2- or 3-element crossover (i.e. resistor/capacitor network only) as
being preferred to a more complex crossover network that their competitors
employ. They argue that they custom designed their
drivers to better integrate with each other, therefore not needing a crossover with
steep slopes or an elaborate design to improve overall system impedance.
In truth, they make an argument they themselves don’t believe, but insist on
this path, since it is the least expensive means to a compromised end. When you see a very simple network, it is
usually the result of a budgetary decision, not a performance decision.

Budget Designed Crossover (left pic);    a High Quality Crossover (right pic)

Can you guess which crossover was in the speaker that produced the measurements above?

When you see a very simple XOVER network, it is
usually the result of a budgetary decision, not a performance one.

The KISS principle doesn’t always work when
it comes to building a crossover network for a loudspeaker. Take pause if
you open the speaker box and see a 2- or 3-element crossover like the left
picture above, recognizing that this was done ,in our opinion, for cost reducing
purposes and/or design incompetence. While the speaker can still offer
respectable performance nonetheless, its performance is likely not state of the
art like you would find in more robust and often more
expensive alternatives.

Myth
#8: Less Cabinet Bracing is Better Because It Lowers Panel Resonance

This
again is nonsense as we’ve demonstrated in the articles below. Lowering panel resonance is a bad idea
because it typically places it right within the driver bandwidth that is
producing the largest amplitude vibrations.
Devices like accelerometers and even high power impedance measurements
can give you a good idea of where resonances are that need to be squashed but
you can also tell a lot about a cabinet design by simply knocking on the side
and top panels all around the cabinet.
If it sounds like a high pitch thwack, that’s a good thing. If it sounds like a low frequency thump that
doesn’t instantly decay, that means the panels aren’t very rigid which will
typically result in boomy or bloated bass response or chesty
midrange. Properly bracing a cabinet
requires a good understanding of acoustics and budget not just for the panels
but for the labor of properly installing them into the cabinet.

A good cabinet will have sufficient stiffness and bracing to raise the panel resonance above the driver bandwidth.

Panel resonance manifests as launching of
sound waves from an axis that is not the design axis. This effects the polar/power response and is, in
our belief, a major deterrent to good imaging. The resonance decay is longer too. Speaker cabinets that resonate do not
disappear into a room.

It
is worth noting that there can be movement in a portion of a cabinet that is
cancelled by opposite polarity movement in another. Numerical simulation
using coupled physics between structural mechanics and fluid mechanics and
physical testing in a controlled environment are ways to determine if there are
actually significant unwanted acoustic emissions. Harman uses a laser vibrometer to scan all
surfaces at all frequencies.

For more information on this topic see:

Loudspeaker Cabinets Identifying Myths and Facts about
Good Design Practices

Myth
#9: We Make our Own Drivers For Superior
Performance and Consistency

I
believe statements like this are quite insulting to all of the very talented
acoustical engineers working for reputable driver companies. The fact of the matter is VERY few
loudspeaker manufacturers truly make their own drivers as can be seen in our
article below.

Scan Speak 9500 Tweeter (left pic) ; Tweeter from Loudspeaker Manufacturer (right pic)

Which tweeter do you think has better power handling, lower Fs, and better sound?

For more information on this topic see:
Do Loudspeaker Manufacturers Really Make their Own Drivers?

Loudspeaker Drivers: Myths vs Facts to Identify Good Parts from Bad

Most companies will NOT reinvent the
wheel. They will use standard baskets,
ferrites, top plates, etc. They will
typically call one of the driver vendors and ask them to customize an OEM part
by either changing the voice coil impedance or the cosmetics of the faceplate.

Loudspeaker driver companies are fully capable of making the very best drivers and are often better equipped than loudspeaker manufacturers themselves.

The idea that the major reputable driver
companies like SEAS, Scan Speak, Focal, etc. can’t produce a superior product
with tight tolerances is not only arrogant presumption, but it’s untrue. They have been doing it longer – for generations! They must compete in the marketplace against
the other companies that have also been doing it for generations.

There are certainly cases a designer may
choose to customize a specific part to meet their design requirements since
nothing off the shelf can meet their needs.
Choosing to make your own parts has nothing to do with the fact that
driver companies can’t produce reliable and consistently good parts. Most of the best loudspeaker designs in the
world use very high quality OEM parts and achieve excellent consistency in
design and performance. The reality again is if you can claim “we make our own drivers” it gives the
consumer the illusion of exclusivity to help your product stand out.

Myth
#10: The Constantly Evolving Speaker

A properly designed speaker built a decade
ago is still relevant today.

I call
this myth the evolving speaker for the simple reason that some companies love
to throw a spin on their products every couple of years just to make them new
and appealing again. They often make
cosmetic changes or slight changes to drivers or the crossovers. While this can result in better performance,
be a bit leery if a company goes through revision cycles as often as a parent
changes a baby’s diaper. Speaker
technology doesn’t evolve nearly as quickly as electronics. A properly designed speaker built a decade
ago is still relevant today. No recent
earth shattering science has reinvented the wheel or taught us how to build a
better mousetrap. However, if a company
gets a bad review or sees their sales drop one quarter, or if they want an
excuse to raise prices, you may often see the next revision of their speaker
hit the shelves as the best thing since apple pie or sliced bread. Again, take these claims with a grain of salt.

The real changes in the past few
decades have been in measuring the speaker more than in designing
it. Having tools like Klippel and Finite Element Analysis (FEA) to
model things on a computer has changed much more so than the product or the
quality of it.

Bonus
Myth #11 “Digital” Ready Speakers

Watch out for speakers described as
“Digital Ready”. This terminology was popular back in the
early 80s when the CD player first hit the market and manufacturers were
clamoring to get a piece of the digital pie.
Nowadays, “Digital” usually
tends to be associated with “White Van” speakers built in some Chinese factory
or somebody’s garage loaded with cheap drivers and electronics, designed to
look impressive with flashy brochures and outrageously high retail prices to
confuse the inexperienced buyer.

For
more info on these brands, visit Scam Shield.

Conclusion

I
think it’s important to realize that virtually every loudspeaker company is
going to try to sell you a convincing story as to why their speakers are
“better”. There is certainly nothing
wrong with having pride in your company’s products and services. It is important to realize that many of the
claims that companies make, even in the name of “science,” must be welcomed with
cautious skepticism just like a glowing review from an AV magazine online and
in print. Discerning the science from science fiction will help you make a more informed purchasing decision. I love a good Sci-Fi show like the next guy, but let’s keep the Technobabble out of loudspeakers.

Points of Consideration:

Not all manufacturers’ “science” is
equal even if they build their company core philosophy off the discoveries
founded at the NRC decades ago.

Manufacturers have limited budgets
to buy samples of competing products to directly compare against. Thus small sample sizes can often lead to
over-generalized conclusions of product superiority.

Marketing can often triumph over
engineering in a final product design cycle.
Manufacturers don’t operate on a fixed profit ratio regardless if they
sell direct or through brick & mortar channels so price may not be a good
indicator of performance.

We (consumers and manufacturers) all
have our particular biases and preferences on how things should sound.

We aren’t robots. Sound is NOT
always the only determining factor of speaker preference. We must be able to live with the speakers and
also have spousal approval. There is value in that!

Don’t
discount the small speaker company in favor for the giants out there. Some of the smaller companies offer a greater
variety of customization both aesthetically as well as performance options. They often don’t simply slap together a
speaker box from a machined piece of medium-density-fiberboard (MDF) but instead hand-build and test each
product leaving their facility.

Read
the reviews, observe the measurements, listen to the opinions on the forums,
but most importantly…. test the products for an extended period of time in your
own listening environment to see if a particular speaker is right for your
needs.

Remember these great words from Captain Spock “logic is the beginning of wisdom, not the end.“

Must Watch Speaker Myths Interview with Gene DellaSala (left) and Hugo Rivera (right)

Acknowledgements

I would like to personally thank
the following people for their contributions and/or peer review of this
article, all of whom are true experts in their respective fields. Their
contributions enabled us to make the most comprehensive and accurate article
possible on the very complex topic of loudspeaker design and testing dealt with herein.

Source link

4 weeks ago

20 35 minutes read

Separating the Scientific Facts from Science Fiction

Myth
#1: Speaker Break-In

Myth
#2: The Double Blind and ABX Test Religion

Potential Limitations of Running ABX
Speaker Testing

Loudspeaker Myths: Anechoic Chambers , the NRC and Flat Frequency Response

Myth
#3: An Anechoic Chamber is a Must for Designing and Measuring Loudspeaker Performance

Myth
#4: A Speaker Should Measure like a
Flat Line from 20Hz to 20kHz

Myth
#5: The National Research Council – the
Buck Stops Here

Bottom Line about the NRC Research

Loudspeaker Myths: Crossovers, Bracing, Drivers Oh My!

Myth
#6: Better Crossover Parts Don’t Matter

Myth
#7: The Simpler Crossover is ALWAYS
Better

Myth
#8: Less Cabinet Bracing is Better Because It Lowers Panel Resonance

Myth
#9: We Make our Own Drivers For Superior
Performance and Consistency

Myth
#10: The Constantly Evolving Speaker

Bonus
Myth #11 “Digital” Ready Speakers

Conclusion

Acknowledgements

Leave a Reply Cancel reply

The Chemistry of Compelling Communication – UCLA

communication coach impressed by 1965 Lee Kuan Yew speech

15 of the Best Leadership Podcasts for Professional Growth (2025)

A New Leadership Communication Skill Is Emerging In The Workplace

How to improve communication skills as a leader?

Myth #1: Speaker Break-In

Myth #2: The Double Blind and ABX Test Religion

Potential Limitations of Running ABX Speaker Testing

Loudspeaker Myths: Anechoic Chambers , the NRC and Flat Frequency Response

Myth #3: An Anechoic Chamber is a Must for Designing and Measuring Loudspeaker Performance

Myth #4: A Speaker Should Measure like a Flat Line from 20Hz to 20kHz

Myth #5: The National Research Council – the Buck Stops Here

Bottom Line about the NRC Research

Loudspeaker Myths: Crossovers, Bracing, Drivers Oh My!

Myth #6: Better Crossover Parts Don’t Matter

Myth #7: The Simpler Crossover is ALWAYS Better

Myth #8: Less Cabinet Bracing is Better Because It Lowers Panel Resonance

Myth #9: We Make our Own Drivers For Superior Performance and Consistency

Myth #10: The Constantly Evolving Speaker

Bonus Myth #11 “Digital” Ready Speakers

Conclusion

Acknowledgements

EXCLUSIVE: Interview With A Young Growing Motivational Speaker; Kaushalya Balamurugan

Why empathy constitutes the ultimate leadership skill

Related Articles

Leave a Reply Cancel reply

Myth
#1: Speaker Break-In

Myth
#2: The Double Blind and ABX Test Religion

Potential Limitations of Running ABX
Speaker Testing

Myth
#3: An Anechoic Chamber is a Must for Designing and Measuring Loudspeaker Performance

Myth
#4: A Speaker Should Measure like a
Flat Line from 20Hz to 20kHz

Myth
#5: The National Research Council – the
Buck Stops Here

Myth
#6: Better Crossover Parts Don’t Matter

Myth
#7: The Simpler Crossover is ALWAYS
Better

Myth
#8: Less Cabinet Bracing is Better Because It Lowers Panel Resonance

Myth
#9: We Make our Own Drivers For Superior
Performance and Consistency

Myth
#10: The Constantly Evolving Speaker

Bonus
Myth #11 “Digital” Ready Speakers