For my next pair of speakers, I was going to get it right. Really right. At least that was the plan. I wanted efficient loudspeakers that could effortlessly reproduce the sensation of a full-fledged symphony orchestra in my living room, such as when seated in the center of the main floor and just a few rows back from the stage. After plenty of reading on the Lansing Heritage Forum and similar sources of internet wisdom, I wound up with 4 JBL 2235H 15" woofers, 2 JBL 2123H 10" midranges, and a pair of Aurum Cantus G1 ribbon tweeters—especially since the tweeters were on sale at Parts Express. Sources for the JBLs were the local JBL representative Morgan Sound and eBay.
My speaker design software (LspCAD) simulates a pair of 2235H in parallel with an efficiency of 99 dB/2.83V/m, likewise for a single 2123H, while the G1 is rated 102 dB/W/m. Moreover, the simulation had it that in an appropriately tuned vented enclosure, the pair of 2235H should reach 117 dB SPL at 1 m with the cone excursion staying within xmax above 16 Hz, hence at 3 m distance a stereo pair should do 110 dB SPL. 110 dB SPL is about what has been recorded as the loudest peaks of a symphony orchestra, and 3 m (10 ft) is how far I'm usually seated from my speakers. xmax loosely denotes the speaker cone's range of travel within which it is well-behaved (low distortion), and 16 Hz represents the lowest note on a large pipe organ (lowest C on a 32' stop). For the record, the simulation of the 2123H handily exceeds 117 dB SPL at 1 m and within xmax above 200 Hz, while the G1 should get there with some 32 Watts out of its 100 W RMS rating.
The tricky part will be the crossover choice for the transition from the 10" midrange to the ribbon tweeter. Cross the midrange too high and it starts
fry the ribbon, especially when using the prodigious power handling and efficiency of the midrange and woofers. Suggestions for the midrange I read were as low as 1.2 kHz, Aurum Cantus' recommendation for the tweeter was 1.5 kHz (18 dB/oct) in the past, while currently their website recommends 2 kHz or above.
Along with that I seemed to have in mind that the crossover frequency fc should be equal to the wavelength λc that corresponds to the center-to-center distance between the midrange and the tweeter. Given the physical dimensions of the two transducers, and my intention to build separate enclosures for the two (such that I could time align their acoustic centers after building the enclosures), I determined a minimum center-to-center distance of 230 mm (about 9 1/16").
Time to do some simulations in hopes that this may provide decisive insight. I started with a 4th order Linkwitz-Riley (24 dB/oct) filter because this is probably about the most complex crossover I'd be willing to implement with passive components, while its relatively steep slopes would maximize the protection of the fragile ribbon. To start with, I chose fc = 1491 Hz, corresponding to λc = 230 mm, the minimum vertical offset ΔY of the drivers. I had LspCAD display both horizontal and vertical polar frequency responses, for a range of frequencies bracketing fc. To simplify things for now, and in hopes I would gain some understanding, I assumed an infinite listening distance and point sources for the level-matched transducers (in practice, I would be listening from a distance closer than infinity, and the transducers would have a non-zero diameter or—in case of the ribbon—width and height. Also, the ribbon's 102 dB/W/m level would have to be padded down to match the midrange's efficiency). Here is the somewhat startling result:
Horizontal (left) and vertical (right) frequency responses (polar plots), simulated for point sources crossed over at fc = 1491 Hz with a 4th order Linkwitz-Riley filter and ΔY = 230 mm
To read these diagrams, imagine a line from the center of the tweeter to your ear. At 0° this line is perfectly perpendicular to the flange of the tweeter (on-axis). Now imagine moving your ear left or right (left diagram) or up or down (right diagram). Depending on the angle (in range ±90°), orientation (horizontal vs. vertical), and frequency (see color legend), the above diagrams simulate how much sound will make it to your ear (off-axis response).
What do we see? For any horizontal angle, the off-axis response is the same as the on-axis response at any frequency. We should hear the same sound if we moved our head left or right. But for just about any vertical angle, the off-axis response is lower than the on-axis response at most frequencies—in fact, at 1491 Hz and 30° up or down (green trace), the response is just about nil. We should hear no sound at all at these angles and 1491 Hz (!).
To understand this drop-out in the vertical polar response let's look at the speaker from its side. From this point of view it becomes obvious that an observer at a vertical angle α does not hear the midrange at the same time as the tweeter:
Hearing the midrange and tweeter at a vertical angle α
Observed from infinity but at an angle α, and the transducers vertically spaced by ΔY, it takes the sound a bit more time to travel from the other transducer to the observing ear. By elementary trigonometry, the extra distance ΔZ that the sound has to travel is
Z = ΔY·sin(α).
For our drop-out, α = 30°, sin(α) = ½, and ΔZ = ΔY/2. The way I chose fc, ΔY = λc, hence
ΔZ = λc/2.
At fc the sound from the midrange arrives half a wavelength later than the sound from the tweeter. It arrives out-of-phase, hence the cancellation.
On second thought, this result may not be that intuitive, even after duly applying trigonometry. For instance, without trigonometry I wouldn't find it intuitive that the sound traveling the extra 115 mm (about 4 17/32") between here and infinity would have any effect at all. Moreover, I don't find it intuitive that after traveling through the filter in the first place, the electrical signal arrives at both the midrange and the tweeter in phase. If it didn't, my simple use of trigonometry couldn't explain the cancellation observed at ±30°. Turns out that this is a property of 4th order filters. In other words, the crossover does not introduce any Δφ between the midrange and the tweeter. It does, nevertheless, introduce a Δφ between its input and its output. But for any given frequency the Δφ it introduces is the same for both the midrange and the tweeter, hence the above explanation for our drop-out still applies (why it is that for 4th order filters the low- and highpass are always in phase I still have to learn).
While I carefully chose the frequencies to be used in the above diagrams (LspCAD allows 5 individual frequencies and promptly re-normalizes them to what it deems more workable), these diagrams don't convey the complete picture. What we want to look at is the magnitude of the sound in function of both its frequency and the angle of observation. LspCAD's alternative is to plot the complete frequency spectrum for a discrete number of angles, albeit allowing more angles than frequencies in the polar plot. Here's what this looks like:
Horizontal (top) and vertical (bottom) off-axis frequency responses (surface plots), simulated for point sources crossed over at fc = 1491 Hz with a 4th order Linkwitz-Riley filter and ΔY = 230 mm
For the chosen angles (±90° in increments of 15°), the horizontal off-axis frequency response is perfectly flat at any frequency, while the vertical off-axis response shows a few minor dips above and below the 0° line and almost hides the previously observed cancellations: Carefully follow the lines at ±30° and notice where they seem to disappear behind neighbouring lines.
I found it rather remarkable that both the polar and the surface plot essentially convey the same aspect of speaker behaviour, but merely looking at the latter I might have missed the cancellations I observed in the former. This is not to say that one is better or more suitable than the other. I just have to learn how to make the best of either method of visualization. For instance, knowing that off-axis the response cancels at ±30° I could have chosen to have the off-axis plot display angles in range ±30° only but instead in increments of 5°. This would clearly illustrate the cancellations, but in the process I might have missed what's happening above +30° and below -30°:
Vertical off-axis frequency response (surface plot, ±30° window) simulated for point sources crossed over at fc = 1491 Hz with a 4th order Linkwitz-Riley filter and ΔY = 230 mm
So far I have looked at a particular crossover frequency fc for a 4th order Linkwitz-Riley filter and a given ΔY = λc between the midrange and the tweeter. This combination has shown a deep frequency response dip both above and below the axis of observation. Next I'll have a look at a range of crossover frequencies to see if there is maybe a less
intrusive spot for this dip, and if so at what
Vertical off-axis frequency response (polar and surface plot) simulated for point sources crossed over at frequencies corresponding to λc in range 4·ΔY to ¼·ΔY at intervals of ½ octaves, keeping the 4th order Linkwitz-Riley filter and ΔY = 230 mm (click on the animation for a complete list of the individual simulations)
Everything else being equal, the preceding simulation suggests that cancellation in the vertical off-axis response is unavoidable unless I chose a crossover frequency equivalent to λc > 2·ΔY (or fc < 745 Hz—benign for the midrange but out of the question for the ribbon tweeter). Recall that for the first simulation, cancellation occurred at
ΔZ = ΔY·sin(α)
ΔZ = λc/2
α = sin−1(λc/(2·ΔY))
for which there is no solution if
λc/(2·ΔY) > 1
λc > 2·ΔY
as predicted by the simulation. Conversely, we should expect cancellation at odd multiples of λc/2, or
ΔZ = (2·n + 1)·λc/2 n = 0, 1, 2, ...
corresponding to angles
αn = sin−1((2·n + 1)·λc/(2·ΔY)) n = 0, 1, 2, ...
which explains multiple cancellations for n ≥ 1 and (2·n + 1)·λc/(2·ΔY) ≤ 1 or
λc ≤ 2·ΔY/(2·n + 1)
which is illustrated by the simulation for λc ≤ ½·ΔY.
What to do? I won't even think about crossing over the tweeter below 745 Hz, and I physically can't mount the drivers closer together than a center-to-center distance ΔY of about 230 mm. Mounting the tallish ribbon tweeter horizontally is not a good alternative, either, because of its dispersion characteristic (which is a posh way of saying that it D times the linear cone travel xmax (one way). If I wanted a cone with half the diameter to move the same amount of air, I am reducing the area to a quarter, hence I would have to find a driver with quadruple the cone travel.
There is another variable that I have not looked at very closely. Recall that it is a property of 4th order filters that the low- and highpass are always in phase, and hence the validity of all the above math for determining cancellation. What if I used a different filter order? For instance, 1st order filters (6 dB/oct) are out-of-phase by 90°, and so are 3rd order filters (18 dB/oct), the latter with inverted tweeter polarity. Before thinking about the mathematical consequences, let's simulate a range of filter orders.
Vertical off-axis frequency response (polar and surface plot, left and top right) simulated for point sources crossed over at λc = ΔY for various slopes from 6 dB/oct to 96 dB/oct, keeping ΔY = 230 mm, and simulating group delay (bottom right)
Several things caught my eyes right away:
cleanthe vertical off-axis response may look, exceeds the Blauert and Laws criteria for audibility of group delay.
Group delay is a concept for which I yet have to develop an intuitive understanding. Formally, it is defined as
τ(ω) = −dφ/dω
i.e. the infinitesimal rate of change of phase φ with respect to frequency ω = 2·π·f, with the negative sign chosen such that an actual delay yields a positive number. Practically I first observed this concept on different woofer alignments (sealed vs. ported enclosures). Generally, the lower the frequency, the more the woofer appeared to
drag behind, with certain sealed enclosures showing a monotonous but slow increase towards lower frequencies, while certain vented enclosures exhibiting a rather marked
peak. I suspect I'll need to understand the speaker and its enclosure as a forced but damped harmonic oscillator, at which point a frequency dependent phase shift (delay) should become evident, and hence the group delay.
But why does this matter? Think of it this way: group delay causes some frequencies of the spectrum to be delayed relative to others. It's like the high frequencies arrive first, followed by the low frequencies. Intuitively, this doesn't sound right, but seems unavoidable for woofers or any cone driver I managed to simulate so far. What surprised me is that the phenomenon extends into crossovers with slopes > 6 dB/oct. I should probably try to understand crossovers as forced damped oscillators as well. Like the dips in the off-axis response, I may have to
live with group delay. In turn, this begs the question, at what rate does this become audible? There is some speculation about the audibility threshold for low frequencies; what is known is the threshold of audibility at higher frequencies—the previously mentioned studies by Blauert and Laws.
Back to the cancellations in the off-axis response: I can't get rid of them in an MT configuration (single midrange, single tweeter), but I could minimize their side-effects using a filter with the highest slope within the Blauert and Laws limits (48 dB/oct). Using a 48 dB/oct filter is not a purely academical exercise provided I am ready to tri-amp my speakers; i.e. use an individual amplifier per driver and channel, and implement the crossover at the line level (pre-amp) stage. In practice, I'd have to get off of my listening chair, position my head to within a couple of degrees of 30° above or below the 0° axis, and wait for e.g. the pianist to hit F#6 (1480 Hz) to notice the remaining cancellations.
Last but not least, I could look at MTM configurations (after Joseph D'Appolito). In an MTM configuration, two midranges are used, with a single tweeter in-between. The immediate advantage of this configuration is the increased efficiency provided by dual midrange drivers—not that this was a problem to begin with for my choice of transducers. If there is any advantage to the off-axis response, I should be able to simulate it—particularly before getting serious on eBay with bidding on a second pair of JBL 2123H.
Vertical off-axis frequency response (polar and surface plot, left and top right) simulated for point sources crossed over in an MTM array (after Joseph D'Appolito) at λc = ΔY for various slopes from 6 dB/oct to 96 dB/oct, keeping ΔY = 230 mm, and simulating group delay (bottom right)
Several things caught my eyes again right away:
If I were to look at the vertical polar plot only, I'd probably go for the 6 dB/oct (1st order) Butterworth filter, which is the exact opposite when compared to my preference for the MT configuration. As I learned while doing these simulations, 1st order filters are the only ones that have a chance to transmit square waves as square. But this won't get me to consider crossing over the G1 ribbon at 1491 Hz and a mere 6 dB/oct—not at the power levels I was planning on using it.
Anything else left to look at? LspCAD lets me specify radii of cone drivers, or width and height of rectangular transducers such as the G1. I calculated a radius of 100 mm for the 2123H and measured 14.5 mm x 150 mm (W x H) for the G1, set the filter topology back to 4th order Linkwitz-Riley, and was in for a surprise:
Horizontal (top) and vertical (bottom) off-axis frequency responses simulated with actual driver dimensions, crossed over at fc = 1491 Hz with a 4th order Linkwitz-Riley filter and ΔY = 230 mm
Ouch! After I double-checked against any obvious data entry errors, I re-tried the various filter slopes from the previous simulations, but without obtaining substantially different results. It always looked like the vertical off-axis response is very narrow in the tweeter's range of frequencies.
The reason for this is the respectable vertical diaphragm dimension of the G1 ribbon. Looked at in the vertical plane, it is as if I used a tweeter with 150 mm (about 6") piston diameter. While this tweeter has no problem whatsoever to transmit frequencies beyond 20 kHz, it will do so from any point on its diaphragm. But unless I listen to it on-axis, sound emanating from one point of the diaphragm won't arrive at my ears at the same time as sound emanating from any other point of that same diaphragm. The problem is the same as previously illustrated for hearing the midrange and tweeter at an angle. All that has changed is the size and the frequencies involved. To illustrate the point, I have simulated the vertical off-axis response for a range of imaginary ribbon tweeters with progressively smaller vertical diaphragm dimensions:
Vertical off-axis frequency response (polar and surface plot) simulated for an imaginary ribbon tweeter with heights ranging from 150 mm down to 15 mm
The behaviour that the 150 mm tall ribbon tweeter exhibits in the vertical plane is called
beaming. I suppose this is a descriptive term: Like a flashlight throwing a narrow beam of light into darkness, a cone transducer will disperse sound in a narrow
beam if it is asked to transduce wavelengths that are much smaller than its diameter. I do not yet fully understand the sonic consequences of combining a tweeter having progressively restricted vertical dispersion with a midrange having unrestricted dispersion, at least up to its crossover point. Somehow, it doesn't seem right—particularly after having obsessed over the off-axis response of different crossovers, midrange to tweeter distances, and MT vs. MTM configurations.
What are the alternatives? I have seen silk dome tweeters with diameters as small as 19 mm (3/4"), but I don't recall any of them with an efficiency and power handling of the Aurum Cantus G1. Installing multiple tweeters to increase efficiency and power handling is completely out of the question, as this would lead to comb filtering. Comb filtering is a descriptive term for the effect that happens as a result of multiple cancellations—a generalization to the off-axis listening experiment illustrated above.
I could use a horn with a compression driver, such as are used in professional sound reinforcement applications. Modern horn geometries provide for constant directivity over a well-defined angle of dispersion. Within this angle the amount of sound is about the same, while it decreases rapidly outside. Think of a floodlight with a 90° wide beam. But before I would be confident to use this as an alternative, I'd still have to learn more about this topic. What worries me in particular is the non-linearity introduced by the pressure changes in the compression chamber . Following the ideal gas law
p·V ~ T
the pressure p exerted by the diaphragm to compress and decompress air is inversely proportional to the volume V of the air between the diaphragm and the phasing plug towards the throat of the horn. The positive half of a sine wave is transmitted with a different magnitude than the negative one. This becomes audible as distortion, with the distortion increasing with frequency.
Compared to that, maybe the
beaming of the ribbon tweeter is not that bad after all, however ugly the vertical off-axis response may look at first. To get an idea of the area within which I should be
safe—also known as the sweet spot—I tried to determine the 6 db down point for various frequencies. As far as I gather, a relative level of 6 dB down is used to define the angle of dispersion, and with an angle I can extrapolate to the listening distance by basic trigonometry:
Vertical off-axis frequency response (overlaid plot) simulated for a range of angles for an imaginary ribbon tweeter with height 150 mm
This simulation indicates that at 8 kHz the response is 6 dB down at 10° below axis (magenta trace), while at about 16 kHz all it takes is 5° (blue trace). I determine the height of my listening window (sweet spot) as follows
H = 2·Z·tan(α)
which for α = 10° and Z = 3 m works out to 1.06 m (41 2/3"), while 5° yields a window height of about 0.52 m (20 2/3"). For this to work, it is imperative that the tweeter be positioned at ear height. Moreover, the more I should be slouching in my BarcaLounger, the more I will loose the higher frequencies of the spectrum... By comparison, in the horizontal direction I should be safe.
Now then, what does it mean to get it right? All I know at this point is that with the height of the chosen ribbon tweeter I don't see an advantage of an MTM configuration over an MT configuration. Mounting the tweeter as closely as possible to the midrange should be beneficial. Using a 4th order Linkwitz Riley filter looks like a reasonable compromise between complexity, vertical off-axis response, and group delay, especially with a passive filter in mind. Keep in mind that all these conclusions stem from simulations which have been done assuming an infinite listening distance and zero phase drivers. In practice, I'd be sitting closer than infinity, which messes up time-alignment, and the individual transducers' phase is not always 0°, which further messes up off-axis response and notably crossovers.
It's all relative. What may be right for one set of priorities may be wrong for another set. Then what are the priorities? I know some of them—the ones I have made my design goals: 110 dB SPL at 3 m within xmax. Some of the priorities I have a vague idea about. For instance, what does a tweeter with a narrow vertical dispersion sound like in reality? Could it actually be beneficial that less sound bounces off the floor and ceiling before arriving at my ears and mixing with the direct sound? I will have to build it, listen to it, and learn to hear a difference between this and other implementations. Some priorities I simply have to learn more about, such as constant directivity horns or time-alignment and transient response, before I can even think about building anything in that direction.
It's a bootstrapping problem: Learn before making saw dust—except for those things that require saw dust before they can be learned.
laterreflections (ie sound reflected by walls surrounding the listener). Moreover, for psychoacoustical reasons the
earliestreflections should stimulate the ear opposite to the respective speaker: Sound from the right speaker is reflected at the left wall before entering the left ear, and vice-versa. If I understand this correctly, the purpose of this method is to optimize
imaging(which could be achieved in an anechoic chamber) while maintaining a sense of
ambiance(which cannot be portrayed by an anechoic [
acoustically dead] chamber).