Scientific Publications

Real-time low-latency music source separation using hybrid spectrogram-tasnet

There have been significant advances in deep learning for music demixing in recent years. However, there has been little attention given to how these neural networks can be adapted for real-time low-latency applications, which could be helpful for hearing aids, remixing audio streams and live shows. In this paper, we investigate the various challenges involved in adapting current demixing models in the literature for this use case. Subsequently, inspired by the Hybrid Demucs architecture, we propose the Hybrid Spectrogram Time-domain Audio Separation Network (HS-TasNet), which utilises the advantages of spectral and waveform domains. For a latency of 23 ms, the HS-TasNet obtains an overall signal-to-distortion ratio (SDR) of 4.65 on the MusDB test set, and increases to 5.55 with additional training data. These results demonstrate the potential of efficient demixing for real-time low-latency music applications.

Authors: Satvik Venkatesh, Arthur Benilov, Philip Coleman, Frederic Roskam

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

April 2024

Requirements and Solutions for Audio Networking in Sound Reinforcement Systems

Market demands for new business models and for multifunctionality of venues with fully digitalized technical structures foster a fast development of fully networked AV and sound reinforcement systems. At the same time the ProAV industry is experiencing a phase of disillusionment about the suitability and future proofness of network solutions for professional applications. While most established media network solutions build on standard legacy Ethernet it turns out that this technology does not meet requirements of large-scale sound reinforcement systems regarding audio performance, usability, reliability, and scalability. Leading sound reinforcement manufacturers have tackled this fundamental problem and since 2016 developed a collaborative approach named MILAN which is based on open deterministic IEEE AVB resp. TSN network technology. The paper describes the user and market requirements that lead to this decision and explains how MILAN and its underlying technology fulfils even the highest demands for audio performance, reliability, ease of use and scalability.

Authors: Henning Kaltheuner, Genio Kronauer, Morten Lave and Etienne Corteel

International Acoustics & Sound Reinforcement Conference 2024

January 2024

On the factors influencing groove fidelity in immersive live music events

Spatial audio is employed more and more often in large-scale live music events. In events of this kind, loudspeakers can be widely spaced apart, which may result in large time differences of arrival between certain sources. These timing differences may in turn affect the perceived rhythmic quality of music, or groove, as the synchronization between instruments is modified. This paper presents the results of a perceptual experiment that investigated how different factors, such as the nature of the instrument or the musical genre, impact the perceived groove modification resulting from sound propagation time differences. The results indicate that different instruments can show more or less sensitivity to time shifts, even in the same musical excerpt. Based on these findings, we derive mixing and sound system design guidelines that aim at preserving an optimal musical quality for the majority of the audience.

Authors: Thomas Mouterde, Nicolas Epain, Samuel Moulin, Etienne Corteel

AES 2024 International Acoustics & Sound Reinforcement Conference

January 2024

On the perception of time-alignment between full-range speakers and subwoofers for sound reinforcement

Most sound reinforcement systems employ a combination of full-range speakers and subwoofers to deliver a consistent sound pressure level over the audience, while maximizing the frequency bandwidth. A time alignment between the main (full-range) and sub (subwoofers) systems is generally required to ensure an efficient summation at low frequencies. This study investigates how time misalignments between the main and sub systems affect the perceived sound quality. We conducted a listening test whereby the listeners were asked to rate the sound quality as a function of the relative delay between main and sub systems. In addition, test participants were requested to qualify the nature of the perceived artifacts using spectral or temporal attributes. Our results suggest that the overall perceived quality does not decrease linearly with increasing delays, and that it reflects the presence of both spectral and temporal degradations. Lastly, temporal degradations are perceived more often when the sub system is delayed with respect to the main system, unlike spectral degradations for which the direction of the delay has very little influence.

Authors: Thomas Mouterde, Samuel Moulin, Nicolas Epain, Etienne Corteel

AES 2024 International Acoustics & Sound Reinforcement Conference

January 2024

Exploring perceptual annoyance and colouration assessment in active acoustic environments

In active acoustics, signals from microphones within a room are processed and fed to loudspeakers in the same room, creating an extended reverberation time and modified room perception. The system’s performance is limited by the audibility and acceptability of colouration at gains close to instability. Some listening tests have been presented in the literature to assess perceptual colouration, but thresholds for when the colouration becomes annoying or unacceptable have not previously been established. In this paper, we revisit the prediction of the gain before instability and show how this can be used to equalize an active acoustics system. Then, we present new listening tests where listeners were asked to rate the audibility and annoyance of changes introduced by 8 channel active acoustics systems in two rooms at various simulated gains. We show that the annoyance depends on the initial room acoustics as well as the loop gain; perceptual thresholds for slightly annoying degradation varied from?5.4 dB to ?8.5 dB, relative to instability. These thresholds are discussed in the context of objective measurements calculated from the impulse responses. The resonance perception is linked to the gain where the reverberation time starts to grow much more quickly in some frequency bands than others. It is also shown to be well predicted by the standard deviation of the magnitude response, with a value of 0.62 corresponding to slightly annoying degradation.

Authors: Philip Coleman, Nicolas Epain, Satvik Venkatesh, Frédéric Roskam

AES 2024 International Acoustics & Sound Reinforcement Conference

January 2024

The L-Acoustics Education platform, an online tool for blended learning in live sound

L-Acoustics is a French audio brand known for having introduced several disruptive technologies in the sector of live sound, like in 1992 with full-range line sources, or more recently with large-scale immersive audio. To accompany the end-users on the mastering of these new tools, training and education has always been a core foundation of the company. In addition to a program that is now clearly designed under the perspective of vocational education, the L-Acoustics education team has fully embraced the company DNA of innovation and has launched an online platform to support its blended learning strategy. This interactive tool will be the focus of this text, describing how it augments the learner experience in both instructor- led sessions and autonomous learning activities, with learning quizzes, videos, tutorials, certification tests and an online space for the learning community to exchange.

Authors: François Montignies, Etienne Corteel, Lucile Diemert, Thomas Mouterde

10th Convention of the European Acoustics Association

September 2023

The L-Acoustics program for vocational education in loudspeaker system and immersive audio

L-Acoustics has introduced several disruptive technologies in the live sound industry, like in 1992 with full- range line sources, or more recently with large-scale immersive audio. To accompany the end-users in mastering these new tools, training and education have always been a core foundation of the company. This paper explains the educational challenges associated with the sector of live sound and how L-Acoustics has decided to overcome them. The development of a program targeted at end-users, with structured content and elaborated methodologies has allowed the evolution from traditional product training to vocational education. The different job profiles, the associated learning objectives and courses fulfil the development of live sound professionals. In addition, the program reveals itself as a good complement to initial education curriculums that will prepare their students for the multiple job opportunities in the live sound industry.

Authors: François Montignies, Etienne Corteel, and Katerina Panagopoulou

International Conference on Audio Education of the Audio Engineering Society

September 2023

On the perception of musical groove in large-scale events with immersive sound

Immersive audio is increasingly used in large-scale live music events. The dimensions of the audience area impose that propagation times from several loudspeakers to a given audience position can be significantly different. This may be perceived by listeners as a loss of time synchronization between sound sources, which in turn affects the perception of musical groove. In this paper, we first investigate the range of propagation time differences that can occur with large-scale loudspeaker deployments. The results of a listening test confirm that time differences may degrade the rhythmic characteristics. The degradations may depend on the musical content but not on the spatialization. Mixing guidelines and methodologies are finally proposed to overcome the potential issues.

Authors: Thomas Mouterde, Nicolas Epain, Samuel Moulin, and Etienne Corteel

International Conference on Spatial and Immersive Audio of the Audio Engineering Society

August 2023

Simulating low frequency noise pollution using the parabolic equations in sound reinforcement loudspeaker systems

Sound system designers are used to optimizing loudspeaker systems for the audience experience with free-field simulation software. However, noise pollution reduction must also be considered during the design phase and the propagation of sound may be affected by inhomogeneous atmospheric conditions, such as wind, temperature gradients, and ground impedance. This paper proposes a method to simulate the impact of the environment on sound pressure levels at large distances created by loudspeaker systems using parabolic equations, considering a reference left-right main system associated with either flown or ground-stacked subwoofers. Results show a higher variability of the sound pressure level with systems using ground-stacked subwoofers. The influence of the crossover frequency between main and subwoofers is discussed in this paper.

Authors: Mouterde, Thomas; Perrot, Joris; Lihoreau, Bertrand; Corteel, Etienne

153rd Convention of the Audio Engineering Society

October 2022

Spectral and spatial perceptions of comb-filtering for sound reinforcement applications

Most sound reinforcement systems consist of multiple loudspeakers systems arranged strategically to cover the entire audience area. This study investigates the spectral and spatial perceptions of interferences that can be experienced in the shared coverage area between two full-range loudspeakers. A listening test was conducted to determine the effect of lag source delay, relative level, and angular separation, on the perception of spectral coloration and spatial impressions (width, localization shift, image separation). The results show that spectral coloration is considerably reduced when sources are spatially separated, even with a small azimuth angle (10°). It was also found that coloration audibility depends on the interaction between the audio track and the delay introduced. Finally, the type of perceived spatial degradation depends mainly on the spatial separation and on the relative level of the source arriving later in time (lag source).

Authors: Moulin, Samuel & Corteel, Etienne

AES Spring Convention 2022

May 2022

Spatial Rendering over Distributed Fills Systems in Immersive Live Sound Reproduction

Fills systems in live sound reproduction aim at providing coverage in audio areas that can’t be addressed by the main system. In areas close to the audience, such as first rows close to the audience or under balconies, distributed systems should be used since a single source cannot provide coverage for the entire area. These systems are usually fed with a mono downmix providing no spatial information. In this paper, multiple spatial rendering algorithms for distributed fills are investigated. A framework is proposed to evaluate the performance of these algorithms in terms spatial unmasking and audio-visual consistency. The spatial fills algorithm offers good performances on these two perceptual dimensions while assuring coverage and alignment with the main system independently of the positioning of audio objects. Other algorithms tend to favor one or the other dimension or may fail at assuring coverage for any audio object position.

Authors: Corteel, E.; Moulin, S.; Roskam, F.

Reproduced Sound 2021, Proceedings of the Institute of Acoustics, Vol. 43. Pt. 2. 2021


On the comparison of flown and ground-stacked subwoofer configurations regarding noise pollution

In addition to audience experience and hearing health concerns, noise pollution issues are increasingly considered in large scale sound reinforcement for outdoor events. Among other factors, subwoofer positioning relative to the main system influences sound pressure levels at large distances, which may be considered as noise pollution. In this paper, free field simulations are first performed showing that subwoofers positioning affects rear and side rejections but has a limited impact on noise level in front of the system. Then, the impact of wind on sound propagation at low frequencies is investigated. Simulation results show that the wind impacts more ground-stacked subwoofers than flown subwoofers, leading to higher sound levels downwind in the case of ground-stacked subwoofers. Reference: T. Mouterde, and E. Corteel, "On the comparison of flown and ground-stacked subwoofer configurations regarding noise pollution," Paper 10533, (2021 October).

Authors: Mouterde, Thomas & Corteel, Etienne

AES Fall Show 2021

October 2021

3D audio for Live Sound

3D audio has only recently started to penetrate the live sound industry. This is due to specificities of live sound, especially touring, where a performance may be delivered in vastly diverse environments and address audiences of up to tens of thousands. Accommodating these challenges to offer consistent high-quality results requires specific technologies and practices. This chapter describes the specific constraints of live sound that lead to a selection of algorithms that are robust enough to adapt to various scales and loudspeaker layout. A framework is proposed for the evaluation of the performance of panning algorithm in the live sound context. The importance of the design of the loudspeaker system is outlined. A methodology and evaluation criteria are proposed to successfully scale 3D sound to large audiences. The creation of virtual environments in large scale venues and their portability from venue to venue creates specific constraints that are described and addressed. Finally, key applications and references are presented. Reference: Etienne Corteel, Guillaume Le Nost, Frédéric Roskam, 3D Audio for Live Sound, in 3D Audio (1st edition), Routledge, 2021.

July 2021

Demystifying the effects of loudspeaker cables

Nowadays, professional sound installations such as sport facilities, theme parks and entertainment venues require not only a public address system based on a 100 V / 70 V line speakers, but also sound quality, high sound pressure level and improved frequency bandwidth, which results in the specifications of low impedance systems. As cable lengths may sometimes go beyond a hundred meters, it can result in high power losses when driving a full spectrum amplified signal. This paper presents an electric model of speaker cables that includes two electromagnetic phenomenons, skin effect and inductive effect. The following parameters have been investigated – cable length up to two hundred meters, cable gauge from 4mm² to 10mm² in relation with various speakers, with different impedance loads. Then follows an in-depth interpretation of its impact on audio signal when connected between an amplifier and a loudspeaker voice coil. Along with all simulations of power loss is presented an acoustic measurement highlighting the accuracy of the model. This powerful tool may be an asset for sound designers and integrators to predict the impact of long cable runs on loudspeakers output. Reference: Bertin, Nicolas & Montignies, Francois. (2015). Demystifying the effects of loudspeaker cables. Institute of Acoustics, 2015.

Authors: Bertin, Nicolas & Montignies, Francois

November 2015

Audience effect on the response of a loudspeaker system in the low frequency range, part 1: magnitude

The response of a loudspeaker system is affected by the presence of the audience. However, the loudspeaker system tuning is performed without an audience, applying equalization filters and delays for time alignment system components. The validity of these decisions with an audience is of primary importance. In this paper, the magnitude response of a loudspeaker system is simulated at low frequencies using Finite Element Method over a flat listening area for multiple source heights and audience densities. The results show that the audience modifies notches due to the floor reflection for a flown source and creates a build-up associated with a low-pass behavior for ground-stacked sources. The implications on typical loudspeaker system configurations are presented and discussed.

Authors: Mouterde, Thomas; Corteel, Etienne; Melon, Manuel

AES Convention Paper #10398

Article presented at the 149th Convention 2020 October 22, Online

Non-linear acoustic losses prediction in vented loudspeaker using computational fluid dynamic simulation

Bass-reflex designs can exhibit strong non-linear behavior around their resonant frequency with significant acoustic losses and parasite noise emission. These phenomena are mainly due to turbulences and flow separation at the port’s inlet and outlet. This work proposes a method to predict the resulting non-linear acoustic losses for a given loudspeaker, enclosure volume and port geometry. The approach consists of coupling computational fluid dynamics (CFD) simulation with loudspeaker non-linear motion modelization. Four different ports geometries mounted on one given loudspeaker enclosure are tested. The computed acoustic losses are compared with measurements and show a good agreement. The obtained results prove that the proposed method can predict non-linear losses with an average error less than 1 dB around the Helmholtz frequency.

Authors: Pene, Yves; Horyn, Yoachim; Combet, Christophe

AES Convention Paper #10359

Article presented at the 148th Convention 2020 June 1-5, Online

Optimum measurement locations for large-scale loudspeaker system tuning based on first-order reflections analysis

This paper investigates how first-order reflections impact the response of sound reinforcement systems over large audiences. On the field, only few acoustical measurements can be performed to drive tuning decisions. The challenge is then to select the right measurement locations so that it provides an accurate representation of the loudspeaker system response. Simulations of each first-order reflection (e.g., floor or side wall reflection) are performed to characterize the average frequency response and its variability over the target audience area. Then, the representativity of measurements performed at a reduced number of locations is investigated. Results indicate that a subset of eight measurement locations spread over the target audience area represents a rational solution to characterize the loudspeaker system response.

Authors: Moulin, Samuel; Corteel, Etienne; Montignies, François

AES Convention Paper #10234

Article presented at the 147th Convention 2019 October 16–19, New York, USA

On the efficiency of flown vs. ground stacked subwoofer configurations

Modern live loudspeaker systems consist of broadband sources, often using variable curvature line sources, combined with subwoofers. While it is common practice to fly the broadband sources to improve energy distribution in the audience, most subwoofer configurations remain ground-stacked because of practical constraints and alleged efficiency loss of flown configurations. This article aims at evaluating the efficiency of flown subwoofers for large audiences as compared to their ground-stacked counterparts. We use finite element simulations to determine the influence of several factors: baffling effect, trim height. We show that flown configurations remain efficient at the back of the venue while reducing the SPL excess at the front of the audience.

Authors: Corteel, Etienne; Coste Dombre, Hugo; Combet, Christophe; Horyn, Yoachim; Montignies, François

AES Convention Paper #10051

Article presented at the 145th Convention 2018 October 17–20, New York, NY, USA

Large scale open air sound reinforcement in extreme atmospheric conditions

Extreme atmospheric conditions have a profound effect on sound propagation. This paper presents two installations where this problem must be accounted for: the main stage of the Coachella Valley Music and Arts Festival and the Hollywood Bowl. The approach presented here combines an optimized sound system design combined with signal processing for partial compensation of remaining loss in selected areas.

Authors: Corteel, Etienne; Sugden, Scott; Montignies, François

AES Convention Paper #P2.3

Article presented at the AES International Conference on Sound Reinforcement – Open Air Venues (August 2017)

The Distributed Edge Dipole (DED) model for cabinet diffraction effects

A simple model is proposed to account for the effects of cabinet edge diffraction on the radiated sound field for direct-radiating loudspeaker components when mounted in an enclosure. The proposed approach is termed the Distributed Edge Dipole (DED) model since it is developed based on the Kirchoff Approximation (KA) using distributed dipoles with their axes perpendicular to the baffle edge as the elementary diffractive sources. The DED model is first tested against measurements for a thin circular baffle and is then applied to a real world loudspeaker that has a thick, rectangular baffle. The forward sound pressure level and the entire angular domain are investigated and predictions of the DED model show good agreement with experimental measurements.

Authors: Urban, Marcel; Heil, Christian; Pignon, C.; Combet, C.; Bauman, P.

AES Journal, Vol. 52, n°10 - 2004 October

October 15, 2004