SONICOM: Transforming Auditory-Based Social Interaction in AR/VR
Partners: Imperial College London (Coordinator), Sorbonne Université, Austrian Academy of Sciences, University of Milan, National and Kapodistrian University of Athens, University of Glasgow, University of Málaga, Dreamwaves (Austria), Reactify (UK), USound (Austria).
Programme: Horizon 2020
Call: FETPROACT-EIC-07-2020
Grant Agreement No.: 101017743
Duration: January 2021 – June 2026
Principal Investigator at UMA: Arcadio Reyes-Lecuona
Research team at UMA: Luis Molina-Tanco, María Cuevas-Rodríguez, Daniel González-Toledo, Pablo Gutiérrez-Parera
Overview
SONICOM aims to redefine auditory-based social interactions in Augmented Reality (AR) and Virtual Reality (VR) environments. By leveraging Artificial Intelligence (AI), the project develops innovative immersive audio technologies that are personalized and realistic. Beyond auditory perception, SONICOM explores the behavioral, cognitive, and physiological effects of spatial audio, advancing how we experience sound in virtual and real-world settings.
Key Objectives
- Personalized Auditory Experiences: Develop sound models tailored to individual users.
- Enhanced AR/VR Communication: Study how spatial audio affects collaboration and interaction.
- Reproducible Research Tools: Provide open-source frameworks for auditory and psychoacoustic science.
- Behavioral Insights: Investigate how immersive audio influences cognition, behavior, and perception.
UMA’s Role
At the University of Málaga, led by Prof. Arcadio Reyes-Lecuona, the team is at the forefront of developing the Binaural Rendering Toolbox (BRT). This open-source library facilitates real-time spatial audio processing with advanced features like Head-Related Transfer Functions (HRTFs) and Binaural Room Impulse Responses (BRIRs). The BRT is central to psychoacoustic research and immersive AR/VR applications, enabling precise and customizable auditory experiences.
Publications by UMA
González-Toledo D; Cuevas-Rodríguez M; Vicente T; Picinali L; Molina-Tanco L; Reyes-Lecuona A
Spatial release from masking in the median plane with non-native speakers using individual and mannequin head related transfer functions Journal Article
In: The Journal of the Acoustical Society of America, vol. 155, no. 1, pp. 284-293, 2024, ISSN: 0001-4966.
@article{10.1121/10.0024239,
title = {Spatial release from masking in the median plane with non-native speakers using individual and mannequin head related transfer functions},
author = {Daniel González-Toledo and María Cuevas-Rodríguez and Thibault Vicente and Lorenzo Picinali and Luis Molina-Tanco and Arcadio Reyes-Lecuona},
url = {https://doi.org/10.1121/10.0024239},
doi = {10.1121/10.0024239},
issn = {0001-4966},
year = {2024},
date = {2024-01-01},
urldate = {2024-01-01},
journal = {The Journal of the Acoustical Society of America},
volume = {155},
number = {1},
pages = {284-293},
abstract = {Spatial release from masking (SRM) in speech-on-speech tasks has been widely studied in the horizontal plane, where interaural cues play a fundamental role. Several studies have also observed SRM for sources located in the median plane, where (monaural) spectral cues are more important. However, a relatively unexplored research question concerns the impact of head-related transfer function (HRTF) personalisation on SRM, for example, whether using individually-measured HRTFs results in better performance if compared with the use of mannequin HRTFs. This study compares SRM in the median plane in a speech-on-speech virtual task rendered using both individual and mannequin HRTFs. SRM is obtained using English sentences with non-native English speakers. Our participants show lower SRM performances compared to those found by others using native English participants. Furthermore, SRM is significantly larger when the source is spatialised using the individual HRTF, and this effect is more marked for those with lower English proficiency. Further analyses using a spectral distortion metric and the estimation of the better-ear effect, show that the observed SRM can only partially be explained by HRTF-specific factors and that the effect of the familiarity with individual spatial cues is likely to be the most significant element driving these results.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Reyes-Lecuona A; Cuevas-Rodríguez M; González-Toledo D; Molina-Tanco L; Poirier-Quinot D; Picinali L
Hearing loss and hearing aid simulations for accessible user experience Proceedings Article
In: Proceedings of the XXIII International Conference on Human Computer Interaction, Association for Computing Machinery, Lleida, Spain, 2024, ISBN: 9798400707902.
@inproceedings{10.1145/3612783.3612816,
title = {Hearing loss and hearing aid simulations for accessible user experience},
author = {Arcadio Reyes-Lecuona and María Cuevas-Rodríguez and Daniel González-Toledo and Luis Molina-Tanco and David Poirier-Quinot and Lorenzo Picinali},
url = {https://doi.org/10.1145/3612783.3612816},
doi = {10.1145/3612783.3612816},
isbn = {9798400707902},
year = {2024},
date = {2024-01-01},
urldate = {2024-01-01},
booktitle = {Proceedings of the XXIII International Conference on Human Computer Interaction},
publisher = {Association for Computing Machinery},
address = {Lleida, Spain},
series = {Interaccion '23},
abstract = {This paper presents an open-source real-time hearing loss and hearing aids simulator implemented within the 3D Tune-In Toolkit C++ library. These simulators provide a valuable tool for improving auditory accessibility, promoting inclusivity and foster new research. The hearing loss simulator accurately simulates various types and levels of hearing loss, while the hearing aid simulator replicates different hearing aid technologies, allowing for the simulation of real-world hearing aid experiences. Both simulators are implemented to work in real-time, allowing for immediate feedback and adjustment during testing and development. As an open-source tool, the simulators can be customised and modified to meet specific needs, and the scientific community can collaborate and improve upon the algorithms. The technical details of the simulators and their implementation in the C++ library are presented, and the potential applications of the simulators are discussed, showing that they can be used as a valuable support software for UX designers to ensure the accessibility of their products to individuals with hearing impairment. Moreover, these simulators can be used to raise awareness about auditory accessibility issues. Overall, this paper also aims to provide some insight into the development and implementation of accessible technology for individuals with hearing impairments.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Cuevas-Rodriguez M; González-Toledo D; Gutierrez-Parera P; Reyes-Lecuona A
An experiment replication on dynamic 3D sound loalisation using auditory Virtual Reality Proceedings Article
In: Proceedings of the 10th Convention of the European Acoustics Association, 2023.
@inproceedings{ForumAcusticum2023Experiment,
title = {An experiment replication on dynamic 3D sound loalisation using auditory Virtual Reality},
author = {Maria Cuevas-Rodriguez and Daniel González-Toledo and Pablo Gutierrez-Parera and Arcadio Reyes-Lecuona},
url = {https://dael.euracoustics.org/confs/landing_pages/fa2023/000744.html},
year = {2023},
date = {2023-09-01},
urldate = {2023-09-01},
booktitle = {Proceedings of the 10th Convention of the European Acoustics Association},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
González-Toledo D; Molina-Tanco L; Cuevas-Rodríguez M; Majdak P; Reyes-Lecuona A
The Binaural Rendering Toolbox. A Virtual Laboratory for Reproducible Research in Psychoacoustics Proceedings Article
In: Proceedings of the 10th Convention of the European Acoustics Association, 2023.
@inproceedings{ForumAcusticum2023BRT,
title = {The Binaural Rendering Toolbox. A Virtual Laboratory for Reproducible Research in Psychoacoustics},
author = {Daniel González-Toledo and Luis Molina-Tanco and María Cuevas-Rodríguez and Piotr Majdak and Arcadio Reyes-Lecuona},
url = {https://dael.euracoustics.org/confs/landing_pages/fa2023/001042.html},
year = {2023},
date = {2023-09-01},
urldate = {2023-09-01},
booktitle = {Proceedings of the 10th Convention of the European Acoustics Association},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Picinali L; Katz B F; Geronazzo M; Majdak P; Reyes-Lecuona A; Vinciarelli A
The SONICOM Project: Artificial Intelligence-Driven Immersive Audio, From Personalization to Modeling [Applications Corner] Journal Article
In: IEEE Signal Processing Magazine, vol. 39, no. 6, pp. 85-88, 2022.
@article{9931551,
title = {The SONICOM Project: Artificial Intelligence-Driven Immersive Audio, From Personalization to Modeling [Applications Corner]},
author = {Lorenzo Picinali and Brian FG Katz and Michele Geronazzo and Piotr Majdak and Arcadio Reyes-Lecuona and Alessandro Vinciarelli},
url = {https://ieeexplore.ieee.org/document/9931551},
doi = {10.1109/MSP.2022.3182929},
year = {2022},
date = {2022-01-01},
journal = {IEEE Signal Processing Magazine},
volume = {39},
number = {6},
pages = {85-88},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Arrebola F; Gonzalez-Toledo D; Garcia-Jimenez P; Molina-Tanco L; Cuevas-Rodriguez M; Reyes-Lecuona A
Simulación en tiempo real de las reflexiones tempranas mediante el método de las imágenes Proceedings Article
In: 53 Congreso Español de Acústica -TECNIACUSTICA 2022., 2022.
@inproceedings{arrebola2022simulacion,
title = {Simulación en tiempo real de las reflexiones tempranas mediante el método de las imágenes},
author = {F. Arrebola and D. Gonzalez-Toledo and P Garcia-Jimenez and L. Molina-Tanco and M. Cuevas-Rodriguez and A. Reyes-Lecuona},
url = {https://documentacion.sea-acustica.es/publicaciones/Elche22/ID-99.pdf},
year = {2022},
date = {2022-01-01},
urldate = {2022-01-01},
booktitle = {53 Congreso Español de Acústica -TECNIACUSTICA 2022.},
abstract = {Se presenta una herramienta de código abierto para simular las reflexiones tempranas de una sala mediante el método de las imágenes. Para ello, el orden de reflexión y la distancia máxima de las fuentes espejo son controlables por el usuario. Asimismo, son configurables la geometría y los perfiles de absorción de las paredes. Se trata de una extensión del 3D Tune-In Toolkit, en la que se ha añadido la simulación del retardo de propagación y la implementación del método de las imágenes. Esta simulación de reverberación se puede completar con otros renderizadores mediante una aproximación hibrida.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Cuevas-Rodríguez M; González-Toledo D; Reyes-Lecuona A; Picinali L
Impact of non-individualised head related transfer functions on speech-in-noise performances within a synthesised virtual environment Journal Article
In: The Journal of the Acoustical Society of America, vol. 149, no. 4, pp. 2573–2586, 2021, ISSN: 0001-4966.
@article{Imp/2021,
title = {Impact of non-individualised head related transfer functions on speech-in-noise performances within a synthesised virtual environment},
author = {María Cuevas-Rodríguez and Daniel González-Toledo and Arcadio Reyes-Lecuona and Lorenzo Picinali},
url = {https://asa.scitation.org/doi/10.1121/10.0004220},
doi = {10.1121/10.0004220},
issn = {0001-4966},
year = {2021},
date = {2021-04-01},
urldate = {2021-04-01},
journal = {The Journal of the Acoustical Society of America},
volume = {149},
number = {4},
pages = {2573–2586},
abstract = {When performing binaural spatialisation, it is widely accepted that the choice of the head related transfer functions (HRTFs), and in particular the use of individually measured ones, can have an impact on localisation accuracy, externalization, and overall realism. Yet the impact of HRTF choices on speech-in-noise performances in cocktail party-like scenarios has not been investigated in depth. This paper introduces a study where 22 participants were presented with a frontal speech target and two lateral maskers, spatialised using a set of non-individual HRTFs. Speech reception threshold (SRT) was measured for each HRTF. Furthermore, using the SRT predicted by an existing speech perception model, the measured values were compensated in the attempt to remove overall HRTF-specific benefits. Results show significant overall differences among the SRTs measured using different HRTFs, consistently with the results predicted by the model. Individual differences between participants related to their SRT performances using different HRTFs could also be found, but their significance was reduced after the compensation. The implications of these findings are relevant to several research areas related to spatial hearing and speech perception, suggesting that when testing speech-in-noise performances within binaurally rendered virtual environments, the choice of the HRTF for each individual should be carefully considered.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Reyes-Lecuona A; Márquez-Moncada A; Luis B H; González-Toledo D; Cuevas-Rodriguez M; Molina-Tanco L
Binaural Audio and Rotation Gain in Virtual Environments Journal Article
In: Interaccion – Revista Digital de AIPO, vol. 2, iss. 2, pp. 54-62, 2021.
@article{Reyes-Lecuona2021c,
title = {Binaural Audio and Rotation Gain in Virtual Environments},
author = {Arcadio Reyes-Lecuona and Ana Márquez-Moncada and Bottcher Hauke Luis and Daniel González-Toledo and María Cuevas-Rodriguez and Luis Molina-Tanco},
year = {2021},
date = {2021-01-01},
urldate = {2021-01-01},
journal = {Interaccion - Revista Digital de AIPO},
volume = {2},
issue = {2},
pages = {54-62},
keywords = {},
pubstate = {published},
tppubtype = {article}
}