Abstract

There is some uncertainty as to whether objective metrics for predicting the perceived quality of audio source separation are sufficiently accurate. This issue was investigated by employing a revised experimental methodology to collect subjective ratings of sound quality and interference of singing-voice recordings that have been extracted from musical mixtures using state-of-the-art audio source separation. A correlation analysis between the experimental data and the measures of two objective evaluation toolkits, BSS Eval and PEASS, was performed to assess their performance. The artifacts-related perceptual score of the PEASS toolkit had the strongest correlation with the perception of artifacts and distortions caused by singing-voice separation. Both the source-to-interference ratio of BSS Eval and the interference-related perceptual score of PEASS showed comparable correlations with the human ratings of interference.

Bibtex


@inproceedings{Ward_2018,
  year = {2018},
  month = apr,
  publisher = {{IEEE}},
  author = {Ward, Dominic and Wierstorf, Hagen and Mason, Russell D. and Grais, Emad M. and Plumbley, Mark D.},
  title = {{{BSS EVAL or PEASS?} Predicting the Perception of Singing-Voice Separation}},
  booktitle = {2018 {IEEE} International Conference on Acoustics, Speech and Signal Processing ({ICASSP})},
  address = {Calgary, Canada},
  openaccess = {http://epubs.surrey.ac.uk/845998/},
  keywords = {"maruss"}
}


Supplementary Material

Please see the website for this study

https://cvssp.github.io/perceptual-study-source-separation/

All source files associated with this work are available at

https://github.com/CVSSP/perceptual-study-source-separation