Current performance evaluation for audio source separation depends on comparing the processed or separated signals with reference signals. Therefore, common performance evaluation toolkits are not applicable to real-world situations where the ground truth audio is unavailable. In this paper, we propose a performance evaluation technique that does not require reference signals in order to assess separation quality. The proposed technique uses a deep neural network (DNN) to map the processed audio into its quality score. Our experiment results show that the DNN is capable of predicting the sources-to-artifacts ratio from the blind source separation evaluation toolkit [1] for singing-voice separation without the need for reference signals.


  booktitle = {27th European Signal Processing Conference (EUSIPCO)},
  address = { A Coruña, Spain},
  month = jun,
  title = {Referenceless Performance Evaluation of Audio Source Separation using Deep Neural Networks},
  author = {Grais, E. M. and Wierstorf, H. and Ward, D. and Mason, R. D. and Plumbley, M. D.},
  publisher = {IEEE},
  year = {2019},
  journal = {Proceedings 2019 27th European Signal Processing Conference (EUSIPCO)},
  keywords = {"maruss"},
  openaccess = {http://epubs.surrey.ac.uk/852063/}