Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a timefrequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.


  author = {Roma, G. and Grais, Emad and Simpson, A. J. R and Plumbley, M. D},
  year = {2016},
  month = aug,
  title = {Singing Voice Separation using Deep Neural Networks and f0 Estimation},
  booktitle = {MIREX 2016},
  url = {http://www.music-ir.org/mirex/abstracts/2016/RSGP1.pdf},
  keywords = {"maruss"}