Convolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence
Fabio Louvatti do Carmo, Evandro Ottoni Teatini Salles
DOI: 10.14209/sbrt.2017.29
Evento: XXXV Simpósio Brasileiro de Telecomunicações e Processamento de Sinais (SBrT2017)
Keywords: Non-negative Matrix Factorization Monaural separation Blind source separation Constant-Q Transform Itakura-Saito
Abstract
This paper proposes a modification of the Nonnegative Matrix Factorization (NMF) in a single-channel audio source separation problem. NMF is widely used in such problem because of its easy implementation and parts-based separation properties. However, the original NMF uses Short Time Fourier Transform (STFT) as a spectral representation of the data, which has matrix representation, and it does not support data at nonregular grid, such as Constant-Q Transform (CQT). CQT has a strong appeal in audio processing because it approximates the human auditory system in a reasonably way. Besides to usage of CQT as spectral representation, this paper presents a convolutive NMF approach using Itakura-Saito divergence (ISD) to work with irregularly-sampled data, here defined as NRCNMF-IS. The scale invariance property of ISD is interesting for audio applications. The NRCNMF-IS was tested and compared with its matricial version. Utilizing performance metrics, the statistical results show that the use of CQT as spectral representation yields better results than the STFT representation.Download