Online Gaussian Process for Nonstationary Speech Separation

Hsin-Lung Hsieh and Jen-Tzung Chien


    In a practical speech enhancement system, it is required to enhance speech signals from the mixed signals, which were corrupted due to the nonstationary source signals and mixing conditions. The source voices may be from different moving speakers. The speakers may abruptly appear or disappear and may be permuted continuously. To deal with these scenarios with a varying number of sources, we present a new method for nonstationary speech separation. An online Gaussian process independent component analysis (OLGP-ICA) is developed to characterize the temporal structure in time-varying mixing system and to capture the evolved statistics of independent sources from online observed signals. A variational Bayes algorithm is established to estimate the evolved parameters for dynamic source separation. In the experiments, the proposed OLGP-ICA is compared with other ICA methods and is illustrated to be effective in recovering speech and music signals in a nonstationary speaking environment.


Source Speech and Music Signals

Signal 1  Signal 2

Mixed Signals

Signal 1  Signal 2

Demixed Signals

Signal 1  Signal 2