1. Sound source separation in source number uncertainty

Two separation results with 4 channel microphone array are presented: (i) 2 sound sources, and (ii) 5 sound sources. Our method dispenses with any parameter tuning dependent on the number of sources. The same configuration is applied for both cases.
Audio check is done with Chrome browser. Use of a headphone is recommended.
  1. Separation of 2 sources
    Input mixture
    Input mixture
    Result audio (female)
    sep_female_01
    Result audio (male)
    sep_male_02

  2. Separation of 5 sources
    Input mixture
    Input mixture
    Result audio (-90 deg.) Result audio (-60 deg.)
    Result audio (0 deg.) Result audio (60 deg.)
    Result audio (90 deg.)

2. Sound source separation and dereverberation in source number uncertainty

Reverberation often degrades the separation quality. In our method, reverberation is modeled as a propagation of observed signals in the past time frames into the current time frame. In this part, the mixture has also 4 channels with two cases: (i) two sources and (ii) five sources. Results with and without (w/ and w/o) dereverberation are presented.
  1. Results of 2 sources
    • w/o dereverberation: echoes and reflection of the other sound remains in the audio signal.
      Reverberant input mixture
      Reverberant input mixture
      Result audio (female)
      sep_female_01
      Result audio (male)
      sep_male_02
    • w/ dereverberation: reflected sound is suppressed.
      Input mixture (the same as above)
      Reverberant input mixture
      Result audio (female)
      derev_sep_female_01
      Result audio (male)
      derev_sep_male_02
  2. Results of 5 sources
    • w/o dereverberation
      Reverberant input mixture
      Reverberant input mixture
      Result audio (-90 deg.) Result audio (-60 deg.)
      Result audio (0 deg.) Result audio (60 deg.)
      Result audio (90 deg.)
    • w/ dereverberation
      Input mixture (the same as above)
      Reverberant input mixture
      Result audio (-90 deg.) Result audio (-60 deg.)
      Result audio (0 deg.) Result audio (60 deg.)
      Result audio (90 deg.)

3. Sound source separation with a moving microphone array

We tackle the sound source separation problem in a dynamic environment where the relative positions between the sound sources and microphones change over time. The uncertainty about the dynamic environment is partly addressed in that each source move in a disjoint direction range as illustrated below. For separating sound sources in a dynamic environment, we apply the Bayesian nonparametric separation and localization method, which splits the moving sound sources into segments along the time axis. In order to reconstruct each source signal, we merge the separated segments that are localized within a respective direction range of each source.
Overview
Overview of separation in a dynamic environment: blue an red sound sources move in disjoint direction ranges.
Example: a mobile robot
Example: a mobile robot passes between two loudspeakers. The left speaker plays a music whereas the right speaker plays calls of frogs and crickets etc.

Separated results of the mobile robot example.
Observed audio (moving during 3-8 sec)
Input mixture
Left: music
left_music
Right: frogs and crickets
right_frog

Related publications

  1. Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno: "Bayesian Nonparametrics for Microphone Array Processing," IEEE Transactions on Audio, Speech and Language Processing, Vol. 22, No. 2, pp. 493-504, 2014. 10.1109/TASLP.2013.2294582
  2. Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno: "Unified Auditory Functions based on Bayesian Topic Model," Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2012), pp.2370-2376, 2012.
  3. Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi G. Okuno: "Bayesian Unification of Sound Source Localization and Separation with Permutation Resolution," Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI-12), pp.2038-2045, 2012.