SAIGE, 700 N Woodlawn Avenue, Bloomington, IN (2024)

04/13/2023

We are serious about BBQ at SAIGE. A home party at Minje’s place.

03/10/2023

Two additional presentations at ! Our recent journal papers on "self-supervised learning for personalized speech enhancement (IEEE JSTSP, led by Aswin Sivaraman)" and "AdaBoost-based hashing for efficient speech enhancement (IEEE TASLP, led by Dr. Sunwoo Kim)" were accepted for presentation at .

02/17/2023

SAIGE members (Anastasia Kuznetsova, Haici Yang, Darius Petermann, Aswin Sivaraman, and Minje Kim) authored four accepted papers for publication at . Kudos to the authors! See you all in Rhodes Island, Greece!

11/07/2022

SAIGE welcomes new members with warm food and shirts. Photo was taken in Aug and they are well into their research projects already.

Learning to Hash for Source Separation – SAIGE@IU

08/24/2022

We care a lot about the efficiency of AI models for their use on devices. Our latest effort in this area was published in IEEE Trans. on ASLP, where we proposed a hashing-based method for speech enhancement/source separation. Our algorithm learns the efficient and effective binary representations that are used to perform speech denoising in a bitwise, i.e., hardware-friendly fashion. The paper also provides a comprehensive view of our method from the perspective of the kernel method and an interpretation as a neural network model. Please check out our paper:
Demo and source code: https://saige.sice.indiana.edu/research-projects/bwss-blsh/
IEEE Xplore: https://ieeexplore.ieee.org/document/9053052
Author version (pdf):https://saige.sice.indiana.edu/wp-content/uploads/taslp2022_skim.pdf

Learning to Hash for Source Separation – SAIGE@IU Learning to Hash for Source Separation We have cared much about the efficiency of the machine learning inference process. As a part of this effort, we recently came up with a hash code-based source separation system, where we used a specially designed hash function to increase the source separation....

05/04/2022

Please check out Darius Petermann's cool presentation on SpaIn-Net, a music source separation model that is mindful of the instruments' spatial locations. SpaIn-Net is robust even if the spatial information is not precise. ;) https://iu.mediaspace.kaltura.com/media/t/1_mboimmw7

Don’t Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization - Indiana University

04/25/2022

At SAIGE we are dead serious about source separation but in one of our new papers, we suggest you "don't separate, learn to remix" if you want just a remix. In this end-to-end model with joint optimization for separation and remixing, we propose a novel interactive "remixing" system. Yes, the point is, not to focus on source separation too much unless it's absolutely necessary. Please take a look at Haici Yang's virtual presentation for more information. This was a result of an exciting collaboration with Nick Bryan at Adobe Research. https://iu.mediaspace.kaltura.com/media/t/1_l7n8iw7o

Don’t Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization - Indiana University

02/18/2022

SpaIn-Net is a spatially-informed network for music source separation. It takes the user's rough guess about the stereophonic location of the musical instrument as input and does better separation. More details, demo, source codes, and our paper about the SpaIn-Net project are here: https://saige.sice.indiana.edu/research-projects/spain-net/

BLOOM-Net: Scalability Matters – SAIGE@IU

02/15/2022

For we named one of our new deep learning models after our beloved hometown, Bloomington, IN! In this paper, we present "BLOOM-Net" that flexibly scales its architecture to fit from small to large devices, while it always retains optimal speech enhancement performances. More details on this open-sourced project can be found here: https://saige.sice.indiana.edu/research-projects/bloom-net/

BLOOM-Net: Scalability Matters – SAIGE@IU BLOOM-Net: Scalability Matters Scalability is a big deal when it comes to video coding. When you watch a movie via a streaming service on Friday night, the video quality fluctuates—it’s the video codec’s effort in providing the maximum video quality even though your internet connection suffers...

02/04/2022

Our T-ASLP paper is officially published (led by Kai Zhen)! It's a comprehensive consolidation of our neural speech coding projects with lots of in-depth insight and new discoveries. Check out our paper if you are curious how deep learning can be used for speech/audio coding.
IEEE Xplorer: https://ieeexplore.ieee.org/document/9622124
Authors' PDF:https://saige.sice.indiana.edu/wp-content/uploads/taslp2022_kzhen.pdf

saige.sice.indiana.edu

01/24/2022

SAIGE members authored SEVEN papers accepted for publication at ICASSP 2022. Great team work among the SAIGE members as well as exciting external collaboration! We are all sincerely hoping to be there in Singapore in person this year!

Sunwoo Kim, Minje Kim, "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable And Efficient Speech Enhancement"

Darius Petermann, Minje Kim, "SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation"

Haici Yang, Shivani Firodiya, Nicholas Bryan, Minje Kim, "Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization”

Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim, "Upmixing via Style Transfer: a Variational Autoencoder for Disentangling Spatial Images and Musical Content"

Hao Zhang, Srivatsan Kandadai, Harsha Rao, Minje Kim, Tarun Pruthi, Trausti Kristjansson, "Deep Adaptive AEC: Hybrid of Deep Learning and Adaptive Acoustic Echo Cancellation"

Aswin Sivaraman, Scott Wisdom, Hakan Erdogan, John R. Hershey, "Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training"

Darius Petermann, Gordon Wichern, Jonathan Le Roux, Zhong-Qiu Wang, "The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks"

11/15/2021

A late fall excursion to Yellowwood State Forest.

10/15/2021

Our final paper is based on knowledge distillation for personalized speech enhancement. Sunwoo Kim will be at the P3 poster session to take questions. Here is his video presentation you can check out ahead of time, too. https://share.descript.com/view/LX2kLyPs4x3

Recording_9_28 - Descript Hi, everyone. Thanks for joining our session. Today I'll be presenting on test time adaptation toward personalized speech enhancement. Zero shot learning with knowledge installation. My name is son. K

10/15/2021

Next week at , Aswin Sivaraman will present his paper on speaker-informed personalized speech enhancement (P3: Array Processing, Room Acoustics, Enhancement, and Audio Events; Demonstrations). Here's the video lecture for a sneak peek. https://iu.mediaspace.kaltura.com/media/t/1_4us4qtm1

10/15/2021

Our HARP-Net paper will be presented at the "P5: Spatial Audio, ANC/Echo, Coding, and Music; Demonstrations" session at WASPAA next week. Darius Petermann will be there to take questions, but here is the presentation video for a sneak peek! https://iu.mediaspace.kaltura.com/media/t/1_amczhtck

Interspeech 2021 Short - Aswin Sivaraman - Indiana University

08/31/2021

Aswin Sivaraman is virtually presenting his paper at Interspeech tomorrow (7pm CET or 1pm US Eastern). It's on self-supervised learning and data purification for personalized speech enhancement. For a sneak peek, here's a 3-min intro: https://iu.mediaspace.kaltura.com/media/t/1_f8fxu8sx

Interspeech 2021 Short - Aswin Sivaraman - Indiana University

08/03/2021

Aswin Sivaraman will present our third WASPAA 2021 paper that discusses a zero-shot adaptation method to improve performance and efficiency of speech enhancement models. It automatically finds the best-matching "specialist" from the noisy speech utterances during the test time, thus achieving "personalized" speech enhancement without asking for the test user's information. Check out the paper here:https://saige.sice.indiana.edu/wp-content/uploads/waspaa2021_asivaraman.pdf

07/28/2021

One of the big themes in SAIGE is "personalized" AI models, because personalized models work better than a generic one, while they are small and efficient on edge devices. Sunwoo Kim is presenting at WASPAA 2021 about our new speech enhancement algorithm that distills knowledge from a big teacher model to personalize a small student model. Check out the paperhttps://saige.sice.indiana.edu/wp-content/uploads/waspaa2021_skim.pdf and demo and code https://saige.sice.indiana.edu/research-projects/kd-pse/

07/22/2021

Our new neural audio coding system, HARP-Net, will be presented by Darius Petermann at WASPAA 2021. We call it HARP-Net, because the model architecture looks like... a harp (it actually means "Hyper Autoencoded Reconstruction Propagation")! Check out the paper https://saige.sice.indiana.edu/wp-content/uploads/waspaa2021_dpetermann.pdf, code https://github.com/darius522/waspaa2021 and demo https://darius522.github.io/harpnet_examples/

06/14/2021

Haici Yang presented her first first-authored paper at ICASSP 2021. It’s about a neural audio codec that can control multiple sources in the input audio. Check out her talk here: https://iu.mediaspace.kaltura.com/media/t/1_tedt8knf

SAIGE

Category

Contact the university

Website

Address