-
PhD course at Aalborg University
Self-Supervised Learning, November 18-20, 2024. Concluded successfully with 40 participants.
-
Slides for the tutorial at NLDL 2023
-
Video for the expert session at ICASSP 2022
-
Doctoral course at Universit of Oulu, Finland
Self-Supervised Learning for Multimodal Data: From Models to Loss Functions (Z.-H. Tan)
-
Postdoc
We are excited to announce a two-year postdoc position in self-supervised and weakly-supervised learning for signals, e.g. speech, audio, text, and images. While the success of deep learning largely relies on the presence of substantial amounts of labeled data, the prevailing reality often entails the abundance of unlabeled or inadequately labeled data. This project focuses on the development of weakly-supervised and self-supervised learning methods to harness these data resources and gain deeper insights into the underlying mechanisms of these methods. The position is funded by Pioneer Centre for AI, Denmark. Deadline: 8 November 2023. Online application here.
-
PhD stipend
We have one PhD position available in Self-Supervised Learning for Decoding of Complex Signals, funded by the Pioneer Centre for AI. The project aims to develop novel semi-supervised and self-supervised methods for modeling signals of various modalities (e.g., speech, audio, vision, text) and analyse the complexity of the developed models. Good opportunities to do research at other units and the headquarter of the Pioneer Centre as well as abroad. Deadline: 6 June 2023. Online application here.
-
2022 Workshop on Self-Supervised Learning for Signal Decoding, Aalborg, Denmark, 14-14 October 2022.
-
Some recent works
Sarthak Yadav and Zheng-Hua Tan, "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations," INTERSPEECH 2024, 1-5 September 2024 Kos Island, Greece. Source code available.
Sarthak Yadav, Sergios Theodoridis, Lars Kai Hansen, and Zheng-Hua Tan, "Masked Autoencoders with Multi-Window Attention Are Better Audio Learners," The Twelfth International Conference on Learning Representations (ICLR), Vienna, Austria, 7-11 May 2024. 31% acceptance rate.
Holger Severin Bovbjerg, Jesper Jensen, Jan Østergaard, and Zheng-Hua Tan, "Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seoul, Korea, 14-19 April 2024.
Holger Severin Bovbjerg, and Zheng-Hua Tan, "Improving Label-Deficient Keyword Spotting Using Self-Supervised Pretraining," ICASSP 2023 Satellite Workshop: SASB 2023: Self-Supervision in Audio, Speech and Beyond, Rhodes Island, Greece, June 4-10, 2023.
Yuying Xie, Thomas Arildsen, and Zheng-Hua Tan, “Improved Disentangled Speech Representations Using Contrastive Learning in Factorized Hierarchical Variational Autoencoder,” The 31st European Signal Processing Conference (EUSIPCO 2023), September 4-8, 2023, Helsinki, Finland.
Achintya kr. Sarkar, and Zheng-Hua Tan, "Vocal tract length perturbation for text-dependent speaker verification with autoregressive prediction coding," IEEE Signal Processing Letters 28 (2021): 364-368.
Yuying Xie, Thomas Arildsen, and Zheng-Hua Tan, "Disentangled speech representation learning based on factorized hierarchical variational autoencoder with self-supervised objective," 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2021.
Achintya Kumar Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, and James Glass, "Time-contrastive learning based deep bottleneck features for text-dependent speaker verification," IEEE/ACM Transactions on Audio, Speech, and Language Processing 27, no. 8 (2019): 1267-1279.
Achintya Kr. Sarkar and Zheng-Hua Tan, “Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification,” NIPS 2017 Time Series Workshop, Dec. 8, 2017, Long Beach, CA, USA.