View publication

Sleep staging is a clinically important task for diagnosing various sleep disorders but remains challenging to deploy at scale because it requires clinical expertise, among other reasons. Deep learning models can perform the task but at the expense of large labeled datasets, which are unfeasible to procure at scale. While self-supervised learning (SSL) can mitigate this need, recent studies on SSL for sleep staging have shown performance gains saturate after training with labeled data from only tens of subjects, hence are unable to match peak performance attained with larger datasets. We hypothesize that the rapid saturation stems from applying a pretraining scheme that only pretrains a portion of the architecture, i.e., the feature encoder but not the temporal encoder; therefore, we propose adopting an architecture that seamlessly couples the feature and temporal encoding and a suitable pretraining scheme that pretrains the entire model. On a sample sleep staging dataset, we find that the proposed scheme offers performance gains that do not saturate with the labeled training dataset size (e.g., 3-5% improvement in balanced accuracy across low- to high-labeled data settings), which translate into significant reductions in the amount of labeled training data needed for high performance (e.g., by 800 subjects). Based on our findings, we recommend adopting this SSL paradigm for subsequent work on SSL for sleep staging.

Related readings and updates.

More Speaking or More Speakers?

Self-training (ST) and self-supervised learning (SSL) methods have demonstrated strong improvements in automatic speech recognition (ASR). In spite of these advances, to the best of our knowledge, there is no analysis of how the composition of the labelled and unlabelled datasets used in these methods affects the results. In this work we aim to analyse the effect of number of speakers in the training data on a recent SSL algorithm (wav2vec 2.0)…
See paper details

MAEEG: Masked Auto-encoder for EEG Representation Learning

This paper was accepted at the Workshop on Learning from Time Series for Health at NeurIPS 2022. Decoding information from bio-signals such as EEG, using machine learning has been a challenge due to the small data-sets and difficulty to obtain labels. We propose a reconstruction-based self-supervised learning model, the masked auto-encoder for EEG (MAEEG), for learning EEG representations by learning to reconstruct the masked EEG features using a…
See paper details