View publication

The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced from the lungs and is modulated by the vocal tract, where such actions are interspersed by moments of breathing in air (inhalation) to refill the lungs again. Respiratory rate (𝑅𝑅) is a vital metric that is used to assess the overall health, fitness, and general well-being of an individual. Existing approaches to measure 𝑅𝑅 (number of breaths one takes in a minute) are performed using specialized equipment or training. Studies have demonstrated that machine learning algorithms can be used to estimate 𝑅𝑅 using bio-sensor signals as input. Speech-based estimation of 𝑅𝑅 can offer an effective approach to measure the vital metric without requiring any specialized equipment or sensors. This work investigates a machine learning based approach to estimate 𝑅𝑅 from speech segments obtained from subjects speaking to a close-talking microphone device. Data were collected from N=26 individuals, where the groundtruth 𝑅𝑅 was obtained through commercial grade chest-belts and then manually corrected for any errors. A convolutional long-short term memory network (Conv-LSTM) is proposed to estimate respiration time-series data from the speech signal. We demonstrate that the use of pre-trained representations obtained from a foundation model, such as WAV2VEC2, can be used to estimate respiration-time-series with low root-mean-squared error and high correlation coefficient, when compared with the baseline. The model-driven time series can be used to estimate 𝑅𝑅 with a low mean absolute error (𝑀𝐴𝐸) β‰ˆ 1.6π‘π‘Ÿπ‘’π‘Žπ‘‘h𝑠/π‘šπ‘–π‘›.

Related readings and updates.

Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement

The growing demand for personalized and private on-device applications highlights the importance of source-free unsupervised domain adaptation (SFDA) methods, especially for time-series data, where individual differences produce large domain shifts. As sensor-embedded mobile devices become ubiquitous, optimizing SFDA methods for parameter utilization and data-sample efficiency in time-series contexts becomes crucial. Personalization in time…
See paper details

Generalizable Autoregressive Modeling of Time Series Through Functional Narratives

Time series data are inherently functions of time, yet current transformers often learn time series by modeling them as mere concatenations of time periods, overlooking their functional properties. In this work, we propose a novel objective for transformers that learn time series by re-interpreting them as temporal functions. We build an alternative sequence of time series by constructing degradation operators of different intensity in the…
See paper details