Abstract: In this study, we explore the use of Vector Quantized Variational Autoencoders (VQ-VAE) for real-time audio spectrogram inpainting, with a focus on minimizing environmental impact. We ...
Abstract: Birds are essential indicators of ecological health, yet traditional identification methods rely heavily on expert knowledge and visual observation. This paper presents an AI-powered system ...
This study proposes a novel heterogeneous stacking ensemble learning model for the fusion of phonocardiogram (PCG) spectrogram texture and deep features to detect heart failure with preserved ejection ...
Diffusion Speech is a diffusion-based text-to-speech model. Our speech synthesis pipeline is quite simple. We use a diffusion transformer model (DiT) to predict the duration of each phoneme. Then we ...
The development of machine learning for cardiac care is severely hampered by privacy restrictions on sharing real patient electrocardiogram (ECG) data. Although generative AI offers a promising ...