回 Jason 主站·Embodied AI Reading Station
没主意?快捷入口
Tag

#audio-speech (22 篇)

yeartitletopicvenue
2025 VLAS: VLA Model With Speech Instructions Multimodal Ecology ICLR
2025 Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion RF Perception & Mapping arXiv
2024 Proactive Hearing Assistants that Isolate Egocentric Conversations Auditory & Acoustic UIST
2024 NeuralAids: Wireless Hearables With Programmable Speech AI Accelerators Auditory & Acoustic MobiCom
2024 Stable Audio Auditory & Acoustic ICML
2024 Universal Source Separation with Weakly Labelled Data Auditory & Acoustic TASLP
2024 OneLLM Multimodal Ecology CVPR
2024 Argus: Multi-View Egocentric Human Mesh Reconstruction Based on Stripped-Down Wearable mmWave Add-on RF Perception & Mapping SenSys
2024 Enabling Visual Recognition at Radio Frequency (PanoRadar) RF Perception & Mapping MobiCom
2023 Creating speech zones with self-distributing acoustic swarms Auditory & Acoustic Nature
2023 AudioLM Auditory & Acoustic TASLP
2023 EnCodec Auditory & Acoustic TMLR
2023 MusicLM Auditory & Acoustic arXiv
2023 Robust Speech Recognition via Large-Scale Weak Supervision Auditory & Acoustic ICML
2023 SeamlessM4T Auditory & Acoustic arXiv
2023 AudioPaLM Multimodal Ecology arXiv
2022 SoundStream: An End-to-End Neural Audio Codec Auditory & Acoustic IEEE/ACM TASLP
2021 Meta-StyleSpeech Auditory & Acoustic ICML
2020 Conformer Auditory & Acoustic Interspeech
2020 Dual-path RNN Auditory & Acoustic ICASSP
2019 Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Auditory & Acoustic IEEE/ACM TASLP
2019 Connecting Touch and Vision via Cross-Modal Prediction Multimodal Ecology CVPR