Tag

#audio-speech (22 篇)

year	title	topic	venue
2025	VLAS: VLA Model With Speech Instructions	Multimodal Ecology	ICLR
2025	Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion	RF Perception & Mapping	arXiv
2024	Proactive Hearing Assistants that Isolate Egocentric Conversations	Auditory & Acoustic	UIST
2024	NeuralAids: Wireless Hearables With Programmable Speech AI Accelerators	Auditory & Acoustic	MobiCom
2024	Stable Audio	Auditory & Acoustic	ICML
2024	Universal Source Separation with Weakly Labelled Data	Auditory & Acoustic	TASLP
2024	OneLLM	Multimodal Ecology	CVPR
2024	Argus: Multi-View Egocentric Human Mesh Reconstruction Based on Stripped-Down Wearable mmWave Add-on	RF Perception & Mapping	SenSys
2024	Enabling Visual Recognition at Radio Frequency (PanoRadar)	RF Perception & Mapping	MobiCom
2023	Creating speech zones with self-distributing acoustic swarms	Auditory & Acoustic	Nature
2023	AudioLM	Auditory & Acoustic	TASLP
2023	EnCodec	Auditory & Acoustic	TMLR
2023	MusicLM	Auditory & Acoustic	arXiv
2023	Robust Speech Recognition via Large-Scale Weak Supervision	Auditory & Acoustic	ICML
2023	SeamlessM4T	Auditory & Acoustic	arXiv
2023	AudioPaLM	Multimodal Ecology	arXiv
2022	SoundStream: An End-to-End Neural Audio Codec	Auditory & Acoustic	IEEE/ACM TASLP
2021	Meta-StyleSpeech	Auditory & Acoustic	ICML
2020	Conformer	Auditory & Acoustic	Interspeech
2020	Dual-path RNN	Auditory & Acoustic	ICASSP
2019	Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation	Auditory & Acoustic	IEEE/ACM TASLP
2019	Connecting Touch and Vision via Cross-Modal Prediction	Multimodal Ecology	CVPR