回 Jason 主站·Embodied AI Reading Station
没主意?快捷入口
Tag

#dataset (40 篇)

yeartitletopicvenue
2025 Diffusion Policy Policy Optimization (DPPO) Diffusion Policy ICLR
2025 FlowPolicy: 3D Flow-based Policy via Consistency Flow Matching Diffusion Policy AAAI
2025 FAST: Efficient Action Tokenization for VLA Diffusion Policy RSS
2025 Tactile Beyond Pixels (Sparsh-X) Multimodal Ecology CoRL
2025 Isaac Lab Simulation & Sim2Real arXiv
2025 SpatialVLA End-to-End VLA arXiv
2024 mmCLIP: Boosting mmWave-based Zero-shot HAR via Signal-Text Alignment RF Perception & Mapping SenSys 2024
2024 Stable Audio Auditory & Acoustic ICML
2024 Universal Source Separation with Weakly Labelled Data Auditory & Acoustic TASLP
2024 ALOHA 2 Imitation Learning Tech Report
2024 OneLLM Multimodal Ecology CVPR
2024 Sparsh: Self-supervised Touch Representations Multimodal Ecology CoRL
2024 GenSim High-Level Planning ICLR
2024 Tree-Planner High-Level Planning ICLR
2024 Diffusion Model is a Good Pose Estimator from 3D RF-Vision RF Perception & Mapping CVPR
2024 BEHAVIOR-1K Simulation & Sim2Real CoRL
2024 DeepSeek-VL: Towards Real-World Vision-Language Understanding VLM Foundation arXiv
2024 Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks VLM Foundation CVPR
2024 InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks VLM Foundation CVPR
2024 Improved Baselines with Visual Instruction Tuning VLM Foundation CVPR
2024 What matters when building vision-language models? VLM Foundation NeurIPS
2024 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling VLM Foundation arXiv
2024 LLaVA-NeXT-Interleave VLM Foundation arXiv
2024 LLaVA-OneVision: Easy Visual Task Transfer VLM Foundation arXiv
2024 Pixtral 12B VLM Foundation arXiv
2023 MusicLM Auditory & Acoustic arXiv
2023 RH20T Datasets & Benchmarks RSS Workshop
2023 AudioPaLM Multimodal Ecology arXiv
2023 OBELICS VLM Foundation NeurIPS
2023 Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond VLM Foundation arXiv
2023 TWM: Transformer-based World Models World Model & Video Policy ICLR
2022 ProcTHOR Simulation & Sim2Real NeurIPS
2021 What Matters in Learning from Offline Human Demonstrations for Robot Manipulation Datasets & Benchmarks CoRL
2021 3DRIMR: 3D Reconstruction and Imaging via mmWave Radar based on Deep Learning RF Perception & Mapping IPCCC
2021 Habitat 2.0 Simulation & Sim2Real NeurIPS
2020 Dual-path RNN Auditory & Acoustic ICASSP
2020 robosuite: A Modular Simulation Framework and Benchmark for Robot Learning Datasets & Benchmarks arXiv
2020 RadarSLAM: Radar based Large-Scale SLAM in All Weathers RF Perception & Mapping BMVC
2019 Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning Datasets & Benchmarks CoRL
2019 RLBench: The Robot Learning Benchmark & Learning Environment Datasets & Benchmarks RA-L