Tag

#diffusion (58 篇)

year	title	topic	venue
2025	Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control	World Model & Video Policy	arXiv
2025	DiT-Policy	Diffusion Policy	ICRA
2025	Diffusion Policy Policy Optimization (DPPO)	Diffusion Policy	ICLR
2025	FlowPolicy: 3D Flow-based Policy via Consistency Flow Matching	Diffusion Policy	AAAI
2025	FAST: Efficient Action Tokenization for VLA	Diffusion Policy	RSS
2025	pi_0.5: VLA with Open-World Generalization	Diffusion Policy	arXiv
2025	Generalizable Humanoid Manipulation with 3D Diffusion Policies (iDP3)	Imitation Learning	RSS
2025	SmolVLA	Imitation Learning	arXiv
2025	TLA: Tactile-Language-Action	Multimodal Ecology	ICRA
2025	Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion	RF Perception & Mapping	arXiv
2025	DexVLA	End-to-End VLA	arXiv
2025	OpenHelix	End-to-End VLA	arXiv
2025	OpenVLA-OFT	End-to-End VLA	RSS
2025	Dreamer V3: Mastering Diverse Domains through World Models	World Model & Video Policy	Nature
2025	1X World Model Challenge	World Model & Video Policy	arXiv
2025	Cosmos World Foundation Model Platform	World Model & Video Policy	arXiv
2025	Navigation World Models	World Model & Video Policy	CVPR
2024	Stable Audio	Auditory & Acoustic	ICML
2024	DROID	Datasets & Benchmarks	RSS
2024	RoboCasa	Datasets & Benchmarks	RSS
2024	3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations	Diffusion Policy	RSS
2024	Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation	Diffusion Policy	RSS
2024	EquiBot: SIM(3)-Equivariant Diffusion Policy	Diffusion Policy	CoRL
2024	Affordance-based Robot Manipulation with Flow Matching	Diffusion Policy	IROS
2024	pi_0: Vision-Language-Action Flow Model	Diffusion Policy	arXiv
2024	ALOHA 2	Imitation Learning	Tech Report
2024	DexCap	Imitation Learning	RSS
2024	HumanPlus	Imitation Learning	CoRL
2024	Mobile ALOHA	Imitation Learning	CoRL
2024	Universal Manipulation Interface	Imitation Learning	RSS
2024	Behavior Generation with Latent Actions (VQ-BeT)	Imitation Learning	ICML
2024	RoboFlamingo	High-Level Planning	ICLR
2024	Diffusion Model is a Good Pose Estimator from 3D RF-Vision	RF Perception & Mapping	CVPR
2024	3D Diffusion Policy (DP3)	End-to-End VLA	RSS
2024	Octo: An Open-Source Generalist Robot Policy	End-to-End VLA	RSS
2024	3D-VLA	End-to-End VLA	ICML
2024	GR-2: Generative Video-Language-Action Model	End-to-End VLA	arXiv
2024	RDT-1B: Diffusion Foundation Model for Bimanual Manipulation	End-to-End VLA	ICLR
2024	RoboMamba	End-to-End VLA	NeurIPS
2024	TinyVLA	End-to-End VLA	RA-L
2024	Long-CLIP: Unlocking the Long-Text Capability of CLIP	VLM Foundation	ECCV
2024	Genie: Generative Interactive Environments	World Model & Video Policy	ICML
2024	UniSim	World Model & Video Policy	ICLR
2023	3DShape2VecSet: 3D Shape Representation for Diffusion Models	VLM Foundation	SIGGRAPH
2023	MusicLM	Auditory & Acoustic	arXiv
2023	BridgeData V2	Datasets & Benchmarks	dataset-eval
2023	LIBERO	Datasets & Benchmarks	NeurIPS
2023	RH20T	Datasets & Benchmarks	RSS Workshop
2023	Diffusion Policy: Visuomotor Policy Learning via Action Diffusion	Diffusion Policy	RSS
2023	Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware (ACT/ALOHA)	Imitation Learning	RSS
2023	AnyTeleop	Imitation Learning	CoRL
2023	RoboCat	Imitation Learning	TMLR
2023	VoxPoser	High-Level Planning	CoRL
2023	RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches	End-to-End VLA	ICLR
2023	GAIA-1	World Model & Video Policy	arXiv
2022	Behavior Transformers: Cloning k Modes with One Stone	Imitation Learning	NeurIPS
2021	What Matters in Learning from Offline Human Demonstrations for Robot Manipulation	Datasets & Benchmarks	CoRL
2021	ManiSkill	Simulation & Sim2Real	NeurIPS