Topic VI · 世界模型与视频策略

World Model & Video Policy

World Model & Video Policy — 世界模型与视频策略

14papers

2founder

5classic

7frontier

教 AI 在脑子里预演：给它当前画面和一个动作，它能脑补出'下一秒会发生什么'。这个能力一旦建立，规划就能在想象里跑，不必撞坏真机器人。

Primer · 入门 3 篇

先读这三篇。

World Models 开宗 → Dreamer-v3 跨域训练范式 → Genie 从无标签视频学潜在动作空间。

1
World Models 2018 · NeurIPS · ⭐⭐⭐
让 AI 先在自己脑子里反复"做白日梦"练打游戏，练熟了再去真游戏里上场——居然真能赢。
2
Dreamer V3: Mastering Diverse Domains through World Models 2025 · Nature · ⭐⭐⭐⭐
同一套设置，让一个 AI 自己玩 150 多种游戏都不用改参数，还第一次靠自己挖到《我的世界》里的钻石。
3
Genie: Generative Interactive Environments 2024 · ICML · ⭐⭐⭐⭐
Genie 看一堆游戏录屏，自己猜出每帧之间"按了什么键"，再用这个"按键"画出下一帧——把死视频变成能玩的小游戏。

Distribution · 年份分布

2018 到 2025，14 篇怎么排开。

祖师爷经典前沿

All papers · 按 era 排

World Model & Video Policy 全部 14 篇。

era	year	title	venue
前沿	2025	Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control	arXiv
祖师爷	2018	World Models	NeurIPS
祖师爷	2020	Dream to Control: Learning Behaviors by Latent Imagination	ICLR
经典	2021	Mastering Atari with Discrete World Models	ICLR
经典	2022	DayDreamer	CoRL
经典	2023	Transformers are Sample-Efficient World Models	ICLR
经典	2023	TWM: Transformer-based World Models	ICLR
经典	2025	Dreamer V3: Mastering Diverse Domains through World Models	Nature
前沿	2023	GAIA-1	arXiv
前沿	2024	Genie: Generative Interactive Environments	ICML
前沿	2024	UniSim	ICLR
前沿	2025	1X World Model Challenge	arXiv
前沿	2025	Cosmos World Foundation Model Platform	arXiv
前沿	2025	Navigation World Models	CVPR