Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

AI & ML·May 29, 2026·2 min read·via ArXivOriginal source →

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

arXiv:2605.28865v1 Announce Type: new Abstract: What does a world model learn from physical exploration, without any linguistic supervision? We argue the answer is organized by a single principle: the geometric structure of the physical world. Training a VAE-based world model on random embodied exploration, we find that its latent space develops spatial semantic structure that mirrors physical geometry -- direction accuracy 0.677+-0.029 versus 0.547 for a randomly initialized encoder, and posit

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

More Stories

To see to it that the forces of Napoleon are driven out of Spain (1809)

SQLite is all you need for durable workflows

Bill C-22 Is a Mess of the Government's Own Making

CVE-2026-48710: A Maintainer's Perspective