Chuck Darwin<p>If scientists want to determine whether an LLM has formed an accurate model of the world, measuring the accuracy of its predictions doesn’t go far enough.<br>For example, a transformer can predict valid moves in a game of Connect 4 nearly every time without understanding any of the rules.<br>So, the team developed two new metrics that can test a transformer’s world model. The researchers focused their evaluations on a class of problems called deterministic finite automations, or DFAs. <br>A DFA is a problem with a sequence of states, like intersections one must traverse to reach a destination, and a concrete way of describing the rules one must follow along the way.<br>They chose two problems to formulate as DFAs: <br>navigating on streets in New York City <br>and playing the board game Othello.</p><p>“We needed test beds where we know what the world model is. Now, we can rigorously think about what it means to recover that world model,” Vafa explains.<br>The first metric they developed, called <a href="https://c.im/tags/sequence" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>sequence</span></a> <a href="https://c.im/tags/distinction" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>distinction</span></a>, says a model has formed a coherent world model it if sees two different states, like two different Othello boards, and recognizes how they are different. Sequences, that is, ordered lists of data points, are what transformers use to generate outputs.<br>The second metric, called <a href="https://c.im/tags/sequence" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>sequence</span></a> <a href="https://c.im/tags/compression" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>compression</span></a>, says a transformer with a coherent world model should know that two identical states, like two identical Othello boards, have the same sequence of possible next steps.<br>They used these metrics to test two common classes of transformers, one which is trained on data generated from randomly produced sequences and the other on data generated by following strategies.<br>Surprisingly, the researchers found that transformers which made choices randomly formed more accurate world models, perhaps because they saw a wider variety of potential next steps during training. <br>“In Othello, if you see two random computers playing rather than championship players, in theory you’d see the full set of possible moves, even the bad moves championship players wouldn’t make,” Vafa explains.<br>Even though the transformers generated accurate directions and valid Othello moves in nearly every instance, <br>the two metrics revealed that only one generated a coherent world model for Othello moves, <br>and none performed well at forming coherent world models in the wayfinding example.<br><a href="https://news.mit.edu/2024/generative-ai-lacks-coherent-world-understanding-1105" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">news.mit.edu/2024/generative-a</span><span class="invisible">i-lacks-coherent-world-understanding-1105</span></a></p>