Essays about game development, thinking and books

Reasoning LLMs are Wandering Solution Explorers ru en

Illustration of the problem (c) ChatGPT

Illustration of the problem (c) ChatGPT

An interesting paper has appeared on arXiv, arguing that modern reasoning LLMs engage more in "random wandering in the solution space" than in "systematic problem-solving".

The main text is about 10 pages of fairly accessible reading — I recommend checking it out.

Here's what the authors did:

  1. Formalized the notions of "systematic exploration of the solution space" and "random wandering within the solution space".
  2. Built a very simple and illustrative model of how these mechanisms work.
  3. Using that model, they showed that random wandering can easily be mistaken for systematic exploration — especially when you have plenty of compute power.
  4. They also showed that the effectiveness of random wandering degrades sharply as the task complexity exceeds available resources.
  5. Formalized real-world problems as strictly defined tasks with structured solution spaces.
  6. Tested modern LLMs on these tasks and demonstrated that their behavior more closely resembles random wandering than systematic exploration.

I mostly agree with the authors' idea, though I wouldn't say the paper is flawless. There's a chance they didn't use LLMs quite correctly, and that the tasks were formalized in a way unfavorable to them.

However, the primary value of the paper isn't in its final conclusions, but in its excellent formalization of the solution-search process, the concepts of "random wandering" and "systematic exploration", and especially their simplified behavioral model.

If you're interested in the question of "whether LLMs actually think" (or, more broadly, in methods of problem-solving) this paper offers a promising angle of attack on the problem.