Essays about game development, thinking and books

Reasoning LLMs are Wandering Solution Explorers

Illustration of the problem (c) ChatGPT

Illustration of the problem (c) ChatGPT

An interesting paper has appeared on arXiv, arguing that modern reasoning LLMs engage more in "random wandering in the solution space" than in "systematic problem-solving".

The main text is about 10 pages of fairly accessible reading — I recommend checking it out.

Here's what the authors did:

  1. Formalized the notions of "systematic exploration of the solution space" and "random wandering within the solution space".
  2. Built a very simple and illustrative model of how these mechanisms work.
  3. Using that model, they showed that random wandering can easily be mistaken for systematic exploration — especially when you have plenty of compute power.
  4. They also showed that the effectiveness of random wandering degrades sharply as the task complexity exceeds available resources.
  5. Formalized real-world problems as strictly defined tasks with structured solution spaces.
  6. Tested modern LLMs on these tasks and demonstrated that their behavior more closely resembles random wandering than systematic exploration.

I mostly agree with the authors' idea, though I wouldn't say the paper is flawless. There's a chance they didn't use LLMs quite correctly, and that the tasks were formalized in a way unfavorable to them.

However, the primary value of the paper isn't in its final conclusions, but in its excellent formalization of the solution-search process, the concepts of "random wandering" and "systematic exploration", and especially their simplified behavioral model.

If you're interested in the question of "whether LLMs actually think" (or, more broadly, in methods of problem-solving) this paper offers a promising angle of attack on the problem.

Vantage on management: Engineering is science is engineering

A visual depiction of engineering and scientific approaches.

A visual depiction of engineering and scientific approaches.

In the previous post, we discussed that engineering is a creative activity that cannot be reduced to following instructions. Therefore, to manage engineering teams, it is necessary to use practices intended for creative teams.

And what could be more creative than a jazz band science?

That's why, in this post, I will try to show that engineering is conceptually much closer to science than it may seem at first glance, and that in the modern world, these disciplines are converging ever faster. I would even bet that the boundary between them will eventually disappear.

Read more

Vantage on management: No instructions for engineering

"Engineer Seated" © ChatGPT + [Vrubel](https://en.wikipedia.org/wiki/Mikhail_Vrubel)

"Engineer Seated" © ChatGPT + Vrubel

I wanted to write this post for 5 years, give or take, and I still don't fully understand why it needs to be written — in my opinion, these things are obvious.

However, I also don't understand some phenomena from work practice and theory, for example.

Why every most management theories are derived from the experience of physical instruction-driven production, rather than from the experience of engineering and scientific teams? Instruction-driven — in the sense that the work consists of following detailed instructions.

Of course, people wrote many books with sets of specific practices in the spirit of "How I was an Engineering Manager" or "How we do management at Google". However, they are not theories — they are sets of practices for specific cases — to apply these practices wisely, one must have the corresponding theory in mind.

Why do management practices for instruction-driven teams keep seeping into the management of creative teams? From attempts to lock in output quotas to using team velocity as a KPI. From trying to utilize 100% of an engineer's time to (implicitly) demanding a blood oath on every estimate. Not to mention denying autonomy in decision-making, imposing rigid schedules, and forcing work in the office.

Both questions are, of course, rhetorical.

The answer to the first one: "That's how it historically evolved" — until the 1980s, it indeed made sense to derive management, crudely speaking, from the organization of manual labor on factory floors. And even then, it wasn't always the case — fortunately, NASA took a different path. But that was half a century ago; we now live literally in the future compared to that time, yet we continue to rely on its concepts — and that's the answer to the second question.

Meanwhile, cause-and-effect relationships are still there: no matter how strong your team or how brilliant your idea, if you force them through an ill-suited mechanism — alien concepts, alien processes — you'll end up with a poor product and suffering people.

That's why in this and the next couple of posts, I want to discuss the role of creativity in engineering work: why it's critically important and where to look for inspiration in managing creative teams.

Read more

Vantage on management: What to read, when, and why

Books featured in the post.

Books featured in the post.

At my last job, I sometimes found it challenging to convey my brilliant management ideas to colleagues. After some reflection, I realized I had drifted into an overly narrow and specific conceptual field, which often forced me to translate concepts from my internal representations into more or less commonly accepted terms on the fly. Not only is this hard, but people don't always see you as a smart person while you're doing it :-D

Thus, I decided to catch up with the latest achievements of human thought and, about a year ago, stocked up on 9 top-rated management books. With the idea of surveying the situation from a bird's-eye view, standardizing my glossary, and gathering arguments for the future.

I spent the whole of last year reading these books, but, against my usual habit, didn't write any reviews:

  • At my pace of writing, a review of 9 books would have taken a whole month.
  • All the books are top-tier — lots of readers, high ratings, plenty of reviews online — no point in repeating.
  • The books are essentially textbooks and/or manifestos. Writing reviews of obviously solid textbooks just doesn't make much sense.

Instead of multiple reviews, I decided to prepare a single overview post with brief descriptions of each book, recommendations on when to read it, and a few notes. You're reading it right now.

Disclaimer

The comments on the books are somewhat biased — they reflect my personal opinion.

So, let's get started.

Read more

Summary of GPT-5 presentation without marketing pixie dust

Someone "accidentally" doubled one of the bars in the chart. Someone "accidentally" reduced one of the bars in the chart by about three times.

Let he who thinks this is a coincidence cast the first stone at me.

For the record, in the scientific community, disputes over such plots can go as far as ostracizing the authors. But this is business, marketing — so it's fine, right?

  1. The LLM has become, on average, slightly (by a few percent) smarter.
  2. In some aspects, the LLM has become significantly smarter (by tens of percent).
  3. In some aspects, the LLM has become a bit dumber(!).
  4. The API has become cheaper, or not, — it depends on how you use it.
  5. OpenAI is deliberately misleading people about the capabilities of the new model — see the screenshots.

=> The world's LLM leader has begun to bog down, and the rest will likely follow.

When everything is going well and you make another breakthrough, you don't cheat with the pictures.

This doesn't mean that progress has stopped. Still, it does mean that the growth of technology is transitioning from the explosive phase of "discovering new things" to a more or less steady phase of "optimizing technologies in a million directions, where there are only enough hands for a hundred."

We are close to the "disillusionment" phase of the Gartner hype cycle.

In this regard, I would like to remind you of my AI future prognosis — so far, it's holding up.

Let me add a few more thoughts.

Read more