Essays about game development, thinking and books

Top LLM frameworks may not be as reliable as you think

Nearly a month ago, I decided to add Gemini support to Feeds Fun and did some research on top LLM frameworks — I didn't want to write my own bicycle.

As a result, I found an embarrassing bug (in my opinion, of course) in the integration with Gemini in LLamaIndex. Judging by the code, it is also present in Haystack and in the plugin for LangChain. And the root of the problem is in the Google SDK for Python.

When initializing a new client for Gemini, the framework code overwrites/replaces API keys in all clients created before. Because the API key, by default, is stored in a singleton.

It is death-like, if you have a multi-tenant application, and unnoticeable in all other cases. Multi-tenant means that your application works with multiple users.

For example, in my case, in Feeds Fun, a user can enter their API key to improve the quality of the service. Imagine what a funny situation could happen: a user entered an API key to process their news but spent tokens (paid for) for all service users.

I reported this bug only in LLamaIndex as a security issue, and there has been no reaction for 3 weeks. I'm too lazy to reproduce and report for Haystack and LangChain. So this is your chance to report a bug to a top repository. All the info will be below, reproducing is not difficult.

This error is notable for many reasons:

  1. The assessment of the criticality of the error depends a lot on taste, experience, and context. For me, in the projects I worked on, this is a critical security issue. However, it seems that this is not critical at all for most current projects that use LLMs. Which leads to some thoughts about mainstream near-LLM development.
  2. This is a good indicator of a low level of code quality control: code reviews, tests, all processes. After all, this is an integration with one of the major API providers. The problem could have been found in many different ways, but none worked.
  3. This is a good illustration of the vicious approach to development: "copy-paste from a tutorial and push to prod". To make such a mistake, you had to ignore both the basic architecture of your project and the logic of calling the code you are copying.

Ultimately, I gave up on these frameworks and implemented my own client over HTTP API.

My conclusion from this mess is: you can't trust the code under the hood of modern LLM frameworks. You need to double-check and proofread it. Just because they state that they are "production-ready" doesn't mean they are really production-ready.

Let me tell you more about the bug.

Read more

Grainau: hiking and beer at 3000 meters

How it all looks from the ground.

How it all looks from the ground.

For her vacation, Yuliya decided to show me the beautiful German mountains and took me for a couple of days to Grainau — it's a piece of Bavaria that's almost like Switzerland. At least, it is similar to the pictures of Switzerland that I've seen :-D

In short, it's a lovely place with a measured pace of life. If you need to catch your breath, calm your nerves, and enjoy nature, then this is the place for you. But if you can't live without parties, you'll get bored quickly.

What's there:

  • The highest mountain in Germany plus a couple of glaciers.
  • There's skiing in winter. If you really need it, you can find a place to ski in summer, but the descent is short, and the lifts are turned off.
  • A large clean lake and a couple of smaller ones.
  • A huge number of trails for hiking.
  • A huge number of waterfalls, streams, and a couple of mountain rivers.
  • Restaurants with beer.
  • Beautiful fallen trees in the forests, private property, fences, cows with bells, and "racing tractors" (I don't know how to name this phenomenon better, but tractors are moving fast there :-D).

This is briefly, and now in detail.

Read more

Concept document for a space exploration MMO

The expected poster for the game. (c) DALL-E

The expected poster for the game. (c) DALL-E

As a hobby, I write concept documents for games. This is first in English. I have a few more in Russian and will eventually translate them.

One more concept for The Tale 2.0.

Title

Lords Captains MMO

Yep, it's a rip-off from Warhammer 40k and Rogue Trader, but it will do for the concept.

One-liner

Explore the infinite universe on a starship with millions of souls on board, unite and develop abandoned worlds.

Platforms

Browsers, mobile.

Genre

Exploration-driven trade-political MMO PVE sandbox.

Closest analogs

EVE, Sim City, Crusader Kings, 4X games, Rogue Trader.

Read more

Procedural news headlines without complex text generation

A screenshot of the interface for selecting a news connotation (from the prototype of the game about a news agency). News: the arrest of a teenage witch for drunk driving.

A screenshot of the interface for selecting a news connotation (from the prototype of the game about a news agency). News: the arrest of a teenage witch for drunk driving.

From the player preference survey, I gradually moved on to working on a game prototype.

The game will be about a news agency. You will be the chief editor, and your task is to manipulate public opinion by investigating events and choosing a connotation of news: where to draw the public's attention, what to hide, in what tone to present themes, etc.

Therefore, the whole game will be around the text of news.

Creating large blocks of detailed text for each news item looks pointless — the game is not about reading news but about managing them. Therefore, it makes sense to build interaction only around headlines.

But how can we make the displaying of news both interesting and simple?

Read more

Preferences of strategy players

Looking at the survey data and trying to find something useful.

Looking at the survey data and trying to find something useful.

Recently I've conducted a survey about the preferences of strategy players.

In the previous post, we cleaned up the data, and in this one, we will try to find insights within it.

In this post you will find an interactive dashboard with a bunch of charts, where you can compare two samples of your choice. There are many samples — for every taste and color, so feel free to explore and share the patterns you find on Telegram and Discord.

But be careful with conclusions. There is little data, in some cases very little. For example, the difference between the sample sizes of male and female respondents is about tenfold => you should be very careful in interpreting the differences between them.

In general, do not take this post as a full-fledged study. I'm sure many analysts would have torn my hands off for such a thing. Then sewed them back and torn them off again :-D Use the post as an interface to the data, and make your own conclusions.

Read more