Nearly a month ago, I decided to add Gemini support to Feeds Fun and did some research on top LLM frameworks — I didn't want to write my own bicycle.
As a result, I found an embarrassing bug (in my opinion, of course) in the integration with Gemini in LLamaIndex. Judging by the code, it is also present in Haystack and in the plugin for LangChain. And the root of the problem is in the Google SDK for Python.
When initializing a new client for Gemini, the framework code overwrites/replaces API keys in all clients created before. Because the API key, by default, is stored in a singleton.
It is death-like, if you have a multi-tenant application, and unnoticeable in all other cases. Multi-tenant means that your application works with multiple users.
For example, in my case, in Feeds Fun, a user can enter their API key to improve the quality of the service. Imagine what a funny situation could happen: a user entered an API key to process their news but spent tokens (paid for) for all service users.
I reported this bug only in LLamaIndex as a security issue, and there has been no reaction for 3 weeks. I'm too lazy to reproduce and report for Haystack and LangChain. So this is your chance to report a bug to a top repository. All the info will be below, reproducing is not difficult.
This error is notable for many reasons:
Ultimately, I gave up on these frameworks and implemented my own client over HTTP API.
My conclusion from this mess is: you can't trust the code under the hood of modern LLM frameworks. You need to double-check and proofread it. Just because they state that they are "production-ready" doesn't mean they are really production-ready.
Let me tell you more about the bug.
Recently OpenAI released GPT-4o-mini — a new flagship model for the cheap segment, as it were.
Of course, I immediately started migrating my news reader to this model.
In short, it's a cool replacement for GPT-3.5-turbo. I immediately replaced two LLM agents with one without changing prompts, reducing costs by a factor of 5 without losing quality.
However, then I started tuning the prompt to make it even cooler and began to encounter nuances. Let me tell you about them.
This is a translation of a post from 2020
This is a step-by-step guide to generating dungeons in Python. If you are not a programmer, you may be interested in reading how to design a dungeon [ru].
I spent a few evenings testing the idea of generating space bases.. The space base didn't work out, but the result looks like a good dungeon. Since I went from simple to complex and didn't use rocket science, I converted the code into a tutorial on generating dungeons in Python.
By the end of this tutorial, we will have a dungeon generator with the following features:
The entire code can be found on github.
There won't be any code in the post — all the approaches used can be easily described in words. At least, I think so.
Each development stage has a corresponding tag in the repository, containing the code at the end of the stage.
The aim of this tutorial is not only to teach how to generate dungeons but to demonstrate that seemingly complex tasks can be simple when properly broken down into subtasks.
I continue participating in World Builders school. For the last month, I've created a technical prototype of game mechanics for manipulating public opinion.
You play as the chief editor of a news agency, who sends journalists on quests and publishes articles based on the results of investigations focusing on themes that you want to promote.
The top video is in Russian, so I'll go through the main points below.
From the player preference survey, I gradually moved on to working on a game prototype.
The game will be about a news agency. You will be the chief editor, and your task is to manipulate public opinion by investigating events and choosing a connotation of news: where to draw the public's attention, what to hide, in what tone to present themes, etc.
Therefore, the whole game will be around the text of news.
Creating large blocks of detailed text for each news item looks pointless — the game is not about reading news but about managing them. Therefore, it makes sense to build interaction only around headlines.
But how can we make the displaying of news both interesting and simple?