Top LLM frameworks may not be as reliable as you think

Nearly a month ago, I decided to add Gemini support to Feeds Fun and did some research on top LLM frameworks — I didn't want to write my own bicycle.

As a result, I found an embarrassing bug (in my opinion, of course) in the integration with Gemini in LLamaIndex. Judging by the code, it is also present in Haystack and in the plugin for LangChain. And the root of the problem is in the Google SDK for Python.

When initializing a new client for Gemini, the framework code overwrites/replaces API keys in all clients created before. Because the API key, by default, is stored in a singleton.

It is death-like, if you have a multi-tenant application, and unnoticeable in all other cases. Multi-tenant means that your application works with multiple users.

For example, in my case, in Feeds Fun, a user can enter their API key to improve the quality of the service. Imagine what a funny situation could happen: a user entered an API key to process their news but spent tokens (paid for) for all service users.

I reported this bug only in LLamaIndex as a security issue, and there has been no reaction for 3 weeks. I'm too lazy to reproduce and report for Haystack and LangChain. So this is your chance to report a bug to a top repository. All the info will be below, reproducing is not difficult.

This error is notable for many reasons:

The assessment of the criticality of the error depends a lot on taste, experience, and context. For me, in the projects I worked on, this is a critical security issue. However, it seems that this is not critical at all for most current projects that use LLMs. Which leads to some thoughts about mainstream near-LLM development.
This is a good indicator of a low level of code quality control: code reviews, tests, all processes. After all, this is an integration with one of the major API providers. The problem could have been found in many different ways, but none worked.
This is a good illustration of the vicious approach to development: "copy-paste from a tutorial and push to prod". To make such a mistake, you had to ignore both the basic architecture of your project and the logic of calling the code you are copying.

Ultimately, I gave up on these frameworks and implemented my own client over HTTP API.

My conclusion from this mess is: you can't trust the code under the hood of modern LLM frameworks. You need to double-check and proofread it. Just because they state that they are "production-ready" doesn't mean they are really production-ready.

Let me tell you more about the bug.

Grainau: hiking and beer at 3000 meters

For her vacation, Yuliya decided to show me the beautiful German mountains and took me for a couple of days to Grainau — it's a piece of Bavaria that's almost like Switzerland. At least, it is similar to the pictures of Switzerland that I've seen :-D

In short, it's a lovely place with a measured pace of life. If you need to catch your breath, calm your nerves, and enjoy nature, then this is the place for you. But if you can't live without parties, you'll get bored quickly.

What's there:

The highest mountain in Germany plus a couple of glaciers.
There's skiing in winter. If you really need it, you can find a place to ski in summer, but the descent is short, and the lifts are turned off.
A large clean lake and a couple of smaller ones.
A huge number of trails for hiking.
A huge number of waterfalls, streams, and a couple of mountain rivers.
Restaurants with beer.
Beautiful fallen trees in the forests, private property, fences, cows with bells, and "racing tractors" (I don't know how to name this phenomenon better, but tractors are moving fast there :-D).

This is briefly, and now in detail.

Computational mechanics & ε- (epsilon) machines

I found a few new concepts for tracking.

Computational mechanics

There is computational mechanics, which deals with numerical modeling of mechanical processes and there is an article about it on the wiki. This post is not about it.

This post is about computational mechanics, which studies abstractions of complex processes: how emergent behavior arises from the sum of the behavior / statistics of low-level processes. For example, why the Big Red Spot on Jupiter is stable, or why the result of a processor calculations does not depend on the properties of each electron in it.

ε- (epsilon) machine

The concept of a device that can exist in a finite set of states and can predict its future state (or state distribution?) based on the current one.

Computational mechanics allows (or should allow) to represent complex systems as a hierarchy of ε-machines. This creates a formal language for describing complex systems and emergent behavior.

For example, our brain can be represented as an ε-machine. Formally, the state of the brain never repeats (voltages on neurons, positions of neurotransmitter molecules, etc), but there are a huge number of situations when we do the same thing in the same conditions.

Here is a popular science explanation: https://www.quantamagazine.org/the-new-math-of-how-large-scale-order-emerges-20240610/

P.S. I will try to dig into scientific articles. I will tell you if I find something interesting and practical. P.P.S. I have long been thinking in the direction of a similar thing. Unfortunately, the twists of life do not allow me to seriously dig into science and mathematics. I am always happy when I encounter the results of other people's digging.

2024-06-11

evolution, systems, theory, thinking

Two years of writing RFCs — statistics

Slightly more than two years ago, I became a Lead/Engineering Manager for Palta's payment team. I left the company at the end of 2023 for another sabbatical [ru].

It is time to sum up. I will start with my favorite initiative.

From the first month, I promoted the idea of preceding major changes with text documents — RFC — Request for Comments.

In this post, I will analyze two years of applying this practice to share the experience, summarize the results, and have convincing arguments for my next job.

Hello, World!

Nice to meet you, friends!

My name is Aliaksei, but feel free to call me Tiendil — it is my nickname for the last 20 years or so :-)

A few words about me:

By occupation, I am a software developer, mostly backend, mainly in Python.
For most of my career, I've been working in game development on big projects and own indie games.
I like playing games, reading books, and writing long-reads about partially complex topics.

You can find more about me:

In the about page
In my GitHub profile: github.com/tiendil
In the LinkedIn profile: linkedin.com/in/tiendil

This is my first blog post in English, but not the first one in general. I have blogged in Russian for a long time and have always wanted to share my thoughts with the English-speaking world. At last, I found some time to adapt my blog, and here we are!

Most of the future posts will be bilingual (English & Russian). Also, with time, I'll translate my most interesting old posts.

Once again, nice to meet you! Feel free to contact me by any means.

2024-02-21

blog, practice