Nearly a month ago, I decided to add Gemini support to Feeds Fun and did some research on top LLM frameworks — I didn't want to write my own bicycle.
As a result, I found an embarrassing bug (in my opinion, of course) in the integration with Gemini in LLamaIndex. Judging by the code, it is also present in Haystack and in the plugin for LangChain. And the root of the problem is in the Google SDK for Python.
When initializing a new client for Gemini, the framework code overwrites/replaces API keys in all clients created before. Because the API key, by default, is stored in a singleton.
It is death-like, if you have a multi-tenant application, and unnoticeable in all other cases. Multi-tenant means that your application works with multiple users.
For example, in my case, in Feeds Fun, a user can enter their API key to improve the quality of the service. Imagine what a funny situation could happen: a user entered an API key to process their news but spent tokens (paid for) for all service users.
I reported this bug only in LLamaIndex as a security issue, and there has been no reaction for 3 weeks. I'm too lazy to reproduce and report for Haystack and LangChain. So this is your chance to report a bug to a top repository. All the info will be below, reproducing is not difficult.
This error is notable for many reasons:
Ultimately, I gave up on these frameworks and implemented my own client over HTTP API.
My conclusion from this mess is: you can't trust the code under the hood of modern LLM frameworks. You need to double-check and proofread it. Just because they state that they are "production-ready" doesn't mean they are really production-ready.
Let me tell you more about the bug.
Recently, I unexpectedly encountered a justice system in the USA.
What conclusions can be drawn from this:
I've been using ChatGPT almost since the release of the fourth version (so for over a year now). Over this time, I've gotten pretty good at writing queries to this thing.
At some point, OpenAI allowed customizing chats with your text instructions (look for Customize ChatGPT
in the menu). With time, I added more and more commands there, and recently, the size of the instructions exceeded the allowed maximum :-)
Also, it turned out that a universal instruction set is not such a good idea — you need to adjust instructions for different kinds of tasks, otherwise, they won't be as useful as they could be.
Therefore, I moved the instructions to GPT bots instead of customizing my chat. OpenAI calls them GPTs. They are the same chats but with a higher limit on the size of the customized instructions and the ability to upload additional texts as a knowledge base.
Someday, I'll make a GPT for this blog, but for now, I'll tell you about two GPTs I use daily:
For each, I'll provide the basic prompt with my comments.
By the way, OpenAI recently opened a GPT store, I'd be grateful if you liked mine GPTs. Of course, only if they are useful to you.
I bought "The Net And The Butterfly" by mistake when I was in St. Petersburg about 5 years ago and organized a book-shopping day. I bought about 10 kilograms of books :-D, grabbed this one on autopilot without reading the contents. I thought the book would be about the network effect and the spreading of ideas, but it turned out to be about how to "manage" a brain relying on one of the neural networks in it. Which network? For the book and its content it does not matter at all.
My opinion of "The Net And The Butterfly" is twofold. On the one hand, I cannot deny its usefulness, on the other… the material could have been presented 100 times better and 3 times shorter. Sometimes, the authors walk on thin ice and risk falling into information peddling/marketing fraud.
Slightly more than two years ago, I became a Lead/Engineering Manager for Palta's payment team. I left the company at the end of 2023 for another sabbatical [ru].
It is time to sum up. I will start with my favorite initiative.
From the first month, I promoted the idea of preceding major changes with text documents — RFC — Request for Comments.
In this post, I will analyze two years of applying this practice to share the experience, summarize the results, and have convincing arguments for my next job.