Essays about game development, thinking and books

AI notes 2024: The current state

I continue my notes on AI at the end of 2024.

In the previous posts, we discussed two theses:

  • By analyzing the decisions of major AI developers, such as OpenAI or Google, we can make fairly accurate assumptions about the state of the AI industry.
  • All current progress is based on a single base technology — generative knowledge bases, which are large probabilistic models.

Based on these theses, let's look at the current state of the industry.

Read more

Fun case of speeding up data retrieval from PostgreSQL with Psycopg

Speed of data retrieval from the PostgreSQL for each of the optimizations. In percentages of the base implementation speed. Note how the number of retrieved rows has little effect on execution time.

Speed of data retrieval from the PostgreSQL for each of the optimizations. In percentages of the base implementation speed. Note how the number of retrieved rows has little effect on execution time.

Once in a year or two, I have to remember that Python, umm... is not C++. It usually happens suddenly, like this time.

After a thoughtful analysis I got tired of waiting 10 seconds for news to load on feeds.fun, so I rolled up my sleeves and started optimizing. I almost jumped into an epic refactoring, but remembered just in time: measure first, then cut. In this case, the advice is literal — I took a profiler — py-spy — and checked what exactly was causing the slowdown

It turned out that the problem is not in the entire logic, but in a very specific place where the code extracts ~100000 rows from a PostgreSQL table, plus-minus 20000. The indexes are in place, I ran the tests with a database on a RAM disk, so everything should have been fine from the database side.

Don't be surprised by such a large number of rows:

  • Firstly, I have a substantial news flow.
  • Secondly, the reader currently assigns about 100 tags for each news item.

Armed with py-spy and the sources of psycopg, I went through three stages of optimization, reducing the function execution time approximately 4x solely by changing the format of the requested columns in SELECT query and the result processing code.

In the following text, I will tell you about the sequence of curious discoveries I made in the process.

Attention!

This post is not a study of Psycopg or Python performance, but a description of a specific experience with a specific task and specific data.

It would be incorrect to judge the performance of Psycopg, or Python in general, based on a single specific case.

Read more

Prompt engineering: building prompts from business cases

Ponies are building prompts (c) ChatGPT

Ponies are building prompts (c) ChatGPT

As you know, one of the features of my news reader is automatic tag generation using LLMs. That's why I periodically do prompt engineering — I want tags to be better and monthly bills to be lower.

So, I fine-tuned the prompts to the point where everything seems to work, but there's still this nagging feeling that something's off: correct tags are determined well, but in addition to them, many useless ones are created, and sometimes even completely wrong ones.

There are a few options in such cases:

  1. Collect training data and fine-tune the model to generate only correct tags.
  2. Build a chain of actors, where one will create tags, and the other will filter out the unnecessary ones.
  3. Try to radically rework the prompt.

Options 1 and 2 were out of the question due to lack of time and money. Also, my current strategy is to rely exclusively on ready-made AI solutions since keeping up with the industry alone is impossible. So, I had no choice but to go with the third option.

Progress was slow, but after a recent post about generative knowledge bases, something clicked in my head, the problem turned inside out, and over the course of the morning, I drafted a new prompt that’s been performing significantly better so far.

So, let's look at the problem with the old prompt and how the new one fixed it.

Read more

AI notes 2024: Generative knowledge base

I continue my notes on AI at the end of 2024.

Today, I want to discuss the disruptive technology that underlies modern AI achievements. Or the concept, or the meta-technology — whichever is more convenient for you.

You’ve probably never come across the logic described below on the internet (except for the introduction about disruptive technologies) — engineers and mathematicians might get a bit annoyed by the oversimplifications and cutting corners. But this is the lens through which I view the industry, assess what’s possible and less likely, and so on. My blog, my rules, my dictionary :-D

So, keep in mind, this is my personal view, not a generally accepted one.

Read more

AI notes 2024: Industry transparency

Nearly a year and a half ago, I published a major forecast on artificial intelligence [ru]. Read it if you haven't already — it's still looking good.

Recently, I've decided to expand on the prognosis, but a sizeable comprehensive post isn't coming together, so there will be a series of smaller notes.

I'll start with industry transparency: the current AI movement has several impressive aspects that I'd like to discuss.

Read more