Blogs / How to Process 10K Bank Transactions Through an LLM

How to Process 10K Bank Transactions Through an LLM

April 14, 2024 • Matthew Duong • Self Hosting; • 3 min read

How to Process 10K Bank Transactions Through an LLM

What are LLM Context Limits and How to Overcome Them?

LLMs like GPT have changed how we think about AI, providing insights and generating text based on the context provided by extensive training data. However, these models have a limitation known as the "context window". This can be thought of as “working memory”. This article explores the significance of context windows, their limitations, and strategies to overcome these constraints.

img 1

What is the Context Window?

The context window of an LLM is the amount of text it can "remember" or consider when generating a response. This limit is measured in “tokens”. In the english language these are equivalent units:

  • 1 token ≈ 4 characters
  • 1 token ≈ 0.75 words
  • 100 tokens ≈ 75 words

The latest GPT-4 can consider significantly more tokens at once than its predecessors. Detailed context window sizes for various models are available on platforms like OpenAI's model guide .

Why is the Context Window Important?

Imagine you want to analyze a document of 500 words, but your model's context window can only accommodate 250 words at a time. In such cases, only the latter half of the document remains in the model’s context, meaning any analysis or response generated will only have knowledge on the second half. This can result in incomplete understanding and responses that might seem out of context.

Imagine you're cooking a complex recipe but can only remember a few steps and ingredients at a time. Like a limited context window, this might cause you to forget key spices or miss important steps, potentially leading to a less successful dish.

Another analogy of what the effect of a smaller context window is like is trying to compose a coherent text message on your phone while under the influence of alcohol. Often, when you revisit what you've written the next morning, you might find the sentences disjointed and lacking in cohesion.

How do you pack more into your Context Window?

At a high level the goal of overcoming context window limitations is to reduce the number of tokens required to convey the exact same message and meaning. There are two benefits:

  1. Preserve context on a large amount of data or text.
  2. Reduce API charges since most LLMs charge on a per token basis.

img 2

Data Compression

In my day job we are developing an assistant for a client that interfaces with a database of financial records. This assistant can access and "call" tools, which in this scenario, involve database queries to fetch data. A typical query might involve asking for the "evolution of accounts in 2023," where the assistant would display financial data in a graph format.

However the data comes back in the form of transactions like this:

IDDateDescriptionDebitCredit
00012024-04-01Opening Balance0.000.00
00022024-04-02Purchase of office supplies150.000.00
00032024-04-03Client Invoice #1230.002,000.00
00042024-04-04Bank Fee15.000.00
00052024-04-05Rent Payment1,200.000.00
00062024-04-06Sale of Product XYZ0.001,500.00
00072024-04-07Coffee for office45.000.00
00082024-04-08Internet bill100.000.00
00092024-04-09Transfer to savings500.000.00
...............
99992024-12-28Client Invoice #1240.00750.00

You can imagine this table extending out to hundreds and thousands of transactions, well over any context window limit. To manage data efficiently, the assistant uses a data aggregation pipeline that aggregates account balances into annual summaries with growth rates, significantly reducing the volume of data processed. The result of such an aggregation looks like the following:

MonthBalance (USD)MoM Growth Rate (%)
January10,000.00-
February12,345.6723.46
March9,876.54-19.99
April13,210.7833.79
May10,005.89-24.27
June14,567.1245.59
July11,234.56-22.85
August15,000.3333.61
September12,250.00-18.34
October16,789.4537.06
November13,555.55-19.29
December18,000.9932.85

The Future of Context Windows

In my opinion, the issue of context window limits in large language models is a temporary technical limitation. Drawing a parallel to the early days of computing where memory was scarce, I see a similar trajectory with context windows. As technology advances, these limits are likely to expand significantly, making current optimization concerns for context windows less relevant over time.

Similarly, the evolution of programming practices offers a relevant analogy. Early programmers had to optimize code to fit within strict CPU and memory limits due to hardware constraints. Today, with substantial improvements in hardware, these constraints are less pressing. Likewise, as context windows in LLMs expand, the focus on optimizing within these limits will likely decrease, allowing engineers to focus on the business problems.

© 2023-2024 Matthew Duong