I recently did performance testing of Tigerbeetle for a financial transactions c...

rbatiati · 2025-11-11T13:28:02 1762867682

Hi! Rafael from TigerBeetle here!

> One way to keep those faster speeds in Tigerbeetle for real-time workloads is microbatching incoming real-time transactions to Tigerbeetle at an interval of every second or lower, to take advantage of Tigerbeetle's blazing fast batch processing speeds.

We don’t recommend artificially holding transfers just for batching purposes. René actually had to implement a batching worker API to work around a limitation in Python’s FastAPI, which handled requests per process, and he’s been very clear in suggesting that such would be better reimplemented in Go.

Unlike most connection-oriented database clients, the TigerBeetle client doesn’t use a connection pool, because there’s no concept of a “connection” in TigerBeetle’s VSR protocol.

This means that, although you can create multiple client instances, in practice less is better. You should have a single long-lived client instance per process, shared across tasks, coroutines, or threads (think of a web server handling many concurrent requests).

In such a scenario, the client can efficiently pack multiple events into the same request, while your application logic focuses solely on business-event-oriented chains of transfers. Typically, each business event involves only a handful of transfers, which isn't a problem of underutilization, as they'll be submitted together with other concurrent events as soon as possible.

However, if you’re dealing with a non-concurrent workload, for example, a batch process that bills thousands of customers for their monthly invoices, then you can simply submit all transfers at once.

jorangreef · 2025-11-11T12:21:11 1762863671

Joran from TigerBeetle!

> For processing transactions one-by-one individually

If you're artificially restricting the load going into TigerBeetle, by sending transactions in one-by-one individually, then I think predictable latency (and not TPS) would be a better metric.

For example, TB's multi-region/multi-AZ fault-tolerance will work around gray failure (fail slow of hardware, as opposed to fail stop) in your network links or SSDs. You're also getting significantly stronger durability guarantees with TB [0][1].

It sounds like you were benchmarking on EBS? We recommend NVMe. We have customers running extremely tight 1 second SLAs, seeing microsecond latencies, even for one at a time workloads. Before TB, they were bottlenecking on PG. After TB, they saturated their central bank limit.

I would also be curious to what scale you tested? We test TB to literally 100 billion transactions. It's going to be incredibly hard to replicate that with PG's storage engine. PG is a great string DBMS but it's simply not optimized for integers the way TB is. Granted, your scale likely won't require it, but if you're comparing TPS then you should at least compare sustained scale.

There's also the safety factor of trying to reimplement TB's debit/credit primitives over PG to consider. Rolling it yourself. For example, did you change PG's defaults away from Read-Committed to Serializable and enable checksums in your benchmarks? (PG's checksums, even if you enable them, are still not going to protect you from misdirected I/O like the recent XFS bug.) Even the business logic is deceptively hard, there are thousands of lines of complicated state machine code, and we've invested literally millions into testing and audits.

Finally, it's important that your architecture as a whole, the gateways around TB, designs for concurrency first class, and isn't "one at a time", or TigerBeetle is probably not going to be your bottleneck.

[0] https://www.youtube.com/watch?v=_jfOk4L7CiY

[1] https://jepsen.io/analyses/tigerbeetle-0.16.11

NathanaelRea · 2025-11-11T02:23:15 1762827795

Doesn't the Tigerbeetle client automatically batch requests?

kelseydh · 2025-11-11T03:24:17 1762831457

We didn't observe any automatic batching when testing Tigerbeetle with their Go client. I think we initiated a new Go client for every new transaction when benchmarking, which is typically how one uses such a client in app code. This follows with our other complaint: it handles so little you will have to roll a lot of custom logic around it to batch realtime transactions quickly.

matdehaast · 2025-11-11T08:12:33 1762848753

I'm a bit worried you think instantiating a new client for every request is common practice. If you did that to Postgres or MySQL clients, you would also have degradation in performance.

PHP has created mysqli or PDO to deal with this specifically because of the known issues of it being expensive to recreate client connects per request

kelseydh · 2025-11-11T11:14:59 1762859699

Ok your comment made me double check our benchmarking script in Go. Can confirm we didn't instantiate a new client with each request.

For transparency here's the full Golang benchmarking code and our results if you want to replicate it: https://gist.github.com/KelseyDH/c5cec31519f4420e195114dc9c8...

We shared the code with the Tigerbeetle team (who were very nice and responsive btw), and they didn't raise any issues with the script we wrote of their Tigerbeetle client. They did have many comments about the real-world performance of PostgreSQL in comparison, which is fair.

matdehaast · 2025-11-12T05:05:11 1762923911

Thanks for the code and clarification. I'm surprised the TB team didn't pick it up, but your individual transfer test is a pretty poor representation. All you are testing there is how many batches you can complete per second, giving no time for the actual client to batch the transfers. This is because when you call createTransfer in GO, that will synchronously block.

For example, it is as if you created an HTTP server that only allows one concurrent request. Or having a queue where only 1 worker will ever do work. Is that your workload? Because I'm not sure I know of many workloads that are completely sync with only 1 worker.

To get a better representation for individual_transfers, I would use a waitgroup

  var wg sync.WaitGroup
  var mu sync.Mutex
  completedCount := 0

  for i := 0; i < len(transfers); i++ {
    wg.Add(1)
    go func(index int, transfer Transfer) {
     defer wg.Done()

     res, _ := client.CreateTransfers([]Transfer{transfer})
     for _, err := range res {
      if err.Result != 0 {
       log.Printf("Error creating transfer %d: %s", err.Index, err.Result)
      }
     }

     mu.Lock()
     completedCount++
     if completedCount%100 == 0 {
      fmt.Printf("%d\n", completedCount)
     }
     mu.Unlock()
    }(i, transfers[i])
   }

  wg.Wait()
  fmt.Printf("All %d transfers completed\n", len(transfers))

This will actually allow the client to batch the request internally and be more representative of the workloads you would get. Note, the above is not the same as doing the batching manually yourself. You could call createTransfer concurrently the client in multiple call sites. That would still auto batch them

jorangreef · 2025-11-12T07:24:10 1762932250

Appreciate your kind words, Kelsey!

I searched the recent history of our community Slack but it seems it may have been an older conversation.

We typically do code review work only for our customers so I’m not sure if there was some misunderstanding.

Perhaps the assumption that because we didn’t say anything when you pasted the code, therefore we must have reviewed the code?

Per my other comment, your benchmarking environment is also a factor. For example, were you running on EBS?

These are all things that our team would typically work with you on to accelerate you, so that you get it right the first time!

kelseydh · 2025-11-12T22:24:42 1762986282

Yeah it was back in February in your community Slack, I did receive a fairly thorough response from you and others about it. However then there were no technical critiques of the Go benchmarking code, just how our PostgreSQL comparison would fall short in real OLTP workloads (which is fair).

jorangreef · 2025-11-13T05:33:14 1763011994

Yes, thanks!

I don’t think we reviewed your Go benchmarking code at the time—and that there were no technical critiques probably should not have been taken as explicit sign off.

IIRC we were more concerned at the deeper conceptual misunderstanding, that one could “roll your own” TB over PG with safety/performance parity, and that this would somehow be better than just using open source TB, hence the discussion focused on that.

NathanaelRea · 2025-11-11T03:33:06 1762831986

Interesting, I thought I had heard that this is automatically done, but I guess it's only through concurrent tasks/threads. It is still necessary to batch in application code.

https://docs.tigerbeetle.com/coding/clients/go/#batching

But nonetheless, it seems weird to test it with singular queries, because Tigerbeetle's whole point is shoving 8,189 items into the DB as fast as possible. So if you populate that buffer with only one item your're throwing away all that space and efficiency.

kelseydh · 2025-11-11T05:17:22 1762838242

We certainly are losing that efficiency, but this is typically how real-time transactions work. You write real-time endpoints to send off transactions as they come in. Needing to roll more than that is a major introduction of complexity.

We concluded where Tigerbeetle really shines is if you're a large entity like a central bank or corporation sending massive transaction files between entities. Tigerbeetle is amazing for moving large numbers of batch transactions at once.

We found other quirks with Tigerbeetle that made it difficult as a drop-in replacement for handling transactions in PostgreSQL. E.g. Tigerbeetle's primary ID key isn't UUIDv7 or ULID, it's a custom id they engineered for performance. The max metadata you can save on a transaction is a 128-bit unsigned integer on the user_data_128 field. While this lets them achieve lightning fast batch transaction processing benchmarks, the database allows for the saving of so little metadata you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.

jorangreef · 2025-11-11T12:36:43 1762864603

> you risk getting bottlenecked by all the attributes you'll need to wrap around the transaction in PostgreSQL to make it work in a real application.

The performance killer is contention, not writing any associated KV data—KV stores scale well!

But you do need to preserve a clean separation of concerns in your architecture. Strings in your general-purpose DBMS as "system of reference" (control plane). Integers in your transaction processing DBMS as "system of record" (data plane).

Dominik Tornow wrote a great blog post on how to get this right (and let us know if our team can accelerate you on this!):

https://tigerbeetle.com/blog/2025-11-06-the-write-last-read-...

jorangreef · 2025-11-11T12:32:19 1762864339

> We didn't observe any automatic batching when testing Tigerbeetle with their Go client.

This is not accurate. All TigerBeetle's clients also auto batch under the hood, which you can verify from the docs [0] and the source [1], provided your application has at least some concurrency.

> I think we initiated a new Go client for every new transaction when benchmarking

The docs are careful to warn that you shouldn't be throwing away your client like this after each request:

  The TigerBeetle client should be shared across threads (or tasks, depending on your paradigm), since it automatically groups together batches of small sizes into one request. Since TigerBeetle clients can have at most one in-flight request, the client accumulates smaller batches together while waiting for a reply to the last request.

Again, I would double check that your architecture is not accidentally serializing everything. You should be running multiple gateways and they should each be able to handle concurrent user requests. The gold standard to aim for here is a stateless layer of API servers around TigerBeetle, and then you should be able to push pretty good load.

[0] https://docs.tigerbeetle.com/coding/requests/#automatic-batc...

[1] The core batching logic powering all language clients: https://github.com/tigerbeetle/tigerbeetle/blob/main/src/cli...

kelseydh · 2025-11-11T17:24:09 1762881849

Thanks for reaching out. I shared this benchmarking script with your team when we tested Tigerbeetle, but this is it again: https://gist.github.com/KelseyDH/c5cec31519f4420e195114dc9c8...

Was there something wrong with our test of the individual transactions in our Go script that caused the drop in transaction performance we observed?

jorangreef · 2025-11-11T18:22:07 1762885327

Thanks Kelsey!

We’d love to roll up our sleeves and help you get it right. Please drop me an email.

lossolo · 2025-11-11T22:05:38 1762898738

So what was wrong with his isolated benchmark code that he shared here?

matdehaast · 2025-11-12T05:12:42 1762924362

Not from Tigerbeetle, but having looked at his code this is what I saw https://news.ycombinator.com/item?id=45896559

nickmonad · 2025-11-11T02:21:48 1762827708

Did the company end up using it?

kelseydh · 2025-11-11T02:31:14 1762828274

We didn't rule out using Tigerbeetle, but the drop in non-batch performance was disappointing and a reason we haven't prioritised switching our transaction ledger from PostgreSQL to Tigerbeetle.

There was also poor Ruby support for Tigerbeetle at the time, but that has improved recently and there is now a (3rd party) Ruby client: https://github.com/antstorm/tigerbeetle-ruby/

jorangreef · 2025-11-11T12:49:25 1762865365

I think the drop in non-batch performance was more a function of the PoC than of TB. Would love to see what our team could do for you here! Feel free to reach out to peter@tigerbeetle.com