Database Performance Lessons
from Factory Town

When focusing on database performance, I can’t think of anything more important than having visibility into your query workload .

We will explore this idea by drawing on examples from the world of business and manufacturing, and a game called Factory Town.

Database Performance: Measure the Queries!

Important rules for Database Performance

The golden rule for performance improvement comes from famous Management Consultant Peter Drucker: “What isn’t measured, can’t be improved.”

A business book called The Goal highlights the importance of that rule.

The Goal also introduced the Theory of Constraints. This theory is another important concept for any activity related to database performance tuning.

In a nutshell, the Theory of Constraints says that to improve the performance of a system you must focus on improving the single bottleneck holding up the system.

When that bottleneck is fixed, you can move on to the next one.

The video above uses a game called Factory Town to highlight these concepts.

Factory Town production

The game Factory Town is all about making improvements to production pipelines to maximize output and profits.

In the game, I have a complex setup of pastures that will make various animal products such as milk, eggs, chicken, and beef.

My goal is to make as much of these products as possible.

Now, each pasture requires animal feed to produce its assigned item.

Animal feed is produced from grain mills, which in turn requires grain from a farm!

Additionally, each building requires workers to produce the items.

While I could provide one farm to one grain mill to one pasture, that would not be an efficient use of resources.

If you watched the video, you can clearly see there are bottlenecks to shipping the chicken and eggs, while the beef is being produced too slowly.

Factory Town screenshot of pasture bottleneck
Chicken bottleneck in Factory Town

The point to be made, though, is that I need to have visibility into the inputs going into the pastures and the product output before I can begin to optimize production.

By the way, the product output is also called throughput.

And the game easily provides the visibility needed into product throughput to pinpoint bottlenecks.

Queries drive database performance

So what does this have to do with database performance?

Well, database performance is all about database workload production.

And the workload of databases result from queries asked from it.

Therefore, you must measure the queries going into the database to improve the product of your database.

After all, if it isn’t measured, it can’t be improved.

How to get query data in MySQL

MySQL provides several options for finding your query data.

The primary options are the slow query log, and in more recent versions, the Performance Schema.

Using these tools, you can find a way to observe query throughput.

This query data is helpful for answering many workload questions, such as:

  • Is this workload new?
  • What’s happening right now?

Unfortunately, these native tools cannot answer questions like:

  • What was happening yesterday compared to the day before? Or last week?
  • Is this workload at 2am normal?

For these type of questions, you will need an external system to collect query information to understand query performance over time.

Conclusion

A Database Administrator can learn a lot from the world of business and manufacturing when considering database performance.

In order to begin database optimizations, a key requirement is to measure the query throughput that the database processes.

If your goal is to systematically improve your database performance, make sure you have the systems in place to collect and measure query performance over time!