Openstack Swift Database Performance Improvements Part 2

In my previous Openstack post I established the groundwork for my proposed solution to improve the performance of Openstack Swift Database that consists of two parts: MySQL and Database Chunking.

Database Performance with Openstack Improvement: MySQL

For the first part of the solution, I propose replacing SQLite with MySQL as a database engine. As the name implies, SQLite is fine for small databases, but has performance problems with larger Openstack databases. MySQL is perfect for this problem, since a database is represented as a file system directory, and database tables are represented as files.

Openstack Performance Improvement: Database Chunking

For the second part of the solution, I propose using Database Chunking. That is, breaking up the Container and Object Tables into chunks. The result would be database queries on reasonably sized tables. The table structure would be tiered and tables would either be of type “index” or “data”. Index Tables would point to the next level of table, which would be “index” or “data”. Data Tables contain the actual table data and are thus leaf nodes in the table schema.

The optimal size of each table chunk would have to be determined through experimentation, but for purposes of argument, let us assume a table chunk size of 100,000 rows. So for an empty Account, the Container Table schema would be:

Container Index 1 -> empty

After the first container is created, the Container Table schema would be:

Container Index 1 -> Container Data 1
Container Data 1 -> 1 row

When container number 100,001 gets created, the Container Table schema would be:

Container Index 1 -> Container Data 1
Container Data 2
Container Data 1 -> 100,000 rows
Container Data 2 -> 1 row

So Container Index 1 can map (100,000 X 100,000 = 10 Billion) containers. When container number 10,000,000,001 gets created, the Container Table schema would be:

Container Index 1 -> Container Index 2
Container Index 3
Container Index 2 -> Container Data 1

Container Data 100,000
Container Index 3 -> Container Data 100,001
Container Data 1 -> 100,000 rows

Container Data 100,000 -> 100,000 rows
Container Data 100,001 -> 1 row

This Database Table Schema will essentually scale forever. Thus these database changes will ensure that database performance is consistent for Accounts and Containers. But what about Objects? In my next blog, I will first identify problems with Swift Object data storage and then present a solution.

In the meantime, how do you deal with Objects?