Evaluating Group Replication Scaling for I/O Bound Workloads

In this post, I want to evaluate Group Replication Scaling capabilities in cases when we increase the number of nodes and increase user connections. While this setup is identical to that in my post “Evaluating Group Replication Scaling Capabilities in MySQL”, in this case, I will use an I/O bound workload.

For this test, I will deploy multi-node bare metal servers, where each node and client are dedicated to an individual server and connected between themselves by a 10Gb network.

Also, I will use 3-nodes and 5-nodes Group Replication setup. In both cases, the load is directed only to ONE node, but I expect with five nodes there is some additional overhead from replication.

Hardware specifications:

     System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other)
Service Tag | S292592X0110239C
   Platform | Linux
    Release | Ubuntu 18.04.4 LTS (bionic)
     Kernel | 5.3.0-42-generic
Architecture | CPU = 64-bit, OS = 64-bit
  Threading | NPTL 2.27
    SELinux | No SELinux detected
Virtualized | No virtualization detected
# Processor ##################################################
 Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes
     Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
     Caches | 80x28160 KB
# Memory #####################################################
      Total | 187.6G

System | Supermicro; SYS-F619P2-RTN; v0123456789 (Other)

Service Tag | S292592X0110239C

Platform | Linux

Release | Ubuntu 18.04.4 LTS (bionic)

Kernel | 5.3.0-42-generic

Architecture | CPU = 64-bit, OS = 64-bit

Threading | NPTL 2.27

SELinux | No SELinux detected

Virtualized | No virtualization detected

# Processor ##################################################

Processors | physical = 2, cores = 40, virtual = 80, hyperthreading = yes

Models | 80xIntel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz

Caches | 80x28160 KB

# Memory #####################################################

Total | 187.6G

For the benchmark, I use sysbench-tpcc 1000W prepared database as:

./tpcc.lua --mysql-host=172.16.0.11 --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 --db-driver=mysql --use_fk=0 --force_pk=1 --trx_level=RC prepare

1	./tpcc.lua --mysql-host=172.16.0.11 --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=300 --threads=64 --report-interval=1 --tables=10 --scale=100 --db-driver=mysql --use_fk=0 --force_pk=1 --trx_level=RC prepare

The configs, scripts, and raw results are available on our GitHub. The workload is IO-bound, that is, data (about 100GB) exceeds innodb_buffer_pool (also 25GB).

For the MySQL version, I use MySQL 8.0.19.

Results

Let’s review the results I’ve got. First, let’s take a look at how performance changes when we increase user threads from 1 to 256 for three nodes.

Group Replication Scaling

It is interesting to see how the results become unstable when we increase the number of threads. For more detail, let’s draw the chart with the individual scales for each set of threads:

As we can see, there are a lot of variations for threads starting with 64. Let’s check the 64 and 128 threads with a 1-sec resolution.

It looks as if cyclical processes are going on, with periodical drops to 0, which may be related to this bug.

3 nodes vs. 5 nodes

Now let’s take a look at the performance under five nodes (compared to 3 nodes):

There does not seem to be a huge difference, and only when there are stable results with 8-16 threads can we see a decline for five nodes. For threads 64 to 256, when the variance is prevailing, it is hard to notice the difference.

Handling Sustained Incoming Rate

As there are occasional drops in high threads, I wanted to check how the cluster would handle sustained incoming rate, which is about 75% of average throughput. For this, I will test on 64 threads with incoming rate 3100 transactions per sec (the average throughput for three nodes under 64 threads is 4100 tps).

First, let’s see how the system handles throughput.

We can see that most of the time, the throughput is 3100 tps, but there are still intermittent drops as well as jumps with the return to the regular incoming rate. Now, more interesting – how does it affect latency?

While, again, most of the time, the system shows about 25ms 95% response time, during the drops, it jumps up to 1400ms, which obviously will result in a bad application and user experience.

Conclusion

From my findings, it seems that in I/O bound workload cases, Group Replication also handles extra nodes quite well in this workload, but the multiple threads are still problematic. I am open to suggestions on how the multiple threads performance can be improved, so please leave your comments below.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Evaluating Group Replication Scaling for I/O Bound Workloads

Results

3 nodes vs. 5 nodes

Handling Sustained Incoming Rate

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Evaluating Group Replication Scaling for I/O Bound Workloads

Results

3 nodes vs. 5 nodes

Handling Sustained Incoming Rate

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation