Database Solutions EngineerIn this blog series, I will discuss common questions I receive as a database Solutions Engineer at Percona. In this role, I speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller single-server environments. Typically we are contacted when the customer is about to embark on an architecture migration or redesign, or they have performance issues in their production environment. The purpose of this blog is to put together a list of common questions I field while speaking with active MySQL and MongoDB users.

We are considering a migration to AWS. What solution is right for us: EC2, RDS, or Aurora?

We get this question a lot. Moving to AWS is a hot trend. Fellow Solution Engineer Rick Golba wrote a blog post dedicated to the specifics of each of these Amazon offerings, as well as the freedom you give up moving down the tiers. This is the primary concern when considering these cloud-based solutions. With Aurora, you give up a large amount of control of your database environment. With an EC2 deployment, you can keep most of it. However, there are other considerations to make.

Datasize

The largest benefit to choosing one of these Amazon offerings is reducing the cost associated with managing a physical database environment. This does not eliminate the necessary task of right-sizing your environment. Doing so can make a huge difference in the yearly costs associated with acquiring a large Amazon instance. This can also open up options when it comes to choosing between EC2, RDS, and Aurora as there are certain limitations and restrictions with regards to tablesize and total datasize. Here is a quick reference:

  • Amazon RDS – 6 TB*
  • Aurora – 64 TB**
  • EC2 – Depends***

* Max table size from Amazon’s documentation.

** Max size of Aurora cluster volume.

** There are too many options to list one.

There are several strategies when it comes to right-sizing your environment. The first and easiest way is to archive old, unused data. Percona Toolkit offers a tool that can assist with this process called pt-archiver. This tool allows you to archive unused MySQL rows into other tables or a file. The documentation for pt-archiver is here. Another strategy used by large organizations is to employ different databases for different tasks. The advantage of this strategy is that you can use the right database for a specific use-case. The disadvantage is the overhead of having experts to manage each of this varying database types and instances. This requires a significant amount of engineering effort that is not suitable for smaller deployments.

Some people might ask, “Why right-size my environment?” Most of the time, all of that data is not needed in a production database. There is likely data that is never touched taking a significant amount of space. When you lower your datasize, more Amazon options become possible. In addition to this, the operational tasks associated with managing your database environment become easier. If you’ve managed to turn a bloated table into a more manageable one, you might see increased performance as well. This reduces costs when it comes to a cloud migration.

Compatibility

Amazon is compatible with most MySQL deployments, but there are some exceptions. Amazon Aurora is currently compatible with MySQL 5.6. If you are interested in MySQL 5.7 features such as storing data with the JSON datatype, then Aurora might not be the right option. For a full list of MySQL 5.7 features, see the MySQL documentation. Amazon RDS and EC2 are both compatible with MySQL 5.7. One limitation of RDS is that it is not compatible with MongoDB. Amazon does offer its own cloud-hosted NoSQL solution called DynamoDB, but migration is not as seamless as it is with Amazon’s MySQL offerings. The best option for migrating to the cloud with MongoDB is an EC2 instance.

Summary

Percona has assisted with Amazon optimizations and migrations for many customers through our consulting services. Our architects have in-depth knowledge of high-performing MySQL deployments in the cloud and can assist with both your design and implementation/migration. One example of this success is Wattpad. Through performance optimizations recommended by Percona, Wattpad was able to reduce the size of their Amazon instance and save money over the course of the year.

Can we replace our enterprise monitoring solution with Percona Monitoring and Management (PMM)?

As with most answers in the database world, the short answer is “it depends.” Percona Monitoring and Management (PMM) offers a robust array of monitoring features for your database environment and is perfectly capable of replacing certain features of enterprise-grade MySQL and MongoDB monitoring platforms. Here is a short list of what PMM brings to the table:

  • Free and Open Source. Our CEO Peter Zaitsev is dedicated to keeping this true. PMM uses existing open-source elements and integrates some of Percona’s own plugins to form a complete, robust monitoring solution.
  • MongoDB Integration. If you have both MySQL and MongoDB deployments, you can see metrics and query analysis for both in one place.
  • Remotely Monitor MySQL in the CloudPMM is compatible with RDS.
  • Visual Query Analysis. Quickly identify problem queries.
  • Fix and Find Expensive Queries. Analyze expensive queries without needing scripts or command line tools.
  • InnoDB Monitoring. Get in-depth stats on InnoDB metrics.
  • Disk Monitoring. Be aware of system level metrics in addition to MySQL and MongoDB metrics.
  • Cluster Monitor. The recent addition of Orchestrator to PMM added this functionality.
  • Replication Dashboard. Orchestrator can also show the status of replication in an intuitive GUI.

If the list above satisfies your monitoring needs, then you should definitely be using PMM. Our development team is actively working to enhance this product and appreciates input from the community using this solution. The PMM forums are a great place to ask questions or offer feedback/suggestions.

Is moving to a synchronous replication a solution for us?

At first glance, a synchronous replication solution seems to solve all of the limitations that come with a standard MySQL deployment. It brings with it loads of great features like high availability multi-master nodes, each capable of handling writes and read scaling. However, there are several things to consider when answering this question.

Will a simpler solution meet your needs?

One of Percona’s Technical Account Managers, Michael Patrick, wrote a fantastic blog concerning choosing an HA solution. Typically the reason for moving to a clustered solution is for high-availability. If you’ve been bit by downtime due to a failed master and a slow transition to a slave, moving to a cluster could be a knee-jerk reaction. However, solutions like MHA or MySQL Orchestrator might ease these pains sufficiently while adding little complexity to the environment.

Is your application and database design compatible with a clustered solution?

You must make some application-based considerations when moving to a clustered solution. One consideration is storage engine limitations with clustered solutions. Percona XtraDB Cluster and MariaDB Cluster both require InnoDB. MySQL Cluster requires the NDB storage engine. By committing to a clustered solution, other storage engine options become unavailable.

Another application consideration is how clustered solutions handle synchronous write set replication. If your application has write hot-spots, deadlocks will occur given simultaneous write transactions. There are solutions to dealing with these, such as re-engineering database structure to remove the hotspot or allowing the application layer to retry these transactions. If neither of these are an option, a clustered solution might not fit your environment.

Is your database spread across multiple geographic regions?

You can deploy cluster solutions across WAN environments. However, these solutions contain latency issues. If your application is capable of enduring longer flight times due to a cluster being spread across multiple geographic regions, this will not be a problem. However, if this delay is not tolerable, a WAN cluster might not be the right solution. There are multiple strategies for alleviating this pain-point when it comes to deploying a cluster across WAN environments, a Webinar given by Percona XtraDB Cluster’s Lead Software Engineer, Krunal Bauskar, covers this topic. One example is asynchronous replication between geographic regions with clusters in each. The benefit of this is that the cluster in each geographic region will have eliminated the WAN latency delay. The downside of this is the addition of many more nodes (likely three for each data center). This solution also complicates the environment.

Closing Thoughts

I plan to continue this blog series with more frequently asked questions that I receive when talking to MySQL and MongoDB users. If you would like to speak with an account representative (or me!) to see how Percona can help you meet your database performance needs, feel free to reach out.