Note: There is an updated post on this topic here.
Interviewing people for our Job Openings I like to ask them a basic question – if you have a server with 16GB of RAM which will be dedicated for MySQL with large Innodb database using typical Web workload what settings you would adjust and interestingly enough most people fail to come up with anything reasonable. So I decided to publish the answer I would like to hear extending it with basics of Hardware OS And Application optimization to optimize MySQL database.
I call this Innodb Performance Optimization Basics so these are general guidelines which work well for wide range of applications, though the optimal settings of course depend on the workload.
Hardware
If you have large Innodb database size Memory is paramount. 16G-32G is the cost efficient value these days. From CPU standpoint 2*Dual Core CPUs seems to do very well, while with even just two Quad Core CPUs scalability issues can be observed on many workloads. Though this depends on the application a lot. The third is IO Subsystem – directly attached storage with plenty of spindles and RAID with battery backed up cache is a good bet. Typically you can get 6-8 hard drives in the standard case and often it is enough, while sometimes you may need more. Also note new 2.5″ SAS hard drives. They are tiny but often faster than bigger ones. RAID10 works well for data storage and for read-mostly cases when you still would like some redundancy RAID5 can work pretty well as well but beware of random writes to RAID5.
Operating System
First – run 64bit operating system. We still see people running 32bit Linux on 64bit capable boxes with plenty of memory. Do not do this. If using Linux setup LVM for database directory to get more efficient backup. EXT3 file system works OK in most cases, though if you’re running in particular roadblocks with it try XFS. You can use noatime and nodiratime options if you’re using innodb_file_per_table and a lot of tables though benefit of these is minor. Also make sure you wrestle OS so it would not swap out MySQL out of memory.
MySQL Innodb Settings
The most important ones are:
innodb_buffer_pool_size 70-80% of memory is a safe bet. I set it to 12G on 16GB box.
UPDATE: If you’re looking for more details, check out detailed guide on tuning innodb buffer pool
innodb_log_file_size – This depends on your recovery speed needs but 256M seems to be a good balance between reasonable recovery time and good performance
innodb_log_buffer_size=4M 4M is good for most cases unless you’re piping large blobs to Innodb in this case increase it a bit.
innodb_flush_log_at_trx_commit=2 If you’re not concern about ACID and can loose transactions for last second or two in case of full OS crash than set this value. It can dramatic effect especially on a lot of short write transactions.
innodb_thread_concurrency=8 Even with current Innodb Scalability Fixes having limited concurrency helps. The actual number may be higher or lower depending on your application and default which is 8 is decent start
innodb_flush_method=O_DIRECT Avoid double buffering and reduce swap pressure, in most cases this setting improves performance. Though be careful if you do not have battery backed up RAID cache as when write IO may suffer.
innodb_file_per_table – If you do not have too many tables use this option, so you will not have uncontrolled innodb main tablespace growth which you can’t reclaim. This option was added in MySQL 4.1 and now stable enough to use.
Also check if your application can run in READ-COMMITED isolation mode – if it does – set it to be default as transaction-isolation=READ-COMMITTED. This option has some performance benefits, especially in locking in 5.0 and even more to come with MySQL 5.1 and row level replication.
There are bunch of other options you may want to tune but lets focus only on Innodb ones today. You can check about tuning other options here or read one of our MySQL Presentations.
Application tuning for Innodb
Especially when coming from MyISAM background there would be some changes you would like to do with your application. First make sure you’re using transactions when doing updates, both for sake of consistency and to get better performance. Next if your application has any writes be prepared to handle deadlocks which may happen. Third you would like to review your table structure and see how you can get advantage of Innodb properties – clustering by primary key, having primary key in all indexes (so keep primary key short), fast lookups by primary keys (try to use it in joins), large unpacked indexes (try to be easy on indexes).
With these basic innodb performance tunings you will be better off than the majority of Innodb users which take MySQL with defaults and run it on hardware without battery backed up cache with no OS changes and have no changes done to application which was written keeping MyISAM tables in mind. This should help optimize MySQL database performance for your organization.
More Resources
Posts
- How to Choose the MySQL innodb_log_file_size
- InnoDB performance optimization basics (redux)
- Here’s my favorite secret MySQL 5.6 feature. What’s yours?
- What is a big innodb_log_file_size?
- How to change innodb_log_file_size safely
- Measuring the amount of writes in InnoDB redo logs
I’m happy to say that through reading this site regularly and getting suggestions from the forums I’ve been able to consistently shave off seconds of load time from my site over the past year bringing page load times to an almost instant state. It does take patience in testing new settings, especially when dealing with older slower 32bit hardware, but the payoffs are there and the lessons learned are priceless. My old slow query log was filled with thousands of unsolvable mysteries every day and the slow query time was only set to 10 seconds! Now that I’ve tuned everything up in the settings and have a better understanding of what each setting does in the my.cnf, I have it set to 3 seconds and only find that just around 100-200 queries a day are slower than that (usually because i dont have a failover server during backups which are causing locks that slow things down. working on it!)
I’ve seen great speed improvements using just these tips alone. What I don’t see here which is something that many novice administrators or tuners may not know is that if you set your buffers and settings too high and restart your mysql server, mysql wont instantly complain. What I think happens is it either ignores these settings completely and uses defaults or it uses them, discovers that they dont work for the session, reverts to the defaults or recovers in some other way which is slow. This can seriously impair your performance!
My only wishes for mysql would be that they would allow you to log queries which trigger counters of things like sort_merge_pass, full joins and tmp tables on disk so you could actually better find the queries causing slowdowns or poorly written queries in your applications, AS WELL AS a tool that would allow you to see how your buffers were being used in a visual way rather than just guessing through examining the raw numbers. These two changes would make administration lightyears more advanced than it is now for novice or intermediate developers/admins. Out of 801,000 tmp tables created, only 3,762 of those were on disk. It still bugs me that I can’t just look at a log and find them to fix them. I do have 0 Select_full_join and 0 Sort_merge_passes though finally.
What is most confidence inspiring is thinking about the day when i can take the kid gloves off and run my database on a 64bit machine with a more acceptable amount of ram. After being hamstrung this long with 32bit chips, I can’t wait to see how things perform with the newest tech out there!
I have to disagree with the 70-80% of RAM usage for the buffer pool. When I asked Heikki about it at yours and his talk during the conference he admitted that was based on his test box with 1G of RAM. I’ve seen people with 64G of RAM blindly following the 80% rule and only using about 50G of RAM for the buffer poll, leaving 14G unused!
I tend to tell people to leave a few GB for the operating system, and let the buffer pool use the rest. 4G might not be too unreasonable on a 16G box, depending on what else is going on, but I’d probably start with 2G and work up if needed. It’s super important to use O_DIRECT when tuning this, otherwise the OS will snatch up all of your free RAM for fs caching.
P.S.
Good post though 🙂 Agrees with much of what I tell people at Yahoo.
I’d just like to point out that Peter is giving you a sneak peek at the upcoming second edition of High Performance MySQL here. This post is like the cliff notes version of the InnoDB tuning advice in the book. So if you like Peter’s posts, get the book when it comes out.
Howdy,
Echoing what Jay says, I wouldn’t suggest a percentage for the buffer pool, rather a relatively fixed size, as the percentage doesn’t scale well as memory sizes have grown. I usually go for 14G on a 16G box, potentially reducing it if more than normal amounts of memory are needed for other things (say, a very high number of temp tables).
Regards,
Jeremy
Jay, Jeremy
I guess “how much to use for Innodb Buffer Pool” is the question answer to which may depend a lot. As I mentioned I provide some basic guidelines in this post which I would like to be simple and 70-80% is a good answer in this case. It works for most typical range of boxes, say 4GB-32GB and it is safe even though you’re not getting the every single penny of performance.
Your advice of leave a bit for MySQL and OS needs and give the rest to Innodb Buffer Pool is good but how one would know how much memory is needed for these ?
Also note not everything may work as you would expect it in theory. For example even with O_DIRECT OS may be swapping out portions of MySQL due to IO pressure which may come from logs, disk based sorts or disk based temporary table.
Another thing you need to keep into account is caching Innodb logs. As IO to Innodb logs is unaligned you better have them fit in the cache otherwise you will be getting read-around-write stalls every so often.
But you’re right of course for 64GB you would want the buffer pool to be significantly higher than 50G
Great posting. Can you do me a favor and expand on this please??? “Also make sure you wrestle OS so it would not swap out MySQL out of memory” I know what you mean by this..just don’t know how to do it..We run 64-bit Linux (debian actually).
thanks,
Keith
First. Check “si so” columns in VMSTAT – if you have some swap used but there is no swapping activity I would not worry, it is when these values are significant (sometimes in burst) you’re in trouble.
O_DIRECT is a great if you’re using Innodb. You also can use large pages to make MyISAM key buffer and Query Cache not swapable (and get some other benefits) there are some instructions here:
http://www.mysqlperformanceblog.com/2006/06/08/mysql-server-variables-sql-layer-or-storage-engine-specific/
you can use –memlock with varying success – a lot seems to be dependent on Linux Kernel version if it works properly. You can also try to echo 0 > /proc/sys/vm/swappiness though in my experience it does not really work well for preventing swapping.
Jeffrey,
You should have been looking at another post:
http://www.mysqlperformanceblog.com/2007/10/31/new-patch-for-mysql-performance/
We just created the patch which allow to log query flags with queries so you can see which queries caused on disk temporary tables and which required file sort. Now you just need small script to filter through the log.
We surely will modify data aggregation scripts so they can use this log format.
I’ve been doing all of this stuff for years… or so I thought. 🙂 Buried in there, you say ‘having primary key in all indexes’. Can you elaborate more?
Let’s take a sample table:
CREATE TABLE
users
(UserID
smallint(4) unsigned NOT NULL auto_increment,Email
varchar(255) NOT NULL,PRIMARY KEY (
UserID
),KEY
Email
(Email
)) ENGINE=InnoDB;
Are you saying that this would be better when doing queries for UserID based on Email:
CREATE TABLE
users
(UserID
smallint(4) unsigned NOT NULL auto_increment,Email
varchar(255) NOT NULL,PRIMARY KEY (
UserID
),KEY
Email
(Email
,UserID
)) ENGINE=InnoDB;
?
If so, it looks like I (wrongly?) assumed that the Primary Key was always referenced by other indexes. I’ve never seen this be a problem, that I know of, but now I’m wondering…
Thanks!
These kinds of posts are great; really helpful to get some insight to the mysteries of innodb and mysql tuning.
However, my only gripe is that it all feels a bit like random ‘lets tweak this and see’, rather than putting a test suite behind it with your own hardware.
Ben,
Of course to get last percent of performance out of your system you need to setup benchmarks (which well match your real workload) and do experiments. However you’re better to start somewhere other than default MySQL configuration to get results fast and also you do not always have time to spend a lot of time on this. So view this as starting point for Innodb configuration from which you tune it further.
Don,
What I’m saying is if UserID is primary key in Innodb table the key on (Email) is internally (Email,UserID) because PK value is always stored in the index and rows are stored by it for same key value.
This means the UserID key part of id also can be used for covering index, where clause and I think it is being fixed for filesort now. See this post for examples:
http://www.mysqlperformanceblog.com/2006/10/03/mysql-optimizer-and-innodb-primary-key/
Are there any rules when specifying a server’s RAM based on the database size? Is 16GB still useful if your database is 6GB? 12GB?
Mike,
Good question. Of course if your database is 6GB and you have 16GB of memory you will likely have more memory than you can efficiently use. You can allocate it as Innodb buffer pool and it will be as “free pages” or you can set buffer pool to lower value, say 7GB and let it be Free on OS side. Over time OS will find something to cache where but in practice that would not be efficient use anyway. If you plan your data size to growth I would set it to higher value so you do not have to revisit it many times adjusting as your database growths.
Of course if there is a mix between MyISAM and Innodb it is other story.
Peter,
Oh, great, that’s how I always assumed it was. Whew. Thanks for clarifying!
No one else has commented, so maybe some think it’s self-evident, but I could some casual (new) readers being confused or misled. Where you said, “We still see people running 32bit Linux or 64bit capable boxes with plenty of memory. Do not do this”, I’m assuming you meant “on”, not “or”. 🙂
Thanks Charlie,
Fixed now.
peter, re #9
That’s great news!! I didn’t expect to see something materialize so quickly. I will definitely check that out and appreciate the heads up and effort.
best regards
— Jeff
Trivial: but the atime stuff reminded me that nodiratime isn’t required, see http://lwn.net/Articles/245097/
Thank you Matt,
Honestly I typically did not use it either but I got it somewhere and added is as this is one of the thing which should not hurt.
hi,
just wanted to know the role of mmap in innodb and how is it set … also if anyone can help in the same regard with MyISAM….
thank you very much ..
take care…
ajay.
First off I love the blog and would like to thank all of those who contribute.
I did want to point out though that the innodb_flush_logs_at_trx_commit setting you have listed is spelled incorrectly. If I’m not mistaken the setting is innodb_flush_log_at_trx_commit (log should not pluralized). Thought I would make an effort to point this out given the recent posting on the about checking MySQL Config files.
Keep up the fantastic work.
Kirby
Kirby,
Thank you – fixed.
“We still see people running 32bit Linux on 64bit capable boxes with plenty of memory. Do not do this.”
Could you please explain why.
Thanks,
Thiru.
Oh, thank you for the many excellent posts! 🙂
If you run 32bit Linux you will be limited to 32bit address space for MySQL which will limit how much memory you can use.
Plus it will be slower for kernel to access large memory.
[..]Of course if there is a mix between MyISAM and Innodb it is other story.[…]
Do you still recommand thoses settings for a 65% INNODB, 35% MyISAM database ? Does MyISAM performance will be affected ? I’ll soon be switching for a MySQL dedicated server with 16Go of Ram, this post is really interesting to me.
transaction-isolation=READ-COMITTED
please make that COMMITTED with a double M.
Thanks. Fixed.
Hello, I have read many places that InnoDB is supposed to be faster for inserts. I created the following simple script that inserts 5,000,000 records into a four column table. I run it once inserting into a MyISAM table, and run it a second time inserting into an InnoDB table. Every time I run the test (even after changing the innodb_buffer_pool_size). The MyISAM table finishes approximately 4 times faster than the InnoDB table. This is significant. Now, the first thing I’ve noticed is that my machine is SIGNIFICANTLY less powerful than the machines you are discussing, however, I have not read where machine performance dictates the percentage of increase of InnoDB vs MyISAM. (Although I don know it it memory intensive.)
I have a machine with ~768M RAM and 250G drive. It is a dedicated machine for a SMALL website. I don’t think I’ll ever have more that 16 million rows in any given table. (innodb_buffer_pool_size=550M)
Here is the script:
<?php
function microtime_float(){
list($usec, $sec) = explode(” “, microtime());
return ((float)$usec + (float)$sec);
}
$time_start = microtime_float();
echo “Start Time for InnoDB: ” . $time_start . “\n”;
$db = mysql_connect(“localhost”,”user”,”password”);
for ($i=1;$i
MyISAM results: 1532.69 seconds. (25.54 minutes)
InnoDB results: 6815.43 seconds. (1 hour, 53.59 minutes)
I also changed the innodb_flush_method=O_DIRECT. I did not see significant gains (if any), but I must have deleted the nohup.out file.
Any advise would be greatly appricated. I apologize if this is too much to ask for a given forum.
I just noticed there is a significant part of the script missing, here is the script:
I wrote less than or equal to because it seems the site stopped writing all text after the less than symbol in my first post. I hope this post makes it.
Thanks again.
I can’t seem to get the code to show up. It’s inserting NULL, $i, $i, microtime() into the table five million times, incrementing each time.
Lance – MyISAM can be faster than Innodb to insert the data and Innodb can be faster than MyISAM to insert the data – it all depends on what you’re looking at. For inserts in single table by single user MyISAM most likely will win because of Transactional overhead in Innodb, however if you have large system and many concurrent inserts or parallel long running queries Innodb is likely to be faster because MyISAM has table level locks.
Great Article.
1- I have an insert/update intensive application with millions of insert/update operations per day (planned). The web client that reads from the database requires several indexes, however having those indexes slows down the insert/update operations. Is it advisable to set up replication such that the MASTER database does not have any indexes (except those required for updates), and the SLAVE has all the required indexes for the web clients? This should theretically get faster inserts/updates on MASTER, and fast retrievals on the SLAVE. Please advise.
2- How much of a difference does it make to have the MySQL server on a dedicated machine? Is there an article that addresses this issue?
Wehenever I try to change innodb_log_file_size I get my database tables to be “corrupt”. Reverting to the former value fixes them… Any ideas?
To change innodb_log_file_size you need to shut down mysql clearly remove old log files and start it again with new value so MySQL will create new log files otherwise it will complain about wrong log file size and Innodb will fail to initialize.
Just to make sure, you mean I can safely delete (after MySQL shutdown) the following files:
ib_logfile0
ib_logfile1
ibdata1
?
ibdata1…. colnector you dont wanna delete that one 🙂
That was the reason to ask + trap 🙂
So deleting ib_logfileX is the thing to do?
You may consider setting innodb_flush_log_at_trx_commit=0 instead of 2. It’ll be somewhat faster yet less reliable. Here’s the complete explanation: http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#option_mysqld_innodb_flush_log_at_trx_commit
Excellent article, many thanks 🙂
I’ve found that not only does increasing the value of innodb_buffer_pool_size increase performance, but for me it has also made the difference between the database working or not working:
I created a test table with quite a lot of data in it. With the default value for this setting, once I’d got to a certain size I couldn’t insert any new rows at all without mysqld (v5.0.27) falling over and rolling back the latest insert.
Thanks for the article!
BTW, what about for very small writes like financial realtime data? Actually some tests we did, Innodb was not well suited, but I think in the time they did some tests, innodb was not configured accordingly. The problem is that our application need “immediately” to write a small amount of data (usually 40 – 100 byte packet) many times in a millisec. Normally we have 70% writes vs 30% reads. But writes are made in 4 – 8 threads meanwhile reads are done through 100 – 200 threads.
Do you think Innodb or XtraDB (well tunned) could be used in this case?
Thanks!
Hi,
I am working on a system which has currently around 20gb of data (increasing at the rate of ~200mb/day). We need to save complete documents in the system so basically one column has around 70-80% of the data. Our server is Quad Dual Core 4GB Ram, Server 2003 and using MySQL 5.0. Query cache size is 156M and limit is 8M. All tables except one are INNO DB and its buffer pool size is 1024M.
The end user client module is read-only so lots and lots of repeated queries therefore I think query cache helps a lot but one of the problems I am facing is somewhat similar as discussed above. Sometimes the system halts for few mins. I fear that if I disable the query cache it will slow down my system which already is just OK.
Apart from that I also want to handle the future size of the db which is increasing rapidly. I have got hold of High Performance MySQL but any hints where should I be looking at first?
Any help will he highly appreciated. I can also give more details of my system in case if it will help. Thanks a lot!
Firstly, do you know what’s happening when it’s freezing?
Are most of the reads trying to read recently written records, or old stuff? If old stuff, you could shard on a modulus of TO_DAYS() output (for example, depending on your application) so on any given day you’d only be writing to one table but the reads could come from many others (if a lot of reads read old data). Remember that writing to a table will invalidate any entries in the query cache that relied on the content of that table, so this approach could improve the effectiveness of your cache (again, depending on your application). Sharding by day might not be appropriate for your application though – I couldn’t possibly say as I don’t know what your app does.
There are many ways you could improve on this but without further knowledge of your application it’s going to be hard to say.
And of course server configuration is important too.
Read that book – I’ve got a copy and it’s excellent.
With the brand spanking new innodb plugin, will all these apply or will we need to consider changing variables a bit?
What is a loose transaction versus a tight one? You wrote “can loose transactions for last second or two.” What does that mean. I’ve worked in this industry for over 30 years, and I’ve never heard that term before.
“loose” should be read as “lose.”
It would be nice if the comment about needing to worry about deadlock were elaborated.
Is it due to the assumption of multiple reads and updates in the transactions we’re assuming will be used or is some other factor in play?
“loose” should be read as “lose.”
What is a loose transaction versus a tight one? You wrote “can loose transactions for last second or two.” What does that mean. I’ve worked in this industry for over 30 years, and I’ve never heard that term before.