Brian recently posted an article comparing UUID and auto_increment primary keys, basically advertising to use UUID instead of primary keys. I wanted to clarify this a bit as I’ve seen it being problems in so many cases.
First lets look at the benchmark – we do not have full schema specified in the article itself so it results are not absolutely clear but we already can have certain conclusions.
Data size is very small. What is the biggest problem with UUID ? It is the fact inserts are going in random locations in index tree which will require a lot of IO in case of index tree not fitting into memory. This is not simply the case of 32 bytes vs 4 bytes for key value – if you would use integer key and insert data in random order you would have the same problems.
In fact if you store UUID in binary form you can bring it down to 16 bytes so size is not really the problem.
What is about Secondary Keys ? For Innodb tables UUID have extremely poor impact on your secondary key because all rows are referred by primary key value. This especially hurts for a lot of short integer keys because they can become many times longer.
Parallel Inserts UUID are often advertised as allowing to spread the load from single buffer page which auto_increment is constantly hitting, this is true but at large date size it is way overturned by BTREE buffer. The most efficient approach here is not to use auto_increment but use certain partitioned sequences, for example you can have 256 growing sequences (with high byte used for sequence number) which will have 256 “hotspots” – good enough to make load parallel but still small enough to be well cached. It is also worth to note Innodb (which is storage engine which usually considered best for parallel insert/updates) has pretty much table level locks when it comes to auto_increment columns, which is however completely separate problem which can be fixed in MySQL 5.1
Data Clustering This again applies to Innodb tables aspect of primary key selection – you often can gain a lot by selecting primary key which provides data clustering which matches your application needs. It may be (user_id,msg_id) in some cases but in many auto_increment already gives us what we want – if we access “recent” items more frequently and data set is large auto_increment would work much better than UUID because UUID will have recent data scattered all across.
Lookups Later in comments Brian mentions the point of benchmarks was rather lookup by primary key speed. Lookup speed can be similar or can vary a lot. If we have completely random lookups it should rather close for data in memory case or data on disk case with auto_increment having advantage when UUID larger BTREE does not fit in memory any more and so more data reads is required. However for other distributions situation can be different, see previous point about clustering.
Benchmarks In general I stand the same point – be careful applying benchmarks you find in the Internet to your own case, it will be misleading more frequently than not, especially if you do not have technical depth to understand all implications and assumptions. I promised to publish some insert benchmark for this case with larger data size so here they are:
I’ve created MyISAM tables containing just integer auto_increment primary key and containing char(36) value and used for UUID primary key and when I populated it with 268.435.456 rows (large enough for that 512M box to be disk bound). For auto_increment key load process took 1 hour 50 minutes giving load speed of 40305 rows/sec. For UUID process took over 12 hours and is still going. From MySQL status I can see it is loading about 200 rows/sec and the it is still slowing down a bit as key file growths. So in this little case we have about 200 times performance difference which is worth to consider
Can UUID be handled efficiently ? They would not be as efficient as Ints simply because size matters but you can do them pretty efficient by using binary compression. Plus they should be stored in optimized sort order to minimize how random they are –
there is timestamp based part in UUID which has similar properties to auto_increment and which could be used to have values generated at same point in time physically local in BTREE index.
This is actually the reason why UUID can do much better than SHA1 as Kevin proposes in the same post – hashes if it is SHA1, MD5 or CRC32.
I couldn’t agree more.
You’re missing out on a few key points.
One. UUIDs are about as pointless as sequence generators for my goals. UUIDs don’t have to be centralized which is certainly a plus.
I think the real power is in GUIDs.
With truncated SHA1 (GUIDs) you can do some cool stuff.
1. If you’re using client side sharding of data like LiveJournal, Google, Flickr, etc then you can route keys. Since hashcodes are determistic you can figure out that this key resides on hostA. You can do the host mapping via mod or other techniques.
2. You can build complex graph relationships and exist checks without large amounts of memory. For example if you have a URL which is 255 chars long you can compute the hashcode on the client and then lookup that value from an index JUST on the GUID column. Since you need a unique index ANYWAY you can save a LOT of IO since you’re now using a smaller column. You also get to potentially drop another index (more memory) and since the datatype is small you can fit more of that data from the btree into memory.
3. Lookups for one to many relationships can skip a step. For example if you have one product and many buyers instead of FIRST fetchign the product ID from the product table via name you can totally skip this step and just compute the GUID on the client and then query it from the buyers table. Less IO is a good thing.
One drawback is that you have to compute the keys and this costs CPU. It’s worth noting though that you can compute about 10k URLs per second on modern hardware. Probably even faster if you use the modern Linux crypto stuff. I haven’t benchmarked it on Java to see if the system call is worth the hit.
The big win for me though is the key routing. It just makes it easy to build larger clustered databases.
I just with MySQL had support for showing varbinary values on the console without screwing it up :-/
Kevin
Kevin,
First the stuff MySQL calls UUID are GUID at least if you use Wikipedia definition:
http://en.wikipedia.org/wiki/Globally_Unique_Identifier but I see what you really mean by following description.
Now regarding your comments
1) Routing keys is great but do you it is another point. For example having 10 millions of users and spreading them on bunch of servers do you need to use SHA1 or similar ? I would not – I would instead use mapping table and use integer keys for users themselves, because these are much faster.
2) You’re describing completely different problem which is not about using GUID vs AutoIncrement. In your case you want the key to identify the object so you will not have to do lookup by URL etc. In this case SHA1 surely may work great.
3) I see your point. This can be helpful in the case when there are many objects both in products and in vendors tables. In most cases however the smaller table will be well cached in memory (memcached etc) which makes it sort of free. And when You can do lookup in product table faster by use of clustering for example.
In any case I think you’re missing main point – this corresponds to simple swap of auto_increment to UUID without changing application design or anything which is in my opinion often bad idea. If you use SHA1 as “natural” unique keys based on object contents you can design application differently and get better performance in some cases – this is not the question.
Sorry, off topic with the post, but related to the blog’s subject.
You have probably seen these benchmarks comparing PostgreSQL and MySQL performance, which show how bad MySQL scales on multicore + multithread environments:
http://tweakers.net/reviews/649/7
The problem with high concurrency starts with 4 cores and becomes dramatic with 6-8 cores. Well, you probably have also seen these ones comparing MySQL in Linux and FreeBSD:
http://jeffr-tech.livejournal.com/6268.html
Now it seems less clear if the problem is in MySQL or in Linux, since MySQL scales fine in FreeBSD.
Well, it seems that finally the problem has been identified in glibc:
http://ozlabs.org/~anton/linux/sysbench/
It also seems that the poor performance of glibc’s malloc has been previously documented:
http://www.mail-archive.com/[email protected]/msg01239.html
In fact, that’s why there seems to be a few alternatives out there replace malloc:
http://www.cs.umass.edu/~emery/hoard/hoard-documentation.html
http://sourceforge.net/projects/umem/
http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools
So I guess that until glibc fixes this problem it’s a good idea to replace malloc with one of the above solutions if you’re using MySQL in a 4+ cores system with high concurrency.
Hey Peter……
In response……..
1. Yeah… you probably could do this. You’d have to do this in regions though. IF you’re inserting a LOT of data (which we do basically 100% of the time). You’d have to use regions of say 10k entries but this would also mean your routing table would grow in size which involves seeks as it is.
This doesn’t solve the UPDATE problem though.
Say for example you have a key you want to update. You FIRST have to do a SELECT across ALL the tables in your cluster to find the record. Then you take the ID and update the value. This is O(N+1)…. With hashcode routing you can compute the key on the client and know right away where the record is stored. O(1) 🙂
2. Sure…. I wasn’t specifically talking about GUIDs vs AutoIncrement but the value of using PKs based on hashcodes.
3. I can’t tell if you’re agreeing with me on #3 or not! 🙂
OK…. maybe I went off on a tangential discussion 🙂 I just haven’t heard many people using hash routing for storing key values.
Kevin,
1) Why would I do it in regions ? I think from all the types of mapping it is the most inconvenient one because it is slow. Regarding data load – at BoardReader.COM we have about 1 bil of forum posts right now and more than that of links loaded, with some 5 millions of new posts per day coming in. I’m not sure how it compares to your data sizes though. We have partitioning implemented by site_id for forum posts and links are partitioned using domain id. besides partitioning this helps a lot with data locality such as it is easy to retrieve all links for given domain and it is fast because they are physically local. With Hashing…. I guess you can find given URL for example pretty fast but urls from the same domain will be complicated to find and scattered.
I’m not sure what problem of update you’re speaking about we do updates all the time. Yes there is some form of index of course on each of the tables where partitioned is performed to do the lookup. The mapping table is just to find which table to use. In fact for local index on the table SHA1 may be pretty good alternative, even though for our needs CRC32 works better and here is why.
If we do URL lookup we always need to fetch data about it not only check if it is in the database, so we need to do row read anyway if the row is found. With CRC32 index we get very fast responses if URL is not where because index is small and collisions are rare and if row is where you most typically have to read only one row anyway.
2) OK I see. So I simply say it is different discussion than as Brian was mainly speaking about performance.
3) Well. I agree with you in the sense it can be the case for certain data sets with certain range of operations. But in others other approaches may work better as I illustrated. World of performance is never black and white.
Regarding people using solutions – honestly in many cases people use generic solutions or solutions they find working rather than most optimal solution which exits and in many cases it is fine.
Luis,
Good comment to the wrong post. But in general you’re using wrong tweakers benchmarks there are new one with “fixed” MySQL version. But even “fixed” mysql is far from scaling perfectly.
I have not checked that FreeBSD comparison myself in details regarding its correctness but in general there are always ways to blame something else. It is Linux being guilty ? When why Oracle and PostgreSQL scale on Linux ?
In fact sometimes even CPU and hardware architecture matters a lot ie I found Nocona based (old Xeons) scale much worse compares to Woodcrest ones.
Interesting note on the Malloc tests. It is interesting it changes well with Google Malloc as in fact we’ve benchmarked number of malloc alternatives including hoard.org and bumch of alternative ones with no serious effect.
This was not surprising especially as MySQL does internal memory management. On this high query rate however even few per thread allocations MySQL does may matter.
O(1) Updates??? Awesome! What do you mean by hashcode routing though? Does this happen automatically? If I replaced my primary keys with UUID() generated varchars can I just use a normal UPDATE statement and expect the operation to happen in O(1)? If not, can someone please post a link on where one can read more about hashcode routing?
Ok I misinterpreted that, someone else clarified it for me. I thought there was a way to do O(1) updates….not the mapping 🙂
Kevin,
Do you have any good links or more information on how to go about doing “client side sharding of data.” This sounds like what I need to do. I am creating a app that will need quick storage and retrieval of 8-10 Billion rows of data. Splitting up the data onto several servers with a deterministic way of finding it again is what I need.
Any help would be greatly appreciated.
Jacob,
I can tell you what we do for http://www.boardreader.com with some billion rows which need quick retrieval.
The data is partitioned in “table groups” which are mapped to the servers. We use 64bit identifiers with lower byte used to store table group.
Search is done using “Sphinx” search engine and we basically need to find rows by IDs to show result set in most cases.
It appears that the benchmark done in this article was storing the UUID as a ascii representation of a binary number when in fact the UUID should have bin stored in binary form. This is equivalent to searching using a 4 byte integer for the auto_increment primary key and then using a persons First, Middle and Last name to search for the UUID implementation.
Write a function that converts the value returned by the UUID function to binary and store it in a binary(128) column. Write yourself another function that will cast it back to a characters string with hyphenation if you need to display it.
However, the primary use for UUIDs or GUIDs is data portability, not speed for searching. If you have worked for large companies where you have redundant data stored in many locations you have to manage this primary key much closer and have the ability to generate something that you know will be unique across the company. Integers and auto increment will not cut it.
Well it’s too bad the comparison wasn’t done using the UUID in binary format, autogenerating the GUID on the client side using Jimmy Nilsson’s GUID.COMB. Can you do that comparison peter? NHibnerate has an implementation of GUID.COMB.
I have seen those problems with UUID’s in MySQL when stored as text. Ideally, MySQL would have a UUID column type to store the values as binary rather than strings but convert to string anytime the row is returned. The ability to insert an ID in hex text and convert it to binary would be a must too. That’s the biggest problem with using binary fields: you can’t read them or copy and paste them without conversion. That change would provide better performance in terms of storage and indexing (a 16-byte column instead of 36).
In an effort to improve the situation, I created a simple UUID class capable of generating random UUID’s with an option to store them in Base64 rather than hex. Since the length of the UUID is always constant, I was able to trim off the extra = in the Base64 conversion and come up with a case-sensitive 22-byte UUID representation (VTIW7xOgReOGrL3vMRjm4Q, for example). The performance increase was enormous, and the overhead is much smaller (22 bytes vs. 16) and you are able to convert to binary and hex at any time.
Later, as I thought more of the problem, I realized the Base64 encoding was inefficient for a 128-bit number. In order to get maximum efficiency out of Base64, the number of bits needs to be divisible by 6. So I created a new identifier that was only 72 bits. (Yes, the collision probability goes up, but it is still one in 4.7 * 10^21.) These UID’s (as I call them) only take 12 bytes to store in a binary column and strike a very good balance between speed and uniqueness. They can also be translated to GUID’s (########-0000-0000-0000-00##########) and back when needed. (If 72-bits is not enough, use 96 bits to make a 16-byte UID).
Hi, Peter
I noticed your benchmark which shows 200 times performance differences between auto-increment and UUID(). I’m wondering if you did the same thing for InnoDB as well. We are using MySQL 5.0 and InnoDB, we got badly lock conflict when inserting rows into InnoDB tables with auto-increment column. So, we are considering to switch to UUID() as a PK. I did a very simple testing: wrote a stored procedure, which has a loop to insert a row into a table. the results for InnoDB are shown as below: inserting 100,000 rows into InnoDB tables, for a table with auto-increment, it tooks 254 sec; for a table without auto-increment but use UUID(), it took 263 sec; the same testing for MyISAM tables, I got 54 sec vs. 68 sec. the 54 sec is similar to what you got. What I did wrong?
My biggest gripe about UUID() is that it doesn’t generate random (or at least pseudo-random) values. A reason you would want to use UUID/GUID over auto incrementing integers is when you’re synchronizing data from multiple dispersant sources. If you’re constrained by incremental integers for primary keys you’re constantly updating IDs on record inserts during synchronization. If the UUID() values were random (like a Guid in .NET) there should be very little chance of collisions.
I would also like to see a UUID data type (which would use 16-byte binary representation instead of VARCHAR(36))
Here is the solution that I’m considering for my project:
1. Concatenate the numeric user id (1-9 digits) and the UNIX timestamp (10 digits)
–> [i.e. “3333” . “1111111111” = “33331111111111”] ***GUARANTEED UNIQUE.
2. Convert to base-36
–> [ “33331111111111” –> “a3gsei3”] — (if I had 100 million users, the longest ID would still only be 10 characters)
3. Store as binary in CHAR().
My best guess is that this strategy is a win-win (for my situation) over GUIDs and INTs. I have guaranteed-unique ids (since they are tied to the user id and the timestamp) that are available without querying the database. PLUS, they are SIGNIFICANTLY shorter than GUIDs (they are 1/3 the size).
Admittedly, I’m relatively new to highly-scalable database architecture, so I’d appreciate any thoughts or feedback. Thanks.
Oh boy. UUIDs are 128 bits == 16 bytes.
Store them in a BINARY(16) column.
Don’t mess around with CHAR(32) or even VARCHAR(36).
@Vitaly Karasik, my thoughts exactly. In fact you can make them even smaller if you strip out the information you already “know” when you store it. Portions of the UUID never change when generated on the same server, so if you have some other way of uniquely identifying the server you can toss that out when you store it and just add it back in when it’s needed. That brings the size down to a mere 10 bytes, which isn’t bad at all.
BTW in PHP that looks like this:
$uuid = substr($uuid, 0, 8) . substr($uuid, 9, 4) . substr($uuid, 14, 4) . substr($uuid, 19, 4);
where $uuid is the result of doing a SELECT UUID() in MySQL.
I am building an application that can function when network is not available, i.e. off line. I would then replicate the record back to central computer. The problem I face is the unique number to assign to the row. If I use int, I need to allocate a section of number of each of the instance of offline application. I am thinking UUID to solve the problem. Is this the right choice to solve this problem?
As I understood, the last part of UUIDs changes least on one machine, e.g:
1388681F-7348-11DF-83CD-D0B6181E9833
13CF3551-7348-11DF-8556-D1B6181E9833
13F8B66F-7348-11DF-BBC6-D2B6181E9833
141E8E03-7348-11DF-BFCE-D3B6181E9833
1438CCD7-7348-11DF-807F-D4B6181E9833
To improve InnoDB insert performance, primary keys should be inserted with least changing bits first, correct?
I wrote two functions to encode and decode UUIDs, please find them below.
Best,
-jens
DELIMITER $$
CREATE FUNCTION ENCODE_UUID (uuid CHAR(36))
RETURNS BINARY(16)
DETERMINISTIC
BEGIN
RETURN REVERSE(UNHEX(REPLACE(uuid, ‘-‘,”)));
END
$$
DELIMITER $$
CREATE FUNCTION DECODE_UUID (uuid BINARY(16))
RETURNS CHAR(36)
DETERMINISTIC
BEGIN
RETURN concat(
HEX(LEFT(REVERSE(uuid),4)),’-‘,
HEX(MID(REVERSE(uuid),5,2)),’-‘,
HEX(MID(REVERSE(uuid),7,2)),’-‘,
HEX(MID(REVERSE(uuid),9,2)),’-‘,
HEX(RIGHT(REVERSE(uuid),6))
);
END
$$
SELECT DECODE_UUID(ENCODE_UUID(‘0935C9C3-7345-11DF-BADD-A8B4181E9833’))
If you really need to find something between Jens Wagner’s solution Jay Paroline’s in PHP/MySQL, I suggest this approach.
First simply SELECT REVERSE(UNHEX(REPLACE(UUID(),’-‘,”))); this will give us the UNHEXed and REVERSEd UUID in the way that Jens Wagner above suggests. Then you can place this value into your database as you please.
Then when you retrieve the UUID from the database later, you call SELECT HEX(REVERSE(‘{$uuid}’)); And to make the UUID look more “natural,” simply run strtolower() to remove the caps.
Another thing I suspect may happen if you use UUIDs as PK is that every insert would cause mySQL to re-index since they’re not sequential by default as a normal numeric PK is. I think you’d take a performance hit on all INSERT operations.
Why make things so complicated? Innodb and MyISAM allows for a compound primary key. Why can’t you simply add a Server_Id column to each table?
CREATE TABLE
cust
(Cust_Id
int(10) unsigned NOT NULL AUTO_INCREMENT,Server_Id
smallint(6) NOT NULL DEFAULT ‘1’,First_Name
char(15) DEFAULT NULL,Last_Name
char(20) DEFAULT NULL,PRIMARY KEY (
Cust_Id
,Server_Id
)) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
When each database is defined, run an app that alters the default value of the Server_Id of each table. (You probably also want a ServerId table with a row set to the same value). When the table data gets merged from other servers, you still have unique rows and you know which server created the row. The user app would identify each row as ccccc.sss where ccccc is the Cust_Id and sss is the server id. So the user sees customer id of 1234.21 and the program behind the scene just splits it into Cust_Id=1234 and Server_Id=21.
It is fast and it is readable. Of course it all depends on the administrator setting up unique Server_Id values for each server but if they are on a network you can write an app to do it and verify it.
Mike
UUID is 2 integers long, store it as a composite key of two ints, take the upper part from UUID to the first column and the lower part for the second composite column.
@bar4ka Composite keys are not a good idea as they are very slow. Almost all DBA’s will tell you to avoid composite keys.
Thanks, Good Article