![]() You don't want a high-throughput message broker like RabbitMQ asking MySQL for an identifier every time it publishes a message. The first reason is about avoiding an unnecessary call to an external system. These concerns should generally only arise when you are in a concurrent or distributed environment. There's a chance that multiple components may independently generate a non-unique identifier.You do not want a database (or some other authority) to centrally control the identity of records.There's generally two reason to use UUIDs: The point of a UUID is to have a universally unique identifier. Are UUID's the right data structure for the task? ![]() This means a UUID index will become disk bound (because it can't fit completely into memory) more quickly than an integer index. On the other hand, the index that stores UUID identifiers will grow much larger and at a faster rate than integers. Auto incremented integers will almost always be smaller, meaning scans against tables/records will be more efficient. You will very certainly see better performance on queries using integers as well, though this might not be as pronounced. More importantly, note that the scale on the left is "insert time in hours". Notice the nearly consistent insertion time of a long integer vs. ![]() This means the more UUIDs you insert, the larger the insertion penalty will be.ĭon't take my word for it, just look at the statistics. UUID's are designed to be non-sequential and they are very large compared to an integer. splitting a value into a new branch of values - refer to B-tree). Sequential values tend to index well because they don't require large sections of the index to be realigned (e.g. Indexes are trees that grow and branch as you add more data. The problem is their size and randomness. If you use a UUID as an identifier for a table, you're going to have to index it. If you are using Foreign Keys, and they are also UUID's, that's another 1MB of storage for each Foreign Key. While this doesn't seem like a big deal, consider that for every 13,889 identifiers, your database will consume 1MB of storage. That means at least 72 bytes per identifier! MySQL character set) used to represent strings, this could mean 2-3 bytes per character (if using UTF-8). When you consider the text encoding (e.g. This means the column that carries the value must at least be 36 characters ( VARCHAR(36)). To most databases, UUID's are just 36 character strings.ĭatabases like MySQL do not have a native implementation of the data structure. My goal is to encourage engineers to think about the general consequences of data type selection for identifiers in their architectures. The purpose of this post is to discuss appropriate and inappropriate uses of UUIDs. I mean, how cool is it that you can generate an ID unique to every system in the world?. To many developers, the UUID seems like a totally awesome way to establish the identity of a record in a system. ![]() The system I've inherited at work is plagued by their usage. I've been thinking a lot about UUID's lately. For the purpose of this post, I will use UUID (Universally Unique Identifier) to mean both UUID and GUID (Globally Unique Identifier, Microsoft's implementation of UUID). ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |