When I wrote the new tack it was from the standpoint that the table I’d been asked to endorse didn’t have a Primary Key at all, it was a heap 😦 That’s less than something which is performance oriented and I think I covered some options; the Composite Key and the Surrogate Key.
Today I’d like to dig further into the Composite Key a bit. A Composite Key would be more than one column (range) of data coming together to provide a unique identifier for a given row (domain). The Primary Key would be defined on the combination of the values.
In this scenario, the PK (Primary Key) would provide uniqueness but how would it be maintained over the lifespan of the table, assuming CRUD operations would be performed against that table. (CReate, Update, Delete = CRUD) In this case, consider the storage on disk. How does that data end up stored and what, if anything can we as the designers do to help ensure we have good fast access to the data we need? Also, what indexing strategies can be employed to ensure fast access?
The answers become a bit more complicated in the CK (Composite Key) scenario, because the designer needs to decide if the PK will be Clustered, or Non-Clustered. Looking at the options more closely can help, but to do that you need to know what Clustered and Non-Clustered mean. Clustering on a PK means that’s the order the data is stored on disk, some of that depends on the collation, but that’s another topic altogether. For now, understand that the Cluster Key or PK in this case may or may not be Clustered.
Going back to the usage of the data and the table itself, we recall that the CRUD operations will occur. This table isn’t a table holding a lot of static data, or lookup values. CRUD will be accessing the data via the PK or another index, making updates, deletes and additions. If the PK is Clustered, and the data is ordered that way on disk, does it make sense to use a Composite Key? Will the CRUD operations cause the data to become fragmented? Spread all across the disk in several different places? Could that make CRUD operations run more slowly as the arm reading the data has to move here and there? YOU BET!! Could you make the Clustered Index another index? I’m here to tell you that you can, or there are other options, like a Surrogate Key and a Unique Index on what would have been the Composite Key.
And a quick note on Flash arrays, and SAN. Don’t think that Flash can save bad DB design, it can’t. Flash is a good mask for some things, and a good SAN can overcome some cracks in design, but it cannot overcome index fragmentation and that is the main point being flushed out here.