Science Meets Art – A Marriage of the Mind Part 1

Some may approach the concept of database design as something of a textbook exercise.  Putting together the pieces, normalizing things, cutting and moving parts of data into different places.  What makes a database design inspired? What takes the concept of the entity to the next level, whereby the queries are simple to write and the data stays very clean? Simply stated, that comes from a delicate marriage of understanding the data and the science of database architecture.  An art is born.

Looking at the science of database design, an understanding of domains (Primary Keys), Ranges (Column values), indexing, statistics, and the query engine.  It’s essentially the toolbox of data storage, one layer up from the base of block level disk access.  This discussion is aligned towards MS SQL Server, but most OLTP engines follow similar paradigms.  The tools of the trade involves laying out tables to store data.  Dr. Codd, the mind behind relational data storage, arrived at the concept of normal forms.  He essentially arrived at 6 normal forms, which range from a single value in a single table to one big blob of data.  It’s become commonly accepted that the design form OLTP databases attempt to achieve is third normal form. The basic meaning is that each element of data gets stored in one place and one place only.  There are exceptions to every rule, but most examples seen here will follow that rule.  What does that mean really?  This is where the art begins to creep into the picture.

The art involves understanding the data and adapting it to the tools available.  Mapping and organizing the data into logical groupings is a way to start, but first all the data needs to be dropped into the coin sorter.  Thinking of this as a hierarchy sometimes helps to identify the top levels of each logical grouping.  Once the top levels are identified then the data which corresponds to each top level can identified, and some questions need to be asked.  Which elements of data relate to the top levels directly, one for one?  Those elements, with certain exceptions, should describe the top node in the hierarchy directly and be stored within the same table as the top level of the hierarchy.  For example, the Customer should likely contain a name of some sort; company, first and last, etc.  It may also contain something such as the date the record (customer) was created.  Those elements of data would be flushed out when the concept of the customer was envisioned.  However, a word of caution, don’t be too quick to put all the data elements into one big blob of a record, a hundred columns wide.  Why?  It really comes down to maintenance.  How would an application display and make such things logical given the real estate of a common user interface?  Also, different data points should have different CRUD rules (Create, Retrieve, Update and Delete).  Again, another word of caution, don’t allow the application design to dictate database design.  Storage isn’t usage.

Next time I’ll attempt to blow out these concepts some more.  Until then I welcome your comments,..

Clean data storage to you 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: