Be the Frank Lloyd Wright of Your Data Warehouse


I studied architecture in middle school (yes, I am that big of a nerd) and fell in love with the patterns of Frank Lloyd Wright designs. Widely regarded as one of the best architects in U.S. history, Wright’s designs are legendary for their attention to the smallest of details. He drew the plans, selected the…

The Truth, the Whole Truth, and Nothing but the Truth…


How you finish that statement tells a great deal about a person. If you finish with “so help me God”, then you’re likely either a Christian, or spend way too much time watching courtroom dramas. I’m kidding – we all know there’s no such thing as too much Matlock or Perry Mason. Alternatively, if you’re…

Regression to the Mean Machine


Old habits die hard. That, in a nutshell, is the concept behind regression to the mean. To understand this concept, let’s first define what mean, means (and forgive me for sounding like Bill Clinton during the Lewinski affair). Mean is the highfalutin way statisticians say “average”. With regression to the mean, the philosophy is that…

Roll Up Your Sleeves, Not Your Counts


Rolling up your shirtsleeves has long been associated with the idea of working hard. I have zero issues with working hard – but you should work smart as well. I was reminded of this recently in a discussion with a colleague regarding a table that stores distinct record counts by day. The idea behind the…

Because You Know It’s All About the Grain


The most important question to answer when designing a fact table is to determine what the grain of the transaction you wish to record is. Quite literally, it’s all about the base grain. If you select too high of a grain, the you lose your ability to drill down into more detailed analysis. If you…

Just the Facts, Jack!


While we’ve previously discussed the importance of determining the grain of your fact table, we really didn’t cover exactly what data elements should go into the table. One of the reasons much of this site will concentrate on dimensions is because the dimensional data keys are integral to the fact table build. Once you determine the…

I’m Sorry, You’re Just Not My Type 1 Data…


Type 1 data is something you’ll hear frequently when talking about OLTP (OnLine Transactional Processing) systems, and it can be common within a data warehouse as well. What is it? Type 1 data is essentially a permanent “current record” within a dimension. When a change occurs to a dimensional record, the change overwrites the previous…

I’m Looking for More of a Type 2


We covered Type 1 Data in a previous article. So now that you know what Type 1 data is, what is Type 2? Also known as a slowly changing dimension (SCD), Type 2 dimensions simply are dimensional records that represent a unique value at a specific and unique point in time.  The key advantage to this…