For those not familiar with the children’s story, you can click here for an entertaining rendition, or read further. The story begins with three little pigs, each of whom chooses to build a home – one of straw, one of sticks, and one of bricks.
The first little pig build his home out of straw. He completes it very quickly with minimal labor, leaving himself ample time to enjoy himself. Like many nomadic peoples of old, he is simply looking to solve the problem of today – namely, he wants a roof over his head in as expedient a manner as possible. The concerns of tomorrow are not for him. This is the consummate “just get it done” pig, and would likely live paycheck to paycheck.
The second little pig, being of a more industrious sort, chooses to build his house out of sticks. He’s still able to complete it with relative ease, but it certainly required more work than that of the first little pig. He has to chop, shape, and plan more effectively in order to ensure its construction works. His is more of a semi-permanent way of thinking. He isn’t the kind of pig who would worry overmuch about retirement, but would certainly have a savings account.
The third little pig decides to build his house from bricks. This gives his brothers ample time to deride him for his efforts, as he continues to labor away while they are off enjoying themselves. He has to gather mud, build a kiln, bake bricks, and then mortar them all together. This is a very labor intensive and time consuming process, but at the end, he is pleased with the results. He wanted to settle down and stay, and nothing says permanence like stone or bricks. This is the kind of pig who is willing to sacrifice today in order to ensure his future. This little pig almost certainly would have a well funded 401k and a hefty savings account.
At this point in the story, a wolf enters and demands the first little pig let him into his house. The first little pig, being a pig and completely unwilling to share his shelter with a wolf, refuses the request (and in fairness, the wolf looks hungry, so I can’t blame him for refusing it). The wolf decides enough is enough, and blows the house down, sending the little pig fleeing to the transient shelter of his brother’s house of sticks.
The wolf, whom I imagine is still hungry, uses his excellent sense of smell to track the first little pig to the house of sticks (at least, this how how my predictive analytics model thinks that would work with a real wolf). As before, he demands entry, and this time, two voices sound a refusal in unison. The wolf, meanwhile, is not only hungry, but angry – he’s full fledged HANGRY – and he promptly blows this house in, too. Of course, at this point, I would guess he needs a brief break before continuing his pursuit, as he’s already huffed and puffed his lungs free of oxygen. Maybe he’s secretly a smoker. Regardless of the actual cause, the first two little pigs are able to escape and flee to the third little pig’s house.
The third little pig invites his brothers in while they await the arrival of the wolf, who is sure to follow them there. Sure enough, a short time later the wolf arrives and demands entry. Now he hears three little piggies gleefully refuse him entry. The wolf is furious now, and he feels like he’s starving. So, as before, he huffs and he puffs and he blows… but the house doesn’t budge. Again and again he tries, to no avail. The brick house is too strong for him to break it down. The wolf, however, is a cunning beast, and decides to infiltrate the home with his best Santa imitation – right down the chimney. Unfortunately for him, the third little pig was prepared, and the wolf comes down the chimney right into a pot of boiling water. He subsequently is cooked and eaten by the three little pigs.
So, how does this fit with data warehousing, you might ask? Well, the last word in data warehouse is HOUSE. In the tale, the wolf represents the unknown dangers and difficulties we regularly encounter in life. In a data asset, you’re going to encounter unexpected angles of analysis – constantly. You’re going to have people pounding at your door wanting answers (“I’ll huff, and I’ll puff”) – and if your data structure isn’t strong enough, it may very well topple under the weight of all of those queries (“and I’ll blow your house in”). You’re going to get requests to add onto your current structures as well. Much like the houses in the fictional tale, a great deal of your success with a warehouse long term will depend on the effort and planning you put into it up front.
To continue the analogy a bit further, I would state that a straw house in the business world would be represented by a really cool and complex spreadsheet. I’ve seen this countless times in business, and indeed, LinkedIn pings me regularly with Excel “Analytics” classes. Much like a sod house on the great plains, it can provide quite a bit of functionality, but becomes cumbersome the more you add onto it. On the plus side, it’s quick, cheap, and requires little technical expertise. However, you’re not going to pass on a sod house to future generations, nor are you going to build it with bay windows and hardwood floors. A spreadsheet solution is generally the first step in understanding there is a set of data that needs some additional analysis. However, by its very nature, it’s not meant to be a permanent analytics solution.
Moving on, I would posit that the house of sticks in the business view would be an ODS solution, or perhaps even a snowflake schema. I’m seeing this pushed more and more often, as they can be created relatively quickly by consulting companies, but in my opinion, they can be costly and burdensome to maintain long term. Of course, consulting companies aren’t worried about the long term, because if the project fails three or four years down the line as it grows in the depth and scope of its data, they can be hired to come in once more and design a new solution. This is not much different than the construction companies throwing up new subdivisions of cheaply made wood frame homes throughout the country. They even burn faster than older homes. Similarly, ODS and Snowflake schemas have more moving parts (tables and joins), making it easier (in my opinion) to have something break and begin to become unsustainable.
Last, but definitely not least, I would liken the brick house to a true Kimball-esque star schema data model. The beauty of building with brick and stone is that you know, with certainty, that it’s going to outlast most other materials. A well developed and planned star schema model is the same. Both are costlier to build and certainly take longer, but you can add on as needed without fear of it collapsing on itself. Most importantly, by planning properly, the reduction in maintenance costs will enable you build a sustainable solution that will last long after your career is over. I’m pretty certain the architects of the Giza pyramids were well satisfied with their work, as were the architects of the Taj Mahal and the Great Wall of China. I know the world has been amazed by them for hundreds (or thousands) of years. Why settle for anything less in your solutions? Take a page out of the third little pig’s book – build it right, and hunt the wolf in the comfort of your own HOUSE.