Type 1 data is something you’ll hear frequently when talking about OLTP (OnLine Transactional Processing) systems, and it can be common within a data warehouse as well. What is it?
Type 1 data is essentially a permanent “current record” within a dimension. When a change occurs to a dimensional record, the change overwrites the previous record within the data warehouse. For example, if a customer moves and changes their address, the previous address is deleted from the warehouse.
Original Record
Key | Customer Name | Address | City | State | Zip |
1 | Slim Jim | 505 Tasty St. | Delicious | Texas | 12345 |
New Record (Replaces Original Record)
Key | Customer Name | Address | City | State | Zip |
1 | Slim Jim | 123 Snack Ave. | Yum | Arkansas | 54321 |
The advantage of this approach is that the data remains small, and it’s easy for end users to query their customer base without concern for misunderstanding the data when they potentially see what appear to be duplicate records within the dimension. The disadvantage of this is that no historical data is stored within the dimension. This can be problematic for things like customer address, as it will falsely match previous fact records with a current address. As a result, trend analysis data can get skewed as your dimensional data shifts. If I need to account for this, it’s probably that I’m Looking for More of a Type 2.
2 Comments Add yours