From his article:
Something that’s always confused me is the near-religion of data normalization among programmers and database admins. Every developer I’ve ever worked with has told me that when you’re building a database, you need to normalize your data — basically this means organizing your data in such a way that removes redundancy — and failure to do so would result in public ridicule and possible revocation of access to any computing device. But I’ve always wondered, given that hard drives are cheap and getting cheaper, what’s the problem with using more storage space in exchange for greater speed?
To me, he’s missing the point entirely. Normalization isn’t about saving hard drive space. It’s about ensuring the integrity of the data. It’s about using database capabilities to making applications easier to manage and maintain. Any small speed benefits derived from non-normalized data disappear quickly when an application has to start sorting through redundant data.
I don’t think Mr. Kottke has ever had to work with a database that wasn’t normalized. Or with a database in which an unnormalized table has 100+ columns of which most are empty — it’s hellish to maintain. Or had to clean up a large application with lots of redundant data.
On the surface, normalization may seem more complex. And, yes, fifth normal form is a little crazy. However, sensible normalization is absolutely essential to maintaining data integrity in an application.