RDBMS Database Normalization Vs. DeNormalization for speed and performance

the new hot shot coder / friend (he really is a solid coder) was calling me out on my database architecture. why do you have redundant data, it’ll slow the database down. you don’t need the data in this table if it’s connected through a join. i was telling him, for display logic, it’s faster to avoid the CPU + Disk hit to fetch/query a join.

On Jun 1, 2011, at 1:19 AM, Xxxx Xxxxxxxx wrote:

What you know about Boyce Codd Normal form.

http://en.wikipedia.org/wiki/Database_normalization

 

My response:

yeh yeh… i was a systems major in cawlige. i’ve been building production DB driven sites for 10+ years…

i do follow RDBMS and normalization. i would argue against normalization at all times. definitely using a 1/0 bit is faster than looking through a one to one or many to one relationship because you’ll take a disk + cpu hit.

one nice thing about the INNODB builds the indexing and relationships and stores them in memory. this requires lots of memory, but makes shit faster for the relationship query.

this guy pretty much sums up my thoughts…

Database Denormalization – What is it?

Rookie database administrators are taught in school to always keep their databases normalized. A normalized database is one where you minimize duplication of data by keeping groups of data in separate tables and linking (using indexes) between the tables.

For example, as part of the SoftwareProjects CRM application, we store customer contact information in a ‘customers‘ table, and financial transactions in a ‘transactions‘ table. Both tables are keyed to a ‘customer_id‘, allowing easy lookup of all transactions belonging to an individual customer.

Whenever we take on new clients, we typically find clean, normalized tables with multiple indexes connecting tables together. At first glance the concept of database normalization makes perfect sense – it’s clean, easy to understand and makes it easy to keep data is in sync.

In the real world of high traffic websites and users demanding sub-second response times, database normalization is a fatal mistake.

A de-normalized database is one where you keeps redundant frequently-accessed data, duplicated across several tables, making it possible to fulfill a user request without having to look at more than a single table.

Remember: The key to high performance database access is sticking to single-table SELECT queries with short indexes.

 

————————————————————————————————-

from here: http://www.softwareprojects.com/resources/programming/t-database-performance-optimization-denormalization-1410.html

with pretty good discussions below.

http://www.mysqlperformanceblog.com/2008/12/22/high-performance-click-analysis-with-mysql/

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>