What is DB Normalization? And Why Do We Normalize It?

Haebin Ethan Jeong·2020년 8월 31일
0

What is DB Normalization?

  • A technique of organizing the data in the database. It is a systematic approach of decomposing tables to eliminate data redundancy.
  • In other words, it's making sure each table has the only minimal fields and to get rid of dependencies.
    • Data Dependency: the application directly access the data in memory and has to make some assumptions about how the data is organised.
  • Normalization would eliminate the possibility of having null fields.

Scenario - When to Use

Prompt: Imagine you have an employee record, and each employee belongs to a department. If you store the department as a field along with the other data of the employee, you have a problem - what happens if a department is removed? You have to update all the department fields, and there's opportunity for error. And what if some employees does not have a department. Now there will be null values.

Solution:

  • In the emplolyee table, the fields could be id, name, social security number, but these attributes have nothing to do with the "department".

  • Only "employee_id" will tell us what department the employee belongs to.

  • So we need an another table for which department an employee is in.

  • 1NF (First normal form):
    - Unnormalized form:
    EMPLOYEE ( < employee_id >, name, social_security, department_name
    - Normalized form:
    EMPLOYEE ( < employee_id >, name, social_security)

  • Another table that we store which department the employee belongs to?

    • We could do EMPLOYEE_DEPARTMENT ( < employee_id >, department_name ), but the department_name might change. Right?
    • So we do this...
    • EMPLOYEE_DEPARTMENT ( < employee_id >, department_id )
    • DEPARTMENT ( < department_id >, department_name )
      - This is Two and Third Normal Form (2NF and 3NF).

Summary

  • Basically, we have total of 3 tables - EMPLOYEE, EMPLOYEE_DEPARTMENT, and DEPARTMENT.
  • We could put all the information (like which department does the employee belong to, or social security number) in one row. But this is not normalized because it could produce null values if department disappears or changes its name.
  • So we create the middle table called EMPLOYEE_DEPARTMENT and store all these data separately in order to prevent the null values.

HELPFUL VIDEO: https://www.youtube.com/watch?v=KqvIGYjcLQ4

profile
I'm a Junior studying Economics and Computer Science at Vanderbilt University.

1개의 댓글

comment-user-thumbnail
2023년 4월 4일

who need help improving their coding skills hawaiian necklace , daily game solve cross

답글 달기