데이터베이스-relational data model/ domain, attribute, tuple/ relation, relational database / relation 특징/ null의 의미/ key 개념과 종류/ constraints 개념과 종류

snowball moon·2023년 10월 9일

DB&SQL

목록 보기

2/3

알아야 할 배경지식 set

set은 서로 다른 elements를 가지는 collection으로
하나의 set에서 elements의 순서는 중요하지 않다.
e.g. {1,3,11,4,7}/{3,4,11,7,1}

Cartesian product
A x B = { (a, b) | a∈A ∧ b∈B }

(1,p),(1,q),(1,r),(2,p)(2,q),(2,r)이 나올 수 있다.

set이 2개 일 때에는 binary relation인데

binary relation ⊆ AxB

A와 B의 Cartesian product 부분집합이 된다.

set이 n개 일 때에는 n-ary relation으로

n-ary relation ⊆ X1 x X2 ... Xn

이것 또한 Cartesian product의 부분집합이다.

이렇게 각각의 list들을 tuple이라고 부를 수 있는데 n개의 집합에 대한 튜플은 n-tuple이라고 한다.

수학에서 사용되는 relation의 개념은 아래와 같다.

-subset of Cartesian product
(Cartesian product의 부분집합이다.)
-set of tuples
(tuple들의 집합이다.)

relational data model에 적용하기

relational data model에서 set은 domain을 의미한다.
(elements 또는 value의 집합이 domain이다.)
relational data model도 tuple을 가질 수 있다.

예시 (domain 정의하기)

students_ids : 학번 집합, 7자리 integer 정수
human_names : 사람 이름 집합, 문자열
university_grades : 대학교 학년 집합, {1,2,3,4}
major_names : 대학에서 배우는 전공 이름 집합
phone_numbers : 핸드폰 번호 집합

phone_numbers는 두 개를 만들었는데 학생이 연락되지 않을 때를 대비하여 비상 연락망을 저장하기 위한 용도이다.
phone_numbers의 역할을 구분해주기 위해 relational data model에서는 attribute라는 개념을 사용한다.
(attribute는 domain에서 수행하는 역할에 대한 이름을 따로 붙여주는 것이다.)

위의 그림은 이해를 위한 것이고 relational data model에서는 보통 아래와 같은 table로 표현한다.

domain안에서 각각의 역할의 이름이 부여된 것을 attribute라고 하고, attribute 각각의 값들로 이루어진 list를 tuple이라고 한다. 전체를 relation이라고 하고 table로 표현한다. 이러한 relation에는 name이 있다.

주요 개념&설명
domain - set of atomic values
domain name - domain 이름
attribute - domain이 relation에서 맡은 역할 이름
tuple - 각 attribute의 값으로 이루어진 리스트(일부 값은 NULL일 수 있음)
relation - set of tuples
relation name - relation의 이름

(domain은 값들의 집합인데 여기서 값은 더 이상 나눌 수 없는 값이어야 한다.)

relation schema

-relation의 구조를 나타냄
-relation의 이름과 attributes 리스트로 표기
e.g. STUDENT(id, name, grade, major, phone_num, emer_phone_num) ->여기서 시각적으로 표현되지 않지만 attributes와 관련된 constraints도 포함

degree of a relation

-relation schema에서 attributes의 수
e.g. STUDENT(id, name, grade, major, phone_num, emer_phone_num) -> 여기서 degree는 6임

relation (or relation state)

추상적으로 개념적으로 전체를 relation이라고 할 수 있지만 실제 데이터, 실제 튜플들의 집합, 임의의 시점에서의 튜플들의 집합도 relation 혹은 relation state라고 할 수 있다.
(이때는 특정 데이터에 한정해서 relation이라고 함)

relational datebase

-relation data model에 기반하여 구조화된 database
-relational database는 여러 개의 relations로 구성됨

relational database schema

relation schemas set과 integrity constraints(무결성 제약조건) set으로 구성 됨

->integrity constraint(무결성 제약조건)이란?

database의 정확성, 일관성을 보장하기 위해 저장, 삭제, 수정 등을 제약하기 위한 조건을 의미함
(개체 무결성, 참조 무결성, 도메인 무결성, 고유 무결성, NULL 무결성, 키 무결성이 있음)

relation의 특징

1.relation은 중복된 tuple을 가질 수 X(relation is set of tuples)

2.relation의 tuple을 식별하기 위해 attribute의 부분집합을 key로 설정

3.relation에서 tuple의 순서는 중요X(튜플의 순서가 바뀌어도 relation의 의미는 달라지지 않는다. 즉 순서를 정하는 방법은 여러가지이다.)

4.하나의 relation에서 attribute의 이름 중복X

5.하나의 tuple에서 attribute의 순서는 중요X

6.attribute는 atomic 해야 함(composible or multivalued attribute X)

->서울특별시/마포구/공덕동으로 더 나눠질 수 있는 attribute이다. 이렇게 하나로 묶인 것을 composible attribute라고 하는데 나눠서 저장해야한다.

NULL의 의미

1)값이 존재하지 않는다.
2)값이 존재하나 아직 그 값이 무엇인지 알지 못한다.
3)해당 사항과 관련이 없다

->NULL은 중의적인 의미를 가진다.(쓰지 않는 것이 좋다.)

superkey

:relation에서 tuples를 *unique하게 식별할 수 있는 attributes set

e.g. PLAYER(id, name, team_id, back_number, birth_date)의 superkey는
{id, name, team_id, number, birth_date}, {id, name}, {name, team_id, back_number}, ... etc

식별할 수 있는 고유한 값이 있으면 됨

candidate key

:어느 한 attribute라도 제거하면 unique하게 tuples를 식별할 수 없는 super key
->key or minimal superkey

e.g. PLAYER(id, name, team_id, back_number, birth_date)의 candidate key는 {id}, {team_id, back_number}->독립적으로 있을 때 식별이 불가함

primary key

:relation에서 tuples를 unique하게 식별하기 위해 선택된 candidate key

e.g. PLAYER(id, name, team_id, back_number, birth_date)의 primary key는 {id} or {team_id, back_number}
(보통 attribute 수가 적은 것을 primary key로 선정)

unique key

:primary key가 아닌 candidate keys
->alternate key

e.g. PLAYER(id, name, team_id, back_number, birth_date)의 unique key는 {team_id, back_number}
(선택되지 못한 나머지가 unique key가 됨)

foreign key

:다른 relation의 PK를 참조하는 attributes set

e.g. PLAYER(id, name, team_id, number, birth_date)와 TEAM(id, name, manager)가 있을 때
foreign key는 PLAYER의 {team_id}
(여기서 PLAYER의 team_id는 TEAM에 id를 참조하게 된다.)

constraints

: relational database의 relations들이 언제나 항상 지켜줘야 하는 제약 사항

implicit constraints

-relational data model 자체가 가지는 constraints
-relation은 중복되는 tuple을 가질 수 없음
-relation 내에서는 같은 이름의 attribute를 가질 수 없음

schema-based constraints

-주로 DDL을 통해 schema에 직접 명시할 수 있는 constraints
-explicit constraints

schema-based constraints 종류

1) domain constraints

-attribute의 value는 해당 attribute의 domain에 속한 value여야 한다.
(여기서 100학년은 말이 되지 않는다.)

2) key constraints

-서로 다른 tuples는 같은 value의 key를 가질 수 없다.
(id가 primary key라고 했을 때 서로 다른 tuple인데 같은 key값을 가지는 것은 안된다.

3) NULL value constraint

-attribute가 NOT NULL로 명시 되었다면 NULL을 값으로 가질 수 없다.

4) entity integrity constraint

-primary key는 value에 NULL을 가질 수 없다.
(tuple을 식별하는 키는 NULL값을 가지면 안된다.)

5) referential integrity constraint

-FK와 PK와 domain이 같아야 하고 PK에 없는 values를 FK가 값으로 가질 수 없다.
(참조하고 있는 primary key에 존재하는 값을 참조해야한다.)

database의 정확성, 일관성을 보장하기 위해 제약조건을 만들었다.

References

시니어 백엔드 개발자가 알려주는 데이터베이스 개론 & SQL
https://www.inflearn.com/course/%EB%B0%B1%EC%97%94%EB%93%9C-%EB%8D%B0%EC%9D%B4%ED%84%B0%EB%B2%A0%EC%9D%B4%EC%8A%A4-%EA%B0%9C%EB%A1%A0/dashboard

snowball moon

이전 포스트

데이터베이스-DB/DBMS/DB system/data models/schema & state/three-schema architecture/database language

다음 포스트

데이터베이스-SQL 기본 개념/데이터베이스 생성/테이블 생성/attribute data type 소개/primary key 설정/unique constraint 설정/not null constraint 설정/attribute의 default 값 설정/check constraint 설정/foreign key 설정/constraint 이름 명시/테이블 변경(alter table)/테이블 삭제 (drop table) 커리큘럼