BCNF
- 모든 functional dependency å -> ß에 대해
- trivial (ß ⊂ å)
- å가 superkey
Testing for BCNF
어떤 non-trivial한 dependency å->ß가 BCNF를 violate하는지 체크하기 위해서는
- compute å+ (the attribute closure of å)
and
- verify that it includes all attributes of R
(that is, it is superkey of R)
Simplified Test
- relation R이 BCNF인지 확인하기 위해서,
F에 있는 given dependency들만 BCNF를 violate하는지 체크하면 된다.
굳이 F+에 있는 것들까지 체크할 필요는 없다.
- If none of the dependencies in F causes a violation of BCNF, then none of the dependencies in F+ will cause a violation of BCNF either.
However,
- simplified test using only F
is incorrect when testing a relation in a decomposition of R
- 이럴 때는 F+ 를 요구된다.
하지만 계속 나왔듯이 F+는 exponential time이 걸리기 때문에 비효율적이다
Example
- Consider R = (A, B, C, D, E)
with F = { A->B, BC->D}
- Decompose R into R1 = (A, B) and R2 = (A, C, D, E)
- R1은 BCNF이다 (A가 superkey니까?)
- R2에는 적용할 수 있는 dependency가 없다.
그렇다면 trivial한 것 밖에 없다.
그러니까 BCNF이구나
라고 하는 게 incorrect
- F closure에 있는 dependency에 대해 검사해보자.
A->B => AC->BC + BC->D => AC->D
(AC)+ = ACD이므로 AC는 NOT superkey
=> AC->D is nontrivial dependency
따라서 R2는 BCNF가 아니다
Testing Decomposition for BCNF
R의 decomposition중 Ri가 BCNF를 만족하는지 확인하기 위해서는
-
test Ri for BCNF w.r.t the restriction of F+ to Ri
(that is, all FDs in F+ that contain only attributes from Ri)
or
-
Use the original set dependencies F that hold on R, but with the following test :
- 만약 이 조건이 어떤 å 때문에 안된다면,
å -> (å+ - å) ∩ Ri
can be shown to hold on Ri(???),
어쨌든 얘가 Ri violates BCNF
Example
아까 했던 예시를 사용해보자
-
R2 = (A, C, D, E), F = {A->B, BC->D}
-
R2의 subset å들을 다 찾으려면 24=16번에 대해 찾아야 한다.
여기서 약간의 overhead가 있지만, F+ 찾고 있는 것 보다는 낫다
-
å = {A, C}라고 하면,
(AC)+ = ABCD이고,
이건 조건 두개를 모두 만족시키지 못한다
따라서 AC -> D가 R2가 BCNF 못되게 하는 nontrivial한 dependency이다
BCNF Decomposition Algorithm
result = {R}
done = false;
while (not done) do
if (there is a schema Ri in result that is NOT in BCNF)
then begin
let å–>ß be a nontrivial functional dependency that holds on Ri
such that å does not contain Ri and å ∩ ß = ∅
result = (result - Ri) ∪ (Ri - ß) ∪ (å, ß);
end
else done = true
Note : each Ri is in BCNF, and decomposition is lossless-join
Example
-
class(course_id, title, dept_name, credits, sec_id, semester, year, building, room_number, capacity, time_slot_id)
-
Functional dependencies :
- course_id -> title, dept_name, credits
- building, room_number -> capacity
- course_id, sec_id, semester, year -> building, room_number, time_slot_id
-
Candidate key = {course_id, sec_id, semester, year}
-
BCNF Decomposition 시작
-
course_id -> title, dept_name, credits 에서,
course_id는 superkey가 아니다
=> 얘가 BCNF 아닌 원인 dependency이다.
result = ∅ ∪ {coure_id, sec_id, semester, year, building, room_number, capacity, time_slot_id} ∪ {course_id, title, dept_name, credits
We replace class by
- course(course_id, title, dept_name, credits)
- class-1(course_id, sec_id, semester, year, building, room_number, capacity, time_slot_id)
- course is in BCNF
- 이걸 어떻게 알까?
모든 subset에 대해 test를 진행해야 할까?
음.. 여기에 있는 dependency에 대해 å+를 구하고, 그게 모두를 포함하고 있으면 넘어가는 것 같다 일단?...
- class-1 is NOT in BCNF
building, room_number -> capacity에서,
{building, room_number} is NOT a superkey for class-1
We replace class-1 by
- classroom(building, room_number, capacity)
- section(course_id, sec_id, semester, year, building, room_number, time_slot_id)
- classroom and section are in BCNF
3NF
- BCNF의 장단점
- 장 : strong한 정규화
- 단 : dependency preserving이 안될 수도 있다. dependency check를 하기 위해서는 반드시 join이 필요하고, 그럼 update할 때마다 join을 해야 해서 overhead가 발생한다
- Solution : 조금 더 약한 normal form을 만든다. 3NF
- Allows some redundancy
- 각각의 relation에 대해 join 없이도 dependency check가 가능하다
- There is always a lossless-join, dependency preserving decomposition into 3NF
- Example
- R = dept_advisor(s_ID ,i_ID, dept_name)
F = {s_ID, dept_name -> i_ID, i_ID -> dept_name}
Candidate key = {s_ID, dept_name} and {i_ID, s_ID}
- R is in 3NF
- s_ID, dept_name -> i_ID
- (s_ID, dept_name)이 superkey이다
- i_ID -> dept_name
- ß - å = dept_name이 candidate key에 포함되어 있다
Testing for 3NF
-
F에 있는 dependency들만 check하면 된다.
Need not to check all FDs in F+
-
å가 superkey라면, 각 dependency å->ß를 check할 때, attribute closure를 이용한다
- å가 superkey가 아니라면, ß에 있는 each attribute가 candidate key에 포함되어 있는지 verify해야 한다.
- 이 testsms cadidate key를 찾아야 하기 때문에 좀 expensive하다
- Testing for 3NF has been shown to be NP-hard
- 신기하게도, decomposition into 3NF는 polynomial time에 실행될 수 있다.
3NF Decomposition Algorithm
Let Fc be a canonical cover for F;
i = 0;
for each functional dependency å -> ß in Fc do
begin
i++;
Ri = åß
end
if none of the schemas Rj, 1 ≤ j ≤ i contains a candidate key for R
then begin
i++;
Ri = any candidate key for R
end
repeat
if any schema Rj is contained in another shcema Rk
then
Rj = Ri
i--;
return (R1, R2, ..., Ri)
- 각각의 Ri는 3NF이다
- Decomposition은 dependency preserving하고, lossless-join하다
- 이에 대한 증명은 넘어가자
Example
- Relation schema :
cust_banker_branch = (customer_id, employee_id, branch_name, type)
- functional dependency :
- customer_id, employee_id -> branch_name, type
- employee_id -> branch_name
- customer_id, branch_name -> employee_id
- compute the canonical cover
- branch_name is extraneous in first depenency
- (right side case)
- F' = {
customer_id, employee_id -> type,
employee_id -> branch_name,
customer_id, branch_name -> employee_id }
- (customer_id, employee_id)+
= customer_id, employee_id, type, branch_name
- done
- No other attribute is extraneous
- Go to for loop
then
R1 = (customer_id, employee_id, type)
R2 = (employee_id, branch_name)
R3 = (customer_id, branch_name, employee_id)
- 모두 candidate key를 포함하고 있기 때문에 추가적인 schema(candidate key)를 만들 필요는 없다
-
R2는 R3에 모두 포함되어 있기 때문에 제거한다
-
The resultant simplified 3NF schema is :
(customer_id, employee_id, type)
(cumtomer_id, branch_name, employee_id)
BCNF vs. 3NF
- It is always possible to decompose a relation
into a set of relation that are in 3NF such that :
- The decomposition is lossless
- The dependencies are preserved
- It is alway possible to decompose a relation
into a set of relation that are in BCNF such that :
- The decomposition is lossless
- It may not be possible to preservee dependencies
Design Goals
- Goal for a relational database design is :
- BCNF
- Lossless join
- Dependency preservation
- 만약 이렇게 하기가 어렵다면, 다음 중 하나까지는 accept한다
- Lack of depenency preservation (BCNF를 유지하거나)
- Redundancy due to use of 3NF (dependency preserving을 유지)