Database System Indexing 1

eojin16·2023년 3월 21일

Indexing 데이터베이스 데이터베이스시스템

데이터베이스(Database)

목록 보기

1/2

Sample Table Information

Q1.

a. Create two indexes on “table1”

Indexed attributes are “sorted” and “unsorted”
PostgreSQL automatically makes clustered and non-clustered index
based on attribute’s data distribution
- Which is clustered index or non-clustered index?
Type “\h CREATE INDEX” for detailed index creation syntax

A1.

Sorted 된 쪽이 clustered. Unsorted가 non-clustered. 정렬되어 있어야만 clustered index 사용 가능하기 때문.

Q2.

PostgreSQL supports following index-based query execution plans

Seq scan: All rows in a table are read sequentially
Index scan: Some (or all) rows in a table are read after traversing an index
Index only scan: Query is processed in an index, not accessing table data

‘EXPLAIN ANALYZE’ statement shows the query plan and execution time of the query

E.g., > EXPLAIN ANALYZE SELECT * FROM table1;

a. Make (and execute) three queries each of which uses seq scan, index scan, and index only scan respectively

b. Make two queries using clustered index and non-clustered index, then compare their execution times

c. Execute and compare the following two queries:

SELECT sorted, rndm FROM table1 WHERE sorted>1999231 AND
rndm=1005;
SELECT sorted, rndm FROM table1 WHERE sorted<1999231 AND
rndm=1005;
Explain why their query plans are different

A2.

a.

index scan

Seq scan

Index only scan

b.

Non-clustered 쪽이 실행 속도가 느림을 알 수 있다.

c.

두 query의 실행 결과는 다음과 같다.

쿼리 플랜이 다른 이유는 첫 쿼리가 index scan할 때 cost가 상당히 줄어들기 때문에. 두번째 쿼리는 별로 차이 없기 때문에 내부적으로 쿼리 플래너가 다른 판단을 했음.

Q3-Lab Setup.

Type on psql command line

SET enable_bitmapscan=false;
\timing

Create a synthetic data set that has 5,000,000 rows

CREATE TABLE pool (val integer);
INSERT INTO pool(val) SELECT * FROM (SELECT
generate_series(1,5000000)) as T;

Q3.

Consider two cases below. Which case will take a longer time?

Inserting tuples in a table, and then creating index
Creating index, and then inserting tuples in a table

Compare the execution time 𝑡1 and 𝑡2

𝑡1 = 𝑡1.insert + 𝑡1.create-index.
- Tuple insertion → Index creation
𝑡2 = 𝑡2.create-index + 𝑡2.insert
- Index creation → Tuple insertion

3 - Hints

Create an empty table named “table10” and “table20”

CREATE TABLE table10 (val integer);
CREATE TABLE table20 (val integer);

Use “table10” to measure 𝑡! and “table20” to measure 𝑡"

Inserting tuples into tables
Utilize the “pool” table for a synthetic dataset
Insert all the tuples of “pool” into “table10” and “table20”
- E.g., > INSERT INTO table10 (SELECT * FROM pool);

A3.

T1

T2

t1<t2

eojin16

Hello.

다음 포스트