1. Overview of DBMS

anna ny·2023년 3월 9일

Database Internals

Database Internals

목록 보기

4/4

DBMS (Database Management System)

is a module system, and consists of the following subsystem:

Transportation layer (request)
Query Processor
Execution engine
Storage engine

In-memory DBMS

Stores most of data in volatile RAM
Backups data on Disk for durability (as asynchronous batch to reduce I/O)

Pros
- Easier to manage data over disk (through virtualization)
- Low cost of accessing data
Cons
- Volatile data
- Expensive over SSD/HDD

Disk-based DBMS

Uses specific data structure for accessing disk
- Usually a wide-short tree
Uses a pointer to access data in memory

Row-format DBMS

Data stored on the disk is usually accessed with block-level
Stores values of every column of a record in a block
- Efficient when trying to read a specific record
- Not efficient when trying to read multiple records

Column-format DBMS

Stores data by column-level continuously in memory
Appropriate for aggregation analytics

Storage Engine

is a software component in charge of

saving data in memomry and disk
providing simple API to manipulate data in detail
so that user can CRUD records
providing useful functions like schema, query language, indexing, and transaction

consists of the following components:

Transaction manager
Lock manager
Access method
Buffer manager
Recovery manager

Data

is saved as a sequence of bytes (for storage engine)

can be defined as int32 or ASCII on the upper level subsystem

Buffering

Intentionally uses memory(buffer) to store the data before writing to disk
e.g) Adding In-memory buffer in B-tree node (reduce I/O)

Mutability

Means being able to write updated value at the same location
In immutable structure, data cannot be updated once it's written in the file

Copy-On-Write
Difference between LSM and B tree(In-place Update)

Ordering

Stores data in disk ordered by key

Efficient for searching specific data record and range scan
Not optimized for write operation

Data File vs Index File

Data File

Based on the following structure:

Index-Organized Table (IOT)
- Stores actual data record in an index
- Data is sorted by key
Heap-Organized Table
- Stores data in an order of insertion
- No need to reconfigure file once page is added
Hash-Organized Table
- Stores records in the bucket of hash value of each key

Index File

Does mapping keys to the location of the according record in the data file

Clustered Index
- Order of actual data record and one of the index is the same
- Data record is stored in either index file or clustered file
- Usually used for priamry key
Non-clustered Index
- Data record is stored in other files and not ordered by the index key
- Also called as secondary index

이전 포스트

2. Overview of B-tree

0개의 댓글