1. Overview of DBMS

anna ny·2023년 3월 9일

Database Internals

목록 보기
4/4
post-thumbnail

DBMS (Database Management System)

is a module system, and consists of the following subsystem:

  • Transportation layer (request)
  • Query Processor
  • Execution engine
  • Storage engine

In-memory DBMS

Stores most of data in volatile RAM
Backups data on Disk for durability (as asynchronous batch to reduce I/O)

  • Pros
    • Easier to manage data over disk (through virtualization)
    • Low cost of accessing data
  • Cons
    • Volatile data
    • Expensive over SSD/HDD

Disk-based DBMS

  • Uses specific data structure for accessing disk
    • Usually a wide-short tree
  • Uses a pointer to access data in memory

Row-format DBMS

  • Data stored on the disk is usually accessed with block-level
  • Stores values of every column of a record in a block
    • Efficient when trying to read a specific record
    • Not efficient when trying to read multiple records

Column-format DBMS

  • Stores data by column-level continuously in memory
  • Appropriate for aggregation analytics

Storage Engine

is a software component in charge of

  • saving data in memomry and disk
  • providing simple API to manipulate data in detail
  • so that user can CRUD records
  • providing useful functions like schema, query language, indexing, and transaction

consists of the following components:

  • Transaction manager
  • Lock manager
  • Access method
  • Buffer manager
  • Recovery manager

Data

is saved as a sequence of bytes (for storage engine)

  • can be defined as int32 or ASCII on the upper level subsystem

Buffering

Intentionally uses memory(buffer) to store the data before writing to disk
e.g) Adding In-memory buffer in B-tree node (reduce I/O)

Mutability

Means being able to write updated value at the same location
In immutable structure, data cannot be updated once it's written in the file

  • Copy-On-Write
  • Difference between LSM and B tree(In-place Update)

Ordering

Stores data in disk ordered by key

  • Efficient for searching specific data record and range scan
  • Not optimized for write operation

Data File vs Index File

Data File

Based on the following structure:

  • Index-Organized Table (IOT)
    • Stores actual data record in an index
    • Data is sorted by key
  • Heap-Organized Table
    • Stores data in an order of insertion
    • No need to reconfigure file once page is added
  • Hash-Organized Table
    • Stores records in the bucket of hash value of each key

Index File

Does mapping keys to the location of the according record in the data file

  • Clustered Index
    • Order of actual data record and one of the index is the same
    • Data record is stored in either index file or clustered file
    • Usually used for priamry key
  • Non-clustered Index
    • Data record is stored in other files and not ordered by the index key
    • Also called as secondary index

0개의 댓글