DB 7. NoSQL

skh951225·2023년 4월 6일

데이터베이스

목록 보기

7/7

데이터베이스 KOCW

NoSQL

Not Only SQL의 약자
기존의 RDB 와는 다른 종류의 데이터베이스
크게 Key-Value, Column-Family, Document, Graph 모델로 나눌 수 있다.

Key-Value

key값을 통해 value의 값을 얻는 방법으로 dictionary와 같은 것이라고 할 수 있다.
key used to access opaque blobs of data
value ca contain any type of data
장점 : scalable, simple API(get,put,delete)
단점 : content-based search가 불가능(value가 opaque해서?)
대표적으로 redis
- Open source in-memory key-value store with optional durability
- Focus on high speed reads & writes of common data structure to RAM
- Allows simple lists, sets and hashes to be stored with the value and manipulated
- Many features that developers like
  - expiration, transaction, partitioning
  - pub/sub : publish 하면 subscribe한 사람에게 보여짐

Columns-Family stores

key가 row, column family, column name, timestamp으로 구성됨
하나의 row에는 여러개의 column family가 존재하며, column family에는 여러개의 column이 존재
timestamp가 필요한 이유는 multiple version of values를 저장하기위해(value는 시간에 따라 변한다)
store versioned blobs in one large table
queries can be done on rowss, column families and column names
RDBMS 처럼 table-structure로 보존
Join에 최적화 안됨
하나의 row가 million의 collumn을 가질수도 있다. 그런경우 sparse matrix일 가능성이 높다.
column추가를 비교적 쉽게할 수 있음
column-family를 또 모아 super column을 만들 수도 있다.
성능을 위해서 유사한 특성을 지닌 그룹을 묶는 것이 좋음
장점 : good scale out, versioning
단점 :
1. cannot query blob content
- blob data의 저장을 지원하지 않는다.
  2. row and column designs are critical
- 데이터의 분산이 불균형하게 되어 일부 노드의 부하가 증가하거나, 쿼리 성능이 저하될 수 있다.
대표적으로 big table
- Tightly coupled with MapReduce
- Technically a "sparse matrix" were most cells have no data
- Generating a list of all columns is non-trivial(Column-wise storage는 아님)
- ex) Google bigtable, Hadoop HBase
HBase
- Java 로 쓰여져있고 MapReduce 지원
- Column-oriented data store
- 특히 Hadoop과의 호환성에 집중해 design함
- High-level query language (Pig)
- Strong support by many vendors
cassandra
- Apache open source column family db supported by DataStax
- Peer-to-peer distribution model(모든 노드가 동등한 지위를 가진다.) <-> Centered model
- Strong reputation for linear scale out(millions of writes/second)
- linear scale out : node 수 와 성능 비례
- Database side security
- Written in Java and HDFS,MapReduce와 잘 동작

Graph Store

Data is stored in a series of nodes, relationships and properties
Queries are really graph traversals
Ideal when relationships between data is key(eg. social network)
장점 : fast network search, works with public linked data sets
단점 : Poor scalability when graphs don't fit into RAM, specialied query languages (RDF uses SPARQL)
- RDF(Relational Description Framework) : Subject, Preidicate(속성), Object로 데이터를 표현하는 방식
- SPARQL : RDF 표준 질의 언어

Document Store

Data stored in nested hierarchies (eg. json, xml)
Logical data remains stored together as a unit(no shredding) (RDB 에 나눠서 저장할 수도 있음, but 번거로움)
Any item in the document can be queried
장점 : No object-relational mapping layer, ideal for search
단점 : Complex to implement, incompatible with SQL
mongoDB
- Open source JSON data store created by 10gen
- Master-slave scale out model
- Strong developer community
- Sharding built-in, automatic (Sharding : 데이터를 여러 조각으로 나눠 저장하는 기술)
- Implemented in C++ with may APIs(C++,JavaScript,Java,Perl,Python etc.)

skh951225

이전 포스트

DB 7. NoSQL

데이터베이스

NoSQL

Key-Value

Columns-Family stores

Graph Store

Document Store

DB 6. Hashing

0개의 댓글