A distributed system
is a computing model
where multiple computers
are connected through a network
and work together as a single system
. In a distributed system
, managing data
is crucial, so the choice of database
plays a key role.
A database
is essential for storing
, retrieving
, modifying
, and managing the system's data
. Specifically, distributed systems
must consider factors such as data consistency
, availability
, and partition tolerance
, which are commonly referred to as the CAP theorem
.
Moreover, databases
should ensure data consistency
and reliability
through the `ACID properties`.
The CAP theorem
and ACID properties
will be discussed in more detail in the following sections.
π¦Ί ACID
refers to a set of core principles that ensuresdata consistency
andreliability
. It consists of four key components:Atomicity
,Consistency
,Isolation
, andDurability
.
ACID
specifically outlines how a database transaction
, which is the smallest logical unit
in database operations
, should be conducted.
To better illustrate the idea of ACID
, an example from the finance sector
, where strict adherence to these principles appears essential may be considered.
βοΈ Atomicity
ensures that internal operations
within a single transaction
will not be partially reflected in a database
, implying there will be either a complete success
or a complete failure
.
In other words, failing to achieve a single operation
out of multiple operations
would result in a failing over a whole transaction
.
For instance, failing to conduct operation #2
from the financial operations
below would result in operation #1
not reflected:
π§±
Consistency
ensures that when atransaction
is successfully completed, thedatabase
remains in a consistent state.
Specifically, if a transaction
violates rules defined in the database
, such as constraints
, the transaction
must be canceled to maintain consistency
.
For instance, a financial transaction
exceeding the remaining balance of a account may not be permitted.
ποΈ
Isolation
ensures thatmultiple transactions
executedconcurrently
remainindependent
of each other where thesetransactions
in the extreme settings could runsequentially
at the cost ofperformances
.
No operation outside a transaction
can view or interfere intermediate data
during the transaction
's execution.
For instance, during a money transfer
, if the total balance of Account A
at $ 10,000
, there may be moments when the total does not equal $ 10,000
during the transaction
. However, other transactions
must always see the total balance as $ 10,000
.
π
Durability
ensures that once atransaction
is successfully completed, its effects are permanentlyrecorded
.
Even if a system failure
occurs, the results of a successful transaction
must always be reflected in the database
. Typically, transactions
are logged
, and only when the log
is securely stored is a transaction
considered successful. If a failure occurs later, the database
can be recovered using these logs
.
π¦ CAP
refers to atheorem
where out ofConsistency
,Availability
, andPartition Tolerance
at most twoproperties
can be logically attained.
Geeks for Geeks Available at here
π
Consistency
ensures that allnodes
(databases
) within anetwork
have identical and most up-to-date copies of areplicated data item
accessible to varioustransactions
.
Geeks for Geeks Available at here
π
Availability
refers that everyread or write request
for adata
item will receive asuccessful response
with it potentially being thenon-latest data
.
Geeks for Geeks Available at here
π³
Partition Tolerance
means that thesystem
remains operational even undernetwork failures
.
Geeks for Geeks Available at here
e.g. centralised database
databases
running in different networks
.scale
over high traffic
.e.g. inventory databases
inventories
that require high reliability
, consistency
can be chosen over a availbility
.e.g. social media databases
π likes
may not have to be as consistent as inventories
, hence, availability
can be chosen over a consitency
.