• It’s a managed DB service for DB use SQL as a query language.
Advantage over using RDS versus deploying DB on EC2
- RDS is a managed service:
- Automated provisioning, OS patching
- Continuous backups and restore to specific timestamp (Point in Time Restore)!
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for DR (Disaster Recovery)
- Maintenance windows for upgrades
- Scaling capability (vertical and horizontal)
- Storage backed by EBS
Amazon Aurora
- AWS cloud optimized” and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
- Aurora costs more than RDS (20% more) but is more efficient
Amazon Aurora Serverless
- Automated database instantiation and auto-scaling based on actual usage
- Use cases: good for infrequent, intermittent or unpredictable workloads...
RDS Deployments: Read Replicas, Multi-AZ
RDS Deployments: Multi-Region
- Disaster recovery
- Local performance for global reads
- But, Replication cost
ElastiCache
DynamoDB
- NoSQL Database
- Serverless Database
- Fast and consistent in performance
- Single-digit millisecond latency - low latency retrieval
DynamoDB Accelerator - DAX
- Fully Managed in-memory cache for DynamoDB
- 10x performance improvement
- Secure, highly scalable & highly available
- Only for DynamoDB / ElastiCache can be used for other databases
=> cannot join other table
Global Tables
- Make a DynamoDB table accessible with low latency in multiple-regions
- Active-ACtive replication(read/write to any AWS Region)
=> read/write access to any region of AWS on this global table, makes it an active-active replication because you can actively write to any region and it will actively be replicated into other regions.
RedShift
- Redshift is based on PostgreSQL
- OLAP - Online Analytical Processing(analytics and data warehousing)
- Load data once every hour
- Columnar storage of data (not row based)
- Massively Parallel Query Execution (MPP), highly availble
=> computation quickly
- BI tools such as AWS Quicksight or Tableau integrate with it
RedShift Serverless
- Automatically provisions and scales data warehouse underlying capacity
- Run analytics workloads without managing data warehouse infrastructure
- Pay only for what you use (save costs)
- Use cases: Reporting, dashboarding applications, real-time analytics
Amazon EMR
- Elastic MapReduce
- helps creating Hadoop clusters(Big Data) -> analyze and process vast amount of data
- takes care of all the provisioning and configuration
- Use cases: data processing, machine learning, web indexing, big data...
Amazon Athena
- Serverless query service to analyze data stored in Amazon S3
- Use SQL Language
- SupportsCSV,JSON,ORC,Avro,andParquet(builtonPresto
- Use cases: Business intelligence / analytics / reporting, analyze &
query VPC Flow Logs, ELB Logs, CloudTrail trails, etc...
[Exam Tip: analyze data in S3 using serverless SQL, use Athena]
Amazon QuickSight
- Serverless machine learning-powered business intelligence service to create interactive dashboards
- Use cases:
• Business analytics
• Building visualizations
• Perform ad-hoc analysis
• Get business insights using data
DocumentDB
- DocumentDB is the same for MongoDB (which is a NoSQL database)
=> to store, query, and index JSON data
- Fully Managed, highly available with replication across 3 AZ
Amazon Neptune
- Fully managed graph database
- A popular graph dataset would be a social network
- Highly available across 3 AZ, with up to 15 read replicas
- Great for knowledge graphs (Wikipedia), fraud detection,
recommendation engines, social networking
Amazon TimeStream
- Fully managed, fast, scalable, serverless time
series database
- Automatically scales up/down to adjust capacity
- 1000s times faster & 1/10th the cost of
relational databases
Amazon QLDB
- QLDB stands for ”Quantum Ledger Database”
- A ledger is a book recording financial transactions
- FullyManaged,Serverless,Highavailable,Replicationacross3AZ
- review history of all the changes made to your application data
- Immutable system: no entry can be removed or modified, cryptographically verifiable
- 2-3x better performance than common ledger blockchain frameworks, manipulate data using SQL
=> Difference with Amazon Managed Blockchain: no decentralization component, in accordance with financial regulation rules
Amazon Managed Blockchain
- multiple parties can execute transactions without the need for a trusted, central authority.
- join public blockchain networks
- create own scalable private network
- Hyperledger Fabric, Ethereum
AWS Glue
- Managed extract, transform, and load (ETL) service
- prepare and transform data for analytics
- serverless
- Glue Data Catalog: is a catalog of your datasets in your Alias infrastructure => alert reference of everything, the column names, the field names, the field types, et cetera, et cetera. => Redshift
DMS(Database Migration Service)
- Quickly and securely migrate databases to AWS, resilient, self healing