이번 기회를 경험으로 6개월 정도 제대로 준비해보기.
영어도 제대로 준비할 것!
1.Recruiter screen
2.Technical phone interview (09/19)
2.1 chat for catching up interview (09/20) interviewer feedback 전달 받음. 불합격. 다음 라운드를 통과하기엔 어려워 보임. 좀 더 연습하고 충분하다고 생각되면 다시 연락달라.
3.Onsite interview (5 sessions with 4x DS/Algo, 1x Googleness and leadership)
4.Hiring committee review
5.Hiring manager team match
6.offer review
7.offer delivery
1.Recruiter screen
2.Online Assessment (08/28)
3.Technical Phone Screen (09/14)
4.virtual final onsite interview (09/24, 10/03)
5.hiring committe review
void start(number)
int getAck()
void send()
// 함수 구현 문제.
IDE 아니고 노트패드 같은 것(google interview doc)에 손 코딩.
테스트 시나리오
start(10)
getAck() // return 10
send(10)
getAck() // return 11
send(12)
send(13)
getAck() // return 11
send(11)
getAck() // return 14
1차 구현 후 효율관점에서 개선 요청
not improving time complexity
data structure => list -> dictionary -> set
send() 가 무수히 많이 불리고 getAck() 매우 적게 불리는 상황에서 최적화 요청
최적의 solution 을 제안하지 못 했다고 feedback 받음.
solution 제안 시간도 좀 더 빨라야 할 듯.
recruitor 로 부터 면접관 피드백 전달 받음. 다음 단계를 간다고 해도 쉽지 않을 거라고 함. 좀 더 준비해서 자신있다고 생각될 때 이어서 진행하는 게 어떠냐고 제안함(?). 탈락!
https://leetcode.com/list?selectedList=eqpfz2de (68 problems without premium)
dp[x+k] = dp[x]
System design problems
Non-Functional Requirements
The system should be highly available
Our rate limiter should not introduce substantial latencies
Throttling is the process of controlling the usage of the APIs by customers during a given period. HTTP status “429 - Too many requests"
What are different types of throttling?
Hard Throttling
Soft Throttling
Elastic or Dynamic Throttling
What are different types of algorithms used for Rate Limiting?
High level design for Rate Limiter
Basic System Design and Algorithm
Sliding Window algorithm
Sliding Window with Counters
Data Sharding and Caching
Should we rate limit by IP or by user?
Sharding based on UserID
This approach has a couple of issues:
To recover from these situations, either we have to repartition/redistribute our data or used consistent hashing to balance the load between servers.
Sharding based on VideoID
Our hash function will map each VideoID to a random server where we will store that Video’s metadata.
This approach solves our problem of popular users but shifts it to popular videos.
We can further improve our performance by introducing a cache to store hot videos in front of the database servers.
Video Deduplication
Load Balancing => should use Consistent Hashing among cache servers.
will also help in balancing the load between cache servers.
Cache
Content Delivery Network (CDN)
A CDN is a system of distributed servers that deliver web content to a user based on the user’s geographic locations, the origin of the web page, and a content delivery server.
Fault Tolerance => should use Consistent Hashing
Consistent hashing will not only help in replacing a dead server but also help in distributing load among servers.
Design Basics
Horizontal vs. Vertical Scaling,
Horizontal example: Cassandra and MongoDB
vertical example: mysql
redundancy
Load balancing helps you scale horizontally across an ever-increasing number of servers, but caching will enable you to make vastly better use of the resources you already have as well as making otherwise unattainable product requirements feasible.
Application server cache#
Content Delivery (or Distribution) Network (CDN)
Cache Invalidation
if the data is modified in the database, it should be invalidated in the cache; if not, this can cause inconsistent application behavior.
Cache eviction policies
First In First Out (FIFO): The cache evicts the first block accessed first without any regard to how often or how many times it was accessed before.
Last In First Out (LIFO): The cache evicts the block accessed most recently first without any regard to how often or how many times it was accessed before.
Least Recently Used (LRU): Discards the least recently used items first.
Most Recently Used (MRU): Discards, in contrast to LRU, the most recently used items first.
Least Frequently Used (LFU): Counts how often an item is needed. Those that are used least often are discarded first.
Random Replacement (RR): Randomly selects a candidate item and discards it to make space when necessary.
Data partitioning is a technique to break a big database (DB) into many smaller parts.
a. Horizontal Partitioning
we put different rows into different tables
The key problem with this approach is that if the value whose range is used for Partitioning isn’t chosen carefully, then the partitioning scheme will lead to unbalanced servers.
b. Vertical Partitioning
we divide our data to store tables related to a specific feature in their own server.
Vertical Partitioning is straightforward to implement and has a low impact on the application.The main problem with this approach is that if our application experiences additional growth, then it may be necessary to further partition a feature specific DB across various servers
c. Directory-Based Partitioning
This loosely coupled approach means we can perform tasks like adding servers to the DB pool or changing our partitioning scheme without having an impact on the application.
Partitioning Criteria
a. Key or Hash-based Partitioning
b. List partitioning
c. Round-robin partitioning
d. Composite Partitioning
Redundancy is the duplication of critical components or functions of a system with the intention of increasing the reliability of the system
Replication means sharing information to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
SQL: relational, structured and predefined schemas
NoSQL:non-relational databases, unstructured, distributed, dynamic schema
Querying: SQL use SQL for defining and manipulating the data which is very powerful.
NoSQL, queries are focused on a collection of documents. called UnQL(Unstructured Query Language)
Scalability: NoSQL databases are horizontally scalable, we can add more servers easily
Reliability or ACID (Atomicity, Consistency, Isolation, Durability):relational databases are ACID compliant.
Most of the NoSQL solutions sacrifice ACID compliance for performance and scalability.
SQL VS. NoSQL - which one to use?
Reasons to use SQL database
Reasons to use NoSQL database
it is impossible for a distributed system to provide all three of the following desirable properites:
Consistency(C): All nodes see the same data at the same time
Availability(A): refers to a system's ability to remain accessible even if one or more nodes in system go down.
Partition tolerance(P): partition is a communication break(or a network failure) between any two nodes in the system.
According to the CAP theorem, any distributed system needs to pick two out of the three properties.
In the presence of a network partition, a distributed system must choose either Consistency or Availability.
Data partitioning: It is the process of distributing data across a set of servers. It improves the scalability and performance of the system.
Data replication: It is the process of making multiple copies of data and storing them on different servers. It improves the availability and durability of the data across the system.
Data partition and replication strategies lie at the core of any distributed system
Distributed systems can use Consistent Hashing to distribute data across nodes. Consistent Hashing maps data to physical nodes and ensures that only a small set of keys move when servers are added or removed.
Consistent Hashing stores the data managed by a distributed system in a ring.
Virtual nodes
Instead of assigning a single token to a node, the hash range is divided into multiple smaller ranges, and each physical node is assigned several of these smaller ranges. Each of these subranges is considered a Vnode. With Vnodes, instead of a node being responsible for just one token, it is responsible for many tokens (or subranges).
Consistent Hashing use cases
Amazon’s Dynamo and Apache Cassandra use Consistent Hashing to distribute and replicate data across nodes.
Heartbeating is one of the mechanisms for detecting failures in a distributed system. If there is a central server, all servers periodically send a heartbeat message to it. If there is no central server, all servers randomly choose a set of servers and send them a heartbeat message every few seconds. This way, if no heartbeat message is received from a server for a while, the system can suspect that the server might have crashed. If there is no heartbeat within a configured timeout period, the system can conclude that the server is not alive anymore and stop sending requests to it and start working on its replacement.
Background
How can a distributed system ensure data integrity, so that the client receives an error instead of corrupt data?
Solution
Calculate a checksum and store it with data.
When a system is storing some data, it computes a checksum of the data and stores the checksum with the data. When a client retrieves data, it verifies that the data it received from the server matches the checksum stored. If not, then the client can opt to retrieve that data from another replica.