Scalability involves beginning with only the resources you need and designing your architecture to automatically respond to changing demand by scaling out or in. As a result, you pay for only the resources you use. You don’t have to worry about a lack of computing capacity to meet your customers’ needs.
If you’ve tried to access a website that wouldn’t load and frequently timed out, the website might have received more requests than it was able to handle. This situation is similar to waiting in a long line at a coffee shop, when there is only one barista present to take orders from customers.
Amazon EC2 Auto Scaling enables you to automatically add or remove Amazon EC2 instances in response to changing application demand. By automatically scaling your instances in and out as needed, you can maintain a greater sense of application availability.
Within Amazon EC2 Auto Scaling, you can use two approaches: dynamic scaling and predictive scaling.
In the cloud, computing power is flexible, allowing you to scale resources programmatically. By using Amazon EC2 Auto Scaling, you can automatically add or remove EC2 instances based on demand. When configuring an Auto Scaling group, you set a minimum capacity (e.g., at least one instance), a desired capacity (e.g., two instances for normal operation), and a maximum capacity (e.g., up to four instances for high demand). This setup ensures cost-effectiveness, as you only pay for the instances in use, optimizing both performance and expenses.
Elastic Load Balancing is the AWS service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances.
Elastic Load Balancing (ELB) is a service that automatically distributes incoming application traffic across multiple resources, such as Amazon EC2 instances. The primary purposes and roles of ELB are:
Traffic Distribution: ELB acts as a single point of contact for incoming traffic, distributing it evenly across all available resources (e.g., EC2 instances) to prevent any single instance from becoming overloaded.
High Availability: By distributing traffic across multiple instances, ELB ensures that your application remains available even if one or more instances fail or are removed. This redundancy helps maintain continuous operation.
Scalability: ELB works in conjunction with Auto Scaling to manage changes in traffic load. As Auto Scaling adds or removes instances based on demand, ELB automatically adjusts to distribute traffic to the new set of instances.
Health Monitoring: ELB can monitor the health of registered instances and route traffic only to healthy instances, ensuring that requests are always handled by fully operational resources.
Elastic Load Balancing (ELB) and Auto Scaling are distinct but complementary AWS services, each with its own role in managing application traffic and infrastructure:
Elastic Load Balancing (ELB):
Auto Scaling:
Although Elastic Load Balancing and Amazon EC2 Auto Scaling are separate services, they work together to help ensure that applications running in Amazon EC2 can provide high performance and availability.
Low-demand period
Suppose that a few customers have come to the coffee shop and are ready to place their orders.
If only a few registers are open, this matches the demand of customers who need service. The coffee shop is less likely to have open registers with no customers. In this example, you can think of the registers as Amazon EC2 instances.
High-demand period
Throughout the day, as the number of customers increases, the coffee shop opens more registers to accommodate them.
Additionally, a coffee shop employee directs customers to the most appropriate register so that the number of requests can evenly distribute across the open registers. You can think of this coffee shop employee as a load balancer.
Applications are made of multiple components. The components communicate with each other to transmit data, fulfill requests, and keep the application running.
Suppose that you have an application with tightly coupled components. These components might include databases, servers, the user interface, business logic, and so on. This type of architecture can be considered a monolithic application.
In this approach to application architecture, if a single component fails, other components fail, and possibly the entire application fails.
source : revdebug
To help maintain application availability when a single component fails, you can design your application through a microservices approach.
In a microservices approach, application components are loosely coupled. In this case, if a single component fails, the other components continue to work because they are communicating with each other. The loose coupling prevents the entire application from failing.
When designing applications on AWS, you can take a microservices approach with services and components that fulfill different functions. Two services facilitate application integration: Amazon Simple Notification Service (Amazon SNS) and Amazon Simple Queue Service (Amazon SQS).