HTTP basics

한윤서·2021년 11월 24일
0

Web

목록 보기
4/6

1. Basics of Computer network

Computer network and key words

Computer network

Interconnection of multiple devices(=hosts) that are connected using multiple paths for purpose of sending/receiving data. The links connecting the nodes are known as communication channels.

Open Systems Interconncetion(OSI)

Reference model that specifies standards for communication protocols and also the functionalities of each layer

Protocols

Set of rules or algorithms that define the way 2 entities can coomunicate across the network. There are different protocols defined at each layer of OSI model.

Unique identifiers of network

Host name

Each device in network is associated with a unique device name known as Hostname.

IP Address

Network address of the system across the network. To identify eacy device in the world-wide-web, the IANA assigns an IPV4(version 4) address as a unique identifier to each device on the internet. IPV4 has a elgnth of 32-bits : 232 address is available.

It has the core function of delivering packets of info from a source to a target device.

Port

Port is a logical channel through which data can be sent/received to an application within a device. Any host(device) may have multiple applications running, and each of these applications is identified by the port number on which they are running. Port number is a 16-bit integer : 216 ports available

Socket

Unique combination of : IP address + Port number

OSI model

The OSI model is a conceptual model created whiech enables diverse communication systems to communicate using standard protocols. It is based on the concept of splitting up a communication system into 7 abstract layers.

From top to bottom :

7. Application layer

Layer that directly interacts with data from user. Software applications (ex : web, email) rely on the following layer to initiate communication. However the application itself is not part of the layer : Layer is responsible for the protocols and data manipulation to present meaningful data to users.
Ex) HTTP, SMTP (Simple Mail Transfer Protocol)

6. Presentation layer

Responsible for preparing data to be used by the application layer. Layer 6 is responsible for translation, encryption and compression of data.

  • Translation : Two communicating devices may be using different encoding methods. Layer 6 translates the incoming data to syntax that application layer of receiving device can understand.
  • Encryption : Layer 6 is responsible for adding encryption on senders end and decoding the enryption on the receiver's end.
  • Compression : Layer also compresses data it receives from application layer before delivering to layer 5 to improve speed and efficiency.

5. Session layer

Responsible for openening and closing communication between the 2 devices.
Session : Time between when communication is opened and closed
The session layer ensures that session stays open long enough to transfer all data being exchanged and then closes session inorder to avoid waisting resources.

4. Transport layer

Responsible for end-to-end communication between 2 devices.

  • Sending side : Takes data from the session layer and breaks up into chunks called segments before sending to layer 3.
  • Receiving side : Reassembles segments into data the session layer can consume

Also responsible for flow control and error control.

  • Flow control : Determines optimal speed of transimission to balance the connection speed between sender and receiver
  • Error control : Ensuring that the data received is complete and request retransmission if isnt

3. Network layer

Responsible for facilitating data transfer between two different networks (unnecessary if 2 devices are communicating on same network)

  • Sending side : Break segments to smaller units -> packets
  • Receiving side : Reassembles packets
    Network layer also does routing : Finds best physical path for data to reach destination

Responsible of facilitating data transfer between 2 devices that are on the same network. It defines the format of data on the network.

Physical layer

Includes the physical equipment invovled in data transfer (ex : cables and switches).
Converts data into a bit stream and transfers it over the physical medium.

TCP basics

TCP is a protocol that ensures data is transfered, handles packet ordering and error checking. The TCP is a data sending protocol that is associated in the Transport layer.

Characteristics of TCP

  • Connection-oriented : 2 hosts must be connected before transfer. Ensure connection is made(3-way handshaking).
  • Secures high reliability and in-order delivery
    • Flow control : Controls the amount of data in sending side
    • Congestion control : Monitors network situation and controls amount of data from sending side
    • Error detection : Ensuring that the data received is complete and request retransmission if isnt
  • Full duplex : Each side of hosts can be sender or receiver
  • Byte stream : Parts data into units called segments
  • Used in HTTP, FP, SMTP

3-way handshaking

The method to set connection between hosts used by TCP. Done by SYN/ACK packets(unit of data that is generated in the network layer).

  • Initially the server is in LISTEN state : Server waiting for connection request from client
  • SYN_SENT : Client generates a sequence number and stores it in the SYN packet. The packet is sent to the server to request connection
  • SYN_RECEIVED : The client that received the SYN stores (client's SYN number + 1) into ACK packet. It also generates its own SYN sequence number. SYN+ACK packets are both sent back to client
  • Client ESTABLISHED : THe client who received both SYN+ACK checks the ACK sequence number and finds that difference between ACK and client's SYN is 1 -> Successful connection -> (server's ACK number + 1) stored in ACK and sent
  • Server ESTABLISHED : Receives client's ACK and checks the difference between it and server's SYN is 1 -> Connected successfully to communicate.

UDP (User Datagram Protocol)

Unlike TCP, it does not secure the reliability of data. For the characteristics :

  • Connection-less : No process of setting connection or disconnecting
  • Does not have reliability and does not guarantee the order of data
  • Speed is fast compared to TCP
  • Used in DNS, DHCP, video/audio streaming

2. Overview of HTTP

HTTP is a protocol for fetching resources such as HTML documents -> data exchange on the Web. It is a client-server protocal, which means requests are initiated by recipient (usualy web browser). A single document is constructed from different sub-documents fetched.

Messages sent by the client to server : requests
Message sent by server to client as anwers : responses

Components of HTTP-based systems

HTTP : Client-server protocol. Requests are sent by one entity : user-agent (which is mostly the web browser). Each individual request is sent to server which handles it and provides the response. Between client and server, there are numerous entities, collectively called proxies.

Client : user-agent

User-agent : Tool that acts on behalf of the user. Role is primarily performed by web browser. THe browser is always the entitiy initiating the request (never the server).

To display a Web page, the browser sends an original request to fetch the HTML document that represents the page. It then parses this file, making additional requests corresponding to execution scripts, layout information (CSS) to display, and sub-resources contained within the page. Web browser then combines these resources to present the complete document, the Web page. Scripts executed by the browser can fetch more resources in later phases and the browser updates the Web page accordingly.

A Web page is hypertext document -> Some parts of the displayed contents are links which can be activated to fetch a new Web page, allowing users to direct their user-agent and navigate through the Web. Browser translates these directions into HTTP requests and further interpret HTTP responses to present user with a clear response.

Web server

The server serves the document as requested by the client. A server appears as only a single machine virtually but can be a colletion of servers sharing the load, or can be a complex software iterrogating other computers totally or partially generating document on demand.

Proxies

Between Web browser and server, numerous computers and machines relay the HTTP messages. Network architecture Most of these operate at the transport, network or physical levels. Those operating at the application layer -> proxies.

Transparent proxies : Forwarding the requests they recieve with no alter
Non-transparent : Change the request in some way before passing it along to server

Functions:

  • Caching : Cache can be public or private
  • Filtering : Antivirus scan or parental controls
  • Load balancing : Allow multiple servers to serve different requests
  • Authentication : Control access to different resources

Aspects of HTTP

HTTP is simple

HTTP messages can be read and understood by humans, providing easier testing for developers, and reduced complexity for newcomers.

HTTP is extensible

HTTP headers make this protocol easy to extend and experiment with. New functionality can even be introduced by a simple agreement between a client and a server about a new header's semantics.

HTTP is stateless, but not sessionless

Stateless : No link between 2 requests being successively carried out on the same connection. Causes problem for users attempting to interact with certain pages coherently (ex : shopping baskets on e-commerce).

However HTTP cookies (by using HTTP headers) are added to workflow, allowing session creation on such HTTP request to share same context or state.

HTTP and Connections

Connectionless Protocol - HTTP/1.0

Before a lient and server can exchange a HTTP request/response pair, they must establish a TCP connection. The default behavior of HTTP/1.0 is to open a seperate TCP connection for each HTTP request/response pair. This has an advantage of reducing pressure to server but is inefficient interms of const and time

Keep Alive - HTTP/1.0 , HTTP/1.1

To keep the connection, a new method was introduced. The property Connection : keep-alive was included in the request header and sended to the server. Then the server would also send the message in its response header and send back to browser. This would maintain the connection till one host would stop sendin the message.

Persistent connection - HTTP/1.1

Then HTTP developed so that the connection maintaining would become the default value until the following Connection : close would be included in either headers.

HTTP flow

1) Open a TCP connection
2) Client sends HTTP message : Contains info of the type of request, host and others to give info to server regarding what type of data to respond

GET / HTTP/1.1
Host: developer.mozilla.org
Accept-Language: fr

3) Response from the server is read : Gives info on whether reponse was successful or not

HTTP/1.1 200 OK
Date: Sat, 09 Oct 2010 14:28:02 GMT
Server: Apache
Last-Modified: Tue, 01 Dec 2009 20:18:22 GMT
ETag: "51142bc1-7449-479b075b2891b"
Accept-Ranges: bytes
Content-Length: 29769
Content-Type: text/html

3. HTTP references

HTTP Headers

HTTP headers let the client and server pass additional information with an HTTP request or response. It consists of a series of properties and its values in the following structure : property : value

Types of headers

  • Request headers : Contain info about the resources to be fetched
  • Response headers : Hold additional information about the response, like its location or about the server providing it.
  • Representation headers : Contain information about the body of the resource, like its MIME type, or encoding/compression applied.

Key properties in Request header

  • Host : Server's domain name and TCP port number
    Host : en.wikipedia.org:8080
  • Content-Type : Type of body when using POST/PUT method. Indicate the original media type of the resource
    Content-Type : text/html; charset=UTF-8
    Media-type is written in the MIME type form
    MIME type : Has a structure of type/subtype (ex : text/html, image/png)
  • If-Modified-Since : Makes the request conditional -> the server sends back the requested resource, with a 200 status, only if it has been last modified after the given date.
    If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
  • Origin : Indicates the origin of the request(domain or IP).
    Origin: http://www.example-social-network.com
  • Cookie : Contains stored HTTP cookeis associated with the server (i.e. previously sent by server with Set-Cooke header)
    Cookie: $Version=1; Skin=new;

Key properties in Respond header

  • Access-Control-Allow-Origin : Indicates whether response can be shared with requesting code from given origin (CORS)
  • Set-Cooke : Send a cookie from the server to the user agent, so that the user agent can send it back to the server later.
    Set-Cookie: <cookie-name>=<cookie-value>
  • Last-Modified : Contains the date when origin server belives the resource was last modified. Used as a validator to determine if resource is sames as previously stored one
    Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
  • Location : Indicates the URL to redirect a page to. It only provides a meaning when served with a 3xx (redirection) or 201 (created) status response.
    Location: <url>
  • Allow Lists the set of methods supported by a resource
    Allow: GET, POST, HEAD

HTTP request methods

HTTP defines a set of request methods to indicate the desired action to be performed for a given resource.

  • GET : Requests a representation of specified resource. Only retrieves data
  • HEAD : Identical to GET but without the response body
  • POST : Used to store new data into server. Inserts a new data in to body and sends it to server.
  • PUT : Replaces all current representations of target resource with request payload. Contains the data to replace in body
  • DELETE : Deletes specified resource
  • PATCH : Used to replace the data stored in server
  • OPTION : USed to explain network connection option before request

Note : When using GET, the browser can cache(remember) the resource and therefore retrieve data from the cache(not server), providing better functionality

HTTP status code

HTTP response status codes indicate whether a specific HTTP request has been successfully completed. Responses are grouped in five classes:

  • 1XX (Information responses) : After receiving request, continue work
  • 2XX (success) : Request executed successfully
    • 200 (Success)
    • 201 (created) -> request succeeded and a new reource was created as result. Normally for POST
    • 202 (Accepted) : Request has been received byt not yet acted
  • 3XX (Redirection messages) : Client has to take additional actions to finish request
    • 300 (Multiple choice) : Request has more than 1 possible response and therefore must choose 1
    • 301 (Moved permanently) : URL of requested resource have changed permanently and the new URL is given in response
  • 4XX (client error responses) : Error in client
    • 401 (Unauthorized) : Client must authenticate itself to get requested response
    • 403 (Forbidden) : Client does not have access rights to content
    • 404 (Not Found) : Server can not find requested resource (in browser -> URL not recognized)
  • 5XX (server error responses)
    • 500 (Internal Server Error)
profile
future eo

0개의 댓글