크래프톤정글6주차 - 웹서버 회고

김태성·2024년 2월 29일

크래프톤정글 4기 개발일지

목록 보기

30/43

이번주는 상당히 힘들었다.
힘들다는게 몸이 힘들었다는 뜻이 아니라, 처음 배워보는 네트워크 프로그래밍과 그것을 적용하는 tiny 서버 구축, 그리고 서버를 만들때 들어가는 함수조차 알지 못했기 때문에 이게 뭔소리인가 하루이틀정도 코드만 봤던거 같다.

그래서 진전도 없기에 알고리즘 푸는데도 집중을 많이 한 것 같다.
이미 제일 등급이 낮은 문제가 실버2~3에 머물렀기 때문에 플래티넘 문제 풀어도 한 7점정도 밖에 안줬었는데... 결국 점수를 100점가까이 올리고 골드1에서 플래티넘으로 승급을 했으니 말이다.

하지만 그렇다고 할지라도 이번주에서 얻은 수확은 있다고 생각한다.

기본적인 osi7계층,tcp/ip 계층구조라던가
데이터가 호스트에서 출발 한 후 패킷, 메시지 등으로 변화 한 후 포장을 뜯고 전송되는 과정이라던가
도커 쓸줄도 몰랐는데 포트포워딩으로 이더넷을 통해 컴퓨터 로컬 서버와 휴대폰을 연결시켰다던가
그리고 주문했던 책이 도착해서 node.js와 mongodb에 관한 내용을 좀 읽었다던가 등등

여러 일이 있었고 처음보는 코드들에 당황도 하고 정신없이 1주가 지난 것 같다.

그래도 코드를 한번씩 봤을꺼니 어디가 어떻게 작용되는지 한번 회고하면서 복습을 하는 시간을 가져야 할 것 같다.

이번 회고는 크게 3가지로 나뉜다.

에코 서버
tiny 서버
proxy 서버

여기에서 코드 리뷰는 에코, tiny만 할 것이고, proxy는 대락적인 코드의 흐름만 잡고 넘어가는 식으로 적을 것이다.

Echo 서버

에코 서버란 클라이언트 서버가 보내주는 데이터를 그대로 반환하는 서버를 의미한다. Echo라는 말 그대로 자신의 말이 '메아리'친다고 보면 된다.

그렇기 때문에 에코서버는 다음과 같은 일들이 행해져야 한다.

서버를 열고
클라이언트에서 접속을 하고
클라이언트가 서버에 정보를 전송하고
서버는 그 정보를 다시 클라이언트에 뿌린다.

Echo server

#include "csapp.h"

void echo(int connfd);

int main(int argc , char **argv)
{
    int listenfd , connfd;
    socklen_t clientlen;
    struct sockaddr_storage clientaddr;
    char client_hostname[MAXLINE], client_port[MAXLINE];

    if (argc !=2){
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(0);
    }

    listenfd = Open_listenfd(argv[1]);
    while(1) {
        clientlen = sizeof(struct sockaddr_storage);
        connfd = Accept(listenfd , (SA *) &clientaddr , &clientlen);
        Getnameinfo((SA *) &clientaddr, clientlen , client_hostname, MAXLINE,
                    client_port , MAXLINE, 0);
        printf("connected to (%s, %s)\n", client_hostname, client_port);
        echo(connfd);
        Close(connfd);
    }
    exit(0);
}

에코 서버는 정말 간단한 기능만 구현하기 때문에 다른 함수는 필요없고 void main 하나로 끝난다.

위에서 부터 코드를 살펴보면

    int listenfd , connfd;
    socklen_t clientlen;
    struct sockaddr_storage clientaddr;
    char client_hostname[MAXLINE], client_port[MAXLINE];

    if (argc !=2){
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(0);
    }

써야되는 변수들을 정의 한 후,
argc != 2 일때, (들어오는 데이터가 2구절이 아닐때 [servername , PortNumber]) 에러를 띄우고 ProtNumber를 적으라고 경고를 띄운 후, 종료

    listenfd = Open_listenfd(argv[1]);
    while(1) {
        clientlen = sizeof(struct sockaddr_storage);
        connfd = Accept(listenfd , (SA *) &clientaddr , &clientlen);
        Getnameinfo((SA *) &clientaddr, clientlen , client_hostname, MAXLINE,
                    client_port , MAXLINE, 0);
        printf("connected to (%s, %s)\n", client_hostname, client_port);
        echo(connfd);
        Close(connfd);
    }

만약에 정확한 이름과 port 번호가 들어왔다면,
listenfd를 만들어 파일 디스크립터를 실행한다.

이후 서버에 정보가 들어올때까지 while 구문을 돌며 기다리는데, 정보가 들어온다면 정보를 받아내고 echo를 해버린다.
그리고 한번 echo 한 후, close를 한다.

Echo

이 코드는 단순히 server에 들어온 데이터를 클라이언트에게 뿌리는 역할을 하는 코드들이다. 많은 기능이 있는것은 아니다.

#include "csapp.h"

void echo(int connfd) 
{
    size_t n; 
    char buf[MAXLINE]; 
    rio_t rio;

    Rio_readinitb(&rio, connfd);
    while((n = Rio_readlineb(&rio, buf, MAXLINE)) != 0) { //line:netp:echo:eof
	printf("server received %d bytes\n", (int)n);
	Rio_writen(connfd, buf, n);
    }
}

간단한 코드들이다.
먼저 변수들을 선언해주고, rio에 connfd를 읽힌다.
이후 이러한 정보를 바탕으로 rio0_writen으로 식별자 fd에 비트를 전송한다.

Echo client

#include "csapp.h"

int main(int argc , char **argv)
{
    int clientfd;
    char *host , *port , buf[MAXLINE];
    rio_t rio;

    if (argc != 3) {
        fprintf(stderr, "usage : %s <host> <port>\n", argv[0]);
        exit(0);
    }
    host = argv[1];
    port = argv[2];

    clientfd = Open_clientfd(host,port);
    Rio_readinitb(&rio , clientfd);

    while (Fgets(buf , MAXLINE , stdin) != NULL){
        Rio_writen(clientfd , buf , strlen(buf));
        Rio_readlineb(&rio , buf , MAXLINE);
        Fputs(buf , stdout);
    }
    Close(clientfd);
    exit(0);
}

클라이언트 코드이다.
처음은 서버 코드와 같이 변수 선언과 실행/ip/portnumber를 받고, host,port에 argv[]를 세팅한다.

서버와 마찬가지로 clientfd를 만든 후, rio에 clientfd를 세팅한다.

이후 while 구문을 돌면서 유저의 입력 데이터를 fget으로 받고,
rio 함수를 통해 Fputs을 한다.

Tiny

tiny 서버는 웹 서버의 중요한 기능들을 '모조리'빼버리고 실행할때 필요한 최소한의 기능만 남긴, 뼈대만 남은 서버이다.
그래서 전체 코드가 264줄밖에 안되고(주석 포함) 구조도 비교적 이해하기 편해 서버 입문에 적합한 코드이다. 하지만 입문용 코드라고 해도 런타임이 가능한 서버이기 때문에 해석하는 것이 쉽지는 않다.

크게 돌아가는 맥락을 살펴보면

main 함수가 실행이 되면 Echo 서버처럼 listenfd를 이용해 소켓을 열고, 정보를 읽은 후 doit 함수로 넘어간다.
doit 함수는 정보와 헤더를 읽고(함수는 있지만 헤더는 무시한다.) parse_uri, serve_static, serve_dynamic 함수를 통과한다.
이러한 과정 중 에러가 발생하면 clienterror 함수로 빠져서 처리한다.

main

int main(int argc, char **argv) 
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];
    socklen_t clientlen;
    struct sockaddr_storage clientaddr;

    /* Check command line args */
    if (argc != 2) {
	fprintf(stderr, "usage: %s <port>\n", argv[0]);
	exit(1);
    }

    if (Signal(SIGCHLD, sigchild_handler) == SIG_ERR)
    unix_error("signal child handler error");


    listenfd = Open_listenfd(argv[1]);
    while (1) {
	clientlen = sizeof(clientaddr);
	connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen); //line:netp:tiny:accept
        Getnameinfo((SA *) &clientaddr, clientlen, hostname, MAXLINE, 
                    port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);
	doit(connfd);                                             //line:netp:tiny:doit
	Close(connfd);                                            //line:netp:tiny:close
    }
}

main 함수이다. 이전 서버들과 마찬가지로 정보를 받아들이고 listenfd를 실행한다.
이후 doit 으로 넘어간다.

doit

void doit(int fd) 
{
    int is_static;
    struct stat sbuf;
    char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
    char filename[MAXLINE], cgiargs[MAXLINE];
    rio_t rio;

    /* Read request line and headers */
    Rio_readinitb(&rio, fd);
    if (!Rio_readlineb(&rio, buf, MAXLINE))  //line:netp:doit:readrequest
        return;
    printf("%s", buf);
    sscanf(buf, "%s %s %s", method, uri, version);       //line:netp:doit:parserequest
    if (strcasecmp(method, "GET")) {                     //line:netp:doit:beginrequesterr
        clienterror(fd, method, "501", "Not Implemented",
                    "Tiny does not implement this method");
        return;
    }                                                    //line:netp:doit:endrequesterr
    read_requesthdrs(&rio);                              //line:netp:doit:readrequesthdrs

    /* Parse URI from GET request */
    is_static = parse_uri(uri, filename, cgiargs);       //line:netp:doit:staticcheck
    if (stat(filename, &sbuf) < 0) {                     //line:netp:doit:beginnotfound
	clienterror(fd, filename, "404", "Not found",
		    "Tiny couldn't find this file");
	return;
    }                                                    //line:netp:doit:endnotfound

    if (is_static) { /* Serve static content */          
	if (!(S_ISREG(sbuf.st_mode)) || !(S_IRUSR & sbuf.st_mode)) { //line:netp:doit:readable
	    clienterror(fd, filename, "403", "Forbidden",
			"Tiny couldn't read the file");
	    return;
	}
	serve_static(fd, filename, sbuf.st_size);        //line:netp:doit:servestatic
    }
    else { /* Serve dynamic content */
	if (!(S_ISREG(sbuf.st_mode)) || !(S_IXUSR & sbuf.st_mode)) { //line:netp:doit:executable
	    clienterror(fd, filename, "403", "Forbidden",
			"Tiny couldn't run the CGI program");
	    return;
	}
	serve_dynamic(fd, filename, cgiargs);            //line:netp:doit:servedynamic
    }
}

엄청나게 긴 코드이다.
하지만 이 코드를 나눠서 보면 비슷한 구조가 반복된다고 볼 수 있다.

    Rio_readinitb(&rio, fd);
    if (!Rio_readlineb(&rio, buf, MAXLINE))  //line:netp:doit:readrequest
        return;
    printf("%s", buf);
    sscanf(buf, "%s %s %s", method, uri, version);       //line:netp:doit:parserequest
    if (strcasecmp(method, "GET")) {                     //line:netp:doit:beginrequesterr
        clienterror(fd, method, "501", "Not Implemented",
                    "Tiny does not implement this method");
        return;
    }                                                    //line:netp:doit:endrequesterr
    read_requesthdrs(&rio);                              //line:netp:doit:readrequesthdrs

rio로 데이터를 읽어들인 후, strcasecmp를 통해 get 방식이 맞는지 확인을 한다.
이번 tiny 코드는 헤더를 읽어도 무시하기 때문에, read_requesthdrs는 작동하지 않는다.

    is_static = parse_uri(uri, filename, cgiargs);       //line:netp:doit:staticcheck
    if (stat(filename, &sbuf) < 0) {                     //line:netp:doit:beginnotfound
	clienterror(fd, filename, "404", "Not found",
		    "Tiny couldn't find this file");
	return;
    }                                                    //line:netp:doit:endnotfound

    if (is_static) { /* Serve static content */          
	if (!(S_ISREG(sbuf.st_mode)) || !(S_IRUSR & sbuf.st_mode)) { //line:netp:doit:readable
	    clienterror(fd, filename, "403", "Forbidden",
			"Tiny couldn't read the file");
	    return;
	}
	serve_static(fd, filename, sbuf.st_size);        //line:netp:doit:servestatic
    }
    else { /* Serve dynamic content */
	if (!(S_ISREG(sbuf.st_mode)) || !(S_IXUSR & sbuf.st_mode)) { //line:netp:doit:executable
	    clienterror(fd, filename, "403", "Forbidden",
			"Tiny couldn't run the CGI program");
	    return;
	}
	serve_dynamic(fd, filename, cgiargs);            //line:netp:doit:servedynamic
    }

parse_uri를 통해 uri를 검사해 만약 페이지가 없다면 404를 띄운다.

이후 만약 static이라면(is_static = 1)
if문을 도는데, 만약 권한이 없다면 403 에러를 내보낸다.
이후 serve_static 함수를 돈다.

하지만 static이 아니라면(is_statia = 0)
if문을 돌고, 권한이 없다면 마찬가지로 403을 보내고
serve_dynamic 함수를 돈다.

parse_uri

int parse_uri(char *uri, char *filename, char *cgiargs) 
{
    char *ptr;

    if (!strstr(uri, "cgi-bin")) {  /* Static content */ //line:netp:parseuri:isstatic
	strcpy(cgiargs, "");                             //line:netp:parseuri:clearcgi
	strcpy(filename, ".");                           //line:netp:parseuri:beginconvert1
	strcat(filename, uri);                           //line:netp:parseuri:endconvert1
	if (uri[strlen(uri)-1] == '/')                   //line:netp:parseuri:slashcheck
	    strcat(filename, "home.html");               //line:netp:parseuri:appenddefault
	return 1;
    }
    else {  /* Dynamic content */                        //line:netp:parseuri:isdynamic
	ptr = index(uri, '?');                           //line:netp:parseuri:beginextract
	if (ptr) {
	    strcpy(cgiargs, ptr+1);
	    *ptr = '\0';
	}
	else 
	    strcpy(cgiargs, "");                         //line:netp:parseuri:endextract
	strcpy(filename, ".");                           //line:netp:parseuri:beginconvert2
	strcat(filename, uri);                           //line:netp:parseuri:endconvert2
	return 0;
    }
}

uri를 분석해서 동적인지, 정적인지 알려주는 함수이다.
만약 uri에 'cgi-bin'이 없다면 정적임으로
문자열을 copy하고 1을 return한다.(정적)

그렇지 않다면 ?의 포인터를 받고 계산한다.
ptr이 0이라면 else로 넘어간다.

이후 return 0을 한다.(동적)

serve_static

uri가 정적일때 쓰이는 함수이다.

void serve_static(int fd, char *filename, int filesize)
{
    int srcfd;
    char *srcp, filetype[MAXLINE], buf[MAXBUF];

    /* Send response headers to client */
    get_filetype(filename, filetype);    //line:netp:servestatic:getfiletype
    sprintf(buf, "HTTP/1.0 200 OK\r\n"); //line:netp:servestatic:beginserve
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Server: Tiny Web Server\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-length: %d\r\n", filesize);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: %s\r\n\r\n", filetype);
    Rio_writen(fd, buf, strlen(buf));    //line:netp:servestatic:endserve

    /* Send response body to client */
    srcfd = Open(filename, O_RDONLY, 0); //line:netp:servestatic:open
    // srcp = Mmap(0, filesize, PROT_READ, MAP_PRIVATE, srcfd, 0); //line:netp:servestatic:mmap
    srcp = (char*)Malloc(filesize);
    Rio_readn(srcfd, srcp, filesize);
    Close(srcfd);                       //line:netp:servestatic:close
    Rio_writen(fd, srcp, filesize);     //line:netp:servestatic:write
    // Munmap(srcp, filesize);             //line:netp:servestatic:munmap
    free(srcp);
}

길어보이지만 첫 문단은 filetype를 받고, str을 print한다.(rio_writen)
별 뜻이 없고 진짜 sprintf로 저장한 buf를 rio_writen으로 출력하는거다.

이후 srcfd로 파일을 열고, 파일의 크기만큼 malloc 할당 해 준 후
파일의 내용을 받아온 후, srcfd를 닫고 내용을 보낸 후, malloc할당을 해제한다.

get_filetype

void get_filetype(char *filename, char *filetype) 
{
    if (strstr(filename, ".html"))
	strcpy(filetype, "text/html");
    else if (strstr(filename, ".gif"))
	strcpy(filetype, "image/gif");
    else if (strstr(filename, ".png"))
	strcpy(filetype, "image/png");
    else if (strstr(filename, ".jpg"))
	strcpy(filetype, "image/jpeg");
    else if (strstr(filename, ".mp4"))
    strcpy(filetype, "video/mp4");
    else
	strcpy(filetype, "text/plain");
}

어떤 확장자명을 받을것인가 매핑한다.
원하는 파일명을 추가하면 그 파일 또한 읽는다.

serve_dynamic

void serve_dynamic(int fd, char *filename, char *cgiargs) 
{
    char buf[MAXLINE], *emptylist[] = { NULL };

    /* Return first part of HTTP response */
    sprintf(buf, "HTTP/1.0 200 OK\r\n"); 
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Server: Tiny Web Server\r\n");
    Rio_writen(fd, buf, strlen(buf));
  
    if (Fork() == 0) { /* Child */ //line:netp:servedynamic:fork
	/* Real server would set all CGI vars here */
	setenv("QUERY_STRING", cgiargs, 1); //line:netp:servedynamic:setenv
	Dup2(fd, STDOUT_FILENO);         /* Redirect stdout to client */ //line:netp:servedynamic:dup2
	Execve(filename, emptylist, environ); /* Run CGI program */ //line:netp:servedynamic:execve
    }
}

동적 파일을 받는 함수이다.
static과는 다르게 'QWERY_STRING'을 CGI 프로그램에 돌린다.
Fork로 시작하고 Execve로 종료한다.
자세한 내용은 csapp의 fork/execve를 보면 나온다.

clienterror

void clienterror(int fd, char *cause, char *errnum, 
		 char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE];

    /* Print the HTTP response headers */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: text/html\r\n\r\n");
    Rio_writen(fd, buf, strlen(buf));

    /* Print the HTTP response body */
    sprintf(buf, "<html><title>Tiny Error</title>");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "<body bgcolor=""ffffff"">\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "%s: %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "<p>%s: %s\r\n", longmsg, cause);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "<hr><em>The Tiny Web server</em>\r\n");
    Rio_writen(fd, buf, strlen(buf));
}

말 그대로 클라이언트에 에러가 발생했을때 에러 메시지를 띄우는 함수이다.
어려운 로직은 없다.

Proxy

프록시는 간단하게 문맥만 짚고 넘어가겠다.

이번 프록시에서 수행해야 할 목록은

소켓통신
HTTP프로토콜 해석
요청 중개
캐싱
에러처리
보안

이 있다.
main 함수는 기존 서버의 코드 순서대로 가는데, 멀티쓰레드 관련 코드가 추가된다.

doit은 클라이언트와 연결 이후 요청을 파싱한다. 이후 캐시를 찾고(찾는동안 프로세스를 잠근 후 찾고나면 잠금을 품), 요쳥사항을 해석 하였으니 디스크립터를 생성, 서버 소켓과 연결한다.
이후 서버에 보낼 데이터를 작성한 후 (해더, 데이터) 캐싱한다.

소켓통신 은 기존 서버에서 하던것을 그대로 하면 된다.
하지만 기본 서버와 다른것이 있는데, 그것은 멀티 쓰레딩의 구현이다.
멀티 쓰레딩을 하기 위해서는 Pthread_create 함수를 써서 여러 쓰레드 사용을 지정해 주어야 하고, 세마포어를 잠구고 푸는것 또한 지정을 해서 쓰레드끼리의 충돌을 방지하여야 한다.

HTTP 해석은 프록시로 들어온 정보를 서버로 전달하기 위해 해더를 생성해야한다(make_request). [부정확한 정보니까 확인바람]

요청 중개는 클라이언트와 서버 사이의 데이터를 연결해주는 역할을 하고 있는것이다.(클라이언트에서 정보를 받아 서버로 보내는 역할)

캐싱을 사용해 계산속도를 높히는데, 위에서 적었다시피 멀티쓰레딩 기술을 사용할때 쓰레드의 충돌을 피하기 위해 세마포어를 사용해야 한다.

에러처리는 기존 서버의 것을 이용하면 된다.

보안에 관해서는... 적을수 있는 내용이 없다.
보안에 관련해서 배운 적도 없고 내가 알고있는 선에서는 보안을 위해 정보를 주고받는 방법을 바꾸거나 jwt같은 보안 토큰을 이용해야 한다.

아래는 내가 제출한 코드이다.

#include <stdio.h>
#include "csapp.h"
#include "hash.c"


/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400
#define LRU_MAGIC_NUMBER 9999
#define CACHE_OBJS_COUNT 10
#define VERBOSE         1
#define CONCURRENCY     1 // 0: 시퀀셜, 1: 멀티스레드, 2: 멀티프로세스

/*cache function*/
void cache_init();
int cache_find(char *url);
int cache_eviction();
void cache_LRU(int index);
void cache_uri(char *uri,char *buf);
void readerPre(int i);
void readerAfter(int i);


typedef struct { // cache_block 구조체
    char cache_obj[MAX_OBJECT_SIZE];
    char cache_url[MAXLINE];
    int LRU; // Least recently used. 캐시의 최근 사용여부
    int isEmpty;

    int readCnt;            /*count of readers*/
    sem_t wmutex;           /*캐시 접근을 위한 세마포어*/
    sem_t rdcntmutex;       /*readcnt변수에 대한 접근을 보호하기 위한 세마포어*/

} cache_block;

typedef struct { //캐시구조체
    cache_block cacheobjs[CACHE_OBJS_COUNT];  /*cacheobjs cache_block 구조체의 배열. 실제 캐시 블록(10)을 저장함.*/
    int cache_num; //현재 캐시된 블록의 수
} Cache;

Cache cache;


/* You won't lose style points for including this long line in your code */
// https://developer.mozilla.org/ko/docs/Glossary/Request_header
static const char *request_hdr_format = "%s %s HTTP/1.0\r\n";
static const char *host_hdr_format = "Host: %s\r\n";
static const char *user_agent_hdr =
    "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 "
    "Firefox/10.0.3\r\n";
static const char *connection_hdr = "Connection: close\r\n";
static const char *proxy_connection_hdr = "Proxy-Connection: close\r\n";
static const char *Accept_hdr = "    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\n";
static const char *EOL = "\r\n";


void doit(int connfd);
void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg);
void parse_uri(char *uri,char *hostname,char *path,int *port);
int make_request(rio_t* client_rio, char *hostname, char *path, int port, char *hdr, char *method);
#if CONCURRENCY == 1 
void *thread(void *vargp);  // Pthread_create 에 루틴 반환형이 정의되어있음
#endif


/*프록시 필요기능 : 소켓통신, HTTP 프로토콜 해석, 요청 중개, 캐싱, 에러처리, 보안*/

int main(int argc, char **argv) {
  int listenfd, *clientfd; //소켓 파일 디스크립터: 듣기 식별자, 클라이언트 식별자
  char hostname[MAXLINE], port[MAXLINE];
  socklen_t clientlen; //클라이언트 주소 구조체의 크기 저장
  struct sockaddr_storage clientaddr; //클라이언트 소켓 주소정보 저장
  char client_hostname[MAXLINE], client_port[MAXLINE];  // 프록시가 요청을 받고 응답해줄 클라이언트의 IP, Port
  pthread_t tid;  // 스레드에 부여할 tid 번호 (unsigned long)

  /* Check command line args
  명령행 인수의 개수를 검사하여 올바른 개수가 아니면 오류 메시지를 출력하고 프로그램을 종료*/
  if (argc != 2) {
    fprintf(stderr, "usage: %s <port>\n", argv[0]); //stderr스트림에 usage:프로그램이름<port> 형식의 표준에러 메시지 출력
    exit(1);
  }

  cache_init();

  listenfd = Open_listenfd(argv[1]);  // e듣기소켓 오픈. client.c에서 인자로 포트번호 넘겨줌
; 
  while (1) { //while true임.
    clientlen = sizeof(clientaddr); // accept 함수 인자에 넣기 위한 주소 길이를 계산
    clientfd = (int *)Malloc(sizeof(int));  // 여러개의 디스크립터를 만들 것이므로 덮어쓰지 못하도록 고유메모리에 할당
    *clientfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);  // 클라이언트와 연결수락. 프록시가 서버로서 클라이언트와 맺는 파일 디스크립터(소켓 디스크립터) : 고유 식별되는 회선이자 메모리 그 자체

    //클라이언트의 IP 주소와 포트 번호를 얻어 client_hostname과 client_port에 저장
    Getnameinfo((SA *)&clientaddr, clientlen, client_hostname, MAXLINE, client_port, MAXLINE, 0);

    if (VERBOSE){// VERBOSE가 참(1)인 경우 printf 함수를 사용하여 클라이언트의 연결 정보를 출력
      printf("Connected to (%s, %s)\n", client_hostname, client_port); 
    }

    //CONCURRENCY 옵션에 따라 다른 동작을 수행
    #if CONCURRENCY == 0 //단일스레드모드 : doit 함수를 호출하여 클라이언트 요청을 처리한 후, clientfd를 닫음.
      doit(*clientfd);
      Close(*clientfd);
    
    #elif CONCURRENCY == 1 //스레드모드 : 스레드 모드. thread 함수를 새로운 스레드로 실행하고, doit 함수를 호출하여 클라이언트 요청을 처리한 후, clientfd를 닫습니다.
      Pthread_create(&tid, NULL, thread, (void *)clientfd);

    #elif CONCURRENCY == 2  //프로세스모드 : 새로운 자식을 생성하여 클라이언트 요청을 처리한 다음, 
      if (Fork() == 0) {  // clientfd, listenfd 닫고 자식 프로세스 종료
        Close(listenfd);
        doit(*clientfd);
        Close(*clientfd);
        exit(0);
      }
    Close(*clientfd);
  #endif
  }

  return 0;
}

#if CONCURRENCY == 1 
  void *thread(void *argptr) {
    int clientfd = *((int *)argptr);
    Pthread_detach((pthread_self()));
    doit(clientfd);
    Close(clientfd);
  }
#endif


/*프록시 doit. 클라이언트와의 통신 처리. 클라이언트의 요청을 서버로 전달. 서버의 응답을 클라이언트로 전달.*/
void doit(int client_fd){
  char hostname[MAXLINE], path[MAXLINE]; //프록시가 요청을 보낼 서버의 hostname, 파일경로, 포트번호
  int port;

  char buf[MAXLINE], hdr[MAXLINE]; //문자열을 저장
  char method[MAXLINE], uri[MAXLINE], version[MAXLINE]; //클라이언트 요청의 메소드, uri, 버전

  int server_fd;

  rio_t client_rio;
  rio_t server_rio;

  Rio_readinitb(&client_rio, client_fd); //클라이언트와 연결 시작
  rio_readlineb(&client_rio, buf, MAXLINE); //클라이언트의 요청을 한줄씩 읽어옴
  sscanf(buf, "%s %s %s", method, uri, version); // 클라이언트에서 받은 요청 파싱

  if (strcasecmp(method, "GET") && strcasecmp(method, "HEAD")){//strcasecamp()는 괄호 안이 맞다면 0(FALSE) 반환. -> GET 또는 HEAD가 아닐 경우
    if (VERBOSE) { //VERBOSE가 참(1)인 경우 printf 함수를 사용하여 상세한 로그 메시지를 출력
      printf("[PROXY]501 ERROR\n");
    }
    clienterror(client_fd, method, "501", "잘못된 요청", "501 에러. 올바른 요청이 아닙니다.");
    return;
  } 


  char url_store[100];
  strcpy(url_store,uri);  /*original url을 url_store에 저장 */
  if(strcasecmp(method,"GET")){  // method 문자열이 GET과 다르다면
    printf("Proxy does not implement the method");
    return;
  }

  /*the uri is cached ? */
  int cache_index;/*in cache then return the cache content*/
  if((cache_index=cache_find(url_store))!=-1){//cache_find함수를 이용하여 uri_store에 해당하는 캐시 블록의 인덱스를 찾음. cahce_index는 반환된 인덱스.
      readerPre(cache_index); //cache_index에 해당하는 캐시 블록을 읽는 동안 다른 스레드들이 쓰기 작업을 방지하기 위해 reader 프로세스 락을 설정함. 이는 동시에 여러 스레드가 동일한 캐시 블록을 동시에 읽을 수 있도록 함.
      Rio_writen(client_fd,cache.cacheobjs[cache_index].cache_obj,strlen(cache.cacheobjs[cache_index].cache_obj));  //cleint_fd를 통해 캐시 블록의 콘텐츠를 클라이언트에게 전송.
      readerAfter(cache_index); //캐시블록의 읽기 작업이 완료되었으므로 reader프로세스락 해제.
      return;
  }

  parse_uri(uri, hostname, path, &port); // 클라이언트가 요청한 uri 파싱하여 hostname, path, port(포인터) 변수에 할당
    
  char port_value[100];
  sprintf(port_value,"%d",port); //port 변수값을 문자열(%d)로 변환하여 port_value에 저장.
  server_fd = Open_clientfd(hostname, port_value); // 서버와의 소켓 디스크립터 생성

  if (!make_request(&client_rio, hostname, path, port, hdr, method)) {//클라이언트 요청 생성 실패한 경우
    if (VERBOSE) {
      printf("[PROXY]501 ERROR\n");
    }
    clienterror(client_fd, method, "501", "잘못된 요청", "501 에러. 올바른 요청이 아닙니다.");
  }
  
  Rio_readinitb(&server_rio, server_fd);  // 서버 소켓과 연결
  Rio_writen(server_fd, hdr, strlen(hdr)); // 서버에 req 보냄
  
  char cachebuf[MAX_OBJECT_SIZE]; //캐시에 저장할 응답을 임시로 저장하는 cachebuf선언.
  int sizebuf = 0; //현재까지 버퍼에 정답된 응답의 크기
  size_t n; //Rio_readlineb함수가 읽어온 응답의 길이를 저장
  while ((n=Rio_readlineb(&server_rio, buf, MAXLINE)) > 0) { //서버로부터 한줄씩 읽어온 응답의 길이를 n에 저장. n이 0보다 큰 경우(더 읽어올 응답이 있는 경우)
    sizebuf+=n; //sizebuf + n으로 현재까지 읽어온 응답의 크기 갱신 
    if(sizebuf < MAX_OBJECT_SIZE)
      strcat(cachebuf,buf); //buf에 저장된 응답을 cachebuf에 이어붙임. 응답을 임시로 cahcebuf에 저장.
      Rio_writen(client_fd, buf, n);   // 클라이언트에게 buf에 저장된 응답 전달. n은 읽어온 응답의 길이.
  } 
  Close(server_fd); //

  /*store it*/
  if(sizebuf < MAX_OBJECT_SIZE){
    cache_uri(url_store,cachebuf); //url_sotre에 해당하는 url의 캐시 블록에 cachebuf에 저장된 응답을 저장.
  }
}


/*
 * clienterror - returns an error message to the client
 */
/* $begin clienterror */
void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE], body[MAXBUF];

    /* Build the HTTP response body */
    sprintf(body, "<html><title>Tiny Error</title>"); //body문자열에 HTML태그 포함한 오류페이지의 시작부분 저장.
    sprintf(body, "%s<body bgcolor=""ffffff"">\r\n", body);
    sprintf(body, "%s%s: %s\r\n", body, errnum, shortmsg);
    sprintf(body, "%s<p>%s: %s\r\n", body, longmsg, cause);
    sprintf(body, "%s<hr><em>The Tiny Web server</em>\r\n", body);

    /* Print the HTTP response */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg); //buf문자열에 응답헤더의 상태코드와 상태 메시지 저장.
    Rio_writen(fd, buf, strlen(buf)); //buf에 저장된 HTTP응답헤더를 클라이언트에 전달.
    sprintf(buf, "Content-type: text/html\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-length: %d\r\n\r\n", (int)strlen(body));
    Rio_writen(fd, buf, strlen(buf));
    Rio_writen(fd, body, strlen(body)); //body에 저장된 오류페이지 내용을 클라이언트에 전달.
}
/* $end clienterror */


void parse_uri(char *uri,char *hostname, char *path, int *port) {
  /*
   uri가  
   / , /cgi-bin/adder 이렇게 들어올 수도 있고,
   http://11.22.33.44:5001/home.html 이렇게 들어올 수도 있다.
   알맞게 파싱해서 hostname, port로, path 나누어주어야 한다!
   주어진 URI를 파싱하여 호스트 이름, 경로, 포트 번호를 추출하고 해당 변수들에 저장하는 역할.
  */

  *port = 80;
  if (VERBOSE) {
    printf("uri=%s\n", uri);
  }
  
  char *parsed;
  parsed = strstr(uri, "//"); //uri문자열에서 //찾기
  
  //만약 "//"문자열이 존재하지 않는다면 uri에 호스트이름과 포트번호가 포함되어있지 않은 경우.
  //(= URI가 "/path" 형식이거나 "/cgi-bin/adder" 형식인 경우)
  if (parsed == NULL) {
    parsed = uri;
  }
  else { // "//" 문자열이 존재하여, URI에 호스트 이름과 포트 번호가 포함된 경우
    parsed = parsed + 2;  // "//"이후로 포인터 두칸 이동 . parsed는 이제 호스트이름과 포트번호가 시작되는 위치 가리킴.
  }
  char *parsed2 = strstr(parsed, ":"); //parsed에서 ":" 문자열을 찾아서 호스트 이름과 포트 번호를 구분.

  if (parsed2 == NULL) {// ':' 이후가 없다면, port가 없음
    parsed2 = strstr(parsed, "/");
    if (parsed2 == NULL) {
      sscanf(parsed,"%s",hostname); // "/" 문자열이 존재하지 않는다면, URI에 호스트 이름만 포함된 것이므로 parsed에서 호스트 이름을 추출하여 hostname에 저장
    } 
    else { // "/" 문자열이 존재한다면, parsed에서 호스트 이름과 경로를 추출하여 각각 hostname과 path에 저장.
        *parsed2 = '\0';
        sscanf(parsed,"%s",hostname);
        *parsed2 = '/';
        sscanf(parsed2,"%s",path);
    }
  } else {// ':' 이후가 있으므로 port가 있음
      *parsed2 = '\0'; // ":" 문자열을 NULL 문자로 대체하여 호스트 이름을 추출
      sscanf(parsed, "%s", hostname);
      sscanf(parsed2+1, "%d%s", port, path); //":" 이후에 오는 문자열에서 포트 번호를 추출하여 port에 저장하고, 경로를 추출하여 path에 저장
  }
  if (VERBOSE) {
    printf("hostname=%s port=%d path=%s\n", hostname, *port, path);
  }
}


int make_request(rio_t* client_rio, char *hostname, char *path, int port, char *hdr, char *method) {
  // 프록시서버로 들어온 요청을 서버에 전달하기 위해 HTTP 헤더 생성
  char req_hdr[MAXLINE], additional_hdr[MAXLINE], host_hdr[MAXLINE]; //요청헤더, 추가적인 헤더정보, 호스트헤더정보 저장하기 위한 변수.
  char buf[MAXLINE];
  char *HOST = "Host";
  char *CONN = "Connection";
  char *UA = "User-Agent";
  char *P_CONN = "Proxy-Connection";
  sprintf(req_hdr, request_hdr_format, method, path); // req_hdr에 request_hdr_format 문자열, method, path를 형식화하여 저장.

  while (1) { //클라이언트로부터 받은 헤더 정보를 처리. 반복문은 클라이언트로부터 빈 줄(EOL)이나 EOF를 받을 때까지 실행
    if (Rio_readlineb(client_rio, buf, MAXLINE) == 0) break; //읽은 결과가 0이면 EOF므로 반복문 종료.
    if (!strcmp(buf,EOL)) break;  // buf와 EOL을 비교하여 같다면 buf == EOL => EOF

    if (!strncasecmp(buf, HOST, strlen(HOST))) {// 호스트 헤더 지정.  buf가 "HOST"헤더를 나타내는 경우.
      strcpy(host_hdr, buf); //host_hdr에 buf복사
      continue;
    }

    if (strncasecmp(buf, CONN, strlen(CONN)) && strncasecmp(buf, UA, strlen(UA)) && strncasecmp(buf, P_CONN, strlen(P_CONN))) {
       // 미리 준비된 헤더(HOST, CONNECTION, USER-AGENT, PROXY-CONNECTION)가 아니면 추가 헤더에 추가 
      strcat(additional_hdr, buf);  
      strcat(additional_hdr, "\r\n");  
    }
  }

  if (!strlen(host_hdr)) {//호스트헤더가 없는 경우. 클라이언트로부터 받은 헤더에 호스트헤더가 포함되어있지 않은 경우.
    sprintf(host_hdr, host_hdr_format, hostname); //host_hdr_format형식 문자열을 사용하여 호스트 헤더를 생성하여 host_hdr에 저장.
  }

  sprintf(hdr, "%s%s%s%s%s%s", 
    req_hdr,   // METHOD URL VERSION
    host_hdr,   // Host header
    user_agent_hdr,
    connection_hdr,
    proxy_connection_hdr,
    EOL
  );

  if (strlen(hdr)) // 생성된 헤더 정보의 길이가 0보다 큰 경우 1을 반환
    return 1;
  return 0;
}



/**************************************
 * Cache Function
 * https://github.com/yeonwooz/CSAPP-Labs
 * 
 * 
 * P: 스레드가 진입시 임계영역 잠금 
 * V: 스레드가 퇴장히 임계영역 열어줌
 **************************************/

void cache_init(){ //캐시 객체와 관련된 변수들을 초기화. 캐시 초기화 함수
    cache.cache_num = 0; //캐시에 저장된 객체의 수 
    int i;
    for(i=0;i<CACHE_OBJS_COUNT;i++){ //for 루프를 사용하여 CACHE_OBJS_COUNT(10)만큼 반복
        cache.cacheobjs[i].LRU = 0;// 캐시의 i번째 캐시블록배열의 LRU(Least Recently Used)변수를 0으로 설정. 
        cache.cacheobjs[i].isEmpty = 1;
        Sem_init(&cache.cacheobjs[i].wmutex,0,1); //쓰기 연산을 보호하기 위해 wmutex 세마포어를 1로 초기화.
        Sem_init(&cache.cacheobjs[i].rdcntmutex,0,1);//읽기 연산의 동시성을 보호하기 위해 rdcntmutex 세마포어를 1로 초기화
        cache.cacheobjs[i].readCnt = 0;//해당 객체를 읽고 있는 쓰레드의 수
    }
}

void readerPre(int i){//캐시 객체의 읽기 연산을 수행하기 전에 호출되는 함수
    P(&cache.cacheobjs[i].rdcntmutex); // cache.cacheobjs[i].rdcntmutex 세마포어를 잠근다.(P연산)-> 현재 읽기 연산의 동시성 보호 및 다른 스레드가 동시에 readCnt 수정 못함.
    cache.cacheobjs[i].readCnt++; 
    if(cache.cacheobjs[i].readCnt==1) P(&cache.cacheobjs[i].wmutex); // 처음으로 읽기 연산을 수행하는 스레드라면: cache.cacheobjs[i].wmutex 세마포어를 잠근다 (P 연산).
    V(&cache.cacheobjs[i].rdcntmutex); // cache.cacheobjs[i].rdcntmutex 세마포어를 풀어준다 (V 연산).
}

void readerAfter(int i){//캐시 객체의 읽기 연산이 완료된 후에 호출되는 함수
    P(&cache.cacheobjs[i].rdcntmutex);
    cache.cacheobjs[i].readCnt--;
    if(cache.cacheobjs[i].readCnt==0) V(&cache.cacheobjs[i].wmutex); // cache.cacheobjs[i].readCnt 값이 0이 되었다면 (더 이상 읽는 스레드가 없는 경우):wmutex세마포어를 푼다 -> 쓰기 연산이 대기 중인 경우 이제 쓰기 연산이 수행될 수 있음.
    V(&cache.cacheobjs[i].rdcntmutex);

}

void writePre(int i){
    P(&cache.cacheobjs[i].wmutex); //wmutex 세마포어를 잠근다.(P연산)
}

void writeAfter(int i){
    V(&cache.cacheobjs[i].wmutex);//wmutex 세마포어를 연다.(V연산)
}

/*find url is in the cache or not
cache_find()함수는 주어진 URL을 캐시에서 검색하고, 검색 결과에 따라 해당 URL이 캐시에 존재하는지를 판단 */
int cache_find(char *url){
    int i;
    for(i=0;i<CACHE_OBJS_COUNT;i++){
        readerPre(i);
        if((cache.cacheobjs[i].isEmpty==0) && (strcmp(url,cache.cacheobjs[i].cache_url)==0)) break; // 해당 캐시 객체가 비어있지 않은 상태(isEmpty = 0)이고 url이 일치하면 반복문 탈출. 해당 url은 캐시에 존재.
        readerAfter(i);
    }
    if(i>=CACHE_OBJS_COUNT) return -1; /*can not find url in the cache*/
    return i; //  URL이 캐시에서 찾아진 캐시 객체의 인덱스 리턴.
}

/*find the empty cacheObj or which cacheObj should be evictioned
가장 적절한 캐시 객체를 선택하여 캐시에서 제거할 객체의 인덱스를 반환하는 함수.
 LRU (Least Recently Used) 알고리즘에 따라 가장 오랫동안 사용되지 않은 객체를 선택하여 제거하는 것을 의미.
*/
int cache_eviction(){
    int min = LRU_MAGIC_NUMBER; // min 변수를 LRU_MAGIC_NUMBER로 초기화.
    int minindex = 0; // minindex 변수를 0으로 초기화. 현재 최소 LRU 값을 갖는 캐시 객체의 인덱스를 저장
    int i;
    for(i=0; i<CACHE_OBJS_COUNT; i++)
    {
        readerPre(i);
        if(cache.cacheobjs[i].isEmpty == 1){/*choose if cache block empty 해당 캐시객체가 비어있는 경우 */
            minindex = i;
            readerAfter(i);
            break; //반복문 탈출 : 비어있는 캐시 객체가 발견되었을 때 해당 객체를 선택하여 제거할 것을 의미.
        }
        if(cache.cacheobjs[i].LRU< min){    /*캐시객체가 비어있지 않을 때, 현재 캐시 객체의 LRU 값이 더 작은 경우*/
            minindex = i;
            readerAfter(i);
            continue;
        }
        readerAfter(i);
    }

    return minindex;
}

/*update the LRU number except the new cache one
 특정 인덱스 이전과 이후에 있는 캐시 객체의 LRU 값을 감소시켜 LRU 알고리즘을 업데이트하는 함수
 해당 인덱스를 최신 사용으로 표시하고 다른 객체들의 LRU 값을 조정하는 역할*/
void cache_LRU(int index){
    int i;
    for(i=0; i<index; i++)    { //i가 index보다 작을 때까지 반복.
        writePre(i);
        if(cache.cacheobjs[i].isEmpty==0){ // 해당 캐시 객체가 비어있지 않은 경우
            cache.cacheobjs[i].LRU--;//해당 캐시 객체의 LRU 값을 업데이트하여 최신 사용으로 표시
        }
        writeAfter(i);
    }
    i++;
    for(i; i<CACHE_OBJS_COUNT; i++)    {//i가 CACHE_OBJS_COUNT(캐시 개체의 총 개수)보다 작을 때까지 반복
        writePre(i);
        if(cache.cacheobjs[i].isEmpty==0){
            cache.cacheobjs[i].LRU--;
        }
        writeAfter(i);
    }
}
/*cache the uri and content in cache
캐시에 URI와 해당 URI에 대한 응답 데이터를 저장하는 기능*/
void cache_uri(char *uri,char *buf) {
    int i = cache_eviction();// 캐시에서 대체할 인덱스(i)를 얻음

    writePre(i);/*writer P쓰기 연산 준비*/

    strcpy(cache.cacheobjs[i].cache_obj,buf); //buf의 내용을 cache.cacheobjs[i].cache_obj에 복사->해당 인덱스에 응답 데이터를 저장
    strcpy(cache.cacheobjs[i].cache_url,uri);
    cache.cacheobjs[i].isEmpty = 0; //해당인덱스의 캐시객체 사용 중(0. False.)으로 표시
    cache.cacheobjs[i].LRU = LRU_MAGIC_NUMBER; //해당 캐시 객체를 최신 사용으로 표시하는 LRU 값을 설정
    cache_LRU(i);  //LRU 알고리즘을 업데이트. 해당 인덱스를 최신 사용으로 표시하고 다른 객체들의 LRU 값을 조정

    writeAfter(i);/*writer V 쓰기 연산 완료*/
}