5. 실습_Inference를 위한 model handler 개발

s2ul3·2022년 10월 14일

AWS를 활용한 인공지능 모델 배포 serving

Model serving

학습된 모델을 REST API 방식으로 배포하기 위해 학습된 모델의 Serialization과 웹 프레임워크를 통해 배포 준비 필요
모델을 서빙할 때는 학습 시의 데이터 분포나 처리 방법과의 연속성 유지 필요
모델을 배포하는 환경에 따라 다양한 Serving Framework를 고려하여 활용

과정

1. Model Training

Data Preprocessing
Model fitting
Evaluation

2. Serializing Model

Save trained model

3. Serving Model

Load trained model
- Define inference
Deployment

지난 시간까지 3번과정인 Serving Model의 load trained model까지 했음.
이번엔 그 다음 과정인 Define inference를 실습해보겠다.

Skeleton of model handler to serve model

class ModelHandler(BaseHandler):
	def __init__(self):
    	pass
    # 1. 정의된 양식으로 데이터가 입력됐는지 확인
	def initialize(self, **kwargs):
    	pass
    # 2. 입력값에 대한 전처리 및 모델에 입력하기 위한 형태로 변환
	def preprocess(self, data):
    	pass
    # 3. 불러온 모델을 바탕으로 추론
	def inference(self, data):
    	pass
    # 4. 모델 반환값의 후처리 작업
	def postprocess(self, data):
    	pass
    # 5. 결과 반환
	def handle(self, data):
    	pass

1. Handle

handle() : 요청 정보를 받은 후, 위 일련의 과정(1~4)을 실행하여 응답을 반환해줌. 실제로 api에서는 handle만 call함.

    def handle(self, data):
        # do above processes
        model_input = self.preprocess(data)
        model_output = self.inference(model_input)
        return self.postprocess(model_output)

2. initialization

initialize()

데이터 처리나 모델, configuration 등 초기화
1. configuration 등 초기화
2. (모델이 신경망인 경우) 신경망을 구성하고 초기화
3. 사전 학습한 모델이나 전처리기 불러오기(De-serialization)
Note
- 모델은 전역변수로 불러와야 한다. (그렇지 않으면 inference 할 때마다 매번 모델을 불러오게 되고 따라서 자원 낭비가 발생한다.)
- 요청을 처리하기 전에 모델을 불러 둔다.

    def initialize(self, ):
        # De-serializing model and loading vectorizer
        import joblib
        self.model = joblib.load('model/ml_model.pkl')
        self.vectorizer = joblib.load('model/ml_vectorizer.pkl')

3. Preprocess

preprocess()

Raw input을 전처리 및 모델 입력 가능 형태로 변환
1. Raw input 전처리 : 데이터 클렌징의 목적과 학습된 모델의 학습 당시 scaling이나 처리 방식과 맞춰준다.
2. 모델에 입력가능한 형태로 변환 : vectorization, converting to id 등

    def preprocess(self, text):
        # cleansing raw text
        model_input = self.clean_text(text)
        # vectorizing cleaned text
        model_input = self.vectorizer.transform(model_input)
        return model_input

4. Inference

inference()

입력된 값을 모델에 넣어 예측 및 추론
각 모델의 predict 방식으로 예측 확률분포 값 반환

    def inference(self, model_input):
        # get predictions from model as probabilities
        model_output = self.model.predict_proba(model_input)
        return model_output

5. Postprocess

postprocess()

모델의 예측값을 response에 맞게 후처리 작업
1. 예측된 결과에 대한 후처리 작업
2. 보통 모델이 반환하는 건 확률분포와 같은 값이기 때문에 response에서 받아야 하는 정보로 처리하는 역할을 많이 함.

    def postprocess(self, model_output):
        # process predictions to predicted label and output format
        predicted_probabilities = model_output.max(axis = 1)
        predicted_ids = model_output.argmax(axis = 1)
        predicted_labels = [self.id2label[id_] for id_ in predicted_ids]
        return predicted_labels, predicted_probabilities

완성된 model handler 코드

class MLModelHandler(ModelHandler):
    def __init__(self):
        super().__init__()
        self.initialize()

    def initialize(self, ):
        # De-serializing model and loading vectorizer
        import joblib
        self.model = joblib.load('model/ml_model.pkl')
        self.vectorizer = joblib.load('model/ml_vectorizer.pkl')

    def preprocess(self, text):
        # cleansing raw text
        model_input = self.clean_text(text)
        # vectorizing cleaned text
        model_input = self.vectorizer.transform(model_input)
        return model_input

    def inference(self, model_input):
        # get predictions from model as probabilities
        model_output = self.model.predict_proba(model_input)
        return model_output

    def postprocess(self, model_output):
        # process predictions to predicted label and output format
        predicted_probabilities = model_output.max(axis = 1)
        predicted_ids = model_output.argmax(axis = 1)
        predicted_labels = [self.id2label[id_] for id_ in predicted_ids]
        return predicted_labels, predicted_probabilities

    def handle(self, data):
        # do above processes
        model_input = self.preprocess(data)
        model_output = self.inference(model_input)
        return self.postprocess(model_output)

Testing ML model handler

완성된 model handler가 잘 수행되는지 테스트 해보자.
위에서 작성한 코드를 save하고 exit() 한 후 아래 코드를 작성.

(pytorch) ubuntu@ip-172-31-47-106:~/kdt-ai-aws$ python
Python 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from model import MLModelHandler # 전역변수로 모델을 불러온다.
>>> ml_handler = MLModelHandler()
>>> ml_handler
<model.MLModelHandler object at 0x7f31853a6d60>
>>> text = ['정말 재미있는 영화입니다.', '정말 재미가 없습니다.']
>>> text
['정말 재미있는 영화입니다.', '정말 재미가 없습니다.']
>>> result = ml_handler.handle(text)
>>> result
(['positive', 'negative'], array([0.98683823, 0.79660478]))
``
그 결과 첫번째 text는 긍정, 두번째 text는 부정으로 잘 예측했다.