kfserving의 inferenceservice 배포
- 원래 kfserving의 inferenceservice를 배포하는 방법을 사용하지만,
- 이 방법은 내부적으로 knative의 기능을 기본적으로 사용할 수 있는 방법
- Autoscale, Canary rollout, Routing등
- tensorflow serving의 이미지를 사용하여, kubernetes에 배포하는 방법으로 만들어 보려고 한다.
아래는 kfserving을 inferenceservice CR로 배포하지 않고, 직접 배포하는 방법으로 knative를 사용하지 않는다.
- kfserving에서 만든 tensorflow serving image를 활용하여 직접 배포하여 serving을 하는 방법이다.
- 의도 : kubernetes, istio환경에서 tensforflow serving image를 사용하여 serving을 할 수 있는 환경을 구성한 것이다.
- 기본 구조
- model은 Google Cloud Storage에 저장
- kubernetes Service, Deployment, VirtualService, DestinationRule 배포
apiVersion: v1
kind: Service
metadata:
labels:
app: minst
name: minst-service
namesapce: kubeflow
spec:
ports:
- name: grpc-tf-serving
port: 9000
targetPort: 9000
- name: http-tf-serving
port: 8500
targetPort: 8500
selector:
app: mnist
type: LoadBalancer
---
apiVersion: v1
kind: Deployment
metadata:
labels:
app: mnist
name: mnist-v1
namespace: kubeflow
spec:
replicas: 1
selector:
matchLabels:
app: mnist
template:
metadta:
annotations:
sidecar.istio.io/inject: "true"
labels:
app: mnist
version: v1
spec:
containers:
- args:
- --port=9000
- --rest_api_port=8500
- --model_name=mnist
- --model_base_path=gs://<BUCKET_NAME>
command:
- /usr/bin/tensorflow_model_server
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /secret/gcp-credentials/user-gcp.sa.json
image: tensorflow/serving
imagePullPolicy: IfNotPresent
livenessProbe:
initialDelaySeconds: 30
periodSeconds: 30
tcpSocket:
port: 9000
name: mnist
ports:
- containerPort: 9000
- containerPort: 8500
resources:
limits:
cpu: "4"
memroy: 4Gi
nvidia.com/gpu: 1
requests:
cpu: "1"
memroy: 1Gi
volumeMounts:
- mountPath: /secret/gcp-credentials
name: gcp-credentials
volumes:
- name: gcp-credentials
secret:
secretName: user-gcp-sa
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
labels:
app: mnist
name: mnist-service
namespace: kubeflow
spec:
gateways:
- kubeflow-gateway
hosts:
- '*'
http:
- match:
- method:
exact: POST
uri:
prefix: /tfserving/models/minst
rewrite:
uri: /v1/models/mnist:predict
route:
- destination:
host: mnist-service
port:
number: 8500
subset: v1
weight: 100
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
labels:
app: mnist
name: mnist-service
namespace: kubeflow
spec:
host: mnist-service
subsets:
- labels:
version: v1
name v1