이 글은 공식 문서 내용 정리한거고 실제 사용은 아직 안했다.
❯ helm repo add flagger https://flagger.app
"flagger" has been added to your repositories
# canary CRD 적용
❯ kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml
customresourcedefinition.apiextensions.k8s.io/canaries.flagger.app created
customresourcedefinition.apiextensions.k8s.io/metrictemplates.flagger.app created
customresourcedefinition.apiextensions.k8s.io/alertproviders.flagger.app created
❯ k get crd
NAME CREATED AT
alertproviders.flagger.app 2022-05-19T15:54:19Z # 👍
authorizationpolicies.security.istio.io 2022-05-19T15:05:50Z
canaries.flagger.app 2022-05-19T15:54:19Z # 👍
demoes.demoapp.my.domain 2022-04-24T10:46:39Z
destinationrules.networking.istio.io 2022-05-19T15:05:50Z
envoyfilters.networking.istio.io 2022-05-19T15:05:50Z
gateways.networking.istio.io 2022-05-19T15:05:50Z
istiooperators.install.istio.io 2022-05-19T15:05:50Z
metrictemplates.flagger.app 2022-05-19T15:54:19Z # 👍
peerauthentications.security.istio.io 2022-05-19T15:05:50Z
proxyconfigs.networking.istio.io 2022-05-19T15:05:50Z
requestauthentications.security.istio.io 2022-05-19T15:05:50Z
serviceentries.networking.istio.io 2022-05-19T15:05:50Z
sidecars.networking.istio.io 2022-05-19T15:05:50Z
telemetries.telemetry.istio.io 2022-05-19T15:05:50Z
virtualservices.networking.istio.io 2022-05-19T15:05:50Z
wasmplugins.extensions.istio.io 2022-05-19T15:05:50Z
workloadentries.networking.istio.io 2022-05-19T15:05:50Z
workloadgroups.networking.istio.io 2022-05-19T15:05:50Z
# istio provider로 배포
❯ helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://prometheus:9090
For Istio multi-cluster shared control plane
❯ helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://istio-cluster-prometheus:9090 \ # 🙊
--set controlplane.kubeconfig.secretName=istio-kubeconfig \ # 🙊
--set controlplane.kubeconfig.key=kubeconfig # 🙊
- 설치
- app 배포 (deployment or daemonset)
- canary CR 작성
다만 canary가 계속 모니터링 하는건지 여부는 모르겠음.
배포 순서도
인지 확인해봐야 함.
Deployment, DaemonSetServiceapiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: "app에 대한 canary CR 이름 작성"
spec:
# app에 대한 내용 ---
targetRef:
apiVersion: apps/v1
kind: Deployment
name: <"여기에 Deployment 이름 작성"
service:
port: 9898
# ---
analysis:
interval: 1m
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
webhooks:
- name: load-test
url: http://flagger-loadtester.test/
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"
deployment/<targetRef.name>-primary 생성hpa/<autoscalerRef.name>-primary 생성primary 리소스
canary analysis를 수행.canary analysis를 통과하면 primary를 새 버전으로 바꿈.spec:
progressDeadlineSeconds: 60
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
autoscalerRef:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
name: podinfo
단, target의 label selector는 꼭 하나만 있어야 됨.
app: <DEPLOYMENT-NAME>apiVersion: apps/v1
kind: Deployment
metadata:
name: podinfo
spec:
selector: # 🙊
matchLabels: # 🙊
app: podinfo # 🙊
template:
metadata:
labels:
app: podinfo
셀렉터가 여러개로 배포되는데 이건 호환이 되는지...?
selector:
matchLabels:
app: "app-name"
code: "service-code"
env: "dev"
part-of: "app-name"
이미 배포된 deploy에 여러 selector가 있어도 상관은 없는 것 같다.
다만 canary가 바라볼 label만 단 한개로 지정.
apiVersion: apps/v1
kind: Deployment
metadata:
name: podinfo
spec:
selector:
matchLabels:
app: podinfo
affinity: podinfo
template:
metadata:
labels:
app: podinfo
affinity: podinfo
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
affinity: podinfo
topologyKey: topology.kubernetes.io/zone
flagger에서 기본 지원하는 selector 형식
app: <DEPLOYMENT-NAME>name: <DEPLOYMENT-NAME>app.kubernetes.io/name: <DEPLOYMENT-NAME>만약 다른 라벨로 selector 쓰고싶다면 방법 2개 중 하나 사용
-selector-labels=my-app-label 를 arg로--set selectorLabels=my-app-labelFlagger will create a copy of each object using the
-primarysuffix and will reference these objects in the primary deployment.
deploy와 마찬가지로 -primary 네이밍이 붙는 copy 리소스를 사용.
configMap, Secret 리소스에 라벨을 줘서 같은 리소스를 공통으로 쓸 수 있다.
flagger.app/config-tracking: disabledglobal하게 disable 하는 방법도 2개임.
-enable-config-tracking=false 를 arg로--set configTracking.enabled=false하지만 전역 설정보다 리소스마다 annotation 추가해주는게 use-case에 더 맞을거라고 문서에 나와있었다.
Flagger will pause the traffic increase while the target and primary deployments are scaled up or down.
HPA can help reduce the resource usage during the canary analysis.
When the autoscaler reference is specified,
any changes made to the autoscalerare only made active inthe primary autoscalerwhen a rollout for the deployment starts and completes successfully.
Optionally, you can create two HPAs, one for canary and one for the primary to update the HPA without doing a new rollout.
As the canary deployment will be scaled to 0, the HPA on the canary will be inactive.
기본으론 service 이름을 deploy와 동일하게 적용하는걸 전제로 함.
근데 canary.spec.service 에서 service 이름을 따로 작성해줄 수 있기에 괜찮은 것 같다.
spec:
service:
name: podinfo # 이름을 명시하면 deploy 이름과 달라도 됨
port: 9898
portName: http
targetPort: 9898
portDiscovery: true
portName도 optional이고 기본은 http로 되어있음.
만약 gRPC를 쓴다면 portName을 grpc 로 써줘야 한다.
port discovery
If port discovery is enabled, Flagger scans the target workload and extracts the containers ports excluding the port specified in the canary service and service mesh sidecar ports. These ports will be used when generating the ClusterIP services.
canary.spec.service 작성으로 생성되는 service 객체들
<service.name>.<namespace>.svc.cluster.localapp=<name>-primary<service.name>-primary.<namespace>.svc.cluster.localapp=<name>-primary<service.name>-canary.<namespace>.svc.cluster.localapp=<name>This ensures that traffic to
podinfo.test:9898will be routed to the latest stable release of your app. Thepodinfo-canary.test:9898address is available only during the canary analysis and can be used for conformance testing or load testing.
selector 설정을 수정할 수 있다.
Kubernetes Service and the generated service mesh/ingress object.VirtualServices and TraefikServicesistio에 맞춘 설정은 FAQ를 참고
spec:
service:
port: 9898
apex:
annotations:
test: "test"
labels:
test: "test"
canary:
annotations:
test: "test"
labels:
test: "test"
primary:
annotations:
test: "test"
labels:
test: "test"
canary 리소스 상태 확인
❯ k get canaries --all-namespaces
NAMESPACE NAME STATUS WEIGHT LASTTRANSITIONTIME
test podinfo Progressing 15 2019-06-30T14:05:07Z
prod frontend Succeeded 0 2019-06-30T16:15:07Z
prod backend Failed 0 2019-06-30T17:05:07Z
이 리소스에 대해서 성공적인 상태는 이렇게 됨
status:
canaryWeight: 0
failedChecks: 0
iterations: 0
lastAppliedSpec: "14788816656920327485"
lastPromotedSpec: "14788816656920327485"
conditions:
- lastTransitionTime: "2019-07-10T08:23:18Z"
lastUpdateTime: "2019-07-10T08:23:18Z"
message: Canary analysis completed successfully, promotion finished.
reason: Succeeded
status: "True"
type: Promoted
상태 종류는 아래와 같음.
A failed canary will have the promoted status set to false.
the reason to failed and the last applied spec will be different to the last promoted one.CI example도 있다.
# update the container image
kubectl set image deployment/podinfo podinfod=stefanprodan/podinfo:3.0.1
# wait for Flagger to detect the change
ok=false
until ${ok}; do
kubectl get canary/podinfo | grep 'Progressing' && ok=true || ok=false
sleep 5
done
# wait for the canary analysis to finish
kubectl wait canary/podinfo --for=condition=promoted --timeout=5m
# check if the deployment was successful
kubectl get canary/podinfo | grep Succeeded
템플릿
analysis:
# schedule interval (default 60s)
interval:
# max number of failed metric checks before rollback
threshold:
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight:
# canary increment step
# percentage (0-100)
stepWeight:
# promotion increment step
# percentage (0-100)
stepWeightPromotion:
# total number of iterations
# used for A/B Testing and Blue/Green
iterations:
# threshold of primary pods that need to be available to consider it ready
# before starting rollout. this is optional and the default is 100
# percentage (0-100)
primaryReadyThreshold: 100
# threshold of canary pods that need to be available to consider it ready
# before starting rollout. this is optional and the default is 100
# percentage (0-100)
canaryReadyThreshold: 100
# canary match conditions
# used for A/B Testing
match:
- # HTTP header
# key performance indicators
metrics:
- # metric check
# alerting
alerts:
- # alert provider
# external checks
webhooks:
- # hook