04 Understanding Google Cloud Security and Operations

Bean·2023년 10월 9일

Google Cloud Digital Leader

목록 보기

4/4

Financial Governance in the Cloud

Fundamentals of cloud cost management

클라우드 기술은 조직에게 더 동적인 결정을 내리고 혁신을 가속화하는 수단을 제공할 수 있지만, 클라우드 비용 관리는 주의와 실시간 모니터링이 병행되어야 합니다.

Capex → Opex : cost must be monitored

클라우드 사용이 효과적으로 통제되지 않았을때 비효율성 야기.

To solve this problem consider the solution through three lenses: People, Process, Technology

People: 재무, 기술, 비즈니스간 협력 필요, 전문가로 구성
Process: 왜 클라우드 리소스가 사용되는지, 비용이 어디로부터 나왔는지, 비용이 어떻게 책정될지
Technology: Visibility, Accountability, Control, Intelligence

Total cost of ownership

TCO: 전반적인 모든 비용을 합쳤을때 얼마냐

클라우드가 더 가성비다 - 인프라에 리소스를 투입하는 것 보다, 개발하는데 리소스를 더 투입할 수 있기 때문(관점이 있다)

Best practices for managing google cloud costs

Identify the individual or team that will manage costs
Learn the difference between invoices and cost tools
invoice - 얼마를 썼는지 청구서로 나옴
Use cost management tools for accountability

management tool - invoice가 왜 이렇게 나왔는지에 대한 내용
- Visibility: Built-in reporting tools, Custom dashboards, pricing calculator
  클라우드 비용의 추세와 예상 비용등을 투명하게 파악해야함
- Accountability: 누가 어떻게 썼는지에 대해 공유 가능한 tool
- Control: 권한이 부여된 개인만 클라우드 리소스 사용 가능, 상한치 설정
- Intelligence: smart spending decisions with intelligent

Security in the cloud

Fundamental terms

Privacy
개인 또는 조직이 액세스 하는 데이터, 누구와 공유할 수 있는지
Security
데이터를 안전하게 유지하기 위한 정책, 절차
Compliance
third party(regulatory authority, international standards organization) standard meet
의료나 금융같은 민감한 데이터가 많은 곳일수록 중요
Availability
클라우드 프로바이더가 얼마나 오랜시간동안 데이터와 서비스를 보장하는지

Google Cloud의 약속

You own your data, not Google
Google does not sell customer data to third party
All customer data is encrypted by default
Google Cloud guards against insider access to your data
We never give any government entity “backdoor” access to your data
Our privacy practices are audited against international standards

Today’s cybersecurity challenges

Traditional on-premise system uses perimeter-based security approach.
한 번 누군가 경계 안에 들어가면 신뢰할 수 있는 것으로 간주 → 모든 데이터 엑세스
IoT는 각 노드가 네트워크기 때문에 각 노드가 진입점이 된다.

Criminal Attack
Phishing attacker - using email with malicious attachment, giving up password, sharing sensitive data
Physical damage
하드디스크 물리적 손상, 자연재해 데이터 손실 책임
Malware, viruses, and ransomware attacks
Data can be lost, damaged, or destroyed by viruses or malware.
Unsecured third-party systems
Lack of expert knowledge

The shared responsibility model

When an organization adopts the cloud, the cloud service provider typically becomes the data processor. The organization is the data controller.

Google Cloud’s multilayer approach to security

Hardware - 하드웨어 직접 제조, new server builds called Titan
Software - server is not allowed until its health is confirmed
Storage - 데이터 암호화
1. Data is broken into many pieces in memory.
2. These pieces, or “chunks”, are encrypted with their own data encryption key or ‘DEK’.
3. These DEKs are then encrypted a second time with key encryption key or ‘KEK’.
4. Encrypted chunks and wrapped KEKs are distributed across Google’s infrastructure.
암호화 키가 있어도 조각이기 때문에 전체 데이터 접근 불가
Identity - zero-trust model, 각 단계에서 신원 확인
Network - encryption in transit
Operations - system monitoring

Cloud Identity and Access Management

IAM - Who, Can do what, On which resource

Who: google account, service account
Can do what: defined by an IAM role(editor, viewer, owner)
three kinds of roles in IAM // Basic, Predefined, Custom
구글 클라우드는 least-privilege model 권장

Resource hierarchy

클라우드 환경에서 프로젝트는 Google Cloud 기능을 활성화하고 사용하는 기초다.
Managing APIs, Enabling billing, Adding/removing collaborators, Enabling other Google services

Resource hierarchy - IT팀이 비즈니스의 Google cloud 환경을 구성하는 방법과 그 서비스 구조가 실제 구조에 어떻게 매핑되는지를 나타낸다.

Domain and Organization - Google Cloud에서 관리되는 모든 것.
domain은 cloud identity를 통해 관리하며 사용자 프로필 관리
organization은 clound console을 통해 관리됨

Projects, folders and labels
project는 사용자가 생성한 것이 아니라 조직에 속함. cloud resource 그룹화 하는데 사용. 권한 상속 가능

Monitoring Cloud IT Services and Operations

IT development and operations challenges

503 Service unavailable - planned maintenace or unexpected system failure.

Unexpected system failure ← may be the result of team structure issue

개발자들은 민첩성을 갖추고 있어야 합니다. 그들의 목표는 자주 새로운 기능을 출시하고, 새로운 기능으로 핵심 비즈니스 가치를 높이며, 사용자 경험을 전반적으로 개선하기 위해 빠르게 수정을 배포하는 것입니다. 반면에, 운영자들은 시스템을 안정적으로 유지해야 하므로 신뢰성과 일관성을 보장하기 위해 더 느리게 작업하는 것을 선호할 때가 많습니다.

Organizations need to adapt their IT operations

Adjust expectations for service availability
서비스 가용성의 기대치에 대한 조정
Adopt best practices from DevOps and Site Reliability Engineering
팀이 민첩하게 대처

Availability

100% 가용성을 유지한다면 비용문제 발생

Standard practices - 표준관행을 사용해 고객의 서비스 가용성을 측정

Service level agreement // SLA
어떤 계약의 수준, baseline for quality, availability, reliability
Service level objective // SLO
목표치
Service level indicator // SLI
우리가 제공하는 서비스 레벨의 척도, SLI include reliability, latency, error budget
error budget - 어느정도 누적되는 오류. space between SLA and SLO

DevOps and SRE

DevOps - 철학, 개발자와 운영팀 내에서 보다 협력적이고 책임감있는 문화를 조성하고자 하는 철학

The five objectives of DevOps

Reduce silos - 팀 사이 장벽을 허물어 서로 협업강화
Accept failure as normal
Implement gradual change - 점진적 변화 이행
Leverage tooling and automation
Measure everything

Site Reliability Engineering(SRE) - 철학에 실전적인 구현에 관한 것. 실행지침. 소프트웨어공학 + 운영
The goals of SRE: ultra-scalable and highly reliable software systems.

Reduce silos - shared ownership
Accept failure as normal
Implement gradual change - reduce the cost of failure
Leverage tooling and automation - toil automation
toil - work that is tied to running a production service
Measure everything

To foster these practices organizations need: Goal setting, Transparency, Data-driven decision making
SRE shifts the mindset from high availability to low availability.

DevOps SRE는 같은 목표를 바라본다. 개발조직과 운영조직의 장벽을 허물고 더 나은 서비스를 제공하기를 원한다.

Google Cloud resource monitoring tools

Google Cloud’s operations two major categories

Operations-foucsed tools
- Cloud Monitoring - foundation of SRE
- Cloud Logging - Log file, to get insight and identify root cause of issues, fully managed service
- Error Reporting
- Service Monitoring
Application performance management tools
- Cloud Debugger - 응용프로그램 성능 모니터링
- Cloud Trace - container나 MSA에 있는 버그를 찾는 분산 추적 시스템
- Cloud Profiler

Bean

콩야쿵야

이전 포스트