데이터엔지니어링이나 웨어하우징에는 종종 문서화 문제가 간과되곤 합니다.
Why document?
- Sharing data details with other consumers
- Centralize sources of documentation
- Providing details for updates / changes / etc
- Creating examples, suggestions for use, SLA details
Creating documentation in dbt
- Can provide documentation with model definitions
- Can add documentation about columns within models
- Automatically show data linieage(데이터 계보) / DAG
- Document any test / validations
- View generated warehouse information
- Columns data types
- Data sizes
version: 2
models:
- name: taxi_rides_raw
description: Yellow Taxi raw data
access: public
- name: avg_fare_per_day
description: Average ride per day
access: public
Generating documentation in dbt
dbt docs
dbt docs -h
: dbt docs
명령 관련 도움말
dbt docs generate
- Creates the documentation website based on project
- Sholud be run after
dbt run
Acessing documentation
-
Web browser
-
dbt docs serve
로컬 시스템에서 웹서버 시작, 문서에 대한 액세스 제공
편리하지만 로컬 및 개발환경에만 사용해야합니다. 보안을 염두해두지 않았습니다.
-
Copy content to other hosting service
- dbt cloud
- Amazon S3
- NginX / Apache / etc
Documentation example
- View
- Models
- Description information
- Column details
- Lineage graphs