Course Review

우상욱·2024년 3월 2일
0

DBT

목록 보기
15/16

What we've learned


  • dbt commands
    • dbt run, dbt test, dbt -h
  • Projects in dbt
    • General folder structure
    • dbt_project.yml
  • Creating dbt models with SQL
    • Defining the model in SQL files
    • Modifying the configuration in YAML
  • Building on top of exisiting models
    • Using Jinja {{ ref() }} function to build lineage
  • Validating our data through testing
    • Creating and applying various tests
      • Built-in
      • Singular tests in SQL
      • Generic tests
  • Documentation

    • Ability to automatically document dbt objects via YAML
    • Generating and serving documentation via dbt docs
  • Creating dbt sources

    • Defining lineage
    • Adding testing and documentation
  • Loading reference data with dbt seeds

    • Creating new seed via YAML configuration
    • Loading CSV files into data warehouse with dbt seed
  • Creating SCD2-style tracking with dbt snapshot

    • Adding and tracking changes to a dataset via dbt
  • Trouble shooting concept

    • Models
    • Tests
    • Snapshots
    • Projects
  • Production considerations

    • Using dbt build
    • Fixing errors with unfamiliar projects

Potential next topics


  • Incremental models

    • Loading partial data changes into warehouse without re-loading all data
  • More advanced SQL

    • CTEs
  • Jinja commands and macros

    • {{ ENV() }}, {% for %}
  • Building models with Python

  • Documentation blocks

  • Production & automation

    • Adding hooks to run tasks automatically
    • Integrating with orchestrators (such as Airflow)

References


profile
데이터엔지니어

0개의 댓글