DBT sources

우상욱·2024년 3월 2일
0

DBT

목록 보기
11/16

What is a dbt source?


  • Name and description of data loaded by EL process
    dbt에서 소스는 EL 프로세스에 의해 로드된 데이터의 이름을 지정하고, 설명하는 기능을 나타냅니다.
    이는 데이터 웨어하우스에 있거나, 들어오려고하는 데이터에 추가 정보를 적용하는 것입니다.
  • Helps define data lineage
  • Tests
  • Documentation

Sources


  • Provides data lineage information
  • Describes the flow of data in warehouse
  • Use the Jinja {{ source() }} function
    • Similar to the {{ ref() }} function
  • Simplifies accessing raw data
select *
from
  {{ source('raw', 'orders') }}

Defining a source


  • In the .yml file
  • File can be models/model_properties.yml
    • Or any other .yml file in the directory
  • Under the source: section
  • Name the source with the - name: option
  • Define each source table with a - name: option under tables:
  • Different options available depending on the data warehouse type, refer to dbt documentation
version: 2

sources:
  - name: raw
    tables:
      - name: phone_orders
      - name: web_orders

Accessing sources


  • Use the `{{ source() }}
  • {{ source(source_name, table_name) }}
  • Provides the proper name of the table in the compiled query
select * from
  {{ source('raw', 'phone_orders') }}
UNION
select * from
  {{ source('raw', 'web_orders') }}
-- dbt compiled
select * from
  'raw'.'phone_orders'
UNION
select * from
  'raw'.'web_orders'

Testing sources


  • You can apply tests to sources
  • Same methods as applying to models
  • Defined in the sources: section instead of models:
  • In .yml file where sources defined
version: 2

sources:
  - name: raw
    tables:
      - name: phone_orders
        columns:
          - name: id
            tests:
              - not_null
              - unique
          - name: web_orders

Documentation


  • Document using the same tools as models
  • Defined in the sources: section
version: 2

sources:
  - name: raw
    tables:
      - name: phone_orders
        description: >
          Sales orders by phone, daily
        columns:
          - name: id
            tests:
              - not_null
              - unique

Model VS Sourc


profile
데이터엔지니어

0개의 댓글