default pip install dbt는 dbt cloud 가 설치되므로 dbt-core설치 필수
pip install dbt-core
# check
> dbt --version
Core:
- installed: 1.7.13
- latest: 1.8.0 - Update available!
Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation
> dbt init {project_name}
## output
> tree
.
├─dags
└─config
└─dbt
├─logs
├─{project_name}
├─analyses
├─macros
├─models
│ └─example
├─seeds
├─snapshots
└─tests
└─docker-compose.yaml
└─Dockerfile
└─entrypoint.sh
profile.yml
profile.yml in windows directory : C:\Users\Username\ .dbt
profile.yml in ubuntu directory : /home/airflow/.dbt
profile.yml setting
packages to install
> tree
.
├─dags
└─config
└─dbt
├─logs
├─{project_name}
├─analyses
├─macros
├─models
│ └─example
├─seeds
├─snapshots
└─tests
└─dbt_project.yml
└─packages.yml
└─docker-compose.yaml
└─Dockerfile
└─entrypoint.sh
Create packages.yml file same with dbt_project.yml level.
packages:
- package: fivetran/sap_source
version: [">=0.1.0", "<0.2.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
- package: dbt-labs/dbt_utils
version: [">=1.3.0", "<2.0.0"]
- package: dbt-labs/spark_utils
version: [">=0.3.0", "<0.4.0"]
run dbt deps to install packages
select
customer_id,
count(*) as order_count
from
orders
group by
customer_id;
# Activate env installed DBT
> conda activate airflow
# Run all models
> dbt run
# Run specific model
> dbt run --models {model_name}
{% macro generate_schema_name(custom_schema_name, node) -%}
{%- set default_schema = target.schema -%}
{%- if custom_schema_name is none -%}
{{ default_schema }}
{%- else -%}
{{ custom_schema_name | trim }}
{%- endif -%}
{%- endmacro %}
-- example
{{
config(
materialized='incremental',
unique_key='id',
incremental_strategy='delete+insert',
schema=generate_schema_name('L1'),
alias='new_table'
)
}}
WITH TARGET_DATA AS (
SELECT *
FROM tablename,
{% if is_incremental() %}
WHERE START_DATE < current_date AND END_DATE > current_date
{% endif %}
),
...
SOURCE_DATA AS (...)
SELECT *
FROM SOURCE_DATA
[Explanation model]
[Caution❗️]
{% if is_incremental() %} ~ {% endif %}
dbt run --full-refresh --select {model_name.sql}dbt run --select {model_name.sql}airflow@:/opt/airflow/dbt/saleshub$ dbt run --models dim_project_info
03:45:43 Running with dbt=1.7.13
03:45:47 Registered adapter: snowflake=1.7.3
03:45:50 Encountered an error:
'project://macros/generate_schema_name.sql'
...
KeyError: 'project://macros/generate_schema_name.sql'
Since I changed dbt profiles.yml file location in airflow container, there's a dbt cache problem or compile directory problem
Solution 1) Clean up dbt chache
# in the airflow container $ dbt clean $ dbt compile