📖 빅쿼리에서 날짜 다루기
🔆 날짜 데이터 유형
| 날짜 데이터 유형 | | 예시 |
|---|
| DATE | 연도-월-일자 | 2017-11-25 |
| DATETIME | 연도-월-일자 시간 | 2017-11-25T11:10:33 |
| TIMESTAMP | 연도-월-일자 시간(타임존 정보) | 2017-11-25 11:10:33 UTC |
| TIME | 시간 | 11:10:33 |
🔆 DATE() / DATETIME() / TIMESTAMP()
SELECT
DATE(2016, 12, 25) AS date_ymd,
DATE(DATETIME '2016-12-25 23:59:59') AS date_dt,
DATE(TIMESTAMP '2016-12-25 05:30:00+07', 'America/Los_Angeles') AS date_tstz;

SELECT
DATETIME(2008, 12, 25, 05, 30, 00) AS datetime)ymdhms,
DATETIME(TIMESTAMP '2008-12-25 05:30:00+00', 'America/Los_Angeles') AS datetime_tstz;

☁️ DATE() / DATETIME() / TIMESTAMP() 실습
SELECT
order_purchase_timestamp,
DATE(order_purchase_timestamp) AS ord_date,
DATETIME(order_purchase_timestamp) AS ord_dt,
TIMESTAMP(order_purchase_timestamp) AS ord_ts,
TIMESTAMP(DATE(order_purchase_timestamp)) AS ord_date_ts,
TIME(order_purchase_timestamp) AS ord_time
FROM `olist.olist_orders`

🔆 날짜에서 일부분만 추출하고 싶다면?
EXTRACT(‘추출할 부분’ FROM ‘날짜 컬럼’)
| 추출할 부분 | 의미 | 결과(정수) |
|---|
| YEAR | 연도 | 2017 |
| MONTH | 월 | 11 |
| DAY | 일 | 25 |
| DAYOFWEEK | 요일(0:일요일 ~ 6: 토요일) | 1 |
| WEEK | 주(기본 일요일 시작) | 13 |
| WEEK(MONDAY) | 주(월요일 시작) | 12 |
| QUARTER | 분기 | 2 |
| HOUR | 시 | 1 |
| MINUTE | 분 | 30 |
| SECOND | 초 | 55 |

SELECT
order_purchase_timestamp,
EXTRACT(YEAR FROM order_purchase_timestamp) AS YEAR,
EXTRACT(QUARTER FROM order_purchase_timestamp) AS QUARTER,
EXTRACT(MONTH FROM order_purchase_timestamp) AS MONTH,
EXTRACT(DAY FROM order_purchase_timestamp) AS DAY,
EXTRACT(DAYOFWEEK FROM order_purchase_timestamp) AS DAYOFWEEK,
EXTRACT(WEEK FROM order_purchase_timestamp) AS WEEK,
EXTRACT(WEEK(MONDAY) FROM order_purchase_timestamp) AS WEEK_MONDAY,
EXTRACT(HOUR FROM order_purchase_timestamp) AS HOUR,
EXTRACT(MINUTE FROM order_purchase_timestamp) AS MINUTE,
EXTRACT(SECOND FROM order_purchase_timestamp) AS SECOND
FROM `olist.olist_orders`
WHERE
DATE(order_purchase_timestamp) = '2017-11-26'

🔆 날짜에서 원하는 정보까지만 남기고 싶다면?
DATE_TRUNC(‘날짜’, ‘남기는 부분’)
DATETIME_TRUNC(‘날짜’, ‘남기는 부분’)
TIMESTAMP_TRUNC(‘날짜’, ‘남기는 부분’)

| 추출할 부분 | 의미 | 결과(날짜) | 예시(2017-11-26 16:13:13 UTC) |
|---|
| YEAR | 연도 | 해당 연도의 첫 날 | 2017-01-01 00:00:00 UTC |
| MONTH | 월 | 해당 연도, 월의 첫 날 | 2017-11-01 00:00:00 UTC |
| DAY | 일 | 해당 날짜의 0시 0분 | 2017-11-26 00:00:00 UTC |
| WEEK | 주(기본 일요일 시작) | 해당 주차의 첫 날(일) | 2017-11-26 00:00:00 UTC |
| WEEK(MONDAY) | 주(월요일 시작) | 해당 주차의 첫 날(월) | 2017-11-20 00:00:00 UTC |
| QUARTER | 분기 | 해당 분기의 첫 날 | 2017-10-01 00:00:00 UTC |
| HOUR | 시 | 해당 날짜의 시 | 2017-11-26 16:00:00 UTC |
| MINUTE | 분 | 해당 날짜의 시, 분 | 2017-11-26 16:13:00 UTC |
| SECOND | 초 | 해당 날짜의 시, 분, 초 | 2017-11-26 16:13:13 UTC |
☁️ TIMESTAMP_TRUNC 실습
SELECT
order_purchase_timestamp,
TIMESTAMP_TRUNC(order_purchase_timestamp, YEAR) AS YEAR,
TIMESTAMP_TRUNC(order_purchase_timestamp,QUARTER) AS QUARTER,
TIMESTAMP_TRUNC(order_purchase_timestamp,MONTH) AS MONTH,
TIMESTAMP_TRUNC(order_purchase_timestamp,DAY) AS DAY,
TIMESTAMP_TRUNC(order_purchase_timestamp,WEEK) AS WEEK,
TIMESTAMP_TRUNC(order_purchase_timestamp,WEEK(MONDAY)) AS WEEK_MONDAY,
TIMESTAMP_TRUNC(order_purchase_timestamp,HOUR) AS HOUR,
TIMESTAMP_TRUNC(order_purchase_timestamp,MINUTE) AS MINUTE,
TIMESTAMP_TRUNC(order_purchase_timestamp,SECOND) AS SECOND
FROM `olist.olist_orders`
WHERE
DATE(order_purchase_timestamp) = '2017-11-26'

SELECT
order_purchase_timestamp,
EXTRACT(YEAR FROM order_purchase_timestamp) AS YEAR1,
TIMESTAMP_TRUNC(order_purchase_timestamp, YEAR) AS YEAR2,
EXTRACT(MONTH FROM order_purchase_timestamp) AS MONTH1,
TIMESTAMP_TRUNC(order_purchase_timestamp, MONTH) AS MONTH2
FROM `olist.olist_orders`
WHERE
DATE(order_purchase_timestamp) = '2017-11-26'

🔆 한번에 주석 처리하는 방법
- 원하는 블록 구간 drag →
ctrl + /
☁️ 종합 쿼리 2번째에 날짜 정보 붙이기
SELECT
ord.order_approved_at,
DATE(ord.order_approved_at) AS ord_date,
DATETIME(ord.order_approved_at) AS ord_dt,
TIMESTAMP(ord.order_approved_at) AS ord_ts,
ord.order_id,
ord.order_status,
ord.customer_id,
cust.customer_unique_id,
tb.ord_amt,
tb.prd_cnt
FROM `olist.olist_orders` AS ord
LEFT JOIN `olist.olist_customers` AS cust
ON ord.customer_id = cust.customer_id
INNER JOIN tb
ON ord.order_id = tb.order_id

SELECT
ord.order_approved_at,
DATE(ord.order_approved_at) AS ord_date,
DATETIME(ord.order_approved_at) AS ord_dt,
TIMESTAMP(ord.order_approved_at) AS ord_ts,
EXTRACT(YEAR FROM ord.order_approved_at) AS ord_year,
EXTRACT(MONTH FROM ord.order_approved_at) AS ord_month,
TIMESTAMP_TRUNC(ord.order_approved_at, YEAR) AS ord_year_t,
TIMESTAMP_TRUNC(ord.order_approved_at, MONTH) AS ord_year_m,
ord.order_id,
ord.order_status,
ord.customer_id,
cust.customer_unique_id,
tb.ord_amt,
tb.prd_cnt
FROM `olist.olist_orders` AS ord
LEFT JOIN `olist.olist_customers` AS cust
ON ord.customer_id = cust.customer_id
INNER JOIN tb
ON ord.order_id = tb.order_id

☁️ 종합 쿼리 3번째에 날짜 정보 붙이기
SELECT
ord_year,
SUM(ord_amt) AS `총 매출`,
COUNT(DISTINCT order_id) AS `총 주문수`,
COUNT(prd_cnt) AS `총 판매상품수`,
SUM(ord_amt) / COUNT(DISTINCT order_id) AS `주문당 평균 가격`,
COUNT(prd_cnt) / COUNT(DISTINCT order_id) AS `평균판매상품수`,
SUM(ord_amt) / COUNT(prd_cnt) AS `제품개당평균가격`,
COUNT(DISTINCT customer_unique_id) AS `주문고객수`,
COUNT(DISTINCT order_id) / COUNT(DISTINCT customer_unique_id) AS `주문빈도`
FROM base
GROUP BY ord_year
ORDER BY ord_year
- Null 값이 있는 이유 : olist_orders 원본에서 일부 행의 날짜가 Null값이기 때문

📖 Appendix, 빅쿼리의 유용한 함수
🔆 column이 중요한 빅쿼리

🔆 EXCEPT : 필요없는 컬럼은 빼고 불러오자
SELECT * EXCEPT('제외할 컬럼')
🔆 REPLACE : 불러올 때 간단하게 바꿔주자
SELECT * REPLACE('대체할 값' AS '컬럼명')
☁️ 실습
SELECT *
FROM `olist.olist_order_items`
LIMIT 100;

SELECT * EXCEPT(shipping_limit_date, freight_value)
FROM `olist.olist_order_items`;

SELECT * REPLACE(price * 10000 AS price)
FROM `olist.olist_order_items`;

🔆 데이터 타입을 바꾸는 CAST 그리고 SAFE_CAST
CAST('값' AS '데이터타입')
SAFE_CAST('값' AS '데이터타입')
CAST('123' AS INT64)
SAFE_CAST('123' AS INT64)
CAST('abc' AS INT64)
SAFE_CAST('abc' AS INT64)
🔆 오류를 발생시키지 않는 SAFE 연산
- 에러 있을 경우 에러를 발생 시키지 않고 NULL로 반환함
SAFE_ADD(X, Y) = X + Y
SAFE_SUBTRACT(X, Y) = X - Y
SAFE_MULTIPLY(X, Y) = X * Y
SAFE_DIVIDE(X, Y) = X / Y

☁️ CAST & SAFE_CAST 실습
SELECT
CAST('123' AS INT64) AS cast_result,
SAFE_CAST('123' AS INT64) AS safe_result;

SELECT
CAST('abc' AS INT64) AS cast_result;

SELECT
SAFE_CAST('abc' AS INT64) AS safe_result;

☁️ SAFE_DIVIDE
SELECT
SAFE_DIVIDE(5, 0)

SELECT
SAFE_DIVIDE(5, 0),
IFNULL(SAFE_DIVIDE(5, 0), 0)
