SELECT
date(order_purchase_timestamp) purchase_date,
sum(
case
when (
order_estimated_delivery_date >= order_delivered_customer_date
) then 1
else 0
end
) success,
sum(
case
when (
order_estimated_delivery_date < order_delivered_customer_date
) then 1
else 0
end
) fail
FROM
olist_orders_dataset
WHERE
order_delivered_customer_date is not null
and order_estimated_delivery_date is not null
and date(order_purchase_timestamp) BETWEEN '2017-01-01' and '2017-01-31'
GROUP BY
date(order_purchase_timestamp)
ORDER BY
date(order_purchase_timestamp) asc
조건에 부합하면 1, 부합하지 않으면 0을 반환해 sum으로 카운트 셀 수 있다.
SELECT
date(order_purchase_timestamp) purchase_date,
case
when (
order_delivered_customer_date <= order_estimated_delivery_date
) then count(*)
else 0
end success,
case
when (
order_delivered_customer_date > order_estimated_delivery_date
) then count(*)
else 0
end fail
case when 안에 count() 함수를 쓸 수 없다.
success에 몰빵되어 카운트 되었는데,
👉 COUNT(*)는 그룹 단위에서 한 번만 실행되기 때문에, 첫 번째로 조건을 만족하는 곳에서 그룹 전체 개수를 반환
👉 첫 번째 행이 success 조건을 만족하면 COUNT(*) 실행 → 그룹 전체 개수를 success에 넣음
👉 첫 번째 행이 fail 조건을 만족하지 않으면 COUNT(*) 자체가 실행되지 않아서 fail은 0이 됨
SELECT
date(order_purchase_timestamp) purchase_date,
count(
case
when (
order_estimated_delivery_date >= order_delivered_customer_date
) then order_id
end
) success,
count(
case
when (
order_estimated_delivery_date < order_delivered_customer_date
) then order_id
end
) fail
FROM
olist_orders_dataset
WHERE
order_delivered_customer_date is not null
and order_estimated_delivery_date is not null
and date(order_purchase_timestamp) BETWEEN '2017-01-01' and '2017-01-31'
GROUP BY
date(order_purchase_timestamp)
ORDER BY
date(order_purchase_timestamp) asc
sum이 아닌 count로 묶을수도 있다.