음수랑 NULL이랑 구분할 것!
# 7일의 데이터 출력 ; self join 해서 날짜 차가 6인 조합을 구한다. 31 -> 1로 넘어가는 날이 있기 떄문에 int 형으로 변환해서 푸는 방법은 기각
-- SELECT c1.visited_on, c1.amount
-- FROM Customer c1 INNER JOIN Customer c2
-- ON DATEDIFF()
-- SELECT visited_on
-- FROM Customer
-- WHERE DAY(visited_on) - 6 > 0;
# 01 & 07 ; 01 ~ 07까지의 날짜에 해당하는 amount 총합
-- SELECT visited_on, amount
-- FROM Customer
-- WHERE visited_on between (SELECT TOP 1 visited_on FROM Customer WHERE DAY(visited_on) - 6 > 0)
-- AND (SELECT TOP 1 visited_on FROM Customer);
# 01. make one date to one row
-- SELECT visited_on, SUM(amount) as sum FROM Customer GROUP BY visited_on;
# 02. DATE DIFF = 6
SELECT c1.visited_on, c1.sum, c2.visited_on, c2.sum
FROM (SELECT visited_on, SUM(amount) as sum FROM Customer GROUP BY visited_on) as c1,
(SELECT visited_on, SUM(amount) as sum FROM Customer GROUP BY visited_on) as c2
WHERE DATEDIFF(c1.visited_on, c2.visited_on) BETWEEN 0 and 6;
| visited_on | sum | visited_on | sum |
|---|---|---|---|
| 2019-01-07 | 150 | 2019-01-01 | 100 |
| 2019-01-06 | 140 | 2019-01-01 | 100 |
| 2019-01-05 | 110 | 2019-01-01 | 100 |
| 2019-01-04 | 130 | 2019-01-01 | 100 |
| 2019-01-03 | 120 | 2019-01-01 | 100 |
| 2019-01-02 | 110 | 2019-01-01 | 100 |
| 2019-01-01 | 100 | 2019-01-01 | 100 |
| 2019-01-08 | 80 | 2019-01-02 | 110 |
| 2019-01-07 | 150 | 2019-01-02 | 110 |
| 2019-01-06 | 140 | 2019-01-02 | 110 |
| 2019-01-05 | 110 | 2019-01-02 | 110 |
| 2019-01-04 | 130 | 2019-01-02 | 110 |
| 2019-01-03 | 120 | 2019-01-02 | 110 |
| 2019-01-02 | 110 | 2019-01-02 | 110 |
| 2019-01-09 | 110 | 2019-01-03 | 120 |
| 2019-01-08 | 80 | 2019-01-03 | 120 |
| 2019-01-07 | 150 | 2019-01-03 | 120 |
| 2019-01-06 | 140 | 2019-01-03 | 120 |
| 2019-01-05 | 110 | 2019-01-03 | 120 |
| 2019-01-04 | 130 | 2019-01-03 | 120 |
| 2019-01-03 | 120 | 2019-01-03 | 120 |
| 2019-01-10 | 280 | 2019-01-04 | 130 |
| 2019-01-09 | 110 | 2019-01-04 | 130 |
| 2019-01-08 | 80 | 2019-01-04 | 130 |
| 2019-01-07 | 150 | 2019-01-04 | 130 |
| 2019-01-06 | 140 | 2019-01-04 | 130 |
| 2019-01-05 | 110 | 2019-01-04 | 130 |
| 2019-01-04 | 130 | 2019-01-04 | 130 |
| 2019-01-10 | 280 | 2019-01-05 | 110 |
| 2019-01-09 | 110 | 2019-01-05 | 110 |
| 2019-01-08 | 80 | 2019-01-05 | 110 |
| 2019-01-07 | 150 | 2019-01-05 | 110 |
| 2019-01-06 | 140 | 2019-01-05 | 110 |
| 2019-01-05 | 110 | 2019-01-05 | 110 |
| 2019-01-10 | 280 | 2019-01-06 | 140 |
| 2019-01-09 | 110 | 2019-01-06 | 140 |
| 2019-01-08 | 80 | 2019-01-06 | 140 |
| 2019-01-07 | 150 | 2019-01-06 | 140 |
| 2019-01-06 | 140 | 2019-01-06 | 140 |
| 2019-01-10 | 280 | 2019-01-07 | 150 |
| 2019-01-09 | 110 | 2019-01-07 | 150 |
| 2019-01-08 | 80 | 2019-01-07 | 150 |
| 2019-01-07 | 150 | 2019-01-07 | 150 |
| 2019-01-10 | 280 | 2019-01-08 | 80 |
| 2019-01-09 | 110 | 2019-01-08 | 80 |
| 2019-01-08 | 80 | 2019-01-08 | 80 |
| 2019-01-10 | 280 | 2019-01-09 | 110 |
| 2019-01-09 | 110 | 2019-01-09 | 110 |
| 2019-01-10 | 280 | 2019-01-10 | 280 |
GROUP BY가 실행될 때, 1 row로 결과가 도출되게 된다. 이 때, 완전 압축되는 것은 아니고 만약 HAVING 절에 사용되는 집계 함수가 있으면 임시 데이터로 값을 가지고 있다가 집계 함수를 실행한 후에 완전 압축된다. 이를 중간 단계(Intermediate State)라고 한다.
SELECT c1.visited_on as visited_on, sum(c2.sum) as amount, ROUND(AVG(c2.sum), 2) as average_amount
FROM (SELECT visited_on, SUM(amount) as sum FROM Customer GROUP BY visited_on) as c1,
(SELECT visited_on, SUM(amount) as sum FROM Customer GROUP BY visited_on) as c2
WHERE DATEDIFF(c1.visited_on, c2.visited_on) BETWEEN 0 and 6
GROUP BY c1.visited_on
HAVING COUNT(c2.visited_on) = 7;
| visited_on | amount | average_amount |
|---|---|---|
| 2019-01-07 | 860 | 122.86 |
| 2019-01-08 | 840 | 120 |
| 2019-01-09 | 840 | 120 |
| 2019-01-10 | 1000 | 142.86 |