실험 환경
결론부터 말하면
(2025-11-14 추가)
aurora serverlessV2 (ram 2gb)로 해봤는데, exists가 더 빨랐습니다.
음.. 뭐지
SELECT COUNT(*) FROM item i1_0
WHERE EXISTS (
SELECT 1
FROM color_item c1_0
JOIN color_item_size_stock c2_0
ON c1_0.color_item_id = c2_0.color_item_id
AND c2_0.stock > 0
WHERE c1_0.item_id = i1_0.item_id
);
//exlpain analyze
-> Aggregate: count(0)
(cost=2.55e+9 rows=1)
(actual time=835..835 rows=1 loops=1)
-> Nested loop inner join
(cost=1.27e+9 rows=12.7e+9)
(actual time=725..829 rows=100000 loops=1)
-> Covering index scan on i1_0 using FK2n9w8d0dp4bsfra9dcg0046l4
(cost=10632 rows=95818)
(actual time=3.28..22.5 rows=100000 loops=1)
-> Single-row index lookup on <subquery2> using <auto_distinct_key> (item_id=i1_0.item_id)
(cost=100063..100063 rows=1)
(actual time=0.00778..0.00788 rows=1 loops=100000)
-> Materialize with deduplication
(cost=100063..100063 rows=132995)
(actual time=722..722 rows=100000 loops=1)
-> Nested loop inner join
(cost=86763 rows=132995)
(actual time=2.98..583 rows=400000 loops=1)
-> Filter: (c2_0.stock > 0)
(cost=40215 rows=132995)
(actual time=0.443..155 rows=400000 loops=1)
-> Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.0379..116 rows=400000 loops=1)
-> Single-row index lookup on c1_0 using PRIMARY (color_item_id=c2_0.color_item_id)
(cost=0.25 rows=1)
(actual time=855e-6..890e-6 rows=1 loops=400000)
select count(distinct i1_0.item_id) from item i1_0
join color_item c1_0 on i1_0.item_id=c1_0.item_id
join color_item_size_stock c2_0 on c1_0.color_item_id=c2_0.color_item_id
where c2_0.stock>0;
//exlpain analyze
-> Aggregate: count(distinct i1_0.item_id)
(cost=146611 rows=1)
(actual time=1110..1110 rows=1 loops=1)
-> Nested loop inner join
(cost=133311 rows=132995)
(actual time=0.0591..990 rows=400000 loops=1)
-> Nested loop inner join
(cost=86763 rows=132995)
(actual time=0.0516..600 rows=400000 loops=1)
-> Filter: (c2_0.stock > 0)
(cost=40215 rows=132995)
(actual time=0.0409..160 rows=400000 loops=1)
-> Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.0396..120 rows=400000 loops=1)
-> Single-row index lookup on c1_0 using PRIMARY (color_item_id=c2_0.color_item_id)
(cost=0.25 rows=1)
(actual time=871e-6..909e-6 rows=1 loops=400000)
-> Single-row covering index lookup on i1_0 using PRIMARY (item_id=c1_0.item_id)
(cost=0.25 rows=1)
(actual time=748e-6..786e-6 rows=1 loops=400000)
EXISTS 는 첫 매칭 row 찾으면 멈춤 → 필요 없는 탐색 생략해서 빠름
정확히 뭐 때문에 더 빠른지 알아보려 analyze 비교했지만 모르겠습니다.
//추가한 인덱스
ALTER TABLE color_item_size_stock
ADD INDEX idx_color_item_id_stock (color_item_id, stock);
//스캔 방식 바뀐 부분
//적용 전
Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.0396..120 rows=400000 loops=1)
//적용 후
Covering index scan on c2_0 using idx_color_item_id_stock
(cost=40215 rows=399024)
(actual time=0.0236..98.6 rows=400000 loops=1)
두 쿼리 모두 c2 테이블 스캔 방식이 바뀌었습니다. 근데 속도는 동일했습니다.
→ 인덱스 안 쓰기로 결정
SELECT COUNT(*)
FROM (
SELECT distinct i1_0.item_id
FROM color_item_size_stock c2_0
JOIN color_item c1_0 ON c1_0.color_item_id = c2_0.color_item_id
JOIN item i1_0 ON i1_0.item_id = c1_0.item_id
WHERE c2_0.stock > 0
) AS t;
//explain analyze
-> Aggregate: count(0)
(cost=176540..176540 rows=1)
(actual time=1118..1118 rows=1 loops=1)
-> Table scan on t
(cost=161575..163240 rows=132995)
(actual time=1101..1112 rows=100000 loops=1)
-> Materialize
(cost=161575..161575 rows=132995)
(actual time=1101..1101 rows=100000 loops=1)
-> Table scan on <temporary>
(cost=146611..148276 rows=132995)
(actual time=1079..1091 rows=100000 loops=1)
-> Temporary table with deduplication
(cost=146611..146611 rows=132995)
(actual time=1079..1079 rows=100000 loops=1)
-> Nested loop inner join
(cost=133311 rows=132995)
(actual time=0.0758..946 rows=400000 loops=1)
-> Nested loop inner join
(cost=86763 rows=132995)
(actual time=0.0679..583 rows=400000 loops=1)
-> Filter: (c2_0.stock > 0)
(cost=40215 rows=132995)
(actual time=0.0498..161 rows=400000 loops=1)
-> Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.0484..123 rows=400000 loops=1)
-> Single-row index lookup on c1_0 using PRIMARY (color_item_id=c2_0.color_item_id)
(cost=0.25 rows=1)
(actual time=839e-6..874e-6 rows=1 loops=400000)
-> Single-row covering index lookup on i1_0 using PRIMARY (item_id=c1_0.item_id)
(cost=0.25 rows=1)
(actual time=696e-6..732e-6 rows=1 loops=400000)
페이징 쿼리와 달리, 세미조인이 안 일어나서 느린 것 같습니다.
SELECT i1_0.item_id, i1_0.name, i1_0.original_price, i1_0.now_price
FROM item i1_0
WHERE EXISTS (
SELECT 1
FROM color_item c1_0
JOIN color_item_size_stock c2_0 ON c1_0.color_item_id = c2_0.color_item_id
WHERE c2_0.stock > 0
AND c1_0.item_id = i1_0.item_id
)
ORDER BY i1_0.item_id DESC
LIMIT 0,10;
//explain analyze
-> Limit: 10 row(s)
(cost=142577 rows=10)
(actual time=718..718 rows=10 loops=1
-> Nested loop inner join
(cost=142577 rows=1.33e+6)
(actual time=718..718 rows=10 loops=1)
-> Index scan on i1_0 using PRIMARY (reverse)
(cost=0.11 rows=10)
(actual time=0.414..0.418 rows=10 loops=1)
-> Single-row index lookup on <subquery2> using <auto_distinct_key> (item_id=i1_0.item_id)
(cost=100063..100063 rows=1)
(actual time=71.7..71.7 rows=1 loops=10)
-> Materialize with deduplication
(cost=100063..100063 rows=132995)
(actual time=717..717 rows=100000 loops=1)
-> Nested loop inner join
(cost=86763 rows=132995)
(actual time=0.0476..583 rows=400000 loops=1)
-> Filter: (c2_0.stock > 0)
(cost=40215 rows=132995)
(actual time=0.0349..163 rows=400000 loops=1)
-> Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.033..125 rows=400000 loops=1)
-> Single-row index lookup on c1_0 using PRIMARY (color_item_id=c2_0.color_item_id)
(cost=0.25 rows=1)
(actual time=835e-6..871e-6 rows=1 loops=400000)
인덱스가 적용됐지만 느림
SELECT DISTINCT i1_0.item_id,
i1_0.name,
i1_0.original_price,
i1_0.now_price
FROM item i1_0
JOIN color_item c1_0
ON i1_0.item_id = c1_0.item_id
JOIN color_item_size_stock c2_0
ON c1_0.color_item_id = c2_0.color_item_id
WHERE c2_0.stock > 0
ORDER BY i1_0.item_id DESC
LIMIT 0, 10;
//explain analyze
-> Limit: 10 row(s)
(cost=119055..119058 rows=9.26)
(actual time=7.16..7.16 rows=10 loops=1)
-> Table scan on <temporary>
(cost=119055..119058 rows=9.26)
(actual time=6.47..6.47 rows=10 loops=1)
-> Temporary table with deduplication
(cost=119055..119055 rows=9.26)
(actual time=6.46..6.46 rows=10 loops=1)
-> Limit table size: 10 unique row(s)
-> Nested loop inner join
(cost=119054 rows=9.26)
(actual time=0.735..0.993 rows=19 loops=1)
-> Nested loop inner join
(cost=24002 rows=13.9)
(actual time=0.715..0.772 rows=19 loops=1)
-> Index scan on i1_0 using PRIMARY (reverse)
(cost=0.0767 rows=7)
(actual time=0.7..0.716 rows=10 loops=1)
-> Covering index lookup on c1_0 using FKaiqat78ya0grkolk7yfk015xp (item_id=i1_0.item_id)
(cost=0.25 rows=1.99)
(actual time=0.00332..0.00514 rows=1.9 loops=10)
-> Limit: 1 row(s)
(cost=0.498 rows=0.664)
(actual time=0.0111..0.0112 rows=1 loops=19)
-> Filter: (c2_0.stock > 0)
(cost=0.498 rows=0.664)
(actual time=0.011..0.011 rows=1 loops=19)
-> Index lookup on c2_0 using FKfsw9w4rngl3jbfn8oo71ot83d (color_item_id=c1_0.color_item_id)
(cost=0.498 rows=1.99)
(actual time=0.0104..0.0104 rows=1 loops=19)
카운트 쿼리와 다르게, 페이징 쿼리는 exits보다 distinct가 더 빠른걸 보실 수 있습니다.
원인은 Limit table size: 10 unique row(s) 인 것 같습니다.
입증하기 위해, limit 조건을 없애고 비교해보니 맞음
→ exists (0.9s), distinct (1.7s)
(exits, distinct 둘 다 인덱스 적용된 상황)
SELECT i1_0.item_id, i1_0.name, i1_0.original_price, i1_0.now_price
FROM item i1_0
WHERE EXISTS (
SELECT 1
FROM color_item c1_0
JOIN color_item_size_stock c2_0 ON c1_0.color_item_id = c2_0.color_item_id
WHERE c1_0.item_id = i1_0.item_id
AND c2_0.stock > 0
)
ORDER BY i1_0.now_price DESC
LIMIT 0,10;
//explain analyze
-> Limit: 10 row(s)
(cost=1.27e+9 rows=10)
(actual time=801..801 rows=10 loops=1)
-> Nested loop inner join
(cost=1.27e+9 rows=12.7e+9)
(actual time=801..801 rows=10 loops=1)
-> Sort: i1_0.now_price DESC
(cost=10632 rows=95818)
(actual time=93.1..93.1 rows=10 loops=1)
-> Table scan on i1_0
(cost=10632 rows=95818)
(actual time=0.0471..45.3 rows=100000 loops=1)
-> Single-row index lookup on <subquery2> using <auto_distinct_key> (item_id=i1_0.item_id)
(cost=100063..100063 rows=1)
(actual time=70.7..70.7 rows=1 loops=10)
-> Materialize with deduplication
(cost=100063..100063 rows=132995)
(actual time=707..707 rows=100000 loops=1)
-> Nested loop inner join
(cost=86763 rows=132995)
(actual time=0.0673..577 rows=400000 loops=1)
-> Filter: (c2_0.stock > 0)
(cost=40215 rows=132995)
(actual time=0.0531..160 rows=400000 loops=1)
-> Table scan on c2_0
(cost=40215 rows=399024)
(actual time=0.0497..121 rows=400000 loops=1)
-> Single-row index lookup on c1_0 using PRIMARY (color_item_id=c2_0.color_item_id)
(cost=0.25 rows=1)
(actual time=829e-6..865e-6 rows=1 loops=400000)
now_price 단일 인덱스가 적용되도 속도는 동일했습니다.
SELECT DISTINCT i1_0.item_id,
i1_0.name,
i1_0.original_price,
i1_0.now_price
FROM item i1_0
JOIN color_item c1_0
ON i1_0.item_id = c1_0.item_id
JOIN color_item_size_stock c2_0
ON c1_0.color_item_id = c2_0.color_item_id
WHERE c2_0.stock > 0
ORDER BY i1_0.now_price DESC
LIMIT 0, 10;
//explain analyze
-> Limit: 10 row(s)
(actual time=2057..2057 rows=10 loops=1)
-> Sort: i1_0.now_price DESC, limit input to 10 row(s) per chunk
(actual time=2057..2057 rows=10 loops=1)
-> Table scan on <temporary>
(cost=199447..201034 rows=126720)
(actual time=2004..2037 rows=100000 loops=1)
-> Temporary table with deduplication
(cost=199447..199447 rows=126720)
(actual time=2004..2004 rows=100000 loops=1)
-> Nested loop inner join
(cost=186775 rows=126720)
(actual time=0.959..1475 rows=200000 loops=1)
-> Nested loop inner join
(cost=53706 rows=190742)
(actual time=0.936..445 rows=200000 loops=1)
-> Table scan on i1_0
(cost=10632 rows=95818)
(actual time=0.0549..58.6 rows=100000 loops=1)
-> Covering index lookup on c1_0 using FKaiqat78ya0grkolk7yfk015xp (item_id=i1_0.item_id)
(cost=0.25 rows=1.99)
(actual time=0.0029..0.0036 rows=2 loops=100000)
-> Limit: 1 row(s)
(cost=0.498 rows=0.664)
(actual time=0.0049..0.00493 rows=1 loops=200000)
-> Filter: (c2_0.stock > 0)
(cost=0.498 rows=0.664)
(actual time=0.00476..0.00476 rows=1 loops=200000)
-> Index lookup on c2_0 using FKfsw9w4rngl3jbfn8oo71ot83d (color_item_id=c1_0.color_item_id)
(cost=0.498 rows=1.99)
(actual time=0.00459..0.00459 rows=1 loops=200000)
최신순 정렬은 인덱스가 있어서 빨랐지만,
가격순 정렬은 인덱스가 없어서 정렬하느라 느린 것 같습니다.
바로 추가해보겠습니다.
//추가한 인덱스
ALTER TABLE item ADD INDEX idx_now_price(now_price);
//인덱스 적용 후 explain analyze
-> Limit: 10 row(s)
(cost=119055..119058 rows=9.26)
(actual time=0.335..0.337 rows=10 loops=1)
-> Table scan on <temporary>
(cost=119055..119058 rows=9.26)
(actual time=0.335..0.336 rows=10 loops=1)
-> Temporary table with deduplication
(cost=119055..119055 rows=9.26)
(actual time=0.333..0.333 rows=10 loops=1)
-> Limit table size: 10 unique row(s)
-> Nested loop inner join
(cost=119054 rows=9.26)
(actual time=0.0967..0.292 rows=19 loops=1)
-> Nested loop inner join
(cost=24002 rows=13.9)
(actual time=0.0801..0.138 rows=19 loops=1)
-> Index scan on i1_0 using idx_now_price (reverse)
(cost=0.0767 rows=7)
(actual time=0.0706..0.103 rows=10 loops=1)
-> Covering index lookup on c1_0 using FKaiqat78ya0grkolk7yfk015xp (item_id=i1_0.item_id)
(cost=0.25 rows=1.99)
(actual time=0.00245..0.00318 rows=1.9 loops=10)
-> Limit: 1 row(s)
(cost=0.498 rows=0.664)
(actual time=0.00775..0.00779 rows=1 loops=19)
-> Filter: (c2_0.stock > 0)
(cost=0.498 rows=0.664)
(actual time=0.0076..0.0076 rows=1 loops=19)
-> Index lookup on c2_0 using FKfsw9w4rngl3jbfn8oo71ot83d (color_item_id=c1_0.color_item_id)
(cost=0.498 rows=1.99)
(actual time=0.00737..0.00737 rows=1 loops=19)
//인덱스 적용하고, 스캔 방식 바뀐 부분
//적용 전
Table scan on i1_0
(cost=10632 rows=95818)
(actual time=0.0549..58.6 rows=100000 loops=1)
//적용 후 (sort 사라짐)
Index scan on i1_0 using idx_now_price (reverse)
(cost=0.0767 rows=7)
(actual time=0.0706..0.103 rows=10 loops=1)
인덱스 적용 → Limit table size: 10 unique row(s) 생기고 빨라짐
결론
select i1_0.item_id,i1_0.name,i1_0.original_price,i1_0.now_price from item i1_0
where i1_0.item_id in
(select distinct i2_0.item_id from item i2_0
join color_item c1_0 on i2_0.item_id=c1_0.item_id
join color_item_size_stock c2_0 on c1_0.color_item_id=c2_0.color_item_id
where c2_0.stock>0)
order by i1_0.item_id desc limit 0,10;
//explain analyze
-> Limit: 10 row(s)
(cost=47957 rows=9.26)
(actual time=0.0771..0.206 rows=10 loops=1)
-> Nested loop semijoin
(cost=47957 rows=9.26)
(actual time=0.0722..0.2 rows=10 loops=1)
-> Nested loop inner join
(cost=23955 rows=7)
(actual time=0.0482..0.0893 rows=10 loops=1)
-> Index scan on i1_0 using PRIMARY (reverse)
(cost=0.0767 rows=7)
(actual time=0.035..0.0444 rows=10 loops=1)
-> Single-row covering index lookup on i2_0 using PRIMARY (item_id=i1_0.item_id)
(cost=0.25 rows=1)
(actual time=0.00415..0.00418 rows=1 loops=10)
-> Nested loop inner join
(cost=95050 rows=1.32)
(actual time=0.0109..0.0109 rows=1 loops=10)
-> Covering index lookup on c1_0 using FKaiqat78ya0grkolk7yfk015xp (item_id=i1_0.item_id)
(cost=0.25 rows=1.99)
(actual time=0.0022..0.0022 rows=1 loops=10)
-> Filter: (c2_0.stock > 0)
(cost=0.659 rows=0.664)
(actual time=0.00834..0.00834 rows=1 loops=10)
-> Index lookup on c2_0 using FKfsw9w4rngl3jbfn8oo71ot83d (color_item_id=c1_0.color_item_id)
(cost=0.659 rows=1.99)
(actual time=0.00808..0.00808 rows=1 loops=10)
select i1_0.item_id,i1_0.name,i1_0.original_price,i1_0.now_price from item i1_0
where i1_0.item_id in
(select distinct i2_0.item_id from item i2_0
join color_item c1_0 on i2_0.item_id=c1_0.item_id
join color_item_size_stock c2_0
on c1_0.color_item_id=c2_0.color_item_id
where c2_0.stock>0)
order by i1_0.now_price desc limit 0,10;
//explain analyze
-> Limit: 10 row(s)
(cost=80841 rows=10)
(actual time=97.1..97.2 rows=10 loops=1)
-> Nested loop semijoin
(cost=80841 rows=126720)
(actual time=97.1..97.2 rows=10 loops=1)
-> Nested loop inner join
(cost=44168 rows=95818)
(actual time=97..97.1 rows=10 loops=1)
-> Sort: i1_0.now_price DESC
(cost=10632 rows=95818)
(actual time=97..97 rows=10 loops=1)
-> Table scan on i1_0
(cost=10632 rows=95818)
(actual time=0.0434..47 rows=100000 loops=1)
-> Single-row covering index lookup on i2_0 using PRIMARY (item_id=i1_0.item_id)
(cost=0.25 rows=1)
(actual time=0.0066..0.00664 rows=1 loops=10)
-> Nested loop inner join
(cost=95050 rows=1.32)
(actual time=0.0133..0.0133 rows=1 loops=10)
-> Covering index lookup on c1_0 using FKaiqat78ya0grkolk7yfk015xp (item_id=i1_0.item_id)
(cost=0.25 rows=1.99)
(actual time=0.00257..0.00257 rows=1 loops=10)
-> Filter: (c2_0.stock > 0)
(cost=0.659 rows=0.664)
(actual time=0.0104..0.0104 rows=1 loops=10)
-> Index lookup on c2_0 using FKfsw9w4rngl3jbfn8oo71ot83d (color_item_id=c1_0.color_item_id)
(cost=0.659 rows=1.99)
(actual time=0.01..0.01 rows=1 loops=10)
now_price 단일 인덱스 없는 상황인데 빠릅니다.
세미조인이 일어나서 빠른 것 같습니다.
따로 실험해봤습니다.
복합 인덱스 컬럼 순서에 따른 경우의 수
https://one-armed-boy.tistory.com/entry/%EB%B3%B5%ED%95%A9-%EC%9D%B8%EB%8D%B1%EC%8A%A4-%ED%99%9C%EC%9A%A9-%EA%B2%BD%EC%9A%B0%EC%9D%98-%EC%88%98
커버링 인덱스 경우의 수
https://jojoldu.tistory.com/476