No offset 을 이용한 무한 스크롤 성능 개선

이진우·2024년 11월 25일

mysql 스프링

스프링 학습

목록 보기

42/48

문제 상황

위 영상과 같이 스크롤을 내리면서 다음 페이지를 탐색할 수 있게 하는 방법을 무한 스크롤 이라고 한다.

이는 스프링 data jpa 의 Slice 를 이용하여 주로 구현하는데 이는 Page 와 달리 count 쿼리가 나가지 않는다는 장점이 있다.

따라서 지금까지 나는 아래와 같이 통상적인 방법으로 이 부분을 구현했었다.

service

public CommunityPostSearchWithSliceResponseDto searchCommunityPostByCommunityBoardId(final Long communityBoardId,
                                                                                         final Pageable pageable){

        Slice<CommunityPostSearchDBResponseDto> communityPosts =
                communityPostRepository.findCommunityPostKeyWordSearchDBByCommunityBoardId(communityBoardId,pageable);


        List<CommunityPostSearchResponseDto> communityPostSearchResponseDtoList =
                communityPosts.stream().map(communityPostSearchDBResponseDto -> {
                    return  CommunityPostSearchResponseDto.of(communityPostSearchDBResponseDto.getCommunityPostId(),
                            communityPostSearchDBResponseDto.getTitle(),
                            replyRepository.totalReplyCount(communityPostSearchDBResponseDto.getCommunityPostId()),
                            communityPostSearchDBResponseDto.getWriterNickName(),
                            communityPostSearchDBResponseDto.getCreatedAt());
                }).collect(Collectors.toList());

        return CommunityPostSearchWithSliceResponseDto.of(communityPostSearchResponseDtoList,communityPosts.hasNext());

    }

repository

    @Query("select new com.gaduationproject.cre8.domain.community.dto.CommunityPostSearchDBResponseDto(cp.id,cp.title,m.nickName,cp.createdAt)"
            + "from CommunityPost cp join cp.writer m where cp.communityBoard.id=:communityBoardId")
    Slice<CommunityPostSearchDBResponseDto> findCommunityPostKeyWordSearchDBByCommunityBoardId(@Param("communityBoardId") final Long communityBoardId,
                                                                                               final Pageable pageable);

문제점

하지만 이와 같은 방법은 단점이 있다.

결국 위와 같은 코드는

데이터베이스에

select
       cp1_0.community_post_id,
       cp1_0.title,
       w1_0.nick_name,
       cp1_0.created_at 
   from
       community_post cp1_0 
   join
       member w1_0 
           on w1_0.member_id=cp1_0.writer_id 
   where
       cp1_0.community_board_id=1
   order by
       cp1_0.community_post_id desc 
   limit
       50010,10;

이런 형태의 쿼리를 날리며

이 형태의 쿼리는

offset, limit 의 구조로
되어 있으므로

offset + limit 만큼의 레코드 수를 읽은 후 offset만큼의 개수를 버린다.

위 상황에서 select 절에 들어가는 칼럼들이
index 가 사용되지 않고 (즉 커버링 인덱스가 아니고)

이므로 논 클러스터링 인덱스의 단점인 2번의 인덱스 탐색을 수행하는 과정을 매

offset+limit 만큼 수행해야 한다.

모두 index 를 처리하면 클러스터링 인덱스를 통해서 offset 만큼 불필요하게 2번 조회하는 것을 방지할 수 있다.

아무튼 각설하고

따라서 불필요한 연산이 들어감에 틀림이 없다.

sql 실행 순서상

limit, order by 가 가장 마지막에 실행 되는 것이기 때문이다. 그렇기 때문에 다 읽어 들일 수 밖에 없다.

증명

이는 다양한 형태로 확인할 수 있다.
먼저 실험을 위해 대략 50000 건의 커뮤니티 게시글을 DB에 저장해두었다.

JMeter 를 사용한 실행 테스트

Page 가 0일 때 50명의 유저가 한 번에 조회!

page가 0일 때는 offset+limit 만큼의 게시글을 읽기 때문에

최종적으로 limit 만큼의 게시글을 읽는다.

따라서 Jmeter 를 이용하여 부하테스트를 가할 때 상대적으로 빠르다.

Page 가 5001 일 때 50명의 유저가 한 번에 조회!

마찬가지로 offset+ limit 만큼의 게시글을 읽기 때문에 50010+A 만큼의 게시글을 읽어 버린다.

따라서 Jmeter 를 이용하여 부하 테스트를 가할 때 상대적으로 느리다.

mysql explain anlyze

explain analyze select
        cp1_0.community_post_id,
        cp1_0.title,
        w1_0.nick_name,
        cp1_0.created_at 
    from
        community_post cp1_0 
    join
        member w1_0 
            on w1_0.member_id=cp1_0.writer_id 
    where
        cp1_0.community_board_id=1
    order by
        cp1_0.community_post_id desc 
	limit 10
	offset 50010;

explain analyze  select
        cp1_0.community_post_id,
        cp1_0.title,
        w1_0.nick_name,
        cp1_0.created_at 
    from
        community_post cp1_0 
    join
        member w1_0 
            on w1_0.member_id=cp1_0.writer_id 
    where
        cp1_0.community_board_id=1
    order by
        cp1_0.community_post_id desc 
	limit 0,10;

두 쿼리를 비교한다.

offset이 50010 일 때

-> Limit/Offset: 10/50010 row(s) 
(cost=11518 rows=0) (actual time=168..168 rows=9 loops=1)
     -> Nested loop inner join  (cost=11518 rows=24900) 
     (actual time=0.189..166 rows=50019 loops=1)
         -> Filter: (cp1_0.writer_id is not null)  (cost=2803 ro...

offset이 0일 때

-> Limit: 10 row(s)  
(cost=11518 rows=10) (actual time=0.0891..0.0995 rows=10 loops=1)
     -> Nested loop inner join  (cost=11518 rows=24900) 
     (actual time=0.0884..0.0983 rows=10 loops=1)
         -> Filter: (cp1_0.writer_id is not null)  (cost=2803 rows=2...

이로써 page 의 개수가 커지면 커질 수록
내가 구현한 무한 스크롤의 성능이 나빠진다는 것을 확인할 수 있었다.

참고 explain 키워드

explain 키워드는 위 모두 아래와 같이 동일한 결과를 가진다.

개선

따라서 offset 을 사용하면
페이지의 개수가 커지면 커질 수록 성능이 나빠진다는 것을 파악했다.

이를 해결하기 위해 위와 같이 정렬 기준이 최신순으로 고정되어 있는 경우에 적합한 방법이 있다.

말그대로 offset 을 쓰지 않는 No offset 방식이 있다.

offset 을 이용하지 않는 대신 where 절에 id 를 넣어준다.(현재 pk autogenerated 형태이다).

sql 쿼리를 예로 들면

select
        cp1_0.community_post_id,
        cp1_0.title,
        w1_0.nick_name,
        cp1_0.created_at 
    from
        community_post cp1_0 
    join
        member w1_0 
            on w1_0.member_id=cp1_0.writer_id 
    where
        cp1_0.community_post_id<12
        and cp1_0.community_board_id=1
    order by
        cp1_0.community_post_id desc 
    limit
        10;

이런 형태를 가진다.

그럼 결과적으로 위에 작성했던 sql 쿼리와 동일한 결과가 나온다 .

또한 where 절의 12 라는 아이디 값은 이전 페이지의 마지막 게시물의 아이디를 의미한다.

이는 프론트 측에서 얼마든지 백앤드에게 전달해 줄 수 있는 값이다.

아무튼 where 절에 community_post_id 값을 구체적으로 명시해줌으로써

클러스터링 인덱스를 활용하여 원하는 값을 더 빠르게 검색하여

어느 page 에서나 동일하게 빠른 검색 속도를 가질 수 있다. (순서상 from ,where 절이 첫 번째에 실행되므로 필요 없는 부분을 빠르게 쳐낼 수 있다.)

단 이 경우 첫 번째 페이지에서는 이전 id 가 없으므로

null 과 Query DSL 을 활용하여 이 문제를 해결한다. (아니면 백앤드 와 프론트가 따로 협의하여 lastPostId의 값이 0 이라면 따로 쿼리를 만드던가 등등)

개선 코드

Service

public CommunityPostSearchWithSliceResponseDto searchCommunityPostByCommunityBoardIdAndLastPostId(final Long communityBoardId,
           final Long lastPostId, final Pageable pageable){

       Slice<CommunityPostSearchDBResponseDto> communityPosts =
               communityPostRepository.showCommunityPostWithNoOffSet(lastPostId,communityBoardId,pageable);


       List<CommunityPostSearchResponseDto> communityPostSearchResponseDtoList =
               communityPosts.stream().map(communityPostSearchDBResponseDto -> {
                   return  CommunityPostSearchResponseDto.of(communityPostSearchDBResponseDto.getCommunityPostId(),
                           communityPostSearchDBResponseDto.getTitle(),
                           replyRepository.totalReplyCount(communityPostSearchDBResponseDto.getCommunityPostId()),
                           communityPostSearchDBResponseDto.getWriterNickName(),
                           communityPostSearchDBResponseDto.getCreatedAt());
               }).collect(Collectors.toList());

       return CommunityPostSearchWithSliceResponseDto.of(communityPostSearchResponseDtoList,communityPosts.hasNext());

   }

lastPostId 가 생겼다!

Repository

@RequiredArgsConstructor
public class CommunityPostCustomRepositoryImpl implements CommunityPostCustomRepository{

    private final JPAQueryFactory queryFactory;

    @Override
    public Slice<CommunityPostSearchDBResponseDto> showCommunityPostWithNoOffSet(final Long lastCommunityPostId,final Long communityBoardId,final Pageable pageable){

        List<CommunityPostSearchDBResponseDto> results = queryFactory.select(
                        Projections.constructor(CommunityPostSearchDBResponseDto.class,
                                communityPost.id, communityPost.title,communityPost.writer.nickName,communityPost.createdAt))
                .from(communityPost)
                .join(communityPost.writer)
                .where(ltStoreId(lastCommunityPostId),
                        findByCommunityBoardId(communityBoardId))
                .orderBy(communityPost.id.desc())
                .limit(pageable.getPageSize()+1)
                .fetch();

     
        return checkLastPage(pageable,results);



    }


    private BooleanExpression ltStoreId(final Long lastCommunityPostId) {
        if (lastCommunityPostId == null) {
            return null;
        }

        return communityPost.id.lt(lastCommunityPostId);
    }

    private BooleanExpression findByCommunityBoardId(final Long communityBoardId){

        return communityPost.communityBoard.id.eq(communityBoardId);

    }

    private Slice<CommunityPostSearchDBResponseDto> checkLastPage(final Pageable pageable, final List<CommunityPostSearchDBResponseDto> results) {

        boolean hasNext = false;

        if (results.size() > pageable.getPageSize()) {
            hasNext = true;
            results.remove(pageable.getPageSize());
        }

        return new SliceImpl<>(results, pageable, hasNext);
    }

}

동적 쿼리를 사용하여 lastPostId 가 null 일 때 첫 번째 페이지임을 확인하고, 첫 번째 페이지라면 null 을 반환하여 조건절을 없앤다.
애초에 size + 1 만큼 조회한 이후에 이를 통해서 nextPage 가 있는지 여부를 프론트에게 반환 하여 준다.

개선 결과

Jmeter

50명의 유저가 한번에 조회할 때 (마지막 페이지)

참고로 이전 No Offset 을 적용하기 전에는

동일한 환경에서 이랬다.

단순한 쿼리 성능 비교 한명이 100번 조회할 때

단순히 100번 쿼리를 실행하고 그 평균 값을 계산하여 비교하자 .

1. no offset 적용 전 - 마지막 페이지

평균은 그럭저럭 봐줄만 한데 TPS 가 6으로 낮다.

2. no offset 적용 후 - 마지막 페이지

explain analyze

-> Limit: 10 row(s)  (cost=2.47 rows=0.9) 
(actual time=0.0631..0.0802 rows=9 loops=1)
     -> Nested loop inner join  (cost=2.47 rows=0.9) 
     (actual time=0.0623..0.0788 rows=9 loops=1)
         -> Filter: ((cp1_0.community_board_id = 1) and (cp1_0.community_...

explain

기존 filtered 와 rows 를 비교하였을 때 상당 부분이 개선되었음을 알 수 있다.

이진우

기록을 통해 실력을 쌓아가자

이전 포스트

Mongo DB Atlas vs mongodb vs rds 비교

다음 포스트