Offset Pagination vs Cursor Pagination

양성준·2025년 4월 18일

Spring

스프링

목록 보기

34/49

Pagination

대량의 데이터를 한 번에 조회하지 않고, 일정한 단위로 나누어 필요한 만큼만 조회하는 방식.
이를 통해 시스템 자원 사용을 줄이고, 네트워크 오버헤드를 줄여 사용자 경험을 개선할 수 있다.
- 한 번에 수만개의 데이터를 데이터베이스에서 애플리케이션으로 가져오면 메모리가 모자를 수 있다.
- 많은 데이터를 전송하므로 네트워크의 오버헤드가 생긴다.

Pagination은 총 두가지 방식으로 처리할 수 있다.

Offset 기반

select *from post
order by create_at desc
limit [페이지사이즈] 
offset [페이지번호];

limit : 가져올 데이터 숫자, 페이지의 사이즈
offset : 페이지 번호

장점

직관적이고 구현하기 편하다 - JPA에서는 Pageable을 이용해서 쉽게 구현 (Page, Slice, Pageable)
유저가 특정 페이지를 선택하고, 이동할 수 있다.
전체 페이지의 갯수를 알 수 있다.

단점

중복 데이터 발생 또는 누락

쓰기 빈도가 빈번한 경우, 데이터를 조회하는 도중 데이터가 추가된다면, 다음 페이지를 눌렀을 때 이미 보았던 데이터가 보일 수 있음
데이터가 삭제된다면, 다음 페이지 조회 시에 못 본 데이터가 한칸 앞으로 당겨져, 볼 수 없을 수도 있다.

성능 저하

offset 기반 페이지네이션은, offset 크기만큼 지정된 데이터를 모두 읽고, 지정된 개수만큼 순회하여 자르는 방식
- 즉, offset 이전까지의 모든 데이터를 읽은 뒤, 필요한 부분만 잘라서 반환함
- offset 10000, limit 20이라 하면 최종적으로 10,020 개의 행을 읽고 뒤의 20개만 반환함 -> DB 부하
뒷 페이지로 갈수록 읽어야 할 데이터의 개수가 많아져서 성능 저하가 발생한다.

=> Cursor 기반 Pagination을 이용하면, 모든 문제를 해결할 수 있다.

구현

Controller

  @GetMapping
  public ResponseEntity<PageResponse<FindMessageResponseDTO>> getChannelMessages(
      @RequestParam("channelId") UUID id,
      @PageableDefault(size = 10, sort = "createdAt", direction = Sort.Direction.ASC) Pageable pageable) {
    Slice<FindMessageResult> messageResults = messageService.findAllByChannelId(id, pageable);
    Slice<FindMessageResponseDTO> messageResponseDTOPage = messageResults.map(
        messageMapper::toFindMessageResponseDTO);

    return ResponseEntity.ok(pageResponseMapper.fromSlice(messageResponseDTOPage));
  }
  
 
  public record PageResponse<T>(
    List<T> content,
    int number,
    int size,
    boolean hasNext,
    Long totalElements // nullable
) {
}


@Mapper(componentModel = "spring")
public interface PageResponseMapper {

  default <T> PageResponse<T> fromPage(Page<T> page) {
    return new PageResponse<>(
        page.getContent(),
        page.getNumber(),
        page.getSize(),
        page.hasNext(),
        page.getTotalElements()
    );
  }

  default <T> PageResponse<T> fromSlice(Slice<T> slice) {
    return new PageResponse<>(
        slice.getContent(),
        slice.getNumber(),
        slice.getSize(),
        slice.hasNext(),
        null
    );
  }
}

PageResponse에 감싸서 반환해야, content List 부분에 응답 데이터 값이 들어가고, 페이징에 대한 정보를 포함시킬 수 있음
- List를 한 번 더 감싸서 반환하기 때문에, 확장성과 유연성이 좋아진다.
Mapper의 경우, Slice나 Page를 받아서 PageResponse에 매핑시켜줌
- 제너릭 래퍼 타입의 경우, MapStruct가 구조를 모르기 때문에 직접 코드를 작성해줘야한다.
Controller에서는 page, size, sort를 @RequestParam으로 받거나 @Pageable을 파라미터로 받아 넘겨줄 수 있다.
(기본값이 필요하다면 @PageableDefault)
Page나 Slice의 경우, 내부 content 타입을 page.map으로 변환시켜줄 수 있다.

Service, Repository

  public Slice<FindMessageResult> findAllByChannelId(UUID channelId,
      Pageable pageable) {
    Slice<Message> messages = messageRepository.findAllByChannelId(channelId,
        pageable);
    return messages.map(messageMapper::toFindMessageResult);
  }
  
  Slice<Message> findAllByChannelId(UUID channelId, Pageable pageable);

Controller에서 넘어온 Pageable을 Service에서 받아서 Repository에 전달하고, Paging된 조회 결과를 받아서
다시 엔티티를 DTO로 감싸 Controller단에 넘겨주면 된다!

{
    "content": [
        {
            "id": "dc55fbd9-8b91-4e4a-94a6-bb5ee00d712e",
            "createdAt": "2025-04-18T09:15:21.691898Z",
            "updatedAt": "2025-04-18T09:15:21.691898Z",
            "attachments": [],
            "content": "안녕하세요",
            "channelId": "17b67b98-a541-4344-be0a-d2f95bc2936d",
            "author": {
                "id": "0f079068-79a2-4884-998e-c71ef3efd266",
                "profile": {
                    "id": "e3b25241-f85d-416a-adf6-981ea7f57003",
                    "filename": "프로필용이미지1.jpg",
                    "size": 152238,
                    "contentType": "image/jpeg"
                },
                "username": "yyjjmm2005",
                "email": "yyjjmm2005@naver.com",
                "online": false
            }
        },
        {
            "id": "ce350fe2-c4ab-4993-9afb-cec08d4deed1",
            "createdAt": "2025-04-18T09:15:29.593678Z",
            "updatedAt": "2025-04-18T09:15:29.593678Z",
            "attachments": [],
            "content": "테스트",
            "channelId": "17b67b98-a541-4344-be0a-d2f95bc2936d",
            "author": {
                "id": "6780ff72-0fd7-4c56-9cb8-73cb6dc85c20",
                "profile": {
                    "id": "5708d884-a189-42fe-a095-7af77b24be89",
                    "filename": "프로필용이미지2.jpg",
                    "size": 396698,
                    "contentType": "image/jpeg"
                },
                "username": "yyjjmm2006",
                "email": "yyjjmm2006@naver.com",
                "online": false
            }
        },
        {
            "id": "c425ba77-51fa-46fc-8847-0fd3098e5226",
            "createdAt": "2025-04-18T09:15:36.541587Z",
            "updatedAt": "2025-04-18T09:15:36.541587Z",
            "attachments": [],
            "content": "잘되나요?",
            "channelId": "17b67b98-a541-4344-be0a-d2f95bc2936d",
            "author": {
                "id": "d03da4b3-cfbd-444a-bf25-7d5b740e0a87",
                "profile": null,
                "username": "yyjjmm2007",
                "email": "yyjjmm2007@naver.com",
                "online": false
            }
        }
    ],
    "number": 0,
    "size": 10,
    "hasNext": false,
    "totalElements": null
}

totalElements는 Slice를 사용했기 때문에 null로 들어감

Slice vs Page

Page는 totalPages를 포함하여, count(*)가 포함된 쿼리가 수행된다.
- 쿼리가 한 번 더 나가기 때문에 성능상 좋지 않지만, 전체 페이지 수가 필요한 경우 사용
Slice는 totalPages를 포함하지 않아, count 쿼리가 나가지 않는다.
- 성능이 Page보다 좋음!
- totalPages가 필요없는 단순 무한 스크롤, 다음 페이지 유무만 판단하면 되는 경우 사용

=> 이러한 이유로 무한 스크롤에서는 Slice와 Cursor 기반 페이징을 조합하여 많이 사용한다.
(전체 page 불필요 + 빠른 응답을 위함)

Cursor 기반

Cursor라는 개념을 사용하여, offset을 사용하지 않고 Cursor를 기준으로 다음 N개의 데이터를 응답해준다.
DB 인덱스를 통해서 원하는 페이지의 게시글에 바로 접근하는 기술 (인덱스가 필수!!)

Cursor?
- 사용자에게 응답해준 마지막 데이터의 식별자 값
- 원하는 데이터가 어떤 데이터의 다음에 있는지가 중요
- offset과 다르게 이전 데이터를 전부 읽어낸 뒤 잘라내는 것이 아니라 cursor 다음 데이터만 조회함
- Cursor가 되는 컬럼은 Unique, 비교가능, 불변한 컬럼을 선택해야함 (고유 id 사용)
  - 정렬 기준 + 커서 기준에 맞는 인덱스가 없다면, 테이블을 full-scan을 하므로 Cursor의 성능상의 이점을 가져갈 수 없다.
  - PK인 id값의 경우 정렬된 인덱스를 RDBMS에서 자동으로 만들어주므로, id값을 사용하는게 가장 좋다.
  - id가 아닌 값을 사용하려면, 정렬 기준 + 커서 기준에 맞는 인덱스를 별도로 생성해줘야함
    - created_at 기준 커서 페이징을 한다고 치면, SELECT * FROM messages WHERE created_at < :cursor ORDER BY created_at DESC LIMIT 20;에 대해 WHERE created_at < ? 조건도 못타고, ORDER BY created_at DESC 도 못 타서 full-scan

 SELECT *
 FROM post
 WHERE 조건문
 AND id < 마지막조회ID # 직전 조회 결과의 마지막 id
 ORDER BY id DESC
 LIMIT 페이지사이즈

id 다음의 값부터 LIMIT 만큼의 데이터를 조회하기 때문에, 이전 페이지 전체를 읽어오지 않아도 된다.
-> 성능상의 어마어마한 이점이 존재!

장점

id 값으로 데이터를 조회하기 때문에, 데이터 쓰기가 빈번하더라도 데이터가 중복되거나 누락되지 않는다.
이전의 데이터를 읽지 않고, cursor 이후의 데이터만 읽기 떄문에 대용량 데이터를 처리할 때 좋은 성능을 보인다.

단점

마지막 페이지에서 스크롤을 할 경우, 데이터가 존재하지 않기 때문에 아무 데이터도 응답하지 않는다.
- 이를 위해, 다음 데이터가 존재하지 않는다는 의미를 포함한 필드가 있는 것이 좋음 (hasNext)
WHERE절에 여러 조건이 들어가는 경우, offset 보다 성능이 떨어질 수 있다.
인덱스가 없다면, full-scan이 일어나므로 offset보다 성능이 좋지 않다.

구현

1. ID값을 쓰는 경우

Controller

  @GetMapping
  public ResponseEntity<PageResponse<FindMessageResponseDTO>> getChannelMessages(
      @RequestParam("channelId") UUID id,
      // 클라이언트 측에서는 이전 응답의 nextCursor값을 cursor로 넘겨준다.
      @RequestParam(value = "cursor", required = false) Long cursor,
      @RequestParam(value = "limit", defaultValue = "10") int limit) {
    
    Slice<FindMessageResult> messageResults;
    if(cursor == null) {
    	messageResults = messageService.findAllByChannelIdInitial(id, limit);
    } else {
    	messageResults = messageService.findAllByChannelId(id, cursor, limit);
    }
    Slice<FindMessageResponseDTO> messageResponseDTOPage = messageResults.map(
        messageMapper::toFindMessageResponseDTO);

    return ResponseEntity.ok(pageResponseMapper.fromSlice(messageResponseDTOPage));
  }
  
  
  public record PageResponse<T>(
    List<T> content,
    Object 또는 String nextCursor,  // String을 가장 많이 사용, 직렬화해서 String으로 만들면 주고받기 쉽다
    int size,
    boolean hasNext,
    Long totalElements // nullable
) {
}

Pageable은 offset 기반을 전제로 만들어진 인터페이스기 때문에, 직접 limit을 받아주는 것이 좋다.
첫 페이지의 경우 Cursor가 null로 들어오기 때문에, 해당 상태에 따라 분기 처리가 필요하다.
(null인 경우, Hibernate -> PostgreSQL에서 바인딩 오류가 발생)
nexCursor는 "현재 페이지의 마지막 요소", 다음 페이지를 요청할 때 사용할 기준 값
클라이언트측에서는 응답받은 nextCursor를 다음 요청을 위해 저장해놓고, 이를 요청할 때 다시 포함시킴

const url = `/messages?channelId=${channelId}&cursor=${lastCreatedAt}&limit=10`;

fetch(url)
  .then((res) => res.json())
  .then((data) => {
    // 메시지 렌더링
    // 다음 요청을 위한 nextCursor 저장
    lastCreatedAt = data.nextCursor;
  });

이 때문에, 첫 요청인 경우 Cursor가 null인 경우가 존재한다. -> Repository에 cursor is null or 추가

nextCursor 계산을 위해, PageResponseMapper 클래스에 계산하여 매핑해주는 내용을 추가해야함

PageResponseMapper

default <T extends HasId> PageResponse<T> fromSlice(Slice<T> slice) {
    // nextCursor 추출, content의 마지막 id값
    T last = slice.getContent().isEmpty() ? null :
        slice.getContent().get(slice.getContent().size() - 1);

    Object nextCursor = last != null ? last.getId() : null;

    return new PageResponse<>(
        slice.getContent(),
        nextCursor,
        slice.getSize(),
        slice.hasNext(),
        null
    );
  }
  

// nextCursor를 쉽게 빼기 위한 인터페이스
public interface HasId {
  Object getId();
}

// PageResponse에 매핑될 Slice 객체
public record FindMessageResponseDTO(
    UUID id,
    Instant createdAt,
    Instant updatedAt,
    List<FindBinaryContentResult> attachments,
    String content,
    UUID channelId,
    FindUserResult author
) implements HasId {

  @Override
  public Object getId() {
    return id;
  }
}

nextCursor는 결국 마지막 Content의 Id값
여러 곳에서 쓰는 Response기에 제너릭 래퍼 클래스로 구현함
- getId()를 하기 위해서 hasId 인터페이스를 구현하고, 이를 PageResponse 매핑 대상 DTO에 implements
- getId()를 사용하여 id를 뽑아내어 nextCursor 값으로 사용

Service

  public Slice<FindMessageResult> findAllByChannelIdInitial(UUID channelId, int limit) {
    Pageable pageable = PageRequest.of(0, 20);
    Slice<Message> messages = messageRepository.findAllByChannelIdInitial(channelId, pageable);
    return messages.map(messageMapper::toFindMessageResult);
  }

  public Slice<FindMessageResult> findAllByChannelIdAfterCursor(UUID channelId,
      Long cursor, int limit) {
    Pageable pageable = PageRequest.of(0, 20); // Cursor 페이징을 JPQL로 하기 위함
    Slice<Message> messages = messageRepository.findAllByChannelId(channelId, cursor, pageable);
    return messages.map(messageMapper::toFindMessageResult);
  }

Cursor 기반 페이징의 경우, offset을 전제로 하는 Pageable을 잘 사용하지 않지만,
JPQL을 사용해야하므로 Pageable 객체 생성하여 넘겨줌 (JPA 추상 언어기 때문에 LIMIT, OFFSET SQL 문법 적용 불가능, 쿼리로는 불가능 하지만, Pageable을 통해 자동 적용해줌)
- 순수 SQL 사용할 경우 Limi과 offset을 그대로 넘겨줘도 된다!

Repository

  // 커서가 없을 때 조회
  @Query("select m from Message m"
      + " where m.channel.id = :channelId"
      + " order by m.id ASC")
  Slice<Message> findAllByChannelIdInitial(@Param("channelId") UUID channelId, Pageable pageable);

  // 커서가 있을 때 조회
  @Query("select m from Message m"
      + " where m.channel.id = :channelId"
      + " and m.id < :cursor"
      + " order by m.id ASC")
  Slice<Message> findAllByChannelId(@Param("channelId") UUID channelId,
      @Param("cursor") Long cursor, Pageable pageable);
  // Cursor 기반 정렬을 하기 위해선 channelId + id ASC 기반 복합 인덱스 필요

첫 요청인 경우 cursor가 null이기 때문에, 이를 위해 or 조건 추가
프론트엔드에서 어떻게 스크롤을 구성하느냐에 따라 DESC, ASC / < cursor, > cursor 등이 달라질 수 있음
channelId로 한 번 더 필터링을 하기 때문에, 해당 channelId를 포함하고, 오름차순인 복합 인덱스를 만들어줘야함

CREATE INDEX idx_channel_id_id_asc
ON messages(channel_id, id ASC);

(channel_id1, id1)
(channel_id1, id2)
(channel_id1, id3)
(channel_id2, id1)
(channel_id2, id2)
(channel_id3, id1)
...

복합 인덱스 (channel_id, id)는 왼쪽에서 오른쪽으로만 탐색 가능
channel_id를 갖고 먼저 탐색한 후, 정렬한 순서대로 순차 탐색

하지만, 이번 미션에서는 Id값이 UUID 였고, Postgre15를 쓰기 때문에, UUID v4를 사용하여 인덱스 정렬이 불가능하다..

UUID v4 VS UUID v7

v4는 무작위기 때문에, 정렬이 불가능함! -> 커서 페이징에 적합하지 않다.
v7은 시간 기반이기 때문에 정렬이 가능함. PostgreSQL16+ 부터 지원 (기본값은 아니고, 직접 생성하거나 확장해야함)

=> 이러한 이유로, UUID가 아닌 created_at을 이용하여 커서 기반 페이징을 해보려한다.

2. 기타 컬럼을 쓰는 경우 (created_at)

Controller

  @GetMapping
  public ResponseEntity<PageResponse<FindMessageResponseDTO>> getChannelMessages(
      @RequestParam("channelId") UUID id,
      @RequestParam(value = "cursor", required = false) Instant cursor,
      // 클라이언트 측에서는 이전 응답의 nextCursor값을 cursor로 넘겨준다.
      @RequestParam(value = "limit", defaultValue = "10") int limit) {
    Slice<FindMessageResult> messageResults;

    // 첫 페이징이라면 cursor가 존재하지 않으므로, cursor를 기준이 아닌 첫 페이지를 반환
    if (cursor == null) {
      messageResults = messageService.findAllByChannelIdInitial(id, limit);
    } else {
      messageResults = messageService.findAllByChannelIdAfterCursor(id, cursor, limit);
    }
    Slice<FindMessageResponseDTO> messageResponseDTOPage = messageResults.map(
        messageMapper::toFindMessageResponseDTO);

    return ResponseEntity.ok(pageResponseMapper.fromSlice(messageResponseDTOPage));
  }

PageResponseMapper

  default <T extends HasCursor> PageResponse<T> fromSlice(Slice<T> slice) {
    // nextCursor 추출, content의 마지막 createdAt값
    T last = slice.getContent().isEmpty() ? null :
        slice.getContent().get(slice.getContent().size() - 1);

    Object nextCursor = last != null ? last.getCursor() : null;

    return new PageResponse<>(
        slice.getContent(),
        nextCursor,
        slice.getSize(),
        slice.hasNext(),
        null
    );
  }


// nextCursor를 쉽게 빼기 위한 인터페이스
public interface HasCursor {
  Object getCursor();
}

public record FindMessageResponseDTO(
    UUID id,
    Instant createdAt,
    Instant updatedAt,
    List<FindBinaryContentResult> attachments,
    String content,
    UUID channelId,
    FindUserResult author
) implements HasCursor {

  @Override
  public Object getCursor() {
    return createdAt;
  }
}

Service

  // cursor가 없는 경우
  public Slice<FindMessageResult> findAllByChannelIdInitial(UUID channelId, int limit) {
    Pageable pageable = PageRequest.of(0, 20);
    Slice<Message> messages = messageRepository.findAllByChannelIdInitial(channelId, pageable);
    return messages.map(messageMapper::toFindMessageResult);
  }

  // cursor가 존재할 경우
  public Slice<FindMessageResult> findAllByChannelIdAfterCursor(UUID channelId, Instant cursor,
      int limit) {
    Pageable pageable = PageRequest.of(0, 20); // Cursor 페이징을 JPQL로 하기 위함
    Slice<Message> messages = messageRepository.findAllByChannelIdAfterCursor(channelId, cursor,
        pageable);
    return messages.map(messageMapper::toFindMessageResult);
  }

Repository

  // 커서가 없을 때 조회
  @Query("select m from Message m"
      + " where m.channel.id = :channelId"
      + " order by m.createdAt ASC")
  Slice<Message> findAllByChannelIdInitial(@Param("channelId") UUID channelId, Pageable pageable);


  // 커서가 있을 때 조회
  @Query("select m from Message m"
      + " where m.channel.id = :channelId"
      + " and m.createdAt < :cursor" // 첫 요청의 경우 cursor가 없음
      + " order by m.createdAt DESC")
  Slice<Message> findAllByChannelIdAfterCursor(@Param("channelId") UUID channelId,
      @Param("cursor") Instant cursor, Pageable pageable);
  // Cursor 기반 정렬을 하기 위해선 channelId + createdAt ASC 기반 복합 인덱스 필요

첫 요청의 경우 cursor가 존재하지 않기 때문에 cursor is null or을 사용하여, null이 아닌 경우에만 조건
프론트엔드에서 어떻게 스크롤을 구성하느냐에 따라 DESC, ASC / < cursor, > cursor 등이 달라질 수 있음
(여기서는 최신 페이지 먼저 렌더링 후, 위로 스크롤할 시 다음 페이지 보여주게끔 설계, 즉, 현재 페이지의 가장 오래된 메시지보다 더 오래된 메시지를 이전 페이지에 렌더링)
Cursor 기반 정렬을 하기 위해선 channelId + createdAt ASC 기반 복합 인덱스 필요

 CREATE INDEX idx_channel_id_created_at_asc
 ON messages(channel_id, created_at ASC);

Hibernate: 
    /* select
        m 
    from
        Message m 
    where
        m.channel.id = :channelId 
        and m.createdAt > :cursor 
    order by
        m.createdAt ASC */ 
select
    m1_0.id,
    m1_0.author_id,
    m1_0.channel_id,
    m1_0.content,
    m1_0.created_at,
    m1_0.updated_at 
from
    messages m1_0 
where
    m1_0.channel_id=? 
    and m1_0.created_at>? 
order by
    m1_0.created_at 
fetch first ? rows only

Cursor 기반으로 잘 동작한다!

정리

Offset 기반 사용

데이터의 변화가 거의 없다시피하여 중복 데이터가 노출될 염려가 없는 경우
일반 유저에게 노출되는 리스트가 아니라 중복 데이터가 노출되어도 크게 문제 되지 않는 경우
검색엔진이 인덱싱 할 이유도, 유저가 마지막 페이지를 갈 이유도, 오래 된 데이터의 링크가 공유 될 이유도 없는 경우
애초에 row 수가 그렇게 많지 않아 특별히 퍼포먼스 걱정이 필요 없는 경우
이런 경우에까지 커서 기반을 고려할 필요가 없다. 많은 유저가 접속하는 서비스 페이지에는 철저하게 커서 기반을 사용해야 성능상 이슈가 없고, 백오피스나 어드민은 편한 오프셋 기반을 사용해도 된다.

Cursor 기반 사용

그외 거의 모든 리스트는 커서 기반 페이지네이션을 사용하는 것이 무조건적으로 좋다.

참조 - https://0soo.tistory.com/130

양성준

백엔드 개발자

이전 포스트

N+1 문제 해결 (Fetch Join, @BatchSize 한계 돌파)

다음 포스트

Offset Pagination vs Cursor Pagination

스프링

Offset 기반

장점

단점

구현

Controller

Service, Repository

Slice vs Page

Cursor 기반

장점

단점

구현

1. ID값을 쓰는 경우

Controller

PageResponseMapper

Service

Repository

2. 기타 컬럼을 쓰는 경우 (created_at)

Controller

PageResponseMapper

Service

Repository

정리

Offset 기반 사용

Cursor 기반 사용

N+1 문제 해결 (Fetch Join, @BatchSize 한계 돌파)

Lazy Loading과 OSIV

0개의 댓글