JPA Query Optimization Chronicles3 (Eng)

yboy·2024년 8월 24일
0

Learning Log 

목록 보기
36/41
post-thumbnail

Motivation

To summarize the query improvements made in our ongoing project, I’ve decided to document the process. So, let’s kick off Query Optimization Chronicles Series 3!

Contents

Problematic Logic - N+1

Service

public CoachReservationsResponse findByCoachId(Long coachId) {
        validateCoachId(coachId);
        List<Reservation> reservations = reservationRepository.findAllByCoachIdAndStatusNot(coachId, DONE);
        return classifyReservationsAndReturnDto(reservations);
    }

Repository

@Query("SELECT r FROM Reservation AS r "
        + "INNER JOIN r.schedule AS s "
        + "INNER JOIN s.coach AS c "
        + "ON c.id = :coachId "
        + "WHERE r.reservationStatus NOT IN :status")
List<Reservation> findAllByCoachIdAndStatusNot(Long coachId, ReservationStatus status);

The above logic is for the main view that allows coaches to check their reservation list. The JPQL query used here is problematic.

The query retrieves only the IDs of the reservations and the associated entities. The reason for fetching just the IDs is that the strategy for loading related entities is set to lazy loading. (It would be the same even if set to eager loading.) However, when returning the response, detailed information for these related entities (Crew, Schedule, Coach) is also fetched using these IDs. The part where this fetching occurs is...

private static CoachReservationDtoWithSheetStatus from(Reservation reservation) {
        return new CoachReservationDtoWithSheetStatus(
                reservation.getId(),
                reservation.getScheduleDateTime(),
                reservation.getCrew().getId(),
                reservation.getCrew().getName(),
                reservation.getCrew().getImage(),
                reservation.getSheetStatus()
        );
    }

As shown above, this is the internal DTO logic for providing the response to be displayed on the main view. When using the JPQL query to obtain the Reservation entities, additional queries are triggered at the point of calling getXXX().getXXX() on the associated entities. This results in n extra queries being executed.

The above shows the N additional queries that were actually executed on the development server. The number of extra queries increases in proportion to the number of reservations the coach has, which is a risky situation where the number of queries is dependent on the number of reservations.

Solution - Applied a Fetch Join

after Applying a Fetch Join

@Query("SELECT r FROM Reservation AS r "
            + "JOIN FETCH r.schedule AS s "
            + "JOIN FETCH r.crew AS cr "
            + "INNER JOIN s.coach AS co "
            + "ON co.id = :coachId "
            + "WHERE r.reservationStatus NOT IN :status")
    List<Reservation> findAllByCoachIdAndStatusNot(Long coachId, ReservationStatus status);

The simplest way to resolve the N+1 problem is by applying a Fetch Join. Let's briefly review what Fetch Join is.

Fetch Join

  • A feature provided in JPQL for performance optimization.
  • It prevents lazy loading from occurring.
  • It reduces the number of SQL calls, thereby optimizing performance.

Fetch Join is the simplest and most effective method in JPQL for resolving the N+1 problem.

The reason Fetch Join is used for performance optimization is that it retrieves not only the IDs of associated entities but also their column (field) values in a single query.

In contrast, Inner Join or Outer Join only fetches the IDs of associated entities in the initial query. When lazy loading is applied, additional queries (e.g., select * from Coach where id = ?) are issued when the associated entity is accessed (e.g., calling getXXX() on the entity). With eager loading, additional queries are immediately executed after the initial query, using the IDs fetched in the initial query.

After applying the Fetch Join, the queries executed, as shown above, retrieve not only the IDs of the associated entities but also all their field information. As a result, the N+1 problem was effectively resolved.

Cautions with Fetch Join

While Fetch Join provides significant performance benefits, it is not a panacea. Fetch Join is effective when maintaining an object graph, but if you need to join multiple tables to produce results that differ from the entity's structure, it is more effective to retrieve only the necessary fields from the tables and return them as a DTO, rather than forcing a Fetch Join. Overusing Fetch Join can lead to performance issues, as it might fetch excessive amounts of unnecessary data all at once, potentially causing performance degradation.

0개의 댓글