JPA에서 N + 1은 어떻게 해결할까?

uncle.ra·2023년 10월 22일

📗 들어가기 전에

지난 번에 JPA에서 N + 1은 무엇이고 왜 발생할까?에 대해서 알아봤다.
그러면 이번엔 N + 1 문제를 어떻게 해결할 수 있을 지 살펴보자.🤔

📗 잠시 생각해보기

N + 1 문제는 왜 발생할까? 했을 때 지난 번에 얘기한 것과 같이

기본적으로 JPA에서는 엔터티 Query를 수행할 때 객체 자체만 가지고 오기 때문이다.
즉, 즉시로딩이든 지연로딩이든 연관되어 있는 하위 엔터티 객체들에 대해서 한 번의 Query 수행으로 가져오지 않는다.

그러면 N + 1을 어떻게 해결할 수 있을까?

한 번에 가져오게끔 하면 된다. (Query 수행 단 1번)
한 번의 Query 수행 이후 N번의 추가 쿼리 대신에 (N / 지정한 BatchSize) 번의 추가 쿼리를 발생시키면 된다.

📗 한 번에 가져오는 fetch join

fetch join

JPQL에서 성능 최적화를 위해 제공하는 기능으로, 연관된 하위 Entity 객체를 한 번에 조회할 수 있는 기능이다. fetch join을 사용해서 N+1문제를 먼저 해결해보자.

이전 포스팅에서 사용했던 Post, Comment를 활용해보자.


@Test
	void checkNPlusOneProblemByEntityManager() {

		// postA 생성
		Post postA = new Post("postA", "postA");
		em.persist(postA);

		// postB 생성
		Post postB = new Post("postB", "postB");
		em.persist(postB);

		// postA의 comment1 생성
		Comment comment1 = new Comment("comment1");
		comment1.addPost(postA);
		em.persist(comment1);

		// postA의 comment2 생성
		Comment comment2 = new Comment("comment2");
		comment2.addPost(postA);
		em.persist(comment2);

		// postB의 comment3 생성
		Comment comment3 = new Comment("comment3");
		comment3.addPost(postB);
		em.persist(comment3);

		em.flush(); // DB에 반영 
		em.clear(); // 영속성 컨텍스트 비우기


		// fetch join을 사용해 한 번에 하위 엔터티 객체인 comments 조회하기
		List<Post> foundedPosts = em.createQuery("select p from Post p join fetch p.comments", Post.class)
			.getResultList();

		for (Post post : foundedPosts) {
			System.out.println("post.getId() = " + post.getContent() + " / " + post.getComments().size());

		}
		assertThat(foundedPosts.size()).isEqualTo(2);

	}

위와 같이 테스트 코드를 실행하면 조회 쿼리 한번에 하위 엔터티인 Comment 까지 전부 가져와지는 걸 확인할 수 있다.

Hibernate: 
    select
        p1_0.id,
        c1_0.post_id,
        c1_0.comment_id,
        c1_0.content,
        p1_0.content,
        p1_0.title 
    from
        post p1_0 
    join
        comment c1_0 
            on p1_0.id=c1_0.post_id

잠깐! OneToMany관계인 Post와 Comment를 fetch join을 하면 중복데이터가 발생해야하는게 아닐까?

필자 역시도 xToMany관계를 페치조인을 한다면 카테시안 곱으로 당연히 중복데이터가 발생할 것으로 예상하고 시도했다. 하지만 발생하지 않았다.😳
그 이유를 Hibernate 6.0 Docs에서 찾을 수 있었다.

From Hibernate ORM 6, distinct is always passed to the SQL query and the flag QueryHints#HINT_PASS_DISTINCT_THROUGH has been removed.

Hibernate ORM 6부터, distict keyword는 항상 SQL query문에 전달된다. 그리고 flag QueryHints#HINT_PASS_DISTINCT_THROUGHT는 없앴다.
필자는 Hibernate 6.2.9 version에서 테스트를 진행했기 때문에 중복 체크를 따로 해주지 않았음에도 정상적으로 데이터를 가져와진다는 걸 확인할 수 있다.

그렇다면 모든 연관관계에 대해서 fetch join을 하면 되지 않을까?

그렇지 않다. fetch join에도 한계가 존재한다.

한계 1. MultipleBagException

💡 List Collection으로 된 두 개 이상의 xToMany 관계의 하위 엔터티 객체를 fetch join으로 가져올 수 없다.

테스트를 위해 Post에 Comment말고도 OneToMany관계로 Tag Entity도 생성해보자.

// Post.java

@Entity
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@Getter
@Setter
public class Post {
	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	private Long id;

	@Lob
	private String content;

	private String title;

	@OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
	private List<Comment> comments = new ArrayList<>();

	@OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
	private List<Tag> tags = new ArrayList<>();

	public Post(String content, String title) {
		this.content = content;
		this.title = title;
	}

	/*
	 * 연관관계 메서드
	 * */
	public void addComment(Comment comment) {
		this.getComments().add(comment);
		comment.addPost(this);
	}

	/*
	 * 연관관계 메서드
	 * */
	public void addTag(Tag tag) {
		this.getTags().add(tag);
		tag.addPost(this);
	}
}

// Tag.java
@Entity
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@Getter
@Setter
public class Tag {

	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	private Long id;

	@Lob
	private String content;

	public Tag(String content) {
		this.content = content;
	}

	@ManyToOne
	@JoinColumn(name = "post_id")
	private Post post;

	@Override
	public String toString() {
		return "Tag{" +
			"id=" + id +
			", content='" + content + '\'' +
			'}';
	}

	public void addPost(Post post) {
		this.post = post;
	}

}

그리고 테스트 코드를 작성해보자.

	@Test
	void checkNPlusOneProblemByEntityManagerWithTag() {

		// postA 생성
		Post postA = new Post("postA", "postA");
		em.persist(postA);

		// postB 생성
		Post postB = new Post("postB", "postB");
		em.persist(postB);

		// postA의 comment1 생성
		Comment comment1 = new Comment("comment1");
		comment1.addPost(postA);
		em.persist(comment1);

		// postA의 comment2 생성
		Comment comment2 = new Comment("comment2");
		comment2.addPost(postA);
		em.persist(comment2);

		// postB의 comment3 생성
		Comment comment3 = new Comment("comment3");
		comment3.addPost(postB);
		em.persist(comment3);

		// postA의 tag1 생성
		Tag tag1 = new Tag("tag1");
		tag1.addPost(postA);
		em.persist(tag1);

		// postA의 tag2 생성
		Tag tag2 = new Tag("tag2");
		tag2.addPost(postA);
		em.persist(tag2);

		// postB의 tag2 생성
		Tag tag3 = new Tag("tag3");
		tag3.addPost(postB);
		em.persist(tag3);

		em.flush();
		em.clear();

		try {
        	// Post를 가져올 때 OneToMany 관계의 하위 엔터티인 Comment 와 Tag를 동시에 가져오는 query
			List<Post> foundedPosts = em.createQuery("select p from Post p join fetch p.comments join fetch p.tags",
					Post.class)
				.getResultList();
		} catch (MultipleBagFetchException e) {
			e.printStackTrace();
		}
	}

단순하게 Post를 조회할 때 OneToMany관계의 하위 엔터티인 Comment와 Tag를 fetch join을 사용해서 한 번에 조회했다. 그랬을 때 결과는 아래와 같다.

MultipleBagException

MultipleBagException은 어떤 경우 발생하는 Exception일까?

이 글을 참고해서 답을 찾을 수 있었다.

우선 Bag에 대해서 살펴보자.

Bag는 중복을 허용하고 순서가 없는 Collection을 가리킨다.
Java의 Collection에서는 Bag가 정의되어 있지 않다. 따라서 Hibernate에서는 List를 Bag로 사용하고 있다.

결국 두 개 이상의 xToMany 관계의 List Collection의 하위 엔터티를 fetch join으로 가지고 오는 경우에 MultipleBagException이 발생하는 것으로 받아들여졌다.

테스트를 위해서 tags 필드의 타입을 Set Collection으로 변경 해본 후에 다시 시도해보자.

@Entity
@NoArgsConstructor(access = AccessLevel.PROTECTED)
@Getter
@Setter
public class Post {
	@Id
	@GeneratedValue(strategy = GenerationType.IDENTITY)
	private Long id;

	@Lob
	private String content;

	private String title;

	@OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
	private List<Comment> comments = new ArrayList<>();

	/*
	 * MultipleBagException의 발생 조건을 확인해 보기 위해서
	 * List를 Set으로 변경
	 * */
	@OneToMany(mappedBy = "post", fetch = FetchType.LAZY)
	private Set<Tag> tags = new HashSet<>();

	public Post(String content, String title) {
		this.content = content;
		this.title = title;
	}

	/*
	 * 연관관계 메서드
	 * */
	public void addComment(Comment comment) {
		this.getComments().add(comment);
		comment.addPost(this);
	}

	/*
	 * 연관관계 메서드
	 * */
	public void addTag(Tag tag) {
		this.getTags().add(tag);
		tag.addPost(this);
	}

}

이렇게 변경하고 위의 테스트를 돌려보면 정상적으로 동작하는 것을 확인할 수 있다.

한계 2. xToMany 관계의 하위 엔터티 객체를 불러오는 경우 페이징 쿼리를 사용하면 안된다.

처음에는 "페이징 쿼리를 사용할 수 없다!" 라고 적으려고 했지만 "페이징 쿼리를 사용하면 안된다!" 라고 수정했다.
왜 사용하면 안되는지, 마찬가지로 Post와 Comment를 가지고 테스트를 진행해보자.

	@Test
	void checkPagingQuery() {

		// 20개의 Post를 생성
		for (int i = 1; i <= 20; i++) {
			Post post = new Post("post" + i, "post" + i);
			em.persist(post);
		}

		for (Post post : postRepository.findAll()) {
			// 각 포스트 마다 3개의 comment를 생성한다.
			for (int i = 1; i <= 3; i++) {
				Comment comment = new Comment("comment" + i);
				comment.addPost(post);
				em.persist(comment);
			}
		}

		// when && then

		List<Post> foundPosts = em.createQuery("SELECT p FROM Post p JOIN FETCH p.comments", Post.class)
			.setFirstResult(0)
			.setMaxResults(10)
			.getResultList();

		assertThat(foundPosts.size()).isEqualTo(10);
	}

테스트를 위해서 Post를 20개와 각 Post마다 3개의 Comment를 생성했다. join fetch를 사용해서 데이터를 출력하면, 테스트가 자연스럽게 통과된다.

🤔 테스트가 통과된다는 건 잘 동작하고 있다는 걸 의미하는 거 아니야?

하지만 수행된 쿼리를 봐보자.

Hibernate: 
    select
        p1_0.id,
        c1_0.post_id,
        c1_0.comment_id,
        c1_0.content,
        p1_0.content,
        p1_0.title 
    from
        post p1_0 
    join
        comment c1_0 
            on p1_0.id=c1_0.post_id

limit이랑 offset 어디갔지?!
페이징 쿼리를 사용하면 안되는 이유가 여기에 있다.

💡 xToMany 관계에서 페이징 쿼리를 시도하면 전체 Post와 Comment를 가져온 이후에 hibernate에서 페이징을 시도하기 때문이다.

따라서 사용해선 안된다라고 언급했다.

반대로, xToOne 관계에서 페치 조인을 실행했을 경우 수행 쿼리는 제대로 동작할까?

Comment를 대상으로 테스트 해보았다.

List<Comment> foundComments = em.createQuery("SELECT c FROM Comment c JOIN FETCH c.post", Comment.class)
	.setFirstResult(0)
    .setMaxResults(10)
    .getResultList();

Hibernate: 
    select
        c1_0.comment_id,
        c1_0.content,
        p1_0.id,
        p1_0.content,
        p1_0.title 
    from
        comment c1_0 
    join
        post p1_0 
            on p1_0.id=c1_0.post_id offset ? rows fetch first ? rows only

xToOne 관계에서는 페이징 쿼리가 수행되는 것을 확인할 수 있다.

📗 한 번의 query 수행 이후 Where In절로 추가 쿼리 발생시키기

그렇다면 이러한 한계점 때문에 결국 N+1문제를 완전히 해결 못하는 것은 아닌가..? 라고 생각할 수 있지만 또 다른 선택지가 있다.

한 번의 query 수행 이후에 where in절을 활용해서 N번을 가져오는 것이 아닌 (N / BatchSize) 번 만큼 가져오는 방법이다.

테스트를 위해서 40개의 Post와 각 Post에 3개의 Comment를 생성했다.
그리고 src/test/resources/application.yml에 속성을 하나 추가했다.

spring:
  jpa:
    properties:
      hibernate.default_batch_fetch_size: 20

위의 hibernate.default_batch_fetch_size 속성은
지연 로딩으로 하위 엔터티 객체들을 가져오는데 발생하는 query들을 지정한 batchSize만큼 모아서 IN 절로 한 번에 조회하는 기능이다.
(즉시 로딩으로 되어있을 경우 IN 절로 한 번에 조회하는 기능을 사용할 수 없다.)

그리고 아래의 테스트 코드로 쿼리 수행이 어떻게 되는지 확인해보자.

	@Test
	void testWhereIn() {

		// 40개의 Post를 생성
		for (int i = 1; i <= 40; i++) {
			Post post = new Post("post" + i, "post" + i);
			em.persist(post);
		}

		for (Post post : postRepository.findAll()) {
			// 각 포스트 마다 3개의 comment를 생성한다.
			for (int i = 1; i <= 3; i++) {
				Comment comment = new Comment("comment" + i);
				comment.addPost(post);
				em.persist(comment);
			}
		}

		em.flush(); // DB 반영
		em.clear(); // 영속성 컨텍스트 비워주기

		// when & then

		List<Post> foundPosts = em.createQuery("SELECT p FROM Post p", Post.class).getResultList();

		for (Post post : foundPosts) {
        	// 지연 로딩으로 설정된 comments를 가져오기 위해서 추가
			System.out.println("comment size: " + post.getComments().size());
		}

		assertThat(foundPosts.size()).isEqualTo(40);
	}

테스트를 실행했을 때 조회 쿼리를 확인해보자.

// 전체 Post 조회
Hibernate: 
    select
        p1_0.id,
        p1_0.content,
        p1_0.title 
    from
        post p1_0
        
// Comments 20개 조회
Hibernate: 
    select
        c1_0.post_id,
        c1_0.comment_id,
        c1_0.content 
    from
        comment c1_0 
    where
        c1_0.post_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
        
// comment size: 3
// ...

// Comments 20개 조회
Hibernate: 
    select
        c1_0.post_id,
        c1_0.comment_id,
        c1_0.content 
    from
        comment c1_0 
    where
        c1_0.post_id in (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)

전체 Post 조회 쿼리 1번,
Comment 조회 쿼리 20개씩 2번,
총 1 + (40 / 20) => 1 + 2 => 3번의 쿼리를 수행했다.
N+1이였다면 1 + 40 => 41 번의 쿼리가 수행 되었을 것이다.

📗 정리

N + 1문제를 해결하기 위한 방법으로 총 2가지 관점이 존재한다.

연관된 하위 엔터티 객체까지 한 번에 가져오는 방법
WHERE IN 절을 활용해서 1 + (N / 지정한 Batch Size) 번 만큼 수행해서 가져오는 방법

fetch join으로 한 번에 가져오는 방법 만으로 가져올 수 있다면 그렇게 하자.
하지만 fetch join에도 한계점이 존재했다.

List Collection으로 되어 있는 xToMany 하위 엔터티 객체들이 2가지 이상일 때에는 MultipleBagException을 발생했었다.
xToMany 하위 엔터티 객체들을 fetch join을 해서 가져왔을 때에는 Paging Api를 사용하면 성능적으로 이슈가 있었다.

이러한 2가지 한계점을 통해서 바라봤을 때 WHERE IN 절을 활용해서 1 + (N / 지정한 Batch Size) 번 만큼 수행해서 가져오는 방법이 존재했다.

이 둘을 적절히 조합해서 N + 1을 해결해 나가자.