Site icon SpringHow

N+1 Selects problem in Hibernate and How to Avoid it

Let us talk about the infamous N+1 selects problem in hibernate entities and how to solve it with some examples. You may have also heard it as an N+1 queries problem in some places.

What is the N+1 selects problem?

The N+1 selects problem is a performance anti-pattern where an application makes N+1 small queries to the database instead of 1 query fetching all the required data.

Let us take an example of blog posts and their comments. In this system, we have Post and Comment entities. Each comment has a reference to their respective posts. This means that the relationship from Post to Comment entities follows one to many mapping.

If you are thinking of Comment to Post as Many to one relationship, you are also correct.

The table values for these two entities look like below.

insert into post (id,title,content,author) values(1, 'Some title 1','Some long content 1','John Doe' );
insert into post (id,title,content,author) values(2, 'Some title 2','Some long content 2','John Doe' );
insert into post (id,title,content,author) values(3, 'Some title 3','Some long content 3','John Doe' );
insert into post (id,title,content,author) values(4, 'Some title 4','Some long content 4','John Doe' );
insert into post (id,title,content,author) values(5, 'Some title 5','Some long content 5','John Doe' );

insert into comment (id, content, author, post_id) values (1, 'some comment on post1 ', 'John Doe', 1);
insert into comment (id, content, author, post_id) values (2, 'some comment on post1 ', 'John Doe', 1);
insert into comment (id, content, author, post_id) values (3, 'some comment on post2 ', 'John Doe', 2);
insert into comment (id, content, author, post_id) values (4, 'some comment on post3 ', 'John Doe', 3);
insert into comment (id, content, author, post_id) values (5, 'some comment on post4 ', 'John Doe', 4);
insert into comment (id, content, author, post_id) values (6, 'some comment on post4 ', 'John Doe', 4);
insert into comment (id, content, author, post_id) values (7, 'some comment on post4 ', 'John Doe', 4);
insert into comment (id, content, author, post_id) values (8, 'some comment on post5 ', 'John Doe', 5);Code language: SQL (Structured Query Language) (sql)

As you can see, a single post contains multiple comments, and each comment points to a post. So let’s set up a spring boot project with these entities and query them from the database.

public interface CommentRepository extends JpaRepository<Comment, Integer> {

    List<Comment> findAll();
}Code language: PHP (php)

If you want to learn more about jpa repositories, read the introduction to Spring JPA.

Also, I wrote a command-line runner so I can trigger the query on application startup.

@SpringBootApplication
public class SpringBootNPlusOneApplication implements CommandLineRunner {
    public static final Logger logger = LoggerFactory.getLogger(SpringBootNPlusOneApplication.class);

    @Autowired
    CommentRepository commentRepository;


    public static void main(String[] args) {
        SpringApplication.run(SpringBootNPlusOneApplication.class, args);
    }

    @Transactional
    @Override
    public void run(String... args) throws Exception {
        logger.info("Finding all comment objects");
        List<Comment> comments = commentRepository.findAll();
        for (Comment comment : comments) {
            logger.info("Comment [{}] from Post [{}]",comment.getContent(), comment.getPost().getTitle());
        }

    }
}
Code language: Java (java)

With all of these setups, let us enable SQL logs in the spring boot application and run our application.

As you can see, the underlying hibernate framework is creating one query to load all comments. But it also loads each post on a separate query.

Imagine you have 2000 comments and 500 posts. So There would be one query that takes 2000 comments from the database. But there will also be 500 more queries to the database for fetching each post referenced by those comments. In the real world, that’s too much work for the database. Also, it would take more network round-trip, which in turn, increases the overall processing time. This is what we call an N+1 query problem or N+1 selects problem in hibernate.

How to solve N+1 queries problem?

The cause of the N+1 problem is the eager loading nature of Hibernate. Hibernate will always load the @ManyToOne children by default (FetchType.EAGER). So it will load each post object if it already hasn’t loaded it from the database.

Let’s check out the possible solution for hibernate’s N+1 select problem and how they may fit.

At initial thought, you can lazy load the child. So to do this set the annotation as @ManyToOne(fetch = FetchType.LAZY). Under the hood, lazy loading creates proxy objects for child objects. As and when we access the child objects, hibernate will fire the queries and load them. With that being said, this may look like a good idea, but lazy loading is not the perfect solution.

This is due to the underlying logic behind lazy loading. When lazy loading is enabled, hibernate will create proxies for the “Comment.post” fields. When the post field is accessed, hibernate will fill the proxy with values from the database. Yes, a query will still happen for each Post object.

We can see this behavior in the logs as well.

As you see here, We still see queries to the POST table. But it happens as we loop through the comment object to access comment.getPost().getTitle().

The solution

So how do we solve the N+1 problem? For this, we need to get back to some basics of SQL. When we have to load data from two separate tables, we can use joins. So instead of using multiple queries, we can write a single query like below.

select *
from comment c
         inner join post p on c.post_id = p.idCode language: SQL (Structured Query Language) (sql)

Similarly, we can join fetch results in Spring Data JPA. To do that, you need to add a custom @Query annotation. Here is the repository after the changes.

public interface CommentRepository extends JpaRepository<Comment, Integer> {

    @Query("select c from Comment c join fetch c.post")
    List<Comment> findAll();

}Code language: PHP (php)

As you can see here, we are using the join fetch keyword to load the post object in the same query. Let’s run our application again and check the results.

eliminating N+1 selects problem in hibernate

As you see in the screenshot, there is only one query to the database. It makes an inner join to the post table in the same query. Also, no new queries went to DB while processing the comments.

How to find N+1 Problems?

The easiest way to find N+1 problems is through SQL logs. It is easier to show SQL logs in Spring Boot and analyze them. So while testing your code, make sure you enable trace logs to find such occurrences of N+1.

The systematic approach for finding the N+1 problem would be to start looking at your entity mappings and repository queries. Whenever there is a list of objects from the database, you might want to check the business logic if their child entities are accessed. If yes, then you should opt-in for “join fetch”.

Why you should avoid N+1 problems?

The reasons are simple.

  1. N+1 problems create more queries to database. This means database will be overloaded.
  2. More queries to the database impacts the performance of the database as well as the application server. Each query and processing the response requires more CPU cycles.
  3. Each extra query would increase the over all processing time. As each query need to send and receive query and results, the processing time increases proportionately.
  4. Longer processing time means more open connections from connection pools. This impacts other requests as they have to wait longer to get connections.

Summary

To wrap it up, We learned what the N+1 problem in hibernate is and how to solve them using join fetch. If you want to try out these examples, you can always find the code in our GitHub Account.

Related

Exit mobile version