Nov 19, 2018

Hibernate Acceleration by Snapshots

A Google search for “Hibernate and performance” will yield innumerable articles describing how to fix performance issues. This post is not yet another performance improver. Instead we will demonstrate how to remove bottlenecks in your Hibernate project by using JPA support in Spring boot in tandem with in-JVM-memory snapshots which provide speedups of orders of magnitude. Let us use the Sakila sample database for the purpose of this post. The database contains, among other things, Films and Actors and relations between them.

A Straightforward Hibernate and Spring Application

A non-complicated way of interacting with a relational database is to use the Spring JPA plugin to allow Spring to handle dependency injection and setup of the project. This allows for a pure java implementation without any XML code to set up the ORM. For example, a properly annotated plain Film class is all that is needed to map the database table of films into the Java object domain.

@Entity
public class Film {
   @Id
   @Column(name="film_id")
   private int id;
   private String title;
   private String description;
   private int releaseYear;

   // ... more fields, getter and setters et c
}

The perhaps most straightforward way of retrieving such entities is by means of a Repository


@Repository
public interface FilmRepository extends CrudRepository {
}
which allows us to write an application with minimal boilerplate that operates on Films with code such as the following.

@SpringBootApplication
public class HibernateSpeedmentApplication implements CommandLineRunner {

   public static void main(String[] args) {
       SpringApplication.run(HibernateSpeedmentApplication.class, args);
   }

  @Autowired
  FilmRepository filmRepository;

  // ... code using the filmRepository
}

Here Spring helps us inject the filmRepository which can then be used as follows where we stream over all films and sum the film lengths.


public Long getTotalLengthHibernate() {
   return StreamSupport.stream(filmRepository.findAll().spliterator(), false)
       .mapToLong(com.example.hibernatespeedment.data.Film::getLength)
       .sum();
}

Clearly, this is an inefficient way of summing all the film lengths, since it entails fetching all film entities to the JVM and then summing a single property. Since we just retrieve a single value and are not interested in updating any data we would be better off with a Data Transfer Object that only contains the film length. That would require us to write some code that SELECTs the length column server side in the database. When we realize that we want some of the logic of this operation to be moved to the database, it makes a lot of sense to compute the whole sum in the database instead of transferring the film lengths. We then arrive at the following piece of code.


public Long getTotalLengthHibernateQuery() {
   EntityManager em = entityManagerFactory.createEntityManager();
   Query query = em.createQuery("SELECT SUM(length) FROM Film");
   return (Long) query.getSingleResult();
}

Now the application logic contains an explicit work split between JVM and database including a query language construct that is more or less opaque to the compiler.

Remove a Bottleneck with Speedment

While the stream construct with which we started out in the section above was very inefficient, it has appeal in the way that it abstracts away the details of the database operations. The ORM Speedment has a Streams based API allowing stream operations to be efficient. The Speedment based application code is very similar to the Hibernate example with the exception that the Repository is replaced by a Manager and this manager provides streams of entities. Thus, the corresponding Speedment application code would be as follows.

@Autowired
FilmManager filmManager;

public Long getTotalLengthSpeedment() {
   return filmManager.stream()
       .mapToLong(Film.LENGTH.asLong())
       .sum();
}

There are several advantages of deciding on the SQL details at runtime rather than in the application code, including type safety and lower maintenance cost for a more concise business logic code base. The perhaps most prominent advantage of the clean abstraction from database operations, however, is that it allows the runtime to provide acceleration. As a matter of setup configuration and with no modification of any application logic, an optional plugin to the Speedment runtime allows partial snapshots of the database to be prefetched to an in-memory data store, providing several orders of magnitude application speedup without rewriting any part of the application logic.

For this particular example, the Query based Hibernate solution was approximately 5 times faster than the naive approach of streaming over the full set of entities. The Speedment powered solution returned a result 50 times faster than the Query based Hibernate solution. If you try it out, your mileage may vary depending on setup, but clearly the in-memory snapshot will invariably be orders of magnitude faster than round tripping the database with an explicit query which in turn will be significantly faster than fetching the full table to the JVM which happens in the naive implementation.

Coexistence - Using the Right Tool for the Job

While in-memory acceleration does deliver unparallelled speed it is no panacea. For some tables, a speedup of several orders of magnitude may not yield any noticeable effect on the overall application. For other tables, querying an in-memory snapshot of data may not be acceptable due to transactional dependencies. For example, operating on a snapshot may be perfect for the whole dataset in a business intelligence system, a dashboard of KPIs or a tool for exploring historical trade data. On the other hand, the resulting balance after a bank account deposit needs to be immediately visible to the online bank and thus serving the bank account from a snapshot would be a terrible idea.

In many real-world scenarios one would need a solution where some data is served from a snapshot while data from other tables are always fetched from the database. In such a common hybrid case, the code that directly fetches data from the database may use the same Speedment API as the snapshot querying code but for projects already using Hibernate it works perfectly well to combine Speedment and Hibernate.

Since both Hibernate and Speedment rely on JDBC under the hood, they will ultimately use the same driver and may therefore work in tandem in an application. Having a Hibernate powered application, a decision to move to Speedment for bottlenecks can therefore be local to the part of the application that will benefit the most. The rest of the Hibernate application will coexist with the code that leverages Speedment.

It is easy to try this out for yourself. The Sakila database is Open Source and can be downloaded here. Speedment is available as a free version, use the Initializer to download.




Note: For the Spring autowire of the Speedment FilmManager to work, we also need a configuration class which may look as follows.

@Configuration
public class Setup {
    @Bean
    public SakilaApplication createApplication() {
        SakilaApplication app = new SakilaApplicationBuilder()
            .withBundle(DataStoreBundle.class)
            .withUsername("sakila")
            .withPassword("sakila")
            .build();
        app.getOrThrow(DataStoreComponent.class).load();
        return app;
    }

    @Bean
    public FilmManager createFilmMananger(SakilaApplication app) {
        return app.getOrThrow(FilmManager.class);
    }
}

No comments:

Post a Comment