In version 5.2 Hibernate has moved to Java 8 as base line. Keeping up with the new functional paradigm of Java 8 with lambdas and streams, Hibernate 5.2 also supports handling a query result set as a stream. Admittedly a small addition to the API, streams add significant value by allowing the Hibernate user to leverage streams parallelism and functional programming without creating any custom adaptors.
This post will elaborate on the added superficially small but fundamentally important streams feature of Hibernate 5.2 and then discuss how the Java 8 stream ORM Speedment takes the functional paradigm further by removing the language barrier and thus enabling a clean declarative design.
The following text will assume general knowledge of relational databases and the concept of ORM in particular. Without a basic knowledge of Java 8 streams and lambdas the presentation will probably seem overly abstract since basic features will be mentioned without further elaboration.
Imperative Processing of a Query Result
The table we use is a table of Hares, where a Hare has a name and an id.
CREATE TABLE `hare` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) NOT NULL,
PRIMARY KEY (`id`)
);
To avoid discussing the query language per se, we use an example of a simplistic HQL query that creates a result set containing all the contents of a table of the database. The naïve approach to finding the item we are looking for would be to iterate over the data of the table as follows.
List<Hare> hares = session.createQuery("SELECT h FROM Hare h", Hare.class).getResultList();
for (Hare hare : hares) {
if (hare.getId() == 1) {
System.out.println(hare.getName());
}
}
Note how the design of the query result handling is fully imperative. The implementation clearly states a step-by-step instruction of how to iterate over the elements and what to do with each element. By the end of the day, when it is time to run the program, all programs are in a sense imperative since the processor will need a very explicit sequence of instrucitons to execute. The imperative approach to programming may therefore seem the most intuitive.
Declaring the Goal, Receiving the Path
In contrast to the imperative design, the declarative approach focuses on what to be done, rather than on how to do it. This does not just tend to create more concise and elegant programs, but introduces a fundamental advantage as it allows the computer to figure out the transition from what to how. Sometimes without even thinking about it, many programmers are used to this approach in the realm of relational databases since the query language SQL is one of the most popular instances of declarative programming. Relieved of the details of exactly how the database engine will retrieve the data the designer can focus on what data to get, and then of course what to do with it after it is retrieved.
Java 8 streams and lambdas allow for a declarative approach to handling collections of data. Instead of listing a sequence of instructions to be carried out, the user of a stream first creates a pipeline of abstract operations to be carried out and when presented with a terminated pipeline, the stream implementation will figure out the imperative details.
Even before Hibernate 5.2, our running example could be ported to the Java 8 domain of streams by just adding a simple method call in the chain of operations since the List itself has a stream method.
List<Hare> hares = session.createQuery("SELECT h FROM Hare h", Hare.class).getResultList();
hares.stream()
.filter(h -> h.getId() == 1)
.forEach(h -> System.out.println(h.getName()));
While this example may seem similar to the imperative iteration in the previous design, the fundamental difference is that this program will first create a representation of the operations to be carried out and then lazy evaluate it. Thus, nothing actually happens to the items of the List until the full pipeline is created. We express what we want in terms of a functional composition of basic operations but do not lock down any decisions about how to execute the resulting function.
Since a major feature of functional programming is the compositional design, a more typical streams approach would be to chain stepwise operations on the data. To extract the name of the item, we may map the getter on the stream as follows.
List<Hare> hares = session.createQuery("SELECT h FROM Hare h", Hare.class).getResultList();
hares.stream()
.filter(h -> h.getId() == 1)
.map(Hare::getName)
.forEach(System.out::println);
Streaming a Result Set
With Hibernate 5.2, the query result can produce a stream, allowing the following minimal change in code which has the important advantage of not loading the entire table into an intermediate representation from which to source the stream.
session.createQuery("SELECT h FROM Hare h", Hare.class).stream()
.filter(h -> h.getId() == 1)
.map(Hare::getName)
.forEach(System.out::println);
Selection by the Source
The optimization desperately needed for this code is of course to adjust the query to allow the database to create a result set closer to the desired result of the operation. Focusing on just filtering the rows of the database and leaving the extraction of the columns to the JVM, the now familiar code snippet can be updated to the following.
session.createQuery("SELECT h FROM Hare h WHERE id = 1", Hare.class).stream()
.map(Hare::getName)
.forEach(System.out::println);
Note that this short piece of a program contains two declarative parts that require separate design with different kinds of considerations. Since the program is divided between what happens before and after the stream is created, any optimization will have to consider what happens on both sides of that barrier.
While this indeed is considerably more elegant than the first example (which admittedly for pedagogical reasons was designed to showcase potential for improvement rather than representing a real solution to a problem), the barrier poses a fundamental problem in terms of declarative design. It can rightfully be claimed that the program still is an imperative program composed by two declarative sub routines - first execute the query and then execute the Java part of the program. We may chose to refer to this as the language barrier, since the interface between the two declarative languages creates a barrier over which functional abstraction will not take place.
Enter Speedment - Going Fully Declarative
- the seamless generalization to parallelism (expressing a design as a pipeline of operations is a great starting point for building a set of parallel pipes),
- design by composition (reuse and modularization of code is encouraged by a paradigm of composing solutions as a composition of smaller operations),
- higher order functions (behavior expressed as lambdas can be used as language entities such as parameters to methods) and
- declarative programming (the application designer focuses on what is needed, the framework or stream primitives design determines the details about how, allowing lazy evaluation and shortcuts).
We have shown how the new Hibernate API of version 5.2 adds basic support for streams, which allows for a declarative approach to describing the operations applied to the dataset retrieved from the database. While this is a fundamental insight and improvement, the Hibernate design with a foundation in an explicit query language limits the reach of the declarative features of the resulting programs due to the language barrier constituted by the interface between two languages.
The logical next step along the path from iterative to declarative design would be to break the language barrier and that is what the Java stream ORM Speedment does.
In the Speedment framework, the resulting SQL query is the responsibility of the framework. Thus, a program leveraging Speedment does not use any explicit query language. Instead, all the data operations are expressed as a pipeline of operations on a stream of data and the framework will create the SQL query. Returning to our example, a Speedment based design could be expressed as follows.
hares.stream()
.filter(h -> h.getId() == 1)
.map(Hare::getName)
.forEach(System.out::println);
The hares manager is the source of the stream of Hares. No SQL will be run or even created until the pipeline of operations is terminated. In the general case, the Speedment framework cannot optimize a SQL query followed by lambda filters since the lambda may contain any functionality. Therefore, the executed SQL query for this example will be a query for all data in the Hares table since the behavior of the first filter cannot be analysed by the framework. To allow the framework to optimize the pipeline, there is a need for a data structure representing the operations in terms of basic known building blocks instead of general lambda operations. This is supported by the framework and is expressed in a program as follows.
hares.stream()
.filter(Hare.ID.equal(1))
.map(Hare.NAME.getter())
.forEach(System.out::println);
The pipeline of operations is now a clean data structure declaratively describing the operations without any runnable code, in contrast to a filter with a lambda. Thus, the SQL query that will be run is no longer a selection of all items of the table, but instead a query of the type "SELECT * FROM hares WHERE ID=1". Thus, by removing the language barrier, a fully declarative design is achieved. The program states "Find me the names of the hares of the database with ID 1" and it is up to the Speedment framework and the database engine to cooperate in figuring out how to turn that program into a set of instructions to execute.
This discussion uses an very simplistic example to illustrate a general point. Please see the Speedment API Quick Start for more elaborate examples of what the framework can do.
Edit: This text is also published at DZone: Streams in Hibernate and Beyond.