Gradle Natives Plugin Update

19 September 2016 ~ blog, java, groovy, gradle

A few years ago I wrote a post about my Gradle-Natives plugin, called "Going Native with Gradle". The plugin was the result of some failed attempts at game programming and it pretty much stopped there; however, it seems there are some users who found it useful. In the years since it was written, it sat and got buggy and then recently just became useless due to external library changes and the rigidity of the plugin functionality. Tryign to be a good open source citizen, I figured it would be a good time to do some updates that will hopefully keep the plugin viable for a while.

With the new release, I figured it would be good to go back to the original post and attempt a similar example. Using the example code from the LWJGL Getting Started Guide we have:

import org.lwjgl.*;
import org.lwjgl.glfw.*;
import org.lwjgl.opengl.*;

import static org.lwjgl.glfw.Callbacks.*;
import static org.lwjgl.glfw.GLFW.*;
import static org.lwjgl.opengl.GL11.*;
import static org.lwjgl.system.MemoryUtil.*;

public class HelloWorld {

	// The window handle
	private long window;

	public void run() {
		System.out.println("Hello LWJGL " + Version.getVersion() + "!");

		try {
			init();
			loop();

			// Free the window callbacks and destroy the window
			glfwFreeCallbacks(window);
			glfwDestroyWindow(window);
		} finally {
			// Terminate GLFW and free the error callback
			glfwTerminate();
			glfwSetErrorCallback(null).free();
		}
	}

	private void init() {
		// Setup an error callback. The default implementation
		// will print the error message in System.err.
		GLFWErrorCallback.createPrint(System.err).set();

		// Initialize GLFW. Most GLFW functions will not work before doing this.
		if ( !glfwInit() )
			throw new IllegalStateException("Unable to initialize GLFW");

		// Configure our window
		glfwDefaultWindowHints(); // optional, the current window hints are already the default
		glfwWindowHint(GLFW_VISIBLE, GLFW_FALSE); // the window will stay hidden after creation
		glfwWindowHint(GLFW_RESIZABLE, GLFW_TRUE); // the window will be resizable

		int WIDTH = 300;
		int HEIGHT = 300;

		// Create the window
		window = glfwCreateWindow(WIDTH, HEIGHT, "Hello World!", NULL, NULL);
		if ( window == NULL )
			throw new RuntimeException("Failed to create the GLFW window");

		// Setup a key callback. It will be called every time a key is pressed, repeated or released.
		glfwSetKeyCallback(window, (window, key, scancode, action, mods) -> {
			if ( key == GLFW_KEY_ESCAPE && action == GLFW_RELEASE )
				glfwSetWindowShouldClose(window, true); // We will detect this in our rendering loop
		});

		// Get the resolution of the primary monitor
		GLFWVidMode vidmode = glfwGetVideoMode(glfwGetPrimaryMonitor());
		// Center our window
		glfwSetWindowPos(
			window,
			(vidmode.width() - WIDTH) / 2,
			(vidmode.height() - HEIGHT) / 2
		);

		// Make the OpenGL context current
		glfwMakeContextCurrent(window);
		// Enable v-sync
		glfwSwapInterval(1);

		// Make the window visible
		glfwShowWindow(window);
	}

	private void loop() {
		// This line is critical for LWJGL's interoperation with GLFW's
		// OpenGL context, or any context that is managed externally.
		// LWJGL detects the context that is current in the current thread,
		// creates the GLCapabilities instance and makes the OpenGL
		// bindings available for use.
		GL.createCapabilities();

		// Set the clear color
		glClearColor(1.0f, 0.0f, 0.0f, 0.0f);

		// Run the rendering loop until the user has attempted to close
		// the window or has pressed the ESCAPE key.
		while ( !glfwWindowShouldClose(window) ) {
			glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); // clear the framebuffer

			glfwSwapBuffers(window); // swap the color buffers

			// Poll for window events. The key callback above will only be
			// invoked during this call.
			glfwPollEvents();
		}
	}

	public static void main(String[] args) {
		new HelloWorld().run();
	}

}

If we put this in a simple Gradle project with build.gradle as:

plugins {
    id 'com.stehno.natives' version '0.2.4'
    id 'java'
    id 'application'
}

version = "0.0.1"
group = "com.stehno"
mainClassName = 'hello.HelloWorld'

sourceCompatibility = 8
targetCompatibility = 8

repositories {
    jcenter()
}

dependencies {
    compile 'org.lwjgl:lwjgl:3.0.0'
    compile 'org.lwjgl:lwjgl-platform:3.0.0:natives-windows'
    compile 'org.lwjgl:lwjgl-platform:3.0.0:natives-linux'
    compile 'org.lwjgl:lwjgl-platform:3.0.0:natives-osx'
}

task wrapper(type: Wrapper) {
    gradleVersion = "2.14"
}

We can view the native libraries for all platforms using ./gradlew listNatives:

:listNatives
Native libraries found for configurations (compile, runtime)...
 - lwjgl-platform-3.0.0-natives-linux.jar:
        [LINUX] libjemalloc.so
        [LINUX] liblwjgl.so
        [LINUX] libglfw.so
        [LINUX] libopenal.so
 - lwjgl-platform-3.0.0-natives-osx.jar:
        [MAC] liblwjgl.dylib
        [MAC] libjemalloc.dylib
        [MAC] libglfw.dylib
        [MAC] libopenal.dylib
 - lwjgl-platform-3.0.0-natives-windows.jar:
        [WINDOWS] lwjgl.dll
        [WINDOWS] lwjgl32.dll
        [WINDOWS] OpenAL.dll
        [WINDOWS] jemalloc.dll
        [WINDOWS] glfw.dll
        [WINDOWS] glfw32.dll
        [WINDOWS] jemalloc32.dll
        [WINDOWS] OpenAL32.dll

and we can build and run the HelloWorld application with ./gradlew clean build run, which begs the question of whether or not this plugin is needed, since at this point the application works and we have not used the plugin at all. I will leave that to developers who actually work with this stuff and may use the plugin - I am just updating the existing functionality.

You can inlude the native libraries in the build using ./gradlew clean build includeNatives which will unpack the native libraries into the project build directory.

There are still a number of configuration options available through the natives DSL extension, such as including and excluding libraries, as well as limiting the scan to certain configurations and platforms, but I will leave those for the official documentation. Without any additional configuration you get all of the native libraries from the compile and runtime configurations for all platforms unpacked into the build/natives directory.

This plugin is still pretty raw, but hopefully it is useful enough to make some developers lives easier.

HTTP Builder NG for Groovy and Java

18 September 2016 ~ blog, groovy

The HttpBuilder project has been a go-to library for the simplification of HTTP requests for years; however, development on the project has stalled and seemingly died. A friend of mine (Dave Clark) decided to pick up where the project left off and to bring it up-to-date with modern Groovy and Java 8 support. The HTTP Builder NG project is a major update and refactor of the original project. I joined on to help with development, documentation and testing. In my opinion, this effort has brought the library back from the dead, better than ever. In this post, I will walk through accessing a simple REST service using the HttpBuilder with both Groovy and Java examples - yes, the new version of the library supports standard Java 8 coding.

First, we need a REST service to work with. I have thrown together a simple set of endpoints using the Spark Web Framework to make a "message of the day" service. There are three endpoints:

  • GET /message - retrieves the current stored message

  • POST /message - saves the text field of the JSON body content as the new message

  • DELETE /message - deletes the current message

There is not much to it, but it should be enough to play with. You can find the code in the [repo for this post](https://github.com/cjstehno/httpb-demo). Startup the server by running:

./gradlew run

In the root of the project. The server will be running on http://localhost:4567.

Let’s start off by retrieving the current message from the server. We need a base configured HttpBuilder object to make requests from:

HttpBuilder http = HttpBuilder.configure {
    request.uri = 'http://localhost:4567'
}

Then, we need to make a GET request to the /message path:

def result = http.get {
    request.uri.path = '/message'
}

When you run this code, you will get the following:

[text:n/a, timestamp:2016-09-16T12:47:55+0000]

which is a Map of the parsed JSON data coming back from the server - the HttpBuilder recognizes the application/json response content and parses it for you. In this case all we really want is the text, so let’s transform the response data a bit:

String text = http.get(String){
    request.uri.path = '/message'
    response.success { FromServer from, Object body->
        body.text
    }
}

We have added an expected result type of String and a response.success() handler. This handler will be called when a successful response code is received (code < 400). When it is called it will pull the text field out of our body object, which in this case, is the already-parsed JSON data. The return value from the success() method is returned as the result - the text of the current message. When you run this version of the code you get the current message text:

n/a

This is the default "empty" message content. How do we update the message to something more interesting? The service exposes POST /message which will take the text field of the request body content and use it as the new message. We can write a POST request just as easily as our GET request:

String updated = http.post(String) {
    request.uri.path = '/message'
    request.contentType = 'application/json'
    request.body = { text 'HttpBuilder is alive!' }
    response.success { FromServer from, Object body ->
        body.text
    }
}

Again, we will expect the text of the new message back from the server, but this time we are calling the post() method with a JSON content type. Note that our body content is using the Groovy JsonBuilder closure format, it could have just as easily been a Map of the data to be encoded. Similar to the response decoding, the request body is automatically encoded based on the content type.

If you run the code now, you will get:

HttpBuilder is alive!

You could also call the get() method again and verify that it is the current message.

As a final example with our service, let’s call the DELETE /message endpoint to reset the message back to it’s "empty" state. A DELETE request is just as simple:

String deleted = http.delete(String){
    request.uri.path = '/message'
    response.success { FromServer from, Object body ->
        body.text
    }
}

The result will be the new message after deletion:

n/a

which is the "empty" state.

One thing we notice now that we have written all of the verb calls is that there are a lot of similarities between them. They all call the same path and they all handle the successful response content in the same manner. I am not a fan of duplication, so we can move the common configuration up into the main configure method:

HttpBuilder http = HttpBuilder.configure {
    request.uri = 'http://localhost:4567/message'
    response.success { FromServer from, Object body ->
        body.text
    }
}

and our verb methods, now contain only what they need to do their work:

String message = http.get(String) {}

String updated = http.post(String) {
    request.contentType = 'application/json'
    request.body = { text 'HttpBuilder is alive!' }
}

String deleted = http.delete(String) {}

Nice and clean. Now wait, I know, I promised something similar in plain old Java, well Java 8 anyway…​ ok, you can do the same operations in Java with a fairly similar expressiveness:

HttpBuilder http = HttpBuilder.configure(config -> {
    config.getRequest().setUri("http://localhost:4567/message");
    config.getResponse().success(new BiFunction<FromServer, Object, String>() {
        @Override public String apply(FromServer fromServer, Object body) {
            return ((Map<String, Object>) body).get("text").toString();
        }
    });
});

String message = http.get(String.class, config -> {});

System.out.println("Starting content: " + message);

// update the content

String updated = http.post(String.class, config -> {
    config.getRequest().setContentType("application/json");
    config.getRequest().setBody(singletonMap("text", "HttpBuilder works from Java too!"));
});

System.out.println("Updated content: " + updated);

// delete the content

String deleted = http.delete(String.class, config -> {});

System.out.println("Post-delete content: " + deleted);

Notice that the Java 8 lambdas make the syntax about as simple as the Groovy DSL. When you run this version of the client you get:

Starting content: n/a
Updated content: HttpBuilder works from Java too!
Post-delete content: n/a

In Java or Groovy, the library makes HTTP interactions much easier to work with. Check out the project and feel free to submit bug reports and feature requests, or even suggested details to be documented.

ND4J Matrix Math

12 August 2016 ~ blog, groovy

In my last post (Commons Math - RealMatrix), I discussed the matrix operations support provided by the Apache Commons Math API. In doing my research I also stumbled on a library that is much closer in functionality to the Python NumPy library (commonly used in Machine Learning examples). The ND4J library is a scientific computing library for the JVM, meant to be used in production environments, which means routines are designed to run fast with minimum RAM requirements.

The main draw I had at this point was that their support for array-style element-by-element operations was much deeper than the matrix operations provided by the Apache Commons Math API and much closer to what I was seeing in the Python code I was working with, which makes conversion simpler.

With NumPy in Python you can multiply two arrays such that the result is the multiplication of each value of the array by the corresponding value in the second array. This is not so simple with matrices (as shown in my last post). With ND4J, it becomes much simpler:

def arrA = Nd4j.create([1.0, 2.0, 3.0] as double[])
def arrB = Nd4j.create([2.0, 4.0, 6.0] as double[])
def arrC = arrA.mul(arrB)
println "$arrA + $arrB = $arrC"

will result in:

[1.00, 2.00, 3.00] * [2.00, 4.00, 6.00] = [ 2.00,  8.00, 18.00]

which is as we would expect from the Python case. ND4J also has the ability to do two-dimensional (matrix-style) arrays:

def matA = Nd4j.create([
    [1.0, 2.0, 3.0] as double[],
    [4.0, 5.0, 6.0] as double[]
] as double[][])
println "Matrix: $matA\n"

which will produce:

Matrix: [[1.00, 2.00, 3.00],
 [4.00, 5.00, 6.00]]

All of the other mathematical operations I mentioned in the previous post are available and with data structures that feel a lot more rich and refined for general use. This is barely scratching the surface of the available functionality. Also, the underlying code is native-based and has support for running on CUDA cores for higher performance. This library is definitely one to keep in mind for cases when you have a lot of array and matrix-based operations.

Apache Commons Math - RealMatrix

11 August 2016 ~ blog, groovy

I have been reading Machine Learning in Action, which is a great book; however, all of the examples are in Python. While Python seems like a decent language, it is not one of my primary languages. With that in mind, I have been converting the Python examples into Groovy so that a few months from now when I come back and try to understand what I learned, I will be able to decipher the code.

What I have found is that, at least in the examples for this book, there are numerous Matrix-based operations. My Linear Algebra was a long time ago, but I think I remember some of the basics; however, it’s nice to have a Java-based implementation in the Apache Commons Math RealMatrix implementations (there are others but this is what I have been focussing on). It took me a little time to get back up to speed, especially since the Python examples will have something like:

someMatrix = someMatrix * anotherMatrix + thirdMatrix * number

where the resulting matrix is the product of two matrices added to the product of a matrix and a scalar number. Conceptually, this boils down to:

  1. Multiply someMatrix by anotherMatrix

  2. Multiply every element of thirdMatrix by the scalar number value

  3. Add every element of the matrix in step 1 with the element at the same position of the matrix from step 2.

Not hard to grasp, and not even all that hard to code; however, this is something I’d rather push off to someone else’s implementation instead of writing it myself. That is where the Commons Math library comes into play. The RealMatrix interface defines matrix support for double values - there is also a more flexible FieldMatrix<T> interface, but double values work well as an example. Let’s start by setting up a simple Groovy script for playing with matrices. Create a file named matrix-math.groovy and add the following to it:

matrix-math.groovy
@Grapes(
   @Grab('org.apache.commons:commons-math3:3.6.1')
)

import org.apache.commons.math3.linear.*

def mat = new Array2DRowRealMatrix(4,3)

println mat

This script will download the artifacts for the Apache Commons Math library and create a simple RealMatrix with 4 rows and 3 columns. It will then be printed to the console. When you run it, you should see

Array2DRowRealMatrix{{0.0,0.0,0.0},{0.0,0.0,0.0},{0.0,0.0,0.0},{0.0,0.0,0.0}}

which represents our empty 4x3 matrix. While this is not a bad representation, it would be nicer if we could get a better representation of the rows and columns of data for our poor human eyes. The library provides a RealMatrixFormat for this. Add the following to the script:

def formatter = new RealMatrixFormat('{', '}', '{', '}', ',\n ', ',\t')

println formatter.format(mat)

Note that the println line replaces the existing one. Now we get a better, more human-readable representation:

{{0,    0,      0},
 {0,    0,      0},
 {0,    0,      0},
 {0,    0,      0}}

Interesting so far, but we would really like some data. With the existing matrix, you can add data in rows or columns by index, similar to an array:

mat.setRow(0, [1.0, 2.0, 3.0] as double[])
mat.setColumn(1, [9.0, 8.0, 7.0, 6.0] as double[])

Now when you run the code, notice that you get the first row and the second column populated with the provided data:

{{1,    9,      3},
 {0,    8,      0},
 {0,    7,      0},
 {0,    6,      0}}

Also notice that the column data overwrote the row data we set for the second column (index 1). In our row data it was 2.0, but the 9.0 value from the column was applied after and is the final value. The other main method of creating a matrix is by providing the data directly in the constructor. Say we want to create a matrix with the same dimensions, but with a sequential collection of values, such as:

1  2  3
4  5  6
7  8  9
10 11 12

You can do the following in the code:

def seqMat = new Array2DRowRealMatrix([
    [1.0, 2.0, 3.0] as double[],
    [4.0, 5.0, 6.0] as double[],
    [7.0, 8.0, 9.0] as double[],
    [10.0, 11.0, 12.0] as double[]
] as double[][])

println formatter.format(seqMat)

This code creates a matrix with an array of arrays, where the inner arrays are the rows of data. When printed out, you get the following:

{{1,    2,      3},
 {4,    5,      6},
 {7,    8,      9},
 {10,   11,     12}}

Now, let’s do some operations on our matrices. You can do common math operations on two matrices. Adding two matrices:

def sum = mat.add(seqMat)
println formatter.format(sum)

This gives you the element-by-element sum of the values and yields:

{{2,    11,     6},
 {4,    13,     6},
 {7,    15,     9},
 {10,   17,     12}}

Subtracting one matrix from another:

def diff = seqMat.subtract(mat)
println formatter.format(diff)

Gives:

{{0,    -7,     0},
 {4,    -3,     6},
 {7,    1,      9},
 {10,   5,      12}}

Multiplication of matrices is not what you might intuitively think it is, unless you are up on your Linear Algebra. Since there are whole wiki pages devoted to Matrix Multiplication, I won’t go into it here beyond stating that it can be done when you have square matrices (ours are not). Not being a tutorial on Linear Algebra, I am going to leave it at that. You can also multiply a matrix by a scalar number:

def prod = mat.scalarMultiply(2)
println formatter.format(prod)

Which multiplies every element by the given value and results in:

{{2,    18,     6},
 {0,    16,     0},
 {0,    14,     0},
 {0,    12,     0}}

Similarly, there is a scalarAdd(double) method.

Other useful operations may be performed on matrices. You can "transpose" the matrix:

def trans = seqMat.transpose()
println formatter.format(trans)

This rotates the values of the matrix to turn rows into columns, as in our example:

{{1,    2,      3},
 {4,    5,      6},
 {7,    8,      9},
 {10,   11,     12}}

becomes

{{1,    4,      7,      10},
 {2,    5,      8,      11},
 {3,    6,      9,      12}}

There are a handful of other built-in operations available to matrices that are probably useful if you know what you are doing, but at this point, I do not. Another useful construct is the set of "walker" methods that allow you to walk through the elements of the matrix in various ways, allowing you to modify the elements or simply read them. Let’s take our initial matrix as an example and multiply every element by 2.0 both in place and in an external collection.

For the in-place modification we need a RealMatrixChangingVisitor

class MultiplicationVisitor extends DefaultRealMatrixChangingVisitor {

    double factor

    double visit(int row, int column, double value){
        value * factor
    }
}

mat.walkInOptimizedOrder(new MultiplicationVisitor(factor:2.0))
println formatter.format(mat)

This visitor simply multiplies each value by the provided factor and returns it, which will update the value in the matrix. The resulting matrix has the following:

{{2,    18,     6},
 {0,    16,     0},
 {0,    14,     0},
 {0,    12,     0}}

You can also walk a matrix without the ability to change the internal values. This requires a RealMatrixPreservingVisitor:

class CollectingVisitor extends DefaultRealMatrixPreservingVisitor {

    List values = []

    void visit(int row, int column, double value){
        values << value
    }
}

def collectingVisitor = new CollectingVisitor()
mat.walkInOptimizedOrder(collectingVisitor)
println collectingVisitor.values

In this case, the values are collected into a list and no matrix value is modified. You get the following result:

[2.0, 18.0, 6.0, 0.0, 16.0, 0.0, 0.0, 14.0, 0.0, 0.0, 12.0, 0.0]

This contains a list of all the values from our original matrix after the previous visitor has modified it.

Matrix operations can seem quite complicated; however, they are not bad with a helpful library. So far the Commons Math API seems pretty useful for these more advanced math concepts.

The entire script for this tutorial is provided below for completeness:

matrix-math.groovy
@Grapes(
   @Grab('org.apache.commons:commons-math3:3.6.1')
)

import org.apache.commons.math3.linear.*

def formatter = new RealMatrixFormat('{', '}', '{', '}', ',\n ', ',\t')

def mat = new Array2DRowRealMatrix(4,3)
mat.setRow(0, [1.0, 2.0, 3.0] as double[])
mat.setColumn(1, [9.0, 8.0, 7.0, 6.0] as double[])

println formatter.format(mat)
println()

def seqMat = new Array2DRowRealMatrix([
    [1.0, 2.0, 3.0] as double[],
    [4.0, 5.0, 6.0] as double[],
    [7.0, 8.0, 9.0] as double[],
    [10.0, 11.0, 12.0] as double[]
] as double[][])

println formatter.format(seqMat)
println()

def sum = mat.add(seqMat)
println formatter.format(sum)
println()

def diff = seqMat.subtract(mat)
println formatter.format(diff)
println()

def prod = mat.scalarMultiply(2)
println formatter.format(prod)
println()

def trans = seqMat.transpose()
println formatter.format(trans)
println()

class MultiplicationVisitor extends DefaultRealMatrixChangingVisitor {

    double factor

    double visit(int row, int column, double value){
        value * factor
    }
}

mat.walkInOptimizedOrder(new MultiplicationVisitor(factor:2.0))
println formatter.format(mat)
println()

class CollectingVisitor extends DefaultRealMatrixPreservingVisitor {

    List values = []

    void visit(int row, int column, double value){
        values << value
    }
}

def collectingVisitor = new CollectingVisitor()
mat.walkInOptimizedOrder(collectingVisitor)
println collectingVisitor.values
println()

Gradle Dependencies Behind the Wall

10 July 2016 ~ blog, groovy, gradle

Some companies like to take full control of their build environments and disallow builds that pull artifacts from external sources so that only approved internal artifact repositories are used containing only approved artifacts. While the validity of this is debatable, it exists and in my experience tends to add roadblocks to development, especially when working with new frameworks and libraries.

Consider the scenario where you are working on a poject that uses a newer version of the Spring Framework than has been previously used in the company. Now you need to get the new Spring artifacts into your approved repository, which requires an issue ticket of some sort and at least one or two architects to approve it. I am sure I am not shocking you when I say that Spring has numerous dependencies if you are doing anything interesting with it and they are all transient. How do you get a list of the dependencies that you need to have added without an arduous catalogging of artifacts and their dependencies or numerous iterations of the list-ticket-approval work flow( which is not generally speedy)? You write a Gradle plugin to do it for you.

I have added a checkAvailability task to my Dependency Checker Plugin. This task allows you to do your development work using the standard jcenter or mavenCentral artifact repositories so that you can get things working, but when you are ready to lock down your dependencies you can run:

./gradlew checkAvailability -PrepoUrls=http://artifacts.mycompany.com/repository

Which will list out the dependencies missing from the specified repository without affecting your build. The reported console entries will look something like:

Availability check for (commons-lang:commons-lang:2.1.2): FAILED

You can provide additional configuration to futher configure the task:

checkAvailability {
    repoUrls = ['http://artifacts.mycompany.com/repository']
    configurations = ['runtime']
    ignored = ['com.something:thingifier:1.2.3']
    failOnMissing = true
}

This configuration will specify the default repoUrls to be used, which may still be overridden on the command line. The configurations property allows you to limit the dependency configurations searched (to only runtime in this case). The ignored property allows specified artifacts to be ignored even if they are missing. And finally, the failOnMissing property will cause the build to fail when set to true after reporting all the missing dependencies - the default is false so that it will only list the status of the dependencies and allow the build to continue.

Now, armed with a full list of the dependencies missing from your internal artifact repository, you can create your issue ticket and get the approvals once and get back to actual work faster.

Custom Spring Boot Shell Banner

25 March 2016 ~ blog, groovy, spring

I did a Groovy User Group talk recently related to my Spring Boot Remote Shell blog post and while putting the talk together I stumbled across a bug in the integration between Spring Boot and the Crash shell (see Spring-Boot-3988). The custom banner you can add to your Spring Boot application (as /resources/banner.txt) is not applied by default to your crash shell, so you get the boring Spring logo every time you startup the shell. I had worked with the Crash shell previously and I remember that the banner was customizable so I did a little digging and figured out how to code a work-around - I also added this information to the bug ticket; I considered contributing a pull request, but I am not sure how this would be coded into the default application framework.

The work-around is pretty simple and straight-forward if you have worked with the crash shell before. You use their method of customization and then have it pull in your Spring Boot custom banner. In your /src/main/resources/commands directory you add a login.groovy file, which Crash will load with every shell connection. The file allows the customization of the banner and the prompt. We can then load our spring banner from the classpath. The basic code is as follows:

login.groovy
welcome = { ->
    def hostName;
    try {
        hostName = java.net.InetAddress.getLocalHost().getHostName();
    } catch (java.net.UnknownHostException ignore) {
        hostName = 'localhost';
    }

    String banner = YourApplication.getResourceAsStream('/banner.txt').text

    return """
${banner}
Logged into $hostName @ ${new Date()}
"""
}

prompt = { ->
    return "% ";
}

It’s a silly little thing to worry about, but sometimes it’s the little things that make an application feel more like your own.

I have created a pull request in the spring-boot project to address this issue…​ we’ll see what happens.

Groovy Dependency Injection

19 March 2016 ~ blog, groovy

Dependency Injection frameworks were a dime a dozen for a while - everybody had their own and probably a spare just in case. For the most part the field has settled down to a few big players, the Spring Framework and Google Guice are the only two that come to mind. While both of these have their pluses and minuses, they both have a certain level of overhead in libraries and understanding. Sometimes you want to throw something together quickly or you are in a scenario where you can’t use one of these off the shelf libraries. I had to do this recently and while I still wanted to do something spring/guice-like, I could not use either of them, but I did have Groovy available.

Note
I want to preface the further discussion here to say that I am not suggesting you stop using Spring or Guice or whatever you are using now in favor of rolling your own Groovy DI - this is purely a sharing of information about how you can if you ever need to.

Let’s use as an example a batch application used to process some game scores and report on the min/max/average values. We will use a database (H2) just to show a little more configuration depth and I will use the TextFileReader class from my Vanilla project to keep things simple and focussed on DI rather than logic.

First, we need the heart of our DI framework, the configuration class. Let’s call it Config; we will also need a means of loading external configuration properties and this is where our first Groovy helper comes in, the ConfigSlurper. The ConfigSlurper does what it sounds like, it slurps up a configuration file with a Groovy-like syntax and converts it to a ConfigObject. To start with, our Config class looks something like this:

class Config {
    private final ConfigObject config

    Config(final URL configLocation) {
        config = new ConfigSlurper().parse(configLocation)
    }
}

The backing configuration file we will use, looks like this:

inputFile = 'classpath:/scores.csv'

datasource {
    url = 'jdbc:h2:mem:test'
    user = 'sa'
    pass = ''
}

This will live in a file named application.cfg and as can be seen, it will store our externalized config properties.

Next, let’s configure our DataSource. Both Spring and Guice have a similar "bean definition" style, and what I am sure is based on those influences, I came up with something similar here:

@Memoized(protectedCacheSize = 1, maxCacheSize = 1)
DataSource dataSource() {
    JdbcConnectionPool.create(
        config.datasource.url,
        config.datasource.user,
        config.datasource.pass
    )
}

Notice that I used the @Memoized Groovy transformation annotation. This ensures that once the "bean" is created, the same instance is reused, and since I will only ever have one, I can limit the cache size and make sure it sicks around. As an interesting side-item, I created a collected annotation version of the memoized functionality and named it @OneInstance since @Singleton was alread taken.

@Memoized(protectedCacheSize = 1, maxCacheSize = 1)
@AnnotationCollector
@interface OneInstance {}

It just keeps things a little cleaner:

@OneInstance DataSource dataSource() {
    JdbcConnectionPool.create(
        config.datasource.url,
        config.datasource.user,
        config.datasource.pass
    )
}

Lastly, notice how the ConfigObject is used to retrieve the configuration property values, very clean and concise.

Next, we need to an input file to read and a TextFileReader to read it so we will configure those as well.

@OneInstance Path inputFilePath() {
    if (config.inputFile.startsWith('classpath:')) {
        return Paths.get(Config.getResource(config.inputFile - 'classpath:').toURI())
    } else {
        return new File(config.inputFile).toPath()
    }
}

@OneInstance TextFileReader fileReader() {
    new TextFileReader(
        filePath: inputFilePath(),
        firstLine: 2,
        lineParser: new CommaSeparatedLineParser(
            (0): { v -> v as long },
            (2): { v -> v as int }
        )
    )
}

I added a little configuration sugar so that you can define the input file as a classpath file or an external file. The TextFileReader is setup to convert the data csv file as three columns of data, an id (long), a username (string) and a score (int). The data file looks like this:

# id,username,score
100,bhoser,4523
200,ripplehauer,235
300,jegenflur,576
400,bobknows,997

The last thing we need in the configuration is our service which will do that data management and the stat calculations, we’ll call it the StatsService:

@TypeChecked
class StatsService {

    private Sql sql

    StatsService(DataSource dataSource) {
        sql = new Sql(dataSource)
    }

    StatsService init() {
        sql.execute('create table scores (id bigint PRIMARY KEY, username VARCHAR(20) NOT NULL, score int NOT NULL )')
        this
    }

    void input(long id, String username, int score) {
        sql.executeUpdate(
            'insert into scores (id,username,score) values (?,?,?)',
            id,
            username,
            score
        )
    }

    void report() {
        def row = sql.firstRow(
            '''
            select
                count(*) as score_count,
                avg(score) as average_score,
                min(score) as min_score,
                max(score) as max_score
            from scores
            '''
        )

        println "Count  : ${row.score_count}"
        println "Min    : ${row.min_score}"
        println "Max    : ${row.max_score}"
        println "Average: ${row.average_score}"
    }
}

I’m just going to dump it out there since it’s mostly SQL logic to load the data into the table and then report the stats out to the standard output. We will wire this in like the others in Config:

@OneInstance StatsService statsService() {
    new StatsService(dataSource()).init()
}

With that, our configuration is done. Now we need to use it in an application, which we’ll call Application:

class Application {

    static void main(args){
        Config config = Config.fromClasspath('/application.cfg')

        StatsService stats = config.statsService()
        TextFileReader reader = config.fileReader()

        reader.eachLine { Object[] line->
            stats.input(line[0], line[1], line[2])
        }

        stats.report()
    }
}

We instantiate a Config object, call the bean accessor methods and use the beans to do the desired work. I added the fromClasspath(String) helper method to simplify loading config from the classpath.

Like I said, this is no fulltime replacement for a real DI framework; however, when I was in a pinch, this came in pretty handy and worked really well. Also, it was easy to extend the Config class in the testing source so that certain parts of the configuration could be overridden and mocked as needed during testing.

Note
The demo code for this post is on GitHub: cjstehno/groovy-di.

Dependency Duplication Checking

12 March 2016 ~ blog, groovy, gradle

Sometimes it takes a critical mass threshold of running into the same issue repeatedly to really do something about it. How often, when working with a dependency manager like Gradle or Maven, have you run into some runtime issue only to find that it was caused by a build dependency that you had two (or more) different versions of at runtime? More often than you would like, I am sure. It can be a real surprise when you actually go digging into your aggregated dependency list only to find out you have more than one duplicate dependency just waiting to become a problem.

What do I mean by duplicate dependency? Basically, it’s just what it sounds like. You have two dependencies with different versions. Something like:

org.codehaus.groovy:groovy-all:2.4.4
org.codehaus.groovy:groovy-all:2.4.5

Most likely, your project defines one of them and some other dependency brought the other along for the ride. It is usually pretty easy to resolve these extra dependencies; in Gradle you can run the dependency task to see which dependency is bringing the extra library in:

./gradlew dependencies > deps.txt

I like to dump the output to a text file for easier viewing. Then, once you find the culprit, you can exclude the transitive dependency:

compile( 'com.somebody:coollib:2.3.5' ){
    exclude group:'org.codehaus.groovy', module:'groovy-all'
}

Then you can run the dependency task again to ensure that you got of it. Generally, this is a safe procedure; however, sometimes you get into a situation where different libraries depend on different versions that have significant code differences - that’s when the fun begins and it usually ends in having to up or down-grade various dependencies until you get a set that works and is clean.

What is the problem with having multiple versions of the same library in your project? Sometimes nothing, sometimes everything. The classloader will load whichever one is defined first in the classpath. If your project needs a class Foo with a method bar() and the version you expect to use has it but the previous version does not, bad things can happen at runtime.

Ok, now we know generally how to solve the multiple dependency problem, we’re done right? Sure, for a month or so. Unless your project is done and no longer touched, new dependencies and duplicates will creep in over time. I did this duplicataion purge on a project at work a few months ago and just last week I took a peek at the aggregated dependency list and was truely not so shocked to see three duplicated libraries. One of which was probably the cause of some major performance issues we were facing. That’s what inspired me to solve the problem at least to the point of letting you know when duplications creep in.

I created the dependency-checker Gradle plugin. It is available in the Gradle Plugin Repository. At this point, it has one added task, checkDependencies which, as the name suggests, searches through all the dependencies of the project to see if you have any duplicates within a configuration. If it finds duplicates, it will write them to the output log and fail the build.

Currently, you need to run the task for the checking to occur. I would like to get it to run with the default check task, or build task, but the code I had for that was not working - later version I guess. You can add that functionality into your own build by adding one or two lines to your build.gradle file:

tasks.check.dependsOn checkDependencies
tasks.build.dependsOn checkDependencies

These will make the appropriate tasks depend on the dependency check so that it will be run with every build - that way you will know right away that you have a potential problem.

I did take a tour around Google and the plugin repository just to make sure there was nothing else providing this functionality - so hopefully I am not duplicating anyone else’s work.

Vanilla TextFileReader/Writer

06 March 2016 ~ blog, groovy, vanilla

Something I have found myself doing quite often over my whole career as a developer is reading and writing simple text file data. Whether it is a quick data dump or a data set to be loaded from a 3rd party, it is something I end up doing a lot and usually it is something coded mostly from scratch since, surprisingly enough, there are very few tools available for working with formatted text files. Sure, there are a few for CSV, but quite often I get a reqest to read or write a format that is kind of similar to CSV, but just enough different that it breaks a standard CSV parser for whatever reason. Recently, I decided to add some utility components to my Vanilla project with the aim of making these readers and writers simpler to build.

Let’s start off with the com.stehno.vanilla.text.TextFileWriter and say we have a data source of Person objects in our application that the business wants dumped out to a text file (so they can import it into some business tools that only ever seem capable of importing simple text files). In the application, the data structure looks something like this:

class Person {
    String firstName
    String middleName
    String lastName
    int age
    float height
    float weight
}

with the TextFileWriter you need to define a LineFormatter which will be used to format the generated lines of text, one per object written. The LineFormatter defines two methods, String formatComment(String) for formatting a comment line, and String formatLine(Object) for formatting a data line. A simple implementation is provided, the CommaSeparatedLineFormatter will generate comment lines prefixed with a # and will expect a Collection object to be formatted and will format it as a CSV line.

The available implementation will not work for our case, so we will need to define our own LineFormatter. We want the formatted data lines to be of the form:

# Last-Name,First-Name-Middle-Initial,Attrs
Smith,John Q,{age:42, height:5.9, weight:230.5}

Yes, that’s a bit of a convoluted format, but I have had to generate worse. Our LineFormatter ends up being something like this:

class PersonLineFormatter implements LineFormatter {

    @Override
    String formatComment(String text) {
        "# $text" (1)
    }

    @Override
    String formatLine(Object object) {
        Person person = object as Person
        "${person.lastName},${person.firstName} ${person.middleName[0]},{age:${person.age}, height:${person.height}, weight:${person.weight}}" (2)
    }
}
  1. We specify the comment as being prefixed by a # symbol.

  2. Write out the Person object as the formatted String

We see that implementing the LineFormatter keeps all the application specific logic isolated from the common operation of actually writing the file. Now we can use our formatter as follows:

TextFileWriter writer = new TextFileWriter(
    lineFormatter: new PersonLineFormatter(),
    filePath: new File(outputDir, 'people.txt')
)

writer.writeComment('Last-Name,First-Name-Middle-Initial,Attrs')

Collection<Person> people = peopleDao.listPeople()

people.each { Person p->
    writer.write(p)
}

This will write out the text file in the desired format with very little new coding required.

Generally, writing out text representations of application data is not really all that challenging, since you have access to the data you need and some control over the formatting of the objects to be represented. The real challenge is usually going in the other direction, when you are reading in a data file from some external source, this is where the com.stehno.vanilla.text.TextFileReader becomes useful.

Let’s say you receive a request to import the data file we described above, maybe it was generated by the same business tools I mentioned earlier. We have something like this:

# Last-Name,First-Name-Middle-Initial,Attrs
Smith,John Q,{age:42, height:5.9, weight:230.5}
Jones,Robert M,{age:38, height:5.6, weight:240.0}
Mendez,Jose R,{age:25, height:6.1, weight:232.4}
Smalls,Jessica X,{age:30, height:5.5, weight:175.2}

The TextFileReader requires a LineParser to parse the input file lines into objects; it defines three methods, boolean parseable(String) which is used to determine whether or not the line should be parsed, Object[] parseLine(String) which is used to parse the line of text, and Object parseItem(Object, int) which is used to parse an individual element of the comma-separated line. There is a default implementation provided, the CommaSeparatedLineParser will parse simple comma-separated lines of text into arrays of Objects based on configured item converters; however, this will not work in the case of our file since there are commas in the data items themselves (the JSON-like format of the last element). So we need to implement one. Our LineParser will look something like the following:

class PersonLineParser implements LineParser {

    boolean parsable(String line){
        line && !line.startsWith(HASH) (1)
    }

    Object[] parseLine(String line){ (2)
        int idx = 0
        def elements = line.split(',').collect { parseItem(it, idx++) }

        [
            new Person(
                firstName:elements[1][0],
                middleName:elements[1][1],
                lastName:elements[0],
                age:elements[2],
                height:elements[3],
                weight:elements[4],
            )
        ] as Object[]
    }

    // Smith,John Q,{age:42, height:5.9, weight:230.5}
    // 0    ,1     ,2      ,3          ,4
    Object parseItem(Object item, int index){ (3)
        switch(index){
            case 0:
                return item as String
            case 1:
                return item.split(' ')
            case 2:
                return item.split(':')[1] as int
            case 3:
                return item.split(':')[1] as float
            case 4:
                return item.split(':')[1][0..-2] as float
        }
    }
}
  1. We want to ignore blank lines or lines that start with a # symbol.

  2. We extract the line items and build the Person object

  3. We convert the line items to our desired types

It’s not pretty, but it does the job and keeps all the line parsing logic out of the main file loading functionality. Our code to read in the file would look somethign like:

setup:
TextFileReader reader = new TextFileReader(
    filePath: new File(inputDir, 'people.txt'),
    lineParser: new PersonLineParser(),
    firstLine: 2 (1)
)

when:
def people = []

reader.eachLine { Object[] data ->
    lines << data[0]
}
  1. We skip the first line, since it will always be the header

The provided implementations for both the LineFormatter and LineParser will not account for every scenario, but hopefully they will hit some of them and provide a guideline for implementing your own. If nothing else, these components help to streamline the readign and writing of formatted text data so that you can get it done and focus on other more challenging development tasks.

Spring Boot Remote Shell

07 November 2015 ~ blog, groovy, spring

Spring Boot comes with a ton of useful features that you can enable as needed, and in general the documentation is pretty good; however, sometimes it feels like they gloss over a feature that eventually realize is much more useful than it originally seemed. The remote shell support is one of those features.

Let’s start off with a simple Spring Boot project based on the example provided with the Boot documentation. Our build.gradle file is:

build.gradle
buildscript {
    repositories {
        jcenter()
    }

    dependencies {
        classpath 'org.springframework.boot:spring-boot-gradle-plugin:1.2.7.RELEASE'
    }
}

version = "0.0.1"
group = "com.stehno"

apply plugin: 'groovy'
apply plugin: 'spring-boot'

sourceCompatibility = 8
targetCompatibility = 8

mainClassName = 'com.stehno.SampleController'

repositories {
    jcenter()
}

dependencies {
    compile "org.codehaus.groovy:groovy-all:2.4.5"

    compile 'org.springframework.boot:spring-boot-starter-web'
}

task wrapper(type: Wrapper) {
    gradleVersion = "2.8"
}

Then, our simple controller and starter class looks like:

SampleController.groovy
@Controller
@EnableAutoConfiguration
public class SampleController {

    @RequestMapping('/')
    @ResponseBody
    String home() {
        'Hello World!'
    }

    static void main(args) throws Exception {
        SpringApplication.run(SampleController, args)
    }
}

Run it using:

./gradlew clean build bootRun

and you get your run of the mill "Hello world" application. For our demonstration purposes, we need something a bit more interesting. Let’s make the controller something like a "Message of the Day" server which will return a fixed configured message. Remove the hello controller action and add in the following:

String message = 'Message for you, sir!'

@RequestMapping('/') @ResponseBody
String message() {
    message
}

which will return the static message "Message for you, sir!" for every request. Running the application now, will still be pretty uninteresting, but wait, it gets better.

Now, we would like to have the ability to change the message as needed without rebuilding or even restarting the server. There are handful of ways to do this; however, I’m going to discuss one of the seemingly less used options…​ The CRaSH Shell integration provided in Spring Boot (43. Production Ready Remote Shell).

To add the remote shell support in Spring Boot, you add the following line to your dependencies block in your build.gradle file:

compile 'org.springframework.boot:spring-boot-starter-remote-shell'

Now, when you run the application, you will see an extra line in the server log:

Using default password for shell access: 44b3556b-ff9f-4f82-9f1b-54a16da471d5

Since no password was configured, Boot has provided a randomly generated one for you (obviously you would configure this in a real system). You now have an SSH connection available to your application. Using the ssh client of your choice you can login using:

ssh -p 2000 user@localhost

Which will ask you for the provided password. Once you have logged in you are connected to a secure shell running inside your application. You can run help at the prompt to get a list of available commands, which will look something like this:

> help
Try one of these commands with the -h or --help switch:

NAME       DESCRIPTION
autoconfig Display auto configuration report from ApplicationContext
beans      Display beans in ApplicationContext
cron       manages the cron plugin
dashboard  a monitoring dashboard
egrep      search file(s) for lines that match a pattern
endpoint   Invoke actuator endpoints
env        display the term env
filter     a filter for a stream of map
java       various java language commands
jmx        Java Management Extensions
jul        java.util.logging commands
jvm        JVM informations
less       opposite of more
mail       interact with emails
man        format and display the on-line manual pages
metrics    Display metrics provided by Spring Boot
shell      shell related command
sleep      sleep for some time
sort       sort a map
system     vm system properties commands
thread     JVM thread commands
help       provides basic help
repl       list the repl or change the current repl

As you can see, you get quite a bit of functionality right out of the box. I will leave the discussion of each of the provided commands to another post. What we are interested at this point is adding our own command to update the message displayed by our controller.

The really interesting part of the shell integration is the fact that you can extend it with your own commands.

Create a new directory src/main/resources/commands which is where your extended commands will live, and then add a simple starting point class for our command:

message.groovy
package commands

import org.crsh.cli.Usage
import org.crsh.cli.Command
import org.crsh.command.InvocationContext

@Usage('Interactions with the message of the day.')
class message {

    @Usage('View the current message of the day.')
    @Command
    def view(InvocationContext context) {
        return 'Hello'
    }
}

The @Usage annotations provide the help/usage documentation for the command, while the @Command annotation denotes that the view method is a command.

Now, when you run the application and list the shell commands, you will see our new command added to the list:

message    Interactions with the message of the day.

If you run the command as message view you will get the static "Hello" message returned to you on the shell console.

Okay, we need the ability to view our current message of the day. The InvocationContext has attributes which are propulated by Spring, one of which is spring.beanfactory a reference to the Spring BeanFactory for your application. We can access the current message of the day by replacing the content of the view method with the following:

BeanFactory beans = context.attributes['spring.beanfactory']
return beans.getBean(SampleController).message

where we find our controller bean and simply read the message property. Running the application and the shell command now, yield:

Message for you, sir!

While that is pretty cool, we are actually here to modify the message, not just view it and this is just as easy. Add a new command named update:

@Usage('Update the current message of the day.')
@Command
def update(
    InvocationContext context,
    @Usage('The new message') @Argument String message
) {
    BeanFactory beans = context.attributes['spring.beanfactory']
    beans.getBean(SampleController).message = message
    return "Message updated to: $message"
}

Now, rebuild/restart the server and start up the shell. If you execute:

message update "This is cool!"

You will update the configured message, which you can verify using the message view command, or better yet, you can hit your server and see that the returned message has been updated…​ no restart required. Indeed, this is cool.

Tip
You can find a lot more information about writing your own commands in the CRaSH documentation for Developing Commands. There is a lot of functionality that I am not covering here.

At this point, we are functionally complete. We can view and update the message of the day without requiring a restart of the server. But, there are still some added goodies provided by the shell, especially around shell UI support - yes, it’s text, but it can still be pretty and one of the ways CRaSH allows you to pretty things up is with colors and formatting via styles and the UIBuilder (which is sadly under-documented).

Let’s add another property to our controller to make things more interesting. Just add a Date lastUpdated = new Date() field. This will give us two properties to play with. Update the view action as follows:

SampleController controller = context.attributes['spring.beanfactory'].getBean(SampleController)

String message = controller.message
String date = controller.lastUpdated.format('MM/dd/yyyy HH:mm')

out.print new UIBuilder().table(separator: dashed, overflow: Overflow.HIDDEN, rightCellPadding: 1) {
    header(decoration: bold, foreground: black, background: white) {
        label('Date')
        label('Message')
    }

    row {
        label(date, foreground: green)
        label(message, foreground: yellow)
    }
}

We still retrieve the instance of the controller as before; however, now our output rendering is a bit more complicated, though still pretty understandable. We are creating a new UIBuilder for a table and then applying the header and row contents to it. It’s actually a very powerful construct, I just had to dig around in the project source code to actually figure out how to make it work.

You will also need to update the update command to set the new date field:

SampleController controller = context.attributes['spring.beanfactory'].getBean(SampleController)
controller.message = message
controller.lastUpdated = new Date()

return "Message updated to: $message"

Once you have that built and running you can run the message view command and get a much nicer multi-colored table output.

> message view
Date             Message
-------------------------------------------------------------
11/05/2015 10:37 And now for something completely different.

Which puts wraps up what we are trying to do here and even puts a bow on it. You can find more information on the remote shell configuration options in the Spring Boot documentation in Appendix A: Common Application Properties. This is where you can configure the port, change the authentication settings, and even disable some of the default provided commands.

The remote shell support is one of the more interesting, but underused features in Spring Boot. Before Spring Boot was around, I was working on a project where we did a similar integration of CRaSH shell with a Spring-based server project and it provided a wealth of interesting and useful opportunities to dig into our running system and observe or make changes. Very powerful.

Multi-Collection Pagination

31 October 2015 ~ blog

A few years ago, I was working on a project where we had collections of data spread across multiple rows of data…​ and then we had to provide a paginated view of that data. This research was the result of those efforts. The discussion here is a bit more rigorous than I usually go into, so if you just want the implementation code jump to the bottom.

Introduction

Consider that you have a data set representing a collection of collections:

[
    [ A0, A1, A2, A3, A4, A5 ],
    [ B0, B1, B2, B3, B4, B5 ],
    [ C0, C1, C2, C3, C4, C5 ]
]

We want to retrieve the data in a paginated fashion where the subset (page) with index P and subset size (page size) S is used to retrieve only the desired elements in the most efficient means possible.

Consider also that the data sets may be very large and that the internal collections may not be directly associated with the enclosing collection (e.g. two different databases).

Also consider that the subsets may cross collection boundaries or contain fewer than the desired number of elements.

Lastly, requests for data subsets will be more likely discrete events – one subset per request, rather than iterating over all results.

For a page size of four (S = 4) you would have the following five pages:

P0 : [ A0, A1, A2, A3 ]
P1 : [ A4, A5, B0, B1 ]
P2 : [ B2, B3, B4, B5 ]
P3 : [ C0, C1, C2, C3 ]
P4 : [ C4, C5 ]

Computations

The overall collection is traversed to determine how many elements are contained within each sub-collection; this may be pre-computed or done at runtime. Three counts are calculated or derived for each sub-collection:

  • Count (CI) - the number of elements in the sub-collection.

  • Count-before (CB) - the total count of all sub-collection elements counted before this collection, but not including this collection.

  • Count-with (CW) - the total count of all sub-collection elements counted before and including this collection.

For our example data set we would have:

[
    { CI:6, CB:0, CW:6 [ A0, A1, A2, A3, A4, A5 ] },
    { CI:6, CB:6, CW:12 [ B0, B1, B2, B3, B4, B5 ] },
    { CI:6, CB:12, CW:18 [ C0, C1, C2, C3, C4, C5 ] }
]

This allows for a simple means of selecting only the sub-collections we are interested in; those containing the desired elements based on the starting and ending indices for the subset (START and END respectively). These indices can easily be calculated as:

START = P * S

END = START + S – 1
Note
The indices referenced here are for the overall collection, not the individual sub-collections.

The desired elements will reside in sub-collections whose inclusive count (CW) is greater than the starting index and whose preceding count (CB) is less than or equal to the ending index, or:

CW > START and CB ≤ END

For the case of selecting the second subset of data (P = 1) with a page size of four (S = 4) we would have:

START = 4

END = 7

This will select the first two or the three sub-collections as "interesting" sub-collections containing at least some of our desired elements, namely:

{ CI:6, CB:0, CW:6 [ A0, A1, A2, A3, A4, A5 ] },
{ CI:6, CB:6, CW:12 [ B0, B1, B2, B3, B4, B5 ] }

What remains is to gather from these sub-collections (call them SC[0], SC[1]) the desired number of elements (S).

To achieve this, a local starting and ending index must be calculated while iterating through the "interesting" sub-collections to gather the elements until either the desired amount is obtained (S) or there are no more elements available.

  1. Calculate the initial local starting index (LOCAL_START) by subtracting the non-inclusive preceding count value of the first selected collection (SC[0]) from the overall starting index.

  2. Iterate the selected collections (in order) until the desired amount has been gathered

This is more clearly represented in pseudo code as:

LOCAL_START = START – SC[0].CB
REMAINING = S

for-each sc in SC while REMAINING > 0

    if( REMAINING < (sc.size() - LOCAL_START) )
        LOCAL_END = LOCAL_START + REMAINING - 1
    else
        LOCAL_END = sc.size()-1

    FOUND = sc.sub( LOCAL_START, LOCAL_END )
    G.addAll( FOUND )
    REMAINING = REMAINING – FOUND.size()
    LOCAL_START = 0

end

Where the gathered collection of elements (G) is your resulting data set containing the elements for the specified data page.

It must be stated that the ordering of the overall collection and the sub-collections must be consistent across multiple data requests for this procedure to work properly.

Implementation

Ok now, enough discussion. Let’s see what this looks like with some real Groovy code. First, we need our collections of collections data to work with:

def data = [
    [ 'A0', 'A1', 'A2', 'A3', 'A4', 'A5' ],
    [ 'B0', 'B1', 'B2', 'B3', 'B4', 'B5' ],
    [ 'C0', 'C1', 'C2', 'C3', 'C4', 'C5' ]
]

Next, we need to implement the algorithm in Groovy:

int page = 1
int pageSize = 4

// pre-computation

int before = 0
def prepared = data.collect {d ->
    def result = [
        countIn: d.size(),
        countBefore: before,
        countWith: before + d.size(),
        values:d
    ]

    before += d.size()

    return result
}

// main computation

def localStart = (page * pageSize ) - prepared[0].countBefore
def remaining = pageSize

def gathered = []

prepared.each { sc->
    if( remaining ){
        def localEnd
        if( remaining < (sc.values.size() - localStart) ){
            localEnd = localStart + remaining - 1
        } else {
            localEnd = sc.values.size() - 1
        }

        def found = sc.values[localStart..localEnd]
        gathered.addAll(found)

        remaining -= found.size()
        localStart = 0
    }
}

println "P$page : $gathered"

which yields

P1 : [A4, A5, B0, B1]

and if you look all the way back up to the beginning of the article, you see that this is the expected data set for page 1 of the example data.

It’s not a scenario I have run into often, but it was a bit of a tricky one to unravel. The pre-computation steps ended up being the key to keeping it simple and stable.


Older posts are available in the archive.