A Brief Overview

Jiiify is built using Vert.x, an event-driven, non-blocking, reactive tool-kit. Vert.x is polyglot and has support for Java, JavaScript, Groovy, Ruby, Ceylon, Scala and Kotlin out of the box. There are also third-party implementations for Python and other languages. Most of the Jiiify code, at this point, is written in Java. Vert.x is an Eclipse Foundation project (though I believe it was started at Red Hat) and has pretty decent online documentation and forums for communication (a mailing list and an IRC channel). The community seems robust and the Vert.x developers are responsive. The number of modules for version 3 of the Vert.x tool-kit continues to grow at a good pace.

Vert.x implements what they call a Multi-Reactor Pattern, which means there are multiple event loops that can scale across different cores on a machine. There are "handlers" which respond to messages over the event loop(s) and "verticles" which can send messages over the loop(s). Any particular handler will not be executed concurrently so it is very important for a program's performance to not block the event loop. To support tasks that are blocking and need a little more time to complete, Vert.x also has the concept of "worker verticles". These exist on their own event loop and have their own pool of resources from which to draw. So, one might configure a worker verticle to have four, or twenty, instances and they will live in their own named loop, independent from the main loop.

Jiiify divides its work into two camps: serving tiles, manifests, and info.json(s) that have already been created and generating new tiles (and other derivative images, info.json(s), and manifests). The first camp is handled by handlers that live on the main event loop and perform their work very quickly. The second camp is handled by a combination of verticles (for info.json(s), manifests, etc.) and worker verticles (for image processing)). The number of worker verticles configured depends on the number of cores and the amount of memory on the machine. Jiiify tries to calculate how many image processing worker verticles it can run based on a system's resources. In addition to trying to streamline its processing, so that it uses as few resources as possible in a single worker verticle's activity, Jiiify doesn't want to overestimate what it can do given the system's resources, which would result in OutOfMemoryExceptions as too many TIFFs files are loaded into memory before they can be processed and sent on their way.

As a caveat, I will say the reactive style of programming was new to me when I adopted Vert.x. I have found working with Vert.x enjoyable (as their documentation suggests developers will), but have also bumped into areas where my knowledge definitely needs more experience. For instance, Vert.x offers a Pump interface to help control I/O flow, but I've not yet integrated it into Jiiify (which reads in whole streams before outputting them). I'm sure there is room for improvement and I'm excited about the opportunity to continue learning how to best take advantage of all Vert.x offers.

Since Vert.x works with an event loop, Jiiify tries to take advantage of this by calculating the tiles and derivative images that need to be created and then messaging requests for them to the worker verticles that do the actual work. Jiiify will read the TIFF file in at the point of the first worker verticle's response to the tiling request and keep an in-memory cache of the TIFF to be used by all the subsequent worker verticles. When the processing of that image is done, the TIFF cache is forgotten and the worker verticles move onto the next TIFF in their queue. This saves having to re-read the TIFF image from disk repeatedly and helps reduce the draw on system resources. One of the reasons Jiiify is still considered alpha software is that this process is continually being tweaked and alternative methods for balancing speed with system resources are being implemented and evaluated. Some of the things tried in the past include putting an ImageIO layer over Kakadu, accessing Kakadu through System.exec, and using OpenCV through JNI. There is still room for improvement and the experimentation will continue.

Jiiify's Data Store

Since Jiiify manages it own derivative image files, it needs a way of storing and accessing them. It does this currently by implementing Pairtree Object Storage with different backends. The Pairtree specification was developed by the California Digital Library (CDL) as a way to put things on disk without overwhelming a single directory with too many files. Jiiify is using the vertx-pairtree library, written in conjunction with Jiiify, to store its files "on disk" -- either on a system's native file system (locally attached or NFS mounted) or in an S3 bucket.

The S3 Pairtree implementation is newer, and we're still working on thoroughly testing it (though it's based on the work of the SuperS3t project, so it's not completely untested). Additional documentation on vertx-pairtree library can be found on its site.