How it started
Recently at work due to certain complicated reasons, we are looking for a dockerless solution to build docker images for our mostly Java based applications. Jib from Google is one strong candidate we are trying out.
What is Jib?
Jib builds optimized Docker and OCI images for your Java applications without a Docker daemon - and without deep mastery of Docker best-practices. It is available as plugins for Maven and Gradle and as a Java library.
It all works nicely until coming to one of our module that requires building with a relatively complex base image from the local docker daemon.
Jib does support building with reading base image from the local docker daemon. However, rather puzzlingly, it simply gets stuck at 25% saying ‘Processing base image’ for this particular base image.
Even more mysteriously, it seems to gets stucked/unstucked seemingly arbitrarily if we remove/add some random instructions from/to the dockerfile used to build the base image and rebuild it. Such lines can be almost any line ranging from simply setting an environment variable to running a complicated command.
At first we thought it might be due to the size of the base image or the number of files within the base image. Evetually neither was the case as even a base image built with a simple dockerfile that only sets environment variables could cause it to stuck.
It almost appears to suggest there is an implicit instruction count limit to the base image jib can use. And adding to the fun, different type of instruction weights differently.
Root cause
After some diggings into jib’s source code, I found out it’s because of an implementation oversight when jib tries to inspect the base image summary read from the local docker daemon.
Jib creates a process running a docker inspect
command and invokes Process.waitFor()
immediately afterwards without reading from stdout. It only tries to read from stdout after waitFor() returns which only occurs when the process terminates.
As a result if the size of the process output to stdout is significant enough to overwhelm the buffer, the process would become pending until someone read from stdout to clear out the buffer. This results in a deadlock situation that makes waitFor() never returns.
The fix
Well… The fix itself is rather simple. Just continually read from the process’s output while the process runs before calling waitFor() on the process.
Until Google merges my PR to fix it and release a new version of jib, you will need to build jib from source with the necessary changes proposed in the PR yourself.