Remote debugging containerized Jenkins builds: debugShell()

Providing build environments by employing some kind of container technology is quite common nowadays. When looking at Docker as an example, it offers various benefits: Declaring the environment in a textual way by using a Dockerfile , you can combine it with a versioning tool like git and track every change. Each build is running inside its own fresh container and is therefore isolated and also reproducible. If you choose to use a clustering solution like Docker Swarm or Kubernetes, you’re also able to dynamically scale available build power at run time.

Any downsides?

Of course this comes at a cost. Besides building, maintaining and supporting additional infrastructure, from a perspective of developers and DevOps engineers, the usability is definitely lowered. If one is using a central build server whose builds are executed on that same machine, debugging problems seems way easier: When a build fails, we simply browse its work space and check on relevant files to investigate potential causes. If that’s not enough, we can always SSH into the machine and continue debugging locally.

On the other hand, there are some challenges involved when executing a build inside a container on some node somewhere on a container platform. In the nature of ephemeral build environments where separate containers are created and teared down for each individual build, accessing build artifacts or work space files becomes impossible. Using SSH is also problematic: Usually there is no SSH daemon listening inside the container. It would also only be reachable from the outside, if you expose the daemons port. But even then, you don’t even know exactly on which node the container is running on.

Proposed solution and tl;dr

We’ll create a TCP shell server inside the container a client can connect to. To overcome the connectivity issues we’ll setup a dynamic reverse proxy on a public reachable machine that tunnels the connection inside the container. Following this approach, we’ll actually gain some additional features along the road. Using the right combination of tools enable us to:

  1. Set breakpoints within a Jenkinsfile
  2. Gain an interactive shell inside the buildcontainer
  3. Inherit the actual build environment within that shell: Setting a breakpoint inside withMaven() for example, would leave us with the corresponding configuration when calling mvn.
  4. Modify the workspace during build time

Tool introduction

This is a brief introduction of the tools and their respective role to achieve the laid out features above.

  1. frp (fast reverse proxy): Exposes TCP sockets behind a NAT, i.e. inside a container. There is a server part that runs on a public reachable machine. The client runs inside the container and connects to the server and thereby establishes a reverse tunnel to a socket inside the container. The socket inside the container is now reachable via the public machine.
  2. socat (SOcket CAT): Multipurpose socket relay utility. Used to let a developer connect to the buildcontainer and to establish the interactive shell session.
  3. Jenkins shared library: To create a custom build step which functions as a break point that halts the running build, constructs the shell and establishes the reverse tunnel.

Setup

The following shows the basic idea on how to implement the remote debugging functionality. For those of you who want to dive in deeper: Everything laid out here here can be fully explored by checking out the corresponding repository over at Github.

frp

As frp is responsible for establishing a communication link between the client and the buildcontainer, it needs to be setup first. As already mentioned, we require a public reachable node the server component (frpd) must be installed on. Let’s call it jenkins.gee-whiz.de. frp’s feature set is quite extensive, but a minimal server configuration that fit our needs would look like this:

[common]
bind_port = 7000
privilege_allow_ports = 17000-17100

We define a listening port of 7000 where the clients inside of buildcontainers are supposed to connect to, as well as a range of ports frpd may use on the public node for inbound tunneled connections. On the client side of things, i.e. inside the buildcontainer, a suitable frpc configuration would be:

[common]
server_addr = jenkins.gee-whiz.de
server_port = 7000

[shell]
local_ip = 127.0.0.1
local_port = 22222
remote_port = 17000

Beside the servers address and port in the common section, we define the local endpoint that should be exposed. We’re planning to let socat listen on 127.0.0.1:22222. socat will provide the interactive shell a client is able to connect to. In this case, the remote port is explicitly set to 17000. To let frpd automatically choose a free port within the specified range on the public node, the remote port must be set to 0.

When running the client against the server, the resulting tunnel would basically look like this:

[ jenkins.gee-whiz.de:17000 ] --> [ some_docker_server ] --> [ inside_container:22222 ]

socat

Socat will be used in two ways: First, it will act as a server inside the container that is responsible for providing the interactive shell. Second, it will be run on the client to connect to said server via the reverse tunnel created by frp.

On the server side of things we’re going to spawn a TCP socket on port 22222 and redirect incoming traffic to a bash instance. We also need to set some additional options as laid out in this blog post to create a fully interactive PTY. This lets you do things you would expect from a terminal, like signal handling (ctrl+c), tab-completion or using a text editor like vim. This is the command to spawn the server process:

$ socat exec:'bash -li',pty,stderr,setsid,sigint,sane tcp-listen:22222

Once we created both the frp reverse tunnel and the socat server we’re able to establish the connection from a client to the container. This can be done from an arbitrary Linux system with socat installed, or by using something like Mintty with MSYS2 or Cygwin on Windows. The native Windows cmd.exe technically also works, but misses a few interactive shell features mentioned above. As on the server side, we also set a few additional options to make the shell fully interactive:

$ socat file:$(tty),raw,echo=0,escape=0x0f tcp:jenkins.gee-whiz.de:17000
$ ls -al
total 4
drwxr-xr-x 1 jenkins jenkins  30 Feb 14 08:46 .
drwxr-xr-x 1 jenkins jenkins 300 Feb 14 08:46 ..
drwxr-xr-x 1 jenkins jenkins 136 Feb 14 08:46 .git
-rw-r--r-- 1 jenkins jenkins 676 Feb 14 08:46 Jenkinsfile

Side note: Instead of socat you may also use a real SSH daemon to allows every standard SSH client to connect. You would also automatically gain features like authentication, encryption and file transfer.

Jenkins shared library

Doing everything above manually inside a sh step would be way to cumbersome. Instead we’ll use the Jenkins shared library mechanism to neatly encapsulate all of those actions into a single custom step. This enables us to halt the execution of the Jenkinsfile and start the debug shell with just a single command. We name that custom step we’re about to create debugShell().

We’re not diving into all the details about creating shared libraries here. Also note that this version is way stripped down to bring across the basic idea. Among other things, the full version adds automatic port determination and therefore multi user capability.

This is the content of the file debugShell.groovy inside our shared library:

#!/usr/bin/env groovy

def call() {
    echo "------------------------------"
    echo "Spawning debug shell socket"
    echo "------------------------------"
    echo "Reverse connection endpoint to localhost:22222 created on: jenkins.gee-whiz.de:17000"
    echo "Connect via: socat file:\$(tty),raw,echo=0,escape=0x0f tcp:jenkins.gee-whiz.de:17000"
    echo "Listening for incoming connection and pausing execution until connection terminates"
    echo "------------------------------"
    
    sh '/tmp/debugshell/create.sh'
    sh 'frpc -c /etc/frp/frpc.ini &'
    sh 'sleep 1'
    sh 'socat exec:"bash -li",pty,stderr,setsid,sigint,sane tcp-listen:22222'
}

Upon calling the debugShell() function, we can use socat as a client to connect to the buildcontainer. To do so, just execute the echoed command. As the created shell inherits the environment of the current build on the exact location the function is called, access to every environment variable, injected configurations and credentials is being gained. We’re also able to edit files inside the workspace with our favorite editor. As an example, we could modify the content of a pom.xml file, and run Maven on our changes. Upon exiting the debug shell, the build resumes.

Putting it all together

Surly the laid out approach was suitable to understand the basic gist of the setup. Running it in production though, we add a few additional things:

  • Automatically determine an unused port for socats listening socket inside the container
  • Use a remote_port of 0 to let the frp server automatically determine an unused port within the specified port range on the public reachable machine
  • Generating a corresponding frp client configuration
  • Feed back the chosen port and the endpoint as a whole to the user as a log statement
  • Use a custom bashrc to print some useful information when a client connects:
      Connected to buildcontainer a163fc77fac0 on Swarm-a163fc77fac0 for job https://jenkins.gee-whiz.de/some_project/some_repository/master/75/ from 127.0.0.1:57964
      Building commit b6050f00bbd7c9a10ebabbd1151dad7130e32e71 on branch master originating from ssh://git@git.gee-whiz.de:2222/some_project/some_repository.git
    
  • Be able to run the debugShell in the background while the build continues

Go investigate all the details on Github, if you like. Or drop us an email if you have any questions or suggestions.