Context
For designing and teaching computer science labs, one need to allow students to be able to start easily, while ensuring that the lab is doable the same way by everyone, and facilitating the troubleshooting by the teacher. Between the labs I have been taught and the ones I taught myself, I witnessed several ways of distributing lab environments to students, each way with its ups and downs. The easiest is to just ask the students to install the necessary tools and software, either on their personal machines or the ones available in the lab room. However, this does not guarantee that the students have the same, reproducible experience that allow easy troubleshooting from the teacher. Another way is to provide virtual machines, but creating and installing such VMs can be time-consuming, and some students may lack the necessary resources to run VMs smoothly. Finally, hosting the lab on remote VMs for students is another possibility, but that requires servers with quite a lot of resources.
I therefore decided to try to experiment with a new kind of lab, where:
- Each student can start straight away without any kind of installation,
- Each student have their progress saved automatically,
- Each student still have access to the lab after classes for a determined amount of time,
- The solution is lightweight for me to host and easy to deploy.
My dream idea was an on demand, containerized environment for students that is accessible through SSH. This would allow students to pop out ready to use environments by connecting with SSH, that is readily available or simply installable on most Linux distributions, MacOS, and now available by default on recent versions of Windows. During classes, the infrastructure will need to be able to serve all students at the same time; however, after the classes, I let the lab up for weeks, and it should consume at most a few containers’ worth of resources at any given time.
Setup
This part is presenting the different tools that I used in order to put in place my lab. You can find all the configuration files I used in this repository.
ContainerSSH
ContainerSSH is a tool allowing you to launch a new container for each new SSH connection. It works with Docker, Podman, or Kubernetes. It needs a YAML configuration file in order to work. You can see an example close to the one I used for my lab below.
Nothing fancy about this file, just providing the IP and port on which the ContainerSSH server will be listening, the URLs for the authentication and configuration servers (we will come back to this later), the container image to use, the container engine’s configuration and a few more things.
ssh:
listen: 0.0.0.0:2222
banner: |
Welcome to the Buffer Overflow TP
hostkeys:
- ./host.key
log:
level: debug
auth:
password:
method: webhook
webhook:
url: http://127.0.0.1:5000
configserver:
url: http://127.0.0.1:5000
backend: docker
docker:
connection:
host: unix:///run/user/1000/podman/podman.sock
execution:
container:
image: docker.io/mh4ckt3mh4ckt1c4s/tp-bof-tsp
cmd:
- /bin/sh
user: user
You will have to generate an SSH host key with the openssl genrsa > host.key
command.
The ContainerSSH documentation suggests to use Docker, but I prefer to use Podman for its out-of-the-box rootless configuration. Install Podman then enable the Podman user service with systemctl enable --now --user podman.socket
as the user with which you want to run the ContainerSSH service. You can then get the name of the socket with systemctl status --user podman.socket
and put the correct value into the config file. You can see in the example above that the socket is running for the user with UID 1000.
For my first setup I was lazy, so I made everything run in a tmux session, so run containerssh -config config.yaml
to get started.
Python authentication and configuration server
ContainerSSH can use a custom authentication server to allow (or not) access to a container when a new SSH connection is initiated. It also optionally uses a configuration server to tailor the container to the user that just connected. ContainerSSH provides a Go library to easily develop these servers, but as the API is really simple and nicely documented, I chose to implement them in Python using Flask.
Authentication server
Several methods are available for authentication, such as Kerberos or OIDC, but I chose to implement a simpler webhook-based method with my own HTTP server.
This authentication method uses two routes, /password
for a password-based authentication and /pubkey
for a key-based authentication. Here is the code of my Flask server for this part:
@app.route('/password', methods=['POST'])
def handle_password():
data = request.json
username = data['username']
remote_address = data['remoteAddress']
connection_id = data['connectionId']
client_version = data['clientVersion']
password_base64 = data['passwordBase64']
print(password_base64, base64.b64decode(password_base64))
if base64.b64decode(password_base64).decode() == PASSWORD:
return jsonify({'success': True, 'authenticatedUsername': username})
return jsonify({'success': False})
@app.route('/pubkey', methods=['POST'])
def handle_pubkey():
return jsonify({'success': False})
I chose to provide a fixed password for my students, with no username verification, as I was behind the school firewall. Of course, I do not recommend this if you expose your server to the outside world.
Configuration server
The configuration server is a little more interesting. If the authentication is successful, ContainerSSH does another request to this server, on the /
endpoint, with information about the newly authenticated user. The server can answer with a list of parameters that can be passed to the container runtime before spawning the container (this list of parameters are available in the backend documentation).
This is interesting because in my use case, I can configure the container to save the progress of each student by mounting the /home
folder of the container to a folder with the name of the student. I then use the username to determine if the student already connected, if a folder with his name exists, I mount it, otherwise I create a new empty folder and mount it.
@app.route('/', methods=['POST'])
def handle_json_request():
data = request.json
authenticated_username = data.get('authenticatedUsername')
# Check if the folder exists, otherwise create it
folder_path = os.path.join('./tp_data', authenticated_username)
if not os.path.exists(folder_path):
os.umask(0o000)
os.makedirs(folder_path, mode=0o777, exist_ok=True)
os.umask(0o022)
folder_path = os.path.abspath(folder_path)
print(folder_path)
return jsonify({"config": {"docker": {"execution": {"host": {"binds": [f"{folder_path}:/home/user:z"]}}}}})
I had trouble with the mapping of the permissions between the container and the host, so I create the folders with 777 permissions, but I am sure this can be fixed with a little more investigation.
Security
ContainerSSH provides documentation on how to harden its SSH server and securing the HTTP connections it handles, along with a more generic hardening guide. Being protected by the school’s firewall I did not dive into these options, but it’s nice to know they are available for more complex setups.
ContainerSSH also propose a guide for securing the container backend, (Docker/Podman or Kubernetes) to secure the containers in which the students will be in. I will focus on the recommendations of the Docker/Podman guide because it’s the setup I adopted.
The first advice is to secure the Podman socket using certificates, but as Podman is running on the same machine than ContainerSSH in my case, I did bother to put this in place. The next advice is about running the containers as a non-root user, but that is something that I already achieved by using a user-enabled Podman socket.
The other recommendations are about limiting the containers CPU, RAM, disk access, process number, etc, in order to protect against resource exhaustion or DoS attacks. Again, as I was dealing with (not a lot of) students I did not bother to put these limits in place. Sadly, I should have known better. I suspect that students deliberately tried to escape the container in which they were in, causing slowdowns for all users. I will definitely add these limits for future labs, and see if that improves the situation.
Creating the container image
ContainerSSH advise building upon the container image they provide in order to install an agent inside the container image, in order to allow better support for some ContainerSSH features. You can do it by using the following lines at the beginning of your Containerfile
:
FROM containerssh/agent AS agent
FROM docker.io/debian:12 # Using debian here, but you can use nearly whatever image you'd like
COPY --from=agent /usr/bin/containerssh-agent /usr/bin/containerssh-agent
Feedback
My experience
I ran this setup with two different student sessions, and it ran overall very well. The students were happy of the workflow, allowing them to use the computer they wanted (either their personal one or the one of the lab room), without constraints. I still think however that there is room for improvement, especially if I want to systematize the process across all my labs.
What I am looking to try next
When I first experimented with ContainerSSH, I was aiming for a goal that I was not able to achieve: allow users to connect to a container and use VS Code with remote SSH inside it. I thought that would have eased the way the students may interact with the lab, and may have a lot of other applications in general. However, I was not able to do it, especially because each new SSH session is starting a new container, even if the same person is connecting twice. This makes the remote SSH with VS Code not working. This problem is currently tracked in this issue. Furthermore, I am not sure that this will fix the problem, as ContainerSSH does not offer the same features as a full-blown SSH server, that maybe VSCode with remote SSH is relying on.
Another problem I encountered was that ContainerSSH does not support local images, so I had to push my image to DockerHub and make ContainerSSH use it. Not only it is not convenient while using container images that we may not want to put in public access, but it makes also ContainerSSH dependent on an internet connection, and makes the creation of the image slower than if it is directly in local (especially as ContainerSSH seems to pull the image for each SSH connection, which is not optimal at all). I opened an issue to see if local image support can be implemented. As an alternative, it seems that ContainerSSH supports connecting to registries that requires authentication (see this issue, although I’m not sure how to configure this).
Finally, I am looking to implement a better authentication system. I was thinking of linking the authentication server to the school’s SSO system, but this is unlikely to be put in place. Another possibility would be to collect a database of usernames and SSH keys at the beginning of the year and use this for authentication, but this is not without drawbacks (of course some students will forget their username or lose their SSH key…)
Conclusion
Overall, it was a very nice experience: the ContainerSSH tool really helped me to put in place a lab that is lightweight, secure, and easy to access for students. Browsing the ContainerSSH repository, I saw that a lot of other users have the same use case as me, so one may hope to see this approach being more popular in the future.
The solution is of course not perfect, but I hope to be able to improve it in the future. I also have some additional options to explore in order to strengthen the security of the solution. Finally, I would like to experiment with the Kubernetes backend and try to set up a more robust solution (something better than a tmux at least).
I also learned a few things about containers along the way, so that’s another plus.