Installing Docker on Debian with nftables

I’m going to assume you have a working and secured Debian install, I’ll be starting from where this article ended. The big issue that we’ll face with setup is getting Docker to work with nftables. Actually, I’m not even going to try and get them to work together I’m just going to configure nftables myself. Unfortunately that will require way more reading and research than I really have time to do right now but I don’t see many options unless I strip out nftables and that’s not going to happen either!

Installing Docker

I highly recommend docker be installed from the official Docker apt repository, the instructions for which are found here. It’s also possible to install from the Debian repositories but the version you’ll be will be woefully out of date (Docker is currently on version 27.1 but the Debian stable repository still have version 20.10).

Fixing the Firewall

The Problem

The last step of the build process is to run the hello-world container.

sudo docker run hello-world

Due to the problems with Docker and nftables I was expecting this to fail but it worked just fine. On a hunch I decided to take a look at the firewall (sudo nft list ruleset) and what did I see, a whole host of new tables, chains and rules. I didn’t realize it but the base Debian system comes with iptables-nft, a shim that converts iptables rules to nftables rules, which Docker used to create the entries it needs.

The next test I did was to restart the firewall (sudo systemctl restart nftables.service) and see what it looked like then. As I expected, all the Docker rules are gone. Interestingly the hello-world container still runs fine. My guess is that’s because it doesn’t require any network access.

On a hunch again I restarted Docker and checked the firewall. All the Docker rules were back. That’s interesting because it means that as long as remember to restart Docker after I restart the firewall this set up will likely work (I assume this can be automated with systemd). There’s some really big caveats there though. I don’t know that Docker is writing working rules at this point. Docker writes quite a lot of rules and I’m not good enough with nftables yet to fully understand what it’s doing. Finally, Docker has a nasty habit of exposing things that I’d rather not have exposed.

So, what’s the solution? I’m honestly not sure at this point. I’m not running a lot of services and they don’t require complicated network setups so I’m tempted to just try managing the firewall by hand.

A Solution – Manage the Docker Network Manually

Configure the Docker Daemon

The Docker daemon can be configured by creating the file /etc/docker/daemon.json and populating it with custom settings (see here). The next step of the setup is to prevent Docker from creating it’s own firewall rules and set up a custom Docker network. In order to do this you’ll need to read and understand at least the section on packet filtering and firewalls in the Docker documentation, a good overview of the settings available for the dockerd daemon is also required. Long story short a configuration file like this will set up Docker to not use iptables and switch the non-routable IP range it uses.

{
  "iptables" : false,
  "ip6tables" : false,
  "bip": "10.0.0.1/24",
  "fixed-cidr": "10.0.0.0/25",
  "default-address-pools": [
    {
      "base":"10.10.0.0/16",
      "size":24
    }
  ]
}

So I’ll explain what this is doing as best as I understand it. The lines "iptables" : false and "ip6tables" : false prevent Docker from trying to create it’s own iptables rules. The Docker manual has a stern warning telling you this is likely to break your containers – it probably will.

The next line, "bip": "10.0.0.1/24", configures the bridge ip address (typically called docker0). It’s is very briefly mentioned in the Docker documentation here. I have read that the bip setting must specify an IP address ending in a 1 and that it has to be provided with a subnet mask.

"fixed-cidr": "10.0.0.0/25" this specifies the range of static IP addresses that can be used in the bridge network. It must be a subset of the range specified in the bip setting. The best information I can find about this is an old question which references information in the manual that has been removed.

Following that is a block called "default-address-pools" which specifies where networks can be created for containers. This range has to be outside the range specified in the bip setting. It can be very large (it is by default) but here I’m allowing it up to 256 networks (/16) of 255 addresses (/24).

Restart Docker and check that it starts cleanly and that it hasn’t created anything in the firewall.

Upon restarting I found that Docker had still created some of the firewall rules. I found a number of people claiming the Docker didn’t respect the –iptables=false setting but my problem was that I hadn’t, at that point, added the "ip6tables" : false setting to the json file. I’ve not once seen that setting specified so I wonder if people are, like me at first, seeing some Docker tables and not noticing that they are only for IPv6.

As a quick check that everything set up correctly take a look at the bridge setup with sudo docker network inspect bridge. You should see a range of setting including:

"Config": [
        {
            "Subnet": "10.0.0.0/24",
            "IPRange": "10.0.0.0/25",
            "Gateway": "10.0.0.1"
        }
    ]

Create a Network

Create a bridge network within the specified range for pools.

docker network create web --subnet 10.10.0.0/24 -o com.docker.network.bridge.name=web0

Listing Docker networks will show you your new network:

$ sudo docker network list

NETWORK ID     NAME         DRIVER    SCOPE
c85f29752472   bridge       bridge    local
4342a17f02e6   host         host      local
455cbcd633ce   none         null      local
9542e0911dad   web          bridge    local

Deploy Nginx

Just about the simplest compose deployment of Nginx is shown below. This publishes the Nginx ports 80 and 443 to the host machine which, if you have allowed access on the input chain of the firewall, will now be publicly accessible.

Just place this file, called compose.yaml, in a sensible place. For example create a directory called docker and inside that a directory called nginx so the path would be something like ~/docker/nginx/compose.yaml.

Inside the nginx directory I created a config directory which is where I’ll store the nginx.conf file. I also created a child conf.d directory which is where most of the configuration will take place. Both the nginx.conf file and the directory are mapped into the container at start. You can either write those files yourself or, more simply, start the container and then copy them out with a command such as this (you might need to tweak the copy to path):

sudo docker cp nginx:/etc/nginx/conf.d/default.conf ~/docker/nginx/config/conf.d/default.conf

While I’m sure it would be possible to link everything up using the dynamic IP addresses that Docker will supply I much prefer to give each service a static IP address so set it here. I also move this from a dynamically created subnet into the web subnet created above.

services:
  nginx:
    container_name: nginx
    image: nginx:latest
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./config/nginx.conf:/etc/nginx/nginx.conf
      - ./config/conf.d:/etc/nginx/conf.d
    networks:
      web:
        ipv4_address: 10.10.0.2

networks:
  web:
    external: true

The firewall rule that allows access to this service is just a standard rule on the input chain (I’m assuming you have a firewall setup similar to this). For example this rule will work:

tcp dport { http, https } accept;

This works because when you publish a port in Docker it, by default, publishes it to all interfaces. Personally, I’m not a big fan of that behaviour but that’s the way it works and I’m taking advantage of it here. Later I’ll

Deploy a Debian Container

You don’t need to deploy this container but it helps to understand how Docker will interact with nftables. Right now (assuming you have a firewall similar to the one described here) it’s not possible for containers to access the outside world. Create a new folder ~/docker/debian and add a compose file with the following settings:

services:
  debian:
    container_name: debian
    image: debian:latest
    restart: unless-stopped
    tty: true
    networks:
      web:
        ipv4_address: 10.10.0.10

networks:
  web:
    external: true

Notice this is very similar to the Nginx configuration just without the published ports and volumes and the addition of a tty. The tty setting is necessary to stop the container from exiting immediately after it’s started. The container isn’t set up to do anything in particular so it’ll exit once it’s started, adding the tty line starts a shell. Start the container in detached mode:

sudo docker compose up -d

List running containers:

$ sudo docker container ls
CONTAINER ID   IMAGE           COMMAND                  CREATED          STATUS                         PORTS                                                                      NAMES
0eb03c7334bb   debian:latest   "bash"                   38 seconds ago   Restarting (0) 7 seconds ago                                                                              debian

Open a shell into the Docker image (remember this command is an alias of docker container exec if you are looking for the documentation):

docker exec -it debian bash

You should get a new command prompt root@container_id:/#, you are now in the container (to leave just enter the command exit). Ideally at this point I’d examine the network of the container using the ip command but as this is a minimal container that command isn’t present and there’s no easy way to add it at the moment. If you run apt update though you’ll get a selection of error message telling you that the system failed to fetch updates. This is because, at the moment, the firewall doesn’t NAT packets from the containers.

To make the firewall forward packets from the container to the outside world you need to set up masquerading. Your firewall will look something like this at the end:

#!/usr/sbin/nft -f

flush ruleset

define home_ip = w.x.y.z
define wan_if = eth0
define dck_web_if = web0
define server_ip = a.b.c.d

table inet filter {
        chain input {
                type filter hook input priority filter; policy drop;
                iifname lo accept;
                ct state established,related accept
                ct state invalid drop;
                icmp type echo-request counter limit rate 1/second accept;
                tcp dport ssh accept;
                tcp dport { http, https } accept;
        }

        chain forward {
                type filter hook forward priority filter; policy drop;
                ct state established,related accept
                ct state invalid drop;
                iifname $dck_web_if accept;
        }

        chain output {
                type filter hook output priority filter; policy accept;
        }

        chain postrouting {
                type nat hook postrouting priority srcnat; policy accept;
                oifname $wan_if masquerade;
        }
}

The changes from the basic firewall, described here, are the addition of the postrouting chain and some new forwarding rules. The postrouting chain tells the firewall to masquerade outgoing packets. The changes to the forward chain tell the firewall to accept packets that are being forwarded from the docker containers on the web0 subnet, you could make this rule broader if you had multiple subnets you wanted to forward from. I also created a few variables at the top just to keep things neat.

Note that if you don’t want to publish ports from containers (as I’ve done with Nginx) then you could alternatively add rules to a prerouting table to do destination natting (port mapping). This will tell the firewall to map from a port on the external interface to somewhere on in the docker network.

For a basic setup that’s pretty much all there is to it. Docker can create some very complex networks which will require more setup on the firewall if you are to keep the containers secure but for simple setups it’s pretty much the same amount of work to just set up the firewall yourself as it is to let Docker do it and understand what it’s doing.

Don’t forget to stop and remove the Debian container (careful with the second command):

sudo docker compose down
sudo docker container prune

Configure a Docker Network

References