NGINX Proxy Manager with Docker Swarm + Host Networking
I've been using NGINX Proxy Manager to expose my web services for a while now, but I had one big problem: When using NPM in a Docker Swarm, the overlay network doesn't allow NPM to see the client IP addresses accessing it. This means you are unable to make access rules based on IP ranges, which can be limiting for people who want to route sensitive stuff through NPM which you wouldn't want just anyone to access. For example, I have a ton of personal services or admin webpages which shouldn't be public, but I want to be able to access them from within my LAN.
The solution? Host networking mode. When you run a container with host networking, it bypasses the Docker Swarm networking and binds directly to the hosts interface. This means NPM can actually see the IP addresses of who's accessing servers proxied with NPM. Good! But, now you've inadvertently lost high availability. Not using Docker Swarm overlay networks means that only the single host with NPM running can answer on 80 and 443. When it was using the overlay network, any of the hosts in the swarm could answer for NPM, regardless of which host it was running on. Host mode is a problem, since I can only port forward to one single IP from my router, and I had been pointing to a virtual IP address created with Keepalived. Now, I'd have to either port forward to the IP of the host that's running NPM, or find some other solution. The requirement is simple: Get NPM to have both access to host networking, and be highly available as it was under the overlay network.
In my setup, I have 4 hosts in my swarm, and using Keepalived, I have a virtual IP, aka a VIP, that can pass between them. Using port forwarding at the router, I pass NPM traffic, 80 and 443, to this VIP. This VIP can hop between hosts if the master goes down, moving to a backup. Keepalived is quite nice when used in conjunction with a Docker Swarm setup, since it allows you to port forward to a Virtual IP that can hop between hosts as they go up and down. As long as at least 1 of the swarm hosts is online, you will be able to access services hosted on the Docker Swarm without having to change the port forwarding IP as machines go up and down. Since Docker Swarm services typically are assigned to any host in the cluster, the combination of Keepalived and swarm networking allows for high availability.
So in summary, running NPM in host mode:
- Can see IP addresses of clients
- Can make access rules with these IPs
- Can only be accessed via 80, 443 on the host running it
- Can't reliably be accessed via VIP unless NPM is randomly running on the VIP holder
- As a result, port forwarding must point to the machine with NPM, and if it moves, the IP must be changed at the router..
NPM in overlay networking mode:
- Can't see IP address of clients
- Can't make access rules with IPs
- Can be accessed via 80, 443 on all swarm hosts, since it uses swarm networking
So what do? Thankfully, through the power of vibe-coding, we can take a crack at writing a scripted solution. After a long chat with an AI agent, I found that it was possible to make Keepalived run a script when the VIP moves from host to host, such as when the previous VIP master goes down. This means we can make a bash script to do some magic.
The goal: Create a script that will ensure that NPM is always running on the machine that is the VIP holder. This way, I can keep the VIP as the port forward destination, and retain high availability, all while ensuring that we can run in host networking mode. Brilliant!Sadly, this wasn't straightforward, since AI agents aren't the smartest at predicting edge cases where the scripts will fail. I was able to muscle my way through it and write a script that does exactly what I want, but it took some effort. Let me run through the process.
First, notify.sh is created, giving it executable permissions after. Then keepalived.conf is modified to tell it to run the script when the VIP changes hosts.
Below is an example /etc/keepalived/notify.sh
#!/bin/bash
TYPE=$1 # vrrp_instance
NAME=$2 # instance name
STATE=$3 # MASTER or BACKUP or FAULT
PRIORITY=$4 # priority
SERVICE_NAME="nginx_proxy-manager" # CHANGE THIS to your real service name, e.g. npm_npm or whatever shows in 'docker service ls'
if [ "$STATE" = "MASTER" ]; then
logger "MASTER: Scaling $SERVICE_NAME to 0 (kill stuck task)"
docker service scale $SERVICE_NAME=0
logger "MASTER: Removing ALL constraints from $SERVICE_NAME"
docker service update $SERVICE_NAME --constraint-rm node.hostname==HOST01
docker service update $SERVICE_NAME --constraint-rm node.hostname==HOST02
docker service update $SERVICE_NAME --constraint-rm node.hostname==HOST03
docker service update $SERVICE_NAME --constraint-rm node.hostname==HOST04
logger "Keepalived: Became MASTER – rescheduled $SERVICE_NAME to $(hostname)"
docker service update --force --constraint-add node.hostname==$(hostname) $SERVICE_NAME
logger "MASTER: Scaling $SERVICE_NAME back to 1 on this node"
docker service scale $SERVICE_NAME=1
elif [ "$STATE" = "BACKUP" ]; then
# Optional: Remove constraint if needed (Swarm will handle rescheduling on next update)
logger "Keepalived: Became BACKUP on $(hostname)"
fiWhat it does: This runs on the host after the VIP moves to it. The script checks to see if it's the MASTER (or owner of the VIP) and if it is, it first shuts down NPM and does a cleanup step, then assigns a constraint to the service, forcing it to only run on machines with its hostname. This will force NPM to run on the VIP owner when started. It then starts NPM.
The cleanup step is important, as if it isn't conducted, every time NPM hops from host to host via this script, it will get more constraints added, making it fail to run. The cleanup step removes all constraints prior to adding the new one. I explicitly remove the possible hostnames one at a time, since all the methods the AI agents recommended for removing all constraints at once would never work.
Below is an example of a modified /etc/keepalived/keepalived.conf, specifically for the master. Each host needs to have their config modified specifically, adding global_defs as well as the script at the bottom.
global_defs {
# Enable script security (recommended)
enable_script_security
# Explicitly run scripts as root (since keepalived_script user doesn't exist)
script_user root
}
vrrp_instance VI_1 {
state MASTER
interface eno1
virtual_router_id 51
priority 255
advert_int 1
authentication {
auth_type PASS
auth_pass MYPASSWORD
}
unicast_peer {
192.168.1.21
192.168.1.22
192.168.1.23
}
virtual_ipaddress {
192.168.1.10
}
notify "/etc/keepalived/notify.sh"
}Does it work? Yes! If I test it via sudo systemctl stop keepalived on the current master, I can see NPM stop and move to the current VIP holder. It's magic!
All this just so I could create access list rules for IP ranges... I learned a lot doing this, and things are more secure. I'd say it was worth all the effort.