0

I've a Debian 12 server (public IP 85.xxx.xxx.xxx at enp6s0) running a bunch of LXC containers on a network bridge cbr0.

Since the public IP is dynamic I had to setup forward + prerouting rules with dnat in order to have incoming requests reach the containers. Eg. port 80/443 are dnatted to container 10.10.0.1. Here are my nftables rules:

flush ruleset

table inet filter {
  chain input {
    type filter hook input priority 0; policy drop;

    ct state {established, related} accept
    iifname lo accept
    iifname cbr0 accept
    ip protocol icmp accept
    ip6 nexthdr icmpv6 accept
  }
  chain forward {
    type filter hook forward priority 0; policy accept;
  }
  chain output {
    type filter hook output priority 0;
  }
}

table ip filter {
  chain forward {
    type filter hook forward priority 0; policy drop;

    oifname enp6s0 iifname cbr0 accept
    iifname enp6s0 oifname cbr0 ct state related, established accept
    # Webproxy
    iifname enp6s0 oifname cbr0 tcp dport 80 accept
    iifname enp6s0 oifname cbr0 udp dport 80 accept
    iifname enp6s0 oifname cbr0 tcp dport 443 accept
    iifname enp6s0 oifname cbr0 udp dport 443 accept
  }
}

table ip nat {

  chain postrouting {
    type nat hook postrouting priority 100; policy accept;
  }

  chain prerouting {
    type nat hook prerouting priority -100; policy accept;
    # Webproxy
    iifname enp6s0 tcp dport 80 dnat to 10.10.0.1:80
    iifname enp6s0 udp dport 80 dnat to 10.10.0.1:80
    iifname enp6s0 tcp dport 443 dnat to 10.10.0.1:443
    iifname enp6s0 udp dport 443 dnat to 10.10.0.1:443
  }

}

Now the problem is: hairpin NAT. I've multiple containers hosting websites and sometimes it happens some of them need to communicate with other containers using domain names. When they run a DNS query for those domains they'll get the host IP and communication fails:

hairpin NAT

How can I fix this situation without resorting to DNS hacks? Is there a way to set nftables to forward request internally while having a dynamic IP? How does the iifname above plays with this?

Thank you.

2 Answers 2

1

What you probably need is a 'FIB EXPRESSION' from nft man page. This is the rule I use for my DNAT (you will need to adjust for your setup):

    chain PREROUTING {
            type nat hook prerouting priority dstnat; policy accept;
            iiftype ppp fib daddr type local dnat ip to meta l4proto . th dport map @PRE-DNAT-IPV4
    }

fib daddr type local is the part which matches any daddr matching the address of the NAT box.

Please note that you may need to also provide appropriate SNAT mapping for this to work (as the return packets may be sent directly to the bridge interface with internal source address which won't match any connection on the VM).

2
  • Why iiftype ppp? Isn't that Point-to-Point Protocol? Should I replace it with ether as my connection isn't PPP? Anyways, when you say "provide appropriate SNAT mapping" you mean change iifname enp6s0 part in the prerouting chain to something like iifname { enp6s0, cbr0 } ? Or something else? Thank you.
    – TCB13
    Sep 30 at 12:25
  • This is the rule I use, so adjust as necessary in your setup. You may not need iiftype at all. SNAT rule will be necessary if the DNATted packet will leave the NAT box through the same interface on which it arrived (so the reply will hit the NAT box on route to originator).
    – Tomek
    Sep 30 at 14:15
1

Okay I managed to fix this and also simpified my config.

  1. The forward chain doesn't need to explicitly accept traffic for specifics ports, I can just tell it to forward anything that has a dnat rules specific later on:
table ip filter {
  chain forward {
    type filter hook forward priority 0; policy drop;
    # cbr0->anywhere (including to itself)
    iifname cbr0 accept
    # publicif->cbr0 only related connections
    iifname enp6s0 oifname cbr0 ct state related, established accept
    # publicif->cbr0 allow fwd for things that have dnat rules
    iifname enp6s0 oifname cbr0 ct status dnat accept
  }
}
  1. Use a FIB Expression as described here. And make sure traffic to dnat can come from both enp6s0 and cbr0:
table ip nat {

  chain postrouting {
    type nat hook postrouting priority 100; policy accept;
    ip saddr 10.0.0.0/24 masquerade
  }

  chain prerouting {
    type nat hook prerouting priority -100; policy accept;

    # Webproxy
    iifname { enp6s0, cbr0 } tcp dport { 80, 443 } fib daddr type local dnat to 10.0.0.1
    iifname { enp6s0, cbr0 } udp dport { 80, 443 } fib daddr type local dnat to 10.0.0.1
 
  }

}

Now it all seems to work, however there's a slight detail: when my webproxy container gets requests from the internet (enp6s0) it logs the real remote machine IP as source of those requests, however when they come from another container the source IP appears as 10.0.0.254 which is the IP of the host in cbr0.

2
  • I am glad you managed to solve it. There is room for further simplification: you probably can drop match on iifname; you can fold the two rules for webproxy into one using meta l4 proto and th dport expressions; I would use snat instead of masquerade as it probably has a little lower overhead (but requires specifying of the snat address).
    – Tomek
    Oct 1 at 20:36
  • meta l4 proto` isn't that slower / less safe than two tcp/udp rules (spinics.net/lists/netfilter/msg57618.html)? I'm not sure if I can use snat instead of masquerade and there's traffic going from the internal bridge cbr0 to the internet via enp6s0 that should have the dynamic public IP. To be fair I'm not even sure if that's required or if that change to the source IP happens elsewhere because there's also a table ip io.systemd.nat that does it for postrouting. Thank you.
    – TCB13
    Oct 2 at 10:35

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .