Links Load balancing

Prerequisites :

Netfilter :

  • CONNMARK
  • nth (or statistic module for recent kernel)
  • condition (for failover, available in xtables addon)
  • Iproute2

System :

A linux gw and 2 internet links (what ever techno) :

  • Link 1 : BP 1500 – fraction 3
  • Link 2 : BP 500 – fraction 1

The ratio between the 2 link is 1/4 3/4.

Objective

The objective is to have a load-balancing failover between the two link at connection level. Setup is here for a
nated LAN.

Algorithm and setup

Mark system

We build a mark system on PREROUTING using MARK and we use CONNMARK to restore the mark
on prerouting.
We use nth or condition module to build a pool :

  • mark 1 for LINK 1 outgoing
  • mark 2 for link 2 outgoing

In our exemple, we will use a counter of 4 to respect the link bandwith ratio:

  • 1 : mark 1
  • 2 : mark 2
  • 3 : mark 1
  • 4 : mark 1

This gives something looking like that:
[bash]iptables -A PREROUTING -t mangle -j CONNMARK –restore-mark
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m nth –counter 1 \
–every 4 –packet 1 -j MARK –set-mark 1
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m nth –counter 1 \
–every 4 –packet 2 -j MARK –set-mark 2
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m nth –counter 1 \
–every 4 –packet 3 -j MARK –set-mark 1
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m nth –counter 1 \
–every 4 –packet 4 -j MARK –set-mark 1
iptables -A POSTROUTING -t mangle -j CONNMARK –save-mark[/bash]
The syntax is different on recent kernel (at least 2.6.24 and over) where you need to use the statistic module:
[bash]iptables -A PREROUTING -t mangle -j CONNMARK –restore-mark
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m statistic \
–mode nth –every 4 –packet 0 -j MARK –set-mark 1
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m statistic \
–mode nth –every 4 –packet 1 -j MARK –set-mark 2
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m statistic \
–mode nth –every 4 –packet 2 -j MARK –set-mark 1
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -m statistic \
–mode nth –every 4 –packet 3 -j MARK –set-mark 1
iptables -A POSTROUTING -t mangle -j CONNMARK –save-mark[/bash]

See the page connmark to understand CONNMARK usage.

Fail over

We will use the condition module which is available in xtables addon.

The mark system is modified to have fail-over. Instead of one line, we have two lines for each item of the nth/statistic pool :
exemple for item 1 :

-m condition -condition  LINK1 UP -j mark 1
-m condition -condition  LINK1 DOWN -j mark 2

Thus when link 1 is down packet get mark 2 and get out via LINK2

This gives :
[bash]iptables -N MARKING
iptables -A PREROUTING -t mangle -j CONNMARK –restore-mark
iptables -A PREROUTING -t mangle -m mark –mark 0x0 -j MARKING

iptables -A MARKING -t mangle -m condition –condition link1_up \
-m nth –counter 1 –every 4 –packet 1 -j MARK –set-mark 1
iptables -A MARKING -t mangle -m condition ! –condition link1_up \
-m nth –counter 1 –every 4 –packet 1 -j MARK –set-mark 1

iptables -A MARKING -t mangle -m condition –condition link2_up
-m nth –counter 1 –every 4 –packet 2 -j MARK –set-mark 2
iptables -A MARKING -t mangle -m condition ! –condition link2_up
-m nth –counter 1 –every 4 –packet 2 -j MARK –set-mark 1

iptables -A MARKING -t mangle -m condition –condition link1_up \
-m nth –counter 1 –every 4 –packet 3 -j MARK –set-mark 1
iptables -A MARKING -t mangle -m condition ! –condition link1_up \
-m nth –counter 1 –every 4 –packet 3 -j MARK –set-mark 2

iptables -A MARKING -t mangle -m condition –condition link1_up \
-m nth –counter 1 –every 4 –packet 4 -j MARK –set-mark 1
iptables -A MARKING -t mangle -m condition ! –condition link1_up \
-m nth –counter 1 –every 4 –packet 4 -j MARK –set-mark 2

iptables -A POSTROUTING -t mangle -j CONNMARK –save-mark[/bash]

IProute

The objective is to:

  • Route packet with mark 1 to a table having default gw via LINK1
  • Route packet with mark 2 to a table having default gw via LINK1

The syntax is the following:
[bash]ip route add default via GW_LINK1 table LINK1
ip route add default via GW_LINK2 table LINK2
ip rule add fwmark 1 lookup table LINK1
ip rule add fwmark 2 lookup table LINK2[/bash]

NAT

To have this working when need to translate internal IP at exit. Packets are dispatched:

  • the ones with mark 1 get IP of link 1.
  • the other with mark 2 get IP of link 2.

This gives:
[bash]iptables -A POSTROUTING -t nat -m mark –mark 1 -j SNAT IP_LINK1
iptables -A POSTROUTING -t nat -m mark –mark 2 -j SNAT IP_LINK2[/bash]

25 thoughts on “Links Load balancing”

  1. Thank you very much for a wonderful tutorial 🙂 Helped me a lot. Please note that ‘-t mange’ option is missing in some commands because of which they’ll fail.

    And the ip route lines should be

    ip route add default via GW_LINK1 table LINK1
    ip route add default via GW_LINK2 table LINK2

    Thanks again 🙂

  2. Hi there – I believe this could also work with more than two links right? Also, would this support persistence over the link where the connection is initiated?

    Thanks a lot!

  3. Yes, there is no limitation on the number of links, simply change the counter size and the distribution of links. And don’t forget to sync this with ‘ip rule’.
    The persistence is achieved thanks to CONNMARK which transfer a per-connection mark to the mark of the packet. This guarantee that the mark on a packet and hence is link is set for all packets of a connection. This part is one of the most tricky and all is done in these lines:

    iptables -A PREROUTING -t mangle -j CONNMARK --restore-mark
    iptables -A PREROUTING -t mangle -m mark --mark 0x0 -m nth --counter 1 \
    --every 4 --packet 1 -j MARK --set-mark 1
    iptables -A POSTROUTING -j CONNMARK --save-mark

    If we have a packet belonging to a new connection, the first rule will not restore a mark which has never been set. Thus the packet is mark 0 and there is a match in one of the counter rules. This match set a mark on the packet which correspond to a link. This mark is saved by the last rule (--save-mark) on the connection mark. Thus, when a packet of an existing connection comes, the restore-mark rules copy the connection mark to the packet mark and the counter rules do not match anymore.

  4. Hi!
    Do You have the same thing but for recent kernels?
    I can’t find it out how to convert all this solution for new versions of iptables. There are no -j MARKING and –counter and so on… ;(

  5. P.s. How iptables knows where is eth0 and where is eth1?
    May be You have totally working example from the beginning till end?
    Sorry for noob questions 🙁

  6. Hi,

    On recent kernel, you’ve got to use the statistic module for counter and not nth module. The syntax is described in the page.

    MARKING is a custom chain created by the -N command. This is not a old lost extensions 😉

  7. Iptables does not care about eth0 and eth1. It puts a mark and that’s the routing who knows about interface.

  8. Hi everybody,

    Would you please let me know what should be my configuration for my centOS system on which i have two external (public) IPs and want to load-balance with failover.
    Actually i am confused whether i should write LINK1/LINK2 in the above config or replace them with WANIP1/WANIP2 or names of the interfaces (eth0/eth1)?????

    Kindly help!!!

  9. I face the following error on the second line:
    iptables v1.4.7: Couldn’t load match `nth’:/lib64/xtables/libipt_nth.so: cannot open shared object file: No such file or directory

    what is the problem?

  10. Masood: the nth module is not available anymore, use statistic instead as described in the page.

  11. Thanks Regit. But, there is another problem with the last line:

    iptables -A POSTROUTING -j CONNMARK –save-mark
    iptables: No chain/target/match by that name.

  12. Hi guys, I want try to load balance and fail over my squid, I have prepare d this iptables conf, can i have your opinion?

    :FORWARD ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    -A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT
    -A INPUT -p icmp -j ACCEPT
    -A INPUT -i lo -j ACCEPT
    -A INPUT -m state –state NEW -m tcp -p tcp –dport 22 -j ACCEPT
    -A INPUT -m state –state NEW -m udp -p udp –dport 161 -j ACCEPT
    COMMIT
    *mangle
    :FORWARD ACCEPT [0:0]
    :INPUT ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    :PREROUTING ACCEPT [0:0]
    :POSTROUTING ACCEPT [0:0]

    #GUEST
    -A PREROUTING -i eth3 -j MARK –set-mark 1
    #LAN
    -A PREROUTING -j CONNMARK –restore-mark
    -A PREROUTING -m mark –mark 0x0 -J MARKING
    -A PREROUTING -i eth1 -p tcp –dport 80 -m condition –condition link1_up \ –mode nth –every 4 –packet 0 -j MARK –set-mark 2
    -A PREROUTING -i eth1 -p tcp –dport 80 -m condition –condition link2_up \ –mode nth –every 4 –packet 1 -j MARK –set-mark 3
    -A PREROUTING -i eth1 -p tcp –dport 80 -m condition –condition link1_up \ –mode nth –every 4 –packet 2 -j MARK –set-mark 2
    -A PREROUTING -i eth1 -p tcp –dport 80 -m condition –condition link2_up \ –mode nth –every 4 –packet 3 -j MARK –set-mark 3
    #WIRELESS
    -A PREROUTING -i eth2 -p tcp –dport 80 -m mark –mark 0x0 -m statistic \ –mode nth –every 4 –packet 0 -j MARK –set-mark 2
    -A PREROUTING -i eth2 -p tcp –dport 80 -m mark –mark 0x0 -m statistic \ –mode nth –every 4 –packet 1 -j MARK –set-mark 3
    -A PREROUTING -i eth2 -p tcp –dport 80 -m mark –mark 0x0 -m statistic \ –mode nth –every 4 –packet 2 -j MARK –set-mark 2
    -A PREROUTING -i eth2 -p tcp –dport 80 -m mark –mark 0x0 -m statistic \ –mode nth –every 4 –packet 3 -j MARK –set-mark 3

    -A PREROUTING -A POSTROUTNG -t mangle -j CONNMARK –save-mark
    -A PREROUTING -m mark –mark 2 -j ACCEPT
    -A PREROUTING -m mark –mark 3 -j ACCEPT
    COMMIT
    *nat
    :PREROUTING ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    :POSTROUTING ACCEPT [0:0]
    COMMIT

    The squid are in different lan to the cliet 🙂

  13. Hello lingeek.

    I’ve never tested lartc version. It seems easy to use and this is a good point for this solution. One advantage of what is explained on my page is that you are able to tune the load balancing with respect to the link bandwidth.

  14. Hi,
    actually there is one problem with lartc version.
    In case of a Linux router with two ISPs (eth1 and eth2 in lartc load balance) with one local (eth0), if I am having a tcp connection from local machine to server on internet and say it is connected through eth1. So my local ip is translated to public ip of eth1 (masquerading) and connection is made with server.

    Now at every minute my application is sending request to sever to share some date (obviously through eth1). But as soon as kernel calls the gc which flushes the route cache, the traffic goes through eth2.

    As both ISPs are in load balance and when route cache flused, it is possible that next time traffic will go through different ISP, but the problem is that the traffic is still NATed/masqueraded by eth1 public ip and routed through eth2.

    Some how the NATing/masquerading functionality is not working hand in hand with route cache flush for already established connections.

    A small clue to resolve this will be a great help.

    Thanks.

  15. Hi Lingeek,

    Just use these rules for nat :

    iptables -t nat -I POSTROUTING -o interface_link1 -j MASQUERADE
    iptables -t nat -I POSTROUTING -o interface link2 -j MASQUERADE

    Use SNAT only if you have severals IP on each interface or if you don’t specify the interface as in the example above. Here you don’t need to nat packet according to the mark because of theses rules :
    ip rule add fwmark 1 lookup table LINK1
    ip rule add fwmark 2 lookup table LINK2

    I have one question.

    the condition’s module just check the value in one file :

    To allow the rule to match:
    echo 1 > /proc/net/nf_condition/link1_up

    To disable the rule:
    echo 0 > /proc/net/nf_condition/link1_up

    How do you manage to write the right value in the file if the interface is up or down ?

    Do you use a script to do this ?
    Thank you.

  16. Hey there I am so grateful I found your blog, I really found you by error, while I was
    browsing on Aol for something else, Regardless I am here
    now and would just like to say cheers for a fantastic post and
    a all round thrilling blog (I also love the theme/design), I don’t have time to browse it all at the moment but I have book-marked it and also added your RSS feeds, so
    when I have time I will be back to read more, Please
    do keep up the great b.

  17. please cmiiw in fail over first line should be
    iptables -t mangle -N MARKING

  18. When I originally left a comment I appear to have clicked on the -Notify me when new comments are added- checkbox and from now on each time a comment is added I recieve four emails with the same comment.
    There has to be a way you are able to remove me from that service?
    Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *