909-744-2891

Vyos and bufferbloat

What is Buffer Bloat?

Buffer bloat first came to my attention via a Slashdot post that publicized Jim Gettys' blog here. I eventually found Mr. Brouer's masters thesis. See this for more information.

The fundamental problem is that excessively large buffers in intermediate routers delay the ACK packets that are needed to keep the data flowing in the opposite direction. The easy solution is to simply reduce the size of those buffers, and that can be done fairly easily in the case of Vyos. A more complex solution involves placing small SYN and ACK packets into their own higher priority queue, and in the case of Vyos that needs some custom Linux traffic control commands.

Vyos

vyos@vyos:~$ configure
vyos@vyos# show interfaces ethernet eth1
 address dhcp
 duplex auto
 traffic-policy {
     out 510sg
 }

vyos@vyos# show traffic-policy
 shaper 510sg {
     bandwidth 600kbit
     class 2 {
         bandwidth 30%
         burst 15k
         ceiling 100%
         description "syn ack bufferbloat"
         queue-limit 4
         queue-type fair-queue
     }
     class 10 {
         bandwidth 15%
         burst 15k
         ceiling 100%
         description "voip rtp traffic"
         match voip-rtp {
             ip {
                 dscp 46
             }
         }
         queue-limit 4
         queue-type fair-queue
     }
     class 20 {
         bandwidth 5%
         burst 15k
         ceiling 100%
         description "voip sip traffic"
         match voip-sip {
             ip {
                 dscp 26
             }
         }
         queue-limit 4
         queue-type fair-queue
     }
     default {
         bandwidth 50%
         burst 15k
         ceiling 100%
         queue-limit 4
         queue-type fair-queue
     }
 }
vyos@vyos#  exit

The first part of the solution, reducing the buffer sizes, is achieved via standard Vyos configuration "queue-limit 4". To achieve the second part, I need to put the SYN and ACK packets into their own higher priority queue. The stock Vyos system does not have facilities to match on the TCP flags or the IP packet length, so I need some modifications. There are two approaches to modifying a Vyos router. I could modify the entire configuration system to add configuration nodes for something like:

match ipv4-syn {
    ip {
        tcp {
            flags SYN SYN
            length less 256
        }
    }
}
match ipv4-ack {
    ip {
        tcp {
            flags ACK ACK
            length less 256
        }
    }
}
match ipv6-syn {
    ipv6{
        tcp {
            flags SYN SYN
            length less 256
        }
    }
}
match ipv6-ack {
    ipv6{
        tcp {
            flags ACK ACK
            length less 256
        }
    }
}

In any case, I would still need to parse that configuration and generate the actual underlying Linux traffic control commands. Deferring that work, I concentrate here on the actual tc commands. I assume that for each interface, I will have a "class 2" with no Vyos match statements. I further assume that /opt/vyatta/share/perl5/Vyatta/Qos/TrafficShaper.pm has been patched with "my $prio = 3;" to start all the Vyos tc filter priorities at 3, leaving 1 and 2 for my use.

I need to run a script to configure the tc filters. Since I already use /etc/dhcp3/dhclient-exit-hooks.d/bgp to fix an issue with policy based routing, I extended that script to add this traffic control filtering. However, any location will do, as long as you manually run that script after making (almost) any change to the Vyos configuration.

I wish to thank nuclearcat for http://www.nuclearcat.com/mediawiki/index.php/U32_tips_tricks, since that page contained (and linked to) http://ace-host.stuart.id.au/russell/files/tc/doc/cls_u32.txt which held many of the keys to making this work. In particular the discussion of hash tables, linking, and the header offsets was important.

This solution handles both ipv4 and ipv6 packets that are not fragmented. In the case of ipv6, the packet cannot have any other ipv6 optional headers such as routing, fragmentation, authentication, etc.

class=2     # assume dummy (no match nodes) "traffic-policy shaper name class 2" on all external interfaces for syn/ack bufferbloat traffic control
priorip=1   # assume priority 1 is available for us, needs patch for /opt/vyatta/share/perl5/Vyatta/Qos/TrafficShaper.pm to start priority at 3
priorip6=2  # assume priority 2 is available for us, needs patch for /opt/vyatta/share/perl5/Vyatta/Qos/TrafficShaper.pm to start priority at 3
protoip="ip"
protoip6="ipv6"
synack=2    # hash table id, arbitrary number
synack6=3   # hash table id, arbitrary number
devs="eth1 eth2 eth3 tun0"
for d in $devs; do
    # see http://www.nuclearcat.com/mediawiki/index.php/U32_tips_tricks
    # setup base filter
    tc filter del dev $d parent 1: prior $priorip protocol $protoip
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32
    # make a linked hash table
    tc filter del dev $d parent 1: prior $priorip protocol $protoip handle $synack: u32
    tc filter add dev $d parent 1: prior $priorip protocol $protoip handle $synack: u32 divisor 1
    # tcp syn bit
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32 ht $synack: \
        match u8 0x02 0x02 at 13 \
        flowid 1:$class
    # tcp ack bit
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32 ht $synack: \
        match u8 0x10 0x10 at 13 \
        flowid 1:$class
    # ipv4/icmp
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32 \
        match ip protocol 1 0xff \
        flowid 1:$class
    # ipv4/tcp, total len<256, tos=0x10 == minimum delay
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32 \
        match ip protocol 6 0xff \
        match u16 0x0000 0xff00 at 2 \
        match ip tos 0x10 0xff \
        flowid 1:$class
    # ipv4/tcp, total len<128, not fragmented
    tc filter add dev $d parent 1: prior $priorip protocol $protoip u32 \
        match ip protocol 6 0xff \
        match u16 0x0000 0xff80 at 2 \
        match ip nofrag \
        offset at 0 mask 0x0f00 shift 6 eat \
        link $synack:

    # setup base filter
    tc filter del dev $d parent 1: prior $priorip6 protocol $protoip6
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32
    # make a linked hash table
    tc filter del dev $d parent 1: prior $priorip6 protocol $protoip6 handle $synack6: u32
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 handle $synack6: u32 divisor 1
    # tcp syn bit
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32 ht $synack6: \
        match u8 0x02 0x02 at 13 \
        flowid 1:$class
    # tcp ack bit
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32 ht $synack6: \
        match u8 0x10 0x10 at 13 \
        flowid 1:$class
    # ipv6/icmpv6
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32 \
        match ip6 protocol 58 0xff \
        flowid 1:$class
    # ipv6/tcp, payload len<128, priority=0x10 == minimum delay
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32 \
        match ip6 protocol 6 0xff \
        match u16 0x0000 0xff80 at 4 \
        match ip6 priority 0x10 0xff \
        flowid 1:$class
    # ipv6/tcp, payload len<64, not fragmented since the next header is a tcp header
    # this does not handle packets with other ipv6 extension headers that might be
    # present between the ipv6 header and the tcp header
    tc filter add dev $d parent 1: prior $priorip6 protocol $protoip6 u32 \
        match ip6 protocol 6 0xff \
        match u16 0x0000 0xffc0 at 4 \
        offset plus 40 eat \
        link $synack6:

    echo " "
    echo "tc filter device $d after add"
    tc filter show dev $d
    echo " "
done