909-546-4700

2014-11-25 Vyos VRRP issues

Introduction

Vyos VRRP allows multiple Vyos routers to act as a single virtual router. Note that these multiple Vyos routers may be running on actual physical hardware, or may themselves be running inside a virtual environment like KVM or Vmware.

Vyos VRRP is built on keepalived, which uses the macvlan kernel module to create the VRRP interfaces. See http://backreference.org/2014/03/20/some-notes-on-macvlanmacvtap/ for an introduction. Although Vyos includes an older VRRP mode, we only consider rfc3768 compatibility mode here.

Which interface?

Consider a pair of Vyos routers, talking to 10.0.0.1 as the external internet gateway, with a small switch connecting them to the outside gateway. From the viewpoint of the ISP 10.0.0.1 device, our (virtual) router has address 10.0.0.2 and some associated mac address generated by VRRP. Our two routers can talk to each other on the 192.168.0.0/24 network using the actual hardware mac addresses of the underlying ethernet interfaces. VRRP will move the 10.0.0.2 ip address and associated mac address between our two Vyos routers.

set interfaces ethernet eth0 address '192.168.0.253/24'
set interfaces ethernet eth0 vrrp vrrp-group 10 advertise-interval '2'
set interfaces ethernet eth0 vrrp vrrp-group 10 hello-source-address '192.168.0.253'
set interfaces ethernet eth0 vrrp vrrp-group 10 preempt 'true'
set interfaces ethernet eth0 vrrp vrrp-group 10 preempt-delay '20'
set interfaces ethernet eth0 vrrp vrrp-group 10 priority '200'
set interfaces ethernet eth0 vrrp vrrp-group 10 'rfc3768-compatibility'
set interfaces ethernet eth0 vrrp vrrp-group 10 sync-group 'total'
set interfaces ethernet eth0 vrrp vrrp-group 10 virtual-address '10.0.0.2'
set protocols static route 0.0.0.0/0 next-hop '10.0.0.1'

So we end up with two interfaces, eth0 and eth0v10, on each of our pair of routers. We can call eth0 the lower interface, and eth0v10 the upper interface. So which interface is used by packets entering and leaving our router? Ultimately all the traffic goes thru the lower interface, eth0. But from the viewpoint of the ARP, NAT and firewall configuration, we want that traffic to appear on the upper interface.

Patches

Vyos includes modifications to the Linux macvlan kernel module to add MACVLAN_MODE_VRRP, in addition to the four modes (private, vepa, bridge and passthru) supported by the stock macvlan module. Vyos also includes modifications to the keepalivd package to use vrrp mode rather than private mode. Vyos now does a shutdown of the upper device when in backup or fault state, and all the multicast communication between the two Vyos VRRP instances is over the lower devices.

MACVLAN_MODE_VRRP makes the incoming packets appear to arrive from the lower device, rather than the upper device. However, outgoing packets may be sent from either the lower or upper devices. This makes the NAT configuration asymmetric.

set nat destination rule 10 inbound-interface 'eth0'
set nat source rule 10 outbound-interface 'eth0v10'

Routing and ARP

Consider the routing table when this router is the master. Outgoing packets get routed out of the upper device.

Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.0.0.1, eth0v10
C>* 10.0.0.0/24 is directly connected, eth0v10
C>* 127.0.0.0/8 is directly connected, lo
C>* 192.168.0.0/24 is directly connected, eth0

But how does it find the mac address of the 10.0.0.1 gateway? Via ARP, just like any other system. So the broadcast ARP request packet is sent out eth0v10, with the rfc3768 source mac address. The ARP reply packet arrives on the physical eth0 device. The macvlan module picks it up since it was sent to the eth0v10 mac address. That module then changes the incoming device back to eth0 and passes it up the protocol stack. The ARP code then rejects it since arp is looking for a reply on eth0v10, not on eth0. The arp cache for eth0v10 never gets updated, and we cannot reach the gateway.

Why does the inside interface work?

We have this ARP problem on the outside internet facing interface. Why don't we have the same issue on the inside interface driving our local network?

set interfaces ethernet eth1 address '192.168.1.253/24'
set interfaces ethernet eth1 vrrp vrrp-group 20 advertise-interval '2'
set interfaces ethernet eth1 vrrp vrrp-group 20 description 'internal gateway'
set interfaces ethernet eth1 vrrp vrrp-group 20 preempt 'true'
set interfaces ethernet eth1 vrrp vrrp-group 20 preempt-delay '20'
set interfaces ethernet eth1 vrrp vrrp-group 20 priority '200'
set interfaces ethernet eth1 vrrp vrrp-group 20 'rfc3768-compatibility'
set interfaces ethernet eth1 vrrp vrrp-group 20 sync-group 'total'
set interfaces ethernet eth1 vrrp vrrp-group 20 virtual-address '192.168.1.254/24'

On the inside, we used the same /24 block for the lower physical eth1 interface, and the upper virtual eth1v20 interface, and the ip address on the lower interface is smaller than the ip address on the upper interface. That generates this routing table.

Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

C * 192.168.1.0/24 is directly connected, eth1v20
C>* 192.168.1.0/24 is directly connected, eth1

Note that the lower interface is selected as the route for that /24, not the upper interface. Clearly the arp replies are making it into the arp cache. I have not investigated with tcpdump, so I am not sure what mac addresses or interfaces are used here for the ARP traffic.

Keepalived and VRRP

Only the master (of the two Vyos VRRP instances) has an active upper virtual VRRP interface. The backup instance takes that interface down. Only the master instance sends multicast traffic, and the backup instance(s) listen for those broadcasts on the lower physical interface. However, the backup instances still have a configured upper virtual VRRP interface, using the same VRRP mac address as the active master instance.

The Vyos modification to keepalived causes it to specify MACVLAN_MODE_VRRP when creating the virtual VRRP interface.

A solution

The ARP problem was caused by MACVLAN_MODE_VRRP moving too much incoming traffic from the upper virtual interface back down to the lower physical interface. The only traffic that needs that are the VRRP multicast packets. My patch leaves all other traffic, including the ARP traffic, on the upper virtual interface.

Assume the following contraints. We only consider rfc3768 mode. For all VRRP interfaces, the lower and upper device ip addresses must not be in the same subnet. This will cause outgoing packets on the virtual ip to be routed out the upper device. The ARP reply packets will then arrive on the upper interface, and everything else just works.

This changes the behavior of the NAT rules - both incoming and outgoing NAT must now reference the upper interface.

set nat destination rule 10 inbound-interface 'eth0v10'
set nat source rule 10 outbound-interface 'eth0v10'

This also affects the firewall configuration. We now apply the firewall directly to the vrrp group.

set interfaces ethernet eth0 vrrp vrrp-group 10 firewall ...

There are other Vyos interface configuration settings that cannot currently be applied to VRRP interfaces. Some of these don't apply to such a virtual interface, and others work properly when applied to the lower interface.

set interfaces ethernet eth0 ?
Possible completions:
+  address      IP address
   bond-group   Assign interface to bonding group
 > bridge-group Add this interface to a bridge group
   description  Description
 > dhcpv6-options
                DHCPv6 options
   disable      Disable interface
   disable-flow-control
                Disable Ethernet flow control (pause frames)
   disable-link-detect
                Ignore link state changes
   duplex       Duplex mode
 > firewall     Firewall options
   hw-id        Media Access Control (MAC) address
 > ip           IPv4 routing parameters
 > ipv6         IPv6 routing parameters
   mac          Media Access Control (MAC) address
   mirror       Incoming packet mirroring destination
   mtu          Maximum Transmission Unit (MTU)
 > policy       Policy route options
+> pppoe        PPPOE unit number
   redirect     Incoming packet redirection destination
   smp_affinity CPU interrupt affinity mask
   speed        Link speed
 > traffic-policy
                Traffic-policy for interface
+> vif          Virtual Local Area Network (VLAN) ID
+> vif-s        QinQ TAG-S Virtual Local Area Network (VLAN) ID
 > vrrp         Virtual Router Redundancy Protocol (VRRP)

We should add configuration nodes to allow these, just like the firewall above.

set interfaces ethernet eth0 vrrp vrrp-group 10 ip ...
set interfaces ethernet eth0 vrrp vrrp-group 10 ipv6 ...
set interfaces ethernet eth0 vrrp vrrp-group 10 mtu ...
set interfaces ethernet eth0 vrrp vrrp-group 10 policy ...
set interfaces ethernet eth0 vrrp vrrp-group 10 redirect ...

I doubt that we need to be able to add vrrp interfaces to bond or bridge groups, or to mirror the traffic to another interface. Traffic policy seems to work properly when applied to the lower interface. Flow control, link detect, duplex modes, hardware id and smp affinity only seem to apply to the lower interface. dhcpv6 options don't seem to apply to such a virtual interface.

Future work

It would also be nice to update keepalived to the latest version, and add VRRP for ipv6. In particular, you might have a pair of routers doing ipv6 ra on the inside network.