diff options
Diffstat (limited to 'docs/configexamples/ha.rst')
-rw-r--r-- | docs/configexamples/ha.rst | 580 |
1 files changed, 580 insertions, 0 deletions
diff --git a/docs/configexamples/ha.rst b/docs/configexamples/ha.rst new file mode 100644 index 00000000..702cb2b2 --- /dev/null +++ b/docs/configexamples/ha.rst @@ -0,0 +1,580 @@ +############################# +High Availability Walkthrough +############################# + +This document walks you through a complete HA setup of two VyOS machines. This +design is based on a VM as the primary router, and a physical machine as a +backup, using VRRP, BGP, OSPF and conntrack sharing. + +The aim of this document is to walk you through setting everything up so you +and up at a point where you can reboot any machine and not lose more than a few +seconds worth of connectivity. + +Design +====== + +This is based on a real life, in production design. One of the complex issues +is ensuring you have redundant data INTO your network. We do this with a pair +of Cisco Nexus switches, and using Virtual PortChannels that are spanned across +them. This as an added bonus, also allows for complete switch failure without +an outage. How you achieve this yourself is left as an exercise to the reader +but our setup is documented here. + +Walkthrough suggestion +---------------------- + +The ``commit`` command is implied after every section. If you make an error, +``commit`` will warn you and you can fix it before getting too far into things. +Please ensure you commit early and commit often. + +If you are following through this document, it is strongly suggested you +complete the entire document, ONLY doing the virtual router1 steps, and then +come back and walk through it AGAIN on the backup hardware router. + +This ensures you don't go to fast, or miss a step. However, it will make your +life easier to configure the fixed IP address and default route now on the +hardware router. + +Example Network +--------------- + +In this document, we have been allocated 203.0.113.0/24 by our upstream +provider, which we are publishing on VLAN100. + +They want us to establish a BGP session to their routers on 192.0.2.11 and +192.0.2.12 from our routers 192.0.2.21 and 192.0.2.22. They are AS 65550 and +we are AS65551. + +Our routers are going to have a floating IP address of 203.0.113.1, and use +.2 and .3 as their fixed IPs. + +We are going to use 10.200.201.0/24 for an 'internal' network on VLAN201. + +When traffic is originated from the 10.200.201.0/24 network, it will be +masqueraded to 203.0.113.1 + +For connection between sites, we are running a WireGuard link to two REMOTE +routers, and using OSPF over those links to distribute routes. That remote +site is expected to send traffic from anything in 10.201.0.0/16 + +VLANs +----- + +These are the vlans we wll be using: + +* 50: Upstream, using the 192.0.2.0/24 network allocated by them. +* 100: 'Public' network, using our 203.0.113.0/24 network. +* 201: 'Internal' network, using 10.200.201.0/24 + +Hardware +-------- + +* switch1 (Nexus 10gb Switch) +* switch2 (Nexus 10gb Switch) +* compute1 (VMware ESXi 6.5) +* compute2 (VMware ESXi 6.5) +* compute3 (VMware ESXi 6.5) +* router2 (Random 1RU machine with 4 NICs) + +Note that router1 is a VM that runs on one of the compute nodes. + +Network Cabling +--------------- + +* From Datacenter - This connects into port 1 on both switches, and is tagged + as VLAN 50 +* Cisco VPC Crossconnect - Ports 39 and 40 bonded between each switch +* Hardware Router - Port 8 of each switch +* compute1 - Port 9 of each switch +* compute2 - Port 10 of each switch +* compute3 - Port 11 of each switch + +This is ignoring the extra Out-of-band management networking, which should be +on totally different switches, and a different feed into the rack, and is out +of scope of this. + +.. note:: Our implementation uses VMware's Distributed Port Groups, which allows + VMware to use LACP. This is a part of the ENTERPRISE licence, and is not + available on a Free licence. If you are implementing this and do not have + access to DPGs, you should not use VMware, and use some other virtualization + platform instead. + + +Basic Setup (via console) +========================= + +Create your router1 VM so it is able to withstand a VM Host failing, or a +network link failing. Using VMware, this is achieved by enabling vSphere DRS, +vSphere Availability, and creating a Distributed Port Group that uses LACP. + +Many other Hypervisors do this, and I'm hoping that this document will be +expanded to document how to do this for others. + +Create an 'All VLANs' network group, that passes all trunked traffic through +to the VM. Attach this network group to router1 as eth0. + +.. note:: VMware: You must DISABLE SECURITY on this Port group. Make sure that + ``Promiscuous Mode``\ , ``MAC address changes`` and ``Forged transmits`` are + enabled. All of these will be done as part of failover. + +Bonding on Hardware Router +-------------------------- + +Create a LACP bond on the hardware router. We are assuming that eth0 and eth1 +are connected to port 8 on both switches, and that those ports are configured +as a Port-Channel. + +.. code-block:: none + + set interfaces bonding bond0 description 'Switch Port-Channel' + set interfaces bonding bond0 hash-policy 'layer2' + set interfaces bonding bond0 member interface 'eth0' + set interfaces bonding bond0 member interface 'eth1' + set interfaces bonding bond0 mode '802.3ad' + + +Assign external IP addresses +---------------------------- + +VLAN 100 and 201 will have floating IP addresses, but VLAN50 does not, as this +is talking directly to upstream. Create our IP address on vlan50. + +For the hardware router, replace ``eth0`` with ``bond0``. As (almost) every +command is identical, this will not be specified unless different things need +to be performed on different hosts. + +.. code-block:: none + + set interfaces ethernet eth0 vif 50 address '192.0.2.21/24' + +In this case, the hardware router has a different IP, so it would be + +.. code-block:: none + + set interfaces ethernet bond0 vif 50 address '192.0.2.22/24' + +Add (temporary) default route +----------------------------- + +It is assumed that the routers provided by upstream are capable of acting as a +default router, add that as a static route. + +.. code-block:: none + + set protocols static route 0.0.0.0/0 next-hop 192.0.2.11 + commit + save + + +Enable SSH +---------- + +Enable SSH so you can now SSH into the routers, rather than using the console. + +.. code-block:: none + + set service ssh + commit + save + +At this point you should be able to SSH into both of them, and will no longer +need access to the console (unless you break something!) + + +VRRP Configuration +================== + +We are setting up VRRP so that it does NOT fail back when a machine returns into +service, and it prioritizes router1 over router2. + +Internal Network +---------------- + +This has a floating IP address of 10.200.201.1/24, using virtual router ID 201. +The difference between them is the interface name, hello-source-address, and +peer-address. + +**router1** + +.. code-block:: none + + set interfaces ethernet eth0 vif 201 address 10.200.201.2/24 + set high-availability vrrp group int hello-source-address '10.200.201.2' + set high-availability vrrp group int interface 'eth0.201' + set high-availability vrrp group int peer-address '10.200.201.3' + set high-availability vrrp group int no-preempt + set high-availability vrrp group int priority '200' + set high-availability vrrp group int virtual-address '10.200.201.1/24' + set high-availability vrrp group int vrid '201' + + +**router2** + +.. code-block:: none + + set interfaces ethernet bond0 vif 201 address 10.200.201.3/24 + set high-availability vrrp group int hello-source-address '10.200.201.3' + set high-availability vrrp group int interface 'bond0.201' + set high-availability vrrp group int peer-address '10.200.201.2' + set high-availability vrrp group int no-preempt + set high-availability vrrp group int priority '100' + set high-availability vrrp group int virtual-address '10.200.201.1/24' + set high-availability vrrp group int vrid '201' + + +Public Network +-------------- + +This has a floating IP address of 203.0.113.1/24, using virtual router ID 113. +The virtual router ID is just a random number between 1 and 254, and can be set +to whatever you want. Best practices suggest you try to keep them unique +enterprise-wide. + +**router1** + +.. code-block:: none + + set interfaces ethernet eth0 vif 100 address 203.0.113.2/24 + set high-availability vrrp group public hello-source-address '203.0.113.2' + set high-availability vrrp group public interface 'eth0.100' + set high-availability vrrp group public peer-address '203.0.113.3' + set high-availability vrrp group public no-preempt + set high-availability vrrp group public priority '200' + set high-availability vrrp group public virtual-address '203.0.113.1/24' + set high-availability vrrp group public vrid '113' + +**router2** + +.. code-block:: none + + set interfaces ethernet bond0 vif 100 address 203.0.113.3/24 + set high-availability vrrp group public hello-source-address '203.0.113.3' + set high-availability vrrp group public interface 'bond0.100' + set high-availability vrrp group public peer-address '203.0.113.2' + set high-availability vrrp group public no-preempt + set high-availability vrrp group public priority '100' + set high-availability vrrp group public virtual-address '203.0.113.1/24' + set high-availability vrrp group public vrid '113' + + +Create VRRP sync-group +---------------------- + +The sync group is used to replicate connection tracking. It needs to be assigned +to a random VRRP group, and we are creating a sync group called ``sync`` using +the vrrp group ``int``. + +.. code-block:: none + + set high-availability vrrp sync-group sync member 'int' + +Testing +------- + +At this point, you should be able to see both IP addresses when you run +``show interfaces``\ , and ``show vrrp`` should show both interfaces in MASTER +state (and SLAVE state on router2). + +.. code-block:: none + + vyos@router1:~$ show vrrp + Name Interface VRID State Last Transition + -------- ----------- ------ ------- ----------------- + int eth0.201 201 MASTER 100s + public eth0.100 113 MASTER 200s + vyos@router1:~$ + + +You should be able to ping to and from all the IPs you have allocated. + +NAT and conntrack-sync +====================== + +Masquerade Traffic originating from 10.200.201.0/24 that is heading out the +public interface. + +.. note:: We explicitly exclude the primary upstream network so that BGP or + OSPF traffic doesn't accidentally get NAT'ed. + +.. code-block:: none + + set nat source rule 10 destination address '!192.0.2.0/24' + set nat source rule 10 outbound-interface 'eth0.50' + set nat source rule 10 source address '10.200.201.0/24' + set nat source rule 10 translation address '203.0.113.1' + + +Configure conntrack-sync and disable helpers +-------------------------------------------- + +Most conntrack modules cause more problems than they're worth, especially in a +complex network. Turn them off by default, and if you need to turn them on +later, you can do so. + +.. code-block:: none + + set system conntrack modules ftp disable + set system conntrack modules gre disable + set system conntrack modules nfs disable + set system conntrack modules pptp disable + set system conntrack modules sip disable + set system conntrack modules tftp disable + +Now enable replication between nodes. Replace eth0.201 with bond0.201 on the +hardware router. + +.. code-block:: none + + set service conntrack-sync accept-protocol 'tcp,udp,icmp' + set service conntrack-sync event-listen-queue-size '8' + set service conntrack-sync failover-mechanism vrrp sync-group 'sync' + set service conntrack-sync interface eth0.201 + set service conntrack-sync mcast-group '224.0.0.50' + set service conntrack-sync sync-queue-size '8' + +Testing +------- + +The simplest way to test is to look at the connection tracking stats on the +standby hardware router with the command ``show conntrack-sync statistics``. +The numbers should be very close to the numbers on the primary router. + +When you have both routers up, you should be able to establish a connection +from a NAT'ed machine out to the internet, reboot the active machine, and that +connection should be preserved, and will not drop out. + +OSPF Over WireGuard +=================== + +Wireguard doesn't have the concept of an up or down link, due to its design. +This complicates AND simplifies using it for network transport, as for reliable +state detection you need to use SOMETHING to detect when the link is down. + +If you use a routing protocol itself, you solve two problems at once. This is +only a basic example, and is provided as a starting point. + +Configure Wireguard +------------------- + +There is plenty of instructions and documentation on setting up Wireguard. The +only important thing you need to remember is to only use one WireGuard +interface per OSPF connection. + +We use small /30's from 10.254.60/24 for the point-to-point links. + +**router1** + +Replace the 203.0.113.3 with whatever the other router's IP address is. + +.. code-block:: none + + set interfaces wireguard wg01 address '10.254.60.1/30' + set interfaces wireguard wg01 description 'router1-to-offsite1' + set interfaces wireguard wg01 ip ospf authentication md5 key-id 1 md5-key 'i360KoCwUGZvPq7e' + set interfaces wireguard wg01 ip ospf cost '11' + set interfaces wireguard wg01 ip ospf dead-interval '5' + set interfaces wireguard wg01 ip ospf hello-interval '1' + set interfaces wireguard wg01 ip ospf network 'point-to-point' + set interfaces wireguard wg01 ip ospf priority '1' + set interfaces wireguard wg01 ip ospf retransmit-interval '5' + set interfaces wireguard wg01 ip ospf transmit-delay '1' + set interfaces wireguard wg01 peer OFFSITE1 allowed-ips '0.0.0.0/0' + set interfaces wireguard wg01 peer OFFSITE1 endpoint '203.0.113.3:50001' + set interfaces wireguard wg01 peer OFFSITE1 persistent-keepalive '15' + set interfaces wireguard wg01 peer OFFSITE1 pubkey 'GEFMOWzAyau42/HwdwfXnrfHdIISQF8YHj35rOgSZ0o=' + set interfaces wireguard wg01 port '50001' + + +**offsite1** + +This is connecting back to the STATIC IP of router1, not the floating. + +.. code-block:: none + + set interfaces wireguard wg01 address '10.254.60.2/30' + set interfaces wireguard wg01 description 'offsite1-to-router1' + set interfaces wireguard wg01 ip ospf authentication md5 key-id 1 md5-key 'i360KoCwUGZvPq7e' + set interfaces wireguard wg01 ip ospf cost '11' + set interfaces wireguard wg01 ip ospf dead-interval '5' + set interfaces wireguard wg01 ip ospf hello-interval '1' + set interfaces wireguard wg01 ip ospf network 'point-to-point' + set interfaces wireguard wg01 ip ospf priority '1' + set interfaces wireguard wg01 ip ospf retransmit-interval '5' + set interfaces wireguard wg01 ip ospf transmit-delay '1' + set interfaces wireguard wg01 peer ROUTER1 allowed-ips '0.0.0.0/0' + set interfaces wireguard wg01 peer ROUTER1 endpoint '192.0.2.21:50001' + set interfaces wireguard wg01 peer ROUTER1 persistent-keepalive '15' + set interfaces wireguard wg01 peer ROUTER1 pubkey 'CKwMV3ZaLntMule2Kd3G7UyVBR7zE8/qoZgLb82EE2Q=' + set interfaces wireguard wg01 port '50001' + +Test WireGuard +-------------- + +Make sure you can ping 10.254.60.1 and .2 from both routers. + +Create Export Filter +-------------------- + +We only want to export the networks we know we should be exporting. Always +whitelist your route filters, both importing and exporting. A good rule of +thumb is **'If you are not the default router for a network, don't advertise +it'**. This means we explicitly do not want to advertise the 192.0.2.0/24 +network (but do want to advertise 10.200.201.0 and 203.0.113.0, which we ARE +the default route for). This filter is applied to ``redistribute connected``. +If we WERE to advertise it, the remote machines would see 192.0.2.21 available +via their default route, establish the connection, and then OSPF would say +'192.0.2.0/24 is available via this tunnel', at which point the tunnel would +break, OSPF would drop the routes, and then 192.0.2.0/24 would be reachable via +default again. This is called 'flapping'. + +.. code-block:: none + + set policy access-list 150 description 'Outbound OSPF Redistribution' + set policy access-list 150 rule 10 action 'permit' + set policy access-list 150 rule 10 destination any + set policy access-list 150 rule 10 source inverse-mask '0.0.0.255' + set policy access-list 150 rule 10 source network '10.200.201.0' + set policy access-list 150 rule 20 action 'permit' + set policy access-list 150 rule 20 destination any + set policy access-list 150 rule 20 source inverse-mask '0.0.0.255' + set policy access-list 150 rule 20 source network '203.0.113.0' + set policy access-list 150 rule 100 action 'deny' + set policy access-list 150 rule 100 destination any + set policy access-list 150 rule 100 source any + + +Create Import Filter +-------------------- + +We only want to import networks we know about. Our OSPF peer should only be +advertising networks in the 10.201.0.0/16 range. Note that this is an INVERSE +MATCH. You deny in access-list 100 to accept the route. + +.. code-block:: none + + set policy access-list 100 description 'Inbound OSPF Routes from Peers' + set policy access-list 100 rule 10 action 'deny' + set policy access-list 100 rule 10 destination any + set policy access-list 100 rule 10 source inverse-mask '0.0.255.255' + set policy access-list 100 rule 10 source network '10.201.0.0' + set policy access-list 100 rule 100 action 'permit' + set policy access-list 100 rule 100 destination any + set policy access-list 100 rule 100 source any + set policy route-map PUBOSPF rule 100 action 'deny' + set policy route-map PUBOSPF rule 100 match ip address access-list '100' + set policy route-map PUBOSPF rule 500 action 'permit' + + +Enable OSPF +----------- + +Every router **must** have a unique router-id. +The 'reference-bandwidth' is used because when OSPF was originally designed, +the idea of a link faster than 1gbit was unheard of, and it does not scale +correctly. + +.. code-block:: none + + set protocols ospf area 0.0.0.0 authentication 'md5' + set protocols ospf area 0.0.0.0 network '10.254.60.0/24' + set protocols ospf auto-cost reference-bandwidth '10000' + set protocols ospf log-adjacency-changes + set protocols ospf parameters abr-type 'cisco' + set protocols ospf parameters router-id '10.254.60.2' + set protocols ospf route-map PUBOSPF + + +Test OSPF +--------- + +When you have enabled OSPF on both routers, you should be able to see each +other with the command ``show ip ospf neighbour``. The state must be 'Full' +or '2-Way', if it is not then there is a network connectivity issue between the +hosts. This is often caused by NAT or MTU issues. You should not see any new +routes (unless this is the second pass) in the output of ``show ip route`` + +Advertise connected routes +========================== + +As a reminder, only advertise routes that you are the default router for. This +is why we are NOT announcing the 192.0.2.0/24 network, because if that was +announced into OSPF, the other routers would try to connect to that network +over a tunnel that connects to that network! + +.. code-block:: none + + set protocols ospf access-list 150 export 'connected' + set protocols ospf redistribute connected + + +You should now be able to see the advertised network on the other host. + +Duplicate configuration +----------------------- + +At this pont you now need to create the X link between all four routers. Use a +different /30 for each link. + +Priorities +---------- + +Set the cost on the secondary links to be 200. This means that they will not +be used unless the primary links are down. + +.. code-block:: none + + set interfaces wireguard wg01 ip ospf cost '10' + set interfaces wireguard wg02 ip ospf cost '200' + + +This will be visible in 'show ip route'. + +BGP +=== + +BGP is an extremely complex network protocol. An example is provided here. + +.. note:: Router id's must be unique. + +**router1** + + +The ``redistribute ospf`` command is there purely as an example of how this can +be expanded. In this walkthrough, it will be filtered by BGPOUT rule 10000, as +it is not 203.0.113.0/24. + +.. code-block:: none + + set policy prefix-list BGPOUT description 'BGP Export List' + set policy prefix-list BGPOUT rule 10 action 'deny' + set policy prefix-list BGPOUT rule 10 description 'Do not advertise short masks' + set policy prefix-list BGPOUT rule 10 ge '25' + set policy prefix-list BGPOUT rule 10 prefix '0.0.0.0/0' + set policy prefix-list BGPOUT rule 100 action 'permit' + set policy prefix-list BGPOUT rule 100 description 'Our network' + set policy prefix-list BGPOUT rule 100 prefix '203.0.113.0/24' + set policy prefix-list BGPOUT rule 10000 action 'deny' + set policy prefix-list BGPOUT rule 10000 prefix '0.0.0.0/0' + set policy route-map BGPOUT description 'BGP Export Filter' + set policy route-map BGPOUT rule 10 action 'permit' + set policy route-map BGPOUT rule 10 match ip address prefix-list 'BGPOUT' + set policy route-map BGPOUT rule 10000 action 'deny' + set policy route-map BGPPREPENDOUT description 'BGP Export Filter' + set policy route-map BGPPREPENDOUT rule 10 action 'permit' + set policy route-map BGPPREPENDOUT rule 10 set as-path-prepend '65551 65551 65551' + set policy route-map BGPPREPENDOUT rule 10 match ip address prefix-list 'BGPOUT' + set policy route-map BGPPREPENDOUT rule 10000 action 'deny' + set protocols bgp 65551 address-family ipv4-unicast network 192.0.2.0/24 + set protocols bgp 65551 address-family ipv4-unicast redistribute connected metric '50' + set protocols bgp 65551 address-family ipv4-unicast redistribute ospf metric '50' + set protocols bgp 65551 neighbor 192.0.2.11 address-family ipv4-unicast route-map export 'BGPOUT' + set protocols bgp 65551 neighbor 192.0.2.11 address-family ipv4-unicast soft-reconfiguration inbound + set protocols bgp 65551 neighbor 192.0.2.11 remote-as '65550' + set protocols bgp 65551 neighbor 192.0.2.11 update-source '192.0.2.21' + set protocols bgp 65551 parameters router-id '192.0.2.21' + + +**router2** + +This is identical, but you use the BGPPREPENDOUT route-map to advertise the +route with a longer path. |