summaryrefslogtreecommitdiff
path: root/docs/appendix/examples/ha.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/appendix/examples/ha.rst')
-rw-r--r--docs/appendix/examples/ha.rst580
1 files changed, 0 insertions, 580 deletions
diff --git a/docs/appendix/examples/ha.rst b/docs/appendix/examples/ha.rst
deleted file mode 100644
index 702cb2b2..00000000
--- a/docs/appendix/examples/ha.rst
+++ /dev/null
@@ -1,580 +0,0 @@
-#############################
-High Availability Walkthrough
-#############################
-
-This document walks you through a complete HA setup of two VyOS machines. This
-design is based on a VM as the primary router, and a physical machine as a
-backup, using VRRP, BGP, OSPF and conntrack sharing.
-
-The aim of this document is to walk you through setting everything up so you
-and up at a point where you can reboot any machine and not lose more than a few
-seconds worth of connectivity.
-
-Design
-======
-
-This is based on a real life, in production design. One of the complex issues
-is ensuring you have redundant data INTO your network. We do this with a pair
-of Cisco Nexus switches, and using Virtual PortChannels that are spanned across
-them. This as an added bonus, also allows for complete switch failure without
-an outage. How you achieve this yourself is left as an exercise to the reader
-but our setup is documented here.
-
-Walkthrough suggestion
-----------------------
-
-The ``commit`` command is implied after every section. If you make an error,
-``commit`` will warn you and you can fix it before getting too far into things.
-Please ensure you commit early and commit often.
-
-If you are following through this document, it is strongly suggested you
-complete the entire document, ONLY doing the virtual router1 steps, and then
-come back and walk through it AGAIN on the backup hardware router.
-
-This ensures you don't go to fast, or miss a step. However, it will make your
-life easier to configure the fixed IP address and default route now on the
-hardware router.
-
-Example Network
----------------
-
-In this document, we have been allocated 203.0.113.0/24 by our upstream
-provider, which we are publishing on VLAN100.
-
-They want us to establish a BGP session to their routers on 192.0.2.11 and
-192.0.2.12 from our routers 192.0.2.21 and 192.0.2.22. They are AS 65550 and
-we are AS65551.
-
-Our routers are going to have a floating IP address of 203.0.113.1, and use
-.2 and .3 as their fixed IPs.
-
-We are going to use 10.200.201.0/24 for an 'internal' network on VLAN201.
-
-When traffic is originated from the 10.200.201.0/24 network, it will be
-masqueraded to 203.0.113.1
-
-For connection between sites, we are running a WireGuard link to two REMOTE
-routers, and using OSPF over those links to distribute routes. That remote
-site is expected to send traffic from anything in 10.201.0.0/16
-
-VLANs
------
-
-These are the vlans we wll be using:
-
-* 50: Upstream, using the 192.0.2.0/24 network allocated by them.
-* 100: 'Public' network, using our 203.0.113.0/24 network.
-* 201: 'Internal' network, using 10.200.201.0/24
-
-Hardware
---------
-
-* switch1 (Nexus 10gb Switch)
-* switch2 (Nexus 10gb Switch)
-* compute1 (VMware ESXi 6.5)
-* compute2 (VMware ESXi 6.5)
-* compute3 (VMware ESXi 6.5)
-* router2 (Random 1RU machine with 4 NICs)
-
-Note that router1 is a VM that runs on one of the compute nodes.
-
-Network Cabling
----------------
-
-* From Datacenter - This connects into port 1 on both switches, and is tagged
- as VLAN 50
-* Cisco VPC Crossconnect - Ports 39 and 40 bonded between each switch
-* Hardware Router - Port 8 of each switch
-* compute1 - Port 9 of each switch
-* compute2 - Port 10 of each switch
-* compute3 - Port 11 of each switch
-
-This is ignoring the extra Out-of-band management networking, which should be
-on totally different switches, and a different feed into the rack, and is out
-of scope of this.
-
-.. note:: Our implementation uses VMware's Distributed Port Groups, which allows
- VMware to use LACP. This is a part of the ENTERPRISE licence, and is not
- available on a Free licence. If you are implementing this and do not have
- access to DPGs, you should not use VMware, and use some other virtualization
- platform instead.
-
-
-Basic Setup (via console)
-=========================
-
-Create your router1 VM so it is able to withstand a VM Host failing, or a
-network link failing. Using VMware, this is achieved by enabling vSphere DRS,
-vSphere Availability, and creating a Distributed Port Group that uses LACP.
-
-Many other Hypervisors do this, and I'm hoping that this document will be
-expanded to document how to do this for others.
-
-Create an 'All VLANs' network group, that passes all trunked traffic through
-to the VM. Attach this network group to router1 as eth0.
-
-.. note:: VMware: You must DISABLE SECURITY on this Port group. Make sure that
- ``Promiscuous Mode``\ , ``MAC address changes`` and ``Forged transmits`` are
- enabled. All of these will be done as part of failover.
-
-Bonding on Hardware Router
---------------------------
-
-Create a LACP bond on the hardware router. We are assuming that eth0 and eth1
-are connected to port 8 on both switches, and that those ports are configured
-as a Port-Channel.
-
-.. code-block:: none
-
- set interfaces bonding bond0 description 'Switch Port-Channel'
- set interfaces bonding bond0 hash-policy 'layer2'
- set interfaces bonding bond0 member interface 'eth0'
- set interfaces bonding bond0 member interface 'eth1'
- set interfaces bonding bond0 mode '802.3ad'
-
-
-Assign external IP addresses
-----------------------------
-
-VLAN 100 and 201 will have floating IP addresses, but VLAN50 does not, as this
-is talking directly to upstream. Create our IP address on vlan50.
-
-For the hardware router, replace ``eth0`` with ``bond0``. As (almost) every
-command is identical, this will not be specified unless different things need
-to be performed on different hosts.
-
-.. code-block:: none
-
- set interfaces ethernet eth0 vif 50 address '192.0.2.21/24'
-
-In this case, the hardware router has a different IP, so it would be
-
-.. code-block:: none
-
- set interfaces ethernet bond0 vif 50 address '192.0.2.22/24'
-
-Add (temporary) default route
------------------------------
-
-It is assumed that the routers provided by upstream are capable of acting as a
-default router, add that as a static route.
-
-.. code-block:: none
-
- set protocols static route 0.0.0.0/0 next-hop 192.0.2.11
- commit
- save
-
-
-Enable SSH
-----------
-
-Enable SSH so you can now SSH into the routers, rather than using the console.
-
-.. code-block:: none
-
- set service ssh
- commit
- save
-
-At this point you should be able to SSH into both of them, and will no longer
-need access to the console (unless you break something!)
-
-
-VRRP Configuration
-==================
-
-We are setting up VRRP so that it does NOT fail back when a machine returns into
-service, and it prioritizes router1 over router2.
-
-Internal Network
-----------------
-
-This has a floating IP address of 10.200.201.1/24, using virtual router ID 201.
-The difference between them is the interface name, hello-source-address, and
-peer-address.
-
-**router1**
-
-.. code-block:: none
-
- set interfaces ethernet eth0 vif 201 address 10.200.201.2/24
- set high-availability vrrp group int hello-source-address '10.200.201.2'
- set high-availability vrrp group int interface 'eth0.201'
- set high-availability vrrp group int peer-address '10.200.201.3'
- set high-availability vrrp group int no-preempt
- set high-availability vrrp group int priority '200'
- set high-availability vrrp group int virtual-address '10.200.201.1/24'
- set high-availability vrrp group int vrid '201'
-
-
-**router2**
-
-.. code-block:: none
-
- set interfaces ethernet bond0 vif 201 address 10.200.201.3/24
- set high-availability vrrp group int hello-source-address '10.200.201.3'
- set high-availability vrrp group int interface 'bond0.201'
- set high-availability vrrp group int peer-address '10.200.201.2'
- set high-availability vrrp group int no-preempt
- set high-availability vrrp group int priority '100'
- set high-availability vrrp group int virtual-address '10.200.201.1/24'
- set high-availability vrrp group int vrid '201'
-
-
-Public Network
---------------
-
-This has a floating IP address of 203.0.113.1/24, using virtual router ID 113.
-The virtual router ID is just a random number between 1 and 254, and can be set
-to whatever you want. Best practices suggest you try to keep them unique
-enterprise-wide.
-
-**router1**
-
-.. code-block:: none
-
- set interfaces ethernet eth0 vif 100 address 203.0.113.2/24
- set high-availability vrrp group public hello-source-address '203.0.113.2'
- set high-availability vrrp group public interface 'eth0.100'
- set high-availability vrrp group public peer-address '203.0.113.3'
- set high-availability vrrp group public no-preempt
- set high-availability vrrp group public priority '200'
- set high-availability vrrp group public virtual-address '203.0.113.1/24'
- set high-availability vrrp group public vrid '113'
-
-**router2**
-
-.. code-block:: none
-
- set interfaces ethernet bond0 vif 100 address 203.0.113.3/24
- set high-availability vrrp group public hello-source-address '203.0.113.3'
- set high-availability vrrp group public interface 'bond0.100'
- set high-availability vrrp group public peer-address '203.0.113.2'
- set high-availability vrrp group public no-preempt
- set high-availability vrrp group public priority '100'
- set high-availability vrrp group public virtual-address '203.0.113.1/24'
- set high-availability vrrp group public vrid '113'
-
-
-Create VRRP sync-group
-----------------------
-
-The sync group is used to replicate connection tracking. It needs to be assigned
-to a random VRRP group, and we are creating a sync group called ``sync`` using
-the vrrp group ``int``.
-
-.. code-block:: none
-
- set high-availability vrrp sync-group sync member 'int'
-
-Testing
--------
-
-At this point, you should be able to see both IP addresses when you run
-``show interfaces``\ , and ``show vrrp`` should show both interfaces in MASTER
-state (and SLAVE state on router2).
-
-.. code-block:: none
-
- vyos@router1:~$ show vrrp
- Name Interface VRID State Last Transition
- -------- ----------- ------ ------- -----------------
- int eth0.201 201 MASTER 100s
- public eth0.100 113 MASTER 200s
- vyos@router1:~$
-
-
-You should be able to ping to and from all the IPs you have allocated.
-
-NAT and conntrack-sync
-======================
-
-Masquerade Traffic originating from 10.200.201.0/24 that is heading out the
-public interface.
-
-.. note:: We explicitly exclude the primary upstream network so that BGP or
- OSPF traffic doesn't accidentally get NAT'ed.
-
-.. code-block:: none
-
- set nat source rule 10 destination address '!192.0.2.0/24'
- set nat source rule 10 outbound-interface 'eth0.50'
- set nat source rule 10 source address '10.200.201.0/24'
- set nat source rule 10 translation address '203.0.113.1'
-
-
-Configure conntrack-sync and disable helpers
---------------------------------------------
-
-Most conntrack modules cause more problems than they're worth, especially in a
-complex network. Turn them off by default, and if you need to turn them on
-later, you can do so.
-
-.. code-block:: none
-
- set system conntrack modules ftp disable
- set system conntrack modules gre disable
- set system conntrack modules nfs disable
- set system conntrack modules pptp disable
- set system conntrack modules sip disable
- set system conntrack modules tftp disable
-
-Now enable replication between nodes. Replace eth0.201 with bond0.201 on the
-hardware router.
-
-.. code-block:: none
-
- set service conntrack-sync accept-protocol 'tcp,udp,icmp'
- set service conntrack-sync event-listen-queue-size '8'
- set service conntrack-sync failover-mechanism vrrp sync-group 'sync'
- set service conntrack-sync interface eth0.201
- set service conntrack-sync mcast-group '224.0.0.50'
- set service conntrack-sync sync-queue-size '8'
-
-Testing
--------
-
-The simplest way to test is to look at the connection tracking stats on the
-standby hardware router with the command ``show conntrack-sync statistics``.
-The numbers should be very close to the numbers on the primary router.
-
-When you have both routers up, you should be able to establish a connection
-from a NAT'ed machine out to the internet, reboot the active machine, and that
-connection should be preserved, and will not drop out.
-
-OSPF Over WireGuard
-===================
-
-Wireguard doesn't have the concept of an up or down link, due to its design.
-This complicates AND simplifies using it for network transport, as for reliable
-state detection you need to use SOMETHING to detect when the link is down.
-
-If you use a routing protocol itself, you solve two problems at once. This is
-only a basic example, and is provided as a starting point.
-
-Configure Wireguard
--------------------
-
-There is plenty of instructions and documentation on setting up Wireguard. The
-only important thing you need to remember is to only use one WireGuard
-interface per OSPF connection.
-
-We use small /30's from 10.254.60/24 for the point-to-point links.
-
-**router1**
-
-Replace the 203.0.113.3 with whatever the other router's IP address is.
-
-.. code-block:: none
-
- set interfaces wireguard wg01 address '10.254.60.1/30'
- set interfaces wireguard wg01 description 'router1-to-offsite1'
- set interfaces wireguard wg01 ip ospf authentication md5 key-id 1 md5-key 'i360KoCwUGZvPq7e'
- set interfaces wireguard wg01 ip ospf cost '11'
- set interfaces wireguard wg01 ip ospf dead-interval '5'
- set interfaces wireguard wg01 ip ospf hello-interval '1'
- set interfaces wireguard wg01 ip ospf network 'point-to-point'
- set interfaces wireguard wg01 ip ospf priority '1'
- set interfaces wireguard wg01 ip ospf retransmit-interval '5'
- set interfaces wireguard wg01 ip ospf transmit-delay '1'
- set interfaces wireguard wg01 peer OFFSITE1 allowed-ips '0.0.0.0/0'
- set interfaces wireguard wg01 peer OFFSITE1 endpoint '203.0.113.3:50001'
- set interfaces wireguard wg01 peer OFFSITE1 persistent-keepalive '15'
- set interfaces wireguard wg01 peer OFFSITE1 pubkey 'GEFMOWzAyau42/HwdwfXnrfHdIISQF8YHj35rOgSZ0o='
- set interfaces wireguard wg01 port '50001'
-
-
-**offsite1**
-
-This is connecting back to the STATIC IP of router1, not the floating.
-
-.. code-block:: none
-
- set interfaces wireguard wg01 address '10.254.60.2/30'
- set interfaces wireguard wg01 description 'offsite1-to-router1'
- set interfaces wireguard wg01 ip ospf authentication md5 key-id 1 md5-key 'i360KoCwUGZvPq7e'
- set interfaces wireguard wg01 ip ospf cost '11'
- set interfaces wireguard wg01 ip ospf dead-interval '5'
- set interfaces wireguard wg01 ip ospf hello-interval '1'
- set interfaces wireguard wg01 ip ospf network 'point-to-point'
- set interfaces wireguard wg01 ip ospf priority '1'
- set interfaces wireguard wg01 ip ospf retransmit-interval '5'
- set interfaces wireguard wg01 ip ospf transmit-delay '1'
- set interfaces wireguard wg01 peer ROUTER1 allowed-ips '0.0.0.0/0'
- set interfaces wireguard wg01 peer ROUTER1 endpoint '192.0.2.21:50001'
- set interfaces wireguard wg01 peer ROUTER1 persistent-keepalive '15'
- set interfaces wireguard wg01 peer ROUTER1 pubkey 'CKwMV3ZaLntMule2Kd3G7UyVBR7zE8/qoZgLb82EE2Q='
- set interfaces wireguard wg01 port '50001'
-
-Test WireGuard
---------------
-
-Make sure you can ping 10.254.60.1 and .2 from both routers.
-
-Create Export Filter
---------------------
-
-We only want to export the networks we know we should be exporting. Always
-whitelist your route filters, both importing and exporting. A good rule of
-thumb is **'If you are not the default router for a network, don't advertise
-it'**. This means we explicitly do not want to advertise the 192.0.2.0/24
-network (but do want to advertise 10.200.201.0 and 203.0.113.0, which we ARE
-the default route for). This filter is applied to ``redistribute connected``.
-If we WERE to advertise it, the remote machines would see 192.0.2.21 available
-via their default route, establish the connection, and then OSPF would say
-'192.0.2.0/24 is available via this tunnel', at which point the tunnel would
-break, OSPF would drop the routes, and then 192.0.2.0/24 would be reachable via
-default again. This is called 'flapping'.
-
-.. code-block:: none
-
- set policy access-list 150 description 'Outbound OSPF Redistribution'
- set policy access-list 150 rule 10 action 'permit'
- set policy access-list 150 rule 10 destination any
- set policy access-list 150 rule 10 source inverse-mask '0.0.0.255'
- set policy access-list 150 rule 10 source network '10.200.201.0'
- set policy access-list 150 rule 20 action 'permit'
- set policy access-list 150 rule 20 destination any
- set policy access-list 150 rule 20 source inverse-mask '0.0.0.255'
- set policy access-list 150 rule 20 source network '203.0.113.0'
- set policy access-list 150 rule 100 action 'deny'
- set policy access-list 150 rule 100 destination any
- set policy access-list 150 rule 100 source any
-
-
-Create Import Filter
---------------------
-
-We only want to import networks we know about. Our OSPF peer should only be
-advertising networks in the 10.201.0.0/16 range. Note that this is an INVERSE
-MATCH. You deny in access-list 100 to accept the route.
-
-.. code-block:: none
-
- set policy access-list 100 description 'Inbound OSPF Routes from Peers'
- set policy access-list 100 rule 10 action 'deny'
- set policy access-list 100 rule 10 destination any
- set policy access-list 100 rule 10 source inverse-mask '0.0.255.255'
- set policy access-list 100 rule 10 source network '10.201.0.0'
- set policy access-list 100 rule 100 action 'permit'
- set policy access-list 100 rule 100 destination any
- set policy access-list 100 rule 100 source any
- set policy route-map PUBOSPF rule 100 action 'deny'
- set policy route-map PUBOSPF rule 100 match ip address access-list '100'
- set policy route-map PUBOSPF rule 500 action 'permit'
-
-
-Enable OSPF
------------
-
-Every router **must** have a unique router-id.
-The 'reference-bandwidth' is used because when OSPF was originally designed,
-the idea of a link faster than 1gbit was unheard of, and it does not scale
-correctly.
-
-.. code-block:: none
-
- set protocols ospf area 0.0.0.0 authentication 'md5'
- set protocols ospf area 0.0.0.0 network '10.254.60.0/24'
- set protocols ospf auto-cost reference-bandwidth '10000'
- set protocols ospf log-adjacency-changes
- set protocols ospf parameters abr-type 'cisco'
- set protocols ospf parameters router-id '10.254.60.2'
- set protocols ospf route-map PUBOSPF
-
-
-Test OSPF
----------
-
-When you have enabled OSPF on both routers, you should be able to see each
-other with the command ``show ip ospf neighbour``. The state must be 'Full'
-or '2-Way', if it is not then there is a network connectivity issue between the
-hosts. This is often caused by NAT or MTU issues. You should not see any new
-routes (unless this is the second pass) in the output of ``show ip route``
-
-Advertise connected routes
-==========================
-
-As a reminder, only advertise routes that you are the default router for. This
-is why we are NOT announcing the 192.0.2.0/24 network, because if that was
-announced into OSPF, the other routers would try to connect to that network
-over a tunnel that connects to that network!
-
-.. code-block:: none
-
- set protocols ospf access-list 150 export 'connected'
- set protocols ospf redistribute connected
-
-
-You should now be able to see the advertised network on the other host.
-
-Duplicate configuration
------------------------
-
-At this pont you now need to create the X link between all four routers. Use a
-different /30 for each link.
-
-Priorities
-----------
-
-Set the cost on the secondary links to be 200. This means that they will not
-be used unless the primary links are down.
-
-.. code-block:: none
-
- set interfaces wireguard wg01 ip ospf cost '10'
- set interfaces wireguard wg02 ip ospf cost '200'
-
-
-This will be visible in 'show ip route'.
-
-BGP
-===
-
-BGP is an extremely complex network protocol. An example is provided here.
-
-.. note:: Router id's must be unique.
-
-**router1**
-
-
-The ``redistribute ospf`` command is there purely as an example of how this can
-be expanded. In this walkthrough, it will be filtered by BGPOUT rule 10000, as
-it is not 203.0.113.0/24.
-
-.. code-block:: none
-
- set policy prefix-list BGPOUT description 'BGP Export List'
- set policy prefix-list BGPOUT rule 10 action 'deny'
- set policy prefix-list BGPOUT rule 10 description 'Do not advertise short masks'
- set policy prefix-list BGPOUT rule 10 ge '25'
- set policy prefix-list BGPOUT rule 10 prefix '0.0.0.0/0'
- set policy prefix-list BGPOUT rule 100 action 'permit'
- set policy prefix-list BGPOUT rule 100 description 'Our network'
- set policy prefix-list BGPOUT rule 100 prefix '203.0.113.0/24'
- set policy prefix-list BGPOUT rule 10000 action 'deny'
- set policy prefix-list BGPOUT rule 10000 prefix '0.0.0.0/0'
- set policy route-map BGPOUT description 'BGP Export Filter'
- set policy route-map BGPOUT rule 10 action 'permit'
- set policy route-map BGPOUT rule 10 match ip address prefix-list 'BGPOUT'
- set policy route-map BGPOUT rule 10000 action 'deny'
- set policy route-map BGPPREPENDOUT description 'BGP Export Filter'
- set policy route-map BGPPREPENDOUT rule 10 action 'permit'
- set policy route-map BGPPREPENDOUT rule 10 set as-path-prepend '65551 65551 65551'
- set policy route-map BGPPREPENDOUT rule 10 match ip address prefix-list 'BGPOUT'
- set policy route-map BGPPREPENDOUT rule 10000 action 'deny'
- set protocols bgp 65551 address-family ipv4-unicast network 192.0.2.0/24
- set protocols bgp 65551 address-family ipv4-unicast redistribute connected metric '50'
- set protocols bgp 65551 address-family ipv4-unicast redistribute ospf metric '50'
- set protocols bgp 65551 neighbor 192.0.2.11 address-family ipv4-unicast route-map export 'BGPOUT'
- set protocols bgp 65551 neighbor 192.0.2.11 address-family ipv4-unicast soft-reconfiguration inbound
- set protocols bgp 65551 neighbor 192.0.2.11 remote-as '65550'
- set protocols bgp 65551 neighbor 192.0.2.11 update-source '192.0.2.21'
- set protocols bgp 65551 parameters router-id '192.0.2.21'
-
-
-**router2**
-
-This is identical, but you use the BGPPREPENDOUT route-map to advertise the
-route with a longer path.