Intro: what is VyOS ?
Short version: a virtual router, running as virtual machine. Skip to Updating VyOS if you already know what VyOS is.
Long explanation: I’m using VyOS heavily in my SDDC labs; for L3 routing as well as DHCP relay, clustered services, dynamic routing protocols (BGP) with both the NSX-V and NSX-T edges as well as multi-site peering.
I believe it’s a great virtual router solution; I’ve been using in production environments since 2010, when back then it was Vyatta Core(VC); the company got acquired by Brocade in 2012 and subsequently got sold to AT&T (more here). Summary for what Vyatta was designed for:
The Vyatta system is intended as a replacement for Cisco IOS 1800 through ASR 1000[3] series Integrated Services Routers (ISR) and ASA 5500 security appliances, with a strong emphasis on the cost and flexibility inherent in an open source, Linux-based system[4] running on commodity x86 hardware or in VMware ESXi, Microsoft Hyper-V, Citrix XenServer, Open Source Xen and KVM virtual environments.
https://en.wikipedia.org/wiki/Vyatta
I wrote about using Vyatta Core (VC) even in my NSX for Newbies series back in 2014 – Part 2
Now on to VyOS. The project was started in late 2013 as a community fork of the GPL portions of Vyatta Core 6.6R1 with the goal of maintaining a free and open source network operating system in response to the decision to discontinue the community edition of Vyatta. It provides not only traditional routing services but also VPNs, OSPFv2, OSPFv3, BGP, VRRP, DHCP, TFTP, scriptable CLI and extensive route policy mapping and filtering, high availability (clustered services) as well as built-in versioning.
So many companies out there have gone to market with enterprise grade solutions based on VyOS. Even Dell EMC is selling few appliances based on VyOS.
Anyway, enough of history and introduction! I will write a separate article about my lab networking design, where I will demonstrate clustering services on VyOS and how much more powerful they are compared to VRRP.
Updating VyOS
Updating VyOS is very easy, assuming you are cool getting the “latest and greatest” build (because hey, I’m talking about a lab right?) you simply point the image to the vyos-rolling-latest.iso and the system will do the rest. If want a specific build instead you can point to such iso (full list here https://downloads.vyos.io/?dir=rolling/current/amd64) using the same command.
Towards the end of Dec 2020 I was having problems troubleshooting BGP debug messages and I was using VyOS 1.3-rolling-202003310117 (from March 2020). I reached out to the VyOS developers and they confirmed to me there were many bugs that got fixed past March (fair enough) so I decided to update to latest on 1st of Jan 2021.
add system image https://downloads.vyos.io/rolling/current/amd64/vyos-rolling-latest.iso
Give it a reboot and you’re done. The system keeps track of all images so you could reboot the old, keeping the same configuration. So it’s very safe to upgrade, should things break on the new version you can very easily roll back the old one.
The good news
Using the build 1.3 from 1st of Jan 2021 I was finally able to debug BGP protocol messages and logs using the following commands (that were not working before):
monitor protocol bgp enable keepalives monitor protocol bgp enable neighbor-events monitor protocol bgp enable updates monitor protocol bgp
The bad news
The new version broke my BGP routes advertisement! As I was telling you earlier, VyOS serves as my L3 core and BGP router. Very often I destroy and re-build nested SDDC environments. As of VCF 4.0 during the installation process (called bring-up) Cloud Builder checks and validates the BGP peering and route distribution configuration.
The VyOS 1.3 release train hasn’t GA’ed yet; it means every rolling release that you decide to upgrade to you’re sort of in the dark as to what was fixed, what’s new and what’s still broken; there’s no such thing as release notes for rolling releases. You can only check the GitHub commits and fixes but let’s be honest, this will be difficult to decipher. So you just have to take the punt and test it out.
That said, happy to have just fixed my BGP debugging requirements I went ahead and kicked-off a new SDDC bring-up. All went well… until it broke during the “Verify BGP Route Distribution” task; from vcf-bring-debug.log I was getting
FAILED_TO_VALIDATE_BGP_ROUTE_DISTRIBUTION_RESULT Failed to validate the BGP Route Distribution result for edge node with ID 0a52bae4-b2e3-4631-9af3-8ac1d1da1efb
I couldn’t get my head around what was wrong, knowing my VyOS config had not changed.
But I immediately noticed something on my BGP summary table
Instead of seeing the number of prefixes received and sent on each neighbour I could only see (Policy). So I failed-over my VyOS to the secondary node (hot-standby) which I did not upgrade to latest (on purpose) and here’s the clear difference:
Not knowing what to check I went back to VyOS slack channel and asked the developers and promptly discovered what was going on, as following:
VyOS 1.3 rolling releases after 23rd Dec 2020 and up until the 1st of January had an updated FFRouting (FRR) engine version 7.5 which implements RFC 8212. This RFC makes the BGP implementation stricter by requiring mandatory explicit configuration of both BGP Import and Export Policies (via route maps) for any External BGP (EBGP) peer for all enabled address families. In other words BGP speakers adhering to RFC 8212 will not advertise or accept routes unless explicitly configured to do so with an export/ import policy.
Well that was it! It explains why for the NSX-T Edge Transport nodes I had (Policy) listed there. If you pay attention neighbour 10.11.8.2 does not have (Policy) because that’s my vRouter at “Site 2” for which I did already have route maps configured.
VyOS developers realised that this stricter implementation did cause problems because there’s no direct upgrade path available; in other words route filters must be manually configured after migrating to FFR 7.5.
In light of that and in the context of VyOS 1.3 still being under rolling release cycles (aka not GA) to make things easier for testing purposes they have decided to take a step back so any build you pick after 3rd of Jan 2021 will have FFR 7.3 and so no RFC 8212.