Catalyst 6509-E VSS Software Upgrade Gone Bad

This post was originally published on my old blog showbrain.blogspot.com

My work network has a pair of Cisco Catalyst 6509-E chassis that are configured in a Virtual Switching System (VSS) to serve as the network core.  Last week we had a supervisor engine crash and were having some residual craziness with our CAM table.  TAC suggested a reboot and software upgrade so we scheduled one for Sunday afternoon.  

Usually, a software upgrade on the 6509 is relatively painless, but this time it proved to be very painful.  The previous software load on the VSS pair was 12.2(33)SXI, but it was the modular version (keep this in mind it’s important).  The new software load suggested by TAC was 12.2(33)SXJ1 which as of SXJ is only offered in monolithic versions.

Assuming that all was well with these two versions, I started down the path of doing an enhanced Fast Software Upgrade (eFSU) of my VSS pair using the ISSU commands as listed in the Catalyst 6500 Release 12.2SX Software Configuration Guide – Virtual Switching Systems (VSS) on Cisco’s website. After issuing issue loadversion disk0:s72033-ipservicesk9_wan-mz.122-33.SXJ1.bin on the active console, I waited for the standby chassis to reload. Unfortunately, it entered a reboot loop because the new software was not compatible with ISSU. Here is where it got hairy. At this point, I could neither abort nor complete the upgrade on the active supervisor. It wouldn’t let me change the boot system variable because it had a state somewhere that said it was in an ISSU upgrade even after power cycling the chassis.

After 5 hours on the phone with TAC, we were able to clear this persistence and finish the upgrade, but it was a very long downtime. The moral of the story… modular and monolithic IOS don’t mix well.