This document describes in some detail the operation and functions of the OS upgrade tooling that is part of the YCE environment. The tooling is aimed at performing the targeted OS upgrades in bulk with minimal interference of the operator while still maintaining full control of the process.
The OS upgrades can be performed on a wide variety of routing and switching devices and vendors. No claims can currently be made to support all Cisco devices since the tool was solely tested against the Cisco product families: 800, 1800, 2700, 3500, 4000, 4500, 6500 and 7200. However, it is assumed that only very minor adaptations are required should wider device support be required.
The following vendors have been successfully tested:
|Vendor||product family / models|
|Cisco||800, 1800, 2700, 3500, 4000, 4500, 6500 and 7200 IOS, XE, Nexus|
|HP||A-series / Comware 5 / Comware 7|
|Huawei||S5700, CE6850, AR169FGW-L|
To realize a minimal operator intervention of the operator, two core functions were incorporated that are unique to the YCE solution. First there is the method of assigning the proper set of credentials to each device to be upgraded. Second is the ability to predefine the targeted OS version and feature set of any device based on its hardware and its ‘role’ in the network.
Upgrading the OS image of a device is a multi-stage process that starts with an automated inventory. Its results include - where available - a recommended OS file for the device. The recommendation or the operators’ choice of the OS file(s) then leads to the selection of the upgrade method: all-in-one-go or the step-by-step method involving cleanup and upload, activation and reload, and finishing and more cleanups.
To be able to recommend an OS version for the upgrade, the tool uses information that the network engineers or operators entered in the OS versions table. Also, the OS files needed should be acquired and copied to the YCE tftp os-environment by the operators.
The OS selection information resides in the Os_images table of the YCE database and is compiled of the various OS files/versions/feature, the device's hardware-name and a ‘Domain’. These domains also play a role in linking the proper credentials a device. Suffice to say for now that each node is assigned a domain, and that these domains are also keyed to the target OS version.
Make sure to define the Storage_device field in the Model details form for the Vendor type in question so the image will get uploaded to the right path on the node.
After completion of the inventory phase of the OS upgrade, both the device's hardware name and its domain are known allowing the automatic selection of the OS file. Should the same device type be used in different roles within a domain, the node-type (or role) in the domain could provide fine-grained selection criteria.
Operators can always override the automatic selection within the OS files corresponding to the device's hardware name.
All referenced OS files are assumed to be available on the YCE server(s). The root directory of the YCE TFTP server is /var/opt/shared/public. The OS images must be located in the 'os' directory of the tftp server (/var/opt/shared/public/os for the operating system). The used protocol to transfer the file in question is determined by the selected File transfer protocol in the model details of the hardware model the node(s) in question belong to:
The OS images will be uploaded by the device using a command like:
copy tftp://172.29.34.16/os/c7200-Js-Mz 124-3.bin disk0:/c7200-Js-Mz 124-3.bin
All files listed in the Os_files field of the Os_images table will be copied in this manner. Compressed or archived OS files (tar) which should be expanded or extracted on the device are not supported. This limitation was introduced when it was discovered that many undesired html files were copied to the bootflash storage in this manner.
The YCE TFTP server is an in-house developed multithreaded tftp server that is optimized for handling large files between network devices. The tftp server is also fully integrated with the YCE database to generate full configurations directly from the database and on-the-fly when requested by a device.
A TFTP maintenance web tool is part of the YCE tooling environment. It is used to upload and maintain all TFTP directories, including the OS environment.
An important prerequisite to perform any action in bulk, is to automatically retrieve the login and privilege credentials for each device. These credentials include two sets of username and passwords for the login (one Tacacs based, one local to be used in succession) and the enable secret.
For the fully YCE managed devices this information is available by default, but not for the remaining devices. These devices can be imported from an external source like the CMDB that includes the required information. This information entails the fully-qualified-domain-name, its Domain and optionally the node-type designating its role in the topology or architecture.
It is the domain name that will be mapped to an YCE ‘Domain’ and where the relevant credentials will be retrieved from.
The YCE database supports two types of data: the internal YCE tables that are used for the modeling and population of the network, and the NMS tables that are used for custom data. The YCE tables are restricted to read-only access for any tool other than the YCE Client application. The NMS tables are fully accessible and may be customized. The table ‘CmdbNodes’ table resides in this part of the database and can be used to import device details for nodes that may be partially managed by YCE.
Nodes listed in this table can have their OS’s upgraded and their configurations manipulated on a basic level (i.e. the use of ‘Basic Command Jobs’, configuration updates without the use of node-context parameters). Other YCE integrations also access the data in this table for purposes like Maintenance Event Suppression, Seed file generation, CMDB validation reports, Monitoring validation, Inventory reports, etc. Preferably this data is renewed on a regular basis from a CMDB or other source.
The fields that are relevant to the OS upgrade tool are:
|AssetName||The hostname of the device. Mandatory since it is the primary key of the table|
|NodeName||The full-qualified-domain-name of the device. Mandatory|
|ServiceCode||The ‘service’ the node is a part of. Mandatory, it must match a ‘Domain’|
|NodeType||The role of the device in the design. Optional, may be used in selection rule for OS target file|
The YCE ‘domain’s are maintained using the YCE Client application. The Domains used by the fully managed nodes are already present and will have the proper values assigned – they were used to create the configurations and should therefore be current. Since the ServiceCodes listed in the CMDB must match a Domain, additional domains may be required and set up before the credentials are available to the OS upgrade tool.
The fields relevant to the OS upgrade tool are:
|Domain||has to match the ServiceCode of the NMS CmdbNodes table||Y|
|Domain_name||Friendly name of the domain. It is listed in the OS tools domain filter option||Y|
|Enable_secret||The password granting configuration mode privileges||Y|
|Rme_user||Username of standard Tacacs enabled user. This username will be attempted before the local_user. Normally a functional user will be created for automation purposes.||N|
|Rme_passwd||The password used by the Rme_user. Mandatory|
|Local_user||Username of a standard local user. This username will be used if the Rme_user was denied access. Mandatory. The local user will generally be a functional user that will be inactive while the Tacacs service is operational.|
|Local_passwd||The password of the Local_user. Mandatory|
Sample of Domain table entries to cover CMDB ServiceCodes:
These Domains need to entered only once, but require to have current password entries.
Once the nodes are selected, the OS upgrade view is shown listing the current status and options for each device. This overview lists the OS upgrade status of each device showing the target OS selection and the upgrade phase. The information the overview is based on, is maintained in the database table ‘Ios_upgrades’.
The overview lists the hostname and the ‘Node type’ of each selected device. The node type is copied from the YCE database or ‘CmdbNodes’ table and could be used to select the target OS. The ‘Node model’ might initially show the value that YCE uses to model the hardware, but will be modified once the true Cisco hardware name is lifted from the device during the Inventory.
The Inventory phase will set the recommended target OS file based on the node model, the node type and the domain as defined in the OS versions. The operator can select a different OS should various versions be available. Before the inventory or if no OS version matches the node model, only ‘-none’ will be available. The OS versions can at any time be modified or appended. The ‘Update/Refresh’ button will reevaluate the target OS selection.
The column ‘Upgrade Status’ shows the current upgrade phase along with the descriptive text. About 80 different phases are distinguished. An overview is listed in a pop-up window when the column header is clicked (see par ). The background color represents the nature of the current status: green for ok, red for fail.
The “Sched” (for ‘schedule’) checkbox must be checked to perform an action on that node. By default all devices have their ‘Sched’ checkbox deselected resulting in no actions. Nodes that do not have the active or planned status (YCE devices only) do not show a checkbox. The ‘Select all’ button ticks (or un-ticks) the checkbox of all devices in the window.
The action is defined by the radio-button of the subsequent columns. The available radio options differ per phase and scenario. The represent the following actions:
|Inv||Inventory||Perform hardware and OS inventory of the device|
|Psh||Push||Place the target OS file(s) on the device storage|
|Act||Activate||Activate the target OS and reload the device, then check its status|
|Fin||Finish||Cleanup the obsolete OS file(s) from the device storage|
|P/A||Push/Activate||Push/Activatie/Finish all in one go: Storage shortage may force the removal of the old OS before uploading the target OS.|
|Del||Delete||Delete status and inventory information of the device from the ‘Ios_upgrades’ table.|
Each of these actions is detailed in paragraph .
The final (rightmost) item in the table is a link named ‘details’. It will bring up a pop-up window with a table of all variables maintained by the OS upgrade tool for the device.
Inventory The Inventory action precedes any other IOS action but is usually required just once. During the inventory an interactive session (using telnet, see limitations) is established with the device and some relevant information about the hardware, the running IOS and files on the storage device(s) is retrieved. This includes information on node model, memory, bootdevice, bootloader, storage size, storage usage, modules and supervisor. The inventory requires a full set of credentials (two sets of user/passwd + enable), even should the device setup require only part of the credentials.
For a new device the radio button for the inventory (‘Inv’) will already be selected while any other choices are absent. Since the inventory precludes any other action, none are made available. This behavior is common to the OS upgrade tool: the phases of steps dictate the choices made available to the user. The recommended next step is highlighted using a green background and is usually already selected.
Upon completion of the inventory the tool attempts to select the target OS based on hardware model, domain and node-type. The requirements for the selection are fully controlled by the operators by defining the ‘OS versions’ using these parameters.
If a target OS can be selected, the details of the OS version are copied to the OS upgrade details of the device (using the table ‘Ios_upgrades’). When no target OS could be selected, a warning in the status will be displayed and no further actions will be available. Adding or modifying the OS versions and clicking the ‘Update/Refresh’ button will re-evaluate the OS selection.
If the target OS details were changed in the OS versions by an operator, these updates will not be used by the upgrade of devices that still use copies of the older definition. Changing selection to a different OS version and back might update the data, but deleting the device data and performing a new inventory is also used often to clear its status.
An inventory can be repeated as often as an operator desires. The inventory will update the information lifted from the device, but never the information on the target OS.
As part of the assignment of the target OS will be the analysis of the available os files in the device storage and marking the undesired ones for deletion. Then, based on the resulting free-space and possible presence of the target OS file on the device, the upgrade phase is re-determined or warnings issued. The recommended action may then be indicated (the green highlight), but the actions’ radio button deselected. This to force an operator to take notice and resolve the issue.
The target OS evaluation takes place automatically after an inventory. Should an operator select the target OS manually then the evaluation must be requested by the operator using the ‘Update/Refresh’ button.
Actions can be scheduled for overnight execution or immediately. While an action is scheduled or running no other actions are allowed. The scheduled action is indicated using a blue highlight and an ‘X’ in the place of the radio button.
Scheduled OS actions that are canceled will still show their scheduled status in the OS overview window. These incorrect statuses cannot be corrected without deleting the OS upgrade record.
To delete a record, select the device checkbox, select the ‘Del’ radio button and click the ‘Start’ button. The deletion will take place immediately, ignoring the schedule settings. Nodes also checked but for regular actions will be normally scheduled during the delete actions.
OS Upgrade Scenario’s Two OS upgrade scenario’s are available to the operator. The ‘Normal’ scenario performs the upgrade in three steps following the inventory: Push, Activate and Finish. Each step must be initiated by an operator and scheduled allowing full control over the process.
The alternate scenario, ‘Strict’, is intended to deal with devices where the available storage space is insufficient to hold the current and target OS files at the same time.
Normal Psh: OS files other than the current and target get deleted, as do any html files. The free disk space is verified and the target OS file(s) copied using tftp.
Act: The target OS file is configured to become the boot image. Any pre-activation commands the operator defined will be executed and the configuration saved. Then follows the reload. The node is revisited by the toll when it is back on-line and the OS running target OS version verified.
Fin: Deletes the now obsolete OS file from the storage device. Up to this moment could the operator perform a roll back by selecting the older OS file and activating it.
Every step includes the updating of the inventory information regarding files, memory and storage space from which it re-validates the upgrade criteria.
Strict The ‘Strict’ scenario is intended to deal with devices where the available storage space is insufficient to hold the current and target OS files at the same time. In these cases the three steps are rolled into one. The scenario starts by deleting the current and all other unneeded OS files (the ‘Finish’ step) before uploading the target OS (the Push) and the reload (the Activation).
These actions are performed as a single sequence without any operator intervention. Since this scenario is much less reliable it is not recommended unless truly required. To signal the use of the strict scenario, a yellow highlight is used.
Conditional actions The various actions have inter-dependencies that rule their availability and the highlight color. The table below lists for the different phases the presentation of the corresponding options. An ‘x’ represents an activated radio button, an ‘o’ a deactivated radio button.
A green or yellow highlight indicates which action is the recommended one. In case of an error situation the action must be repeated but requires explicit activation of the radio button by the operator.
|Ready for inventory||10||||(x)||(o)|
|Ready for Push||20||||(o)||(x)||(o)||(o)|
|Ready for activation||30||||(o)||(x)||(o)|
|Reload in progress||40||||X||(o)|
|Ready for finishing||50||||(o)||(x)||(o)|
|Ready for strict upgrade||60||||(o)||(x)||(o)|
|Strict upgrade scheduled||61||||X||(o)|
|Strict push failed||62-69||||(o)||(o)||(o)|
|Reload in progress||70||||X||(o)|
|IOS upgrade completed||80||||(o)||(o)|
The table below lists all the OS upgrade phases and status information. This table is available in a pop-up window when the upgrade status link is clicked.
|Data collection||0||Initial node selection|
|6||Peregrine information failed|
|7||CRC information failed|
|8||Previous inventory information failed|
|9||Node credentials not available|
|10||Node credentials available|
|Schedule||10||11||Inventory pending (scheduled)|
|Access||12||Access failed: No ping|
|12||Access failed: No login|
|12||Access failed: Incorrect node|
|12||Access failed: No enable|
|Inventory||13||Inventory: Version failed|
|13||Inventory: Bootvar failed|
|13||Inventory: Bootdev failed|
|13||Inventory: Module failed|
|Selection||14||No IOS selected|
|15||IOS files not available on server|
|17||Device space insufficient|
|20||Normal upgrade scenario|
|60||Strict upgrade scenario|
|Schedule||20||21||Push pending (scheduled)|
|Access||22||Access failed: No ping|
|22||Access failed: No login|
|22||Access failed: Incorrect node|
|22||Access failed: No enable|
|Verify inventory||23||Inventory outdated|
|Delete file||24||Freeup failed|
|25||Disk space insufficient|
|Push new||26||IOS push failed|
|Verify new||26||IOS verification failed|
|Schedule||30||31||Activation pending (scheduled)|
|Access||32||Access failed: No ping|
|32||Access failed: No login|
|32||Access failed: Incorrect node|
|32||Access failed: No enable|
|Verify IOS||33||IOS status outdated|
|Activate||34||Activation IOS failed|
|Pre-commands||35||Pre-activation commands failed|
|Reload||40||Reload command failed|
|Wait||40||Reload in progress|
|Access||40||41||Activation failed: No ping|
|41||Activation failed: No login|
|41||Activation failed: Incorrect node|
|41||Activation failed: No enable|
|Verify active ios||42||IOS activation failed|
|Post commands||43||Post-activation commands failed|
|50||IOS activation completed|
|Schedule||50||51||Finishing pending (scheduled)|
|Access||52||Access failed: No ping|
|52||Access failed: No login|
|52||Access failed: Incorrect node|
|52||Access failed: No enable|
|Verify active ios||53||Incorrect active IOS|
|Delete previous||55||Old IOS delete failed|
|80||IOS migration completed|
|Schedule||60||61||Psh/Act pending (scheduled)|
|Access||62||Access failed: No ping|
|62||Access failed: No login|
|62||Access failed: Incorrect node|
|62||Access failed: No enable|
|Verify inventory||63||Inventory outdated|
|Delete files||64||Freeup failed|
|65||Disk space insufficient|
|Delete old||66||Active IOS delete failed|
|Push new||67||IOS push failed|
|Verify new||67||IOS verification failed|
|Activate||68||Activation IOS failed|
|Pre-commands||69||Pre-activation commands failed|
|Reload||70||Reload command failed|
|Wait||70||Reload in progress|
|Access||70||71||Activation failed: No ping|
|71||Activation failed: No login|
|71||Activation failed: Incorrect node|
|71||Activation failed: No enable|
|Verify active ios||72||IOS activation failed|
|Post commands||73||Post-activation commands failed|
|80||IOS activation completed|
Sample of the pop-up window with the status legend:
For each device the OS parameters collected during the inventory, its credentials and target OS information are stored in the YCE database (the ‘Ios_upgrades’ table). The ‘details’ link for each listed device in the OS overview window activates a pop-up window with the current information for the device.
Several known limitations are the result of design choices:
Some minor hardware dependencies or special commands might be required. Fixes will be made available at short notice should feedback be provided. File transfer using SCP or SFTP is currently under consideration by the development team. Customer requirements could prioritize its development
Scheduled OS actions that are canceled will still show their scheduled status in the OS overview window. These incorrect statuses cannot be corrected without deleting the OS upgrade record.
When placing a target OS file on the device storage media, the medium selected will be the same where the current OS file was located. Changing the (boot) storage medium as part of the OS upgrade is currently not supported.
All actions are executed using scheduled jobs initiated by the OS upgrades tool. During execution, the scheduler finds in the job file one of the five executables that correspond to the action indicated by the operator. The action executables require only the use of the Ios_upgrades and Ios_version tables for their information. These records were prepared by the OS upgrades tool using the Node sources form YCE and the CmdbNodes table. Credentials for each device are lifted from the Domain table for which reach device needs a reference. OS files are copied to the devices using TFTP. The tftp files are maintained by the operator using a separate web tool.
The OS upgrade tool can be accessed under Operate, then click the “Tools” menu option and subsequently choose “OS upgrades” on the horizontal Tools menu. This tool requires an access level of “engineer” or higher.
The initial form of the OS upgrades tool allows for the selection of the device names based on the Domain name or direct entry (in case of entering multiple devices please make sure to separate these using spaces, newline, comma, semicolon or tabs) of the device names. Please note that only the shorter hostname (assetname) will be accepted, not the full qualified domain name. Nodes that are not present in either the YCE database or CmdbNodes table will not be accepted.
To select nodes based on the domain name click the “Filter domain” button, subsequently the nodes in the selected domain will be shown. To select multiple nodes hold down the ctrl key while selecting.
To add this selection to the selected nodes click the “»” button.
In the following example we will walk through the different phases of the OS upgrade process
First we need to run an inventory on the device to discover the version of the running OS:
Notice the enabled “Now” option, without this option enabled the job would be scheduled to run the next day at 05:00. To schedule the inventory job click the “Start” button. You'll be redirected to a page with the job number, date and the time it's scheduled on.
Click the “Back” button to return to the status overview page, as you can see the Upgrade Status is changed to “14 - No IOS selected”
To see which OS version is running click the “details” link on the right, the Current_ios field tells us this node is running v5.3.2.007
We select a newer version of the OS version, in this case V5.3.3.011s and again please take notice of the enabled “Now” option and click the “Start” button. The push job transfers the image(s) in question to the node using the transfer protocol defined in the Model details of the vendor type of the node in question.
OS update is scheduled, click the “Back” button to return to the status overview page.
Now the image file has been transferred to the switch, the Upgrade Status is changed to “30 - IOS Updated” and the “Act” action already has been selected so we can activate the new image file by first enabling the “Now” option and thereafter clicking the “Start” button.
OS activation has been scheduled, click the “Back” button to return to the status overview page.
The new image file has been activated on the switch, the Upgrade status has been changed to “Reload in progress” click the “Update/Refresh” button to refresh the status.
Reboot has taken place and the new OS version has been successfully verified, Upgrade Status has been changed to “OS activation completed” and the “Fin” action is already selected so we only need to enable the “Now” option and click the “Start” button to finalize the OS upgrade operation.
OS Finish has been scheduled, click the “Back” button to return to the status overview page.
The old OS image on the device is no longer needed and therefore deleted, the Upgrade Status has been changed to “Freeup completed”.
To view the node's OS details click the “details” link which will bring up the following popup:
In case you would like to delete the row from the status overview page select the “Del” action, enable the “Now” option and click the “Start” button to delete the record.