menu:admin:system:status
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
menu:admin:system:status [2019/07/16 12:43] – ↷ Page moved from admin:system:status to menu:admin:system:status jbosch | menu:admin:system:status [2024/07/03 12:31] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | {{indexmenu_n> | ||
+ | ===== System status ===== | ||
+ | |||
+ | ==== Introduction ==== | ||
+ | |||
+ | This tool reports on the user-selected YCE server and includes details on its setup, the various daemons and processes and some additional details. The System status tool will also allow users to stop and start the YCE daemons. | ||
+ | |||
+ | The tool is accessible for any user with the global ‘Manager’ permissions and is located in the Admin menu by the name " | ||
+ | |||
+ | The report is divided into six sections: | ||
+ | * YCE Server setup | ||
+ | * YCE processes | ||
+ | * Database status | ||
+ | * Filesystems | ||
+ | * YCE process list | ||
+ | * YCE usage | ||
+ | |||
+ | Each of these sections lists some relevant details on the YCE server selected. | ||
+ | |||
+ | ==== Operation ==== | ||
+ | When the tool is initially started, a header list is shown with the various YCE servers along with their details. The server name forms a button to select the server to report on. The default server selected is always the current (front-end) server the user is working on. To highlight the current selected server, the background color for this server details is a lighter blue. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Following the header is a line with connection details to illustrate the connection status to the selected server. The connection uses the ‘Yce exchange service’ that is normally active on all YCE servers. | ||
+ | |||
+ | When the connection succeeds, the connection line is followed by the report, each section preceded by a bold header. Subsections are preceded by a header in blue text. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Each section is described in the following paragraphs. | ||
+ | |||
+ | === Connection details === | ||
+ | The line with connection details informs the user of the status of his request. All requests are issued over the YCE exchange interface, an xml based synchronous request-response system between all YCE servers. | ||
+ | |||
+ | The process serving this interface (yce_xch) is therefore a prerequisite for both the system report and its actions. The connection details line informs the user on the availability and connectivity of the yce_xch service. | ||
+ | |||
+ | When the service or server cannot be reached, the connection line shows this status: | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | When performing actions (using the buttons embedded in the report), the action is executed using the same method of connecting, executing and reporting. When done, the system status report is requested automatically and appended. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | ====YCE Server setup ==== | ||
+ | The first section in the report has two subsections: | ||
+ | |||
+ | === YCE overview === | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | The server overview should correspond to the header of the page. In fact, it will match exactly for the local server (the front-end server the user is using) since they are both taken from the same source: the yce configuration file for the server ''/ | ||
+ | |||
+ | This file is created by the setup tool during system configuration ''/ | ||
+ | |||
+ | It is essential however, that all servers have the same ‘view’ of the YCE environment. If the page header shows a different setup than the report, the configuration setup should be corrected for the erroneous server, or preferably all servers. | ||
+ | |||
+ | Keep in mind that the page header is taken from the local server configuration file. | ||
+ | |||
+ | == This server == | ||
+ | This subsection shows the hostname, short name and ip-address of the server as retrieved using the '' | ||
+ | |||
+ | One line describes the (Linux Red Hat) OS. The output of the command | ||
+ | '' | ||
+ | |||
+ | The final line in this section is the output form the '' | ||
+ | |||
+ | == YCE processes == | ||
+ | This section will probably the most consulted section since it will validate the running YCE daemons. The report uses a subsection per daemon to show its status, its matching pid-file (for locking purposes) and the number of childs included with the daemon. Each line is preceded by a validation remark: '' | ||
+ | |||
+ | In addition to the status lines, buttons can be shown to manipulate its operation. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | === YCE daemon configuration === | ||
+ | Before elaborating on the action these buttons represent, some background information is required on the configuration of the YCE daemons. | ||
+ | |||
+ | When the configuration setup script is executed ''/ | ||
+ | |||
+ | When completed, configuration files for all servers are generated. Amongst these configuration files some are named ''< | ||
+ | |||
+ | It is this configuration file that is being used to determine the YCE daemon statuses. | ||
+ | |||
+ | <file - hostname_psmon.conf file> | ||
+ | # YCE psmon configuration | ||
+ | # | ||
+ | # | ||
+ | # filename: ' | ||
+ | # PSmon config for ' | ||
+ | # | ||
+ | # | ||
+ | # YCE Server overview: | ||
+ | # Name | ||
+ | # ---- | ||
+ | # kunoichi | ||
+ | # shinobi | ||
+ | |||
+ | # | ||
+ | # File created by ' | ||
+ | # | ||
+ | # | ||
+ | <Process mysqld> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | pidfile | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process httpd> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | pidfile | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process yce_skulker.pl> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | pidfile | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process yce_sched.pl> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process yce_tftpd.pl> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process yce_xch.pl> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | <Process yce_ibd.pl> | ||
+ | disabled | ||
+ | ignoreflag | ||
+ | spawncmd | ||
+ | killcmd | ||
+ | instances | ||
+ | pctcpu | ||
+ | noemail | ||
+ | </ | ||
+ | |||
+ | Frequency | ||
+ | Disabled | ||
+ | AdminEmail | ||
+ | </ | ||
+ | |||
+ | The syntax of the file is straightforward. Xml-style process definitions with several attribute / value pairs. YCE processes not required for the server role have the '' | ||
+ | |||
+ | More on these ignore-flags in a moment. First you will need to understand how this file is used by the service manager: the psmon-daemon. | ||
+ | |||
+ | === Service manager === | ||
+ | At system startup the YCE service manager ''/ | ||
+ | |||
+ | When needed a process is restarted automatically or taken down if misbehaving. Essentially, | ||
+ | |||
+ | To ensure the psmon-daemon is permanently running, it is added to the ‘root’s crontab to relaunch it every hour. | ||
+ | |||
+ | == Ignore flags == | ||
+ | For maintenance purposes a process must be temporary stopped before restarted. To prevent the restart to take place before the user or maintenance task is ready, the service manager needs to be informed that a process should not be monitored. This is achieved by setting an ignore-flag for the appropriate process. | ||
+ | |||
+ | While this ignore-flag exists, the service manager will not touch this process or its siblings. When the daemon dies, it its not automatically relaunched. The various ignore-flag files are all located in the ''/ | ||
+ | |||
+ | The standard procedure for maintenance on an YCE daemon is therefore: create the ignore-flag file, stop the daemon, perform the maintenance task, remove the ignore-flag. The service manager will then start the daemon automatically within the next 20 seconds. | ||
+ | |||
+ | If a daemon must be restarted without additional maintenance tasks, it suffices to stop the daemon and wait a few seconds to make it come back. | ||
+ | |||
+ | To facilitate these procedures, the YCE processes report includes buttons to Set or Remove the ignore flag per daemon. Once set, the report will list a warning for its presence. | ||
+ | |||
+ | === Notes on process operations === | ||
+ | Some actions provided by the buttons in this section have limitations or repercussions. Those are listed below. | ||
+ | |||
+ | == yce_tftp == | ||
+ | The YCE tftp daemon serves its users on port 69. It requires ‘root’ privileges to be able to bind to these low port-numbers. Therefore, stopping the '' | ||
+ | |||
+ | Using the button '' | ||
+ | |||
+ | To restart the '' | ||
+ | |||
+ | == yce_xch == | ||
+ | The yce_xch daemon is used as a north-bound interface for NMS systems, but also for inter-server tasks of the YCE system itself. One of these is the execution of the system status report and its additional actions. The '' | ||
+ | |||
+ | Setting the '' | ||
+ | |||
+ | ==== Database status ==== | ||
+ | The Database status section has four subsections. | ||
+ | |||
+ | == DSN == | ||
+ | The first, lists the current data source name (DSN) as used by the server. It contains amongst others the IP-address of the database server. The DSN is read from the file ''/ | ||
+ | |||
+ | The yce_skulker is tasked with the monitoring of the database availability and synchronization status of the YCE master/ | ||
+ | |||
+ | == Replication status == | ||
+ | The lines in this subsection tell the status of the master/ | ||
+ | |||
+ | The status of the IO state and SQL state are given separately, but both need to be running to get an active replication status. Additional information is listed when failure is detected and can include the offending SQL statement in case of a replication conflict. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | == Database sync status == | ||
+ | The database sync status gives the result of the yce_skulker interpretation of its continuous synchronization tests. It lists the primary and standby database IP-addresses and wich of these is the current active database for this server. | ||
+ | |||
+ | == License status == | ||
+ | YCE licenses come in two varieties, the package licenses and the activation licenses. The latter are listed here along with their status as monitored by the '' | ||
+ | |||
+ | === Sample database states === | ||
+ | When either database can be up, down, active or inactive, tracing the corrective action can be confusing. The example below clarifies the messages listed when one database is down and the other operational. | ||
+ | |||
+ | In this example the primary database for the ‘shinobi’ server is brought down (eg for backup purposes). This causes ‘shinobi’ to switch to the database on ‘kunoichi’, | ||
+ | |||
+ | Step 0: all’s well | ||
+ | Shinobi' | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Kunoichi' | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Step 1: stop ‘shinobi’ database | ||
+ | Set the ignore-flag, | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Then stop the mysqld database | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | You get an error because the database cannot verify you are a valid user for the system status report. The database is gone and the switch has not yet occurred. | ||
+ | |||
+ | Request the report again from ‘shinobi’. The processes show the missing mysql database process: | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | The database status on ‘shinobi’ shows it runs on ‘kunoichi’ now. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Step 2: review ‘kunoichi’ database | ||
+ | Request the report for ‘kunoichi’. It shows a database replication status with errors: | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | The first error alerts that the replication was halted. Its remote, ‘shinobi’ failed. The problem seems to be IO since it is in the ‘connecting’ state. The detailed message indicates that the connection to the master failed but is in retry mode. | ||
+ | No error messages on the SQL state since the problem does not relate to it. If it was, additional messages on the SQL cause would be given. | ||
+ | |||
+ | Step 3: restore ‘shinobi’ database | ||
+ | Remove the ignore-flag for mysqld or start the database directly. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | If you leave the ignore-flag, | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Immediately the master/ | ||
+ | Shinobi status: | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Kunoichi status: | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | After about a minute (or more if a lot of data needs to be synced), the ‘shinobi’ report shows that the current database is once again ‘shinobi’ and the Primary is Active. | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | Note: During the database re-synchronization phase the active licenses may show a warning because license validation occurs only at large intervals. This situation will correct itself and has no repercussions because licenses are never hard enforced. | ||
+ | |||
+ | ==== Filesystems ==== | ||
+ | The size and usage of the various filesystems mounted by the server are listed. It is the output of the command '' | ||
+ | |||
+ | |||
+ | {{menu: | ||
+ | ==== Process list ==== | ||
+ | The YCE process list reports the process table of all YCE related processes. | ||
+ | The top subsection all YCE daemon processes and their siblings. The bottom subsection all remaining ‘yce’-owned processes as well | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | ==== YCE usage ==== | ||
+ | The YCE usage section shows a snapshot of the most active processes of the ‘yce’ user. The output of the command '' | ||
+ | |||
+ | {{menu: | ||
+ | |||
+ | ==== Wiki updates ==== | ||
+ | |||
+ | The NetYCE WIKI installation consists of two parts. The DokuWiki engine setup for NetYCE wiki's and the actual Wiki content. Both can be downloaded from this page and are **daily** updated. Normally, only the Wiki content part needs to be regularly downloaded and installed on your local YCE-server(s). | ||
+ | |||
+ | {{: | ||
+ | {{: | ||
+ | |||
+ | these NetYCE wiki installation distribution files can be installed using the NetYCE web-based front-end using the '' | ||
+ | Both parts need to be installed this way. | ||
+ | |||
+ | |||
+ | // **NOTE** \\ | ||
+ | This front-end functionality is not yet available in the current releases. Within a few days the Wiki installation option, the URL configuration and the http-server setup options - required to access the Wiki - will become available in a NetYCE patch update. Alternatively, | ||
+ | // | ||