QuakeWatch Status Processor
(Version 0.33Beta)





Introduction

     The QuakeWatch Status Processor monitors QuakeWatch Server ('QWServer') programs and may be configured to send status reports to any number of email recipients.  Status reports may be sent to a log file, and an output file may be setup to always contain the latest report.  (Making the output file web-accessible can provide easy access to the latest status report.)
     This program establishes "client" connections to each of its monitored QuakeWatch Servers, and will flag warnings when communications are lost to any server.  The connections are also used to retrieve detailed status-report information from each monitored server.

     The QuakeWatch Status Processor ('QWStatusProcessor') distribution is available in archived form as a ".zip" file.  The following files and directories should be present after the archive is unpacked:

QWStatusProcessor.jar - Java archive file containing the classes for the program.
run.bat - Windows batch file for launching the program.
startProc - Unix script for launching the server program.
stopProc - Unix script for teminating the server program.
conf/QWStatusProcConfig.xml - A sample configuration file.
doc/QWStatusProcessor.html - The QuakeWatch Status Processor manual.
QWStatusProcessorSrc.zip - An archive file containing the source code for the program.

     Use of the QuakeWatch Status Processor requires that a Java Version Machine (JVM) version 1.4 or higher be available on the host system.  An installable version of the Java Runtime Environment (JRE) containing the latest JVM may be downloaded from Sun.

     The following command may be used to query the version of the program:

java -jar QWStatusProcessor.jar -v

     The configuration file must be customized for each installation.  See the sections that follow for more information.



Configuration File

     The QWStatusProcConfig.xml file (found in the "conf" directory) contains the configuration parameters for the QuakeWatch Status Processor.  See the "Example Configuration File" section for a sample file.  An alternate location and/or name for the configuration file may be specified by supplying "--configFile" as the first command-line parameter and the path/file name for the configuration file as the second command-line parameter to the program.  When any parameter in the configuration file is changed, the program must be restarted for the change to take effect.
     The configuration file is in XML format, with all of the information inside of a "<QWStatusProcConfig>" element.  The main configuration parameters reside inside of the "<Settings>" element, and are described below.
    
See the "Monitored Server Configuration" and "Recipient Configuration" sections for details on other elements used in the configuration file.

programInstanceName = "name"
     The ID name string for this installation of the program.  This ID name will be shown in emails and reports generated by the program.  If this parameter is not specified then no ID name will be shown.

emailFromName = "name"
     The "real" name that will appear in the "From:" field of email messages sent out by this program.

emailFromAddress = "emailAddress"
     The email address that will appear in the "From:" field of email messages sent out by this program.  For the sending of email messages to succeed, this setting must contain a valid email address.

smtpServerAddress = "serverAddress"
    
The address of the SMTP email server to be used for sending email messages.  For the sending of email messages to succeed, this setting must contain the address of an SMTP email server that will accept send-mail requests from the machine on which the program is running.

maxSendMailRetries = ##
     Sets the maximum number of times the program will attempt to send an alert email message after all previous attempts have failed.

reportOutputLogName = "name"
     The name of the report-output log file.  Status reports may be generated and sent to this file on a periodic basis, with each new report appended to the end of the file. 
A date code in the form "_yyyymmdd" will be inserted into the name, and a new file name will be used each day.   If this parameter is not specified then the file name "log/ReportOutput_yyyymmdd.txt" will be used.

repLogOutIntervalMins = ##
     The update interval, in minutes, for the generation and saving of status reports to the report-output log file.  Additional status reports may be generated and saved when "warning" conditions are detected.  This parameter may be specified as a floating point value (such as 1.5 minutes).  Setting the value to 0.0 will result in no status reports being saved to the report-output log file. 
If this parameter is not specified then the value will default to 10.0 minutes.

repOutMaxAgeInDays = ##
    
The maximum age, in days, for report-output log files generated by the program.  Report-output log files older than this value will be automatically deleted; unless the value is zero, in which case the log files will never be deleted.  If this parameter is not specified then the value will default to 30 days.

repOutCurrentFileName = "name"
     The name of the current-report output file.  Status reports may be generated and sent to this file on a periodic basis
, with each new report overwriting any previous data in the file.  If this parameter is not specified or is set to an empty string ("") then no status reports will be saved to the current-report output file.  Making the current-report output file web-accessible can provide easy access to the latest status report.

repCurOutIntervalMins = ##
     The update interval, in minutes, for the generation and saving of status reports to the current-report output file.  Additional status reports may be generated and saved when "warning" conditions are detected.  This parameter may be specified as a floating point value (such as 1.5 minutes).  Setting the value to 0.0 will result in no status reports being saved to the current-report output file.  If this parameter is not specified then the value will default to 1 minute.

repAllGoodDelaySecs = ##
     Specifies the delay (in seconds) before a status-report entry declaring all monitored-server connections as "good" is output.  When the program sees any monitored-server connection as "bad", followed by seeing all connections as "good", a status-report entry is output stating "good connections to all servers restored."  The entry is delayed by the given number of seconds, during which time all monitored-server connections must remain "good" (or else the entry is cancelled). 
If this parameter is not specified then the value will default to 60 seconds.

logFileName = "logFileName"
     The name of the log file to be written to by the program.  A date code in the form "_yyyymmdd" will be inserted into the name, and a new log file name will be used each day.  If this parameter is not specified then the log file name will default to "log/QWStatusProc.log".

logFileLevel = "level"
     The message level for log file output.  The allowed values are "NO_MSGS", "Error", "Warning", "Info", "Debug", "Debug2", "Debug3", "Debug4", "Debug5" and "ALL_MSGS".  If this parameter is not specified then the value will default to "Debug".

consoleLevel = "level"
     The message level for console output.  The allowed values are "NO_MSGS", "Error", "Warning", "Info", "Debug", "Debug2", "Debug3", "Debug4", "Debug5" and "ALL_MSGS".  If this parameter is not specified then the value will default to "Info".

logFilesMaxAgeInDays = ##
     The maximum age, in days, for log files generated by the program.  Log files older than this value will be automatically deleted; unless the value is zero, in which case the log files will never be deleted.  If this parameter is not specified then the value will default to 30 days.

consoleRedirectFileName = "fileName"
     An optional specification for the console-output redirect file name.  A date code in the form "_yyyymmdd" will be inserted into the name, i.e., "QWStatusProcConsole_yyyymmdd.txt".  If this parameter is not specified then the console output will not be redirected.  The use a of console-output redirect file is highly recommended so that unexpected error messages will be captured and saved (i.e., out-of-memory error messages).

consoleFilesSwitchIntvlDays = ##
     The switch interval, in days, to be used for the console-output redirect files specified by the "consoleRedirectFileName" parameter.  After the given number of days, the current redirect file is closed and a new one is started; unless the value is zero, in which case the initial redirect file is used indefinitely.  If this parameter is not specified then the value will default to 30 days.

consoleFilesMaxAgeInDays = ##
     The maximum age, in days, for the console-output redirect files specified by the "consoleRedirectFileName" parameter.  Redirect files older than this value will be automatically deleted; unless the value is zero, in which case the redirect files will never be deleted.  If this parameter is not specified then the value will default to 366 days.

The default (and preferred) mechanism for retrieving status-report information from monitored QuakeWatch Servers is via the direct "client" connection that is automatically established to the servers.  If this mechanism is not workable in a given installation, status-report information may also be delivered via status-report files (using 'scp') from QuakeWatch servers to an input directory that is polled by the QWStatusProcessor.  The input of these files may be configured using the following parameters:

inputDirName = "directoryName"

     The directory that will be polled for status-report input files. 
If this parameter is not specified or is set to an empty string ("") then no polling for status-report input files will be performed.

storageDirName = "directoryName"
     The directory into which files will be placed after they have been processed. 
If this parameter is not specified then the value will default to "msgSave".

processDirName = "directoryName"
     The directory that the program will move input files into while they are being processed. 
If this parameter is not specified then the value will default to "msgProcess".

inputPollDelayMS = ##
     The number of milliseconds that the program will delay between polls of the input directory
.  If this parameter is not specified or is set to 0 then no polling for status-report input files will be performed.

storageAgeSecs = ##
     The number of seconds that the program will hold processed files in the "storage" directory before deleting them.  A value of 0 will configure the program to delete the files immediately after processing them.
  If this parameter is not specified then the value will default to 86400 seconds (1 day).

reportTimeoutMins = ##
     Sets the missing-report timeout value, in minutes.  After the specified number of minutes have elapsed since the last status report from a given server, a warning entry will be output by this program.  This parameter may be specified as a floating point value (such as 12.5 minutes).  If this parameter is not specified or is set to 0.0 then no warning entries will be output for missing-report timeouts.

dropSvrTimeoutMins = ##
     Sets the drop-server timeout value, in minutes.  After the specified number of minutes have elapsed since the last status report from a given server, tracking of the server will no longer be performed by this program (no more warning entries for missing reports).  This parameter may be specified as a floating point value (such as 30.5 minutes).  If this parameter is not specified or is set to 0.0 then no drop-server timeout will be used.




Monitored Server Configuration

     Within the "<QWStatusProcConfig>" element in the configuration file may be any number of "<ServerMonitor>" elements.  Each one specifies a QuakeWatch Server that is to be monitored and reported on.  The configuration parameters described below may be specified in a
"<ServerMonitor>" element.

serverHostAddress = "hostAddress"
     The host-address specification for the server to be monitored. 
The address may be in name (i.e. "www.caltech.edu") or numeric-IP (i.e. "131.215.220.4") form.

serverPortNumber = ##
     The port-number
specification for the server to be monitored.  If this parameter is not specified then the value will default to 39977.

webServicesServerFlag = true/false
     Set 'true' to configure the connection as being to a QuakeWatch Web Services Server (as opposed to a CORBA-based QWServer). 
If this parameter is not specified then the value will default to 'false' (CORBA-based QWServer).

loginInfoFileName = "filename"
     The name of the server login information file.  This file may be used to specify the username and password used to validate the connection to
the server.  If this parameter is set to an empty string ("") or if the specified file is missing then a blank username and password will be used.  If this parameter is not specified then the filename will default to "conf/ServerLoginInfo.dat".  For more information on this file, see the "QuakeWatch Server Login Information File" section.



Recipient Configuration

     Within the "<QWStatusProcConfig>" element in the configuration file may be any number of "<Recipient>" elements.  Each one specifies an email recipient to receive notifications from the program.  The configuration parameters described below may be specified in a "<Recipient>" element.

recipientName = "name"
     Sets the name to be associated with the recipient.

emailAddressList = "emailAddress(es)"
     Specifies one or more email addresses for the recipient.  If more than one address is specified then commas are used to separate the addresses.

emailsEnabledFlag = true/false
     When this parameter is set to 'true', the recipient is enabled and may be sent email messages.  When set to 'false', the recipient will never be sent email messages.

startupEmailFlag = true/false
    
When this parameter is set to 'true', the recipient will be sent an email when the program starts up and when it shuts down.

sendMailIntervalMins = ##
     Specifies the interval (in minutes) between the periodic sending of status-report emails to the recipient.  For instance, if this parameter is set to 60.0 then a status-report email will be sent once per hour.  Note that if the 'immedSendOnMsgsFlag' parameter is set to 'true' then additional emails will be sent when "warning" conditions are detected.  This parameter may be specified as a floating point value (such as 30.5 minutes).  If this parameter is not specified then the value will default to
1440.0 (once per day).

immedSendOnMsgsFlag = true/false
     If this parameter is set to 'true' then "immediate" emails will be sent to the recipient when "warning" conditions are detected.  Examples of "warning" conditions include:  loss of communication with a monitored server; warning messages logged by a server.  Note that the value of the 'minImmedIntervalMins' parameter may affect when "immediate" emails are sent out.  If this parameter is not specified then the value will default to 'false'.

minImmedIntervalMins = ##
     Specifies the minimum interval (in minutes) between "immediate" emails sent to the recipient.  If the 'immedSendOnMsgsFlag' parameter is set to 'true' and an "immediate" email has been sent, the next
"immediate" email will be delayed if the specified number of minutes have not elapsed.  This parameter may be specified as a floating point value (such as 1.5 minutes).  If this parameter is not specified then the value will default to 10.0 minutes.



QuakeWatch Server Login Information File

The QuakeWatch Server login information file contains the username and password used to validate a connection to a QuakeWatch Server.  The password may be specified in plain text (in which case the password will be replaced with an encrypted version after the first time it is used by the program).  This file is specified via the "loginInfoFileName" parameter.

The first entry in the file is always the username:

username = "yourUser"

The second entry is the password, which may be specified using one of the following:

password = "yourPwd"
stdEncPassword = "yourStdEncPwd"
encryptedPassword = "yourEncyptedPwd"

The 'password' entry specifies the password in plain text.  The first time the login information file is used it will be rewritten with the "password" entry replaced by an 'encryptedPassword' entry containing an encrypted version of the password.

The 'stdEncPassword' entry specifies the password in a "standard" encoded form (using Unix-style "crypt").  The first time the login information file is used it will be rewritten with the "stdEncPassword" entry replaced by an 'encryptedPassword' entry containing an encrypted version of the "standard" encoded password.

The 'encryptedPassword' entry specifies the password in an "encrypted" form.

If more than one type of password entry is given, only one will be used, with the order of precedence as shown above.

The following shows an example of the contents of a login information file:

username = "yourUser"
password = "yourPwd"



Example Configuration File

<QWStatusProcConfig>
  <Settings>
      # Instance name for this installation of the program:
    programInstanceName = "DefaultName"
      # Name of console output redirect file:
    consoleRedirectFileName = "log/QWStatusProcConsole.txt"
      # Name of the log file:
    logFileName = "log/QWStatusProc.log"
      # Level of log messages sent to the log file:
    logFileLevel = "Debug"
      # Level of log messages sent to the console:
    consoleLevel = "Info"
      # Maximum age for log files (days, 0=infinite):
    logFilesMaxAgeInDays = 30
      # Missing report timeout (minutes, 0.0=none):
    reportTimeoutMins = 60.0
      # Drop-server timeout (minutes, 0.0=none):
    dropSvrTimeoutMins = 120.0
      # 'From' real-name for messages:
    emailFromName = "QWStatusProcessor"
      # 'From' email-address for messages:
    emailFromAddress = "name@example.com"
      # Address of SMTP server for sending mail:
    smtpServerAddress = "mail.example.com"
      # Update interval for report-output to log file:
    repLogOutIntervalMins = 10.0
      # Update interval for output to current-report file:
    repCurOutIntervalMins = 1.0
      # Pathname of output file to receive current report:
    repOutCurrentFileName = "log/CurrentReport.txt"
  </Settings>
  <ServerMonitor>
     # host address of server:
   serverHostAddress = "localhost"
     # port number of server:
   serverPortNumber = 39977
     # true for QW web services server:
   webServicesServerFlag = false
     # file holding server-login information:
#   loginInfoFileName = "conf/ServerLoginInfo.dat"
  </ServerMonitor>
  <Recipient>
    recipientName = "TestRecipient"
    emailAddressList = "name@example.com"
    emailsEnabledFlag = false
      # true to enable program-startup email to recipient:
    startupEmailFlag = true
      # Update interval for sending email reports:
    sendMailIntervalMins = 1440.0
      # Enable immediate send on tracked log messages from servers:
    immedSendOnMsgsFlag = true
      # Minimum interval between "immediate" sends:
    minImmedIntervalMins = 10.0
  </Recipient>
</QWStatusProcConfig>




See Also

QuakeWatch Server Installation at http://www.cisn.org/software/QWServer
CISN Display Installation at http://www.cisn.org/software/cisndisplay.html


9/24/2013 - Eric Thomas, Instrumental Software Technologies, Inc. - info@isti.com