ELabs Cluster power-up checklist

The following steps should be performed whenever a cluster machine (or all cluster machines) are rebooted or powered up from maintenance, just to be sure that everything is working as expected.


Boot Sequence checks

  1. boot the data server, data2
  2. boot the database server, data1, and verify that both postgres and mysqld are running:
    1. sudo /etc/init.d/postgres status
    2. sudo /etc/init.d/mysqld status
  3. boot each individual cluster node: wwwXX where XX=10 to 17 (does the order matter?)
  4. check RAID mounts; no complaints and all disks healthy?
  5. /nfs use 'mount -a' (to be set with run level?)
  6. verify mounts & partitions --> throw success/failure message
  7. start http servers: Tomcat, apache (or verify they are running)
  8. scan ports: correct? (how?)
    1. confirm http (how?)
    2. confirm memory available (how?)
  9. security checks: ??

Sanity checks

  1. confirm login.mcs.anl.gov available (and/or terra, harley, shakey individually)
  2. ping all cluster machines
  3. all hosts: login --> dataX, wwwXX via 'terra' (or run Eric's hey-you-guys uptime)
  4. all servers: URL offers --> wwwXX,i2u2.org/elab/cosmic/project.jsp
  5. confirm all fwds: 'quarknet.fnal.gov/e-lab' and
  6. www.i2u2.org/elab/cosmic'
  7. confirm all cross mounts
  8. database responding: users & data
  9. all servers: run JMeter tests

LIGO e-Lab checks

  1. verify Bluestone works for "User" login: http://www13.i2u2.org/tla_test
  2. verify Bluestione works for "Guest" login: http://www13.i2u2.org/tla_test

QuarkNet Fellows Library

  1. verify it is up and working correctly: http://www13.i2u2.org/cosmic/library
Topic revision: r8 - 2019-05-22, AdminUser
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback