QuarkNet Servers
The
e-Labs website and e-Labs are served from three SuperMicro servers located at Notre Dame's
Center for Research Computing. The servers were purchased through Fermilab in July 2014 and stored there until they were moved to Notre Dame and placed into production in Q4 2015 when we moved IT operations from Argonne to Notre Dame.
Purchase Order with specs.
As labeled by the CRC, the physical ("bare metal") servers are
- i2u2-vmhost01
- i2u2-vmhost02
- i2u2-store01
i2u2-vmhost01 and
i2u2-vmhost02 are ostensibly identical general-purpose servers.
i2u2-store01 is the primary data storage server, with 28TB.
The CRC has installed Red Hat Enterprise Linux on these servers, from which their resources are apportioned into several virtual machines (VMs) described below. All interaction that e-Lab developers have with the servers is in terms of the VMs, so you typically don't need to know the physical machine names unless something goes wrong.
Something goes wrong: In December 2016 two of the drives failed on
i2u2-vmhost02 along with its power supply unit. CRC engineers warned us that the failure of an additional drive would wipe out the VMs stored on that server. These VMs were moved to
i2u2-vmhost01 (with reduced RAM) for safety until
i2u2-vmhost02 can be repaired. The servers were still under a 3-year parts warranty from SuperMicro.
The CRC handled the warranty submission to SuperMicro, which shipped replacement parts. The server was repaired by the end of January 2017, and the affected VMs were returned to normal service over the following week (
i2u2-data, being critical for the website function, had to wait until the next weekend to be moved).
The engineers recommend against purchasing SuperMicro equipment in the future, since their products tend to be not as robust as "Tier 1" equipment.
VMs
All e-Lab IT functions are performed on virtual machines (VMs) created on the two
i2u2-vmhost bare-metal servers listed above. The CRC is in charge of the virtualization software running on the underlying RHEL OS, so contact them if you need a new VM or a clone or something.
The VMs are
(VM-name).crc.nd.edu |
Public IP |
description |
i2u2-prod |
129.74.246.110 |
Server for e-Labs site www.i2u2.org |
i2u2-dev |
129.74.246.106 |
For development prior to deployment on i2u2-prod |
i2u2-db |
N/A |
Database server for user data to i2u2-prod / dev |
i2u2-data |
N/A |
Database server for physics data to i2u2-prod / dev |
i2u2-quarknet |
129.74.246.125 |
Server for quarknet.i2u2.org |
i2u2-wiki |
129.74.246.153 |
Server for wiki.i2u2.org (this one) and bugzilla.i2u2.org |
i2u2-ligo |
N/A |
Temporary server to help fix a problem with LIGO in 2016. Since deleted |
i2u2-jupyter |
N/A |
Jupyter Notebook server |
More details on the VMs (private page)
Obtaining access to the VMs
Backups
The drives on all physical servers are kept in a RAID array as a first measure against data loss.
The VMs themselves are backed up nightly to tape.
Maintenance and Security
Cron jobs
Updates
The VM's need to have their packages updated regularly using
apt-get update
and
apt-get upgrade
in order to stay secure. After
apt-get upgrade
, a restart may be required.
Restarts
Naturally, you want to avoid restarting public VM's while users are logged in. In either
i2u2-prod (www.i2u2.org) or
i2u2-dev, you can login as the administrator and select "Session Tracking" to see who's currently logged in. It's typical to have many users logged into the Cosmic Ray e-Lab on
i2u2-prod, for example.
Once SSH'd into the VM itself, you can also check the Tomcat logs at
/home/quarkcat/tomcatlogs/
to see who's doing what on the website. The terminal commands
users
,
ps
and
w
are also useful to see who else is logged into the VM directly and what they're doing (this should only be other developers or sysadmins).
Restarts to
i2u2-prod,
i2u2-data, and
i2u2-db are best done at night (and preferably over the weekend) to avoid disrupting users.
SSL Certificates
We maintain Let's Encrypt SSL certificates for www.i2u2.org on
i2u2-prod and for bugzilla.quarknet.org and wiki.quarknet.org on
i2u2-wiki. The CRC maintains SSL certs for crc.nd.edu domains (e.g. i2u2-dev.crc.nd.edu) on servers that we don't serve anything under our own domains on.
ELabs SSL Certificates
Old Servers
Confused by references to servers you've never heard of, like www18, www13, or data4? These were the names of servers we used when the e-Labs site was served from Argonne.
Learn More
Troubleshooting
Unknown MySQL server host 'i2u2-db.crc.nd.edu'
This can happen when an e-Lab or CIMA attempts to pull data from
i2u2-db, and it's generally a DNS resolution error (that is, the calling VM can't turn "i2u2-db.crc.nd.edu" into an IP address). First do the obvious cursory checks:
- SSH into i2u2-db to make sure it's up and connected to the network
-
$ ping -c 5 i2u2-db.crc.nd.edu
from the calling VMs (likely i2u2-prod and/or i2u2-dev)
Assuming the above checks are nominal, these solutions have worked:
- Restart Apache. If you can't restart Apache - for example, if there are active e-Lab users:
- Use the direct IP. In the relevant code, try replacing "i2u2-db.crc.nd.edu" with the IP address of i2u2-db as given on the VM info page. This is mostly useful with CIMA where you can make direct changes to the code without deployment.
- Look at
/etc/resolv.conf
on the calling VM. On one occasion, this file became wrong, possibly after being overwritten by resolvconf
. Ask the CRC engineers to confirm that the current file is correct; as of March 2017 the correct configuration is given here.
If none of these work, you can try restarting MySQL using either of
$ sudo service mysql restart
$ sudo /etc/init.d/mysql restart
If that fails, reboot the calling VMs if possible.
-- Main.JoelG - 2016-12-07