Stress test for www18/cluster: Scheduled a phone call with Phong to go over these issues and come up with a plan.
We need to VPN to those machines and set up some tools (like the ability to restart them remotely).
Be able to change setting to prevent the code sending jobs to nodes that are not available.
Some of the nodes have issues (old power supply) and sometimes fail. Our plan is to work something out and be able to respond when these machines fail.
CMS
CMS plots in the logbook:
Phong is looking into this code.
Cosmic
Bug 455: Code is ready and we will proceed to roll out, update the VDC and the code for the nodes. Scheduled for Thu 28 March in the evening. Steps for this rollout:
Back the database before doing any updates. If the updatevdc actually updates the database, we need to be able to roll back if something goes wrong.
Deploy code from branches/2.0 in www18:
./deploy-from-svn branches/2.0
Update the vdc: run on 18 the steps provided by Tom in the email. (not as quarkcat though)