Present: Eric Myers, Tibi, Liz and Nick, Tom Jordan
LIGO Analysis and the Grid
Eric and Tibi talked off-line last week, which was very useful. Eric has added a lot to the wiki. He also sent out an e-mail to the list detailing his progress, which is reproduced here:
Here is a report of news from LIGO and my recent progress. This can just be
copied into the minutes unless there are questions to add or discuss.
I'm sorry it's out so soon before today's telecon.
First the news. Two of our three summer teachers are now at Hanford and
starting to work on their summer internships. You can see what they are up
to or even say hello in the Gladstone Room on the mock-up site. Visit
http://i2u2.spy-hill.net/ and then go to the Meeting and Discussion Rooms.
The Gladstone room is where they will be working on LIGO analyses, but there
will also be discussons on pedagogy in the Teacher's Lounge.
Wiki
Tibi and I had a good chat last week, for about 2 hours, and as a result I
think we both have a better idea of what needs to be done to get a LIGO
analysis to run "on the grid". To help clarify some of the requirements I
created two new wiki pages in the LIGO e-Lab area.
The "Grid Interface API" page describes what I have been expecting to see
for an API for launching tasks on the grid. It's just a sketch and open to
change, but I hope it's a good starting point.
The "Parameter Files" page describes some general considerations on how we
could handle parameter files, with some examples drawn from other systems
which use parameter files. This is in the LIGO section, but I've tried to
make it general enough that it might have wider application. For example,
if we have a general interface for ROOT scripts to fetch parameter values
from one or more of these parameter file formats then this could be used by
LIGO, CMS, and anybody else who uses ROOT. The page is not done yet, but
I'd welcome comments and suggestions.
Tibi then created a "Grid Execution" page to summarize the steps needed to
run a LIGO analysis on a grid node.
I also started a "LIGO Grid" page to enumerate LIGO Grid activities. This
is primarily being done by a separate group of people (mainly at Caltech) so
I'm not an expert, but I wanted a place to collect what I know and learn.
Ideally our efforts will also mesh with theirs when appropriate. The main
thing to note there is LDR, which I think we would likely want to use to
distribute LIGO environmental data to grid nodes for our analyses once
everything is working.
Dataflow
LIGO data are now on the Argonne cluster, in /disks1/myers/data/ligo/frames.
I have there the most recent 90 days of minute trends of the ELabs channels.
This is about 1GB of data. The full 3 years of data we have available is
around 14GB, but I wanted a smaller subset, and this is the same subset I'm
using for testing in New York.
New frame files are produced about once an hour at Hanford, and so a cron
job on www13 uses rsync to update the collection every hour. We will always
have the newest 90 days available for our initial tests. Another cron job
which runs once a day trims the collection to just the past 90 days. This
can be removed when we are ready for the full RDS.
This gets the data to Argonne for testing there, but I still don't know how
the data will get to the remote grid node for an analysis. In the long run
I expect we'll use RDS, but to get something working now we'll need
something else. It's something to start thinking about.
Over the past week I made some internal revisions to the script
run_dmtroot.sh which actually runs a ROOT script with the DMT macros and
ROOT enabled libraries. A big part of the script sets environmental
variables to point to key components. ROOTSYS points to the local
installation of a stock version of ROOT (what we get from ROOT_DIR points to
a directory containing our own scripts for this Analysis Tool. Now
DMT_ROOT_LIBS points to the root enabled libraries, while DMT_ROOT_MACROS
points to the LIGO macros for ROOT (this was formerly from LIGOTOOLS).
Doing this has pointed out that the directory structure which had started
developing to hold different components is backwards and needs to be
reversed. I will likely do that after I return from vacation.
I like to checkpoint my work when I get to a good point to do so, and so I
checked all my changes in as the v0.35 branch and made this the "test"
release on both tekoa and the spy-hill sites. This is sort of like a
"rollout", I guess. I won't make this the "production" release until it's
been tested further, and not until after I get back from vacation.
Meanwhile, now I'll add more to the development release.
As I have already told Tibi, I will creata a simple non-graphics ROOT script
which cycles through one or more frame files and outputs simple status
information. This will be very useful for testing without the requirement
of a graphics interface.
I've built both GDS and ROOT on several machines and I think I have a better
understanding of what needs to be built, and how (and why we did it
differently on tekoa -- because I used pre-existing LIGO software on that
machine). More importantly, I think I know what small subset of libraries
and macros has to go to the machine on which an analysis is run. I'll be
trying this out on www13 today and tomorrow.
Vacation
I will be away on a family vacation from 23-30 June, so I won't get anything
done next week. I may try to join in the phone call if I can do so, but I
don't know yet. I may be able to check e-mail once or twice a day, but
maybe not.
Cosmic Analysis
Nick and Tom talked about how to improve the handling of geometry files.
Tom and Eric discussed general need for handling parameters and possibility of using a database for geometry files (Cosmics) or channel info (LIGO).
Cosmic Analysis
Load-balancing - 14,15,16,17 ready; he hs not added 10 yet.
Tibi still needs to test with Jmeter. He will work in the evenings because there a lot of workshops going on.
Problems with Cosmic e-Lab
No obvious errors during workshops.
Action Item: Tibi will communicate with Ben and Mihael about cleaning up data0.
data0 just has backup data according to Ben. (100% full) It should not affect the functioning of e-Lab. data1 has space. ( 79% full)
disk0 and disk1 Is the uploaded data going to the disk that is 100%. No according to Tibi.
To see the current disk usage log on to www13 and give the
df
command.
Rollout
Rollout this Friday. Coordinate with Bob Peterson because of the workshop after Argonne.
Google results
Searching for "I2U2" on Google gives links to www11 and www10. We would prefer just to www.
Tibi may want to try to discover if he can set it up so www instead of www11, etc.
This may happen naturally after the load balancing is in place for a while. May want to provide
redirects in place of proxy connections as appropriate to stear the googlebot (and other search spiders) to the right place.
-- Main.LizQuigg - 20 Jun 2007
-- Main.EricMyers - 20 Jun 2007
-- Main.EricMyers - 03 Jul 2007