CMS: Difference between revisions
Jump to navigation
Jump to search
Line 14: | Line 14: | ||
****no blocksize specified in root (optimizing blocksize may reduce latency significantly) | ****no blocksize specified in root (optimizing blocksize may reduce latency significantly) | ||
****no caching or read-ahead (also may reduce latency) | ****no caching or read-ahead (also may reduce latency) | ||
** PING measures network latency | ** PING measures network latency | ||
** IPERF & File Download measures network bandwidth | ** IPERF & File Download measures network bandwidth |
Revision as of 13:34, 9 April 2007
Goals
- L-Store plugin for root, so that rootio can read (write?) L-Store files
- CMS then uses REDDnet as temporary storage for user analysis (Tier 3 and below)
- Other CMS Applications possible, begin with above.
Benchmarks
- IBP --> CMSSW streaming tests
- CMSSW_1_2_0
- input: 1.5GB root file, 100 events
- ROOT/L plugin
- extends TFile Class
- Naive implementation
- relays block requests directly from root
- no blocksize specified in root (optimizing blocksize may reduce latency significantly)
- no caching or read-ahead (also may reduce latency)
- PING measures network latency
- IPERF & File Download measures network bandwidth
- RUNNING ON VAMPIRE, 2GHz CPU:
Data Source | # Depots
(stripe) |
URL | PING
time (ms) |
IPERF | File (lors)
Download |
CMSSW
default (mins) | |
---|---|---|---|---|---|---|---|
in | out | ||||||
local gpfs | 0 | /gpfs2/ | 11 | ||||
Vanderbilt REDDnet Depots | 10 | vudepot1.accre.vanderbilt.edu | 0.16 | 1.0 | |||
across campus IBP Depot | 1 | vpac12.phy.vanderbilt.edu | 0.46 | 3.5 | |||
Remote IBP Depots | 5 | ounce.cs.utk.edu | 13 | 20 | |||
pound.cs.utk.edu | 13 | ||||||
acre.cs.utk.edu | 13 | ||||||
umich-depot01.ultralight.org | 83 | ||||||
ibp.its.uiowa.edu | 35 |
- RUNNING AT CALTECH, 2.4GHz Opteron:
Data Source | # Depots
(stripe) |
URL | PING
time (ms) |
IPERF | 1.5GB File Download | CMSSW
default (mins) |
CMSSW
>blksize (mins) | ||
---|---|---|---|---|---|---|---|---|---|
in | out | time (m:s) | rate (Mbits/s) | ||||||
local disk | 0 | /dev/hda3 | 1.0 | ||||||
Vanderbilt REDDnet depots | 10 | vudepot1.accre.vanderbilt.edu | 78 | 100 | |||||
U Mich REDDnet depots | 2 | umich-depot01.ultralight.org | 63 | 4: 0 | 51 | 33 | |||
Caltech REDDnet depots | 2 | caltech-depot01.ultralight.org | 0.60 | 0: 16 | 770 | 1.0 | |||
UT Knoxville IBP depot | 1 | acre.cs.utk.edu | 93 | ||||||
U Iowa IBP depots | 1 | ibp.its.uiowa.edu | 70 |
Current Work in Progress
- Figure out how to get necessary code included in CMSSW
- Talk to Bill Tannenbaum, Phillipe Canal,...
- include L-Store code in CMSSW distribution so it is built on platforms correctly for use with rest of CMS software.
- that way no software for users to download themselves, no changing of configuratino scripts, etc.
- how test and validate before checking in?
- how to check code in?
- Figure out all issues needed to integrate with CMS data distribution model
- phEDeX, TFILE, DBS/DLS,...
- Switch Root plugin to use L-Store version of libxio
Demos
Demos at March 2007 OSG Grid Meeting (UC San Diego)
Can use Vladimir's or Dmitri's analysis for all of below.
Interactive Root
- First upload a file to Depots spread across the WAN, use LORSView to show where and how they go.
- Then read it back in root, show it works.
- Mainly an introduction to the issues.
100 Node ACCRE Run
- each reads its own file from WAN set of depots.
- show speed versus local copies of file (data tethered analysis).
100 CPU Grid Job
- similar to ACCRE Run, each job reads its own file from WAN depots.
- jobs are distributed accross open science grid sites
- demonstrates complete lack of data tether.
To Do To Get Ready
- Run all of the above many times before actual demo!
- Get LORSview upload working
- Figure out how to submit 100 CPU Grid Job.
- Want to run all 100 ACCRE jobs simultaneously? Need to work with ACCRE on that...
Get Rick Cavanaugh to run his analysis
- need most of the stuff needed for "Summer 2007 demo" but maybe not all fully in place.
- he runs himself.
- work with him so he understands full functionality possible.
- work with him to develop ideas for better implementing Summer 2007 demo
- what docs are needed
- best approach to getting users using it
- etc.
Summer 2007 Demo
- A "scratch space" demo for CMS users.
- Use deployed REDDnet resources which should become available June 2007
- Load REDDnet with "hot" data files, convince a few users to try them out
- Must have L-Store code fully integrated with CMS software
General Testing
verify ROOT/L works
- package up plugin for CMS L-Store test community
- gain experience via benchmarking
- finalize API (add write access?)
- checkin plugin to root cvs
- it will take a while for this ROOT addition to propagate into CMSSW
- explore CMSSW procedures for checkin of LORS and/or LSTORE
- it will take months for L-Store to be available for check-in
increasing level of stress tests:
(validate and benchmark)
do various combinations of the following:
- single jobs vs simultaneous jobs
- many jobs: at one cluster vs across the grid
- simultaneous jobs hitting same ibp depot accessing one or many files
- simultaneous jobs hitting same file at one depot or striped across many depots
Also need to benchmark and profile various types of jobs:
- I/O intensive skims
- CPU intensive jobs
- show benchmarks/demonstrate which jobs work well with L-Store and which jobs won't work well (if any). Have thorough benchmarks for the worst-case scenario.
- gather numbers to discuss impact on bandwidth as L-Store usage explodes.
- will people feel more free to do unnecessary computations?
assemble interactive analysis demos:
- host variety of interesting datasets
- need to identify these datasets
- make a wiki
- with instructions
- links to necessary data catalogs
- L-Store
- DBS/DLS
- gather visually interesting ROOT Macros
- event/detector displays
- histograms/results
- any FW-lite tools (even development versions) to try
assist user-analysis batch production:
- identify and host a wide variety of datasets
- calibration datasets
- various backgrounds, pileup
- variety of signal samples
- populate catalogs to find datasets
- web tools to assist this
- how to find datasets
- how to upload results
- how to register results in catalogs
- how to coordinate with L-Store and DBS/DLS
provide info on joining CMS/L (via L-Store? via REDDnet? ultralight?)
- how to add an ibp depot
Long Term Needs
- L-store version of libxio
- Stable, production REDDnet depot deployment
- including request tracker support!
- L-store software fully integrated with CMS software, and being distributed.
- this means need source code for L-Store version of libxio - check into CMS distribution
SRM interface
- in principle, this is important for CMS usage.
- Need to get new support person on board with this and up to speed.