August 18, 2006: Difference between revisions
No edit summary |
|||
(14 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
* Blackwell (SFA) | * Blackwell (SFA) | ||
* Hagewood (Nevoa) | * Hagewood (Nevoa) | ||
* Beck, Moore (UTK) | * Beck, Moore, Chris Sellers, David LaBissoniere (UTK) | ||
* | * Sheldon, Tackett (VU) | ||
* Swany (UDel/CERN) | * Swany (UDel/CERN) | ||
== Current Agenda == | == Current Agenda == | ||
=== Application Communities === | === Application Communities === | ||
*Update on work with Electron Microscopy (Micah/Chris) | |||
Pheobe's application is running in 32bit mode on ACCRE (or at least it is close), but she wants it to run on her 64 bit PowerPC machines. This is more difficult. Chris has been integrating the IBP I/O routines into her application. Larry Dawson suggested that Chris use a compiler that ACCRE has for the 64 bit PowerPC... | |||
*News from CERN (Martin)? | |||
CERN is amenable to our installing our software there on the Datatag machines. If Harvey approves it, we can the routing changed, which is necessary because it is a closed network. | |||
In the long run, we should install our own equipment. Martin and Paul may try to meet up at CERN in late September. | |||
Another possibility is to talk to the Open Lab people. They are interested in storage, doing some work with xrootd. We could put some depots on machines there if the Datatag is problematic. | |||
*AmericaView (PR) | |||
PR has made contact with people and we should be able to make some contact with at Chicago I2 meeting. | |||
=== Deployment and Operations === | === Deployment and Operations === | ||
* Visit to Fermi (Bobby) | * Visit to Fermi (Bobby) | ||
* Deployment to ORNL (Bobby) | * Deployment to ORNL (Bobby) | ||
* Status of AMPATH testing (Surya/Hunter) | |||
**Progress on Brazilian depot deployment at well connected locations | |||
**Plans for future tests | |||
==== News from the AMPATH Runs: ==== | |||
I am in Seattle, so am sending my side of things by email. If you do | |||
need further clarifications, you can call me on my cell (6152758519). | |||
1. So far we have had two machines that have been available to us at | |||
Rio. I have used both lors and lstore to do test runs. Since the c lors | |||
library is about 1.5 times faster than the java lors library i have | |||
been using the c library for the test runs. | |||
2. The network set up at Rio is something i do not quite understand. | |||
Initially i was able to get about 110 Mbits/sec per box thus filling | |||
about 25 % of the pipe. Each machine was running anywhere between 6-8 | |||
clinets transfering a 250 MB file with 20 MBytes block sizes. The TCP | |||
window size on the depots were configured to 256MBytes where as on the | |||
clients were set to 128 MBytes. The weirdness lies in the fact that we | |||
have seen occasional peaks of about350 MBits/sec with the two machines. | |||
Infact this week i have seen sustained around 160 MBits/sec using just | |||
one client. This result was coincidental with a infrastructure failure | |||
at the rio end. At the time we saw this burst only the machine i was | |||
using was running with a very limited number of other things going on on | |||
their campus network.This leads to me towards a thought: their must be | |||
something going on during other times that is limiting the performance | |||
of the individual machines. We are definitely sharing these boxes so its | |||
hard to identify whether it is a network issue or something else. Iperfs | |||
have though established that we do see about 600 Mbits/sec so we are no | |||
where close to that figure. | |||
3. In any case right now they have a major infrastructure breakdown (no | |||
airconditioning) so we hae only one box. | |||
4, Ofcourse no root access to the box also means that it takes time to | |||
tweak anything on that side. | |||
5. Thanks to takeo we have a box. But i have been seeing only | |||
17Mbits/sec(yes it is bits and not bytes). There is something that needs | |||
to be adjusted. I have asked Takeo to change the TCP window size to 128 | |||
MBytes. Haven't heard back. | |||
6. Plans: Rio has promised 6 machines. And once i get sau paulo i will | |||
keep running tests and update this group. My gut feeling is that its is | |||
a set up issue on that side. There is no question regarding whether we | |||
can saturate a 1 gig pipe or not. | |||
=== Technology: Hardware, Software === | === Technology: Hardware, Software === | ||
* lodn_cp (Micah) | |||
Micah has been doing testing on the EWOK machines at ORNL. Over infiniband. | |||
* L-Store and REDDnet hardware (Alan) | |||
progress on lstore-cp and other commands: this is working now. | |||
Micah: should we settle on one way to specify the URLs? (which specifies what the protocal is, the host name?) We are eager to have a good well designed protocal... can we grab the work you are doing for L-store and make this the | |||
standard way things are done in lodn? This could also give us interoperability. | |||
Alan agrees. | |||
Capricorn box tests have been going well. We are not doing any raiding at the hardware level, this is done in our software. What about performance? (Micah asks). There is not a huge penalty for recovery operations, say. Reads and writes not much cost. Reads there is no cost. (data and parity are seperate). He then talked about the erasure code formalism (I missed some of this due to a phone call I had to take, apologies). | |||
=== Organization and Funding Opportunities === | === Organization and Funding Opportunities === | ||
*Discussions about a NSF Physics at the Information Frontier(PIF)? | |||
we need to setup a phone call with Stefan at UTK next week. Maybe late morning | |||
on Thursday? | |||
=== Events, Education, Outreach === | === Events, Education, Outreach === | ||
*Upcoming OSG meeting (Paul, Surya) | *Upcoming OSG meeting (Paul, Surya) | ||
[http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=81 Surya's Talk] that he will present at the OSG meeting on L-Store, REDDnet... | |||
*SC06? | |||
**Status of Ultralight plans (Alan) | |||
*Plans for Internet2 Side meeting (Terry, Micah) | *Plans for Internet2 Side meeting (Terry, Micah) | ||
Discussion started with Laurie Berns of I2 and it looks good for our having a meeting before or after. They would give us a room. "REDDnet tools and applications" We are bringing REDDnet up, and the goal of the meeting would be to share what we have been doing, talk about the tools that are available, give people help in terms of participating. South and central American's do tend to come to this meeting, so this might be one place we can make some headway. | |||
We expect to hear from Laurie pretty soon. | |||
Also, a breakout session during the meeting: an hour, say. This would give us a chance to talk with the regular conference attendees. Proposals due August 31st. | |||
== Action Items == | == Action Items == | ||
*Paul setup phone call RE PIF with Stefan for next week. |
Latest revision as of 09:33, 18 August 2006
Attending
- Blackwell (SFA)
- Hagewood (Nevoa)
- Beck, Moore, Chris Sellers, David LaBissoniere (UTK)
- Sheldon, Tackett (VU)
- Swany (UDel/CERN)
Current Agenda
Application Communities
- Update on work with Electron Microscopy (Micah/Chris)
Pheobe's application is running in 32bit mode on ACCRE (or at least it is close), but she wants it to run on her 64 bit PowerPC machines. This is more difficult. Chris has been integrating the IBP I/O routines into her application. Larry Dawson suggested that Chris use a compiler that ACCRE has for the 64 bit PowerPC...
- News from CERN (Martin)?
CERN is amenable to our installing our software there on the Datatag machines. If Harvey approves it, we can the routing changed, which is necessary because it is a closed network.
In the long run, we should install our own equipment. Martin and Paul may try to meet up at CERN in late September.
Another possibility is to talk to the Open Lab people. They are interested in storage, doing some work with xrootd. We could put some depots on machines there if the Datatag is problematic.
- AmericaView (PR)
PR has made contact with people and we should be able to make some contact with at Chicago I2 meeting.
Deployment and Operations
- Visit to Fermi (Bobby)
- Deployment to ORNL (Bobby)
- Status of AMPATH testing (Surya/Hunter)
- Progress on Brazilian depot deployment at well connected locations
- Plans for future tests
News from the AMPATH Runs:
I am in Seattle, so am sending my side of things by email. If you do need further clarifications, you can call me on my cell (6152758519).
1. So far we have had two machines that have been available to us at Rio. I have used both lors and lstore to do test runs. Since the c lors library is about 1.5 times faster than the java lors library i have been using the c library for the test runs.
2. The network set up at Rio is something i do not quite understand. Initially i was able to get about 110 Mbits/sec per box thus filling about 25 % of the pipe. Each machine was running anywhere between 6-8 clinets transfering a 250 MB file with 20 MBytes block sizes. The TCP window size on the depots were configured to 256MBytes where as on the clients were set to 128 MBytes. The weirdness lies in the fact that we have seen occasional peaks of about350 MBits/sec with the two machines. Infact this week i have seen sustained around 160 MBits/sec using just one client. This result was coincidental with a infrastructure failure at the rio end. At the time we saw this burst only the machine i was using was running with a very limited number of other things going on on their campus network.This leads to me towards a thought: their must be something going on during other times that is limiting the performance of the individual machines. We are definitely sharing these boxes so its hard to identify whether it is a network issue or something else. Iperfs have though established that we do see about 600 Mbits/sec so we are no where close to that figure.
3. In any case right now they have a major infrastructure breakdown (no airconditioning) so we hae only one box.
4, Ofcourse no root access to the box also means that it takes time to tweak anything on that side.
5. Thanks to takeo we have a box. But i have been seeing only 17Mbits/sec(yes it is bits and not bytes). There is something that needs to be adjusted. I have asked Takeo to change the TCP window size to 128 MBytes. Haven't heard back.
6. Plans: Rio has promised 6 machines. And once i get sau paulo i will keep running tests and update this group. My gut feeling is that its is a set up issue on that side. There is no question regarding whether we can saturate a 1 gig pipe or not.
Technology: Hardware, Software
- lodn_cp (Micah)
Micah has been doing testing on the EWOK machines at ORNL. Over infiniband.
- L-Store and REDDnet hardware (Alan)
progress on lstore-cp and other commands: this is working now.
Micah: should we settle on one way to specify the URLs? (which specifies what the protocal is, the host name?) We are eager to have a good well designed protocal... can we grab the work you are doing for L-store and make this the standard way things are done in lodn? This could also give us interoperability.
Alan agrees.
Capricorn box tests have been going well. We are not doing any raiding at the hardware level, this is done in our software. What about performance? (Micah asks). There is not a huge penalty for recovery operations, say. Reads and writes not much cost. Reads there is no cost. (data and parity are seperate). He then talked about the erasure code formalism (I missed some of this due to a phone call I had to take, apologies).
Organization and Funding Opportunities
- Discussions about a NSF Physics at the Information Frontier(PIF)?
we need to setup a phone call with Stefan at UTK next week. Maybe late morning on Thursday?
Events, Education, Outreach
- Upcoming OSG meeting (Paul, Surya)
Surya's Talk that he will present at the OSG meeting on L-Store, REDDnet...
- SC06?
- Status of Ultralight plans (Alan)
- Plans for Internet2 Side meeting (Terry, Micah)
Discussion started with Laurie Berns of I2 and it looks good for our having a meeting before or after. They would give us a room. "REDDnet tools and applications" We are bringing REDDnet up, and the goal of the meeting would be to share what we have been doing, talk about the tools that are available, give people help in terms of participating. South and central American's do tend to come to this meeting, so this might be one place we can make some headway.
We expect to hear from Laurie pretty soon.
Also, a breakout session during the meeting: an hour, say. This would give us a chance to talk with the regular conference attendees. Proposals due August 31st.
Action Items
- Paul setup phone call RE PIF with Stefan for next week.