SC06: Difference between revisions

From ReddNet
Jump to navigation Jump to search
 
(31 intermediate revisions by 7 users not shown)
Line 5: Line 5:
**Tony Messacappa
**Tony Messacappa


== Demo Participation ==
==Talks and Demonstrations==
===L-Store: Storage Virtualization for the Open Science Grid===
'''Alan Tackett, Vanderbilt University/ACCRE'''
 
L-Store allows for distributed storage devices (which can be distributed in a wide geographic region) to
function as a single file system.  Specifically, L-Store provides a distributed namespace for
storing arbitrary sized data objects.  It is
agnostic to the data transfer and storage mechanism, although currently only IBP is supported.
L-Store utilizes a chord based DHT implementation to provide metadata scalability.
Multiple metadata servers can be used to increase reliability and fault tolerance, and
real time addition and deletion of metadata server nodes is allowed.
L-Store supports Weaver Erasure encoding of data, so that files can be stored across multiple
storage devices and can be recovered even if some of these devices fail.
 
L-Store/IBP is highly performant, in our booth we will be demonstrating sustained writes to a rack
of disk servers with an aggregate transfer rate of 3-4 GBytes/sec.  The plot below shows
that we were able to sustain traffic at nearly 3.3 GBytes/sec for a two hour period (and we could
have continued several hours more):
 
[[Image:35Gbs.png]]
 
 
 
At SC06, we successfully filled a 10 GigE pipe with traffic from/to Caltech (Pasadena) to our depot
on the floor at Tampa:
 
[[Image:pasadena2tampa.png]]
 
===Nevoa Storage System: Managing scalable storage virtualization infrastructure===
'''Hunter Hagewood, Nevoa Networks'''
The deployment of distributed systems both in wide and local area networks requires a more extensive set of management tools in order to attain the same levels of realibility typically found in self-contained systems. Nevoa Networks currently offers such tools through its Nevoa StorCore product line. Nevoa StorCore allows you to easily configure, monitor, share, and manage the storage resources of your logistical network. Partitioning pooled storage capacity, defining realiability policies, automating fault detection decisions, tiering or peering with other logistical networks are just some of the functionalities Nevoa StorCore offers.
This demonstration will include a walk through of Nevoa StorCore's features, examples of national and international deployments and an introduction to Nevoa Explorer Personal Edition, the client component of the Nevoa Storage System for end-users.
 
===How to create a multi-user, wide area facility for data intensive visualization===
'''Contact: Jian Huang, University to Tennessee <huang@cs.utk.edu>'''
Participants: VU, ORNL
 
Contrary to conventional wisdom, we show that a distributed collection of heterogeneous computing resources can be used effectively to create a scalable infrastructure for high performance, data intensive visualization. In this demonstration, we will show how a large scale visualization of cutting edge simulation datasets can be shared among multiple, distributed users working on the same pool of distributed resources. The demonstration will use IBP depots, equipped with our Visualization Cookbook Library (VCL), at the ACCRE booth, at other booths on the SC06 show floor and on REDDnet's wide area infrastructure. As shown, good performance can be attained with only a standard Internet connection and in spite of the presence of several other simultaneous users.


===SRM===
===SRM===
Contact: Alex Sim of LBL <ASim@lbl.gov>  
'''Contact: Alex Sim of LBL <ASim@lbl.gov>'''


This is only for SRM v2.2 implementations with the updated WSDL.  
This is only for SRM v2.2 implementations with the updated WSDL. We'll use srm-tester to make put/put_done/bring-onling/get/release/remove,and the same with space reservation (with space token). In addition, we'll have srmLs. Then, we'll have srmCopy for a remote copy from a gridftp server and from another SRM server.  Lastly, we'll have two SRMs involved in a coordinated remote copy by get_1/put_2/gridftp_between_1_2/release_1/put_done_2/srmLs_2/remove_2.   
We'll use srm-tester to make put/put_done/bring-onling/get/release/remove,and the same with space reservation (with space token). In addition, we'll have srmLs. Then, we'll have srmCopy for a remote copy from a gridftp server and from another SRM server.  Lastly, we'll have two SRMs involved in a coordinated remote copy by get_1/put_2/gridftp_between_1_2/release_1/put_done_2/srmLs_2/remove_2.   


All tests will be repeated, and will not leave any left-over files by removing the previous test. We'll prepare some gui to show the in-between progresses, and the result on the GUI itself as well as on the web.
All tests will be repeated, and will not leave any left-over files by removing the previous test. We'll prepare some gui to show the in-between progresses, and the result on the GUI itself as well as on the web.
Line 22: Line 58:
*RAL, CERN - CASTOR
*RAL, CERN - CASTOR
*VU - LStore
*VU - LStore
Any non-implemented and un-supported methods in a site will be screened in advance and will be skipped for the demo.  The servers are expected to be up and running during the SC06 (11/12-11/16/2006)
Any non-implemented and un-supported methods in a site will be screened in advance and will be skipped for the demo.  The servers are expected to be up and running during the SC06. (11/12-11/16/2006)''
   
   
I'll need to ask you your institutional logo and project title, etc when we prepare the slides, etc. If you do not want to participate in the SC06 demo, please let me know.
I'll need to ask you your institutional logo and project title, etc when we prepare the slides, etc. If you do not want to participate in the SC06 demo, please let me know.
 
==Schedule of Talks and Demonstrations==
===Monday===
===Tuesday===
*1:00-2:00 -- Multi-user, Data Intensive Visualization
===Wednesday===
*1:00-2:00 -- Multi-user, Data Intensive Visualization
===Thursday===
*1:00-2:00 -- Multi-user, Data Intensive Visualization
==Computing resources used in demonstrations==
 
===Vanderbilt/ACCRE Booth===
[[Image:thankyous.jpg|right|250px]]
'''100 TBytes of online disk storage, 40 Gbs Networking, and a 96 CPU compute cluster'''
 
* 40 Capricorn Technologies 3 TB disk servers with dual-core Athlon processors.
 
* Foundry sx 800 Switch with 192 1-GigE ports and 4 10-GigE ports.
 
* Four 10-GigE Foundry ports connected to the showroom floor (SCinet) and the external network.
 
* A 96 CPU "client" cluster consisting of:
** 2 dual-dual-core opteron servers
** 9 dual-CPU operton servers
** 14 IBM JS20 dual-CPU PowerPC blades
** 7 IBM x336 servers (dual-CPU Intel)
** 14 IBM x335 servers (dual-CPU Intel)
 
===Collaborating booths at SC06===


===Distributed visualization Facility===
===Current REDDnet nodes, national and international===
Contact: Jian Huang of UTK (huang@cs.utk.edu)
Probable participants: VU, ORNL


''Creating a multiuser visualization facility based on distributed and heterogeneous resources has proved to be an elusive challenge for the high performance computing and visualization community.  We have developed a highly deployable solution to this problem. Our solution uses a collection freely available, unscheduled and unreserved storage and computing resources, distributed across the network, to provide the essential infrastructure that such a visualization facility needs.  Our demonstration will show that large scale visualization of cutting edge simulation datasets can be shared among multiple distributed users from the same pool of distributed resources, requiring only a standard Internet connection for each user.''
* ANSP - Academic Network at São Paulo (Brazil)
* California Institute of Technology
* Fermi National Accelerator Laboratory
* Universidade do Estado do Rio de Janeiro (Brazil)
* University of Florida
* University of Michigan
* Vanderbilt University

Latest revision as of 15:17, 30 November 2006

Meetings

  • Discussion with ORNL and Ultralight to set up REDDnet nodes at ORNL
  • People we should think about meeting with
    • Jackie Chen and/or her group
    • Tony Messacappa

Talks and Demonstrations

L-Store: Storage Virtualization for the Open Science Grid

Alan Tackett, Vanderbilt University/ACCRE

L-Store allows for distributed storage devices (which can be distributed in a wide geographic region) to function as a single file system. Specifically, L-Store provides a distributed namespace for storing arbitrary sized data objects. It is agnostic to the data transfer and storage mechanism, although currently only IBP is supported. L-Store utilizes a chord based DHT implementation to provide metadata scalability. Multiple metadata servers can be used to increase reliability and fault tolerance, and real time addition and deletion of metadata server nodes is allowed. L-Store supports Weaver Erasure encoding of data, so that files can be stored across multiple storage devices and can be recovered even if some of these devices fail.

L-Store/IBP is highly performant, in our booth we will be demonstrating sustained writes to a rack of disk servers with an aggregate transfer rate of 3-4 GBytes/sec. The plot below shows that we were able to sustain traffic at nearly 3.3 GBytes/sec for a two hour period (and we could have continued several hours more):

Error creating thumbnail: Unable to save thumbnail to destination


At SC06, we successfully filled a 10 GigE pipe with traffic from/to Caltech (Pasadena) to our depot on the floor at Tampa:

Error creating thumbnail: Unable to save thumbnail to destination

Nevoa Storage System: Managing scalable storage virtualization infrastructure

Hunter Hagewood, Nevoa Networks The deployment of distributed systems both in wide and local area networks requires a more extensive set of management tools in order to attain the same levels of realibility typically found in self-contained systems. Nevoa Networks currently offers such tools through its Nevoa StorCore product line. Nevoa StorCore allows you to easily configure, monitor, share, and manage the storage resources of your logistical network. Partitioning pooled storage capacity, defining realiability policies, automating fault detection decisions, tiering or peering with other logistical networks are just some of the functionalities Nevoa StorCore offers. This demonstration will include a walk through of Nevoa StorCore's features, examples of national and international deployments and an introduction to Nevoa Explorer Personal Edition, the client component of the Nevoa Storage System for end-users.

How to create a multi-user, wide area facility for data intensive visualization

Contact: Jian Huang, University to Tennessee <huang@cs.utk.edu> Participants: VU, ORNL

Contrary to conventional wisdom, we show that a distributed collection of heterogeneous computing resources can be used effectively to create a scalable infrastructure for high performance, data intensive visualization. In this demonstration, we will show how a large scale visualization of cutting edge simulation datasets can be shared among multiple, distributed users working on the same pool of distributed resources. The demonstration will use IBP depots, equipped with our Visualization Cookbook Library (VCL), at the ACCRE booth, at other booths on the SC06 show floor and on REDDnet's wide area infrastructure. As shown, good performance can be attained with only a standard Internet connection and in spite of the presence of several other simultaneous users.

SRM

Contact: Alex Sim of LBL <ASim@lbl.gov>

This is only for SRM v2.2 implementations with the updated WSDL. We'll use srm-tester to make put/put_done/bring-onling/get/release/remove,and the same with space reservation (with space token). In addition, we'll have srmLs. Then, we'll have srmCopy for a remote copy from a gridftp server and from another SRM server. Lastly, we'll have two SRMs involved in a coordinated remote copy by get_1/put_2/gridftp_between_1_2/release_1/put_done_2/srmLs_2/remove_2.

All tests will be repeated, and will not leave any left-over files by removing the previous test. We'll prepare some gui to show the in-between progresses, and the result on the GUI itself as well as on the web.

All current 6 implementations on 7 sites are following:

  • CERN - DPM
  • FNAL - dCache
  • INFN - StoRM
  • LBNL - SRM
  • RAL, CERN - CASTOR
  • VU - LStore

Any non-implemented and un-supported methods in a site will be screened in advance and will be skipped for the demo. The servers are expected to be up and running during the SC06. (11/12-11/16/2006)

I'll need to ask you your institutional logo and project title, etc when we prepare the slides, etc. If you do not want to participate in the SC06 demo, please let me know.

Schedule of Talks and Demonstrations

Monday

Tuesday

  • 1:00-2:00 -- Multi-user, Data Intensive Visualization

Wednesday

  • 1:00-2:00 -- Multi-user, Data Intensive Visualization

Thursday

  • 1:00-2:00 -- Multi-user, Data Intensive Visualization

Computing resources used in demonstrations

Vanderbilt/ACCRE Booth

Thankyous.jpg

100 TBytes of online disk storage, 40 Gbs Networking, and a 96 CPU compute cluster

  • 40 Capricorn Technologies 3 TB disk servers with dual-core Athlon processors.
  • Foundry sx 800 Switch with 192 1-GigE ports and 4 10-GigE ports.
  • Four 10-GigE Foundry ports connected to the showroom floor (SCinet) and the external network.
  • A 96 CPU "client" cluster consisting of:
    • 2 dual-dual-core opteron servers
    • 9 dual-CPU operton servers
    • 14 IBM JS20 dual-CPU PowerPC blades
    • 7 IBM x336 servers (dual-CPU Intel)
    • 14 IBM x335 servers (dual-CPU Intel)

Collaborating booths at SC06

Current REDDnet nodes, national and international

  • ANSP - Academic Network at São Paulo (Brazil)
  • California Institute of Technology
  • Fermi National Accelerator Laboratory
  • Universidade do Estado do Rio de Janeiro (Brazil)
  • University of Florida
  • University of Michigan
  • Vanderbilt University