Main Page: Difference between revisions

From ReddNet
Jump to navigation Jump to search
 
(68 intermediate revisions by 10 users not shown)
Line 1: Line 1:
[[Image:reddnetmap.gif|right|450px]]
__NOTOC__


== REDDnet Documentation ==  
== {{Template:REDDnet}}:  Enabling Data Intensive Science in the Wide Area ==


* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=73 REDDnet NSF MRI Proposal]
[[Image:reddnetmap.gif|right|500px]]{{Template:REDDnet}} (Research and Education Data Depot network) is an NSF-funded infrastructure project designed to provide a large distributed storage facility for data intensive collaboration among the nation's researchers and educators in a wide variety of application areas. Its mission is to provide "working storage" to help manage the logistics of moving and staging large amounts of data in the wide area network, e.g. among collaborating researchers who are either trying to move data from one collaborator (person or institution) to another or who want share large data sets for limited periods of time (ranging from a few hours to a few months) while they work on it. REDDnet is not designed or intended to be a replacement for reliable archival or long term personal storage and users must make separate arrangements to insure that the data they are sharing via REDDnet's "best effort" storage is also preserved independently with stronger guarantees. 


* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=84 L-Store Presentation at the University of Sao Paulo, July, 2006]  
One example comes from the [http://cms.cern.ch/ CMS] collaboration, a high energy physics experiment that will be taking data soon at the Large Hadron Collider (LHC) at [http://public.web.cern.ch/ CERN].  Groups of researchers, distributed across the country and the world, will want to use data products derived from the raw data produced by collisions in the LHC to do a variety of tasks from calibrating the detector to searching for new physics.  They will want the newest data products available for anywhere from a month to a few months, after which it can be archived to make way for the next batch of data. Although all the data will be stored long term at CERN and [http://www.fnal.gov/ Fermi Lab] they would benefit greatly if this data could be made more readily available for processing on their distributed computing infrastructure, especially on the [http://www.opensciencegrid.org/ Open Science Grid]. REDDnet is the kind of resource needed to deal with the data logistics of this application.


* [http://mimir.accre.vanderbilt.edu/cgi-bin/public/DocDB/ShowDocument?docid=82 L-Store Presentation at LBNL, Sept. 9, 2006]
Another example, from the [http://www.americaview.org/ AmericaView] project, might occur in the aftermath of an earthquake in California or a Hurricane on the Gulf Coast, where researchers across the country will want access to the geospatial image data
from satellites covering the affected region. For a few months after the event, this data could be uploaded to {{Template:REDDnet}} and made available to this community with much higher levels of performance and availability.


*'''The ACCRE Booth at [[SC06]] will highlight REDDnet technology'''
Initially, {{Template:REDDnet}} will deploy >700 Terabytes of distributed storage with an emphasis on scalability, speed and fault tolerance. Currently (Spring 08), there are roughly 160 TB deployed.
For example, at the
Supercomputing 2006 Conference in Tampa, Florida, {{Template:REDDnet}} demonstrated sustained transfers at a rate of 10 Gigabits per second between Caltech and the convention floor. These transfers were limited by the bandwidth of the network connection. At the same conference, {{Template:REDDnet}} demonstrated fault tolerance by striping data across thirty depots and then successfully reading the data even after turning off nine of these depots.


== Component Technologies and Partners ==
== Research Projects Using {{Template:REDDnet}} ==


* [http://www.lstore.org/mwiki/index.php/Main_Page L-Store], the Logistical Storage project at ACCRE (Vanderbilt)
* [http://www.americaview.org/ AmericaView] - Satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.


* [http://loci.cs.utk.edu/ LoCI], the Logistical Networking and Internetworking Laboratory at the University of Tennessee
* [http://cms.cern.ch/ CMS] - Elementary Particle Physics at the [http://public.web.cern.ch/ CERN] Large Hadron Collider.


* the [http://www.ultralight.org/ UltraLight] Project, an Ultrascale Information System for Data Intensive Research
* Structural Biology - Image reconstruction of large macromolecular assemblies through a collaborative effort of Vanderbilt and Lawrence Berkeley National Laboratory researchers.


* the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]
* [http://www.phy.ornl.gov/tsi/ Terascale Supernova Initiative] - a multidisciplinary collaboration to develop models for core collapse supernovae and related enabling technologies.


== REDDnet@Work ==
* [http://www.ngda.org/ National Geospatial Digital Archive] (NGDA) - a collecting network for the archiving of geospatial images and data.


* [[REDDnet at Work Page]] -- Organization,[[REDDnet Meetings and Minutes Page|Meeting Notes]], Work Plans, Events, etc.
* [http://www.vanderbilt.edu/americas/English/pagemanager.php?page=Merin.php Retinopathy] - Diabetic Eye Disease Screening in Peru and Bolivia
*[[REDDnet Tools and Applications Meeting 2006]] December 4, 8:00am-5:00pm, Hyatt Regency McCormick Place, Chicago, IL. In coordination with the [http://events.internet2.edu/2006/fall-mm/index.html Fall 2006 Internet2 Member Meeting]
 
<br />


== Collaborators ==
== Collaborators ==


=== Core Institutions ===
{{Template:REDDnet_Collaborators}}
 
<table width="600px" border=0 cellspacing="0" cellpadding="0">
<tr><td>
[[Image:vubw.jpg|center|Vanderbilt]]
</td><td>
[[Image:utorange.gif|70px|center|Tennessee]]
</td><td>
[[Image:SFA.gif|70px|center|Stephen F. Austin]]
</td><td>
[[Image:nevoa.png|60px|center|nevoa]]
</td><td>
[[Image:NCstate.gif|50px|center|N. C. State]]
</td><td>
[[Image:udel.gif|55px|center|Delaware]]
</td></tr>
<tr><td align="center">
Vanderbilt
</td><td align="center">
Tennessee
</td><td align="center">
S. F. Austin
</td><td align="center">
Nevoa Networks
</td><td align="center">
N. C. State
</td><td align="center">
Delaware
</td></tr>
 
</table><BR>
 
=== Collaborating Host Institutions ===
 
<table width="700px" border=0 cellspacing="0" cellpadding="0">
<tr><td>
[[Image:usp.gif|90px|center|USP]]
</td><td>
[[Image:uerj.jpg|70px|center|UERJ]]
</td><td>
[[Image:michigan.jpg|60px|center|Michigan]]
</td><td align="center">
[[Image:fermilab.gif|55px|center|Florida]]
</td><td align="center">
[[Image:fnal.gif|55px|center|Fermilab]]
</td><td align="center">
[[Image:citlogo.gif|55px|center|Caltech]]
</td><td>
[[Image:AMPATH.gif|55px|center|AMPATH]]
</td></tr>
<tr><td align="center">
Sao Paolo
</td><td align="center">
Rio de Janeiro
</td><td align="center">
Michigan
</td><td align="center">
Florida
</td><td align="center">
Fermilab
</td><td align="center">
Caltech
</td><td align="center">
AMPATH
</td></tr>
 
</table><BR>


== Support ==
== Support ==


[[Image:NSF.gif|50px]]  <B>This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]</B>
[[Image:NSF.gif|50px]]  <B>This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt [http://www.vanderbilt.edu/americas/ Center for the Americas]</B>

Latest revision as of 21:04, 22 April 2009


REDDnet: Enabling Data Intensive Science in the Wide Area

Reddnetmap.gif

REDDnet (Research and Education Data Depot network) is an NSF-funded infrastructure project designed to provide a large distributed storage facility for data intensive collaboration among the nation's researchers and educators in a wide variety of application areas. Its mission is to provide "working storage" to help manage the logistics of moving and staging large amounts of data in the wide area network, e.g. among collaborating researchers who are either trying to move data from one collaborator (person or institution) to another or who want share large data sets for limited periods of time (ranging from a few hours to a few months) while they work on it. REDDnet is not designed or intended to be a replacement for reliable archival or long term personal storage and users must make separate arrangements to insure that the data they are sharing via REDDnet's "best effort" storage is also preserved independently with stronger guarantees.

One example comes from the CMS collaboration, a high energy physics experiment that will be taking data soon at the Large Hadron Collider (LHC) at CERN. Groups of researchers, distributed across the country and the world, will want to use data products derived from the raw data produced by collisions in the LHC to do a variety of tasks from calibrating the detector to searching for new physics. They will want the newest data products available for anywhere from a month to a few months, after which it can be archived to make way for the next batch of data. Although all the data will be stored long term at CERN and Fermi Lab they would benefit greatly if this data could be made more readily available for processing on their distributed computing infrastructure, especially on the Open Science Grid. REDDnet is the kind of resource needed to deal with the data logistics of this application.

Another example, from the AmericaView project, might occur in the aftermath of an earthquake in California or a Hurricane on the Gulf Coast, where researchers across the country will want access to the geospatial image data from satellites covering the affected region. For a few months after the event, this data could be uploaded to REDDnet and made available to this community with much higher levels of performance and availability.

Initially, REDDnet will deploy >700 Terabytes of distributed storage with an emphasis on scalability, speed and fault tolerance. Currently (Spring 08), there are roughly 160 TB deployed. For example, at the Supercomputing 2006 Conference in Tampa, Florida, REDDnet demonstrated sustained transfers at a rate of 10 Gigabits per second between Caltech and the convention floor. These transfers were limited by the bandwidth of the network connection. At the same conference, REDDnet demonstrated fault tolerance by striping data across thirty depots and then successfully reading the data even after turning off nine of these depots.

Research Projects Using REDDnet

  • AmericaView - Satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer.
  • CMS - Elementary Particle Physics at the CERN Large Hadron Collider.
  • Structural Biology - Image reconstruction of large macromolecular assemblies through a collaborative effort of Vanderbilt and Lawrence Berkeley National Laboratory researchers.
  • Retinopathy - Diabetic Eye Disease Screening in Peru and Bolivia


Collaborators

Core Institutions


Error creating thumbnail: Unable to save thumbnail to destination
Tennessee
Stephen F. Austin
Error creating thumbnail: Unable to save thumbnail to destination
Error creating thumbnail: Unable to save thumbnail to destination
N. C. State
Delaware
Vanderbilt Tennessee S. F. Austin ORNL Nevoa Networks N. C. State Delaware


Collaborating Host Institutions


USP
UERJ
Michigan
Florida
Fermilab
Caltech
São Paulo Rio de Janeiro Michigan Florida Fermilab Caltech


AMPATH
LOC
LOC
Error creating thumbnail: Unable to save thumbnail to destination
LOC
AMPATH FIU Library of Congress SDSC Stanford UCSB

Support

NSF.gif This work is supported by NSF Grant PHY-0619847 and by the Vanderbilt Center for the Americas