REDDnet Application Integration Working Group
Work Plan
General Plan of work, etc. to be outlined here?
Applications
AmericaView
AmericaView REDDnet Implementation Plan
Background
The Research and Education Data Depot Network (REDDnet) “will create a wide area storage facility for data intensive collaboration, consisting of a set of eight large storage nodes, strategically positioned across the nation’s high performance research networks, and configured with a distributed storage management technology, called Logistical networking (LN), expressly designed to attack major problems of data logistics.” The hardware infrastructure project, funded by NSF MRI, includes three application groups. One of these is remote sensing data archiving and distribution, represented by the AmericaView Remote Sensing Consortium.
TexasView, the AmericaView member consortium for Texas, has been working to harness Logistical Networking for remote sensing data distribution for several years. Initial work involved adapting the Logistical Download Network (LoDN) for use with the TexasView GloVis archive responder (download manager.) The Advanced Automated Archive Responder Gadget accepted scene selections from GloVis and offered either immediate or deferred LoDN data delivery as well as conventional http delivery. The limited success of this effort was attributed to two factors.:
First, the “best effort” nature of IBP depots meant that data had to be uploaded from the archive to the depots on demand. The IBP solution to this, i.e. replicating multiple copies across several depots, seemed unsuited for permanent storage of remote sensing data. Further, the excessive storage required for multiple copies of large datasets seemed impractical. As a result, the time required to complete the upload portion of the transaction greatly diminished the time savings afforded by the faster transfer rate.
Secondly, inherent instability in the LoDN server caused user frustration. Once frustrated by a failed transfer, most users abandoned the LoDN option and adopted the more pedestrian and reliable http download for future transactions.
REDDnet
The AmericaView REDDnet project addresses both of these shortcomings. First, with L-Store technology, storage is guaranteed. This means that a single copy of the data files can be stored reliably on the REDDnet depots. In addition, with 320 TB of IBP-enabled storage available in the REDDnet network, data storage capacity will no longer be a consideration.
Secondly, the LoDN server instability will be mitigated by replacing the LoDN server with an integrated facility for handing exnodes and a download client to the user.
Objectives
The Goal of the AmericaView REDDnet project is to demonstrate the application of logistical networking technology for archiving and distribution of a national archive of remote sensing data. To achieve this goal, the following objectives are identified:
- Develop an ingest software tool that automates uploading remote sensing archives to REDDnet and archives exnodes in a GloVis-compatible storage system.
- Develop GloVis archive responder software (download manager) that accepts a standard GloVis scene list, retrieves the appropriate exnode(s), delivers the exnodes to the client, and downloads and launches software to retrieve and assemble the original file.
- Load all available AmericaView remotes sensing data into REDDnet and expose those data to the general public.
Software Requirements
Upload Tool The Upload Tool walks an archive data structure, identifies “deliverable units”, uploads these to REDDnet, receives an exnode, and stores the exnode for later use. This process should be merged with existing code used to generate GloVis scene lists and download GloVis browse and metadata. Requirements for the Upload Tool include:
- Ability to autonomously walk a directory structure and identify which files constitute a “deliverable unit”. This should work across a variety of directory structures and file collections, although some restrictions will probably have to be imposed.
- Integrate exnodes with existing GloVis data structure.
- Incorporate existing browse and metadata download functionality.
Archive Responder The Archive Responder will accept a standard GloVis scenelist, retrieve the appropriate exnode(s) and pass it to the client, download an exnode retrieval client and launch the client. Requirements include:
- ability to be “branded” by the hosting entity.
- interaction with both a previously downloaded retrieval client and with a download on demand retrieval client.
- very simple interface – easy to use.
- client TCP/IP stack tuning to improve performance (is this currently done in the LoDN client piece?)
AmericaView participation
The initial work on the AmericaView REDDnet project will be completed by TexasView with volunteer assistance from other StateViews on a voluntary basis. Once the system is operational, all StateView’s will be given the opportunity to load their data into REDDnet and/or host a REDDnet-GloVis instance. The latter will give each StateView the opportunity to “brand” both the GloVis interface and the download interface for their instance. It will appear that each StateView is hosting the entire AmericaView archive, while in fact each is merely providing a window into the REDDNET archive.
Loading data from each StateView will be much simpler if some limitations are place on file formats and band combinations. We might consider defining a standard set of formats. This is problematic because there are so many different formats available. One approach would be to offer 1 file only for each scene. This 1 file might be a complete NLAPS data set (zipped), a full dataset in TIFF format (zipped), or a set of 3, 3-band jpg composites (zipped.) In all cases, only one zip file would be delivered to the client.
However this issue is resolved, the upload procedure must be relatively simple if we are to get wide-spread adoption by the AmericaView community. Likewise, the download procedure must be even more straight-forward.
Comments:
Please include comments, ideas, suggestions, etc. here.
ATLAS
- TBD (McKee)
CMS
- xrootd integration into L-Store, to allow CMS analysis to use REDDnet storage (Engh)
- The SRM L-Store interface should allow CMS to use REDDnet storage for CMS production running (MC, reconstruction, analysis). This possibility needs to be further explored and developed with Ian Fisk, Brian Bockelman (Cavanaugh, Engh)
Library of Congress
- TBD (Moore)
Local Vanderbilt
- Need to set up a local Vanderbilt demo/outreach project ... with Piston's group? Pheobe?