Development ideas: Difference between revisions
No edit summary |
|||
(9 intermediate revisions by 2 users not shown) | |||
Line 11: | Line 11: | ||
== Change in ''RID'' format == | == Change in ''RID'' format == | ||
The current definition of an ''RID'' is an integer as defined in ''struct ibp_depot''. The definition of an integer is architecture dependent and hense not portable. An alternative definition would be to define the ''RID'' as a character string. This would provide flexibility in its implmentation and use. The current IBP client libraries already treat the ''RID'' as an opaque character string for all commands except ''IBP_Allocate()''. | The current definition of an ''RID'' is an integer as defined in ''struct ibp_depot''. The definition of an integer is architecture dependent and hense not portable. An alternative definition would be to define the ''RID'' as a character string. This would provide flexibility in its implmentation and use. The current IBP client libraries already treat the ''RID'' as an opaque character string for all commands except ''IBP_Allocate()''. | ||
== Provide interface to IBP data structures == | == Provide interface to IBP data structures == | ||
No explicit interfaces are provided for any of the various IBP data structures. A more flexible approach would be to add API calls to manipulate these structures indirectly. | |||
== IBP_MCOPY current status == | |||
The documentation for this command is sparse. It looks like numerous different multicast methods were implemented but there is very little documentation describing them. Should this command be dropped? | |||
== NFU == | |||
There is very little documentation describing the NFU implementation in the current LoCI depot and the documentation provided has errors and is not fully supported. The concept of the NFU is very powerful and I wonder if it should be split out as a separate specification altogether. Hunter's Java implementation is quite elegant. In his implementation the NFU calls are actually Java JAR files stored as allocations. These allocation are then registered with the NFU manager with hooks for checksums for data integrity. Having the NFU call operating in a Java container is extremely appealing. Java can place the NFU call in a box to limit it's resource consumption(memory, cpu, threads, etc) making it much more difficult for an NFU call to inadvertently or maliciously take down the depot or NFU manager. Also because of Java's portability deploying new NFU calls becomes trivial. | |||
== Resource Discovery == | |||
Added call to receive the list of resource ID's | |||
--[[User:Hunterh|Hunterh]] 13:38, 1 February 2008 (CST) | |||
= Security = | |||
== Add support for SSL == | |||
Self-explanatory | |||
== Auth/AuthZ for IBP_ALLOC command == | |||
This command has the potential for abuse and could result in a a "Denial of Space" attack on the depot. If the concept of an "account" is added one could then come up with additional methods to share resources for example adding the concept of an account quota. It also provides a tracking mechanism on who is *creating* allocations. | |||
== Virtual Capabilities(vcap) == | |||
The current implementation only allows a single set of caps for an allocation. So once a user has access to a cap it can never be revoked. Virtual caps is designed to solve this problem. The idea is a user presenting the IBP_MANAGE cap could request the depot issue a new set of caps with a shorter duration. These new vcaps could then be provided to a 3rd party. At any time the original cap owner can revoke access to the allocation by simply using the IBP_MANAGE command to delete the vcap. Another useful feature to consider is restricting the vcap to a specific byte range of the original cap. | |||
== IBP "Accounts" == | |||
In order for several of these ideas to work a new set of commands would need to be added to manage the accounts. | |||
= Data Integrity = | |||
== Validation along the entire data path == | |||
The current implementation allows for validating at the end points only. This is accomplished by having the data originator calculate a checksum before uploading the data. This checksum can be appended to the data uploaded data or it can be stored externally in the exnode. The consumer can then download the data, calculate the checksum, and compare it to what is stored. This approach is not well suited to live data streams since the raw data will have to be buffered until the consumer can download the data to verify it. | |||
An alternative approach would be to standardize on a checksum algorithm and have the client calculate the checksum as the data is being streamed to the depot while the depot simultaneously calculates the checksum as it receives the data. The sender would pass on it's checksum for validation by the receiver. Any discrepancy occurring during the network transfer would be immediately detected while the data is still in the senders original buffer. The depot could then store this checksum as part of the allocation for use later. Most OS will immediately detect a write failure but not necessarily bit rot when reading unless the disk is part of a RAID array. Likewise when a reader requests data the reverse process can occur. Namely the depot and receiver both calculate the checksum as the data is being sent. The depot would additionally compare the original checksum stored with what was just calculated in order to detect disk errors. If no errors occurred the depot would go ahead and send the checksum down to the receiver for validation. This process is computationally efficient since the data is never re-read. The checksum is just part of the transfer pipeline. | |||
Building this validation procedure into the protocol simplifies the data integrity higher level tools require. These checksums could be used by higher level tools to verify replicated copies and detect data changes. The checksums should be treated as opaque strings and could be accessed by additional IBP_MANAGE sub-commands: | |||
* IBP_GET_CHECKSUM - Return the allocations checksum | |||
* IBP_VALIDATE_CHECKSUM - Re-calculate the checksum | |||
Using single checksum for an entire allocation is not efficient if random I/O on an allocation is allowed. In this case changing a single byte of a 10MB allocation would require the re-processing of the entire allocation. Another option would be to specify that for every 64KB of data (I picked this out of the blue so feel free to suggest something different) a checksum is generated. This means each allocation could have multiple checksums. In this case if a single byte was changed only 64KB of data would have to be re-processed. If the checksum field on the client is treated as an opaque string then having 1 or multiple checksums is irrelevant. Both cases can be treated the same. | |||
see discussion tab for comments | |||
= Miscellanous = | |||
== Support UDP transfers == | |||
What about usingthe UDT implementation since it can mimic FAST, web100, etc. TCP congestion control methods... |
Latest revision as of 13:38, 1 February 2008
Suggested changes to existing protocol
Re-order parameters in IBP_STATUS command
The existing IBP v1.4 implementation is:
- version IBP_STATUS RID IBP_ST_INQ password TIMEOUT \n
- version IBP_STATUS RID IBP_ST_CHANGE password TIMEOUT \n max_hard max_soft max_duration \n
- version IBP_STATUS IBP_ST_RES TIMEOUT \n
Notice that two of the commands have a primary command, IBP_STATUS, a resource ID (RID), followed by a sub-command (IBP_ST_INQ, IBP_ST_CHANGE) and the last version has no RID, just a sub-command, IBP_ST_RES. The current implementation can only be parsed by first reading the whole line in and then counting the number of arguments. The argument count is then used to determine which command is actually being issued. A more natural version of the commands would always have the sub-command immediately follow the IBP_STATUS command.
Change in RID format
The current definition of an RID is an integer as defined in struct ibp_depot. The definition of an integer is architecture dependent and hense not portable. An alternative definition would be to define the RID as a character string. This would provide flexibility in its implmentation and use. The current IBP client libraries already treat the RID as an opaque character string for all commands except IBP_Allocate().
Provide interface to IBP data structures
No explicit interfaces are provided for any of the various IBP data structures. A more flexible approach would be to add API calls to manipulate these structures indirectly.
IBP_MCOPY current status
The documentation for this command is sparse. It looks like numerous different multicast methods were implemented but there is very little documentation describing them. Should this command be dropped?
NFU
There is very little documentation describing the NFU implementation in the current LoCI depot and the documentation provided has errors and is not fully supported. The concept of the NFU is very powerful and I wonder if it should be split out as a separate specification altogether. Hunter's Java implementation is quite elegant. In his implementation the NFU calls are actually Java JAR files stored as allocations. These allocation are then registered with the NFU manager with hooks for checksums for data integrity. Having the NFU call operating in a Java container is extremely appealing. Java can place the NFU call in a box to limit it's resource consumption(memory, cpu, threads, etc) making it much more difficult for an NFU call to inadvertently or maliciously take down the depot or NFU manager. Also because of Java's portability deploying new NFU calls becomes trivial.
Resource Discovery
Added call to receive the list of resource ID's
--Hunterh 13:38, 1 February 2008 (CST)
Security
Add support for SSL
Self-explanatory
Auth/AuthZ for IBP_ALLOC command
This command has the potential for abuse and could result in a a "Denial of Space" attack on the depot. If the concept of an "account" is added one could then come up with additional methods to share resources for example adding the concept of an account quota. It also provides a tracking mechanism on who is *creating* allocations.
Virtual Capabilities(vcap)
The current implementation only allows a single set of caps for an allocation. So once a user has access to a cap it can never be revoked. Virtual caps is designed to solve this problem. The idea is a user presenting the IBP_MANAGE cap could request the depot issue a new set of caps with a shorter duration. These new vcaps could then be provided to a 3rd party. At any time the original cap owner can revoke access to the allocation by simply using the IBP_MANAGE command to delete the vcap. Another useful feature to consider is restricting the vcap to a specific byte range of the original cap.
IBP "Accounts"
In order for several of these ideas to work a new set of commands would need to be added to manage the accounts.
Data Integrity
Validation along the entire data path
The current implementation allows for validating at the end points only. This is accomplished by having the data originator calculate a checksum before uploading the data. This checksum can be appended to the data uploaded data or it can be stored externally in the exnode. The consumer can then download the data, calculate the checksum, and compare it to what is stored. This approach is not well suited to live data streams since the raw data will have to be buffered until the consumer can download the data to verify it.
An alternative approach would be to standardize on a checksum algorithm and have the client calculate the checksum as the data is being streamed to the depot while the depot simultaneously calculates the checksum as it receives the data. The sender would pass on it's checksum for validation by the receiver. Any discrepancy occurring during the network transfer would be immediately detected while the data is still in the senders original buffer. The depot could then store this checksum as part of the allocation for use later. Most OS will immediately detect a write failure but not necessarily bit rot when reading unless the disk is part of a RAID array. Likewise when a reader requests data the reverse process can occur. Namely the depot and receiver both calculate the checksum as the data is being sent. The depot would additionally compare the original checksum stored with what was just calculated in order to detect disk errors. If no errors occurred the depot would go ahead and send the checksum down to the receiver for validation. This process is computationally efficient since the data is never re-read. The checksum is just part of the transfer pipeline.
Building this validation procedure into the protocol simplifies the data integrity higher level tools require. These checksums could be used by higher level tools to verify replicated copies and detect data changes. The checksums should be treated as opaque strings and could be accessed by additional IBP_MANAGE sub-commands:
- IBP_GET_CHECKSUM - Return the allocations checksum
- IBP_VALIDATE_CHECKSUM - Re-calculate the checksum
Using single checksum for an entire allocation is not efficient if random I/O on an allocation is allowed. In this case changing a single byte of a 10MB allocation would require the re-processing of the entire allocation. Another option would be to specify that for every 64KB of data (I picked this out of the blue so feel free to suggest something different) a checksum is generated. This means each allocation could have multiple checksums. In this case if a single byte was changed only 64KB of data would have to be re-processed. If the checksum field on the client is treated as an opaque string then having 1 or multiple checksums is irrelevant. Both cases can be treated the same.
see discussion tab for comments
Miscellanous
Support UDP transfers
What about usingthe UDT implementation since it can mimic FAST, web100, etc. TCP congestion control methods...