Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Summary

Decision to have a single active version of the Archive storage system (NGAS) for each data type.  This is a change to the initial archive design that had multiple active storage sites and would choose an optimum one for each download based on geographic or network considerations.

Approved: Brian Glendenning [X]Approved: <Approving Authority>

Date:<Date Approved>

Consulted: <List of people consulted>

Justification

Risks

Justification

In August at the DMS architecture busy week the need for an NGAS storage node in Charlottesville to store the SRDPs generated in CV was identified.  As we moved forward toward procurement several questions arose in terms of what the requirements are, and how this should be deployed.  The NRAO Archive architecture originally called for the CV and Socorro versions to both be active and data to be served from either one.  This was intended to have three benefits:

   * Load balancing: Between the two compute centers allowing active jobs to be scheduled wherever that was capacity.
   * Decreased download times: By selecting the geographically closer center, network latency could be minimized and would improve the network throughput.
   * Hot failover: If one site was unavailable, the other could immediately begin servicing requests and minimize downtime.

If we eventually want to pursue this architecture then the correct deployment strategy would be to place the NGAS node in CV and the backup node in Socorro.  This deployment would serve as a prototype of the eventual design and allow us to develop the operations and software tools necessary to support it.

It is the consensus opinion that pursuing this initial design is not a sound investment of resources.  Although replicating VLA and VLBA data to CV can be accomplished, and thus using the CV cluster for processing of VLA/VLBA data is possible, replicating ALMA data to Socorro is more challenging because of the complex ALMA Archive Configuration.  Thus true load balancing would not be possible, and a uni-directional load balancing is probably not politically tenable.  Advances in the data transfer protocols, and the production of SRDPs decrease the impact of decreasing download times.  And finally the type of catastrophic failure that the hot failover would protect has not occurred to date.

Risks

The original design was intended to mitigate several risks described in the justification.  These have become less probable over time and the decision to change the design significantly simplifies the system.

Impact

Because we are changing the design to match the current implementation there is no additional impact.

Communication

  •  Notification to the consulted list through publication of this page.
  •  New design captured in the DMS/SRDP Architecture Deployment document. 

Impact

Communication