Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: revamped to include ALMA and the generalized product system

We need to settle on metadata for the calibrations we have ingested for the EVLA.will be handling calibration information for both EVLA and ALMA (with the potential for adding VLBA some time further in the future).  Unfortunately, while EVLA calibrations can easily be linked to their execution_block of origin, that is not the case with ALMA.  For ALMA, calibrations are done at the MOUS level (and thus are tied to multiple EBs).  

As part of creating the ALMA OUS structure within the AAT/PPI & the generalization of archive products (constituents?), here is a draft for discussion of the calibrations (not calibration_tables) table:

versionWhen the calibration ingestion was performed.
Column NameColumn TypeComments
calibration_idintegerAutomatically generated unique Id
execution_block_idinteger(nullable) Foreign Key to the Execution Block information from which this calibration was derived.casa
alma_ous_idvarcharWhich version of CASA was used.  This potentially has several effects, including restores.
parallelized_casabooleanWas CASA used in MPI mode? That has implications for how the calibration information is stored.
integer(nullable) Foreign Key to the Alma OUS from which the calibration was derived.
products_idintegerForeign Key to the products table for the generalized system.
filegroup_idintegerFilegroup for the file(s) of this calibration. 



project_codevarcharProject to which the calibration belongs.
casa_versionvarcharWhich version of CASA was used. date_ingesteddatetime
qa_commentsvarcharComments provided by the DA upon Ingestion.  Optionally, this could be replaced with a workflow_id if the comments are stored there.


A calibration is the results of either: An Execution Block (EVLA) or an MOUS (Alma).  So one (and only one) of those two structural columns should be null for any column (will need to be enforced via software design, I believe).


From Here To There:

Currently, the calibration_tables table consists of: calibration_table_id, file_id, and a blank 'TBD' column.  We'll require a number of transformations for the 8k+ entries already existing (all EVLA):

  • The table itself, the primary key column, and the generating sequence need to be renamed
    • details to be figured out
  • file_id → filegroup_id should be manageable since their directly tied to each other in the files table
    • SELECT filegroup FROM files WHERE file_id=?;
  • alma_ous_id will be null for all existing entries.
  • execution_block_id is obtainable via the existing filegroup structure, as is the project_code
    • SELECT [execution_block_id, project_code] FROM execution_blocks
      JOIN filegroups ON execution_blocks.filegroup_id = filegroups.parent_filegroup_id
      WHERE filegroups.filegroup_id=?;
  • TODO: products_id
  • The other two metadata fields can be blank
  • Drop/reuse metadata_tbd


ALMA Considerations:

This is predecated upon the assumption that we will create entries for a filegroup & set of files for each ALMA calibration.  Their calibrations are not stored as single tar files like ours, but as a collection of loose files in their NGAS system.  Using the names from ASA_PRODUCTS, we can save ourselves the lookup query that fetchAlmaCals performs and simply skip to drawing the files out of their NGAS system.  This change will necessitate a rewrite of how we handle the ALMA calibrations. I've removed the reference to a particular EB (That's far too EVLA specific for what we need).  Instead, I suspect we'll need linking tables into our processing structure (see: here).