Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added information on determining which ALMA calibrations are of merit, and how to look up the information we desire for populating a filegroup.

...

  • The table itself, the primary key column, and the generating sequence need to be renamed
    • details to be figured out
  • file_id → filegroup_id should be manageable since their directly tied to each other in the files table
    • SELECT filegroup FROM files WHERE file_id=?;
  • alma_ous_id will be null for all existing entries.
  • execution_block_id is obtainable via the existing filegroup structure, as is the project_code
    • SELECT [execution_block_id, project_code] FROM execution_blocks
      JOIN filegroups ON execution_blocks.filegroup_id = filegroups.parent_filegroup_id
      WHERE filegroups.filegroup_id=?;
  • TODO: products_id
  • The other two metadata fields can be blank
  • Drop/reuse metadata_tbd


ALMA

...

Calibrations:

Calibration of ALMA data happens after an MOUS has been fully observed.  At that point, the pipeline is run, and the results are analyzed for QA2 purposes.  An ALMA calibration is not official until there are science images archived for it, therefore the MOUSes of interest to us are those with:

select count(*) from ALMA.ASA_PRODUCT_FILES where FILE_CLASS='science' and ASA_OUS_ID=?; > 0

However, there are cases (roughly 15%) where the DAs must take an active role in performing the calibration.  Those results are not necessarily able to be restored by the pipeline, and therefore we should reject calibrations where:

select count(*) from ASA_PRODUCT_FILES where ASA_OUS_ID='.....' and FILE_CLASS='script' and NGAS_FILE_ID LIKE '%scriptFor%Cal%';  <> 0

We should consider what calibration status to assign in this case, as the data seems to be inappropriate for pipeline processing. 

In order to complete the calibrations table, we need to create a filegroup, and populate it with the relevant file information.  The list of files we desire is generated by:

select FILE_ID,FILE_SIZE from ngas.ngas_files where file_id in (select ngas_file_id from alma.asa_product_files where ASA_OUS_ID=? and FILE_CLASS in ('calibration','script'));

That will (with a potential translation between ALMA's standard for file sizes and ours), allow us to populate the files table.  Note that the filename and ngas_id are identical, and both are needed for the table. 

NOTE: With the NAASC processing moving toward a split imaging & calibration system, it might simplify matters to screen out some of the extraneous files from the imaging pipeline which are caught in the above query.  In particular, there can be auxproducts and pipeline_manifests for the hifa_image pipeline, but those can also easily be ignored in the restore set-up software.This is predecated upon the assumption that we will create entries for a filegroup & set of files for each ALMA calibration.  Their calibrations are not stored as single tar files like ours, but as a collection of loose files in their NGAS system.  Using the names from ASA_PRODUCTS, we can save ourselves the lookup query that fetchAlmaCals performs and simply skip to drawing the files out of their NGAS system.  This change will necessitate a rewrite of how we handle the ALMA calibrations.