Performing an ALMA Restore.

Unlike the process for the EVLA, performing a restore for ALMA is slightly more complicated. In many cases, ALMA processing involves more than one Execution Block (EB) at a time, and thus all the original EBs must be acquired along with the calibration products. In addition, the PPR for handling multiple EBs is more complicated. The typical 'work unit' for ALMA is the Member Observation Unit Set (MOUS). This is an organizational structure for data which is used for automated calibration and imaging using the CASA Pipeline, and it has an identifier of it's own. For data ~~taken~~ processed after October 2017, it is far more reasonable to perform an MOUS level restore because the calibration products are stored separately from the imaging products within the ALMA NGAS system, thus greatly simplifying the organizational work involved. There are plans for products ingested earlier to be split up and reingested, but we have been give no timescale for when that project might start.

How is a restore requested?

Typically, you'll be given the MOUS UID (for example: uid://A001/X1284/X265f, uid://A001/X12a3/X80e, uid://A001/X12cc/X4a, or uid://A001/X1284/X266) upon which to perform the restore. That identifier will allow you to collect all the relevant information to determine if a restore can be performed reasonably, as well as enable the extraction of relevant data from NGAS & generation of a Pipeline Processing Request (PPR).

What if I only have an EB (or ASDM) UID?

You can retrieve the MOUS UID from a given ASDM UID using the ASA_SCIENCE table:

Unlike the process for the EVLA, performing a restore for ALMA is slightly more complicated. In many cases, ALMA processing involves more than one Execution Block (EB) at a time, and thus all the original EBs must be acquired along with the calibration products. In addition, the PPR for handling multiple EBs is more complicated. The typical 'work unit' for ALMA is the Member Observation Unit Set (MOUS). This is an organizational structure for data which is used for automated calibration and imaging using the CASA Pipeline, and it has an identifier of it's own. For data ~~taken~~ processed after October 2017, it is far more reasonable to perform an MOUS level restore because the calibration products are stored separately from the imaging products within the ALMA NGAS system, thus greatly simplifying the organizational work involved. There are plans for products ingested earlier to be split up and reingested, but we have been give no timescale for when that project might start.

How is a restore requested?

Typically, you'll be given the MOUS UID (for example: uid://A001/X1284/X265f, uid://A001/X12a3/X80e, uid://A001/X12cc/X4a, or uid://A001/X1284/X266) upon which to perform the restore. That identifier will allow you to collect all the relevant information to determine if a restore can be performed reasonably, as well as enable the extraction of relevant data from NGAS & generation of a Pipeline Processing Request (PPR).

What if I only have an EB (or ASDM) UID?

You can retrieve the MOUS UID from a given ASDM UID via the following query (this avoids the potentially problematic ASA_SCIENCE table):

select sbs.MOUS_STATUS_UID FROM BMMV_SCHEDBLOCK sbs join ALMA.SHIFTLOG_ENTRIES shifts ON shifts.SE_SB_ID = sbs.ARCHIVE_UID join AQUA_EXECBLOCK ebs ON ebs.EXECBLOCKUID = shifts.SE_EB_UID where ebs.EXECBLOCKUID='....';

When can we perform a restore?

We require the ASDM(s) and the calibration products for an MOUS before we can process a restore. However, calibrations for ALMA are not considered official until there are archived images for them. Thus, we know that we can use a calibration for a restore when:

select count(*) from ALMA.ASA_PRODUCT_FILES where FILE_CLASS='science' and ASA_OUS_ID=?; > 0

and it is primarily a pipeline-calibration:

select count(*) from ASA_PRODUCT_FILES where ASA_OUS_ID='.....' and FILE_CLASS='script' and NGAS_FILE_ID LIKE '%scriptFor%Cal%'; == 0

then we can proceed. If both of these conditions are true, then the calibration_status can be set to 'Calibrated' (i.e. a restore is an option). Otherwise, the calibration_status should stay at 'Ready' which is already our default for ALMA data.

What was that second condition?

Some ALMA data requires manual intervention in order to calibrate correctly. The fraction of data for which this is required is fairly small by late Cycle 5 (15%?), but if there are indicators that our basic restore process won't work, the system should catch & report that fact early on, rather than running the pipeline and letting it fail. There is a more comprehensive methodology used at the Alma Regional Centers (ARCs), but they combine both restores & complete recalibrations. Those more complicated cases have been excluded from the AAT-PPI system.

How do go about performing the restore?

The process overall is largely the same as for an EVLA restore:

Set up directory structure & metadata.json
Retrieve rawdata
Retrieve calibration products & manifest
Write the restore PPR
Run CASA
Deliver the Calibrated Measurement Set

Several of these steps are going to differ in the details, however. In particular obtaining the calibration products from the NAASC NGAS machines, preparing an appropriate PPR (which requires more data than our typical ones), and setting up the appropriate call to CASA (including environment variables and script choice). So in the following sections we will focus upon steps 2-5.

Retrieve raw data from the NAASC NGAS machines:

From the MOUS UID, one can obtain the list of relevant EBs:

SELECT EXECBLOCKUID FROM ALMA.AQUA_EXECBLOCK ebs
JOIN ALMA.SHIFTLOG_ENTRIES shifts ON ebs.EXECBLOCKUID = shifts.SE_EB_UID
JOIN ALMA.BMMV_SCHEDBLOCK sbs ON shifts.SE_SB_ID = sbs.ARCHIVE_UID
WHERE ebs.QA0STATUS = 'Pass'
AND sbs.MOUS_STATUS_UID ='.......';

That list can then be fed sequentially into the asdmExportLight script, which is part of the Alma Common Software suite. See the alma-datafetcher.sh script for the approprate configuration steps. Place the EBs underneath the rawdata directory as normal.

Retrieve calibration products & manifest from the NAASC NGAS machines:

For initial testing I make use of a modified version of a script developed at the Joint Alma Observatory (naasc_listfiles.py). This script looks up the names of the 'calibration' and 'script' FILE_CLASS products for a given MOUS UID in the ASA_PRODUCT_FILES table, and performs a wget on each one to place it in the current working directory. This should be performed in the products directory, as that is where the pipeline expects them. ~~If we are going to continue to use this more direct methodology, the script needs to be refactored into a component of pyat, and perform validation upon the retrieved files.~~ (Done, see fetchAlmaCals)

As an alternative, it should be possible to make use of the 'exportProducts' tool which is part of the Alma Common Software suite to achieve the same results, but I have not yet experimented with that script sufficiently.

Once the calibration files are in place, we need to decompress and expand the *.hifa_calimage.auxproducts.tgz file in order to provide access to the files it contains. The pipeline will not automatically handle these files being contained in a tar achive (the other .tgz files can be left alone).

Finally, the *.pipeline_manifest.xml file must be copied over to the rawdata directory.

NOTE: The server name & port number have been extracted out into capo (the almaNgasSystem properties). Those data were taken from the configuration file provided to me by Rachel Rosen. Updated values can be obtained from /home/acs/config/archiveConfig.properties (you need to be on a CV machine to access it), in case something changes and our properties get out of date.

Write the restore PPR:

This is the tricky part. Due to needing to create a complete PPR (we normally exclude the ProjectStructure section entirely, for instance), there are several additional pieces of information required. It is probably not all of them are strictly required to be accurate, but I haven't tested the limits of what is allowed.

PPR_restore_template.xml

Above is the basic layout of the restore PPR. Much of the data simply needs to be added to the correct area (RootDirectory, ProjectCode, etc). However, the ProcessingIntents and section will require a bit more detail. I'll handle each separately.

Queries for basic PPR data:

- ProjectSummary
  - ProposalCode
    - available via the ProjectData object
    - select distinct PROJECT_CODE from ASA_PROJECT join ASA_SCIENCE on ASA_PROJECT.PROJECT_UID = ASA_SCIENCE.PROJECT_UID where MEMBER_OUSS_ID='.....';
  - The rest can be handled with default values or 'unknown'
- ProjectStructure
  - ObsUnitSetRef – This refers to the Project + the partId corresponding to this MOUS within that project.
    - entityId
      - select OBSUNITSETUID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
    - partId
      - select OBSUNITSETPARTID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
  - ProjectStatusRef
    - entityId
      - select OBS_PROJECT_STATUS_ID from OBS_UNIT_SET_STATUS where STATUS_ENTITY_ID='.....';
  - OUSStatusRef – This is just our MOUS of interest
    - entityId
      - .....
- ProcessingRequests
  - RootDirectory
    - Path to the processing area (i.e. spool directory)
  - ProcessingProcedure
    - There may be a need to handle parameters to the hifa_restoredata call at some point (see the gainmap issue for VLASS SSA-4893 - Getting issue details... STATUS ).
- DataSet
  - RelativePath
    - Path from the RootDirectory above to the directory housing products/rawdata/working
  - SchedBlockRef
    - entityId
      - select distinct SCHEDBLOCK_UID from ALMA.ASA_SCIENCE where MEMBER_OUSS_ID = '.....';
      - There is the possibility of multiple SBs linked to an MOUS. The DOMAIN_ENTITY_STATE column of the SCHED_BLOCK_STATUS table should be able to tell us which we want.
  - SBStatusRef
    - entityId
      - select STATUS_ENTITY_ID from SCHED_BLOCK_STATUS where DOMAIN_ENTITY_ ID='SchedBlockRef entityId';
  - AsdmIdentifer
    - Fill out one of these blocks with the ASDM_UID & the santized version for each EB in the MOUS.

Handling Sessions in the PPR:

Warning: The discussion below is only partially correct. The ASA_SCIENCE table does not provide a solid foundation from which to base the restore process. It does not provide a complete listing of all ASDMs/EBs related to an MOUS, and has occasionally listed an ASDM as belonging to multiple MOUSs. I've been working with Kana Sugimoto to understand what the pipeline is doing for the creation of session information in their initial PPRs. She has provided a potential replacement query that I need to study further before rewriting both the PPR generation and this section.

ALMA uses shorter scheduling blocks than the EVLA for flexibility, and as a consequence those scheduling blocks are commonly executed multiple times. If an SB is run multiple times in a row, the resulting EBs are grouped together into a 'session', and this grouping is something that CASA needs to know for calibration (and therefore for any restoration). In many cases, the following query will retrieve the relevant information for building sessions for the PPR:

SELECT DISTINCT ASA_SCIENCE.ASDM_UID, AQUA_EXECBLOCK.SESSIONID, FROM ALMA.AQUA_EXECBLOCK
JOIN ALMA.ASA_SCIENCE ON ASA_SCIENCE.ASDM_UID = AQUA_EXECBLOCK.EXECBLOCKUID
JOIN AQUA_SESSION S on AQUA_EXECBLOCK.SESSIONID = S.SESSIONID
WHERE ASA_SCIENCE.MEMBER_OUSS_ID='.....'
ORDER BY S.ENDTIME ASC;

This will provide a list of the EBs in the order of their observation, along with an id value for their associated session. To start, SESSION_1 will contain the first EB in the list. If the next EB has the same session id number, then it also belongs to SESSION_1, otherwise it belongs to a new session (SESSION_2) Repeat until all EBs are associated with a session. Then check on the status of each EB:

select QA0STATUS from AQUA_EXECBLOCK where EXECBLOCKUID='ASDM_UID';

If the EB does not have a status of PASS, remove it from the session. If that empties a session, that empty session still needs to be placed into the PPR. Then for each session, write an Intents block with the session name (SESSION_1, SESSION_2, etc) as the Keyword and the associated EBs in the Value field. If there are no EBs leave the Value portion of the block empty. If there are multiple EBs, separate them with a ' | '.

Caveat: This will not always produce the correct session structure within the PPR, but it works in many cases. There are instances where an attempted observation is not associated with its MOUS via the ASA_SCIENCE table. In order to find those cases, however, we would need to extract the XML status blob for the MOUS:

select XML from OBS_UNIT_SET_STATUS where STATUS_ENTITY_ID='.....';

From that source, we could examine the SESSION tags to determine the EB to Session mapping, and then perform the QA0STATUS screening above. However, this is not something we want to explore at this time.

Note: Comparing the sessions structure you create with that of the initial PPR provides a good test.

Once the PPR is complete, it needs to be in the working directory (a blank PPR.xml is generated in working at the start of our workflows).

Run CASA:

There are a few differences in the details of running the CASA pipeline for an ALMA restore. Firstly, we need to use the 'runpipeline.py' script instead of the 'runvlapipeline.py' which is our normal method of invoking casa. This is due to differences in how those two scripts interpret the PPR, but I don't know the details. As a consequence of the first change, we need to define 3 environment variables:

1. SCIPIPE_ROOTDIR → This should point to the overall working area (i.e. the spool directory)
2. SCIPIPE_LOGDIR → This should point to a 'Logs' directory beneath SCIPIPE_ROOTDIR
3. SCIPIPE_SCRIPTDIR → This points to the pipeline recipes directory for the version of CASA+Pipeline we are using

Once those are set, we can invoke CASA with (to be complete):

${CASA_HOME}/bin/casa --nogui --nologger --pipeline -c ${CASA_HOME}/pipeline/pipeline/runpipeline.py PPR.xml

This will be wrapped in an xvfb-run command as normal in a workflow.

Do we still need ACS set up at this point, to provide access to the PPR & manifest schemas?

When can we perform a restore?

We require the ASDM(s) and the calibration products for an MOUS before we can process a restore. If a calibration has been archived for the MOUS:

select count(*) from ASA_PRODUCT_FILES where ASA_MOUS_UID = '....' and FILE_CLASS = 'calibration'; > 0

and it is primarily a pipeline-calibration:

select count(*) from ASA_PRODUCT_FILES where ASA_MOUS_ID='.....' and FILE_CLASS='script' and NGAS_FILE_ID LIKE '%scriptFor%Cal%'; == 0

then we can proceed. Those queries could be used to update the 'calibration_state' on the ALMA EB metadata we keep in our database.

What was that second condition?

Some ALMA data requires manual intervention in order to calibrate correctly. The fraction of data for which this is required is fairly small by late Cycle 5 (15%?), but if there are indicators that our basic restore process won't work, the system should catch & report that fact early on, rather than running the pipeline and letting it fail. There is a more comprehensive methodology used at the Alma Regional Centers (ARCs), but they combine both restores & complete recalibrations. Those more complicated cases have been excluded from the AAT-PPI system.

How do go about performing the restore?

The process overall is largely the same as for an EVLA restore:

Set up directory structure & metadata.json
Retrieve rawdata
Retrieve calibration products & manifest
Write the restore PPR
Run CASA
Deliver the Calibrated Measurement Set

Several of these steps are going to differ in the details, however. In particular obtaining the calibration products from the NAASC NGAS machines, preparing an appropriate PPR (which requires more data than our typical ones), and setting up the appropriate call to CASA (including environment variables and script choice). So in the following sections we will focus upon steps 2-5.

Retrieve raw data from the NAASC NGAS machines:

From the MOUS UID, one can obtain the list of EBs with: select distinct ASDM_UID from ASA_SCIENCE where MEMBER_OUSS_ID='.....'; That list can then be fed sequentially into the asdmExportLight script, which is part of the Alma Common Software suite. See the alma-datafetcher.sh script for the approprate configuration steps. Place the EBs underneath the rawdata directory as normal.

Retrieve calibration products & manifest from the NAASC NGAS machines:

For initial testing I make use of a modified version of a script developed at the Joint Alma Observatory (naasc_listfiles.py). This script looks up the names of the 'calibration' and 'script' FILE_CLASS products for a given MOUS UID in the ASA_PRODUCT_FILES table, and performs a wget on each one to place it in the current working directory. This should be performed in the products directory, as that is where the pipeline expects them. ~~If we are going to continue to use this more direct methodology, the script needs to be refactored into a component of pyat, and perform validation upon the retrieved files.~~ (Done, see fetchAlmaCals)

As an alternative, it should be possible to make use of the 'exportProducts' tool which is part of the Alma Common Software suite to achieve the same results, but I have not yet experimented with that script sufficiently.

Once the calibration files are in place, we need to decompress and expand the *.hifa_calimage.auxproducts.tgz file in order to provide access to the files it contains. The pipeline will not automatically handle these files being contained in a tar achive (the other .tgz files can be left alone).

Finally, the *.pipeline_manifest.xml file must be copied over to the rawdata directory.

NOTE: The server name & port number have been extracted out into capo (the almaNgasSystem properties). Those data were taken from the configuration file provided to me by Rachel Rosen. Updated values can be obtained from /home/acs/config/archiveConfig.properties (you need to be on a CV machine to access it), in case something changes and our properties get out of date.

Write the restore PPR:

This is the tricky part. Due to needing to create a complete PPR (we normally exclude the ProjectStructure section entirely, for instance), there are several additional pieces of information required. It is probably not all of them are strictly required to be accurate, but I haven't tested the limits of what is allowed.

PPR_restore_template.xml

Above is the basic layout of the restore PPR. Much of the data simply needs to be added to the correct area (RootDirectory, ProjectCode, etc). However, the ProcessingIntents and section will require a bit more detail. I'll handle each separately.

Queries for basic PPR data:

- ProjectSummary
  - ProposalCode
    - available via the ProjectData object
    - select distinct PROJECT_CODE from ASA_PROJECT join ASA_SCIENCE on ASA_PROJECT.PROJECT_UID = ASA_SCIENCE.PROJECT_UID where MEMBER_OUSS_ID='.....';
  - The rest can be handled with default values or 'unknown'
- ProjectStructure
  - ObsUnitSetRef – This refers to the Project + the partId corresponding to this MOUS within that project.
    - entityId
      - select OBSUNITSETUID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
    - partId
      - select OBSUNITSETPARTID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
  - ProjectStatusRef
    - entityId
      - select OBS_PROJECT_STATUS_ID from OBS_UNIT_SET_STATUS where STATUS_ENTITY_ID='.....';
  - OUSStatusRef – This is just our MOUS of interest
    - entityId
      - .....
- ProcessingRequests
  - RootDirectory
    - Path to the processing area (i.e. spool directory)
  - ProcessingProcedure
    - There may be a need to handle parameters to the hifa_restoredata call at some point (see the gainmap issue for VLASS SSA-4893 - Getting issue details... STATUS ).
- DataSet
  - RelativePath
    - Path from the RootDirectory above to the directory housing products/rawdata/working
  - SchedBlockRef
    - entityId
      - select distinct SCHEDBLOCK_UID from ALMA.ASA_SCIENCE where MEMBER_OUSS_ID = '.....';
      - There is the possibility of multiple SBs linked to an MOUS. The DOMAIN_ENTITY_STATE column of the SCHED_BLOCK_STATUS table should be able to tell us which we want.
  - SBStatusRef
    - entityId
      - select STATUS_ENTITY_ID from SCHED_BLOCK_STATUS where DOMAIN_ENTITY_ ID='SchedBlockRef entityId';
  - AsdmIdentifer
    - Fill out one of these blocks with the ASDM_UID & the santized version for each EB in the MOUS.