You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 27 Next »

Unlike the process for the EVLA, performing a restore for ALMA is slightly more complicated.  In many cases, ALMA processing involves more than one Execution Block (EB) at a time, and thus all the original EBs must be acquired along with the calibration products.  In addition, the PPR for handling ALMA data needs to meet the requirements of a much more stringent schema than is used for EVLA data processing. 

The typical 'work unit' for ALMA is the Member Observation Unit Set (MOUS).  This is an organizational structure for data (see: here) which is used for automated calibration and imaging using the CASA Pipeline, and each MOUS has a unique identifier to by which the results of those pipeline runs can be found.  

For data processed after October 2017, it is far more reasonable to perform an MOUS level restore because the calibration products are stored separately from the imaging products within the ALMA NGAS system, thus greatly simplifying the organizational work involved.  There are plans for products ingested earlier to be split up and reingested, but we have been give no timescale for when that project might start.  

When can the AAT/PPI perform a restore?

The AAT/PPI performs periodic checks for new ALMA observations in the NAASC metadata database, and performs its own metadata ingestion to provide access to ALMA observations via the system's tools.  This process has recently been expanded to include checking for new calibrations which have been completed.  Metadata pertaining to calibrations which are appropriate for the AAT/PPI automated restore process is then ingested and made available via the search interface. 

A calibration in the NAASC database is considered 'complete' when there are files of class 'science' ingested (typically, these are images resulting from the calibrated data).  However, not all ALMA calibrations can be handled via the CASA pipline.  In some instance (15-20%) some human intervention is required in order to produce an appropriate calibration for the data.  These calibrations include special files which indicate that work beyond the pipeline was required.   Because the AAT/PPI does not have the same level of DA support as the NAASC, those calibrations are excluded from our restore system.  The link in the paragraph above leads to a more detailed discussion of the ALMA reingestion process used by the system.

How is an ALMA restore requested?

Fundamentally, a restore requires the MOUS UID (for example: uid://A001/X1284/X265f, uid://A001/X12a3/X80e, uid://A001/X12cc/X4a, or uid://A001/X1284/X266), with which the other relevant data can be extracted from the ALMA metadata database.  In the context of the AAT/PPI, there are some technical issues which end up necessitating some other information.  

The almaRestore Command:

Used for initial testing, this initiates a restore of the requested MOUS via our workflow system.  The tool looks up a couple of extra pieces of information (the project code, and an ASDM which belongs to the MOUS), and sends the following event to initiate the restore workflow:

{

"eventName": "runAlmaBasicRestoreWorkflow",

"type": "edu.nrao.archive.workflow.messaging.commands.StartWorkflow",

"additionalPaths": [],

"metadata": {

"workflowName": "AlmaOusRestoreWorkflow",

"processingSite": "NAASC",

"deliveryFormat": "CMS",

"telescope": "ALMA",

"fileSetIds": ["uid://A002/Xd248b5/Xa7a"],

"ousStatusId": "uid://A001/X12d1/X23e",

"projectCodeOrDataType": "2017.1.00370.S",

"cliCorrId": "f1a4dd4a-73ae-4aa7-ac30-9397d31dadab",

"casaHome": "/home/casa/packages/RHEL6/release/casa-release-5.4.0-68"

}

}

This particular restore requested a particular CASA version (casaHome's value) to be used, since there is a policy disconnect between EVLA and ALMA about what constitutes an officially acceptable CASA version.  The cliCorrId is used internally to return information to the command runner about where their data processing is happening.  The remaning files are standard values to allow the worfklow to proceed properly.

What if I only have an EB (or ASDM) UID?

You can retrieve the MOUS UID from a given ASDM UID via the following query:

select sbs.MOUS_STATUS_UID FROM BMMV_SCHEDBLOCK sbs join ALMA.SHIFTLOG_ENTRIES shifts ON shifts.SE_SB_ID = sbs.ARCHIVE_UID join AQUA_EXECBLOCK ebs ON ebs.EXECBLOCKUID = shifts.SE_EB_UID where ebs.EXECBLOCKUID='....';


The AAT/PPI Front End:

The search interface for the AAT/PPI has been updated to handle ALMA MOUS structures, and indicates whether a restore can be requested with the existence of a button underneath the 'cals' column.   This specialized button results in a workflow start event much like the one above, but with additional fields in the metadata pertaining to the final delivery location of the calibrated MSes and a few additional fields used by other workflows initiated via the search interface.  It is important to note that even if ALMA data has been calibrated and imaged, that does not mean that the data are appropriate for an automated restore process.  Please see the section above for how we determine which MOUSes are restore candidates.


What does the AAT/PPI ALMA restore workflow do?

In rough terms, an ALMA restore is largely similar to an EVLA restore:

  1. Set up directory structure & metadata.json
  2. Retrieve rawdata
  3. Retrieve & stage calibration products & manifest
  4. Write the restore PPR
  5. Run CASA
  6. Deliver the Calibrated Measurement Set

Several of these steps are going to differ in the details, however.  The details of obtaining data from the NAASC NGAS machines and the inputs to CASA are variations of the procedure used for the EVLA.  In contrast, preparing an appropriate PPR is a greater challenge in the ALMA processing case. 

Retrieve raw data from the NAASC NGAS machines:

Only those execution blocks which have a QA0 status of Pass are used in the ALMA processing system, and thus it is only those execution blocks which are required to perform a restore.  From the MOUS UID we are given to restore, we can obtain a list of ASDM UIDs via:

SELECT EXECBLOCKUID FROM ALMA.AQUA_EXECBLOCK ebs
JOIN ALMA.SHIFTLOG_ENTRIES shifts ON ebs.EXECBLOCKUID = shifts.SE_EB_UID
JOIN ALMA.BMMV_SCHEDBLOCK  sbs    ON shifts.SE_SB_ID = sbs.ARCHIVE_UID
WHERE ebs.QA0STATUS = 'Pass'
AND sbs.MOUS_STATUS_UID ='.......';


That list are then fed sequentially into the asdmExportLight script, which is part of the Alma Common Software suite.  See the alma-datafetcher.sh script for the appropriate configuration steps.  Place the ASDMs underneath the rawdata subdirectory of our working directory as normal.

Retrieve calibration products & manifest from the NAASC NGAS machines:

For the handling of ALMA calibrations, there is the fetchAlmaCals tool.  This worker script interrogates the ASA_PRODUCT_FILES table for the files relevant to a restore of that MOUS (those of the 'calibration' and 'script' FILE_CLASS in particular) and performs a verified extraction of these files from the NAASC NGAS system into the products subdirectory of our working directory. 

Once the files are extracted from NGAS, there are a pair of additional steps which are done in preparation:

  1. decompress and expand the *.hifa_calimage.auxproducts.tgz file in order to provide access to the files it contains.  The pipeline will not automatically handle these files being contained in a tar achive (the other .tgz files can be left alone)
  2. the *.pipeline_manifest.xml file must be copied over to the rawdata directory, as that is where CASA expects it to be located.

There is a potential alternative solution: it should be possible to make use of the 'exportProducts' tool which is part of the Alma Common Software suite to achieve similar results, but that method was unsuccessful during the prototyping phase and was abandoned for the time being.

NOTE: The master NGAS server name & port number have been extracted out into capo (the almaNgasSystem properties).  Those data were taken from the configuration file provided by Rachel Rosen.  Updated values can be obtained from /home/acs/config/archiveConfig.properties (accessible from most, if not all, CV machines), in case something changes and our properties get out of date. 


Write the restore PPR:

This is the tricky part.  Due to needing to create a complete PPR (we normally exclude the ProjectStructure section entirely, for instance), there are several additional pieces of information required.  It is probably not all of them are strictly required to be accurate, but I haven't tested the limits of what is allowed. 

PPR_restore_template.xml

Above is the basic layout of the restore PPR.  Much of the data simply needs to be added to the correct area (RootDirectory, ProjectCode, etc).  However, the ProcessingIntents and section will require a bit more detail.  I'll handle each separately.

Queries for basic PPR data:

    • ProjectSummary
      • ProposalCode
        • available via the ProjectData object
        • select distinct PROJECT_CODE from ASA_PROJECT join ASA_SCIENCE on ASA_PROJECT.PROJECT_UID = ASA_SCIENCE.PROJECT_UID where  MEMBER_OUSS_ID='.....';
      • The rest can be handled with default values or 'unknown'
    • ProjectStructure
      • ObsUnitSetRef  – This refers to the Project + the partId corresponding to this MOUS within that project.
        • entityId
          • select OBSUNITSETUID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
        • partId
          • select OBSUNITSETPARTID from AQUA_OUS where OUS_STATUS_ENTITY_ID = '.....';
      • ProjectStatusRef
        • entityId
          • select OBS_PROJECT_STATUS_ID from OBS_UNIT_SET_STATUS where STATUS_ENTITY_ID='.....';
      • OUSStatusRef – This is just our MOUS of interest
        • entityId
          • .....
    • ProcessingRequests
      • RootDirectory
        • Path to the processing area (i.e. spool directory)
      • ProcessingProcedure
        • There may be a need to handle parameters to the hifa_restoredata call at some point (see the gainmap issue for VLASS SSA-4893 - Getting issue details... STATUS ). 
    • DataSet
      • RelativePath
        • Path from the RootDirectory above to the directory housing products/rawdata/working
      • SchedBlockRef
        • entityId
          • select distinct SCHEDBLOCK_UID from ALMA.ASA_SCIENCE where MEMBER_OUSS_ID = '.....';
          • There is the possibility of multiple SBs linked to an MOUS.  The DOMAIN_ENTITY_STATE column of the SCHED_BLOCK_STATUS table should be able to tell us which we want.
      • SBStatusRef
        • entityId
          • select STATUS_ENTITY_ID from SCHED_BLOCK_STATUS where DOMAIN_ENTITY_ ID='SchedBlockRef entityId';
      • AsdmIdentifer
        • Fill out one of these blocks with the ASDM_UID & the santized version for each EB in the MOUS.

Handling Sessions in the PPR:

Warning: The discussion below is only partially correct.  The ASA_SCIENCE table does not provide a solid foundation from which to base the restore process.  It does not provide a complete listing of all ASDMs/EBs related to an MOUS, and has occasionally listed an ASDM as belonging to multiple MOUSs.  I've been working with Kana Sugimoto to understand what the pipeline is doing for the creation of session information in their initial PPRs.  She has provided a potential replacement query that I need to study further before rewriting both the PPR generation and this section. 


ALMA uses shorter scheduling blocks than the EVLA for flexibility, and as a consequence those scheduling blocks are commonly executed multiple times.  If an SB is run multiple times in a row, the resulting EBs are grouped together into a 'session', and this grouping is something that CASA needs to know for calibration (and therefore for any restoration).  In many cases, the following query will retrieve the relevant information for building sessions for the PPR:

SELECT DISTINCT ASA_SCIENCE.ASDM_UID, AQUA_EXECBLOCK.SESSIONIDFROM ALMA.AQUA_EXECBLOCK
JOIN ALMA.ASA_SCIENCE ON ASA_SCIENCE.ASDM_UID = AQUA_EXECBLOCK.EXECBLOCKUID
JOIN AQUA_SESSION S on AQUA_EXECBLOCK.SESSIONID = S.SESSIONID
WHERE ASA_SCIENCE.MEMBER_OUSS_ID='.....'
ORDER BY S.ENDTIME ASC;

This will provide a list of the EBs in the order of their observation, along with an id value for their associated session.  To start, SESSION_1 will contain the first EB in the list.  If the next EB has the same session id number, then it also belongs to SESSION_1, otherwise it belongs to a new session (SESSION_2)  Repeat until all EBs are associated with a session.   Then check on the status of each EB:

select QA0STATUS from AQUA_EXECBLOCK where EXECBLOCKUID='ASDM_UID';

If the EB does not have a status of PASS, remove it from the session.  If that empties a session, that empty session still needs to be placed into the PPR.  Then for each session, write an Intents block with the session name (SESSION_1, SESSION_2, etc) as the Keyword and the associated EBs in the Value field.  If there are no EBs leave the Value portion of the block empty.  If there are multiple EBs, separate them with a ' | '.  

Caveat:  This will not always produce the correct session structure within the PPR, but it works in many cases.  There are instances where an attempted observation is not associated with its MOUS via the ASA_SCIENCE table.  In order to find those cases, however, we would need to extract the XML status blob for the MOUS:

select XML from OBS_UNIT_SET_STATUS where STATUS_ENTITY_ID='.....';

From that source, we could examine the SESSION tags to determine the EB to Session mapping, and then perform the QA0STATUS screening above.  However, this is not something we want to explore at this time.

Note: Comparing the sessions structure you create with that of the initial PPR provides a good test.


Once the PPR is complete, it needs to be in the working directory (a blank PPR.xml is generated in working at the start of our workflows). 


Run CASA:

There are a few differences in the details of running the CASA pipeline for an ALMA restore.  Firstly, we need to use the 'runpipeline.py' script instead of the 'runvlapipeline.py' which is our normal method of invoking casa.  This is due to differences in how those two scripts interpret the PPR, but I don't know the details.  As a consequence of the first change, we need to define 3 environment variables:

    1. SCIPIPE_ROOTDIR    → This should point to the overall working area (i.e. the spool directory)
    2. SCIPIPE_LOGDIR       → This should point to a 'Logs' directory beneath SCIPIPE_ROOTDIR
    3. SCIPIPE_SCRIPTDIR  → This points to the pipeline recipes directory for the version of CASA+Pipeline we are using

Once those are set, we can invoke CASA with (to be complete):

${CASA_HOME}/bin/casa --nogui --nologger --pipeline -c ${CASA_HOME}/pipeline/pipeline/runpipeline.py PPR.xml

This will be wrapped in an xvfb-run command as normal in a workflow. 




  • No labels