You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

The AUDI imaging task will sometimes crash, most commonly when the restore on an early Cycle 5 dataset fails. For that specific case, a workaround known as "The Kludge" has been written by SSA. Other failure modes may still need manual runs. For fully manual runs, a dedicated node (currently cvpost016) is needed to run the interactive job, as the almapipe credentials are only supported on one processing node at a time. 

Step-by-step guide - "Classic" Kludged runs - restore with 5.1.1-5

  1. Run will send fail email with error code 2 (note that error code 2 can also refer to other issues such as incomplete ASDM downloads, so double-check that the ASDMs in the raw directory do not have any ASDMBinary files ending .missing)
  2. Starting in the spool/xxxx/xxxx/working directory, chmod the directory to make is group-writeable (chmod g+w working), then run almaReimageCube with specified restore and imaging versions (full path needed) and supply the job ID and directory uid (not the MOUS uid) to the --request parameter e.g. almaReimageCube --restore_casa /home/casa/packages/RHEL7/release/casa-release-5.1.1-5 --image_casa /home/casa/packages/RHEL7/release/casa-6.5.4-9-pipeline-2023.1.0.124 --request 475229560 uid___A002_Xc89480_X1a40 (note this only works in pipelines that have the separate imaging recipe, CASA 6+)
  3. The run should terminate as usual and the usual QA should be possible.

Step-by-step guide - Kludged restores for Session mapping bug - restore with 5.4.2-8

  1. Run will send fail email with error code 2 (note that error code 2 can also refer to other issues such as incomplete ASDM downloads, so double-check that the ASDMs in the raw directory do not have any ASDMBinary files ending .missing)
  2. Starting in the spool/xxxx/xxxx/working directory, run almaReimageCube with specified restore and imaging versions (full path needed) and supply the job ID and directory uid (not the MOUS uid) to the --request parameter e.g. almaReimageCube --restore_casa /home/casa/packages/RHEL7/release/casa-release-5.4.2-8 --image_casa /home/casa/packages/pipeline/casa-6.1.1-10-pipeline-2020.1.0.36 --request 475229560 uid___A002_Xc89480_X1a40 (note this only works in pipelines that have the separate imaging recipe, CASA 6+)
  3. The run should terminate as usual and the usual QA should be possible.

Step-by-step guide - fully manual runs

  1. Run the restore on the dataset using the appropriate version of CASA.
  2. Copy the <uid>casapipescript.py from the original failed imaging run's spool/<uid>/products directory into the working directory of the restore.
  3. Edit the imaging casapipescript to replace hifa_restore for the raw ASDM with hifa_importdata(vis=<calibrated msname[s]>, session=[sessionid], dbservice=False) for the restored MS, also add hifa_exportdata(imaging_products_only=True) at the end.
  4. Remove the products directory from the restore run.
  5. start casa --pipeline
  6. execfile('<uid>casapipescript.py')
  7. Make a <jobid>/<uid> directory in the image-qa area
  8. copy the rawdata, products and working into the image-qa area
  9. Do QA and run audiPass in the usual way.

Step-by-step guide - large cubes

  1. Run the restore on the dataset using the appropriate version of CASA or grab the mses from the working directory of the failed run.
  2. Typically the first attempt will have failed in findcont, need to rerun - options are: (1) download the cont.dat file from the ALMA archive (in auxproducts), rerun without findcont or (2) insert hif_productsize into the script or PPR, and see it it will rerun with mitigations (this can also be used to generate a casa_pipescript template for option (1)) then copy the cont.dat file from that.
  3. Typically you will need to interact with the user to find acceptable mitigations (e.g. smaller image size).
  4. Then proceed as for fully manual runs above. 


  • No labels