This work is a joint effort of SCG and ARDG to decompose imaging in small work units that can be processed as independent jobs on an HTC environment, making use of HTCondor's dagman.
Initial testing consists of running a single gridding cycle on a previously partitioned MS. Original scripts are located at /lustre/aoc/users/sbhatnag/11B-157/Continuum/IMAGING_CTB80/MTWBAWP/PARALLEL/HTCondor/SCRIPT_TEST
At this stage, all scripts assume a shared file system across all compute nodes (lustre). Follows a short description of each script.
imaging.py | Python script that splits the input MS into smaller MSes and produces the DAG (in tclean.dag file) This also has the tclean parameters |
mkres.py | The Python script that sets up the SynthesisImager tool of CASA, runs the gridder on the input MS and produces images with the given basename. The input MS (via the DAG nodes) are the sub-MSes produced by imaging.py |
tclean.dag | The DAG to convert sub-MSes to sub-images. Uses tclean.htc HTCondor script. |
tclean.htc | The HTCondor script that uses CASA to run mkres.py with the (sub-)MS and Image name. |
The convolution functions have to be obtained prior to the partition of the MS. They are contained in the (sub)directory cf.tt_tclean_allSPW_withW.ps. A copy of the original scripts, data (before and after partition) and convolution functions is located in the directory script_test_0 under /lustre/aoc/sciops/fmadsen/HTCondor/imaging_ctb80, that will be used as root directory for subsequent testing.
A top-level diagram of the imaging process is shown below. This is updated dynamically as the development progresses, to represent software structure, as well as important findings and remarks.
The DAG condor_imaging.dag under /lustre/aoc/sciops/fmadsen/HTCondor/imaging_ctb80/script_test_1 calls a first job MSpartition. This script partitions the MS based on inputs and writes the subdag allImagers.dag, that is called by condor_imaging.dag as the child of MSpartition. The file allImagers.dag is needed at DAG submission, but it can be empty.
The software structure has changed and now dagutils.py (the former imaging.py with small changes for integration) is the module that is imported on CASA and has all the function definitions that are used in both MSpartition.py and allImagers.py.
With respect to "test_0", the following has been accomplished on "test_1":
- software integration: a common source of function definitions is used by the scripts that do MS partitioning and run the first major cycle to produce the sub-images
- a DAG that runs MS partitioning as the first job and gridding/imaging the subMSes as second job, and easily scalable to run addition of sub-images and deconvolution
The new software modules are:
condor_imaging.dag | top-level DAG, currently defining the job MSpartition and the SUBDAG allImagers |
dagutils.py | based on the former imaging.py, intended to be a centralized source of definitions for all the stages in the imaging process |
MSpartition.py | simple script that imports dagutils.py and runs the module daggen, that partitions the original MS based on the maximum size of sub-MSes and generates the subDAG (allImagers.dag) with the corresponding jobs to produce the sub-images with all the sub-MSes |
MSpartition.htc | file containing job submission definitions to run MSpartition.py |
allImagers.dag | empty file at job submission, is written by MSpartition.py and called as SUBDAG by condor_imaging.dag after the completion of job MSpartition |
allImagers.py | simple script that imports dagutils.py and runs the module mkImage on the input (sub-)MS to create a (sub-)Image |
allImagers.htc | file containing job submission definitions to run allImagers.py |
What has not yet changed on "test_1":
- convolution functions are obtained manually prior to running the DAG
- although imaging parameters now have a unique source in dagutils.py, they still are not accessible as parameters for general imaging
- sub-images are not (yet) added