...
- explicity list every file/directory in transfer_input_files (it doens't grok regexps). This would be a large list . E.g.
- transfer_input_files = "working/VIP_iter0.gridwt, working/VIP_iter0.pb.tt0, working/VIP_iter0.psf.tt0, working/VIP_iter0.psf.tt1, working/VIP_iter0.psf.tt2, working/VIP_iter0.sumwt.tt0, working/VIP_iter0.sumwt.tt1, working/VIP_iter0.sumwt.tt2, working/VIP_iter0.weight.tt0, working/VIP_iter0.weight.tt1, working/VIP_iter0.weight.tt2"
- Can transfer_input_fies take a manifest? E.g a file containing the list of files to transfer
- Make a temporary director on the submit host, and transfer that (possibly tarring it up)
- Set the inputs and outputs for both data and working as a variable variables in the unified DAG file for each DAG step. The task.sh script deletes and then makes working-<dagstep>, copies the inputs into this directory, transfers it to the scratch area via transfer_input_files=working-<dagstep> then when finished transfers explicitly things out of working-<dagstep> we know changed by an outputs variable defined in the DAG file.uses rsync to merge the various data_inputs together into one data directory and the various working_inputs together into one working directory. Then at the end, task.sh moves data to data-<dagstep> and working to working-<dagstep> and the appropriate dirs/files from these are transferred back to the submithost. The result of all this is that the data needed as an input for a step (E.g. Task08) may need to be combined from multiple places (initial data and data output from Task07)
Task01
Doesn't alter the MS
...
- input: ../data
- input: VIP_iter1.*
- output: VIP_iter1.*
- output: ../data
Task08
Alters the MS
Tasks 08, 09, 10 and 11 take only minutes to run so could be combined into one DAG step.
flagdata(vis=vis, mode='rflag', datacolumn='residual_data',timedev='tdev.txt',freqdev='fdev.txt',action='calculate')
...