You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

I list each task here and what is necessary to run them in HTCondor.  I am assuming this will be running without a shared filesystem and also without access to NRAO filesystems.  So any call to /lustre/aoc or /users/<username> or other such things need to be altered to be site agnostic.

Every DAG or task creates .log, .out and maybe .png files that we want to keep.  Also, .last files like tclean.last are often created.  These are not necessary but can be usefull for debugging things.  I assume that almost all tasks require the Measurement Set (MS).  I question what tasks actually modify the MS.  run_tclean() defaults to using the corrected datacolumn.  Does that mean it is changing this column?

This document it not complete.  I am sure I am missing inputs and perhaps outputs as well.

In this document, "data" when referenced as an input or an output is a directory containing the Measurement Set (E.g. VLASS1.2.sb36491855.eb36574404.58585.53016267361_split.ms/)

How do we handle the want to start a job at a given task?  For example, say a job ran to completion but you want to re-run the job after altering something in task17.  It would be unfortunate to have to run tasks 1 through 16.  It would be better to start and task17 and run through to the end of task25.  To do this requires saving the output of each task.  But how?  Incremental or Differential?  Using prolog and epilog scripts? Other?

Task01

Didn't alter the MS

run_tclean( 'iter0', cfcache=cfcache_nowb, robust=-2.0, uvtaper='3arcsec', calcres=False  )

Task02

This tasks creates VIP_iter0b.* but I don't see those files ever referenced in this script again.  What does this taks do that is necessary to other tasks?

run_tclean( 'iter0b', cfcache=cfcache_nowb, calcres=False  )


Task03

mask_from_catalog(inext=inext,outext="QLcatmask.mask",catalog_search_size=1.5,catalog_fits_file='../VLASS1Q.fits')

  • input: data
  • input: VLASS1Q.fits
  • output: mask_from_cat.crtf, VIP_QLcatmask.mask


Task04

run_tclean( 'iter1', robust=-2.0, uvtaper="3arcsec"  )

  • input: data
  • output: VIP_iter1.*


Task05

replace_psf('iter1','iter0')

This is just some python that deletes VIP_iter1.psf.* and copies VIP_iter0.psf.* to VIP_iter1.psf.*.  It is inefficient to ever make this task be its own DAG.  I suggest it always be in the same DAG as Task04.

  • input: VIP_iter0.psf.*, VIP_iter1.psf.*
  • output: VIP_iter1.psf.*


Task06

run_tclean( 'iter1', robust=-2.0, uvtaper="3arcsec", niter=20000, nsigma=5.0, mask="QLcatmask.mask", calcres=False, calcpsf=False  )

  • input: data
  • input: VIP_iter1.*, VIP_QLcatmask.mask
  • output: VIP_iter1.*


Task07

run_tclean( 'iter1', calcres=False, calcpsf=False, savemodel='modelcolumn', parallel=False  )

  • input: data
  • input: VIP_iter1.*
  • output: VIP_iter1.*
  • output: data

Task08

flagdata(vis=vis, mode='rflag', datacolumn='residual_data',timedev='tdev.txt',freqdev='fdev.txt',action='calculate')

replace_rflag_levels()

flagdata(vis=vis, mode='rflag', datacolumn='residual_data',timedev='tdev.txt',freqdev='fdev.txt',action='apply',extendflags=False)

flagdata(vis=vis, mode='extend', extendpols=True, growaround=True)

  • input: data
  • output: tdev.txt,. fdev.txt


Task09

statwt(vis=vis,combine='field,scan,state,corr',chanbin=1,timebin='1yr', datacolumn='residual_data' )

  • input: data
  • output: data


Task10

gaincal(vis=vis,caltable='g.0',gaintype='T',calmode='p',refant='0',combine='field,spw',minsnr=5)

  • input: data
  • output: data


Task11

applycal(vis=vis,calwt=False,applymode='calonly',gaintable='g.0',spwmap=18*[2], interp='nearest')

  • input: data
  • output: data


Task12

run_tclean( 'iter0c', datacolumn='corrected', cfcache=cfcache_nowb, robust=-2.0, uvtaper='3arcsec', calcres=False  )

  • input: data
  • output: VIP_iter0c.*


Task13

run_tclean( 'iter0d', datacolumn='corrected', cfcache=cfcache_nowb, calcres=False  )

  • input: data
  • output: VIP_iter0d.*


Task14

run_tclean( 'iter1b', datacolumn='corrected', robust=-2.0, uvtaper="3arcsec" )

  • input: data
  • output: VIP_iter1b.*


Task15

replace_psf('iter1b','iter0c')

This is just some python that deletes VIP_iter1b.psf.* and copies VIP_iter0c.psf.* to VIP_iter1b.psf.*.  It is inefficient to ever make this task be its own DAG.  I suggest it always be in the same DAG as Task14.

  • input: VIP_iter1b.psf.*, VIP_iter0c.psf.*
  • output: VIP_iter1b.psf.*


Task16

run_tclean( 'iter1b', datacolumn='corrected', robust=-2.0, uvtaper="3arcsec", niter=20000, nsigma=5.0, mask="QLcatmask.mask", calcres=False, calcpsf=False  )

  • input: data
  • input: iter1b, VIP_QLcatmask.mask
  • output: inter1b


Task17

imsmooth(imagename=imagename_base+"iter1b.image.tt0", major='5arcsec', minor='5arcsec', pa='0deg', outfile=imagename_base+"iter1b.image.smooth5.tt0")

  • input: data
  • input: iter1b.image.tt0
  • output: iter1b.image.smooth5.tt0


Task18

exportfits(imagename=imagename_base+"iter1b.image.smooth5.tt0", fitsimage=imagename_base+"iter1b.image.smooth5.fits")

  • input: data
  • input: iter1b.image.smooth5.tt0
  • output: iter1b.image.smooth5.fits


Task19

subprocess.call(['/users/jmarvil/scripts/run_bdsf.py', imagename_base+'iter1b.image.smooth5.fits'],env={'PYTHONPATH':''})

This needs some modification. It calls a script from Josh's homedir and runs bdsf out of /lustre.


Task20

edit_pybdsf_islands(catalog_fits_file=imagename_base+'iter1b.image.smooth5.cat.fits')

mask_from_catalog(inext=inext,outext="secondmask.mask",catalog_fits_file=imagename_base+'iter1b.image.smooth5.cat.edited.fits', catalog_search_size=1.5)

  • input: iter1b.image.smooth5.cat.fits
  • input: iter1b.image.smooth5.cat.edited.fits
  • output: secondmask.mask


Task21

immath(imagename=[imagename_base+'secondmask.mask',imagename_base+'QLcatmask.mask'],expr='IM0+IM1',outfile=imagename_base+'sum_of_masks.mask')

im.mask(image=imagename_base+'sum_of_masks.mask',mask=imagename_base+'combined.mask',threshold=0.5)

  • input: secondmask.mask, QLcatmask.mask
  • output: sum_of_masks.mask
  • input: sum_of_masks.mask
  • output: combined.mask


Task22

run_tclean( 'iter2', datacolumn='corrected' )

  • input: data
  • output: VIP_iter2.*


Task23

replace_psf('iter2', 'iter0d')

This is just some python that deletes VIP_iter2.psf.* and copies VIP_iter0d.psf.* to VIP_iter2.psf.*.  It is inefficient to ever make this task be its own DAG.  I suggest it always be in the same DAG as Task22.

  • input: VIP_iter2.psf.*, VIP_iter0d.psf.*
  • output: VIP_iter2.psf.*


Task24

run_tclean( 'iter2', datacolumn='corrected', scales=[0,5,12], nsigma=3.0, niter=20000, cycleniter=3000, mask="QLcatmask.mask", calcres=False, calcpsf=False  )

  • input: data
  • input: VIP_iter2.*, QLcatmask.mask
  • VIP_iter2.*


Task25

os.system('rm -rf *.workdirectory')

os.mkdir('iter2_intermediate_results')

os.system('cp -r *iter2* iter2_intermediate_results')

shutil.rmtree(imagename_base+'iter2.mask')

shutil.copytree(imagename_base+'combined.mask',imagename_base+'iter2.mask')

run_tclean( 'iter2', datacolumn='corrected', scales=[0,5,12], nsigma=3.0, niter=20000, cycleniter=3000, mask="", calcres=False, calcpsf=False  )

This does some file cleaning and then runs run_tclean.  Where do we want to do that file cleaning?  In the previous task?  On the submit host?

  • input: data
  • input: VIP_iter2.*
  • output: VIP_iter2.*

















  • No labels