Wrapper

A wrapper script like wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh, is created by build_makeflow_from_config.py and called by makeflow_htcondor.sh idr2_2.mf.  I see it having the following issues:


I am confused as to the difference between the .log, .out and .log.error files.



idr2.2.mf


Even though condor copies do_EXTRACT_AUTOS.sh to the scratch area, it doesn’t run it.  Instead, it runs the full-path version (/lustre/aoc/projects/hera/krowe/…)  This is because the .mf file runs /lustre/.../wrapper_zen.2458098.44615.HH.autos.uvh5.EXTRACT.AUTOS.sh which in turn runs /lustre/.../do_EXTRACT_AUTOS.sh  To fix this, the .mf file will need to be generated differently.



Makeflow uses the classic Make syntax like so


target : prerequisites


        recipe


where it is expected that the recipe will update the target.  So, I think what you want is the wrapper script to output into a file (perhaps .log) that is the target.  E.g.

zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log: /home/nu_kscott/hera/test1/wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh do_EXTRACT_AUTOS.sh _common.sh ../share/makeflow_sample/raw_data/zen.2458098.44615.HH.uvh5 hera_calibration_packages.tar.gz extract_autos.py

        ./wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh > zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log 2>&1

What is different?


Python Package Size

The size of the python packages file (hera_calibration_packages.tar.gz) is about 250MB.  According to http://chtc.cs.wisc.edu/file-avail-largedata, CHTC would like this to be under 100MB in order to use the HTCondor File Transfer mechanism.  So, we either need to reduce this file size with a combination of removing packages and/or better compression or ask CHTC to add it to their SQUID web proxy.


raw_data Size

The sample uvh5 data files I have seen are about 5GB in size.  This is way to large for the HTCondor File Transfer mechanism according to http://chtc.cs.wisc.edu/file-avail-largedata.  Is there a way these files can be split into just what each individual job needs?  If not, then they will have to live on the Large Data Staging filesystem which will limit the pool of available execution hosts.