This is a proof of concept for running HERA calibration jobs at CHTC using HTCondor and no shared filesystem.
Mar. 26, 2020 krowe: I was able to run this through HTCondor at CHTC via makeflow -T condor small.mf
While I think it worked and produced a file (zen.2458098.44615.HH.autos.uvh5) makeflow itself never returned. It just seemed to hang even thought the HTCondor job finished.
wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh
small.mf
This file was idr2_2.mf but was renamed small.mf and then all but the first targets were removed. Then, this was done to make it work at CHTC
< zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log: /home/nu_kscott/hera/test1/wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh do_EXTRACT_AUTOS.sh _common.sh ../share/makeflow_sample/raw_data/zen.2458098.44615.HH.uvh5 hera_calibration_packages.tar.gz extract_autos.py
< ./wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh > zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log 2>&1
---
> zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.out: /lustre/aoc/projects/hera/krowe/hera_opm/pipelines/h1c/idr2/v2/task_scripts/do_EXTRACT_AUTOS.sh
> /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh > /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log 2>&1
wrapper_zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.sh
2,3c2,3
< source ~/.bashrc
< conda activate base
---
> #source ~/.bashrc
> #conda activate base
5,12c5,25
< cd /lustre/aoc/projects/hera/krowe/makeflow_sample/raw_data
< timeout 24h /lustre/aoc/projects/hera/krowe/hera_opm/pipelines/h1c/idr2/v2/task_scripts/do_EXTRACT_AUTOS.sh zen.2458098.44615.HH.uvh5
< if [ $? -eq 0 ]; then
< cd /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow
< touch zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.out
< else
< mv /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log.error
< fi
---
> wget --no-verbose http://proxy.chtc.wisc.edu/SQUID/chtc/python37.tar.gz
> tar xfz python37.tar.gz
> tar xfz hera_calibration_packages.tar.gz
> date
> export PYTHONPATH=${PWD}/hera_calibration_packages
> export PATH=.:${PWD}/python/bin:${PATH}
>
> #CHTC's python37 tarball has bin/python3 and not bin/pyton
> (cd python/bin ; ln -s python3 python)
>
> #cd /lustre/aoc/projects/hera/krowe/makeflow_sample/raw_data
>
> timeout 24h ./do_EXTRACT_AUTOS.sh zen.2458098.44615.HH.uvh5
>
> # Perhaps wait and see where makeflow/condor put output and error.
> #if [ $? -eq 0 ]; then
> # cd /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow
> # touch zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.out
> #else
> # mv /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log /lustre/aoc/projects/hera/krowe/makeflow_sample/makeflow/zen.2458098.44615.HH.uvh5.EXTRACT_AUTOS.log.error
> #fi