You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

Please track notes from HTCondor week here, particularly new features we may want to investigate.  It could be some time before we can look at them and I don't want to forget.  If possible include:

  • Name or very brief description of the feature (e.g. DAGman flow)
  • Any links provided by the talk or that you can track down in semi-realtime
  • Broader description of each item and maybe some context of where it might be useful

The primary function of this is as a note to future versions of us so no need to go into great detail,  just want to avoid head scratching questions about 'what was that thing, that was going to help with that other thing'.

Try to create and follow some form of structure using headers and bullets.


Data Reuse Mechanism

In v8.9 can cache job in put files on the execute machine.

File Transfer Improvements

AWS S3, box.com

DAGMan Data Flow

DAG can work like Make and run or not run jobs
based on the time of files.  Does this replace what Makeflow does?  Seems like it is simpler than Makeflow.  Data Flow will skip nodes if the defined output file is newer than the input file.  It is in the latest 8.9.7 release.

https://htcondor.readthedocs.io/en/latest/users-manual/file-transfer.html?highlight=dataflow#dataflow-jobs

But, even though the input file is the older that both the output files and the executable and adding the knob to the 99-nrao file on the central manager/submit host and the execution host and running condor_reconfig, this job still runs.  So either dataflow doesn't work or I don't know how to work it.



Docker

Condor has docker images (regular and mini)
docker run htcondor/cm
docker run -t -I htcondor/mini

Could we use docker images for interactive use?  That way docker could run a proper sshd?  University in Bonn (High Energy Physics) does something like this.

 condor_submit -i -append '+ContainerOS="CentOS7"'


getting the docker universe working seems simple.  Install docker-io or docker-ce, add condor to the docker group, and condor_starter will detect that docker is available.

  universe = docker

  docker_image = dev7_and_HEP_stack


docker vs singularity

HTCondor seems to prefer singularity because it doesn't start processes from a daemon and therefore processes can be better tracked by condor. condor_ssh_to_job works with singularity.


Youtube

https://www.youtube.com/channel/UCd1UBXmZIgB4p85t2tu-gLw
Center for High Throughput Computing
These seem mostly intro-level stuff.

HTMap

seems like a good approach to make the pipeline run imaging in the HTC environment
https://htmap.readthedocs.io/en/latest/
HTMap is a library that wraps the process of mapping Python function calls out to an HTCondor pool. It provides tools for submitting, managing, and processing the output of arbitrary functions.

Python tutorial

https://mybinder.org/v2/gh/htcondor/htcondor-python-bindings-tutorials/master?urlpath=lab/tree/users/Submitting-and-Managing-Jobs.ipynb

need python3-condor RPM or use pip install.  It doesn't get installed by default with yum.  I pointed this out to CHTC and they submitted a ticket. Should try installing locally with pip in a user's account

pip3 install htcondor



/dev/shm

/dev/shm is now job-private like /tmp and /var/tmp in version 8.9.  It was already restricted by cgroups but now in 8.9 jobs can't see other job's /dev/shm and it gets cleaned when a job exits.  (tested krowe May 28, 2020)


Versions

It is unlikely there will be a version 8.8.10 but instead 9.0







  • No labels