Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

transfer_input_files = ftp://demo:password@test.rebex.net:/readme.txt
transfer_input_files = ftp://ftp:@ftp.gnu.org:/welcome.msg
transfer_input_files = ftp://ftp.gnu.org:/welcome.msg
transfer_input_files = ftp://ftp:@ftp.slackware.com:/welcome.msg
transfer_input_files = ftp://ftp.slackware.com:/welcome.msg

Priority for Glidein Nodes

We have a factory.sh script that glides in Slurm nodes to HTCondor as needed.  The problem is that HTCondor then seems to prefer these nodes to the regular HTCondor nodes such that after a while there are several free regular HTCondor nodes and but three glide-in nodes.  Is there a way to set a lower priority on glide-in nodes so that HTCondor only chooses them if the regular HTCondor nodes are all busy?  I am going to offline the glide-in nodes to see if that works but that is a manual solution not and automated one.

I would think NEGOTIATOR_PRE_JOB_RANK would be the trick but we already set that to the following so that RANK expressions in submit description files are honored and negotiation will prefer NMT nodes over DSOC nodes if possible.

NEGOTIATOR_PRE_JOB_RANK = (10000000 * Target.Rank) + (1000000 * (RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory


File Transfer Plugins and HTCondor-C

I see that when a job starts, the execution point (radial001) uses our nraorsync plugin to download the files.  This is fine and good.  When the job is finished, the execution point (radial001) uses our nraorsync plugin to upload the files, also fine and good.  But then the RADIAL schedd (radialhead) also runs our nraorsync plugin to upload files.  This causes problems because radialhead doesn't have the _CONDOR_JOB_AD environment variable and the plugin dies.  Why is the remote schedd running the plugin and is there a way to prevent it from doing so?

Greg understands this and will ask the HTCondor-c folks about it.

Greg thinks it is a bug and will talk to our HTCondor-C peopole.

2023-08-07: Greg said the HTCondor-C people agree this is a bug and will work on it.

2023-09-25 krowe: send Greg my exact procedure to reproduce this.

2023-10-02 krowe: Sent Greg an example that fails.  Turns out it is intermittent.

2024-01-22 krowe: will send email to the condor list

ANSWER: It was K. Scott all along.  I now have HTCondor-C workiing from nmpost and testpost clusters to the radial cluster using my nraorsync plugin to trasfer both input and output files.  The reason the remote AP (radialhead) was running the nraorsync plugin was because I defined it in the condor config like so.

FILETRANSFER_PLUGINS = $(FILETRANSFER_PLUGINS), /usr/libexec/condor/nraorsync_plugin.py

I probably did this early in my HTCondor-C testing not knowing what I was doing.  I commented this out, restarted condor, and now everything seems to be working properly.

...


GPUs_Capability

We have a host (testpost001) with both a Tesla T4 (Capability=7.5) and a Tesla L4 (Capability=8.9) and when I run condor_gpu_discovery -prop I see something like the following

...

+SingularityImage = "/cvmfs/singularity.opensciencegrid.org/opensciencegrid/osgvo-el7:latest"

File Transfer Plugins and HTCondor-C

I see that when a job starts, the execution point (radial001) uses our nraorsync plugin to download the files.  This is fine and good.  When the job is finished, the execution point (radial001) uses our nraorsync plugin to upload the files, also fine and good.  But then the RADIAL schedd (radialhead) also runs our nraorsync plugin to upload files.  This causes problems because radialhead doesn't have the _CONDOR_JOB_AD environment variable and the plugin dies.  Why is the remote schedd running the plugin and is there a way to prevent it from doing so?

Greg understands this and will ask the HTCondor-c folks about it.

Greg thinks it is a bug and will talk to our HTCondor-C peopole.

2023-08-07: Greg said the HTCondor-C people agree this is a bug and will work on it.

2023-09-25 krowe: send Greg my exact procedure to reproduce this.

2023-10-02 krowe: Sent Greg an example that fails.  Turns out it is intermittent.

2024-01-22 krowe: will send email to the condor list

ANSWER: It was K. Scott all along.  I now have HTCondor-C workiing from nmpost and testpost clusters to the radial cluster using my nraorsync plugin to trasfer both input and output files.  The reason the remote AP (radialhead) was running the nraorsync plugin was because I defined it in the condor config like so.

FILETRANSFER_PLUGINS = $(FILETRANSFER_PLUGINS), /usr/libexec/condor/nraorsync_plugin.py

I probably did this early in my HTCondor-C testing not knowing what I was doing.  I commented this out, restarted condor, and now everything seems to be working properly.

Constant processing

Our workflows have a process called "ingestion" that puts data into our archive.  There are almost always ingestion processes running or needing to run and we don't want them to get stalled because of other jobs.  Both ingestion and other jobs are the same user "vlapipe".  I thought about setting a high priority in the ingestion submit description file but that won't guarantee that ingestion always runs, especially since we don't do preemption.  So my current thinking is to have a dedicated node for ingestion.  Can you think of a better solution?

  • What about using the local scheduling universe so it runs on the Access Point.  The AP is a docker container with only limited Lustre access so this would be a bad option.
  • ANSWER: A dedicated node is a good solution given no preemption.

So on the node I would need to set something like the following

# High priority only jobs

HIGHPRIORITY = True

STARTD_ATTRS = $(STARTD_ATTRS) HIGHPRIORITY

START = ($(START)) && (TARGET.priority =?= "HIGHPRIORITY")

Nov. 13, 2023 krowe: I need to implement this.  Make a node a HIGHPRIROITY node and have SSA put HIGHPRIORITY in the ingestion jobs.

...


Constant processing

Our workflows have a process called "ingestion" that puts data into our archive.  There are almost always ingestion processes running or needing to run and we don't want them to get stalled because of other jobs.  Both ingestion and other jobs are the same user "vlapipe".  I thought about setting a high priority in the ingestion submit description file but that won't guarantee that ingestion always runs, especially since we don't do preemption.  So my current thinking is to have a dedicated node for ingestion.  Can you think of a better solution?

  • What about using the local scheduling universe so it runs on the Access Point.  The AP is a docker container with only limited Lustre access so this would be a bad option.
  • ANSWER: A dedicated node is a good solution given no preemption.

So on the node I would need to set something like the following

# High priority only jobs

HIGHPRIORITY = True

STARTD_ATTRS = $(STARTD_ATTRS) HIGHPRIORITY

START = ($(START)) && (TARGET.priority =?= "HIGHPRIORITY")

Nov. 13, 2023 krowe: I need to implement this.  Make a node a HIGHPRIROITY node and have SSA put HIGHPRIORITY in the ingestion jobs.

2024-02-01 krowe: Talked to chausman today.  She thinks SSA will need this and that the host will need access to /lustre/evla like aocngas-master and the nmngas nodes do.  That might also mean a variable like HASEVLALUSTRE as well or instead of HIGHPRIORITY.


Priority for Glidein Nodes

We have a factory.sh script that glides in Slurm nodes to HTCondor as needed.  The problem is that HTCondor then seems to prefer these nodes to the regular HTCondor nodes such that after a while there are several free regular HTCondor nodes and but three glide-in nodes.  Is there a way to set a lower priority on glide-in nodes so that HTCondor only chooses them if the regular HTCondor nodes are all busy?  I am going to offline the glide-in nodes to see if that works but that is a manual solution not and automated one.

I would think NEGOTIATOR_PRE_JOB_RANK would be the trick but we already set that to the following so that RANK expressions in submit description files are honored and negotiation will prefer NMT nodes over DSOC nodes if possible.

NEGOTIATOR_PRE_JOB_RANK = (10000000 * Target.Rank) + (1000000 * (RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory

ANSWER: NEGOTIATOR_PRE_JOB_RANK = (10000000  Target.Rank) + (1000000 (RemoteOwner =?= UNDEFINED)) - (100000 * Cpus) - Memory + 100000  * (site == "not-slurm")


...

In progress

condor_remote_cluster

...