Open questions:
Sorry for the change in format but I am finding using the bullets in Confluence to be troublesom so I am trying something else.
How can one see nodes that are entirely unclaimed?
I want a proper subset of machines to be for the HERA project. These machines will only run HERA jobs and HERA jobs will only run on these machines. This seems to work..
machine config | submit file |
---|---|
HERA = True STARTD_ATTRS = $(STARTD_ATTRS) HERA START = ($(START)) && (TARGET.partition =?= "HERA") | requirements = (HERA == True) +partition = "HERA" |
But is there a better way? I was hoping I could do something like this but it seems to confuse condor and the jobs run on any machine.
machine config | submit file |
---|---|
HERA = True STARTD_ATTRS = $(STARTD_ATTRS) HERA START = ($(START)) && (TARGET.HERA =?= True) | requirements = (HERA == True) HERA = True |
- Bug where James's jobs are all put on the same core. Here is top -u krowe showing the Last Used Cpu (SMP) after I submitted five sleep jobs to the same host.
- Is this just a side effect of condor using cpuacct instead of cpuset in cgroup?
- Is this a failure of the Linux kernel to schedule things on separate cores?
- Is this because cpu.shares is set to 100 instead of 1024?
- Check if CPU affinity is set in /proc/self/status
- Is sleep cpu-intensive enough to properly test this? Perhaps submit a while 1 loop instead?
...