...
- Torque has this command called pbsnodes that can not only offline/drain a node but keeps a note about it that all can see in one place. I know I can use condor_off to drain a node but is there a central place keep notes so I can remember a month later why I set a certain node to drain?
- ANSWER: there is no place to keep such notes.
- May want to use condor_drain instead of condor_off. condor_off will kill the startd when all jobs finish and it no longer shows up in condor_status. condor_drain will leave the node in condor_status.
- Bug where James's jobs are all put on the same core. Here is top -u krowe showing the Last Used Cpu (SMP) after I submitted five sleep jobs to the same host.
- Is this just a side effect of condor using cpuacct instead of cpuset in cgroup?
- Is this a failure of the Linux kernel to schedule things on separate cores?
- Is this because cpu.shares is set to 100 instead of 1024?
...