Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Torque has this command called pbsnodes that can not only offline/drain a node but keeps a note about it that all can see in one place.  I know I can use condor_off to drain a node but is there a central place keep notes so I can remember a month later why I set a certain node to drain?
    • ANSWER: there is no place to keep such notes.
    • May want to use condor_drain instead of condor_off.  condor_off will kill the startd when all jobs finish and it no longer shows up in condor_status.  condor_drain will leave the node in condor_status.


  • Bug where James's jobs are all put on the same core.  Here is top -u krowe showing the Last Used Cpu (SMP) after I submitted five sleep jobs to the same host.
    • Is this just a side effect of condor using cpuacct instead of cpuset in cgroup?
    • Is this a failure of the Linux kernel to schedule things on separate cores?
    • Is this because cpu.shares is set to 100 instead of 1024?

...