...
- It is looking like hardware doesn't really matter. It's the cpuset.
Questions
? QUESTION: Does the number of threads per process (ps -T <pid>) change with different cpusets?
QUESTION: Check whether nodescheduler give whole NUMA node, also test whether nodescheduler + mpicasa -n 8 gives same image as the good -n 8 images (ie 8-0 not 6-2 or 5-3)
...
- nmpost011/8-15: cpuset.cpus: 1,5,7,11,13,17,19,23 cpuset.mems: 1 (dual 12core sockets)
- nmpost013/0-7: cpuset.cpus: 0,4,6,10,12,16,18,22 cpuset.mems: 0 (dual 12core sockets)
- nmpost021/0-7: cpuset.cpus: 0,2,4,6,8,10,12,14 cpuset.mems: 0 (dual 16core sockets)
- nmpost033/0-7: cpuset.cpus: 0,2,4,6,8,10,12,14 cpuset.mems: 0 (dual 16core sockets)
- nmpost033/8-15: cpuset.cpus: 1,3,5,7,9,11,13,15 cpuset.mems: 1 (dual 16core sockets)
- nmpost036/0-7: cpuset.cpus: 0,2,6,8,10,12,16,18 cpuset.mems: 0 (dual 20core sockets)
- nmpost036/8-15: cpuset.cpus: 1,3,7,9,11,13,17,19 cpuset.mems: 1 (dual 20core sockets)
- nmpost060/0-7: cpuset.cpus 0,2 cpuset.mems: 0 (dual 16core sockets) Why is this cpuset only 0,2 when torque? L_Request = -L tasks=1:lprocs=8:memory=92gb:place=numanode which looks like nodescheduler but cpuset_string = nmpost060:0,2.
? QUESTION: Running jobs with nodescheduler
ANSWER: Using nodescheduler, which provides you with 8 cores, to reserve a node and then manually running casa with either -n 8 or -n 9 produces images that are pixel identical to what you would get with a hand crafted cpuset of 8 cores on the same socket and using -n 8 or -n 9. In other words if you have been using nodescheduler to reserve nodes, I don't think your casa images are suspect.
? QUESTION: Test with 4 way parallelization whether 4-0, 0-4, 2-2, 1-3, 3-1 distribution impacts resulting image using -n 4. also try -n 5.cpuset=0,2,4,6 produces an image different that cpuset=1,2,4,6
ANSWER
- Using 4 cores and -n 4: 4-0, 0-4 produces a different image than 2-2, 1-3, 3-1.
- Using 4cores and -n 5: all permutations tested (4-0, 0-4, 2-2, 1-3, 3-1) produces the same image.