Definitions
cpuset: Is the set of cores on which the job is allowed to run. On a dual processor machine running Linux, all the even numbered cores are on socket and the odd numbered cores are on the other socket. E.g.
cpuset=0,2,4,6,8,10,12,14 # all the cores on one 8core socket.
cpuset=0,1,2,3,4,5,6,7 # 4 cores on one socket and four on the other.
Conclusions
- casa-5 seems to produce the same image no matter what the cpuset is.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=670565607
- casa-pipeline-release-5.6.1-8.el7 and casa-6.1.2-7-pipeline-2020.1.0.36 both use the same version of OpenMPI (1.10.4)
/home/casa/packages/RHEL7/release/casa-pipeline-release-5.6.1-8.el7/lib/mpi/bin/mpirun -version
/home/casa/packages/RHEL7/release/casa-6.1.2-7-pipeline-2020.1.0.36/lib/mpi/bin/mpirun -version
- With casa-6, the resulting image is dependant on the cpuset used.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=2101591390
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=93665106
- When using 8cores and mpicasa -n 9, I casa-6 always produces the
same image regardless of the cpuset.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=1339676938
- jobs jr-batch.9 and jr-nmpost005b.2 show that -n 9 is the same as
-n $machinefile when ppn is 9
- runnnig a manual job with access to all the cores (no cpuset) and n
-9 produces the same result as jr-nmpost005.55 (all 8 even cores).
Though I only have a few data points.
- nmpost005, nmpost006, and nmpost072 produce the same images given
the same input and using the same cpuset.
- cores chosen by Torque don't seem to change for a given host.
Though I only have a few data points. If it did vary once in a
while it could explain the once in a while differences I saw in my
end-to-end runs.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=1234076945
- It seems that the specific cores chosen doesn't dictate the image
created but the number of cores on each socket does.
- It is looking like hardware doesn't really matter. It's the cpuset.
? Does the number of threads per process (ps -T <pid>) change with
different cpusets?