You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Definitions

cpuset: Is the set of cores on which the job is allowed to run. On a dual processor machine running Linux, all the even numbered cores are on socket and the odd numbered cores are on the other socket.  E.g.

cpuset=0,2,4,6,8,10,12,14 # all the cores on one 8core socket.
cpuset=0,1,2,3,4,5,6,7 # 4 cores on one socket and four on the other.

Conclusions

- casa-5 seems to produce the same image no matter what the cpuset is.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=670565607

- casa-pipeline-release-5.6.1-8.el7 and casa-6.1.2-7-pipeline-2020.1.0.36 both use the same version of OpenMPI (1.10.4)
/home/casa/packages/RHEL7/release/casa-pipeline-release-5.6.1-8.el7/lib/mpi/bin/mpirun -version
/home/casa/packages/RHEL7/release/casa-6.1.2-7-pipeline-2020.1.0.36/lib/mpi/bin/mpirun -version

- With casa-6, the resulting image is dependant on the cpuset used.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=2101591390
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=93665106

- When using 8cores and mpicasa -n 9, I casa-6 always produces the same image regardless of the cpuset.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=1339676938

- jobs jr-batch.9 and jr-nmpost005b.2 show that -n 9 is the same as -n $machinefile when ppn is 9

- runnnig a manual job with access to all the cores (no cpuset) and -n 9 produces the same result as jr-nmpost005.55 (all 8 even cores).
Though I only have a few data points.

- nmpost005, nmpost006, and nmpost072 produce the same images given the same input and using the same cpuset.

- cores chosen by Torque don't seem to change for a given host.  Though I only have a few data points. If it did vary once in a
while it could explain the once in a while differences I saw in my end-to-end runs.
https://docs.google.com/spreadsheets/d/1aKCzeCOj1-50mC7I4fN2eMupPrfR6OH-6LN4jSK9LtQ/edit#gid=1234076945

- It seems that the specific cores chosen doesn't dictate the image created but the number of cores on each socket does.

- It is looking like hardware doesn't really matter. It's the cpuset.


Questions

? Does the number of threads per process (ps -T <pid>) change with different cpusets?



  • No labels