...
Comparing CASA-5 and CASA-6 (casa-pipeline-validation-8) across the two different CPUs available for batch processing in NM and CV shows that the newer CPUs (E5-2640v3) run a small calibration job (6.7GB) about 1.25 times faster than the old CPUs (E5-2670) with CASA-6 performing slower in every case. There was no significant run-time difference between NM and CV for similar hardware and software. Results are in minutes.
Here is the full pipeline script I have used for all of these tests casa_pipescript.py For some tests, I commented out all but hifv_importdata.
...
"*" Means it completed with tclean() errors
...
Full, new, serial pipeline with large dataset
RHEL7 - 350GB dataset with NM Lustre-2.5.5 (results are in minutes)
...
You can see that running just hifv_importdata() on a larger data set (350GB) shows that nmpost nodes run about 2% to 10% faster than similar cvpost nodes with CASA-6 performing slower in every case.
RHEL7 - 350GB dataset with NM Lustre-2.10.8 (results are in minutes)
...
Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.
...
Full, serial pipeline with large dataset
RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)
CASA | NM (E5-2640v3) | CV (E5-2640v3) | NM (E5-2670) | CV (E5-2670) |
---|---|---|---|---|
5 | 3,045^350*^ | 3,011^362*^3 | 4,431^605*^ | 34,401^480*^ |
6 | 34,640016* | 3,466943* | 45,511671*4,392 | 5,253* |
"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"
"^" Means "SEVERE setjy No rows were selected"
Full, new, serial pipeline with large dataset and
...
profiling metrics
Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.
RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 and or CASA 6.0.0.23a100.dev17 (results are in minutes)
CASA | NM (E5-2640v3) | CV (E5-2640v3) | NM (E5-2670) | CV (E5-2670) | |||||
---|---|---|---|---|---|---|---|---|---|
54 | 3,326*^ | 4, | 3,397^ | 3,410^ | 6 | 3,527 | 4,442485*^ | ||
6 | 4,172* | 5,572* |
"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"
"^" Means "SEVERE setjy No rows were selected"
...
...
...
...
Full, new serial pipeline with large dataset and times per pipeline task
Comparing two profiling jobs against one of Brian's jobs (/lustre/aoc/sciops/bkent/pipetest/llama3/workingtest60_2) on the same hardware (E5-2670) in NM. Times were calculated from the CASA logs. Times are in minutes.
Large dataset (350GB) times are in minutes | CASA-5.6.3-9, Pipeline 43128 | CASA-6.0.0.23-pipeline-validation-17, Pipeline master-v0.1-145-ge322387-dirty | CASA-6.0.0.23-pipeline-validation-17, Pipeline master-v0.1-18-g2de4d78-dirty | CASA-6.0.0.23-pipeline-validation-17, Pipeline master-v0.1-18-g2de4d78-dirty |
Task | kent2-pr-c5-l-70 | kent2-pr-c6-l-70 | kent3b-no-c6-l-70 | CASA-6 Bkent |
hifv_importdata | 247 | 425 | 403 | 392 |
hifv_hanning | 175 | 188 | 334 | 460 |
hifv_flagdata | 272 | 323 | 374 | 452 |
hifv_vlasetjy | 75 | 199 | 255 | 357 |
hifv_priorcals | 254 | 281 | 539 | 494 |
hifv_testBPdcals | 74 | 84 | 98 | 123 |
hifv_flagbaddef | 0 | 1 | 0 | 0 |
hifv_checkflag | 68 | 70 | 69 | 69 |
hifv_semiFinalBPdcals | 75 | 153 | 154 | 154 |
hifv_checkflag | 189 | 254 | 250 | 253 |
hifv_solint | 66 | 89 | 105 | 105 |
hifv_fluxboot2 | 104 | 181 | 185 | 175 |
hifv_finalcals | 162 | 182 | 177 | 177 |
hifv_circfeedpolcal | 31 | 33 | 32 | 32 |
hifv_flagcal | 0 | 1 | 0 | 0 |
hifv_applycals | 205 | 212 | 358 | 437 |
hifv_checkflag | 1741 | 1840 | 2388 | 2930 |
hifv_statwt | 645 | 710 | 812 | 500 |
hifv_plotsummary | 101 | 346 | 350 | 350 |
TOTAL (minutes) | 4484 | 5573 | 6884 | 7460 |
K. Scott finished three runs on Apr. 8, 2020 using Brian's large dataset (350GB), CASA-6.0.0.23-pipeline-validation-17 and Pipeline master-v0.1-18-g2de4d78-dirty separated by about an hour each. Each job requested 1 node with 8 cores and 96gb; essentially a NUMA node. system.resources.memory was unset and _cf.validate_parameters = False. (Times are in minutes)
Task | kent3a-no-c6-l-70 | kent3b-no-c6-l-70 | kent3c-no-c6-l-70 |
hifv_importdata | 410 | 403 | 407 |
hifv_hanning | 364 | 334 | 359 |
hifv_flagdata | 381 | 374 | 386 |
hifv_vlasetjy | 263 | 255 | 256 |
hifv_priorcals | 513 | 539 | 511 |
hifv_testBPdcals | 97 | 98 | 98 |
hifv_flagbaddef | 0 | 0 | 0 |
hifv_checkflag | 68 | 69 | 68 |
hifv_semiFinalBPdcals | 153 | 154 | 152 |
hifv_checkflag | 251 | 250 | 250 |
hifv_solint | 105 | 105 | 106 |
hifv_fluxboot2 | 174 | 185 | 174 |
hifv_finalcals | 178 | 177 | 180 |
hifv_circfeedpolcal | 31 | 32 | 31 |
hifv_flagcal | 0 | 0 | 0 |
hifv_applycals | 353 | 358 | 366 |
hifv_checkflag | 2501 | 2388 | 2302 |
hifv_statwt | 832 | 812 | 806 |
hifv_plotsummary | 348 | 350 | 345 |
TOTAL (minutes) | 7023 | 6884 | 6799 |
...
...
Current Pipeline Script
Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.
Full, new, serial pipeline with small dataset
RHEL7 - 6.7GB dataset with NM Lustre-2.10.x (results are in minutes) I testing a CASA-6 job with and without cf.validate_parameters = False and both jobs took the same amount of time +/- 1 minute.
...
"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"
Full, new, serial pipeline with large dataset
RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)