Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Comparing CASA-5 and CASA-6 (casa-pipeline-validation-8) across the two different CPUs available for batch processing in NM and CV shows that the newer CPUs (E5-2640v3) run a small calibration job (6.7GB) about 1.25 times faster than the old CPUs (E5-2670) with CASA-6 performing slower in every case.  There was no significant run-time difference between NM and CV for similar hardware and software.  Results are in minutes.

Here is the full pipeline script I have used for all of these tests casa_pipescript.py For some tests, I commented out all but hifv_importdata.

...

"*" Means it completed with tclean() errors

...


Full, new, serial pipeline with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.5.5 (results are in minutes)

...

You can see that running just hifv_importdata() on a larger data set (350GB) shows that nmpost nodes run about 2% to 10% faster than similar cvpost nodes with CASA-6 performing slower in every caseMar. 17, 2020 I started using the same pipeline script that Brian is currently using.

RHEL7 - 350GB dataset with NM Lustre-2.10.8 (results are in minutes)

...

Full, serial pipeline with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
53,045^350*^3,011^362*^34,431^605*^34,401^480*^
634,640016*3,466943*45,511671*4,3925,253*

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

"^" Means "SEVERE setjy No rows were selected"


Full, new, serial pipeline with large dataset and

...

profiling metrics

Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 and or CASA 6.0.0.23a100.dev17 (results are in minutes)

4,442
CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
54

3,326*^

Image Added

,095^, 2,645^


4,

559^
3,397^3,410^63,527

485*^

Image Added



6

4,172*

Image Added


5,572*

Image Added


"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

"^" Means "SEVERE setjy No rows were selected"

...

Image Removed

...

Image Removed

...

Image Removed

...

Image Removed

...

Image Removed

...

Image Removed

Current Pipeline Script

Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.

Full, new, serial pipeline with small dataset

RHEL7 - 6.7GB dataset with NM Lustre-2.10.x (results are in minutes)  I testing a CASA-6 job with and without cf.validate_parameters = False and both jobs took the same amount of time +/- 1 minute.

...

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

Full, new, serial pipeline with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

...

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

"^" Means "SEVERE setjy No rows were selected"

Full, new, serial pipeline with large dataset and profiling metrics

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

...

3,326*^

Image Removed

4,485*^

Image Removed

...

4,172*

Image Removed

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

...


Full, new serial pipeline with large dataset and times per pipeline task

Comparing two profiling jobs against one of Brian's jobs (/lustre/aoc/sciops/bkent/pipetest/llama3/workingtest60_2) on the same hardware (E5-2670) in NM.  Times were calculated from the CASA logs.  Times are in minutes.

Large dataset (350GB) times are in minutes

CASA-5.6.3-9,

Pipeline 43128

CASA-6.0.0.23-pipeline-validation-17,

Pipeline master-v0.1-145-ge322387-dirty

CASA-6.0.0.23-pipeline-validation-17,

Pipeline master-v0.1-18-g2de4d78-dirty

CASA-6.0.0.23-pipeline-validation-17,

Pipeline master-v0.1-18-g2de4d78-dirty

Taskkent2-pr-c5-l-70kent2-pr-c6-l-70kent3b-no-c6-l-70CASA-6 Bkent
hifv_importdata247425403392
hifv_hanning175188334460
hifv_flagdata272323374452
hifv_vlasetjy75199255357
hifv_priorcals254281539494
hifv_testBPdcals748498123
hifv_flagbaddef0100
hifv_checkflag68706969
hifv_semiFinalBPdcals75153154154
hifv_checkflag189254250253
hifv_solint6689105105
hifv_fluxboot2104181185175
hifv_finalcals162182177177
hifv_circfeedpolcal31333232
hifv_flagcal0100
hifv_applycals205212358437
hifv_checkflag1741184023882930
hifv_statwt645710812500
hifv_plotsummary101346350350





TOTAL (minutes)4484557368847460




K. Scott finished three runs on Apr. 8, 2020 using Brian's large dataset (350GB), CASA-6.0.0.23-pipeline-validation-17 and Pipeline master-v0.1-18-g2de4d78-dirty separated by about an hour each.  Each job requested 1 node with 8 cores and 96gb; essentially a NUMA node. system.resources.memory was unset and _cf.validate_parameters = False. (Times are in minutes)

Taskkent3a-no-c6-l-70kent3b-no-c6-l-70kent3c-no-c6-l-70
hifv_importdata410403407
hifv_hanning364334359
hifv_flagdata381374386
hifv_vlasetjy263255256
hifv_priorcals513539511
hifv_testBPdcals979898
hifv_flagbaddef000
hifv_checkflag686968
hifv_semiFinalBPdcals153154152
hifv_checkflag251250250
hifv_solint105105106
hifv_fluxboot2174185174
hifv_finalcals178177180
hifv_circfeedpolcal313231
hifv_flagcal000
hifv_applycals353358366
hifv_checkflag250123882302
hifv_statwt832812806
hifv_plotsummary348350345




TOTAL (minutes)702368846799