You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 69 Next »

Page for tracking an apparently slow down w.r.t CASA-5 and CASA-6 for VLASS calibration: https://open-jira.nrao.edu/browse/PIPE-568

Comparing CASA-5 and CASA-6 (casa-pipeline-validation-8) across the two different CPUs available for batch processing in NM and CV shows that the newer CPUs (E5-2640v3) run a small calibration job (6.7GB) about 1.25 times faster than the old CPUs (E5-2670) with CASA-6 performing slower in every case.  There was no significant run-time difference between NM and CV for similar hardware and software.  Results are in minutes.

Here is the full pipeline script I have used for all of these tests casa_pipescript.py For some tests, I commented out all but hifv_importdata.

Full, serial pipeline with small dataset

RHEL7 - 6.7GB dataset with NM Lustre-2.5.5 (results are in minutes)

CASAnmpost051 (E5-2640v3)cvpost020 (E5-2640v3)nmpost038 (E5-2670)cvpost003 (E5-2670)
5114, 117110, 111144, 143140, 141
6156*, 164*156*, 158*200*, 201*197*, 199*


RHEL7 - 6.7GB dataset after NM upgrade Lustre-2.10.8 and CV results copied from last test (results are in minutes)

CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
5113, 110110, 111142, 141140, 141
6155*156*, 158*198*197*, 199*

Mar. 3, 2020 krowe: I tried the nmpost051-casa6-rhel7 with the latest casa-pipeline-validation-17.  The run-time was the same as were the tclean() errors.

"*" Means it completed with tclean() errors


Just serial hifv_importdata() with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.5.5 (results are in minutes)

You can see that running just hifv_importdata() on a larger data set (350GB) shows that nmpost nodes run about 2% to 10% faster than similar cvpost nodes with CASA-6 performing slower in every case.


RHEL7 - 350GB dataset with NM Lustre-2.10.8 (results are in minutes)


Full, serial pipeline with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
53,045^3,011^3,431^3,401^
63,6403,4664,5114,392

"^" Means "SEVERE setjy No rows were selected"


Full, serial pipeline with large dataset and Profiling Metrics

RHEL7 - 350GB dataset with NM Lustre-2.10.x,CASA-pipeline-5.6.3-9 and CASA 6.0.0.23a100.dev17 (results are in minutes)

CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
54,095^, 2,645^4,559^3,397^3,410^
63,527
4,442

"^" Means "SEVERE setjy No rows were selected"



Current Pipeline Script

Mar. 17, 2020 I started using the same pipeline script that Brian is currently using.

Full, new, serial pipeline with small dataset

RHEL7 - 6.7GB dataset with NM Lustre-2.10.x (results are in minutes)  I testing a CASA-6 job with and without cf.validate_parameters = False and both jobs took the same amount of time +/- 1 minute.

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"


Full, new, serial pipeline with large dataset

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

CASANM (E5-2640v3)CV (E5-2640v3)NM (E5-2670)CV (E5-2670)
53,350*^3,362*^4,605*^4,480*^
64,016*3,943*5,671*5,253*

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

"^" Means "SEVERE setjy No rows were selected"


Full, new, serial pipeline with large dataset and profiling metrics

RHEL7 - 350GB dataset with NM Lustre-2.10.x, CASA-pipeline-5.6.3-9 or CASA 6.0.0.23a100.dev17 (results are in minutes)

"*" Means "SEVERE pipeline.hifv.tasks.flagging No flag summary statistics"

"^" Means "SEVERE setjy No rows were selected"



  • No labels