VIP Script times by step

Numbers are in hours

Large data set VLASS1.2.sb36491855.eb36574404.58585.53016267361_datacolumn.ms with full parameters

Step	NRAO (steps-all-parallel9)	NRAO/CHTC (steps-all-parallel10)	NRAO/AWS (steps-all-parallel16)
01	9.4	9.2	12.3
05	60.2	killed at 72 hours	65.9
06	24		24.4
07	11.8		14.4 (leap second and timeout errors)
15	55.2		0.0 (exception error)
16	6.1
23	230.8
24	46
Total	443.5

Small data set test.ms with full parameters

Step	NRAO (steps-all-parallel12)	NRAO/CHTC (steps-all-parallel15)	NRAO/AWS (steps-all-parallel14)
01	1.8	2.0	1.9
05	8.6	56.8	5.1
06	3.0	3.9	2.0
07	2.0	2.3	2.2
15	6.9	56.3	4.3
16	1.4	1.7	1.4
23	8.3	47.8	5.3
24	14.1	66.0	16.8
Total	46.1	226.8	39.0

CPUs at CHTC are noticibly slower than CPUs at NRAO. For example, their set of c20xx machines (e20{03..18}) each have two Intel Xeon Silver 4114 2.20GHz processors and 0.5TB to 1TB of memory, while their large memory machines (mem3, mem2001, mem2002) each have four Intel Xeon E7-4820 v4 2.00GHz processors and 2TB to 4TB of memory. Possible reasons for this slowdown:

cfcache on cephfs
Slower CPUs
Multiple users
Hyperthreading

I ran a small data set test with full parameters at CHTC that copied cfcache from /staging to local disk and step05 took only 16.7 hours.

Space shortcuts

Page tree