Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Open questions:

  • jobs As we feared referencing the cache of convolution functions (cfcache) directly from staging performed poorly.  This is due to a fstat() pathology that fares poorly on distributed filesystems.  Jobs ran 3 to 4 times faster when we copied cfcache from /staging to local disk.  I ran a small data set test with full parameters at CHTC that copied cfcache from /staging to local disk and step05 took only 16.7 hours instead of the 56.8 hours it had taken using cfcache on /staging.
  • I had a job killed because it exceeded 72 hours even though I set +LongJobs = true  in the submit file
  • What are the options to setting up HTCondor to both flock to CHTC and annex to AWS? Multiple submit hosts?  Multiple CMs? etc.
  • What are the clever solutions to submitting N different DAG jobs with each having different parmeters?
    • T10t34
      • J220200-003000
        • bin, working, data
      • J220600-003000
        • bin, working, data
      • ...
    • T10t35
      • J170743-393000
        • bin, working, data
      • J171241-383000
        • bin, working, data
      • ...
    • ANSWERS:
    • INCLUDE syntax for DAGs
    • include syntax for submit files
    • make a template of files
    • use a PRE script that populates things
    • usedagdir

...