This will start as a collection of notes about the ALMA Observation Unit Set system.  With luck it will gel into a new metadata structure which can be used for more appropriate processing of ALMA data (their pipeline only operates at the OUS level, not on individual Execution Blocks), as well as an organizational construct for larger scale processing VLA data. 


Initially intended as a recursive structure, ALMA has decided to limit it to 3 layers with specific names.  In particular the nature of ALMA (multiple arrays, and short-lived configurations) makes the grouping structures necessary for the decomposition of the Science Goals into actionable Scheduling Blocks.  The smallest grouping (MOUS) is level where the pipeline calibration and imaging take place.  Those processing products are then tied to the OUS level, rather than the individual EBs.

Currently, and for the foreseeable future (~10 years), ALMA pipeline processing will remain at the MOUS level.  Anything more advanced is left to the user.  

Since nothing's ever simple, there is further sub-structure to a MOUS.  Sessions (i.e. instances where a scheduling block (~2hrs max) is run multiple times in succession).  In this case, the later execution blocks skip some of the calibration scans (for thing which vary on a slower timescale, such as flux or bandpass), and instead piggyback off the initial execution block's data for that information. How to tell which EBs are in a single session is yet to be determined.

ALMA keeps status information (mostly related to observation, or processing as well?) at the level of each OUS.  I suspect this is a blob of XML which is retrieved from the metadata database, but I haven't confirmed that.  None of the status XML files affect the pipeline run in any way, despite some being explicitly referenced in the ProjectStructure section of the PPR.


I suspect that with SRDP Jeff is going to want to resurrect the full recursive capabilities.  Do we need to keep the OUS name, or could we rename it to emphasize the processing aspect (and just note the mapping with ALMA OUSs)?


By analogy: An OUS is much like one of our filegroups.  It belongs to a project at the top level, and contains other OUSs + products, or it contains a set of one or more EBs + products at the base level.  We can organize this much like filegroups with a parent link and a project link in the table.

The OUS needs to keep some information of its own:  A name, a type of processing to be done (calibration, imaging, both, likely to be expanded later?), the products generated by the OUS (could be multiple types, can we get a definitive list?) , and the constituent data (a 1→Many mapping table?).  Since the OUS is a metadata construct, it would likely contain references to their product metadata (image set, or calibration table) rather than a direct link to the resultant filegroup (although it could have filegroup references for uniformity...).  Are we going to want to keep a time information at the OUS level (when requested, when approved/finished)?  What am I missing in terms of useful information at this level?  We should link to the workflow metadata table as well, since the OUS is intended for the type of automated processing the workflows provide. 

What else should we have?





OUS were concieved as a measure of organzing multi-configuration observing.  Their application to the VLA in that sense is somewhat limited.  They're also used to organize processing of the data, and that's where they have potential merit for non-alma telescopes. Multi-EB processing is going to become a thing, so we need an organizational structure for results that don't belong to a single EB (quicklook images were an easy staple-on addition).  This isn't going to be done well with a single table, but a main MD table and several Linkage tables. 


  • OUS Table Itself
    • ous_id
    • Structural (one null, one not null):
      • project_id
      • parent_ous_id
    • Informational:
      • Name
      • hasEBs? (boolean to indicate if it's a final-level OUS and can usefully be used to link to EBs)
      • hasProducts? (boolean to indicate if there are products.... may not be necessary)
      • purpose/type (integer of summed product types this OUS is intended to contain)
      • Time of Creation
      • Last Updated   (One of the time fields may play into proprietary period)
  • Product Type Enumeration Table
    • A product is the result of some processing upon a set of data, resulting in something ingested into the archive.
    • Calibration Tables
    • Image Sets
      • Subdivide by type?
    • Cubes
    • Catalogs?
    • Others as they are developed.
  • OUS ↔ EB linkages (one OUS, many EBs)
    • There are practical limitiations to what types of EBs one could gather.  Start with:
      • link to ids for the execution_blocks table, not directly to filegroups.
      • Single Telescope
        • ones within the same project for EVLA (more realistic would limit to same receiver(s) and configuration)
        • use the MOUS groupings for ALMA
        • ??? for VLBA
        • ??? for GBT
      • Eventually we'll probably want to allow multi-config, multi-band, multi-project, multi-telescope
  • OUS ↔ Products linkages  (one OUS many products of many types). 
    • Each OUS can be processed multiple times to generate different products (or the same type of products multiple times).  The OUS will keep a combo of the product types, but we need the individual types separated here.   
    • Type clarifies which product table the ID refers to (calibration tables, image sets, cubes, etc)
    • Link should be at the metadata level, rather than pointing at filegroups
  • OUS ↔ Workflow linkages (maybe, but this will likely make life easier for SRDP)




  • No labels