Overview
The NRAO Archive contains data from as far back as 1976, and currently contains just over 3 petabytes of information. Preserving this data and providing tools for the efficient utilization of the data products by the full multi-messenger community is a key responsibility of the DMS department. Starting in FY2023, SSR working collaboratively with DMS will launch a new project to expand the use of the NRAO Archive as a Research Instrument. This project will focus on user specified reprocessing, interoperability with other science archives, and increasing the supported type of projects. We expect this project to extend at least into 2024 and perhaps to 2025.
Scope Statement
The archive project will improve the user experience with the NRAO archive for the raw (visibility or otherwise) and image data from the VLA, VLBA, ALMA, and other telescopes, while not accruing significant new technical debt and retiring technical debt where necessary to improve the user experience. Interdependencies between the Workspaces system and the archive will be addressed within this project, but significant changes to Workspaces are out of scope, though limited enhancements are possible.
The archive predates architectural analysis at the NRAO. Therefore, this project will also begin with a limited architectural review with the aim of capturing the architectural drivers, quality attributes, and overall architecture of the archive, in order to inform the rest of the project.
Key Goals
- Improve the scripting interface to the NRAO archive
- Allow downloading of data
- Display additional metadata (and/or refactor database) to enable better filtering of data
- Download improvements (inclusion of wget commands to retrieve data).
- Search improvements.
- Position search fixes.
- Frequency-dependent FOV search
- Search based on integration time.
- VLA-metadata display improvements (helps enable VUDI)
- scan list improvements
- sources,
- frequency setup(s)
- total integration times)
- QA state
- weblogs
- VLBA data presentation and display
- Completion of VLBA data collection: Fixing corrupted Mark IV files, and adding a few missing GMVA, HSA, and Global VLBI data.
- Downloading large numbers (>500) of correlation files for VLBA projects in bulk via the 'Select All' option.
- Fixing the missing or incomplete metadata of some VLBA+EVLA (i.e. +Y1, +Y27) projects, without which download is impossible.
- Improve restore capability
- Increased robustness of ALMA MOUS ingestion of EBs and calibrations (ALMA Butler)
- Allow subsets to be downloaded(select on target, intents, spectral windows)
- Select CASA versions automatically
- Improve User-Defined Imaging
- Improvements to AUDI (ALMA User-Defined Imaging) and reimplementation into Workspaces
- Add VLA User Defined Imaging Capabilitiy
- Enable the retirement of the legacy AAT/PPI
- IDM? (Identity Management; might be mostly done?)
- Support ngVLA archive workspaces (what is this???)
Definition of Done
- Archive has the functional equivalency of the legacy AAT and the overall user experience is improved.
- As a user, I can more easily identify data I'm interested in without having to download it first on the basis of the metadata display and interactivity (e.g., CARTA).
Broad Themes/Epics
- Product Versioning
- Data Access:
- ALMA Data (both visibility and image data)
- Large Project Ingestion and Support
- VLA/VLBA Image Archive
- Usability Improvements
- DA Data Editing Tools
- Metadata availability
- VO / Scriptable Interfaces
- User Driven Imaging
- VUDI
Scrum Team
- Product Owner, Archive: John Tobin
- Scrum Master: Daniel Lyons
- Team Members: Nathan Bokisch; Brittany Faciane, Charlotte Hausman; Daniel Lyons; Daniel Nemergut
Subject Matter Experts
- Mark Whitehead (Architecture)
- Robert Treacy (PMD)
Key Stakeholders ('Product Owner Village')
- John Tobin (ALMA Archive subsystem scientist)
- Amy Kimball (VLASS Ops)
- Adele Plunkett (ALMA Archive)
- Mark Lacy (VLASS, Workspaces)
- Emmanuel Momijan (New Mexico Ops)
- John Tobin (VLBA subsystem scientist)
- John Hibbard (ngVLA archive scientist)
- David Frayer (GBT)
- John Tobin (VLA)
- Stephan Witz (SSA Technologies/Tech Debt Avoidance)
Key Stakeholders
- Tony Remijan
- Jeff Kern
Links
NRAO Archive Tool Enhancements