Overview
Ingestion of the IDIFTS files from eLWA observations is a new functionality for the AAT/PPI. The process is, in a nutshell:
...
Jira | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Using the System
Assumptions:
- RealFast eLWA team has access to the AAT/PPI command line installation areas (/users/vlapipe/workflows/)
- Vlapipe user (and group) have access (Read, Write, Execute) access.
- Everyone has (Read) access (to facilitate the AAT/PPI services access)
- The defined staging area (see below) is on the same filesystem as the data to be ingested
The process is contained within a special-purpose workflow, which can be initiated with the 'realfastIngestelwaIngest' command, installed under the vlapipe account.
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
usage: realfastIngestelwaIngest [-h] [-P PROFILE] [-sp SDM_PATH] [-pE PNG_PATHEMAIL] sdmName [sdmName ...] RealFast SDM filename ELWA IDIFITS Ingestion, version 34.90.0b21b1: Initiates an ingestion workflow to forattach the SDM andprovided ancillary IDIFITS file filesto forits eachcorresponding execution block listed.EVLA EB positional arguments: sdmNamefilename FileSet IdentifiersFilename(s) to ingest optional arguments: -h, --help show this help message and exit -P PROFILE, --profile PROFILE profile name to use, e.g. nmtestdsoc-test, mnproddsoc-prod -sp SDM_PATH, --sdm_path SDM_PATH Path to the IDIFITS Path to the RealFast SDM file to ingeste (overrides CAPO setting) -pE PNG_PATHEMAIL, --png_path PNG_PATHemail EMAIL PathOptional toemail theaddress candidatefor PNGa files (overrides CAPO setting)completion message |
The two path arguments are provided for flexibility, but it is assumed that the default values in the CAPO profiles (dsoc-test, dsoc-prod/nmprod) are the typical location. If those paths are correct, the command can then be invoked with:
...
activate_profile dsoc-test
realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820
elwaIngest
buildIDI_TSUB0001_40203068.FITS_1
elwaIngest -p /lustre/aoc/scipos/etc/etc/etc buildIDI_TSUB0001_40203068.FITS_1
Not as vlapipe:
/users/vlapipe/workflows/dsoc-test/bin/realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820elwaIngest buildIDI_TSUB0001_40203068.FITS_1
Production:
As vlapipe:
activate_profile dsoc-prod
realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820
elwaIngest buildIDI_TSUB0001_40203068.FITS_1
Not as vlapipe:
/users/vlapipe/workflows/dsoc-prod/bin/realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820
This will initiate the process, and the SDM will shortly be available in the UI (a matter of ~10 minutes).
elwaIngest buildIDI_TSUB0001_40203068.FITS_1
The workflow will stage the file for ingestion and perform some preparatory work. Then it will call ingestion to set up the metadata and place the files in NGAS (if desired). There is not currently an external sign of the ingestion, so I've hooked the utility into a simplistic feedback system I created for another purpose. You should be provided with the working directory for the ingestion workflow (where the logs for some pieces of the process will show up), and if you provide an email address (-E), you'll get an email about whether the process completed with an error or not.The workflow will gather all the materials (SDM, PNG, files required for full ingestion) in once place (realfastStagePath), and initiate ingestion upon those files. After successful ingestion of the metadata, and (if requested) of the files into NGAS, the workflow will trigger a reindex of the project.
There are a set of values in the CAPO profiles for use with this workflow:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
edu.nrao.archive.workflow.config.collection.RealfastSettings.serviceUrl = https://webtest.aoc.nrao.edu/archiveServices/ # edu.nrao.archive.workflow.config.collection.RealfastSettings.pngNameArgument = realfast_ancillaries?path= edu.nrao.archive.workflow.config.collection.RealfastSettings.donorLocatorArgument = realfast_associate?path= edu.nrao.archive.workflow.config.collection.RealfastSettings.collectionMetadataArgument = realfast_collection?path=# # ELWA Collection Settings # edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.ingestNGASingestNgas = false edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.realfastStagePathelwaSourcePath = /lustre/aoc/cluster/pipeline/nmtest/stage_products #jls_test/elwa edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.realfastSdmPathelwaServiceEndpoint = /lustre/aoc/sciops/pdemores/realfast_sdms edu.nrao.archive.workflow.config.collection.RealfastSettings.realfastPngPath = /lustre/aoc/sciops/pdemores/realfast_sdms elwa_science_source?path= |
Under The Hood
It should be noted that the realfastIngest
elwaIngest
command isn't doing any processing itself. It only prepares the basic metadata and initiates the workflow. It is possible to provide some limited feedback (a working directory name where some log files are kept, and a success/fail email) with a bit of additional work.
What the workflow does in more detail:
...
- (via service which reads the JSON under Annotation.xml) Find and link the required PNG file into a subdirectory of IDIFTIS file to the staging area (under stage_products directory, named after the file)
- Obtain the donor associated SDM's SPL
- (via service which reads the JSON under Annotation.xml FITS header & queries the AAT)
- (via service which reads the JSON under Annotation.xml)
- Write the Ingestion Manifest & collection metadata to a file in the staging area
- Link the SDM & BDFs into the staging area
- The IDIFITS file with associated SPL for linking
- Collection metadata only consists of the collection's name
- SDM Science Product, with PNG ancillary product
- Associate Group with the donor SDM
- Prepare ingestion artifacts
- Trigger ingestion Ingestion sends a 'complete' signal upon success