Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

Ingestion of the IDIFTS files from eLWA observations is a new functionality for the AAT/PPI.   The process is, in a nutshell: 

...

Jira
serverDMS JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverIdeb2e750b-a83a-387e-8345-36eee8a98f01
keySSA-6528

Collection Support: ELWA

Using the System

Assumptions: 

  • RealFast eLWA team has access to the AAT/PPI command line installation areas (/users/vlapipe/workflows/)
  • Vlapipe user (and group) have access (Read, Write, Execute) access.  
  • Everyone has (Read) access (to facilitate the AAT/PPI services access) 
  • The defined staging area (see below) is on the same filesystem as the data to be ingested

The process is contained within a special-purpose workflow, which can be initiated with the 'realfastIngestelwaIngest' command, installed under the vlapipe account.  

...

Code Block
languagetext
titleCLI Arguments
linenumberstrue
collapsetrue
usage: realfastIngestelwaIngest [-h] [-P PROFILE] [-sp SDM_PATH] [-pE PNG_PATHEMAIL]
                      sdmName [sdmName ...]

RealFast SDM filename

ELWA IDIFITS Ingestion, version 34.90.0b21b1: Initiates an ingestion workflow to forattach the SDM andprovided
 ancillary 
  IDIFITS file filesto forits eachcorresponding execution block listed.EVLA EB

positional arguments:
  sdmNamefilename               FileSet IdentifiersFilename(s) to ingest

optional arguments:
  -h, --help            show this help message and exit
  -P PROFILE, --profile PROFILE
                        profile name to use, e.g. nmtestdsoc-test, mnproddsoc-prod
  -sp SDM_PATH, --sdm_path SDM_PATH
  Path to the IDIFITS                   Path to the RealFast SDM file to ingeste (overrides CAPO setting)
  -pE PNG_PATHEMAIL, --png_path PNG_PATHemail EMAIL
                        PathOptional toemail theaddress candidatefor PNGa files (overrides CAPO setting)completion message

The two path arguments are provided for flexibility, but it is assumed that the default values in the CAPO profiles (dsoc-test, dsoc-prod/nmprod) are the typical location.  If those paths are correct, the command can then be invoked with:

...

activate_profile dsoc-testrealfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820

elwaIngest buildIDI_TSUB0001_40203068.FITS_1

elwaIngest -p /lustre/aoc/scipos/etc/etc/etc buildIDI_TSUB0001_40203068.FITS_1

Not as vlapipe: 

/users/vlapipe/workflows/dsoc-test/bin/realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820elwaIngest buildIDI_TSUB0001_40203068.FITS_1


Production: 

As vlapipe:

activate_profile dsoc-prodrealfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820

elwaIngest buildIDI_TSUB0001_40203068.FITS_1

Not as vlapipe: 

/users/vlapipe/workflows/dsoc-prod/bin/realfastIngest realfast_18B-320.sb38241161.eb38244520.59002.47251115741_1591099113820

This will initiate the process, and the SDM will shortly be available in the UI (a matter of ~10 minutes).  

elwaIngest buildIDI_TSUB0001_40203068.FITS_1


The workflow will stage the file for ingestion and perform some preparatory work.  Then it will call ingestion to set up the metadata and place the files in NGAS (if desired).  There is not currently an external sign of the ingestion, so I've hooked the utility into a simplistic feedback system I created for another purpose.  You should be provided with the working directory for the ingestion workflow (where the logs for some pieces of the process will show up), and if you provide an email address (-E), you'll get an email about whether the process completed with an error or not.The workflow will gather all the materials (SDM, PNG, files required for full ingestion) in once place (realfastStagePath), and initiate ingestion upon those files.   After successful ingestion of the metadata, and (if requested) of the files into NGAS, the workflow will trigger a reindex of the project.  


There are a set of values in the CAPO profiles for use with this workflow: 

Code Block
languagetext
titleRealFast ELWA CAPO Settings
linenumberstrue
collapsetrue
edu.nrao.archive.workflow.config.collection.RealfastSettings.serviceUrl = https://webtest.aoc.nrao.edu/archiveServices/
#
edu.nrao.archive.workflow.config.collection.RealfastSettings.pngNameArgument = realfast_ancillaries?path=
edu.nrao.archive.workflow.config.collection.RealfastSettings.donorLocatorArgument = realfast_associate?path=
edu.nrao.archive.workflow.config.collection.RealfastSettings.collectionMetadataArgument = realfast_collection?path=#
# ELWA Collection Settings
#
edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.ingestNGASingestNgas = false
edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.realfastStagePathelwaSourcePath = /lustre/aoc/cluster/pipeline/nmtest/stage_products
#jls_test/elwa
edu.nrao.archive.workflow.config.collection.RealfastSettingsElwaSettings.realfastSdmPathelwaServiceEndpoint = /lustre/aoc/sciops/pdemores/realfast_sdms
edu.nrao.archive.workflow.config.collection.RealfastSettings.realfastPngPath = /lustre/aoc/sciops/pdemores/realfast_sdms
elwa_science_source?path=


Under The Hood

It should be noted that the realfastIngest elwaIngest command isn't doing any processing itself.  It only prepares the basic metadata and initiates the workflow.  It is possible to provide some limited feedback (a working directory name where some log files are kept, and a success/fail email) with a bit of additional work.  

What the workflow does in more detail: 

...

  1. (via service which reads the JSON under Annotation.xml)
  2. Find and link the required PNG file into a subdirectory of IDIFTIS file to the staging area (under stage_products directory, named after the file)
  3. Obtain the donor associated SDM's SPL
    1. (via service which reads the JSON under Annotation.xml FITS header & queries the AAT)
    Obtain the collection metadata 
    1. (via service which reads the JSON under Annotation.xml)
  4. Write the Ingestion Manifest  & collection metadata to a file in the staging area
  5. Link the SDM & BDFs into the staging area
    1. The IDIFITS file with associated SPL for linking
    2. Collection metadata only consists of the collection's name
    Write the Ingestion Manifest 
    1. SDM Science Product, with PNG ancillary product
    2. Associate Group with the donor SDM 
  6. Prepare ingestion artifacts
  7. Trigger ingestion Ingestion sends a 'complete' signal upon success