Project Charter

Long Description

This project deals with the ingest of FITS data cubes, catalogs, and spectra, and providing access to the products from the archive.  For details of the ALFALFA project see the project web page.

The Arecibo Legacy Fast ALFA (ALFALFA) survey is a completed, blind extragalactic HI survey exploiting Arecibo's superior sensitivity, angular resolution and digital technology to conduct a census of the local HI universe over a cosmologically significant volume. ALFALFA has detected more than 30,000 extragalactic HI line sources out to z~0.06, and its catalog will be especially useful in synergy with wide area surveys conducted at other wavelengths.  The data collection is of great interest to the radio community and NRAO users.  NRAO will partner with the ALFALFA team to ingest and serve their valuable products to the community.

ALFALFA has produced ~7500 dual polarization spectral data cubes with supporting weights and continuum maps, an ASCII catalog of over 30,000 extragalactic detections, and HI line spectra.

ALFALFA is a completed program.  All data products were processed through IDL, but the data are converted to FITS via Python and AstroPy.   The ALFALFA project will provide the data cubes in FITS format.

From the SRDP perspective, the set of all ALFALFA data products is a collection.

Publications

The main catalog paper by Haynes et al. (2018).

The long and continually growing list of ALFALFA publications.

ALFALFA Data Products

1. Spectral Data Cubes, Spectral Weights, Continuum Maps, Continuum Weights

Data structures (Spectral cube and three ancillary products - spectral weights, continuum maps, and continuum weights). 

    Note that this option will likely fit best within the current archive scheme.

    Number of files:  ~7500 x 4 (spectral cubes, spectral weights, continuum maps. continuum weights) = ~30,000 FITS files

Dimensions of the data:

    Spectral data cubes:       144 RA pixels  x 144 Decl. pixels  x  1024 frequency channels   x   2 polarizations   (/lustre/aoc/sciops/bkent/grids/1220+27a_spectral.fits)

    Spectral weights:             144 RA pixels  x 144 Decl. pixels  x  1024 frequency channels   x   2 polarizations    (/lustre/aoc/sciops/bkent/grids/1220+27a_spectralweights.fits)

    Continuum maps:            144 RA pixels  x 144 Decl. pixels  x 2 polarizations   (/lustre/aoc/sciops/bkent/1220+27a_continuum.fits)

    Continuum weights:        144 RA pixels  x 144 Decl. pixels  x 2 polarizations   (/lustre/aoc/sciops/bkent/1220+27a_continuumweights.fits)

Volume of data (~ 10 TB)

Frequency range:  1335 to 1445 MHz (L-band)

Extragalactic redshift coverage:  -2000 < cz < 18,000 km/s

Sky coverage:  7000 square degrees


2. Extragalactic Catalog

The catalog description, an ASCII text file, or a CSV text file.   Data volume ~ 5 Megabytes.

Publication:  IOP PDF


3. HI Spectra

Each detection has a single FITS file with a binary table extension, containing the X-Y points of the spectrum and associated metadata.  Number: ~30,000 FITS files.  Total volume: ~ 1 Gigabyte.

Archive Access to ALFALFA Data Products

Modalities (up for discussion) for the discovery and filtering of ALFALFA data sets in the Archive.

  1. Any primary project that has an associated ALFALFA "product" (a set of cubes, catalog entry, or spectrum) should show the ALFALFA results as a related product in the project view.  
    • This should be displayed in a similar manner to any Images or other SRDP related products, but clearly show that it is an ALFALFA product.
  2. In the observation view ALFALFA  data cube sets are displayed and filtered with the other EBs stored in the archive.
    •  A column showing that they belong to the ALFALFA collection should be optionally displayed 
    • The user should be able to suppress display of members of the ALFALFA (or any other) collection.
    • No reprocessing capabilities (e.g. optimized imaging) should be provided for data cubes from the ALFALFA collection (the project is COMPLETE).
  3. A dedicated "ALFALFA" view of the archive, that enables searches on the ALFALFA specific meta-data columns.
    • A cone search of the RA and Dec fields (as described in the above catalog description) should be supported.
    • Searching on ranges of heliocentric redshift (in the above table) could be supported.
    • The ALFALFA view should also provide links to the ALFALFA project, and publications explaining the processing and data products (ALFALFA publications).

End user manipulation

The expectation is that a user would search for a position on the sky, or via a galaxy name (NED resolver, etc.).     For instance, if one searches on the position 12h20m00s, +09d00m00s, four data cubes, each with three additional ancillary products, would be returned (16 totals hits, total volume would be ~ 1.6 GB). 

At the most basic level, using AstroPy, we would expect an end user to download a set of ALFALFA files and manipulate them with some simple Python or a Jupyter notebook (Example demo: ReadTheDocs).
This is out of the archive scope, but is useful so that we all know what the data look like.

Stakeholders

Brian Kent, NRAO, Project Sponsor and Technical Expert

Jeff Kern, NRAO, Project Sponsor

Martha Haynes, Cornell, ALFALFA PI and Technical Expert

Others?

Prerequisites

The ALFALFA project will provide to NRAO the FITS spectral data cubes and ancillary products, data cube index,  extragalactic catalog, and HI spectra FITS files.

Examples will be provided to the SSA team for comments on header metadata keywords, to ease archive ingestion.

Requirements

Must haves:

  • Ingest of ALFALFA spectral data cubes and for each cube, three ancillary spectral products (spectral weights, continuum maps, continuum weights)
  • Ingest of the ALFALFA extragalactic catalog
  • Ingest of the ALFALFA spectral line profiles
  • Harvest and persistence of relevant generic and collection specific metadata.
  • Ability to filter on Collection in observation view (both: "must be in" and "must not be in" semantics) in Archive Interface
  • Ability to search on position, data cube name.

Should haves:

  • Search hits should return the data cubes, and any catalog entries and HI spectra within the search parameters.

Could haves:

  • Search on frequency range?
  • Search via object name (extragalactic NED resolver)?
  • Search via heliocentric redshift?



Implementation Plan

Risk Assessment

The ALFALFA data represent a low-risk, completed, and mature dataset that will be useful to the astronomical community.  The data are not going to be remade or regridded, and will be ingested into the archive only one time.

  • No labels

5 Comments

  1. I could use clarification on  "Archive Access to ALFALFA Data Products", point 1, "Any primary project that has an associated ALFALFA "product" (a set of cubes, catalog entry, or spectrum) should show the ALFALFA results as a related product in the project view."

    In the context of the archive, project metadata is essentially proposal metadata and authorization information. This is a somewhat simplistic explanation, but lets run with it. Since these ALFALFA products came from Arecibo I presume they aren't tied to an NRAO proposal, so I need to know a bit more about what these primary projects might be that have such associations.

    Should this be handled akin to how we did VLASS, as an un-proposed  'ALFALFA' project we hand edit with a title, abstract and authors?

    1. Stephan Witz That's a good query.  I think the PIs of ALFALFA are interested in the following scenario.

      A user searches for position (with radius) or named astronomical object.  The object name is resolved, and the archive returns relevant NRAO projects based on the proposal metadata and authorization.  It would also return entries for ALFALFA products that they could download.  They would not be tied to a NRAO proposal.

      The VLASS scenario might be a good model to follow.

      Or should it be a separate portal?  Does the ALFALFA data collection fit into the NRAO archive database model-view?


      1. So probably you want the ability to have both: an ALFALFA branded portal that just shows ALFALFA, and the main one that can show mixed results. Per-collection portals are in the future, as is catalog support, and the ability to download ancillaries is just now being worked on, but we'll experiment with the spectral data cubes to see what the path forward looks like.

        This page (http://egg.astro.cornell.edu/alfalfa/index.php), is that the title and abstract we would use?

        1. Great, I appreciate your outlook Stephan Witz for the future.  The ability to have both style portals would be wonderful.

          Title could be: "The Arecibo Legacy Fast ALFA Survey Extragalactic HI Data Collection"

          Abstract can be taken from the 2018 paper: https://ui.adsabs.harvard.edu/abs/2018ApJ...861...49H/abstract

  2. Stephan Witz I have placed 54 example data cube directories for us to have a look at in the following location: /lustre/aoc/sciops/bkent/grids/

    Each directory contains 16 FITS data cubes for the RA/Dec position specified by that directory.

    These are there to help us determine how best to define the metadata for ingesting into the archive.