Image & Image Set Metadata Fields

Below are two sets of metadata which will be stored in the new archive in regard to Images and the Image Sets to which they belong. Included where relevant are the equivalent fields in the Virtual Observatory ObsCore Data Model, a potential source from which to obtain the data, and a comment. The images table was written specifically with the idea of spatial images in mind. Other data products (spectra, time series, etc) would have different amounts of granularity in the header information available.

The existing images table covers much of the desired data. The modifications we require are:

Split center_position and observing_band fields
Retaining image size data (the VO _xel fields)
Add fields for spectral resolution, physical data description, and spatial region information
Determination and acquisition of time-domain information.

Below is the proposed new table structure for Images and Image Products in the metadata database. There are 4 classes of columns below: initial columns, new columns, columns adapted from (previous_column), and columns renamed to match their equivalents in the observation tables. Depending on the design decisions and the behavior of CASA, some of these fields could be removed (t_xel, em_xel, pol_xel) if they they will be invariant.

Images Database Table:

Column Name	Units	VO Equivalent	Source	Comments
sourcename		target_name	OBJECT Keyword
ra (center_position)	deg	s_ra	Center of Ra Axis
dec (center_position)	deg	s_dec	Center of Dec Axis
field_of_view	deg	s_fov	Average of NAXISn*CRDELn of the spatial Axes
spatial_region		s_region	Derived from spatial Data	STC-S String Defined in the TAP (Dowler, et al 2010)
ra_element_count		s_xel1	Relevent NAXISn Keyword
dec_element_count		s_xel2	Relevent NAXISn Keyword
spatial_resolution	arcsec	s_resolution	Average of CDELTn of the spatial Axes, converted from deg
starttime	MJD	t_max		Temporal domain information for an image is dependent upon the observation(s) used in its generation. Some controlling process (Like Vlass Manager, whatever imaging pipeline we create, and any image manipulation software we use) will need to supply it.
endtime	MJD	t_min
exposure_time	s	t_exptime
min_frequency observing_band)	Hz	em_min	Minimum of Spectral Axis	The spectral information is recoverable from the FITS header, and will identify the spectral coverage, and total bandwidth processed for the image. (TBD: what further information for unit conversion for the VO? (Hz → m))
max_frequency (observing_band)	Hz	em_max	Maximum of Spectral Axis
spectral_resolution	unitless	em_res_power	Center Frequency/CDELTn
polarization_id (reverting to initial name)		pol_states	Polarization CRVALn value	CASA uses the CRVALn value to convey polarization information, with [1,2,3,4] mapped to [I,Q,U,V]. When we begin dealing with single-polarization images, this will need to be expanded.
telescope		instrument_name	TELESCOP Keyword	This must accommodate images for multiple instruments (i.e. VLA + Single Dish)
file_id			automatically generated	link to information about the physical file
image_id		obs_id	automatically generated	Unique Identifier for the Image
image_units		o_ucd	BTYPE & BUNIT Keywords	Description of the physical quantity measured in the image
max_intensity	image_units			These values are can be determined from the data portion of the image file in conjunction with the BSCALE & BZERO keywords. The library which is used for FITS interaction for ingestion can perform these calculations, but no information about the speed of results is yet available.
min_intensity	image_units
rms_noise	image_units
thumbnail			weblog, a PNG with identical name

FITS Data Description Keywords:

For the purpose of generality, FITS provides a detail-independent method of data access. It's easier to think of the data axis descriptors in groupings by their axis number. The NAXIS keyword provides the total number of dimensions within the data. For the n^th dimension of the data, we have a set of descriptor keywords which should be considered and used together:

NAXISn - Total data size along this axis
CRPIXn - Our reference location
CRVALn - The physical value at our reference location
CDELTn - The increment along the axis
CTYPEn - The axis label
CUNITn - The axis units

The CTYPEn and CUNITn values provide information about the axis to which this group of values applies. The rest of the keywords can then be used to calculate points of interest upon that axis. For instance, in axes longer than 1, we have:

Minimum: CRVALn + CDELTn*(1-CRPIXn)

Center: CRVALn + CDELTn*(NAXISn/2 - CRPIXn)

Maximum: CRVALn + CDELTn*(NAXISn - CRPIXn)

For the Frequency axis, which only has a single point (NAXISn=1), our calculations are simpler:

Minimum: CRVALn - CDELTn/2

Center : CRVALn

Maximum: CRVALn +CDELTn/2

Image Sets Database Table:

The Image Set information will need to come from outside sources, as most of the information is not guaranteed to be in the FITS files themselves. Vlass Manager holds all the needed information for their images, but future development will need to provide the relevant metadata as image sources broaden beyond VLASS.

Column Name	VO Equivalent	Source	Comment
image_set_id	obs_id	automatically generated	Unique Identifier for the Image Set
project_code		Required to facilitate Ingestion
configuration			This will need to hold the entire list used for the imaging.
collection_name	obs_collection
calibration_level	calib_level		As defined by the VO in their 0-4 system
product_file_id		automatically generated	Link to the imaging products tar file

VO ObsCore Remaining Fields:

VO Requirement	Value	Source
access_url
access_estsize		files.filesize, or combined value for an image set
dataproduct_type	'image'	default
access_format	'fits'	default
obs_publisher_did		Obtained upon registering with the Virtual Observatory
facility_name	'NRAO'	default
t_resolution		images.image_integration_time
t_xel	1	default
em_xel	1	default
pol_xel	1	default

Thumbnail

you compute the sha1 of the file, the first two characters become the first level directory name, the second two the second level, down as far as you need. maybe three levels, then you put the file in that directory.

then we store the path, we make the tree visible to apache, we make the path part of the index, and the front end can show thumbnails

Page tree