Below are two sets of metadata which will be stored in the new archive in regard to Images and the Image Sets to which they belong. Included where relevant are the equivalent fields in the Virtual Observatory ObsCore Data Model, a potential source from which to obtain the data, and a comment. The images table was written specifically with the idea of spatial images in mind. Other data products (spectra, time series, etc) would have different amounts of granularity in the header information available.
The existing images table covers much of the desired data. The modifications we require are:
- Split center_position and observing_band fields
- Retaining image size data (the VO _xel fields)
- Add fields for spectral resolution, physical data description, and spatial region information
- Determination and acquisition of time-domain information.
Below is the proposed new table structure for Images and Image Products in the metadata database. There are 4 classes of columns below: initial columns, new columns, columns adapted from (previous_column), and columns renamed to match their equivalents in the observation tables. Depending on the design decisions and the behavior of CASA, some of these fields could be removed (t_xel, em_xel, pol_xel) if they they will be invariant.
Images Database Table:
Column Name | Units | VO Equivalent | Source | Comments |
---|---|---|---|---|
sourcename | target_name | OBJECT Keyword | ||
ra (center_position) | deg | s_ra | Center of Ra Axis | |
dec (center_position) | deg | s_dec | Center of Dec Axis | |
field_of_view | deg | s_fov | Average of NAXISn*CRDELn of the spatial Axes | |
spatial_region | s_region | Derived from spatial Data | STC-S String Defined in the TAP (Dowler, et al 2010) | |
ra_element_count | s_xel1 | Relevent NAXISn Keyword | ||
dec_element_count | s_xel2 | Relevent NAXISn Keyword | ||
spatial_resolution | arcsec | s_resolution | Average of CDELTn of the spatial Axes, converted from deg | |
starttime | MJD | t_max | Temporal domain information for an image is dependent upon the observation(s) used in its generation. Some controlling process (Like Vlass Manager, whatever imaging pipeline we create, and any image manipulation software we use) will need to supply it. | |
endtime | MJD | t_min | ||
exposure_time | s | t_exptime | ||
min_frequency observing_band) | Hz | em_min | Minimum of Spectral Axis | The spectral information is recoverable from the FITS header, and will identify the spectral coverage, and total bandwidth processed for the image. (TBD: what further information for unit conversion for the VO? (Hz → m)) |
max_frequency (observing_band) | Hz | em_max | Maximum of Spectral Axis | |
spectral_resolution | unitless | em_res_power | Center Frequency/CDELTn | |
polarization_id (reverting to initial name) | pol_states | Polarization CRVALn value | CASA uses the CRVALn value to convey polarization information, with [1,2,3,4] mapped to [I,Q,U,V]. When we begin dealing with single-polarization images, this will need to be expanded. | |
telescope | instrument_name | TELESCOP Keyword | This must accommodate images for multiple instruments (i.e. VLA + Single Dish) | |
file_id | automatically generated | link to information about the physical file | ||
image_id | obs_id | automatically generated | Unique Identifier for the Image | |
image_units | o_ucd | BTYPE & BUNIT Keywords | Description of the physical quantity measured in the image | |
max_intensity | image_units | These values are can be determined from the data portion of the image file in conjunction with the BSCALE & BZERO keywords. The library which is used for FITS interaction for ingestion can perform these calculations, but no information about the speed of results is yet available. | ||
min_intensity | image_units | |||
rms_noise | image_units | |||
thumbnail | weblog, a PNG with identical name |
FITS Data Description Keywords:
For the purpose of generality, FITS provides a detail-independent method of data access. It's easier to think of the data axis descriptors in groupings by their axis number. The NAXIS keyword provides the total number of dimensions within the data. For the nth dimension of the data, we have a set of descriptor keywords which should be considered and used together:
- NAXISn - Total data size along this axis
- CRPIXn - Our reference location
- CRVALn - The physical value at our reference location
- CDELTn - The increment along the axis
- CTYPEn - The axis label
- CUNITn - The axis units
The CTYPEn and CUNITn values provide information about the axis to which this group of values applies. The rest of the keywords can then be used to calculate points of interest upon that axis. For instance, in axes longer than 1, we have:
Minimum: CRVALn + CDELTn*(1-CRPIXn)
Center: CRVALn + CDELTn*(NAXISn/2 - CRPIXn)
Maximum: CRVALn + CDELTn*(NAXISn - CRPIXn)
For the Frequency axis, which only has a single point (NAXISn=1), our calculations are simpler:
Minimum: CRVALn - CDELTn/2
Center : CRVALn
Maximum: CRVALn +CDELTn/2
Image Sets Database Table:
The Image Set information will need to come from outside sources, as most of the information is not guaranteed to be in the FITS files themselves. Vlass Manager holds all the needed information for their images, but future development will need to provide the relevant metadata as image sources broaden beyond VLASS.
Column Name | VO Equivalent | Source | Comment |
---|---|---|---|
image_set_id | obs_id | automatically generated | Unique Identifier for the Image Set |
project_code | Required to facilitate Ingestion | ||
configuration | This will need to hold the entire list used for the imaging. | ||
collection_name | obs_collection | ||
calibration_level | calib_level | As defined by the VO in their 0-4 system | |
product_file_id | automatically generated | Link to the imaging products tar file |
VO ObsCore Remaining Fields:
VO Requirement | Value | Source |
---|---|---|
access_url | ||
access_estsize | files.filesize, or combined value for an image set | |
dataproduct_type | 'image' | default |
access_format | 'fits' | default |
obs_publisher_did | Obtained upon registering with the Virtual Observatory | |
facility_name | 'NRAO' | default |
t_resolution | images.image_integration_time | |
t_xel | 1 | default |
em_xel | 1 | default |
pol_xel | 1 | default |
Thumbnail
you compute the sha1 of the file, the first two characters become the first level directory name, the second two the second level, down as far as you need. maybe three levels, then you put the file in that directory.
then we store the path, we make the tree visible to apache, we make the path part of the index, and the front end can show thumbnails