GSICS File Naming Convention

GSICS file naming convention follows the rules given in the General File Naming Conventions section of the W.M.O. Manual on The Global Telecommunication System . The file name general format is:

pflag_productidentifier_oflag_originator_yyyyMMddhhmmss[_freeformat].type[.compression]

A file name consists of a predetermined combination of fields delimited by the underscore character except for the last two fields delimited by the period character. Each field can be of variable length, except for the date-time stamp field which length is fixed. The order of the fields is mandatory. The fields are not case sensitive. Use of capital letters is allowed but they will not be considered different from their lowercase versions.

Although no maximum length is specified for the entire file name, the mandatory fields shall not exceed 128 characters (including all delimiters) to allow processing by all international systems.

Explanation of how each naming field relates to GSICS files follows:

pflag
Always W.

productidentifier
A variable length field containing information that describes the nature of the data in the file. Its format is:

LocationIndicator,DataDesignator,FreeDescription

In order to facilitate identification of each part in the product identifier field no part can contain a comma symbol. If a part is empty, no character shall be inserted between the relevant delimiters "_" or ",".

LocationIndicator
Defines data producer in three fields: country, organization, and production center. Field separator is the "-" symbol. The country is represented by the official ISO 3166 two-letter country code. Recognized location indicators are:

LocationIndicator Data Producer
US-NESDIS-STAR NOAA/NESDIS Center for Satellite Applications and Research (STAR)
XX-EUMETSAT-Darmstadt EUMETSAT
JP-JMA-MSC JMA Meteorological Satellite Center (MSC)
CN-CMA-NSMC CMA National Satellite Meteorological Center
KR-KMA-NMSC KMA National Meteorological Satellite Center

DataDesignator
Describes the type of data in the file. Further explanation is given in a separate section.

FreeDescription
Identifies platforms and instruments from which data came from or to which data applies to. The usage of this field is further explained in a separate section.

oflag
Always C. The originator field will be decoded as the standard international four-letter location indicator (CCCC).

originator
International four-letter location indicator (CCCC) of the station or centre originating the file as agreed internationally and published in the W.M.O. No. 9, Volume C1, Catalogue of Meteorological Bulletins .

Recognized originator codes are:
CCCC Code Originator
EUMS EUMETSAT, Darmstadt, Germany
EUMG
EUMP
KNES NOAA/NESDIS, College Park, Md., U.S.A.
RJTD JMA, Tokyo, Japan
RKSL KMA, Seoul, Korea
BABJ CMA, Beijing, China

yyyyMMddhhmmss
A fixed length UTC date and time stamp field representing a data-related event, for example: collection, observation, or processing time. If a particular date and time stamp field is not specified it must be replaced by a "-" (minus) character. For example, if the file contains data for one entire day that is denoted as "YYYYMMDD------".

freeformat (optional)
A variable length field consisting of sub-fields divided by the underscore ("_"). The recognized sub-fields are:

YYYYMMDDHHMMSS_..._distphase_version

where "..." indicates possible, not yet specified sub-fields. Order of the sub-fields is important.

YYYYMMDDHHMMSS (optional)
A date-time sub-field in the full 14-digit format will be interpreted as end time in UTC related to data collection, observation, or processing. If the YYYYMMDDHHMMSS sub-field is present then the yyyyMMddhhmmss field will be interpreted as denoting beginning of that time interval. This sub-filed must appear first in the freeformat field.

distphase (optional for Operational phase only)
Indicates distribution phase of file's data. Only GSICS product files can use this sub-field. Two possible values: "demo" for the Demonstration phase, and "preop" for the Preoperational phase. Files in the Operational distribution phase cannot have this sub-field in their names.

version (optional)
Version identifier of file's data. Only the major version number is used in the two-digit format. This sub-field must appear last in the freeformat field.

type
Describes the general format type of the file. Important types for GSICS files are mentioned below:
Type File Format
nc Network Common Data Format (netCDF)
met Metadata file describing the content and format of the corresponding file of the same name
png Portable Network Graphics

compression (optional)
Identifies file compression method.
compression Field Meaning
zip Compressed with the UNIX zip or Windows WinZip
gz Compressed with the gzip tool
bz2 Compressed with the bzip2 tool

File Name Character Set

Allowed characters to use for composing file names are:
  • "a" through "z" (ASCII lowercase letters)
  • "A" through "Z" (ASCII uppercase letters)
  • "0" through "9" (ASCII numerical digits)
  • underscore ("_")
  • minus ("-")
  • plus ("+")
  • period (".")
  • comma (",")
The underscore is the field delimiter. It is accepted in the freeformat field but not in any other field. The minus is used only as the field delimiter inside the LocationIndicator and FreeDescription fields of the productidentifier field. The plus is used to concatenate several words in the productidentifier field. The period is to be used as a delimiter only before the type and compression fields. The comma is used as a field delimiter in the productidentifier field. It can be also used in the freeformat field.

Data Designators

A data designator consists of up to three fields delimited by the plus character (+): data category, international data subcategory, and local data subcategory. The format is:

DataCategory+InternationalDataSubcategory[+LocalDataSubcategory]

Data category declares the general type of the data while the international data subcategory provides more specific description of the data. Both of these designators are mandatory. Local data subcategory is optional and can only be used if both data category and international data subcategory are present.

Data categories and international data subcategories are defined in the Common Table C-13 of the W.M.O. Manual on Codes . Each data producer is free to specify its unique local data subcategories for a given pair of data category and international data subcategory.

The data category for all GSICS files is Calibration dataset (satellite), code figure 30. Available GSICS data designators are described in the following table:
Common Table C-13 Data Category Common Table C-13 International Data Subcategory
Alphanum. Code Name Code Figure Alphanum. Code Name Code Figure
SATCAL Calibration dataset (satellite) 30 SUBSET Subsetted data 0
COLLOC Collocated data 1
OBC On-board calibration data 2
BIASM Bias Monitoring Deprecated 3
NRTC Near Real-Time Correction 4
RAC Re-analysis Correction 5
In order to differentiate the content of GSICS files coming from same monitored and reference instruments, the local data subcategory will be used with following values:

Alphanum. Code Name Code Figure
GEOLEOIR GEO-LEO-IR algorithm data 1
LEOLEOIR LEO-LEO-IR algorithm data 2
GEOLEOVNIR GEO-LEO-VISNIR algorithm data 3
LEOLEOVNIR LEO-LEO-VISNIR algorithm data 4

FreeDescription Field

This field refers either to file's data sources, or where file's data is applicable to. Typically these will consist of satellite and instrument names. The general form of the field is:

PLATFORM+INSTRUMENT[-PLATFORM+INSTRUMENT]...

Each PLATFORM+INSTRUMENT pair identifies a platform and one of its on-board instruments. Both platform and instrument names can consist only of ASCII letters and digits. Any other character should be ignored when forming PLATFORM and INSTRUMENT identifiers. Examples of platform-instrument pairs are: GOES11+Imager, Aqua+AIRS, MetopA+IASI, or Terra+MODIS. Note the case of letters does not matter.

It is recommended to use the following reference satellite and instrument names which are used by the owner agency or used on the WMO-OSCAR even though file naming is not case sensitive (No need to change filename of existing GSICS Correction).

Satellite Instrument
MetopA, MetopB, MetopC IASI
Aqua AIRS, MODIS
SNPP CrIS, VIIRS
When forming a FreeDescription field either platform or instrument name can be omitted from a pair if that will not introduce any ambiguity. For example, a complete FreeDescription field GOES11+Imager-Aqua+AIRS can be shortened to GOES11+Imager-AIRS because there is only one AIRS instrument. Another example is when having two platforms with the same instrument: NOAA18+HIRS-NOAA19+HIRS can be shortened to NOAA18-NOAA19+HIRS.

Examples

Here are given a few file name examples to illustrate certain features of the convention:

W_US-NESDIS-STAR,SATCAL+COLLOC+GEOLEOIR,GOES12+Imager-AIRS_C_KNES_20090713------.nc
A file from the U.S. NOAA/NESDIS/STAR containing collocated data between the GOES-12 Imager and the Aqua AIRS according to the GEO-LEO-IR ATBD for one full day on 2009-07-13.

W_JP-JMA-MSC,SATCAL+NRTC+GEOLEOIR,MTSAT1R+JAMI-MetopA+IASI_C_RJTD_20090814000000_demo_01.nc
A file from the JMA Meteorological Satellite Center containing near real-time inter-calibration correction coefficients for the MTSAT-1R JAMI derived from comparisons with the Metop-A IASI according to the GEO-LEO-IR algorithm. The coefficients validity period starts at 2009-08-14 00:00:00Z. This file's data is in the Demonstration phase and its major version number is 1.

Clarification on Instrument name and platform name ( Ref Email from Peter Miu) . Following the precedent used in used in KMA's AMI- CrIs  product , It is recommended that CrIS be referred as CRIS even though the instrument name is not case sensitive. In principal agencies have to arrive at a consensus to decide on how they wish to address case of platform name and instrument name in the product filename

------------------------------------------------------------------------------------------------------------------------------------- On Wed, Feb 17, 2021 at 8:04 AM Peter Miu <Peter.Miu@eumetsat.int> wrote:

Hi,

The part of the filename you are referring to is the FreeDescription field of the WMO file naming convention, and this is not case sensitive.

The GDWG members have agreed to use this field to identify the MonitoringSatellite +Instrument-ReferenceSatellite+Instrument for the GSICS inter-calibration product.

The current GDWG agreement is it is not case sensitive so it is up to the product producer to decide as long as it is consistent in the generation of the products in the whole data collection (see the MetOp products where the filename is “W_XX-EUMETSAT-Darmstadt,SATCAL+RAC+GEOLEOIR,MSG1+SEVIRI-MetOpB+IASI_C_EUMG_20150601000000_01.nc”).

From the usability stand point, it would be a good approach/guide if all GPRC should use the same format/case for this part of the filename in the products they product.

References can be found here: http://gsics.atmos.umd.edu/bin/view/Development/FilenameConvention

Regards,

Pete.
Topic revision: r73 - 03 Mar 2021, ManikBali
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding GSICS Wiki? Send feedback