GSICS File Naming Convention
GSICS file naming convention follows the rules given in the
General File Naming Conventions section of the
W.M.O. Manual on The Global Telecommunication System . The file name general format is:
pflag_productidentifier_oflag_originator_yyyyMMddhhmmss[_freeformat].type[.compression]
A file name consists of a predetermined combination of fields delimited by the underscore character except for the last two fields delimited by the period character. Each field can be of variable length, except for the date-time stamp field which length is fixed. The order of the fields is mandatory. The fields are not case sensitive. Use of capital letters is allowed but they will not be considered different from their lowercase versions.
Although no maximum length is specified for the entire file name, the mandatory fields shall not exceed 128 characters (including all delimiters) to allow processing by all international systems.
Explanation of how each naming field relates to GSICS files follows:
-
pflag
- Always
W
.
-
productidentifier
- A variable length field containing information that describes the nature of the data in the file. Its format is:
LocationIndicator,DataDesignator,FreeDescription
In order to facilitate identification of each part in the product identifier field no part can contain a comma symbol. If a part is empty, no character shall be inserted between the relevant delimiters "_" or ",".
-
-
LocationIndicator
- Defines data producer in three fields: country, organization, and production center. Field separator is the "-" symbol. The country is represented by the official ISO 3166 two-letter country code. Recognized location indicators are:
LocationIndicator |
Data Producer |
US-NESDIS-STAR |
NOAA/NESDIS Center for Satellite Applications and Research (STAR) |
XX-EUMETSAT-Darmstadt |
EUMETSAT |
JP-JMA-MSC |
JMA Meteorological Satellite Center (MSC) |
CN-CMA-NSMC |
CMA National Satellite Meteorological Center |
KR-KMA-NMSC |
KMA National Meteorological Satellite Center |
-
-
DataDesignator
- Describes the type of data in the file. Further explanation is given in a separate section.
-
-
FreeDescription
- Identifies platforms and instruments from which data came from or to which data applies to. The usage of this field is further explained in a separate section.
-
oflag
- Always
C
. The originator
field will be decoded as the standard international four-letter location indicator (CCCC).
-
originator
- International four-letter location indicator (CCCC) of the station or centre originating the file as agreed internationally and published in the W.M.O. No. 9, Volume C1, Catalogue of Meteorological Bulletins .
Recognized originator codes are:
CCCC Code |
Originator |
EUMS |
EUMETSAT, Darmstadt, Germany |
EUMG |
EUMP |
KNES |
NOAA/NESDIS, College Park, Md., U.S.A. |
RJTD |
JMA, Tokyo, Japan |
RKSL |
KMA, Seoul, Korea |
BABJ |
CMA, Beijing, China |
-
yyyyMMddhhmmss
- A fixed length UTC date and time stamp field representing a data-related event, for example: collection, observation, or processing time. If a particular date and time stamp field is not specified it must be replaced by a "-" (minus) character. For example, if the file contains data for one entire day that is denoted as
"YYYYMMDD------"
.
-
freeformat
(optional) - A variable length field consisting of sub-fields divided by the underscore (
"_"
). The recognized sub-fields are:
YYYYMMDDHHMMSS_..._distphase_version
where "..."
indicates possible, not yet specified sub-fields. Order of the sub-fields is important.
-
-
YYYYMMDDHHMMSS
(optional) - A date-time sub-field in the full 14-digit format will be interpreted as end time in UTC related to data collection, observation, or processing. If the
YYYYMMDDHHMMSS
sub-field is present then the yyyyMMddhhmmss
field will be interpreted as denoting beginning of that time interval. This sub-filed must appear first in the freeformat
field.
-
-
distphase
(optional for Operational phase only) - Indicates distribution phase of file's data. Only GSICS product files can use this sub-field. Two possible values:
"demo"
for the Demonstration phase, and "preop"
for the Preoperational phase. Files in the Operational distribution phase cannot have this sub-field in their names.
-
-
version
(optional) - Version identifier of file's data. Only the major version number is used in the two-digit format. This sub-field must appear last in the
freeformat
field.
-
type
- Describes the general format type of the file. Important types for GSICS files are mentioned below:
Type |
File Format |
nc |
Network Common Data Format (netCDF) |
met |
Metadata file describing the content and format of the corresponding file of the same name |
png |
Portable Network Graphics |
-
compression
(optional) - Identifies file compression method.
compression Field |
Meaning |
zip |
Compressed with the UNIX zip or Windows WinZip |
gz |
Compressed with the gzip tool |
bz2 |
Compressed with the bzip2 tool |
File Name Character Set
Allowed characters to use for composing file names are:
- "a" through "z" (ASCII lowercase letters)
- "A" through "Z" (ASCII uppercase letters)
- "0" through "9" (ASCII numerical digits)
- underscore ("_")
- minus ("-")
- plus ("+")
- period (".")
- comma (",")
The underscore is the field delimiter. It is accepted in the
freeformat
field but not in any other field. The minus is used only as the field delimiter inside the
LocationIndicator
and
FreeDescription
fields of the
productidentifier
field. The plus is used to concatenate several words in the
productidentifier
field. The period is to be used as a delimiter only before the
type
and
compression
fields. The comma is used as a field delimiter in the
productidentifier
field. It can be also used in the
freeformat
field.
Data Designators
A data designator consists of up to three fields delimited by the plus character (+): data category, international data subcategory, and local data subcategory. The format is:
DataCategory+InternationalDataSubcategory[+LocalDataSubcategory]
Data category declares the general type of the data while the international data subcategory provides more specific description of the data. Both of these designators are mandatory. Local data subcategory is optional and can only be used if both data category and international data subcategory are present.
Data categories and international data subcategories are defined in the
Common Table C-13 of the
W.M.O. Manual on Codes . Each data producer is free to specify its unique local data subcategories for a given pair of data category and international data subcategory.
The data category for all GSICS files is
Calibration dataset (satellite), code figure 30. Available GSICS data designators are described in the following table:
Common Table C-13 Data Category |
Common Table C-13 International Data Subcategory |
Alphanum. Code |
Name |
Code Figure |
Alphanum. Code |
Name |
Code Figure |
SATCAL |
Calibration dataset (satellite) |
30 |
SUBSET |
Subsetted data |
0 |
COLLOC |
Collocated data |
1 |
OBC |
On-board calibration data |
2 |
BIASM |
Bias Monitoring Deprecated |
3 |
NRTC |
Near Real-Time Correction |
4 |
RAC |
Re-analysis Correction |
5 |
In order to differentiate the content of GSICS files coming from same monitored and reference instruments, the local data subcategory will be used with following values:
Alphanum. Code |
Name |
Code Figure |
GEOLEOIR |
GEO-LEO-IR algorithm data |
1 |
LEOLEOIR |
LEO-LEO-IR algorithm data |
2 |
GEOLEOVNIR |
GEO-LEO-VISNIR algorithm data |
3 |
LEOLEOVNIR |
LEO-LEO-VISNIR algorithm data |
4 |
FreeDescription
Field
This field refers either to file's data sources, or where file's data is applicable to. Typically these will consist of satellite and instrument names. The general form of the field is:
PLATFORM+INSTRUMENT[-PLATFORM+INSTRUMENT]...
Each
PLATFORM+INSTRUMENT
pair identifies a platform and one of its on-board instruments. Both platform and instrument names can consist only of
ASCII letters and digits. Any other character should be ignored when forming
PLATFORM
and
INSTRUMENT
identifiers. Examples of platform-instrument pairs are:
GOES11+Imager
,
Aqua+AIRS
,
MetopA+IASI
, or
Terra+MODIS
. Note the case of letters does not matter.
It is recommended to use the following reference satellite and instrument names which are used by the owner agency or used on the WMO-OSCAR even though file naming is not case sensitive (No need to change filename of existing GSICS Correction).
When forming a
FreeDescription
field either platform or instrument name can be omitted from a pair if that will not introduce any ambiguity. For example, a complete
FreeDescription
field
GOES11+Imager-Aqua+AIRS
can be shortened to
GOES11+Imager-AIRS
because there is only one AIRS instrument. Another example is when having two platforms with the same instrument:
NOAA18+HIRS-NOAA19+HIRS
can be shortened to
NOAA18-NOAA19+HIRS
.
Examples
Here are given a few file name examples to illustrate certain features of the convention:
-
W_US-NESDIS-STAR,SATCAL+COLLOC+GEOLEOIR,GOES12+Imager-AIRS_C_KNES_20090713------.nc
- A file from the U.S. NOAA/NESDIS/STAR containing collocated data between the GOES-12 Imager and the Aqua AIRS according to the GEO-LEO-IR ATBD for one full day on 2009-07-13.
-
W_JP-JMA-MSC,SATCAL+NRTC+GEOLEOIR,MTSAT1R+JAMI-MetopA+IASI_C_RJTD_20090814000000_demo_01.nc
- A file from the JMA Meteorological Satellite Center containing near real-time inter-calibration correction coefficients for the MTSAT-1R JAMI derived from comparisons with the Metop-A IASI according to the GEO-LEO-IR algorithm. The coefficients validity period starts at 2009-08-14 00:00:00Z. This file's data is in the Demonstration phase and its major version number is 1.
Clarification on Instrument name and platform name ( Ref Email from Peter Miu) . Following the precedent used in used in KMA's AMI-
CrIs
product , It is recommended that
CrIS be referred as CRIS even though the instrument name is not case sensitive. In principal agencies have to arrive at a consensus to decide on how they wish to address case of platform name and instrument name in the product filename
-------------------------------------------------------------------------------------------------------------------------------------
On Wed, Feb 17, 2021 at 8:04 AM Peter Miu <
Peter.Miu@eumetsat.int> wrote:
Hi,
The part of the filename you are referring to is the
FreeDescription field of the WMO file naming convention, and this is not case sensitive.
The GDWG members have agreed to use this field to identify the
MonitoringSatellite +Instrument-ReferenceSatellite+Instrument for the GSICS inter-calibration product.
The current GDWG agreement is it is not case sensitive so it is up to the product producer to decide as long as it is consistent in the generation of the products in the whole data collection (see the
MetOp products where the filename is “
W_XX-EUMETSAT-Darmstadt,SATCAL+RAC+GEOLEOIR,MSG1+SEVIRI-MetOpB+IASI_C_EUMG_20150601000000_01.nc”).
From the usability stand point, it would be a good approach/guide if all GPRC should use the same format/case for this part of the filename in the products they product.
References can be found here:
http://gsics.atmos.umd.edu/bin/view/Development/FilenameConvention
Regards,
Pete.