Availability: ncap2, ncbo, ncea,
ncflint, ncpdq, ncra, ncwa Short options: None |
The phrase missing data refers to data points that are missing, invalid, or for any reason not intended to be arithmetically processed in the same fashion as valid data. The NCO arithmetic operators attempt to handle missing data in an intelligent fashion. There are four steps in the NCO treatment of missing data:
NCO follows the convention that missing data should be stored
with the _FillValue specified in the variable's _FillValue
attributes.
The only way NCO recognizes that a variable may
contain missing data is if the variable has a _FillValue
attribute.
In this case, any elements of the variable which are numerically equal
to the _FillValue are treated as missing data.
NCO adopted the behavior that the default attribute name, if
any, assumed to specify the value of data to ignore is _FillValue
with version 3.9.2 (August, 2007).
Prior to that, the missing_value
attribute, if any, was assumed to
specify the value of data to ignore.
Supporting both of these attributes simultaneously is not practical.
Hence the behavior NCO once applied to missing_value it now applies
to any _FillValue.
NCO now treats any missing_value as normal data
1.
It has been and remains most advisable to create both _FillValue
and missing_value
attributes with identical values in datasets.
Many legacy datasets contain only missing_value
attributes.
NCO can help migrating datasets between these conventions.
One may use ncrename (see ncrename netCDF Renamer) to
rename all missing_value
attributes to _FillValue
:
ncrename -a .missing_value,_FillValue inout.nc
Alternatively, one may use
ncatted (see ncatted netCDF Attribute Editor) to
add a _FillValue
attribute to all variables
ncatted -O -a _FillValue,,o,f,1.0e36 inout.nc
Consider a variable var of type var_type with a
_FillValue
attribute of type att_type containing the
value _FillValue.
As a guideline, the type of the _FillValue
attribute should be
the same as the type of the variable it is attached to.
If var_type equals att_type then NCO
straightforwardly compares each value of var to
_FillValue to determine which elements of var are to be
treated as missing data.
If not, then NCO converts _FillValue from
att_type to var_type by using the implicit conversion rules
of C, or, if att_type is NC_CHAR
2, by typecasting the results of the C function
strtod(
_FillValue)
.
You may use the NCO operator ncatted to change the
_FillValue
attribute and all data whose data is
_FillValue to a new value
(see ncatted netCDF Attribute Editor).
When an NCO arithmetic operator processes a variable var
with a _FillValue
attribute, it compares each value of
var to _FillValue before performing an operation.
Note the _FillValue comparison imposes a performance penalty
on the operator.
Arithmetic processing of variables which contain the
_FillValue
attribute always incurs this penalty, even when
none of the data are missing.
Conversely, arithmetic processing of variables which do not contain the
_FillValue
attribute never incurs this penalty.
In other words, do not attach a _FillValue
attribute to a
variable which does not contain missing data.
This exhortation can usually be obeyed for model generated data, but it
may be harder to know in advance whether all observational data will be
valid or not.
NCO averagers (ncra, ncea, ncwa) do not count any element with the value _FillValue towards the average. ncbo and ncflint define a _FillValue result when either of the input values is a _FillValue. Sometimes the _FillValue may change from file to file in a multi-file operator, e.g., ncra. NCO is written to account for this (it always compares a variable to the _FillValue assigned to that variable in the current file). Suffice it to say that, in all known cases, NCO does “the right thing”.
It is impossible to determine and store the correct result of a binary operation in a single variable. One such corner case occurs when both operands have differing _FillValue attributes, i.e., attributes with different numerical values. Since the output (result) of the operation can only have one _FillValue, some information may be lost. In this case, NCO always defines the output variable to have the same _FillValue as the first input variable. Prior to performing the arithmetic operation, all values of the second operand equal to the second _FillValue are replaced with the first _FillValue. Then the arithmetic operation proceeds as normal, comparing each element of each operand to a single _FillValue. Comparing each element to two distinct _FillValue's would be much slower and would be no likelier to yield a more satisfactory answer. In practice, judicious choice of _FillValue values prevents any important information from being lost.
[1]
The old functionality, i.e., where the ignored values are indicated by
missing_value
not _FillValue
, may still be selected
at NCO build time by compiling NCO
with the token definition
CPPFLAGS='-DNCO_MSS_VAL_SNG=missing_value'.
[2] For example, the DOE ARM program often
uses att_type = NC_CHAR
and _FillValue =
‘-99999.’.