As a general rule, automatic type conversions should be avoided for at
least two reasons.
First, type conversions are expensive since they require creating
(temporary) buffers and casting each element of a variable from
the type it was stored at to some other type.
Second, the dataset's creator probably had a good reason
for storing data as, say, NC_FLOAT
rather than NC_DOUBLE
.
In a scientific framework there is no reason to store data with more
precision than the observations were made.
Thus NCO tries to avoid performing automatic type conversions
when performing arithmetic.
Automatic type conversion during arithmetic in the languages C and
Fortran is performed only when necessary.
All operands in an operation are converted to the most precise type
before the operation takes place.
However, following this parsimonious conversion rule dogmatically
results in numerous headaches.
For example, the average of the two NC_SHORT
s 17000s
and
17000s
results in garbage since the intermediate value which
holds their sum is also of type NC_SHORT
and thus cannot
represent values greater than 32,767
1.
There are valid reasons for expecting this operation to succeed and
the NCO philosophy is to make operators do what you want, not
what is most pure.
Thus, unlike C and Fortran, but like many other higher level interpreted
languages, NCO arithmetic operators will perform automatic type
conversion when all the following conditions are met
2:
NC_BYTE
, NC_CHAR
,
NC_SHORT
, or NC_INT
.
Type NC_DOUBLE
is not type converted because there is no type of
higher precision to convert to.
Type NC_FLOAT
is not type converted because, in our judgement,
the performance penalty of always doing so would outweigh the (extremely
rare) potential benefits.
When these criteria are all met, the operator promotes the variable in
question to type NC_DOUBLE
, performs all the arithmetic
operations, casts the NC_DOUBLE
type back to the original type,
and finally writes the result to disk.
The result written to disk may not be what you expect, because of
incommensurate ranges represented by different types, and because of
(lack of) rounding.
First, continuing the above example, the average (e.g., ‘-y avg’)
of 17000s
and 17000s
is written to disk as 17000s
.
The type conversion feature of NCO makes this possible since
the arithmetic and intermediate values are stored as NC_DOUBLE
s,
i.e., 34000.0d
and only the final result must be represented
as an NC_SHORT
.
Without the type conversion feature of NCO, the average would
have been garbage (albeit predictable garbage near -15768s
).
Similarly, the total (e.g., ‘-y ttl’) of 17000s
and
17000s
written to disk is garbage (actually -31536s
) since
the final result (the true total) of 34000 is outside the range
of type NC_SHORT
.
Type conversions use the floor
function to convert floating point
number to integers.
Type conversions do not attempt to round floating point numbers to the
nearest integer.
Thus the average of 1s
and 2s
is computed in double
precisions arithmetic as
(1.0d
+ 1.5d
)/2) = 1.5d
.
This result is converted to NC_SHORT
and stored on disk as
floor(1.5d)
= 1s
3.
Thus no "rounding up" is performed.
The type conversion rules of C can be stated as follows:
If n is an integer then any floating point value x
satisfying
n <= x < n+1
will have the value n when converted to an integer.
[1]
32767 = 2^15-1
[2] Operators began performing type conversions before arithmetic in NCO version 1.2, August, 2000. Previous versions never performed unnecessary type conversion for arithmetic.
[3]
The actual type conversions are handled by intrinsic C-language type
conversion, so the floor()
function is not explicitly called,
though the results would be the same if it were.