Previous: Mask condition, Up: ncwa netCDF Weighted Averager


4.12.2 Normalization and Integration

ncwa has one switch which controls the normalization of the averages appearing in the output-file. Short option ‘-N’ (or long options ‘--nmr’ or ‘--numerator’) prevents ncwa from dividing the weighted sum of the variable (the numerator in the averaging expression) by the weighted sum of the weights (the denominator in the averaging expression). Thus ‘-N’ tells ncwa to return just the numerator of the arithmetic expression defining the operation (see Operation Types).

With this normalization option, ncwa can integrate variables. Averages are first computed as sums, and then normalized to obtain the average. The original sum (i.e., the numerator of the expression in Operation Types) is output if default normalization is turned off (with ‘-N). This sum is the integral (not the average) over the specified (with ‘-a, or all, if none are specified) dimensions. The weighting variable, if specified (with ‘-w), plays the role of the differential increment and thus permits more sophisticated integrals (i.e., weighted sums) to be output. For example, consider the variable lev where lev = [100,500,1000] weighted by the weight lev_wgt where lev_wgt = [10,2,1]. The vertical integral of lev, weighted by lev_wgt, is the dot product of lev and lev_wgt. That this is is 3000.0 can be seen by inspection and verified with the integration command

     ncwa -N -a lev -v lev -w lev_wgt in.nc foo.nc;ncks foo.nc

EXAMPLES

Given file 85_0112.nc:

     netcdf 85_0112 {
     dimensions:
             lat = 64 ;
             lev = 18 ;
             lon = 128 ;
             time = UNLIMITED ; // (12 currently)
     variables:
             float lat(lat) ;
             float lev(lev) ;
             float lon(lon) ;
             float time(time) ;
             float scalar_var ;
             float three_dmn_var(lat, lev, lon) ;
             float two_dmn_var(lat, lev) ;
             float mask(lat, lon) ;
             float gw(lat) ;
     }

Average all variables in in.nc over all dimensions and store results in out.nc:

     ncwa in.nc out.nc

All variables in in.nc are reduced to scalars in out.nc since ncwa averages over all dimensions unless otherwise specified (with ‘-a’).

Store the zonal (longitudinal) mean of in.nc in out.nc:

     ncwa -a lon in.nc out1.nc
     ncwa -a lon -b in.nc out2.nc

The first command turns lon into a scalar and the second retains lon as a degenerate dimension in all variables.

     % ncks -C -H -v lon out1.nc
     lon = 135
     % ncks -C -H -v lon out2.nc
     lon[0] = 135

In either case the tally is simply the size of lon, i.e., for the 85_0112.nc file described by the sample header above.

Compute the meridional (latitudinal) mean, with values weighted by the corresponding element of gw 1:

     ncwa -w gw -a lat in.nc out.nc

Here the tally is simply the size of lat, or 64. The sum of the Gaussian weights is 2.0.

Compute the area mean over the tropical Pacific:

     ncwa -w gw -a lat,lon -d lat,-20.,20. -d lon,120.,270. in.nc out.nc

Here the tally is

64 times 128 = 8192.

Compute the area-mean over the globe using only points for which

ORO < 0.5

2:

     ncwa -B 'ORO < 0.5'      -w gw -a lat,lon in.nc out.nc
     ncwa -m ORO -M 0.5 -T lt -w gw -a lat,lon in.nc out.nc

It is considerably simpler to specify the complete mask_cond with the single string argument to ‘-B’ than with the three separate switches ‘-m’, ‘-T’, and ‘-M’. If in doubt, enclose the mask_cond with double quotes since some of the comparators have special meanings to the shell.

Assuming 70% of the gridpoints are maritime, then here the tally is

0.70 times 8192 = 5734.

Compute the global annual mean over the maritime tropical Pacific:

     ncwa -B 'ORO < 0.5'      -w gw -a lat,lon,time \
       -d lat,-20.0,20.0 -d lon,120.0,270.0 in.nc out.nc
     ncwa -m ORO -M 0.5 -T lt -w gw -a lat,lon,time \
       -d lat,-20.0,20.0 -d lon,120.0,270.0 in.nc out.nc

Further examples will use the one-switch specification of mask_cond.

Determine the total area of the maritime tropical Pacific, assuming the variable area contains the area of each gridcell

     ncwa -N -v area -B 'ORO < 0.5' -a lat,lon \
       -d lat,-20.0,20.0 -d lon,120.0,270.0 in.nc out.nc

Weighting area (e.g., by gw) is not appropriate because area is already area-weighted by definition. Thus the ‘-N’ switch, or, equivalently, the ‘-y ttl’ switch, correctly integrate the cell areas into a total regional area.

Mask a file to contain _FillValue everywhere except where thr_min <= msk_var <= thr_max:

     # Set masking variable and its scalar thresholds
     export msk_var='three_dmn_var_dbl' # Masking variable
     export thr_max='20' # Maximum allowed value
     export thr_min='10' # Minimum allowed value
     ncecat -O in.nc out.nc # Wrap out.nc in degenerate "record" dimension
     ncwa -O -a record -B "${msk_var} <= ${thr_max}" out.nc out.nc
     ncecat -O out.nc out.nc # Wrap out.nc in degenerate "record" dimension
     ncwa -O -a record -B "${msk_var} >= ${thr_min}" out.nc out.nc

After the first use of ncwa, out.nc contains _FillValue where ${msk_var} >= ${thr_max}. The process is then repeated on the remaining data to filter out points where ${msk_var} <= ${thr_min}. The resulting out.nc contains valid data only where thr_min <= msk_var <= thr_max.


Footnotes

[1] gw stands for Gaussian weight in many climate models.

[2] ORO stands for Orography in some climate models and in those models ORO < 0.5 selects ocean gridpoints.