Next: ncflint netCDF File Interpolator, Previous: ncea netCDF Ensemble Averager, Up: Operator Reference Manual
ncecat [-3] [-4] [-6] [-A] [-C] [-c] [-D dbg] [-d dim,[min][,[max][,[stride]]] [-F] [-h] [-L dfl_lvl] [-l path] [-M] [-n loop] [-O] [-o output-file] [-p path] [-R] [-r] [-t thr_nbr] [-u ulm_nm] [-v var[,...]] [-X ...] [-x] [input-files] [output-file]
DESCRIPTION
ncecat concatenates an arbitrary number of input files into a single output file. The input-files are stored consecutively as records in output-file. Each variable in each input file becomes one record in the same variable in the output file. All input-files must contain all extracted variables (or else there would be "gaps" in the output file).
A new record dimension is the glue which binds the input file data together. The new record dimension name is, by default, “record”. Its name can be specified with the ‘-u ulm_nm’ short option (or the ‘--ulm_nm’ or ‘rcd_nm’ long options).
Each extracted variable must be constant in size and rank across all input-files. The only exception is that ncecat allows files to differ in the record dimension size if the requested record hyperslab (see Hyperslabs) resolves to the same size for all files. This allows easier gluing/averaging of unequal length timeseries from simulation ensembles (e.g., the IPCC AR4 archive).
Thus, the output-file size is the sum of the sizes of the
extracted variables in the input files.
See Averaging vs. Concatenating, for a description of the
distinctions between the various averagers and concatenators.
As a multi-file operator, ncecat will read the list of
input-files from stdin
if they are not specified
as positional arguments on the command line
(see Large Numbers of Files).
Turn off global metadata copying. By default all NCO operators copy the global metadata of the first input file into output-file. This helps preserve the provenance of the output data. However, the use of metadata is burgeoning and is not uncommon to encounter files with excessive amounts of extraneous metadata. Extracting small bits of data from such files leads to output files which are much larger than necessary due to the automatically copied metadata. ncecat supports turning off the default copying of global metadata via the ‘-M’ switch (or its long option equivalents, ‘--glb_mtd_spr’ and ‘--global_metadata_suppress’).
Consider five realizations, 85a.nc, 85b.nc,
... 85e.nc of 1985 predictions from the same climate
model.
Then ncecat 85?.nc 85_ens.nc
glues the individual realizations
together into the single file, 85_ens.nc.
If an input variable was dimensioned [lat
,lon
], it will
by default have dimensions [record
,lat
,lon
] in
the output file.
A restriction of ncecat is that the hyperslabs of the
processed variables must be the same from file to file.
Normally this means all the input files are the same size, and contain
data on different realizations of the same variables.
Concatenating a variable packed with different scales multiple datasets
is beyond the capabilities of ncecat (and ncrcat,
the other concatenator (Concatenation).
ncecat does not unpack data, it simply copies the data
from the input-files, and the metadata from the first
input-file, to the output-file.
This means that data compressed with a packing convention must use
the identical packing parameters (e.g., scale_factor
and
add_offset
) for a given variable across all input files.
Otherwise the concatenated dataset will not unpack correctly.
The workaround for cases where the packing parameters differ across
input-files requires three steps:
First, unpack the data using ncpdq.
Second, concatenate the unpacked data using ncecat,
Third, re-pack the result with ncpdq.
Consider a model experiment which generated five realizations of one year of data, say 1985. You can imagine that the experimenter slightly perturbs the initial conditions of the problem before generating each new solution. Assume each file contains all twelve months (a seasonal cycle) of data and we want to produce a single file containing all the seasonal cycles. Here the numeric filename suffix denotes the experiment number (not the month):
ncecat 85_01.nc 85_02.nc 85_03.nc 85_04.nc 85_05.nc 85.nc ncecat 85_0[1-5].nc 85.nc ncecat -n 5,2,1 85_01.nc 85.nc
These three commands produce identical answers. See Specifying Input Files, for an explanation of the distinctions between these methods. The output file, 85.nc, is five times the size as a single input-file. It contains 60 months of data.
One often prefers that the (new) record dimension have a more descriptive, context-based name than simply “record”. This is easily accomplished with the ‘-u ulm_nm’ switch:
ncecat -u realization 85_0[1-5].nc 85.nc
Users are more likely to understand the data processing history when such descriptive coordinates are used.
Consider a file with an existing record dimension named time
.
and suppose the user wishes to convert time
from a record
dimension to a non-record dimension.
This may be useful, for example, when the user has another use for the
record variable.
The procedure is to use ncecat followed by ncwa:
ncecat in.nc out.nc # Convert time to non-record dimension ncwa -a record in.nc out.nc # Remove new degenerate record dimension
The second step removes the degenerate record dimension. See ncpdq netCDF Permute Dimensions Quickly for other methods of changing variable dimensionality, including the record dimension.