In ncap there are two ways to sort data.
The first is a regular sort.
This sorts ALL the elements of a variable or attribute without
regard to any dimensions.
The second method applies a sort map to a variable.
To apply this sort map the size of the variable must be exactly
divisible by the size of the sort map.
The method sort(var_in,&var_map)
is overloaded.
The second optional argument is to supply a call-by-reference variable
to hold the returned sort map.
a1[$time]={10,2,3,4,6,5,7,3,4,1}; a1_sort=sort(a1); print(a1_sort); // 1, 2, 3, 3, 4, 4, 5, 6, 7, 10; a2[$lon]={2,1,4,3}; a2_sort=sort(a2,&a2_map); print(a2); // 1, 2, 3, 4 print(a2_map); // 1, 0, 3, 2;
If the map variable does not exist prior to the sort() call,
then it will be created with the same shape as the input variable and be
of type NC_INT
.
If the map variable already exists, then the only restriction is that it
be of at least the same size as the input variable.
To apply a map use dsort(var_in,var_map)
.
defdim("nlat",5); a3[$lon]={2,5,3,7}; a4[$nlat,$lon]={ 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12, 13,14,15,16, 17,18,19,20}; a3_sort=sort(a3,&a3_map); print(a3_map); // 0, 2, 1, 3; a4_sort=dsort(a4,a3_map); print(a4_sort); // 1, 3, 2, 4, // 5, 7, 6, 8, // 9,11,10,12, // 13,15,14,16, // 17,19,18,20; a3_map2[$nlat]={4,3,0,2,1}; a4_sort2=dsort(a4,a3_map2); print(a4_sort2); // 3, 5, 4, 2, 1 // 8, 10, 9,7, 6, // 13,15,14,12,11, // 18,20,19,17,16
As in the above example you may create your own sort map.
To sort in descending order, apply the reverse()
method after the
sort().
Here is an extended example of how to use ncap2 features to hyperslab an irregular region based on the values of a variable not a coordinate. The distinction is crucial: hyperslabbing based on dimensional indices or coordinate values is straightforward. Using the values of single or multi-dimensional variable to define a hyperslab is quite different.
cat > ~/ncap2_foo.nco << 'EOF' // Purpose: Save irregular 1-D regions based on variable values // Included in NCO User's Guide at http://nco.sf.net/nco.html#sort /* NB: Single quotes around EOF above turn off shell parameter expansion in "here documents". This in turn prevents the need for protecting dollarsign characters in NCO scripts with backslashes when the script is cut-and-pasted (aka "moused") from an editor or e-mail into a shell console window */ /* Copy coordinates and variable(s) of interest into RAM variable(s) Benefits: 1. ncap2 defines writes all variables on LHS of expression to disk Only exception is RAM variables, which are stored in RAM only Repeated operations on regular variables takes more time, because changes are written to disk copy after every change. RAM variables are only changed in RAM so script works faster RAM variables can be written to disk at end with ram_write() 2. Script permutes variables of interest during processing Safer to work with copies that have different names This discourages accidental, mistaken use of permuted versions 3. Makes this script a more generic template: var_in instead of specific variable names everywhere */ *var_in=one_dmn_rec_var; *crd_in=time; *dmn_in_sz=$time.size; // [nbr] Size of input arrays /* Create all other "intermediate" variables as RAM variables to prevent them from cluttering the output file. Mask flag and sort map are same size as variable of interest */ *msk_flg=var_in; *srt_map=var_in; /* In this example we mask for all values evenly divisible by 3 This is the key, problem-specific portion of the template Replace this where() condition by that for your problem Mask variable is Boolean: 1=Meets condition, 0=Fails condition */ where(var_in % 3 == 0) msk_flg=1; elsewhere msk_flg=0; // print("msk_flg = ");print(msk_flg); // For debugging... /* The sort() routine is overloaded, and takes one or two arguments The second argument (optional) is the "sort map" (srt_map below) Pass the sort map by reference, i.e., prefix with an ampersand If the sort map does not yet exist, then it will be created and returned as an integer type the same shape as the input variable. The output of sort(), on the LHS, is a sorted version of the input msk_flg is not needed in its original order after sort() Hence we use msk_flg as both input to and output from sort() Doing this prevents the need to define a new, unneeded variable */ msk_flg=sort(msk_flg,&srt_map); // Count number of valid points in mask by summing the one's *msk_nbr=msk_flg.total(); // Define output dimension equal in size to number of valid points defdim("crd_out",msk_nbr); /* Now sort the variable of interest using the sort map and dsort() The output, on the LHS, is the input re-arranged so that all points meeting the mask condition are contiguous at the end of the array Use same srt_map to hyperslab multiple variables of the same shape Remember to apply srt_map to the coordinate variables */ crd_in=dsort(crd_in,srt_map); var_in=dsort(var_in,srt_map); /* Hyperslab last msk_nbr values of variable(s) of interest */ crd_out[crd_out]=crd_in((dmn_in_sz-msk_nbr):(dmn_in_sz-1)); var_out[crd_out]=var_in((dmn_in_sz-msk_nbr):(dmn_in_sz-1)); /* NB: Even though we created all variables possible as RAM variables, the original coordinate of interest, time, is written to the ouput. I'm not exactly sure why. For now, delete it from the output with: ncks -O -x -v time ~/foo.nc ~/foo.nc */ EOF ncap2 -O -v -S ~/ncap2_foo.nco ~/nco/data/in.nc ~/foo.nc ncks -O -x -v time ~/foo.nc ~/foo.nc ncks ~/foo.nc
Here is an extended example of how to use ncap2 features to sort multi-dimensional arrays based on the coordinate values along a single dimension.
cat > ~/ncap2_foo.nco << 'EOF' /* Purpose: Sort multi-dimensional array based on coordinate values This example sorts the variable three_dmn_rec_var(time,lat,lon) based on the values of the time coordinate. */ // Included in NCO User's Guide at http://nco.sf.net/nco.html#sort // Randomize the time coordinate time=10.0*gsl_rng_uniform(time); //print("original randomized time =\n");print(time); /* The sort() routine is overloaded, and takes one or two arguments The first argument is a one dimensional array The second argument (optional) is the "sort map" (srt_map below) Pass the sort map by reference, i.e., prefix with an ampersand If the sort map does not yet exist, then it will be created and returned as an integer type the same shape as the input variable. The output of sort(), on the LHS, is a sorted version of the input */ time=sort(time,&srt_map); //print("sorted time (ascending order) and associated sort map =\n");print(time);print(srt_map); /* sort() always sorts in ascending order The associated sort map therefore re-arranges the original, randomized time array into ascending order. There are two methods to obtain the descending order the user wants 1) We could solve the problem in ascending order (the default) and then apply the reverse() method to re-arrange the results. 2) We could change the sort map to return things in descending order of time and solve the problem directly in descending order. */ // Following shows how to do method one: /* Expand the sort map to srt_map_3d, the size of the data array 1. Use data array to provide right shape for the expanded sort map 2. Coerce data array into an integer so srt_map_3d is an integer 3. Multiply data array by zero so 3-d map elements are all zero 4. Add the 1-d sort map to the 3-d sort map (NCO automatically resizes) 5. Add the spatial (lat,lon) offsets to each time index 6. de-sort using the srt_map_3d 7. Use reverse to obtain descending in time order Loops could accomplish the same thing (exercise left for reader) However, loops are slow for large datasets */ /* Following index manipulation requires understanding correspondence between 1-d (unrolled, memory order of storage) and access into that memory as a multidimensional (3-d, in this case) rectangular array. Key idea to understand is how dimensionality affects offsets */ // Copy 1-d sort map into 3-d sort map srt_map_3d=(0*int(three_dmn_rec_var))+srt_map; // Multiply base offset by factorial of lesser dimensions srt_map_3d*=$lat.size*$lon.size; lon_idx=array(0,1,$lon); lat_idx=array(0,1,$lat)*$lon.size; lat_lon_idx[$lat,$lon]=lat_idx+lon_idx; srt_map_3d+=lat_lon_idx; print("sort map 3d =\n");print(srt_map_3d); // Use dsort() to "de-sort" the data using the sort map three_dmn_rec_var=dsort(three_dmn_rec_var,srt_map_3d); // Finally, reverse data so time coordinate is descending time=time.reverse($time); //print("sorted time (descending order) =\n");print(time); three_dmn_rec_var=three_dmn_rec_var.reverse($time); // Method two: Key difference is srt_map=$time.size-srt_map-1; EOF ncap2 -O -v -S ~/ncap2_foo.nco ~/nco/data/in.nc ~/foo.nc