Xyz2kdt

From Gerris

Revision as of 07:55, 20 May 2010; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

xyz2rsurface is a command-line utility used to create the R*-tree-indexed terrain databases used as input for the GfsRefineTerrain object of the Terrain module. A summary of the command-line syntax is given by

% xyz2rsurface -h
Usage: xyz2rsurface [OPTION] BASENAME

Converts the x, y and z coordinates on standard input to an
R*-tree-indexed database suitable for use with the
GfsRefineTerrain object of Gerris.

  -p N  --pagesize=N  sets the pagesize in bytes (default is 2048)
  -r    --randomize   randomize (shuffle) the input
  -v    --verbose     display progress bar
  -h    --help        display this help and exit

Report bugs to s.popinet@niwa.co.nz

The format of the data on standard input should look like

3.501 5.634 -2
4.601 7.6778 3.456
...

where the first field is the value of the x-coordinate, the second field the y-coordinate and the last field the z-coordinate. xyz2rsurface will stop at the first line which does not fit this format. You may want to check (e.g. using the --verbose option) that the number of points processed matches what you expect. Note that the database does not enforce any other convention.

Example: building a global terrain topography database using the ETOPO2 dataset

The ETOPO2 dataset contains topographic information for the entire surface of the Earth (both above and below sea level) at a nominal resolution of two arc-minutes (~4 km).

The first step is to get the raw data e.g.

% wget http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO2/ETOPO2v2-2006/ETOPO2v2c/raw_binary/ETOPO2v2c_i2_LSB.zip
% unzip ETOPO2v2c_i2_LSB.zip

This is a binary file with a format described in the ETOPO2v2c_i2_LSB.hdr file

% cat ETOPO2v2c_i2_LSB.hdr
NCOLS 10800
NROWS 5400
XLLCORNER -180.000000
YLLCORNER -90.000000
CELLSIZE 0.0333333333333333333
NODATA_VALUE 999999
BYTEORDER LSBFIRST
NUMBERTYPE 4_BYTE_FLOAT
MIN_VALUE -10791.0
MAX_VALUE 8440.0

We need to convert this binary file to a text file. We also want to use the database together with a cartographic projection defined using the Map module. By definition this means that our x-, y- and z-coordinates need to be the east-positive longitude, north-positive latitude and elevation in metres. We can easily get these coordinates in a text format suitable for input into xyz2rsurface using the following C code:

#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <arpa/inet.h>
 
/* check that this matches ETOPO2v2c_i2_LSB.hdr */
#define NCOLS 10800
#define NROWS 5400
#define XLLCORNER -180.000000
#define YLLCORNER -90.000000
#define CELLSIZE 0.0333333333333333333
#define NODATA_VALUE 999999
#define BYTEORDER LSBFIRST
#define NUMBERTYPE 4_BYTE_FLOAT
#define MIN_VALUE -10791.0
#define MAX_VALUE 8440.0
 
int main (int argc, char * argv[])
{
double lat, lon;
int16_t v;
int i, j;
 
for (j = 0; j < NROWS; j++) {
lat = YLLCORNER + CELLSIZE*j;
for (i = 0; i < NCOLS; i++) {
lon = XLLCORNER + CELLSIZE*i;
assert (fread (&v, sizeof (int16_t), 1, stdin));
assert (v >= MIN_VALUE && v <= MAX_VALUE);
printf ("%.8f %.8f %d\n", lon + CELLSIZE/2., - (lat + CELLSIZE/2.), v);
}
fprintf (stderr, "\rRow %d/%d", j + 1, NROWS);
}
fputc ('\n', stderr);
return 0;
}

Just copy and paste this code into a file called e.g. etopo2xyz.c and compile using

% cc etopo2xyz.c -o etopo2xyz

The following command will then read the binary ETOPO2 file, convert it to the appropriate text format and generate the final terrain database

% etopo2xyz < ETOPO2v2c_i2_LSB.bin | xyz2rsurface etopo2

If everything went well you should end up (~ one hour and ~58 millions points later) with four large files

% ls etopo2*
etopo2 etopo2.Data etopo2.DataPD etopo2.DirPD

which together define the terrain database.

Note also that when using several databases simultaneously within GfsRefineTerrain, you need to choose consistent conventions for all the databases (for example a common geodetic system e.g. WGS84). The terrain databases do not know anything about projection systems and it is up to you to enforce your preferred conventions.

Optimising the database layout

The procedure above produces a functional database but it is not optimal because the points are inserted following horizontal/vertical coordinates lines. This leads to bounding boxes of the R*-tree which have aspect ratios far from one. A simple way of improving the database is to insert the points randomly rather than along coordinate lines. This can be done easily using the "-r" option.

The database can then be regenerated using

% etopo2xyz < ETOPO2v2c_i2_LSB.bin | xyz2rsurface -r -v etopo2-shuffled

You will notice that this takes longer but also that the database size is significantly reduced and that the aspect ratios of bounding boxes is much improved (always less than two). This leads to important performance improvements when using the database.

You may also note that temporary files are created by the sort unix command as part of this process.

Personal tools
communication