Xyz2kdt

From Gerris

(Difference between revisions)
Jump to: navigation, search
Revision as of 10:47, 13 May 2008
Popinet (Talk | contribs)

← Previous diff
Revision as of 01:20, 28 January 2010
Popinet (Talk | contribs)
(Note on database optimisation)
Next diff →
Line 102: Line 102:
Note also that when using several databases simultaneously within [[GfsRefineTerrain]], you need to choose consistent conventions for all the databases (for example a common [[w:Geodetic system|geodetic system]] e.g. [[w:WGS84|WGS84]]). The terrain databases do not know anything about projection systems and it is up to you to enforce your preferred conventions. Note also that when using several databases simultaneously within [[GfsRefineTerrain]], you need to choose consistent conventions for all the databases (for example a common [[w:Geodetic system|geodetic system]] e.g. [[w:WGS84|WGS84]]). The terrain databases do not know anything about projection systems and it is up to you to enforce your preferred conventions.
 +
 +=== Optimising the database layout ===
 +
 +The procedure above produces a functional database but it is not optimal because the points are inserted following horizontal/vertical coordinates lines. This leads to bounding boxes of the R*-tree which have aspect ratios far from one. A simple way of improving the database is to insert the points randomly rather than along coordinate lines. This can be done easily using the following shell script
 +
 +<source bash>
 +#!/bin/sh
 +
 +awk '{
 + printf("%5d %s\n", int (rand()*2**16), $0);
 +}' | sort -T. -n -k 1,2 | cut -c7-
 +</source>
 +
 +Cut and paste this into a file called <code>myshuf</code> then do
 +
 + % chmod +x myshuf
 +
 +The database can then be regenerated using
 +
 + % etopo2xyz < ETOPO2v2c_i2_LSB.bin | ./myshuf | xyz2rsurface -v etopo2-shuffled
 +
 +You will notice that this takes longer but also that the database size is significantly reduced and that the aspect ratios of bounding boxes is much improved (always less than two). This leads to important performance improvements when using the database.

Revision as of 01:20, 28 January 2010

xyz2rsurface is a command-line utility used to create the R*-tree-indexed terrain databases used as input for the GfsRefineTerrain object of the Terrain module. A summary of the command-line syntax is given by

% xyz2rsurface -h
Usage: xyz2rsurface [OPTION] BASENAME

Converts the x, y and z coordinates on standard input to an
R*-tree-indexed database suitable for use with the
GfsRefineTerrain object of Gerris.

  -p N  --pagesize=N  sets the pagesize in bytes (default is 4096)
  -v    --verbose     display progress bar
  -h    --help        display this help and exit

Report bugs to s.popinet@niwa.co.nz

The format of the data on standard input should look like

3.501 5.634 -2
4.601 7.6778 3.456
...

where the first field is the value of the x-coordinate, the second field the y-coordinate and the last field the z-coordinate. xyz2rsurface will stop at the first line which does not fit this format. You may want to check (e.g. using the --verbose option) that the number of points processed matches what you expect. Note that the database does not enforce any other convention.

Example: building a global terrain topography database using the ETOPO2 dataset

The ETOPO2 dataset contains topographic information for the entire surface of the Earth (both above and below sea level) at a nominal resolution of two arc-minutes (~4 km).

The first step is to get the raw data e.g.

% wget http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO2/ETOPO2v2-2006/ETOPO2v2c/raw_binary/ETOPO2v2c_i2_LSB.zip
% unzip ETOPO2v2c_i2_LSB.zip

This is a binary file with a format described in the ETOPO2v2c_i2_LSB.hdr file

% cat ETOPO2v2c_i2_LSB.hdr
NCOLS 10800
NROWS 5400
XLLCORNER -180.000000
YLLCORNER -90.000000
CELLSIZE 0.0333333333333333333
NODATA_VALUE 999999
BYTEORDER LSBFIRST
NUMBERTYPE 4_BYTE_FLOAT
MIN_VALUE -10791.0
MAX_VALUE 8440.0

We need to convert this binary file to a text file. We also want to use the database together with a cartographic projection defined using the Map module. By definition this means that our x-, y- and z-coordinates need to be the east-positive longitude, north-positive latitude and elevation in metres. We can easily get these coordinates in a text format suitable for input into xyz2rsurface using the following C code:

#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <arpa/inet.h>
 
/* check that this matches ETOPO2v2c_i2_LSB.hdr */
#define NCOLS 10800
#define NROWS 5400
#define XLLCORNER -180.000000
#define YLLCORNER -90.000000
#define CELLSIZE 0.0333333333333333333
#define NODATA_VALUE 999999
#define BYTEORDER LSBFIRST
#define NUMBERTYPE 4_BYTE_FLOAT
#define MIN_VALUE -10791.0
#define MAX_VALUE 8440.0
 
int main (int argc, char * argv[])
{
double lat, lon;
int16_t v;
int i, j;
 
for (j = 0; j < NROWS; j++) {
lat = YLLCORNER + CELLSIZE*j;
for (i = 0; i < NCOLS; i++) {
lon = XLLCORNER + CELLSIZE*i;
assert (fread (&v, sizeof (int16_t), 1, stdin));
assert (v >= MIN_VALUE && v <= MAX_VALUE);
printf ("%.8f %.8f %d\n", lon + CELLSIZE/2., - (lat + CELLSIZE/2.), v);
}
fprintf (stderr, "\rRow %d/%d", j + 1, NROWS);
}
fputc ('\n', stderr);
return 0;
}

Just copy and paste this code into a file called e.g. etopo2xyz.c and compile using

% cc etopo2xyz.c -o etopo2xyz

The following command will then read the binary ETOPO2 file, convert it to the appropriate text format and generate the final terrain database

% etopo2xyz < ETOPO2v2c_i2_LSB.bin | xyz2rsurface etopo2

If everything went well you should end up (~ one hour and ~58 millions points later) with four large files

% ls etopo2*
etopo2 etopo2.Data etopo2.DataPD etopo2.DirPD

which together define the terrain database.

Note also that when using several databases simultaneously within GfsRefineTerrain, you need to choose consistent conventions for all the databases (for example a common geodetic system e.g. WGS84). The terrain databases do not know anything about projection systems and it is up to you to enforce your preferred conventions.

Optimising the database layout

The procedure above produces a functional database but it is not optimal because the points are inserted following horizontal/vertical coordinates lines. This leads to bounding boxes of the R*-tree which have aspect ratios far from one. A simple way of improving the database is to insert the points randomly rather than along coordinate lines. This can be done easily using the following shell script

#!/bin/sh
 
awk '{
printf("%5d %s\n", int (rand()*2**16), $0);
}'
| sort -T. -n -k 1,2 | cut -c7-

Cut and paste this into a file called myshuf then do

% chmod +x myshuf

The database can then be regenerated using

% etopo2xyz < ETOPO2v2c_i2_LSB.bin | ./myshuf | xyz2rsurface -v etopo2-shuffled

You will notice that this takes longer but also that the database size is significantly reduced and that the aspect ratios of bounding boxes is much improved (always less than two). This leads to important performance improvements when using the database.

Personal tools
communication