PostgreSQL earth distance module

4 minute read

背景

PostgreSQL提供了一个扩展模块earthdistance,实际上是将地球构造为一个标准的圆球体(实际上是扁球体),利用cube或point来表示地球上的点。

其中cube是用来记录球坐标的,通过坐标来表示地球上的点。(实际上是做了一定约束的cube, 即域类型earth)

-- Define domain for locations on the surface of the earth using a cube  
-- datatype with constraints. cube provides 3D indexing.  
-- The cube is restricted to be a point, no more than 3 dimensions  
-- (for less than 3 dimensions 0 is assumed for the missing coordinates)  
-- and that the point must be very near the surface of the sphere  
-- centered about the origin with the radius of the earth.  

有三个约束

CREATE DOMAIN earth AS cube  
  CONSTRAINT not_point check(cube_is_point(value))  
  CONSTRAINT not_3d check(cube_dim(value) <= 3)  
  CONSTRAINT on_surface check(abs(cube_distance(value, '(0)'::cube) /  
  earth() - 1) < '10e-7'::float8);  

而point是用来记录纬度和经度的,通过纬度和经度来表示地球上的点。

使用纬度和经度来表示有一定的缺陷,例如有正负,有边界需要注意,用球坐标则不存在这个问题。

所以需要依赖另外一个模块cube,这个模块是多维数据结构模型,可以表示多维的点,多维的BOX区域。

例如我要计算北京和杭州的地表距离,只需要提供两个位置经纬度就可以计算,使用earthdistance计算的结果没有使用postgis计算的结果精确,(原因地球是扁球体,而非标准圆球体)PostGIS考虑了这一点。

例子:

create extension cube;  
create extension earthdistance;  

earth()函数,返回地球半径,单位千米。

CREATE FUNCTION earth() RETURNS float8  
LANGUAGE SQL IMMUTABLE  
AS 'SELECT ''6378168''::float8';  

创建完后会建立一个域类型earth, 其实是基于cube建的。

postgres=# select * from pg_type where typname='earth';  
-[ RECORD 1 ]--+------------  
typname        | earth  
typnamespace   | 2200  
typowner       | 10  
typlen         | -1  
typbyval       | f  
typtype        | d  
typcategory    | U  
typispreferred | f  
typisdefined   | t  
typdelim       | ,  
typrelid       | 0  
typelem        | 0  
typarray       | 0  
typinput       | domain_in  
typoutput      | cube_out  
typreceive     | domain_recv  
typsend        | -  
typmodin       | -  
typmodout      | -  
typanalyze     | -  
typalign       | d  
typstorage     | p  
typnotnull     | f  
typbasetype    | 16545  
typtypmod      | -1  
typndims       | 0  
typcollation   | 0  
typdefaultbin  |   
typdefault     |   
typacl         |   

将纬度,经度转换为earth坐标类型。第一个参数表示纬度,第二个参数表示经度。

postgres=# select ll_to_earth(30.3,120.2);  
-[ RECORD 1 ]--------------------------------------------------------  
ll_to_earth | (-2770071.42546437, 4759459.23949754, 3217961.94533299)  

将球坐标转换为经度或纬度。

postgres=# select latitude(ll_to_earth(30.3,120.2));  
-[ RECORD 1 ]--  
latitude | 30.3  
postgres=# select longitude(ll_to_earth(30.3,120.2));  
-[ RECORD 1 ]----  
longitude | 120.2  

直线距离和球体表面距离的换算。

直线距离转换为球面距离,

postgres=# select sec_to_gc(100000);  
-[ RECORD 1 ]--------------  
sec_to_gc | 100001.02425681  

球面距离转换为直线距离,

postgres=# select gc_to_sec(100001.02425681);  
-[ RECORD 1 ]-----  
gc_to_sec | 100000  

使用公式换算,用到圆周率pi()和地球半径:

CREATE FUNCTION sec_to_gc(float8)  
RETURNS float8  
LANGUAGE SQL  
IMMUTABLE STRICT  
AS 'SELECT CASE WHEN $1 < 0 THEN 0::float8 WHEN $1/(2*earth()) > 1 THEN pi()*earth() ELSE 2*earth()*asin($1/(2*earth())) END';  
  
CREATE FUNCTION gc_to_sec(float8)  
RETURNS float8  
LANGUAGE SQL  
IMMUTABLE STRICT  
AS 'SELECT CASE WHEN $1 < 0 THEN 0::float8 WHEN $1/earth() > pi() THEN 2*earth() ELSE 2*earth()*sin($1/(2*earth())) END';  

计算两个坐标的球面距离:

例如北京和杭州的球面距离,约1123公里。

postgres=# select earth_distance(ll_to_earth(39.9,116.4),ll_to_earth(30.3,120.2));  
-[ RECORD 1 ]--+----------------  
earth_distance | 1122999.5704515  

算法,用到了CUBE的cube_distance函数,先计算直线距离,再转换为球体表面距离:

CREATE FUNCTION earth_distance(earth, earth)  
RETURNS float8  
LANGUAGE SQL  
IMMUTABLE STRICT  
AS 'SELECT sec_to_gc(cube_distance($1, $2))';  

最后一个例子是范围,用户提供球体表面的一个坐标,以及一个半径信息,用来表示球体表面的一个以这个坐标为中心辐射的半径范围,但实际测试发现有一定的偏差,可能和cube的计算方法有关系。

earth_box(earth, float8) return cube	  
  
Returns a box suitable for an indexed search using the cube @> operator for points within a given great circle distance of a location.   
Some points in this box are further than the specified great circle distance from the location, so a second check using earth_distance should be included in the query.  

算法,还是要用到CUBE提供的cube_enlarge函数,这个函数是将多维坐标的每个坐标都延展一个半径的距离,实际上已经不是球体了,而是立方体:

CREATE FUNCTION earth_box(earth, float8)  
RETURNS cube  
LANGUAGE SQL  
IMMUTABLE STRICT  
AS 'SELECT cube_enlarge($1, gc_to_sec($2), 3)';  

第一个参数是坐标,第二个参数是直线距离(球面距离转换为直线距离),第三个参数是第一个参数的维度,这里是3维。所以用3表示。

cube_enlarge(cube c, double r, int n) returns cube	  
  
Increases the size of a cube by a specified radius in at least n dimensions.   
If the radius is negative the cube is shrunk instead. This is useful for creating bounding boxes around a point for searching for nearby points.   
All defined dimensions are changed by the radius r. LL coordinates are decreased by r and UR coordinates are increased by r.   
If a LL coordinate is increased to larger than the corresponding UR coordinate (this can only happen when r < 0) than both coordinates are set to their average.   
If n is greater than the number of defined dimensions and the cube is being increased (r >= 0) then 0 is used as the base for the extra coordinates.  

例子:

将一个平面坐标(1,1)的每个方向扩展1个距离单位,得到(0, 0),(2, 2)。你在纸上画一下就知道怎么回事了。

postgres=# select cube_enlarge('(1,1)',1,2);  
 cube_enlarge    
---------------  
 (0, 0),(2, 2)  
(1 row)  

现在回到earthdistance的例子,例如我们前面计算得到杭州到北京的球面距离是1123公里,那么我以杭州为中心,辐射多少距离才能包含北京呢?如果是圆的话,辐射半径应该要达到1123公里才能包含北京。但是实际上使用earth_box得到的结果并非如此。

先看看扩展10米得到的BOX,每个坐标方向正反各延展了10米,得到2个点,即一个BOX:

postgres=# select ll_to_earth(30.3,120.2);  
                       ll_to_earth                         
---------------------------------------------------------  
 (-2770071.42546437, 4759459.23949754, 3217961.94533299)  
(1 row)  
postgres=# select earth_box(ll_to_earth(30.3,120.2), 10);  
                                                    earth_box                                                      
-----------------------------------------------------------------------------------------------------------------  
 (-2770081.42546437, 4759449.23949754, 3217951.94533299),(-2770061.42546437, 4759469.23949754, 3217971.94533299)  
(1 row)  
  
postgres=# select earth_box(ll_to_earth(30.3,120.2), 1122999.5704515);  
                                                    earth_box                                                      
-----------------------------------------------------------------------------------------------------------------  
 (-3891620.99816939, 3637909.66679251, 2096412.37262797),(-1648521.85275934, 5881008.81220257, 4339511.51803802)  
(1 row)  

这个BOX是否包含北京呢?

postgres=# select earth_box(ll_to_earth(30.3,120.2), 1122999.5704515) @> ll_to_earth(39.9,116.4);  
 ?column?   
----------  
 t  
(1 row)  
postgres=# select ll_to_earth(39.9,116.4);  
                      ll_to_earth                         
--------------------------------------------------------  
 (-2175648.05107066, 4382814.5785906, 4091273.51368619)  
(1 row)  

实际上,800多公里就包含北京了,因为这个BOX是立方体,而不是球体,北京被包含在800多公里的球体外面,立方体里面。

postgres=# select earth_box(ll_to_earth(30.3,120.2), 880999.5704515) @> ll_to_earth(39.9,116.4);  
 ?column?   
----------  
 t  
(1 row)  

用PostGIS可以很好的解决这个问题。

earthdistance还支持使用point来表示坐标,计算得到的是英里单位。

经度,纬度。

postgres=# select point(116.4,39.9) <@> point(120.2,30.3);  
    ?column?       
-----------------  
 697.01393638328  
(1 row)  

转换为公里。

postgres=# select 697.01393638328*1.60931;  
       ?column?          
-----------------------  
 1121.7114979609763368  
(1 row)  

与使用cube坐标计算得到的结果有一定的偏差,误差来自定义的地球半径不一样,见代码。

postgres=# select 3958.747716*1.60931;  
     ?column?       
------------------  
 6370.85228683596  
(1 row)  

earth()函数得到的是 AS ‘SELECT ‘‘6378168’’::float8’;

调整一下,两种坐标表示方法得到的结果就一致了:

postgres=# CREATE or REPLACE FUNCTION earth() RETURNS float8  
LANGUAGE SQL IMMUTABLE  
AS 'SELECT ''6370852''::float8';  
CREATE FUNCTION  
  
postgres=# select earth_distance(ll_to_earth(39.9,116.4),ll_to_earth(30.3,120.2));  
  earth_distance    
------------------  
 1121711.44745797  
(1 row)  

代码如下:

--------------- geo_distance as operator <@>  
CREATE OPERATOR <@> (  
  LEFTARG = point,  
  RIGHTARG = point,  
  PROCEDURE = geo_distance,  
  COMMUTATOR = <@>  
);  
  
--------------- geo_distance  
CREATE FUNCTION geo_distance (point, point)  
RETURNS float8  
LANGUAGE C IMMUTABLE STRICT AS 'MODULE_PATHNAME';  
#include "postgres.h"  
  
#include <math.h>  
  
#include "utils/geo_decls.h"    /* for Point */  
  
#ifndef M_PI  
#define M_PI 3.14159265358979323846  
#endif  
  
  
PG_MODULE_MAGIC;  
  
/* Earth's radius is in statute miles. 英里单位 */  
static const double EARTH_RADIUS = 3958.747716;  
static const double TWO_PI = 2.0 * M_PI;  
  
  
/******************************************************  
 *  
 * geo_distance_internal - distance between points  
 *  
 * args:  
 *       a pair of points - for each point,  
 *         x-coordinate is longitude in degrees west of Greenwich  
 *         y-coordinate is latitude in degrees above equator  
 *  
 * returns: double  
 *       distance between the points in miles on earth's surface  
 ******************************************************/  
  
static double  
geo_distance_internal(Point *pt1, Point *pt2)  
{  
        double          long1,  
                                lat1,  
                                long2,  
                                lat2;  
        double          longdiff;  
        double          sino;  
  
        /* convert degrees to radians */  
  
        long1 = degtorad(pt1->x);  
        lat1 = degtorad(pt1->y);  
  
        long2 = degtorad(pt2->x);  
        lat2 = degtorad(pt2->y);  
  
        /* compute difference in longitudes - want < 180 degrees */  
        longdiff = fabs(long1 - long2);  
        if (longdiff > M_PI)  
                longdiff = TWO_PI - longdiff;  
  
        sino = sqrt(sin(fabs(lat1 - lat2) / 2.) * sin(fabs(lat1 - lat2) / 2.) +  
                        cos(lat1) * cos(lat2) * sin(longdiff / 2.) * sin(longdiff / 2.));  
        if (sino > 1.)  
                sino = 1.;  
  
        return 2. * EARTH_RADIUS * asin(sino);  
}  

参考

1. http://www.postgresql.org/docs/9.4/static/earthdistance.html

Flag Counter

digoal’s 大量PostgreSQL文章入口