PostgreSQL PostGIS 的5种空间距离排序(knn)算法
背景
PostgreSQL GiST索引支持排序接口,可以支撑空间、标准、数组、文本向量、文本等类型的排序。
分别表示:
1、给定一空间位置,按空间距离排序输出
2、给定一数组、全文检索值、字符串,按数组、全文检索、字符串相似度排序输出
3、给定一标准类型的值,按标准类型的距离排序
数据库目前支持哪些排序操作符,可以参考:《PostgreSQL 如何确定某个opclass支持哪些操作符(支持索引),JOIN方法,排序》
postgres=# select oprleft::regtype,oprright::regtype,oprresult::regtype,oprname,oprkind,oprcode from pg_operator where oid in (select amopopr from pg_amop where amopsortfamily<>0);
oprleft | oprright | oprresult | oprname | oprkind | oprcode
-----------------------------+-----------------------------+------------------+---------+---------+------------------------------------
point | point | double precision | <-> | b | point_distance
circle | point | double precision | <-> | b | dist_cpoint
polygon | point | double precision | <-> | b | dist_polyp
geometry | geometry | double precision | <-> | b | geometry_distance_centroid
geometry | geometry | double precision | <#> | b | geometry_distance_box
geometry | geometry | double precision | <<->> | b | geometry_distance_centroid_nd
geometry | geometry | double precision | |=| | b | geometry_distance_cpa
geography | geography | double precision | <-> | b | geography_distance_knn
oprcode是这个操作符对应的函数,要知道操作符的算法,可以下载postgis源码根据函数找到其C代码。
安装了PostGIS后,对于空间排序,目前支持5种算法
PostGIS 空间排序算法
1、2D平面,两个对象实际距离
<-> — Returns the 2D distance between A and B.
2、计算两条有效轨迹类型的距离(如果两个轨迹没有交集, 返回空)
|=| — Returns the distance between A and B trajectories at their closest point of approach.
3、2D平面,两个对象bound box实面的距离(所以如果BOUND BOX与另一对象 相交,包含,说明距离为0)
<#> — Returns the 2D distance between A and B bounding boxes.
4、与1类似,多维对象,返回两个bound box 中心点的距离
<<->> — Returns the n-D distance between the centroids of A and B bounding boxes.
5、与3类似,多维对象,返回两个bound box 实体的距离
<<#>> — Returns the n-D distance between A and B bounding boxes.
例子
1、点和线段的实际距离
postgres=# select st_setsrid(st_makepoint(1,2),4326) <-> st_setsrid(ST_MakeLine(ST_MakePoint(0,0.9), ST_MakePoint(3,4)),4326);
?column?
--------------------
0.0463614285010702
(1 row)
2、点和线段(BOUND BOX实体)的距离
postgres=# select st_setsrid(st_makepoint(1,2),4326) <#> st_setsrid(ST_MakeLine(ST_MakePoint(0,0.9), ST_MakePoint(3,4)),4326);
?column?
----------
0
(1 row)
3、两个不相交轨迹(范围没有相交)的距离
postgres=# WITH inp AS ( SELECT
ST_AddMeasure('LINESTRING Z (0 0 0, 10 0 5)'::geometry,
extract(epoch from '2015-05-26 10:01'::timestamptz),
extract(epoch from '2015-05-26 11:01'::timestamptz)
) a,
ST_AddMeasure('LINESTRING Z (0 2 10, 12 1 2)'::geometry,
extract(epoch from '2015-05-25 01:00'::timestamptz),
extract(epoch from '2015-05-25 13:00'::timestamptz)
) b
)
SELECT ST_DistanceCPA(a,b) distance FROM inp;
distance
----------
(1 row)
4、两个有相交(范围有交集)的距离 (lower1~upper1 与 lower2~upper2 有相交)
postgres=# WITH inp AS ( SELECT
ST_AddMeasure('LINESTRING Z (0 0 0, 10 0 5)'::geometry,
extract(epoch from '2015-05-26 10:01'::timestamptz), -- lower
extract(epoch from '2015-05-26 11:01'::timestamptz) -- upper
) a,
ST_AddMeasure('LINESTRING Z (0 2 10, 12 1 2)'::geometry,
extract(epoch from '2015-05-25 01:00'::timestamptz), -- lower
extract(epoch from '2015-05-26 13:00'::timestamptz) -- upper
) b
)
SELECT ST_DistanceCPA(a,b) distance FROM inp;
distance
------------------
3.07479618495659
(1 row)
5、两个有相交(范围有交集)的距离
postgres=# WITH inp AS ( SELECT
ST_AddMeasure('LINESTRING Z (0 0 0, 10 0 5)'::geometry,
extract(epoch from '2015-05-26 10:01'::timestamptz),
extract(epoch from '2015-05-26 11:01'::timestamptz)
) a,
ST_AddMeasure('LINESTRING Z (0 2 10, 12 1 2)'::geometry,
extract(epoch from '2015-05-25 01:00'::timestamptz),
extract(epoch from '2015-05-26 10:10'::timestamptz)
) b
)
SELECT ST_DistanceCPA(a,b) distance FROM inp;
distance
-----------------
10.621322893124
(1 row)
排序用法
select * from tbl order by geom <-> 某个空间对象常量;
参考
http://workshops.boundlessgeo.com/postgis-intro/knn.html
http://postgis.net/docs/manual-2.4/reference.html#Operators
http://postgis.net/docs/manual-2.4/geometry_distance_cpa.html
http://postgis.net/docs/manual-2.4/ST_DistanceCPA.html
《PostgreSQL 如何确定某个opclass支持哪些操作符(支持索引),JOIN方法,排序》