当前位置：

首页
/
IT
/
程序
/
PostgreSQL 实时位置跟踪 + 轨迹分析系统实践 - 单机顶千亿轨迹 / 天

PostgreSQL 实时位置跟踪 + 轨迹分析系统实践 - 单机顶千亿轨迹 / 天

数据存储与数据库 PostgreSQL random 数据库 pos 索引 Create update 存储数组

摘要： 标签 PostgreSQL , PostGIS , 动态更新位置 , 轨迹跟踪 , 空间分析 , 时空分析背景随着移动设备的普及，越来越多的业务具备了时空属性，例如快递，试试跟踪包裹、快递员位置。

背景

随着移动设备的普及，越来越多的业务具备了时空属性，例如快递，试试跟踪包裹、快递员位置。例如实体，具备了空间属性。

例如餐饮配送，送货员位置属性。例如车辆，实时位置。等等。

其中两大需求包括：

1、对象位置实时跟踪，例如实时查询某个位点附近、或某个多边形区域内的送货员。

2、对象位置轨迹记录和分析。结合地图，分析轨迹，结合路由算法，预测、生成最佳路径等。

DEMO

以快递配送为例，GPS 设备实时上报快递员轨迹，写入位置跟踪系统，同时将轨迹记录永久保存到轨迹分析系统。

由于快递员可能在配送过程中停留时间较长（比如在某个小区配送时），上报的多条位置可能变化并不大，同时考虑到数据库更新消耗，以及位置的时效性，可以避免一些点的更新（打个比方，上一次位置和当前位置变化量在 50 米时，不更新）。

动态更新可以减少数据库的更新量，提高整体吞吐能力。

设计

实时位置更新

1、建表

create table t_pos (
  uid int primary key,--传感器、快递员、车辆、。。。对象ID
  pos point,--位置
  mod_time timestamp     --最后修改时间
);
create index idx_t_pos_1 on t_pos using gist (pos);

真实环境中，我们可以使用 PostGIS 空间数据库插件，使用 geometry 数据类型来存储经纬度点。

create extension postgis;
create table t_pos (
  uid int primary key,--传感器、快递员、车辆、。。。对象ID
  pos geometry,--位置
  mod_time timestamp     --最后修改时间
);
create index idx_t_pos_1 on t_pos using gist (pos);

2、上报位置，自动根据移动范围，更新位置。

例如，移动距离 50 米以内，不更新。

insert into t_pos values (?, st_setsrid(st_makepoint($lat, $lon),4326), now())
on conflict (uid)
do update set pos=excluded.pos, mod_time=excluded.mod_time
where st_distancespheroid(t_pos.pos, excluded.pos,'SPHEROID["WGS84",6378137,298.257223563]')>?;--超过多少米不更新

历史轨迹保存

通常终端会批量上报数据，例如每隔 10 秒上报 10 秒内采集的点，一次上报的数据可能包含多个点，在 PostgreSQL 中可以以数组存储。

create table t_pos_hist (
  uid int,--传感器、快递员、车辆、。。。对象ID
  pos point[],--批量上报的位置
  crt_time timestamp[]--批量上报的时间点
);
create index idx_t_pos_hist_uid on t_pos_hist (uid);--对象ID
create index idx_t_pos_hist_crt_time on t_pos_hist ((crt_time[1]));--对每批数据的起始时间创建索引

有必要的话，可以多存一个时间字段，用于分区。

历史轨迹分析

动态位置变更压测

写入并合并，同时判断当距离大于 50 时，才更新，否则不更新。

（测试）如果使用 point 类型，则使用如下 SQL

insert into t_pos values (1, point(1,1), now())
on conflict (uid)
do update set pos=excluded.pos, mod_time=excluded.mod_time
where t_pos.pos <-> excluded.pos >50;

（实际生产）如果使用 PostGIS 的 geometry 类型，则使用如下 SQL

insert into t_pos values (1, st_setsrid(st_makepoint(120,71),4326), now())
on conflict (uid)
do update set pos=excluded.pos, mod_time=excluded.mod_time
where st_distancespheroid(t_pos.pos, excluded.pos,'SPHEROID["WGS84",6378137,298.257223563]')>50;

压测

首先生成 1 亿随机空间对象数据。

postgres=# insert into t_pos select generate_series(1,100000000), point(random()*10000, random()*10000), now();
INSERT 0100000000
Time:250039.193 ms (04:10.039)

压测脚本如下，1 亿空间对象，测试动态更新性能（距离 50 以内，不更新）。

vi test.sql
\set uid random(1,100000000)
insert into t_pos
select uid, point(pos[0]+random()*100-50, pos[1]+random()*100-50), now()from t_pos where uid=:uid
on conflict (uid)
do update set pos=excluded.pos, mod_time=excluded.mod_time
where t_pos.pos <-> excluded.pos >50;

压测结果，动态更新 21.6 万点 / s，187 亿点 / 天。

pgbench -M prepared -n -r -P 1-f ./test.sql -c 64-j 64-T 120
number of transactions actually processed:26014936
latency average =0.295 ms
latency stddev =0.163 ms
tps =216767.645838(including connections establishing)
tps =216786.403543(excluding connections establishing)

轨迹写入压测

每个 UID，每批写入 50 条：写入速度约 467.5 万点 / s，4039 亿点 / 天。

压测时，写多表，压测使用动态 SQL。

do language plpgsql $
declare
begin
for i in0..127 loop
    execute 'create table t_pos_hist'||i||' (like t_pos_hist including all)';
end loop;
end;
$;

create or replace function import_test(int) returns voidas $
declare
begin
  execute format('insert into t_pos_hist%s values (%s, %L, %L)', mod($1,128), $1,
  array[point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1), point(1,1)],
  array['2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10','2018-01-01 10:10:10']);
end;
$ language plpgsql strict;

vi test1.sql
\set uid random(1,100000000)
select import_test(:uid);

pgbench -M prepared -n -r -P 1-f ./test1.sql -c 56-j 56-T 120
number of transactions actually processed:11220725
latency average =0.599 ms
latency stddev =5.452 ms
tps =93504.532256(including connections establishing)
tps =93512.274135(excluding connections establishing)

黑科技

1、块级索引（BRIN），在时序属性字段上，建立块级索引，既能达到高效检索目的，又能节约索引空间，还能加速写入。

《PostgreSQL BRIN 索引的 pages_per_range 选项优化与内核代码优化思考》《万亿级电商广告 - brin 黑科技带你 (最低成本) 玩转毫秒级圈人(视觉挖掘姊妹篇) - 阿里云 RDS PostgreSQL, HybridDB for PostgreSQL 最佳实践》《PostGIS 空间索引 (GiST、BRIN、R-Tree) 选择、优化 - 阿里云 RDS PostgreSQL 最佳实践》《自动选择正确索引访问接口 (btree,hash,gin,gist,sp-gist,brin,bitmap...) 的方法》《PostgreSQL 并行写入堆表，如何保证时序线性存储 - BRIN 索引优化》《PostgreSQL 9 种索引的原理和应用场景》

2、阿里云 HDB PG 特性：sort key , metascan

与 BRIN 类似，适合线性数据，自动建立块级元数据 (取值范围、平均值、CNT、SUM 等) 进行过滤。

3、空间索引

GiST, SP-GiST 空间索引，适合空间数据、以及其他异构数据。

4、动态合并写，根据位置变化量，自动判断是否需要合并更新。

insert on conflict 语法，在 do update 里面，可以进行条件过滤，当位置变化超过 N 米时，才进行更新。

5、数组、JSON、KV 等多值类型。

特别适合多值属性，例如批量上传的轨迹，通常 GPS 终端上报位置并不是实时的，可能存在一定的延迟（例如批量上报）。使用数组、JSON 都可以存储。

如果使用数组存储，将来分析轨迹时，依旧可以 unnest 解开，绘制轨迹。

性能

1、动态位置变更：1 亿被跟踪对象，TPS：21.6 万，动态更新 21.6 万点 / s，187 亿点 / 天。

2、轨迹写入：tps 约 10 万，写入 467.5 万点 / s，4039 亿点 / 天。

参考

《PostGIS 空间数据学习建议》《PostgreSQL + PostGIS + SFCGAL 优雅的处理 3D 数据》《PostGIS 距离计算建议 - 投影与球坐标系, geometry 与 geography 类型》《PostgreSQL 10 + PostGIS + Sharding(pg_pathman) + MySQL(fdw 外部表) on ECS 部署指南 (适合新用户)》《PostGIS 空间索引 (GiST、BRIN、R-Tree) 选择、优化 - 阿里云 RDS PostgreSQL 最佳实践》《PostGIS 坐标转换 (SRID) 的边界问题引发的专业知识 - ST_Transform》《无人驾驶背后的技术 - PostGIS 点云 (pointcloud) 应用 - 2》《无人驾驶背后的技术 - PostGIS 点云 (pointcloud) 应用 - 1》《geohash vs PostGIS》《视觉挖掘与 PostGIS 空间数据库的完美邂逅 - 广告营销 \ 圈人》《PostGIS 点面叠加视觉判断输出》《PostGIS 多点几何类型空字符构造异常 CASE》《开放地图 OpenStreetMap 与 PostGIS 的三生三世十里桃花》《PostGIS 地理信息、栅格数据多核并行处理 (st_memunion, st_union)》《蜂巢的艺术与技术价值 - PostgreSQL PostGIS's hex-grid》《如何建立 GIS 测试环境 - 将 openstreetmap 的样本数据导入 PostgreSQL PostGIS 库》《GIS 附近查找性能优化 - PostGIS long lat geometry distance search tuning using gist knn function》

本文为云栖社区原创内容，未经允许不得转载，如需转载请发送邮件至 yqeditor@list.alibaba-inc.com；如果您发现本社区中有涉嫌抄袭的内容，欢迎发送邮件至：yqgroup@service.aliyun.com 进行举报，并提供相关证据，一经查实，本社区将立刻删除涉嫌侵权内容。

用云栖社区 APP，舒服~

【云栖快讯】中办国办印发《推进互联网协议第六版（IPv6）规模部署行动计划》加快推进基于 IPv6 的下一代互联网规模部署，计划指出 2025 年末中国 IPv6 规模要达到世界第一, 阿里云也第一时间宣布了将全面提供 IPv6 服务，那么在全面部署 IPV6 前，你需要了解都在这儿详情请点击

评论文章 (0) (0) (0)

来源: https://yq.aliyun.com/articles/328133

与本文相关文章

暂无,快来抢沙发吧！