地理位置 geo 处理之 MySQL 函数

目前越来越多的业务都会基于 LBS, 附近的人, 外卖位置, 附近商家等等, 现就讨论离我最近这一业务场景的解决方案.

目前已知解决方案有:

MySQL 自定义函数计算

MySQL geo 索引

MongoDB geo 索引

PostgreSQL PostGis 索引

Redis geo
Elasticsearch

本文测试下 MySQL 函数运算的性能

准备工作

创建数据表

CREATE TABLE `driver` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `lng` float DEFAULT NULL,
  `lat` float DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

创建测试数据

在创建数据之前先了解下基本的地理知识:

全球经纬度的取值范围为: 纬度 - 9090, 经度 - 180180

中国的经纬度范围大约为: 纬度 3.8653.55, 经度 73.66135.05

北京行政中心的纬度为 39.92, 经度为 116.46

越北面的地方纬度数值越大, 越东面的地方经度数值越大

度分转换: 将度分单位数据转换为度单位数据, 公式: 度 = 度 + 分 / 60

分秒转换: 将度分秒单位数据转换为度单位数据, 公式: 度 = 度 + 分 / 60 + 秒 / 60 / 60

在纬度相等的情况下:

经度每隔 0.00001 度, 距离相差约 1 米

在经度相等的情况下:

纬度每隔 0.00001 度, 距离相差约 1.1 米

MySQL 函数计算

DELIMITER //
CREATE DEFINER=`root`@`localhost` FUNCTION `getDistance`(
    `lng1` float(10,7)
    ,
    `lat1` float(10,7)
    ,
    `lng2` float(10,7)
    ,
    `lat2` float(10,7)
) RETURNS double
    COMMENT '计算 2 坐标点距离'
BEGIN
    declare d double;
    declare radius int;
    set radius = 6371000; #假设地球为正球形, 直径为 6371000 米
    set d = (2*ATAN2(SQRT(SIN((lat1-lat2)*PI()/180/2)
        *SIN((lat1-lat2)*PI()/180/2)+
        COS(lat2*PI()/180)*COS(lat1*PI()/180)
        *SIN((lng1-lng2)*PI()/180/2)
        *SIN((lng1-lng2)*PI()/180/2)),
        SQRT(1-SIN((lat1-lat2)*PI()/180/2)
        *SIN((lat1-lat2)*PI()/180/2)
        +COS(lat2*PI()/180)*COS(lat1*PI()/180)
        *SIN((lng1-lng2)*PI()/180/2)
        *SIN((lng1-lng2)*PI()/180/2))))*radius;
    return d;
END//
DELIMITER ;

创建数据 python 脚本

# coding=utf-8
from orator import DatabaseManager, Model
import logging
import random
import threading
"""中国的经纬度范围 纬度 3.86~53.55, 经度 73.66~135.05. 大概 0.00001 度差距 1 米"""
# 创建 日志 对象
logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = logging.Formatter(
    '%(asctime)s %(name)-12s %(levelname)-8s %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
# Connect to the database
config = {
    'mysql': {
        'driver': 'mysql',
        'host': 'localhost',
        'database': 'dbtest',
        'user': 'root',
        'password': '',
        'prefix': ''
    }
}
db = DatabaseManager(config)
Model.set_connection_resolver(db)
class Driver(Model):
    __table__ = 'driver'
    __timestamps__ = False
    pass
def ins_driver(thread_name,nums):
    logger.info('开启线程 %s' % thread_name)
    for _ in range(nums):
        lng = '%.5f' % random.uniform(73.66, 135.05)
        lat = '%.5f' % random.uniform(3.86, 53.55)
        driver = Driver()
        driver.lng = lng
        driver.lat = lat
        driver.save()
thread_nums = 10
for i in range(thread_nums):
    t = threading.Thread(target=ins_driver, args=(i, 400000))
    t.start()

image.PNG

以上脚本创建 10 个线程, 10 个线程插入 4 万条数据. 耗费 150.18s 执行完, 总共插入 40 万条数据

测试

测试环境

系统: Mac os

内存: 16G

CPU: intel core i5

硬盘: 500g 固态硬盘

测试下查找距离 (134.38753,18.56734) 这个坐标点最近的 10 个司机

select *,`getDistance`(134.38753,18.56734,`lng`,`lat`) as dis from driver ORDER BY dis limit 10

耗时: 18.0s

explain: 全表扫描

我测试了从 1 万到 10 万间隔 1 万和从 10 万到 90 万每间隔 10 万测试的结果变化

image.PNG

结论

此方案在数据量达到 3 万条查询耗时就会超过 1 秒

大约每增加 1 万条就会增加 0.4 秒的耗时

来源: https://www.cnblogs.com/fu-yong/p/9896594.html

与本文相关文章

暂无,快来抢沙发吧！