本文将介绍如何利用 Opencv, 对简单场景下的车道线进行离线识别. 梳理整个识别过程的逻辑, 并对过程中使用的相关知识点进行介绍. 正文中使用 C++ 实现, 在文末也会附上利用 python 实现的代码, 读者完全可以依照本文复现该项目.
1. 整体思路
简单的车道线识别可由以下几步完成:
读取视频 - 灰度变换 - 高斯滤波 - 边缘检测 - 感兴趣区域检测 - 霍夫变换 - 车道线拟合 - 图片混合
在下面的内容中, 将按照以上步骤一步步实现, 最终实现对车道线的检测. 大家都知道, 视频是由一帧帧的图像组成, 因此对视频的车道线检测本质上是对图像的车道线进行检测.
2. 实现单张图片车道线检测
1)导入包含的库文件
- #include <iostream>
- #include <opencv2/opencv.hpp>
- #include<vector>
- #include <opencv2/highgui/highgui.hpp>
- #include<string>
- using namespace std;
- using namespace cv;
2)读取图片
- //*************reading image******************
- Mat image;
- image = imread("/home/project1/test1.jpg");
- if(image.empty()){
- cout <<"reading error"<<endl;
- return -1;
- }
在 Opencv 中, 图像的数据格式是 Mat, 相当于一个矩阵. 这个步骤虽然简单, 但有两点需要注意: 一是 imread 后面的文件地址, 在 Linux 和 Windows 下斜线的方向可能不一样, 需要注意, 最好是使用全局路径, 不容易出错; 二是在读入图像后, 最好加一段 image.empty()来判断是否正确导入了图片.
原图
3)灰度变换
- //***************gray image*******************
- Mat image_gray;
- cvtColor(image,image_gray, CV_BGR2GRAY);
使用 Opencv 中的 cvtColor 函数可以直接将 RGB 的图像转换成灰度图. 这个函数有三个输入参数, 分别是: 输入图像, 输出图像, 格式转化类别;
灰度图
4)高斯滤波
- Mat image_gau;
- GaussianBlur(image_gray, image_gau, Size(5,5),0,0);
使用高斯滤波, 也叫高斯模糊, 能够剔除原图像中的一些噪点. 比如, 如果不使用高斯滤波, 直接处理原图, 图中一些无关紧要的特征就无法避开, 影响后面的处理. 相反, 通过高斯模糊之后, 一些不那么清晰的噪点就被删除掉了.
函数 GaussianBlur 的四个参数分别为: 输入图像, 输出图像, 高斯内核, 高斯内核在 X 方向的标准偏差, 高斯内核在 Y 方向上的标准偏差. 其中, 高斯内核是由 width 和 height 两个维度构成, 这两个维度可以使用不同的值, 但是必须是正奇数或者为 0; 高斯内核在 X,Y 两个方向上的标准偏差, 通常设置为 0(具体如何调参数暂未研究)
高斯滤波
5)边缘检测
- //******************canny*********************
- Mat image_canny;
- Canny(image_gau, image_canny,100, 200, 3);
Canny 边缘检测函数共有 5 个输入参数, 分别为: 输入图像, 输出图像, 阈值 1, 阈值 2, sobel 算子的孔径参数. 阈值: 低于阈值 1 的像素点会被认为不是边缘, 高于阈值 2 的像素点会被认为是边缘, 在阈值 1 和阈值 2 之间的像素点, 如果与高于阈值 2 的像素点相邻, 则认为是边缘, 否则认为不是边缘. soble 算子孔径参数, 一般默认为 3, 即表示为一个 3*3 的矩阵. sobel 算子与高斯拉普拉斯算子都是常用的边缘算子.
边缘检测
6)感兴趣区域
- Mat dstImg;
- Mat mask = Mat::zeros(image_canny.size(), CV_8UC1);
- Point PointArray[4];
- PointArray[0] = Point(0, mask.rows);
- PointArray[1] = Point(400,330);
- PointArray[2] = Point(570,330);
- PointArray[3] = Point(mask.cols, mask.rows);
- fillConvexPoly(mask,PointArray,4,Scalar(255));
- bitwise_and(mask,image_canny,dstImg);
从上图可以看出, 通过边缘检测得到的图片包含了很多环境信息, 这些是我们不感兴趣的, 需要提取我们需要都得信息. 观察原图可知, 车道先一般位于图片下方的一个梯形区域, 手动设定 4 个点, 组成梯形区域的四个顶点. 利用 fillConvexPoly 函数可以画出多边形, 这个函数共有 4 个参数: 空图(大小与原图一致), 顶点信息, 多边形的边数, 线条颜色.
将梯形掩模区域与原图进行 bitwise_and 操作, 可以只得到感兴趣区域内的边缘检测图, 从途中可以看出只有车道线信息. bitwise_and 函数是将两张图片做 "与" 操作, 共有 3 个输入参数, 分别是: 掩模图, 原图, 输出图. 需要注意的是, 三张图的大小和颜色通道数量.
掩模梯形区域
感兴趣区域
7)霍夫变换
通过上面的操作, 得到的是组成车道线的一些像素点, 但这些点都是一个个独立的像素点, 没有连成线. 霍夫变换可以通过像素点找到图中的直线. 霍夫变换有 3 种, 标准霍夫变换, 多尺度霍夫变换和累计概率霍夫变换, 前两种使用 HoughLines 函数, 最后一种使用 HoughLines 函数实现. 累计霍夫变换的执行效率更高, 所以一般更多的倾向使用累计概率霍夫变换.
霍夫变换将在迪卡尔坐标系下的线条转换到极坐标系下, 迪卡尔坐标下通过一个点的所有直线的集合在极坐标系下是一条正弦曲线. 正弦曲线的交点, 表示这些曲线代表的点在同一条直线上. 霍夫变换就通过找这些交点, 确定哪些像素点是在同一条直线上. 关于霍夫变换的具体讲解可以参考:
经典霍夫变换(Hough Transform)
blog.csdn.NET
- vector<Vec4i> lines; // 包含 4 个 int 类型的结构体
- int rho = 1;
- double theta = CV_PI/180;
- int threshold = 30;
- int min_line_len = 30;
- int max_line_gap = 20;
- HoughLinesP(dstImg,lines,rho,theta,threshold,min_line_len,max_line_gap);
Opencv 中 HoughLinesP 函数共有 7 个参数: 输入原图像(单通道二进制图像, canny 的结果), 输出线的两个端点(x1, y1, x2, y2),rho 直线搜索时的步长(单位为像素),theta 直线搜索时的角度步长(单位为弧度),threshold 多少个点交在一起才认为是一条直线(int),min_linelen 最低线段长度(默认为 0),max_line_gap 两条直线并列多远的时候认为是两条(默认为 0).
极坐标下的直线表示
8)车道线拟合
- // 简单的车道线拟合
- Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
- for(size_t i= 0;i<lines.size();i++){
- Vec4i L = lines[i];
- line(image_draw, Point(L[0],L[1]),Point(L[2],L[3]),Scalar(0,0,255),3,LINE_AA);
- }
最简单的车道线拟合, 直接将霍夫变换找到的直线画出来, 对于连续的线段, 没有影响, 但是如果车道线有虚线, 就会出现不连续的情况. 如下图:
简单的车道线拟合
为了解决虚线之间不连续的问题, 需要对霍夫变换得到的线段进行处理. 一张图片通过霍夫变换得到的线段有很多, 在这里可以根据斜率分为两类, 左车道线和右车道线. 在分类的过程中需要注意的是图像的坐标系: 左上角为原点, x 正方向朝右侧, y 的正向朝下.
霍夫变换得到的 lines 中是两个点, 通过两个点, 可以计算得到斜率和截距. 对一张图片中, 同一侧的斜率和截距进行平均, 然后直接利用平均后的参数可以直接画出一条完整的直线.
车道线拟合更新
- /***************draw line update********************************
- Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
- vector<int> right_x, right_y, left_x, left_y;
- double slope_right_sum;
- double b_right_sum ;
- double slope_left_sum ;
- double b_left_sum ;
- double slope_right_mean;
- double slope_left_mean;
- double b_right_mean;
- double b_left_mean;
- vector<double> slope_right, slope_left,b_right, b_left;
- for(size_t i= 0;i<lines.size();i++){
- Vec4i L;
- double slope,b;
- L = lines[i];
- slope = (L[3]-L[1])*1.0/(L[2]-L[0]);
- b = L[1]-L[0]*slope;
- if (slope>=0.2){
- slope_right.push_back(slope);
- b_right.push_back(b);
- //right_x.push_back((L[0],L[2]));
- //right_y.push_back((L[1],L[3]));
- }
- else{
- slope_left.push_back(slope);
- b_left.push_back(b);
- // left_x.push_back((L[0],L[2]));
- // right_y.push_back((L(1),L[3]));
- }
- }
- //accumulate 实现 vector 内值的累加, 输出格式与最后一个参数的数据格式一致.
- slope_right_sum = accumulate(slope_right.begin(), slope_right.end(),0.0);
- b_right_sum = accumulate(b_right.begin(), b_right.end(),0.0);
- slope_left_sum = accumulate(slope_left.begin(),slope_left.end(),0.0);
- b_left_sum = accumulate(b_left.begin(),b_left.end(),0.0);
- slope_right_mean = slope_right_sum/slope_right.size();
- slope_left_mean = slope_left_sum/slope_left.size();
- b_right_mean = b_right_sum/b_right.size();
- b_left_mean = b_left_sum/b_left.size();
- cout <<"slope_right:"<<slope_right_sum<<endl;
- double x1r = 550;
- double x2r = 850;
- double x1l = 120;
- double x2l = 425;
- int y1r = slope_right_mean * x1r + b_right_mean;
- int y2r = slope_right_mean * x2r + b_right_mean;
- int y1l = slope_left_mean * x1l + b_left_mean;
- int y2l = slope_left_mean * x2l + b_left_mean;
- line(image_draw, Point(x1r,y1r),Point(x2r,y2r),Scalar(0,0,255),5,LINE_AA);
- line(image_draw, Point(x1l,y1l),Point(x2l,y2l),Scalar(0,0,255),5,LINE_AA);
9)图像混合
将画出来的直线叠加到原图像上, 可以使用 addWeighted 函数, 实现图像加权叠加. addWeighted 函数共有 6 个参数, 分别为: 原图 1, 图 1 的透明度, 原图 2, 图 2 的透明度, 加权值(一般设置为 0), 输出图.
- //*************mix two image*************************
- Mat image_mix = Mat::zeros(image_canny.size(),CV_8UC3);
- addWeighted(image_draw,1,image,1,0.0,image_mix);
车道线检测完成
通过以上 9 个步骤, 我们完成了对与单张图片的车道线检测.
3. 对视频的车道线检测
1)将 2 中 9 个步骤重构成一个类 image_process
重构的 image_process 具有两个成员变量: 原图和结果图, 一个成员函数: 对图像的车道线识别. 另外还有一个构造函数和一个析构函数. 具体代码如下:
- //image_process.h
- #ifndef PROJECT1_IMAGE_PROCESS_H
- #define PROJECT1_IMAGE_PROCESS_H
- #include<iostream>
- #include<opencv2/opencv.hpp>
- using namespace std;
- using namespace cv;
- class image_process {
- public:
- Mat image_src;
- Mat image_dst;
- image_process(Mat image);
- Mat process();
- ~image_process();
- };
- //process_image.cpp
- #include "image_process.h"
- #include<iostream>
- #include<opencv2/opencv.hpp>
- using namespace std;
- using namespace cv;
- // 构造函数
- image_process::image_process(Mat image):image_src(image){}
- // 成员函数
- Mat image_process::process(){
- //*************reading image******************
- Mat image;
- image = image_src ;
- if(image.empty()){
- cout <<"reading error"<<endl;
- }
- //***************gray image*******************
- Mat image_gray;
- cvtColor(image,image_gray, CV_BGR2GRAY);
- //************gaussian smoothing**************
- Mat image_gau;
- GaussianBlur(image_gray, image_gau, Size(5,5),0,0);
- //******************canny*********************
- Mat image_canny;
- Canny(image_gau, image_canny,100, 200, 3);
- //**************interesting aera*************
- Mat dstImg;
- Mat mask = Mat::zeros(image_canny.size(), CV_8UC1);
- Point PointArray[4];
- PointArray[0] = Point(0, mask.rows);
- PointArray[1] = Point(400,330);
- PointArray[2] = Point(570,330);
- PointArray[3] = Point(mask.cols, mask.rows);
- fillConvexPoly(mask,PointArray,4,Scalar(255));
- bitwise_and(mask,image_canny,dstImg);
- //************************houghline*******************
- vector<Vec4i> lines;
- int rho = 1;
- double theta = CV_PI/180;
- int threshold = 30;
- int min_line_len = 100;
- int max_line_gap = 100;
- HoughLinesP(dstImg,lines,rho,theta,threshold,min_line_len,max_line_gap);
- //cout<<lines[1]<<endl;
- //***************draw line update********************************
- Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
- vector<int> right_x, right_y, left_x, left_y;
- double slope_right_sum;
- double b_right_sum ;
- double slope_left_sum ;
- double b_left_sum ;
- double slope_right_mean;
- double slope_left_mean;
- double b_right_mean;
- double b_left_mean;
- vector<double> slope_right, slope_left,b_right, b_left;
- for(size_t i= 0;i<lines.size();i++){
- Vec4i L;
- double slope,b;
- L = lines[i];
- slope = (L[3]-L[1])*1.0/(L[2]-L[0]);
- b = L[1]-L[0]*slope;
- if (slope>=0.2){
- slope_right.push_back(slope);
- b_right.push_back(b);
- //right_x.push_back((L[0],L[2]));
- //right_y.push_back((L[1],L[3]));
- }
- else{
- slope_left.push_back(slope);
- b_left.push_back(b);
- // left_x.push_back((L[0],L[2]));
- // right_y.push_back((L(1),L[3]));
- }
- }
- slope_right_sum = accumulate(slope_right.begin(), slope_right.end(),0.0);
- b_right_sum = accumulate(b_right.begin(), b_right.end(),0.0);
- slope_left_sum = accumulate(slope_left.begin(),slope_left.end(),0.0);
- b_left_sum = accumulate(b_left.begin(),b_left.end(),0.0);
- slope_right_mean = slope_right_sum/slope_right.size();
- slope_left_mean = slope_left_sum/slope_left.size();
- b_right_mean = b_right_sum/b_right.size();
- b_left_mean = b_left_sum/b_left.size();
- cout <<"slope_right:"<<slope_right_sum<<endl;
- double x1r = 550;
- double x2r = 850;
- double x1l = 120;
- double x2l = 425;
- int y1r = slope_right_mean * x1r + b_right_mean;
- int y2r = slope_right_mean * x2r + b_right_mean;
- int y1l = slope_left_mean * x1l + b_left_mean;
- int y2l = slope_left_mean * x2l + b_left_mean;
- line(image_draw, Point(x1r,y1r),Point(x2r,y2r),Scalar(0,0,255),5,LINE_AA);
- line(image_draw, Point(x1l,y1l),Point(x2l,y2l),Scalar(0,0,255),5,LINE_AA);
- //*************mix two image*************************
- Mat image_mix = Mat::zeros(image_canny.size(),CV_8UC3);
- addWeighted(image_draw,1,image,1,0.0,image_mix);
- //**************out put****************************
- return image_mix;
- }
- // 析构函数
- image_process::~image_process() {}
2)主函数
可以直接使用 capture 函数读取视频, 然后将视频中的每一帧图像传递给 frame. 通过构造函数将 frame 传递给处理单张图片的类, 然后调用成员函数进行处理, 最后显示.
waitKey()中, 如果没有参数, 代表窗口会一直等待直到我们对窗口进行操作. 有参数时, 等待固定的时间后自动关闭. 例如 waitKey(30)窗口等待 30ms 后关闭, 接着显示下一帧的图像.
- #include <iostream>
- #include <opencv2/opencv.hpp>
- #include<vector>
- #include <opencv2/highgui/highgui.hpp>
- #include"image_process.h"
- #include<string>
- using namespace std;
- using namespace cv;
- int main(){
- Mat image;
- Mat image_result;
- VideoCapture capture("/home/solidYellowLeft.mp4");
- Mat frame;
- if(!capture.isOpened()) {
- cout <<"can not open video" << endl;
- return -1;
- }
- while(capture.isOpened()){
- capture>>frame;
- image_process image2(frame);
- Mat image_result2;
- image_result2 = image2.process();
- imshow("result_video",image_result2);
- waitKey(30);
- }
- }
简单车道线检测(视频)
https://www.zhihu.com/video/1060875819747962880
备注:
1. 本项目来源于 Udacity Self-Driving Car Engineer, 如有侵权, 请联系删除.
无人驾驶工程师课程_无人车培训_无人驾驶_自动驾驶基础技术 - 优达学城 (Udacity) 官网
2. 参考文献: opencv 之 Canny()函数 - duwangthefirst 的博客 - CSDN 博客
OpenCV 图像 Canny 边缘检测 - 一样菜 - 博客园 https://www.cnblogs.com/mypsq/p/4983566.html
OpenCV 图像 Canny 边缘检测 - 一样菜 - 博客园
www.cnblogs.com
OpenCV 图像 Canny 边缘检测 - 一样菜 - 博客园 https://www.cnblogs.com/mypsq/p/4983566.html
3.python 实现:
- import math
- def grayscale(img):
- """Applies the Grayscale transform
- This will return an image with only one color channel
- but NOTE: to see the returned image as grayscale
- (assuming your grayscaled image is called 'gray')
- you should call plt.imshow(gray, cmap='gray')"""
- return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
- # Or use BGR2GRAY if you read an image with cv2.imread()
- # return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
- def canny(img, low_threshold, high_threshold):
- """Applies the Canny transform"""
- return cv2.Canny(img, low_threshold, high_threshold)
- def gaussian_blur(img, kernel_size):
- """Applies a Gaussian Noise kernel"""
- return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)
- def region_of_interest(img, vertices):
- """
- Applies an image mask.
- Only keeps the region of the image defined by the polygon
- formed from `vertices`. The rest of the image is set to black.
- """
- #defining a blank mask to start with
- mask = np.zeros_like(img)
- #defining a 3 channel or 1 channel color to fill the mask with depending on the input image
- if len(img.shape)> 2:
- channel_count = img.shape[2] # i.e. 3 or 4 depending on your image
- ignore_mask_color = (255,) * channel_count
- else:
- ignore_mask_color = 255
- #filling pixels inside the polygon defined by "vertices" with the fill color
- cv2.fillPoly(mask, vertices, ignore_mask_color)
- #returning the image only where mask pixels are nonzero
- masked_image = cv2.bitwise_and(img, mask)
- return masked_image
- def draw_lines(img, lines, color=[255, 0, 0], thickness=8):
- """
- NOTE: this is the function you might want to use as a starting point once you want to
- average/extrapolate the line segments you detect to map out the full
- extent of the lane (going from the result shown in raw-lines-example.mp4
- to that shown in P1_example.mp4).
- Think about things like separating line segments by their
- slope ((y2-y1)/(x2-x1)) to decide which segments are part of the left
- line vs. the right line. Then, you can average the position of each of
- the lines and extrapolate to the top and bottom of the lane.
- This function draws `lines` with `color` and `thickness`.
- Lines are drawn on the image inplace (mutates the image).
- If you want to make the lines semi-transparent, think about combining
- this function with the weighted_img() function below
- """
- # for line in lines:
- # for x1,y1,x2,y2 in line:
- # cv2.line(img, (x1, y1), (x2, y2), color, thickness)
- right_x =[]
- right_y =[]
- left_x =[]
- left_y =[]
- left_slope =[]
- right_slope =[]
- for line in lines:
- for x1, y1, x2, y2 in line:
- slope = ((y2-y1)/(x2-x1))
- if slope>=0.2:
- #right_slope.extend(int(slope))
- right_x.extend((x1, x2))
- right_y.extend((y1, y2))
- elif slope <= -0.2:
- #left_slope.extend(int(slope))
- left_x.extend((x1, x2))
- left_y.extend((y1, y2))
- right_fit= np.polyfit(right_x, right_y, 1)
- right_line = np.poly1d(right_fit)
- x1R = 550
- y1R = int(right_line(x1R))
- x2R = 850
- y2R = int(right_line(x2R))
- cv2.line(img, (x1R, y1R), (x2R, y2R), color, thickness)
- left_fit= np.polyfit(left_x, left_y, 1)
- left_line = np.poly1d(left_fit)
- x1L = 120
- y1L = int(left_line(x1L))
- x2L = 425
- y2L = int(left_line(x2L))
- cv2.line(img, (x1L, y1L), (x2L, y2L), color, thickness)
- def hough_lines(img, rho, theta, threshold, min_line_len, max_line_gap):
- """
- `img` should be the output of a Canny transform.
- Returns an image with hough lines drawn.
- """
- lines = cv2.HoughLinesP(img, rho, theta, threshold, np.array([]), minLineLength=min_line_len, maxLineGap=max_line_gap)
- line_img = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8)
- draw_lines(line_img, lines)
- return line_img
- # Python 3 has support for cool math symbols.
- def weighted_img(img, initial_img, α=0.8, β=1., λ=0.):
- """
- `img` is the output of the hough_lines(), An image with lines drawn on it.
- Should be a blank image (all black) with lines drawn on it.
- `initial_img` should be the image before any processing.
- The result image is computed as follows:
- initial_img * α + img * β + λ
- NOTE: initial_img and img must be the same shape!
- """
- return cv2.addWeighted(initial_img, α, img, β, λ)
- def pipeline(input_image):
- image = input_image
- import os
- os.listdir("test_images/")
- gray=grayscale(image)
- # Gaussian smoothing
- kernel_size = 5
- gau=gaussian_blur(gray,kernel_size)
- # Canny
- low_threshold = 100
- high_threshold =200
- edges=canny(gau, low_threshold, high_threshold)
- imshape = image.shape
- vertices = np.array([[(0,imshape[0]),(400, 330), (570, 330), (imshape[1],imshape[0])]], dtype=np.int32)
- region=region_of_interest(edges, vertices)
- rho = 1 # distance resolution in pixels of the Hough grid
- theta = np.pi/180 # angular resolution in radians of the Hough grid
- threshold = 30 # minimum number of votes (intersections in Hough grid cell)
- min_line_len = 20 #minimum number of pixels making up a line
- max_line_gap = 20
- line_img=hough_lines(region, rho, theta, threshold, min_line_len, max_line_gap)
- line_last=weighted_img(line_img, image, α=0.8, β=1., λ=0.)
- return line_last
- from moviepy.editor import VideoFileClip
- from IPython.display import HTML
- def process_image(image):
- # NOTE: The output you return should be a color image (3 channel) for processing video below
- # TODO: put your pipeline here,
- # you should return the final output (image where lines are drawn on lanes)
- result = pipeline(image)
- return result
- white_output = 'test_videos_output/solidWhiteRight.mp4'
- ## To speed up the testing process you may want to try your pipeline on a shorter subclip of the video
- ## To do so add .subclip(start_second,end_second) to the end of the line below
- ## Where start_second and end_second are integer values representing the start and end of the subclip
- ## You may also uncomment the following line for a subclip of the first 5 seconds
- #clip1 = VideoFileClip("test_videos/solidWhiteRight.mp4").subclip(0,5)
- clip1 = VideoFileClip("test_videos/solidWhiteRight.mp4")
- white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images!!
- %time white_clip.write_videofile(white_output, audio=False)
来源: http://www.tuicool.com/articles/go/IVZBZzI