人脸检测 识别一直是图像算法领域一个主流话题.
前年 SeetaFace 开源了人脸识别引擎, 一度成为热门话题.
虽然后来 SeetaFace 又放出来 2.0 版本, 但是, 我说但是...
没有训练代码, 想要自己训练一下模型那可就犯难了.
虽然可以阅读源码, 从前向传播的角度, 反过来实现训练代码,
但是谁有那个闲功夫和时间, 去折腾这个呢?
有的时候还是要站在巨人的肩膀上, 你才能看得更远.
而 SeetaFace 不算巨人, 只是当年风口上的猪罢了.
前年, 为了做一个人脸项目, 也是看遍了网上各种项目.
林林总总, 各有优劣.
不多做评价, 很多东西还是要具体实操, 实战才能见真知.
有一段时间, 用 SeetaFace 的人脸检测来做一些小的演示 demo,
也花了一点小时间去优化它的算法.
不过很明显我只是把他当成玩具看待.
毕竟不能自己训练模型, 这是很大的诟病.
直到后来深度学习大放异彩, 印象最深刻莫过于 MTCNN.
Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks
相关资料见: https://github.com/kpzhang93/MTCNN_face_detection_alignment
大合照下, 人脸圈出来很准确, 壮观了去, 这是第一印象.
上图, 大家感受一下.
MTCNN 的有三个网络结构.
Stage1: Proposal Net
Stage2: Refine Net
Stage3: Output Net
具体算法思路就不展开了.
我对 MTCNN 感兴趣的点在于,
MTCNN 的思路可以拓展到各种物体检测和识别方向.
也许唯一缺少的就是打标好的数据,
而标注五个点, 足够用于适配大多数物体了.
符合小而美的理念, 这个是我比较推崇的.
所以 MTCNN 是一个很值得品味的算法.
github 上也有不少 MTCNN 的实现和资源.
基于 mxnet 基于 caffe 基于 ncnn 等等...
很明显, mxnet 和 caffe 不符合小而美的理念.
果断抛弃了.
ncnn 有点肥大, 不合我心.
所以, 我动了杀气..
移除 NCNN 与 mtcnn 无关的层,
梳理 ncnn 的一些逻辑代码.
简单做了一些适配和优化.
砍掉一些边边角角.
不依赖 opencv 等第三方库.
编写示例代码完成后, 还有不少工作要做,
不过第一步感觉已经符合我的小小预期.
完整示例代码:
- #include "mtcnn.h"
- #include "browse.h"
- #define USE_SHELL_OPEN
- #ifndef nullptr
- #define nullptr 0
- #endif
- #if defined(_MSC_VER)
- #define _CRT_SECURE_NO_WARNINGS
- #include <windows.h>
- #else
- #include <unistd.h>
- #endif
- #define STB_IMAGE_STATIC
- #define STB_IMAGE_IMPLEMENTATION
- #include "stb_image.h"
- //ref:https://github.com/nothings/stb/blob/master/stb_image.h
- #define TJE_IMPLEMENTATION
- #include "tiny_jpeg.h"
- //ref:https://github.com/serge-rgb/TinyJPEG/blob/master/tiny_jpeg.h
- #include <stdint.h>
- #include "timing.h"
- char saveFile[1024];
- unsigned char *loadImage(const char *filename, int *Width, int *Height, int *Channels) {
- return stbi_load(filename, Width, Height, Channels, 0);
- }
- void saveImage(const char *filename, int Width, int Height, int Channels, unsigned char *Output) {
- memcpy(saveFile + strlen(saveFile), filename, strlen(filename));
- *(saveFile + strlen(saveFile) + 1) = 0;
- // 保存为 jpg
- if (!tje_encode_to_file(saveFile, Width, Height, Channels, true, Output)) {
- fprintf(stderr, "save JPEG fail.\n");
- return;
- }
- #ifdef USE_SHELL_OPEN
- browse(saveFile);
- #endif
- }
- void splitpath(const char *path, char *drv, char *dir, char *name, char *ext) {
- const char *end;
- const char *p;
- const char *s;
- if (path[0] && path[1] == ':') {
- if (drv) {
- *drv++ = *path++;
- *drv++ = *path++;
- *drv = '\0';
- }
- }
- else if (drv)
- *drv = '\0';
- for (end = path; *end && *end != ':';)
- end++;
- for (p = end; p> path && *--p != '\\' && *p != '/';)
- if (*p == '.') {
- end = p;
- break;
- }
- if (ext)
- for (s = end; (*ext = *s++);)
- ext++;
- for (p = end; p> path;)
- if (*--p == '\\' || *p == '/') {
- p++;
- break;
- }
- if (name) {
- for (s = p; s <end;)
- *name++ = *s++;
- *name = '\0';
- }
- if (dir) {
- for (s = path; s < p;)
- *dir++ = *s++;
- *dir = '\0';
- }
- }
- void getCurrentFilePath(const char *filePath, char *saveFile) {
- char drive[_MAX_DRIVE];
- char dir[_MAX_DIR];
- char fname[_MAX_FNAME];
- char ext[_MAX_EXT];
- splitpath(filePath, drive, dir, fname, ext);
- size_t n = strlen(filePath);
- memcpy(saveFile, filePath, n);
- char *cur_saveFile = saveFile + (n - strlen(ext));
- cur_saveFile[0] = '_';
- cur_saveFile[1] = 0;
- }
- void drawPoint(unsigned char *bits, int width, int depth, int x, int y, const uint8_t *color) {
- for (int i = 0; i < min(depth, 3); ++i) {
- bits[(y * width + x) * depth + i] = color[i];
- }
- }
- void drawLine(unsigned char *bits, int width, int depth, int startX, int startY, int endX, int endY,
- const uint8_t *col) {
- if (endX == startX) {
- if (startY> endY) {
- int a = startY;
- startY = endY;
- endY = a;
- }
- for (int y = startY; y <= endY; y++) {
- drawPoint(bits, width, depth, startX, y, col);
- }
- }
- else {
- float m = 1.0f * (endY - startY) / (endX - startX);
- int y = 0;
- if (startX> endX) {
- int a = startX;
- startX = endX;
- endX = a;
- }
- for (int x = startX; x <= endX; x++) {
- y = (int)(m * (x - startX) + startY);
- drawPoint(bits, width, depth, x, y, col);
- }
- }
- }
- void drawRectangle(unsigned char *bits, int width, int depth, int x1, int y1, int x2, int y2, const uint8_t *col) {
drawLine(bits, width, depth, x1, y1, x2, y1, col);
drawLine(bits, width, depth, x2, y1, x2, y2, col);
drawLine(bits, width, depth, x2, y2, x1, y2, col);
drawLine(bits, width, depth, x1, y2, x1, y1, col);
- }
- int main(int argc, char **argv) {
- printf("mtcnn face detection\n");
- printf("blog:http://cpuimage.cnblogs.com/\n");
- if (argc <2) {
- printf("usage: %s model_path image_file \n", argv[0]);
- printf("eg: %s ../models ../sample.jpg \n", argv[0]);
- printf("press any key to exit. \n");
- getchar();
- return 0;
- }
- const char *model_path = argv[1];
- char *szfile = argv[2];
- getCurrentFilePath(szfile, saveFile);
- int Width = 0;
- int Height = 0;
- int Channels = 0;
- unsigned char *inputImage = loadImage(szfile, &Width, &Height, &Channels);
- if (inputImage == nullptr || Channels != 3) return -1;
- ncnn::Mat ncnn_img = ncnn::Mat::from_pixels(inputImage, ncnn::Mat::PIXEL_RGB, Width, Height);
- std::vector<Bbox> finalBbox;
- MTCNN mtcnn(model_path);
- double startTime = now();
- mtcnn.detect(ncnn_img, finalBbox);
- double nDetectTime = calcElapsed(startTime, now());
- printf("time: %d ms.\n", (int)(nDetectTime * 1000));
- int num_box = finalBbox.size();
- printf("face num: %u \n", num_box);
- for (int i = 0; i < num_box; i++) {
- const uint8_t red[3] = { 255, 0, 0 };
- drawRectangle(inputImage, Width, Channels, finalBbox[i].x1, finalBbox[i].y1,
- finalBbox[i].x2,
- finalBbox[i].y2, red);
- const uint8_t blue[3] = { 0, 0, 255 };
- for (int num = 0; num < 5; num++) {
drawPoint(inputImage, Width, Channels, (int)(finalBbox[i].ppoint[num] + 0.5f),
- (int)(finalBbox[i].ppoint[num + 5] + 0.5f), blue);
- }
- }
- saveImage("_done.jpg", Width, Height, Channels, inputImage);
- free(inputImage);
- printf("press any key to exit. \n");
- getchar();
- return 0;
- }
效果图来一个.
项目地址:
https://github.com/cpuimage/MTCNN
参数也很简单,
mtcnn 模型文件路径 图片路径
例如: mtcnn ../models ../sample.jpg
用 cmake 即可进行编译示例代码, 详情见 CMakeLists.txt.
若有其他相关问题或者需求也可以邮件联系俺探讨.
邮箱地址是:
gaozhihan@vip.qq.com
来源: https://www.cnblogs.com/cpuimage/p/8995600.html