验证码识别
1, 前言
工作关系, 在做自动化测试的时候, 不可避免要碰到验证码, 如果中途暂停手动输入的话, 未免太繁琐, 所以我在这里总结了自己搜索到的资料, 结合实践经验, 与各位分享.
2, 解决的问题
本次我解决的问题主要是比较传统的图片验证码识别, 类似下图这样的:
滑块验证和顺序点击图片那种逆天的验证码本次不涉及.
3, 方法
我这里有 java 和 python 的不同实现, 背后的思路大体一致:
1 图片二值化
2 去噪点
3 识别
下面通过代码给大家讲解, 相关代码已上传至 GitHub, 可在文末查看.
4,java 实现
首先列出工程目录:
Entrance 是程序入口, DT 是一些配置信息, PictureOcr 是识别用到的一些方法.
1 去噪点
- /**
- * 图片去噪点
- * @param picPath
- * @return
- * @throws IOException
- */
- public static void removeBackground(String picPath) throws IOException {
- BufferedImage bufferedImage = ImageIO.read(new File(picPath));
- int width = bufferedImage.getWidth();
- int height = bufferedImage.getHeight();
- for (int x = 0; x <width; ++x) {
- for (int y = 0; y < height; ++y) {
- if (isWrite(bufferedImage.getRGB(x, y)) == 1) {
- bufferedImage.setRGB(x, y, Color.white.getRGB());
- } else {
- bufferedImage.setRGB(x, y, Color.black.getRGB());
- }
- }
- }
- ImageIO.write(bufferedImage, picType, new File(picPath));
- }
- /**
- * 如果某个像素的三原色值大于所设定的阈值, 就将此像素设为白色, 即为背景
- * @param colorInt
- * @return
- */
- public static int isWrite(int colorInt) {
- Color color = new Color(colorInt);
- if (color.getRed() + color.getGreen() + color.getBlue()> DT.DictOfOcr.threshold) {
- return 1;
- }
- return 0;
- }
先取得图片的分辨率 (长 * 宽), 然后设定一个阈值, 阈值就是某个像素的 R,G,B 三原色值的和, 大家可以使用截图工具来分析要识别图像的验证码阈值是多少, 以微信为例, 验证码待识别区域的 RGB 值即可设定为阈值, 大于此阈值的像素均设为白色, 否则即设为黑色, 这样便可以有效去除噪点.
2 裁剪边框
裁剪边框是为了尽可能大的保留图片特征, 提高识别率
- /**
- * 裁剪边角
- * @param picPath
- * @throws IOException
- */
- public static void cutPic(String picPath) throws IOException {
- BufferedImage bufferedimage=ImageIO.read(new File(picPath));
- int width = bufferedimage.getWidth();
- int height = bufferedimage.getHeight();
- bufferedimage = cropPic(bufferedimage, (cutWidth / 2),0, (width - cutWidth / 2), height);
- bufferedimage = cropPic(bufferedimage,0, (cutHeight / 2),(width - cutWidth), (height - cutHeight / 2));
- ImageIO.write(bufferedimage, picType, new File(picPath));
- }
- /**
- * 根据参数裁剪图片
- * @param bufferedImage
- * @param startX
- * @param startY
- * @param endX
- * @param endY
- * @return
- */
- public static BufferedImage cropPic(BufferedImage bufferedImage, int startX, int startY, int endX, int endY) {
- int width = bufferedImage.getWidth();
- int height = bufferedImage.getHeight();
- if (startX == -1) {
- startX = 0;
- }
- if (startY == -1) {
- startY = 0;
- }
- if (endX == -1) {
- endX = width - 1;
- }
- if (endY == -1) {
- endY = height - 1;
- }
- BufferedImage result = new BufferedImage(endX - startX, endY - startY, 4);
- for (int x = startX; x < endX; ++x) {
- for (int y = startY; y < endY; ++y) {
- int rgb = bufferedImage.getRGB(x, y);
- result.setRGB(x - startX, y - startY, rgb);
- }
- }
- return result;
- }
- /**
- * 执行 Ocr 识别
- * @param picPath
- * @return
- * @throws TesseractException
- */
- public static String executeOcr(String picPath) throws TesseractException {
- ITesseract iTesseract = new Tesseract();
- iTesseract.setDatapath(tessdataPath);
- //iTesseract.setLanguage("eng");
- // 可根据需要引入相关的训练集
- String ocrResult = iTesseract.doOCR(new File(picPath));
- return ocrResult;
- }
- def capt_process(capt):
- """
- 图像预处理, 将验证码图片转为二值型图片, 按字符切割
- :param capt: image
- :return: 一个数组包含四个元素, 每个元素是一张包含单个字符的二值型图片
- """
- # 转为灰度图
- capt_gray = capt.convert("L")
- # 取得图片阈值
- threshold = get_threshold(capt_gray)
- # 二值化图片
- table = get_bin_table(threshold)
- capt_bw = capt_gray.point(table, "1")
- capt_per_char_list = []
- for i in range(4):
- x = 5 + i * 15
- y = 2
- capt_per_char = capt_bw.crop((x, y, x + 13, y + 24))
- capt_per_char_list.append(capt_per_char)
- return capt_per_char_list
- View Code
- def get_threshold(capt):
- """
- 获取一张图片中, 像素出现次数最多的像素, 作为阈值
- :param capt:
- :return:
- """
- pixel_dict = defaultdict(int)
- # 取得图片长, 宽
- rows, cols = capt.size
- for i in range(rows):
- for j in range(cols):
- # 取得这一点的 (r,g,b)
- pixel = capt.getpixel((i, j))
- # 以像素做 key, 出现的次数做 value
- pixel_dict[pixel] += 1
- # 取得字典中像素出现最多的次数
- count_max = max(pixel_dict.values())
- # 反转字典, 改为以出现次数做 key, 方便后面取得像素
- pixel_dict_reverse = {v: k for k, v in pixel_dict.items()}
- # 取得出现次数最多的像素
- threshold = pixel_dict_reverse[count_max]
- return threshold
- View Code
- def get_bin_table(threshold):
- """
- 按照阈值进行二值化处理
- :param threshold:
- :return:
- """
- table = []
- rate = 0.1
- for i in range(256):
- if threshold * (1 - rate) <= i <= threshold * (1 + rate):
- table.append(1)
- else:
- table.append(0)
- return table
- View Code
来源: http://www.bubuko.com/infodetail-3265942.html