机器学习生成CRM表单 - 2.opencv提取后模型分类

opencv提取图片中的矩形

opencv 提取图片中的矩形代码

import cv2
import numpy as np
from google.colab.patches import cv2_imshow

#获取图片内矩形函数
def cv_get_block(location):
  img = cv2.imread(location)
  # img resize to 700
  img = cv2.resize(img, (700,int(700/img.shape[1]*img.shape[0])), interpolation=cv2.INTER_AREA)

  # 转为灰度
  gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  # 边缘检测
  binary = cv2.Canny(gray,50,100)

  #cv2_imshow(binary)

  #检测轮廓
  # RETR_EXTERNAL  表示只检测最外层轮廓
  contours, hier = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

  # 裁剪出的图片
  croppedImage = []
  # 位置信息和文案信息
  positionAndText = []

  for c in contours:  #遍历轮廓
      rect = cv2.minAreaRect(c)  #生成最小外接矩形

      # 计算最小面积矩形的坐标
      # 这个返回坐标点顺序是随机的
      box = cv2.boxPoints(rect)
      box = np.int0(box)  # 将坐标规范化为整数

      h = int(abs(box[3, 1] - box[1, 1]))
      w = int(abs(box[3, 0] - box[1, 0]))
      y = min(box[2][1],box[0][1])
      x = min(box[2][0],box[0][0])

      # 去除太小太大 的矩形，只保留合适的
      if (h > 500 or w > 600):
          continue
      if (h < 20 or w < 60):
          continue
      
      #print(h,w)
      #print(box)
      # 取出图片
      # image[y:y+h, x:x+w]  
      cropped = img[y-5:y+h+5, x-5:x+w+5]

      print('\n 元素识别：')
      cv2_imshow(cropped)

      # 识别矩形中的文字
      inner_text, before_text = element_ocr(cropped, img, x, y, w, h)
      
      # 传出
      positionAndText.append([box,inner_text,before_text])
      
      # 格式化为正方形
      cropped = resize_image(cropped)
      croppedImage.append(cropped)
      # 绘制矩形
      cv2.drawContours(img, [box], 0, (255, 0, 255), 1)
  #cv2_imshow(img)

  # 输出 array 而不是list 
  # 'Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (224, 224, 3))
  # 因为 ImageDataGenerator.flow 输入为 NumpyArray
  return np.array(croppedImage), np.array(positionAndText), img

代码中的关键函数

找轮廓 findContours

findContours 找轮廓

https://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours#findcontours

本例中使用

1
2
3

# 边缘检测
binary = cv2.Canny(gray,50,100)
contours, hier = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

最小外接矩形 minAreaRect

minAreaRect 最小外接矩形

1	rect = cv2.minAreaRect(c) #生成最小外接矩形

boxPoints 返回矩形坐标

cv2.boxPoints 返回的坐标顺序：
注意：需要注意的是，这个函数返回的坐标是没有严格顺序的，不能根据index来确定，所以取坐标位置时，要用min,max等函数

1 2	h = abs(box[3, 1] - box[1, 1]) w = abs(box[3, 0] - box[1, 0])

image 裁剪图片

image[y:y+h, x:x+w]
因为因为boxPoints返回坐标点顺序是随机的，无法使用顺序计算，需要用min max 取一下

# 取出图片
# image[y:y+h, x:x+w]
# 因为boxPoints返回坐标点顺序是随机的，无法使用顺序计算
cropped = img[
    min(box[2][1],box[0][1])-5:max(box[2][1],box[0][1])+5,
    min(box[2][0],box[0][0])-5:max(box[2][0],box[0][0])+5]

截取的图片进一步处理

使用 ImageDataGenerator.flow

https://keras.io/api/preprocessing/image/

图片的shape是 (224, 224, 3)

flow输入要求为

NumPy array of rank 4 or a tuple. If tuple, the first element should contain the images and the second element another NumPy array or a list of NumPy arrays that gets passed to the output without any modifications. Can be used to feed the model miscellaneous data along with the images. In case of grayscale data, the channels axis of the image array should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4.

flow 要求的输入shape是4阶 NumPy array，python中 list 和 num array 是不一样的，所以

需要 np.array(croppedImage) 把list 转为 array

注意：还有需要注意的一点是，ImageDataGenerator 返回的数组并不是按照输入的顺序，这点比较坑，需要用label传递对应关系。因为预测完后，还需要这个元素的坐标，才能正确对应。所以这里把坐标传到了label中，得到预期的对应关系。

# create a data generator
datagen = ImageDataGenerator(rescale=(1/255))
# 矩形元素 通过ImageDataGenerator 图像进一步处理
# 需要注意的是，通过flow输出的数据是乱序的，并不会严格安装输入的顺序，所以对应关系要存储在label中
input_data = datagen.flow(images, boxes)
image_batch, boxes_batch = next(input_data)

加载训练好的模型

# 加载模型
html_model = tf.keras.models.load_model('/content/model/')

# 预测元素
predicted_batch = html_model.predict(image_batch)

# 结果归一
predicted_id = np.argmax(predicted_batch, axis=-1)

class_name = np.array(['Input', 'Select', 'Button'])
# 结果对应上名字
predicted_name = class_name[predicted_id]
print(predicted_name)
# 图示结果
result_show(image_batch, predicted_name)

结果

结果如下

待提高

可见识别准确度还有提高的空间，因为模型只训练了一次，数据也少，后续可以进一步提高模型的准确度。

代码

完整代码如下

https://github.com/yuxizhe/HTML-UI-datasets-generate/blob/master/cv%E5%90%8E%E5%88%86%E7%B1%BB.ipynb

Contents