opencv提取图片中的矩形 opencv 提取图片中的矩形代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 import cv2import numpy as npfrom google.colab.patches import cv2_imshowdef cv_get_block (location) : img = cv2.imread(location) img = cv2.resize(img, (700 ,int(700 /img.shape[1 ]*img.shape[0 ])), interpolation=cv2.INTER_AREA) gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) binary = cv2.Canny(gray,50 ,100 ) contours, hier = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) croppedImage = [] positionAndText = [] for c in contours: rect = cv2.minAreaRect(c) box = cv2.boxPoints(rect) box = np.int0(box) h = int(abs(box[3 , 1 ] - box[1 , 1 ])) w = int(abs(box[3 , 0 ] - box[1 , 0 ])) y = min(box[2 ][1 ],box[0 ][1 ]) x = min(box[2 ][0 ],box[0 ][0 ]) if (h > 500 or w > 600 ): continue if (h < 20 or w < 60 ): continue cropped = img[y-5 :y+h+5 , x-5 :x+w+5 ] print('\n 元素识别:' ) cv2_imshow(cropped) inner_text, before_text = element_ocr(cropped, img, x, y, w, h) positionAndText.append([box,inner_text,before_text]) cropped = resize_image(cropped) croppedImage.append(cropped) cv2.drawContours(img, [box], 0 , (255 , 0 , 255 ), 1 ) return np.array(croppedImage), np.array(positionAndText), img
代码中的关键函数
找轮廓 findContours findContours
找轮廓
https://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours#findcontours
本例中使用
1 2 3 binary = cv2.Canny(gray,50 ,100 ) contours, hier = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
最小外接矩形 minAreaRect minAreaRect
最小外接矩形
1 rect = cv2.minAreaRect(c)
boxPoints 返回矩形坐标 cv2.boxPoints 返回的坐标顺序:注意
:需要注意的是,这个函数返回的坐标是没有严格顺序的,不能根据index来确定,所以取坐标位置时,要用min,max等函数1 2 h = abs(box[3 , 1 ] - box[1 , 1 ]) w = abs(box[3 , 0 ] - box[1 , 0 ])
image 裁剪图片 image[y:y+h, x:x+w]
因为因为boxPoints返回坐标点顺序是随机的,无法使用顺序计算,需要用min max 取一下1 2 3 4 5 6 cropped = img[ min(box[2 ][1 ],box[0 ][1 ])-5 :max(box[2 ][1 ],box[0 ][1 ])+5 , min(box[2 ][0 ],box[0 ][0 ])-5 :max(box[2 ][0 ],box[0 ][0 ])+5 ]
截取的图片进一步处理 使用 ImageDataGenerator.flow
https://keras.io/api/preprocessing/image/
图片的shape是 (224, 224, 3)
flow输入要求为
NumPy array of rank 4 or a tuple. If tuple, the first element should contain the images and the second element another NumPy array or a list of NumPy arrays that gets passed to the output without any modifications. Can be used to feed the model miscellaneous data along with the images. In case of grayscale data, the channels axis of the image array should have value 1, in case of RGB data, it should have value 3, and in case of RGBA data, it should have value 4.
flow 要求的输入shape是4阶 NumPy array,python中 list 和 num array 是不一样的,所以
需要 np.array(croppedImage)
把list 转为 array
注意
:还有需要注意的一点是,ImageDataGenerator
返回的数组并不是按照输入的顺序,这点比较坑,需要用label传递对应关系。因为预测完后,还需要这个元素的坐标,才能正确对应。所以这里把坐标传到了label中,得到预期的对应关系。
1 2 3 4 5 6 datagen = ImageDataGenerator(rescale=(1 /255 )) input_data = datagen.flow(images, boxes) image_batch, boxes_batch = next(input_data)
加载训练好的模型 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 html_model = tf.keras.models.load_model('/content/model/' ) predicted_batch = html_model.predict(image_batch) predicted_id = np.argmax(predicted_batch, axis=-1 ) class_name = np.array(['Input' , 'Select' , 'Button' ]) predicted_name = class_name[predicted_id] print(predicted_name) result_show(image_batch, predicted_name)
结果 结果如下
待提高 可见识别准确度还有提高的空间,因为模型只训练了一次,数据也少,后续可以进一步提高模型的准确度。
代码 完整代码如下
https://github.com/yuxizhe/HTML-UI-datasets-generate/blob/master/cv%E5%90%8E%E5%88%86%E7%B1%BB.ipynb