Interactive marking method for target object in image

文档序号:1939508 发布日期:2021-12-07 浏览:14次 中文

阅读说明:本技术 一种图像中目标物的交互式标记方法 (Interactive marking method for target object in image ) 是由 罗兵 徐志敏 李圣田 朱和平 雷苏琪 马能武 黄祥虎 陶蔚 曹胜中 何涛 王炜 于 2021-08-17 设计创作,主要内容包括:本发明公开了一种图像中目标物的交互式标记方法。它包括如下步骤,步骤一:客户端图像数据加载;将需要标记的图像存储在服务端,并将图像以服务对外进行发布;当服务端图片被客户端请求加载时,服务端返回原始图片的压缩和第一次缩放版本;步骤二:客户端图像渲染与标记;客户端向服务端请求需要标记的图片,并在标记区域渲染缩放后的图像;客户端在标记区域内对图像中的目标物进行标记操作,形成标记结果;步骤三:标记结果坐标转换与存储;将组成客户端标记结果的点坐标转换至服务器原始图片的像素坐标,并通过调用服务端接口,将转换后的结果存储至数据库。本发明具有实现网络环境下多人协同图像标记,同时降低客户端的硬件性能要求的优点。(The invention discloses an interactive marking method for a target object in an image. The method comprises the following steps: loading image data of a client; storing the image to be marked at a server, and issuing the image to the outside in a service mode; when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; step two: rendering and marking the client side image; the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area; the client side carries out marking operation on the target object in the image in the marking area to form a marking result; step three: converting and storing the marked result coordinate; and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface. The invention has the advantages of realizing multi-person collaborative image marking in a network environment and simultaneously reducing the hardware performance requirement of the client.)

1. An interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,

the method comprises the following steps: loading image data of a client;

storing the image to be marked at a server, and issuing the image to the outside in a service mode;

when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture;

step two: rendering and marking the client side image;

the client requests a picture needing to be marked from the server, and the zoomed image is rendered in the marked area;

after rendering is completed, the client side carries out marking operation on the target object in the image in the marking area to form a marking result;

step three: converting and storing the marked result coordinate;

and converting the point coordinates forming the client marking result into the pixel coordinates of the original picture of the server, and storing the converted result into a database by calling a server interface.

2. The method of claim 1, wherein the method comprises: in the second step, the marking operation comprises interactive clicking, box selection and plotting.

3. The method of claim 2, wherein the method further comprises: in the second step, in the marking process, the picture is translated or zoomed relative to the marked area.

4. The method of claim 3, wherein the method further comprises: converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, wherein the specific method comprises the following steps:

setting the first time scaling multiple of the original image as R, setting the second time scaling multiple in the marking area of the client as R, and setting the offset of the upper left corner of the image subjected to second time scaling relative to the upper left corner of the marking area as (delta x, delta y);

the relative coordinates of the points forming the marking result of the client in the marking area are (X)0,Y0) Pixel coordinate (x) of the corresponding point in the original picture of the server0,y0) The conversion relationship satisfies the following formula:

in formula (1), when R > 1, it means that the image is reduced in the mark region for the second time; when R < 1, the image is magnified in the mark area for the second time; when R is 1, the image is displayed in the original scale in the mark region for the second time.

Technical Field

The invention relates to the technical field of computer vision, in particular to an interactive marking method for a target object in an image.

Background

Image tagging is a pre-processing process that assists in detecting image objects in which a user can click, box, or plot a particular object in an image so that the object can be further processed by a computer. The image marking tool can be used for creating a training data set and is widely applied in the fields of artificial intelligence and machine learning.

Currently, the widely used image labeling tools are LabelMe (http:// LabelMe. csail.mit. edu/Release3.0) and LabelImg (https:// github. com/tzutalin/labelImg). LabelImg supports positive rectangular labeling, and the labeling result is saved as an xml file in PASCAL VOC format by default. The LabelMe defaults to support polygonal marks and also supports marks of regular rectangles, points, lines and circles, and marking results are saved as json files by default.

Both tools need to download an installation package for local installation, and only local image data can be opened, so that multi-user cooperative marking is difficult to realize. Meanwhile, when the image data volume is large and the size of a single image is large, the local device needs a large storage space and a large memory to ensure smooth loading and smooth marking of the image.

Therefore, it is necessary to develop an image tagging method that can implement multi-user collaborative image tagging in a network environment and reduce the performance requirements of large-size and large-quantity image data on the hardware of the local device.

Disclosure of Invention

The invention aims to provide an interactive marking method of a target object in an image, which realizes multi-person collaborative image marking in a network environment, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirement of a client; the problem that a marking tool and image data of an existing image marking tool need to be installed or stored locally and multi-person cooperative marking is difficult to achieve is solved, and when the image data volume is large, images on equipment with low hardware configuration are difficult to load smoothly, and marking efficiency is affected is solved.

In order to achieve the purpose, the technical scheme of the invention is as follows: an interactive marking method of an object in an image, characterized in that: comprises the following steps of (a) carrying out,

the method comprises the following steps: loading image data of a client;

storing the image to be marked at a server, and issuing the image to the outside in a service mode; a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized;

when the server side picture is requested to be loaded by the client side, the server side returns a compressed and first-time zooming version of the original picture; after compression, the data volume is reduced, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized;

step two: rendering and marking the client side image;

the client requests a picture needing to be marked (the picture is a compressed and first-time zoomed version of an original picture) from the server, and a zoomed image is rendered in a marked area;

after rendering is completed, the client carries out interactive marking operation on the target object in the image in the marking area to form a marking result;

step three: converting and storing the marked result coordinate;

the point coordinates forming the marking result of the client are local relative coordinates in the marking area and need to be converted into pixel coordinates of the original picture of the server,

converting the point coordinates forming the client marking result into the pixel coordinates of the server original picture, and storing the converted result into a database (as shown in fig. 1) by calling a server interface; and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

In the above technical solution, in the second step, the interactive mark operation includes interactive clicking, frame selection, plotting, or the like.

In the above technical solution, in the second step, in the marking process, the picture may be translated or scaled with respect to the marked region.

In the above technical solution, as shown in fig. 2, the point coordinates forming the client marking result are converted into the pixel coordinates of the server original picture, and the specific method is as follows:

setting the first time scaling multiple of the original image as R, setting the second time scaling multiple in the marking area of the client as R, and setting the offset of the upper left corner of the image subjected to second time scaling relative to the upper left corner of the marking area as (delta x, delta y);

the relative coordinates of the points forming the marking result of the client in the marking area are (X)0,Y0) Pixel coordinate (x) of the corresponding point in the original picture of the server0,y0) The conversion relationship satisfies the following formula:

in formula (1), when R > 1, it means that the image is reduced in the mark region for the second time; when R < 1, the image is magnified in the mark area for the second time; when R is 1, the image is displayed in the original scale in the mark region for the second time.

The methods of compressing and scaling the original image at the server, rendering the scaled image in the marked area, calling the server interface, and storing the converted result in the database are all the prior art.

The invention has the following advantages:

1) a networked architecture is adopted, the image data is placed at a server side for external access, and a marking result is stored in a database, so that multi-user cooperative marking can be realized; the defect that in the prior art, a marking tool and image data need to be installed or stored locally, and multi-user cooperative marking is difficult to realize is overcome;

2) through image data compression and scaling of the server side, the image is transmitted to the client side in smaller data volume and size, and hardware requirements on the client side such as network bandwidth, storage capacity and memory size are effectively reduced; the defects that when the image data volume is large, images are difficult to smoothly load on equipment with low hardware configuration and the marking efficiency is influenced in the conventional image marking tool are overcome;

3) and the marking result of the target object is accurately and reversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

Drawings

Fig. 1 is a schematic diagram of the general technical principle of the present invention.

FIG. 2 is a schematic diagram of coordinate transformation of the marking result in the present invention.

Fig. 3 is a schematic diagram of an interactive mark of a target object in an image according to an embodiment of the present invention.

Fig. 3 is a diagram of fig. 3(1), fig. 3(2), and fig. 3(3) from left to right; FIG. 3(1) is an original picture according to the present embodiment; FIG. 3(2) is a compressed and scaled original image of the present embodiment; fig. 3(3) shows the picture and the mark area after the second scaling.

The light gray shading area on the outer circle of the picture in fig. 3(3) is the marked area in step two.

Detailed Description

The embodiments of the present invention will be described in detail with reference to the accompanying drawings, which are not intended to limit the present invention, but are merely exemplary. While the advantages of the invention will be clear and readily understood by the description.

The technical scheme provides an interactive marking method for a target object in an image, which realizes multi-user cooperative image marking in a network environment through steps of server image data compression and release, client image rendering and marking, marking result coordinate conversion and storage and the like, and simultaneously completes image storage and compression tasks by a server, thereby reducing the hardware performance requirements of the client.

Examples

The present invention will now be described in detail with reference to an embodiment in which the present invention is applied to interactive tagging of a spherical lamp-like object in an image, and the present invention is also useful as a guidance for application to interactive tagging of objects in other images.

The effect of the experiment using the present solution based on the image data of a certain spherical lamp-shaped target object is shown in fig. 3.

Fig. 3(1) shows raw picture data obtained by the image capturing device, in BMP format, with a data size of 19496KB and a size of 5472 × 3648. It can be seen that when a single original picture has large data and a large amount, a large hard disk and a memory are required for the conventional local mark to ensure the storage and the rapid reading of the data.

Fig. 3(2) shows picture data compressed in jpg format and reduced by 10 times, the data size being 31KB and the size being 548 × 365. Therefore, after compression, the data volume is reduced by more than 600 times, and on the premise that the target object is clear and identifiable, the data can be better transmitted on the network, so that multi-user cooperative marking is realized.

Fig. 3(3) shows the effect of the image data being transmitted and displayed in the mark area of the client, in which the user can perform mouse wheel zooming and dragging translation on the image. The spherical lamp-shaped target is subjected to frame selection marking in the marking area by using a rectangular frame, and the marking result is shown as a solid line frame in fig. 3 (3). Coordinates of four corner points of the rectangular frame are transformed to restore the coordinates to a rectangular range in the original picture, and the final result is shown by a dotted line frame in fig. 3 (1).

In the embodiment, the marking result of the spherical lamp-shaped target object is accurately and inversely calculated from the coordinates on the compressed image to the coordinates on the original image through the coordinate conversion of the marking result, so that the reliability of the marking result is ensured.

Other parts not described belong to the prior art.

7页详细技术资料下载
上一篇:一种医用注射器针头装配设备
下一篇:笔顺动画的生成方法、装置、系统及电子设备

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!