Image sample generation method and device and electronic equipment

文档序号：616064 发布日期：2021-05-07 浏览：2次中文

阅读说明：本技术 图像样本生成方法、装置和电子设备 (Image sample generation method and device and electronic equipment ) 是由周杰刘畅王长虎于 2021-01-26 设计创作，主要内容包括：本公开实施例公开了图像样本生成方法、装置和电子设备。该方法的一具体实施方式包括：将第一图像样本导入预先训练的分类模型,其中,所述分类模型包括至少一个特征提取层,分类模型用于表征图像与预定义类型之间的对应关系；根据特征提取层输出的特征图,确定第一图像样本中的显著性区域；将指示所述显著性区域的显著性区域标签,与所述第一图像样本建立对应关系,其中,所述显著性区域标签和所述第一图像样本用于训练得到显著性检测模型。由此,可以提供一种新的图像样本生成方式。(The embodiment of the disclosure discloses an image sample generation method and device and electronic equipment. One embodiment of the method comprises: importing a first image sample into a pre-trained classification model, wherein the classification model comprises at least one feature extraction layer and is used for representing the corresponding relation between an image and a predefined type; determining a salient region in the first image sample according to the feature map output by the feature extraction layer; and establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model. Thus, a new image sample generation method can be provided.)

1. An image sample generation method, comprising:

importing a first image sample into a pre-trained classification model, wherein the classification model comprises at least one feature extraction layer and is used for representing the corresponding relation between an image and a predefined type;

determining a salient region in the first image sample according to the feature map output by the feature extraction layer;

and establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

2. The method of claim 1, wherein determining the salient region in the first image sample according to the feature map output by the feature extraction layer comprises:

determining a region with a response score ratio larger than a preset response score ratio threshold value from the feature map, wherein the response score ratio is the ratio of the response score of the region to the total response score in the feature map;

from the determined regions, a salient region is determined.

3. The method of claim 1, wherein the classification model is trained by a first step comprising:

importing the second image sample into an initial classification model to obtain a classification result;

and adjusting the initial classification model based on the classification result and a type label of the second image sample, wherein the type label is used for indicating the image click rate.

4. The method of claim 3, wherein the second image sample and the corresponding type label are generated by a second step, wherein the second step comprises:

importing the candidate image into a click model to obtain an estimated click rate, wherein the click model is used for representing the corresponding relation between the image and the estimated click rate;

and determining the second image sample from the candidate image according to the estimated click rate, and generating a type label corresponding to the second image sample.

5. The method of claim 4, wherein determining the second image sample from the candidate images according to the estimated click rate and generating a type label corresponding to the second image sample comprises:

determining the candidate image with the estimated click rate larger than the first click rate threshold value as a second candidate image corresponding to the first type label;

and determining the candidate image with the estimated click rate smaller than a second click rate threshold as a second candidate image corresponding to the second type label, wherein the first click rate threshold is not smaller than the second click rate threshold.

6. The method of claim 1, further comprising:

importing an image to be processed into a pre-trained significance detection model to obtain significance region indication information, wherein the significance detection model is obtained by utilizing a first image sample and a significance region label for training;

and based on the indication information of the saliency area, cutting the image to be processed to obtain a cut image corresponding to the image to be processed.

7. The method according to claim 6, wherein the cropping the image to be processed based on the saliency region indication information to obtain a cropped image corresponding to the image to be processed comprises:

determining a cutting reserved area according to the size of the salient area indicated by the salient area indication information and the size of the target;

and according to the determined cutting reserved area, cutting the image to be processed.

8. The method of claim 6, wherein the significance detection model is trained by a third step, wherein the third step comprises:

importing the first image sample into an initial significance detection model to obtain a detection result;

and adjusting the initial saliency detection model according to the saliency area label and the detection result corresponding to the first image sample.

9. An image sample generation apparatus, comprising:

the image classification method comprises an importing unit, a classifying unit and a processing unit, wherein the importing unit is used for importing a first image sample into a pre-trained classification model, the classification model comprises at least one feature extraction layer, and the classification model is used for representing the corresponding relation between an image and a predefined type;

the determining unit is used for determining a salient region in the first image sample according to the feature map output by the feature extraction layer;

the establishing unit is used for establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

10. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

11. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to an image sample generation method and apparatus, and an electronic device.

Background

With the development of the internet, users increasingly use terminal devices to browse various information. The human visual system has the ability to quickly search and locate objects of interest when faced with natural scenes, and this visual attention mechanism is an important mechanism for processing visual information in people's daily lives. With the spread of large data volume brought by the internet, how to quickly acquire important information from massive image and video data has become a key problem in the field of computer vision.

Disclosure of Invention

This disclosure is provided to introduce concepts in a simplified form that are further described below in the detailed description. This disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, an embodiment of the present disclosure provides an image sample generation method, including: importing a first image sample into a pre-trained classification model, wherein the classification model comprises at least one feature extraction layer and is used for representing the corresponding relation between an image and a predefined type; determining a salient region in the first image sample according to the feature map output by the feature extraction layer; and establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

In a second aspect, an embodiment of the present disclosure provides an image sample generation apparatus, including: the image classification method comprises an importing unit, a classifying unit and a processing unit, wherein the importing unit is used for importing a first image sample into a pre-trained classification model, the classification model comprises at least one feature extraction layer, and the classification model is used for representing the corresponding relation between an image and a predefined type; the determining unit is used for determining a salient region in the first image sample according to the feature map output by the feature extraction layer; the establishing unit is used for establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the image sample generation method of the first aspect.

In a fourth aspect, the disclosed embodiments provide a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the steps of the image sample generation method according to the first aspect.

According to the image sample generation method, the image sample generation device and the electronic equipment, the first image sample is processed by adopting the classification model, the saliency region is determined by taking the middle process image (feature map) of the classification model as a basis, and the saliency region in the first image sample can be determined by utilizing the feature extraction capability of the classification model. In other words, the trained classification model can distinguish the primary content and the secondary content of the image and highlight the primary content of the image in the process of feature extraction. Therefore, the method for automatically determining the saliency area in the image by the computer can be provided, so that the sample generation difficulty of the saliency detection model is reduced, and the sample generation cost is reduced.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

FIG. 1 is a flow diagram of one embodiment of an image sample generation method according to the present disclosure;

FIG. 2 is a flow diagram according to an exemplary implementation of the present disclosure;

FIG. 3 is a schematic diagram of one application scenario in accordance with the present disclosure;

FIG. 4 is a schematic structural diagram of one embodiment of an image sample generation apparatus according to the present disclosure;

FIG. 5 is an exemplary system architecture to which the image sample generation method of one embodiment of the present disclosure may be applied;

fig. 6 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Referring to fig. 1, a flow of one embodiment of an image sample generation method according to the present disclosure is shown. The image sample generation method as shown in fig. 1 includes the steps of:

step 101, importing a first image sample into a pre-trained classification model.

In this embodiment, the executing entity (e.g., the server) of the image sample generation method may import the first image sample into a classification model trained in advance.

In this embodiment, the classification model may include at least one feature extraction layer.

Here, the classification model may be used to classify the image imported into the classification model, and the classification result may be a predefined type.

In this embodiment, a feature extraction layer may be used to extract image features. Here, the feature image output by the feature extraction layer may be referred to as a feature map.

In this embodiment, the classification model may be constructed based on a neural network. The classification model can comprise a feature extraction layer and can also comprise a neural network layer with various functions such as a pooling layer and a full-link layer. The specific structure of the classification model may be set according to an actual application scenario, and is not limited herein.

In this embodiment, the classification model is used to characterize the correspondence between the image and the predefined type.

In this embodiment, the predefined type may be a predefined type. The specific content, the number of types, and the representation of the predefined types may be set according to an actual application scenario, and are not limited herein.

And step 102, determining a salient region in the first image sample according to the feature map output by the feature extraction layer.

In this embodiment, a feature map (feature map) may be used to characterize the features of the first image sample.

In some related arts, Visual Attention Mechanism (VA) refers to that when facing a scene, a human automatically processes regions of interest and selectively ignores regions of no interest, which are called salient regions. For the picture of the unknown significance region, significance detection can be carried out, and the region possibly interested by people can be predicted to be the significance region.

And 103, establishing a corresponding relation between the salient region label indicating the salient region and the first image sample.

In this embodiment, the salient region label may indicate a salient region in the first image sample.

In this embodiment, the correspondence between the salient region label and the first image sample may be understood as adding a salient region label to the first image sample.

In this embodiment, the saliency region label and the first image sample are used for training to obtain a saliency detection model.

It should be noted that, in the image sample generation method provided in this embodiment, the classification model is used to process the first image sample, and the saliency region is determined based on the intermediate process image (feature map) of the classification model, so that the saliency region in the first image sample can be determined by using the feature extraction capability of the classification model. In other words, the trained classification model can distinguish the primary content and the secondary content of the image and highlight the primary content of the image in the process of feature extraction. Therefore, the method for automatically determining the saliency area in the image by the computer can be provided, so that the sample generation difficulty of the saliency detection model is reduced, and the sample generation cost is reduced.

In some embodiments, the step 102 may include: determining regions from the feature map where the response score ratio is greater than a preset response score ratio threshold; from the determined regions, a salient region is determined.

Here, the pixel value of each pixel in the feature map may be referred to as a response score.

Here, the response score of the determined region may be a sum of response scores of pixels in the region.

Here, the response score ratio may be a ratio between the response score of the region and the total response score in the feature map.

Here, the region where the response score ratio meets the above requirement (greater than a preset response score ratio threshold) may be determined in various ways.

Here, the number of the determined regions may be several, and then the saliency region may be determined from among the several regions. For example, the region with the smallest area may be selected from a plurality of regions that meet the requirement as the salient region.

It should be noted that, by determining the saliency region from the feature map according to the response score, the area of the saliency region can be as small as possible while ensuring that the response score of the saliency region is large, and thus, the saliency region having the area as small as possible is obtained, that is, the accuracy of the saliency region can be improved.

In some embodiments, the classification model may be trained by the first step.

In this embodiment, the first step may include step 201 and step 202.

Step 201, importing the second image sample into the initial classification model to obtain a classification result.

Here, an initial classification model may be constructed according to an actual application scenario. The initial classification model may output a classification result for the second image sample.

Here, the initial classification model may be an untrained or a trained incomplete neural network.

Here, the second image sample has a type label, which may indicate an image click rate.

Alternatively, the image click rate may be represented in the form of a continuous numerical value, or in the form of a click rate grade. For example, two levels of high click rate and low click rate may be used as the type labels, and the classification basis of high and low may be various.

Step 202, based on the classification result and the type label of the second image sample, the initial classification model is adjusted.

Here, the type label of the second image sample may indicate an image click rate.

Here, the image click rate may be determined in various ways, which is not limited herein. For example, image click-through rates may be determined from historical search presentation results.

It should be noted that, by using the type label indicating the image click rate, the initial classification model is trained to obtain the classification model, and the feature extraction capability of the classification model for the image which the user pays more attention to can be improved by using the incidence relation between the image click rate and the attention of the user. In other words, the purpose of the salient region detection is to find out the image region that the user may pay attention to, and the user click rate can reflect the attention degree of the user to the image to some extent. Therefore, the classification model obtained by training the type label indicating the image click rate can be used for accurately extracting the region which is possibly concerned by the user, so that the accuracy of determining the salient region is improved.

In some embodiments, the second image sample and the corresponding type tag may be generated by the second step. Here, the second step may include: importing the candidate image into a click model to obtain an estimated click rate; and determining a second image sample from the candidate image according to the estimated click rate, and generating a type label corresponding to the second image sample.

Here, the number of candidate images may be plural. The click model can obtain the estimated click rate corresponding to each candidate image through each candidate image.

Here, the Click Model (Click Model) is a Model of the user's Click behavior. According to the historical click information of the user, the interest and the behavior of the user are modeled so as to predict the future click behavior of the user and improve the correlation.

It should be noted that the click model is adopted to select the second image sample, the type label of the second image sample is generated according to the estimated click rate, and the second image sample which is not shown can be predicted by using the prediction capability of the click model on the attention degree of the user, so that the range of the candidate image can be expanded. On the other hand, the click model is adopted to select the second image sample and generate the type label, so that the manual processing amount can be reduced, the processing speed is increased, and the time and the labor cost are saved.

In some embodiments, the determining the second image sample from the candidate image according to the estimated click rate and generating the type tag corresponding to the second image sample in the above steps may include: determining the candidate image with the estimated click rate larger than the first click rate threshold value as a second candidate image corresponding to the first type label; and determining the candidate image with the estimated click rate smaller than the second click rate threshold value as a second candidate image corresponding to the second type label.

Here, the first click rate threshold may be not less than the second click rate threshold. In other words, the first click-through rate threshold may be greater than or equal to the second click-through rate threshold. The first click rate threshold and the second click rate threshold may be set according to an actual application scenario, and are not limited herein. As an example, the candidate images corresponding to the estimated click rate greater than the first click rate threshold may account for 30% of the total number of candidate images; the candidate images corresponding to the estimated click rate smaller than the second click rate threshold may account for 30% of the total number of the candidate images.

Here, the first type tag may indicate a higher click rate. The second type of tag may indicate a lower click rate.

It should be noted that, by setting the click rate threshold and determining the two types of labels, the second candidate image and the corresponding type label are simpler, and the difficulty of training the classification model using the second candidate image is reduced while the accuracy of the trained classification model is ensured.

In some embodiments, the method may further include: importing the image to be processed into a pre-trained significance detection model to obtain significance region indication information; and based on the indication information of the saliency area, cutting the image to be processed to obtain a cut image corresponding to the image to be processed.

Here, the saliency detection model may be trained using the first image sample and the saliency region labels.

Here, the above-described saliency region indication information may indicate a saliency region in an image to be processed.

Here, the image to be processed may be clipped based on the saliency area indication information, resulting in a clipped image.

It should be noted that, since the saliency detection model is obtained by training using the first image sample and the saliency region label, the accuracy of the determined saliency region can be improved by using the saliency region determined by the saliency detection model.

In some embodiments, the image to be processed may be cropped based on the saliency area indication information in various ways.

In some embodiments, the cutting the image to be processed based on the indication information of the salient region to obtain a cut image corresponding to the image to be processed may include: determining a cutting reserved area according to the size of the salient area indicated by the salient area information and the size of the target; and according to the determined cutting reserved area, cutting the image to be processed.

Here, the target size may be a size of a desired processed image.

Here, the clipping reserved area may refer to an area that needs to be reserved when the image to be processed is clipped.

Optionally, if the size of the saliency region is larger than the target size, the cropping may be started from the center of the saliency region, and the cropping may be performed with the target size as a boundary, so as to obtain a cropped image.

Optionally, if the size of the salient region is not larger than the target size, edge repairing may be performed on the salient region, and the size of the salient region is compensated to the target size, so as to obtain the cropped image.

It should be noted that, the clipping reserved area is determined according to the size of the salient area and the size of the target area, and an appropriate clipping reserved area can be set according to the size of the salient area and the size of the target area, so that the requirement of the target size is met, and the accuracy of clipping the image to be processed is improved.

In some embodiments, the significance detection model may be trained by a third step. The third step may include: importing the first image sample into an initial significance detection model to obtain a detection result; and adjusting the initial saliency detection model according to the saliency area labels and the detection results corresponding to the first image samples.

Here, the initial saliency detection model may be an untrained or an untrained complete neural network. The specific result of the initial saliency detection model may be set according to an actual application scenario, and is not limited herein.

Here, the detection result may indicate a predicted significance region.

Here, a loss between the saliency area label and the detection result may be calculated, and parameters of the initial saliency detection model may be adjusted according to the loss.

It should be noted that, the first image sample is adopted to train the significance detection model, and as the first image sample has the characteristics of high tag accuracy and low sample cost, a large number of first image samples can be obtained for training, so that the model accuracy can be improved from the perspective of high accuracy of a single sample, and the model accuracy can be improved from the perspective of a large number of samples.

In some application scenarios, please refer to fig. 3, which illustrates an application scenario according to an embodiment of the present application.

The click model may process the candidate image to obtain a second image sample and a type label of the second image sample.

The second image sample and the corresponding type label can be used for training an initial classification model to obtain a classification model.

The classification model may be used to process the first image sample to obtain a salient region label.

The first image sample and the saliency region label can be used for training an initial saliency detection model to obtain the saliency detection model.

The saliency detection model can be used for processing the image to be processed to obtain saliency region indication information of the image to be processed.

Based on the indication information of the saliency region, the image to be processed can be cut to obtain a cut image.

With further reference to fig. 4, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an image sample generation apparatus, which corresponds to the method embodiment shown in fig. 1, and which is particularly applicable in various electronic devices.

As shown in fig. 4, the image sample generation apparatus of the present embodiment includes: an importing unit 401, a determining unit 402 and a establishing unit 403. The image classification method comprises an importing unit, a classifying unit and a processing unit, wherein the importing unit is used for importing a first image sample into a pre-trained classification model, the classification model comprises at least one feature extraction layer, and the classification model is used for representing the corresponding relation between an image and a predefined type; the determining unit is used for determining a salient region in the first image sample according to the feature map output by the feature extraction layer; the establishing unit is used for establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

In this embodiment, specific processes of the importing unit 401, the determining unit 402, and the establishing unit 403 of the image sample generating apparatus and technical effects thereof may refer to related descriptions of step 101, step 102, and step 103 in the corresponding embodiment of fig. 1, which are not described herein again.

In some embodiments, the determining a salient region in the first image sample according to the feature map output by the feature extraction layer includes: determining a region with a response score ratio larger than a preset response score ratio threshold value from the feature map, wherein the response score ratio is the ratio of the response score of the region to the total response score in the feature map; from the determined regions, a salient region is determined.

In some embodiments, the classification model is trained by a first step comprising: importing the second image sample into an initial classification model to obtain a classification result; and adjusting the initial classification model based on the classification result and a type label of the second image sample, wherein the type label is used for indicating the image click rate.

In some embodiments, the second image sample and the corresponding type label are generated by a second step, wherein the second step comprises: importing the candidate image into a click model to obtain an estimated click rate, wherein the click model is used for representing the corresponding relation between the image and the estimated click rate; and determining the second image sample from the candidate image according to the estimated click rate, and generating a type label corresponding to the second image sample.

In some embodiments, the determining the second image sample from the candidate images according to the estimated click rate and generating the type tag corresponding to the second image sample includes: determining the candidate image with the estimated click rate larger than the first click rate threshold value as a second candidate image corresponding to the first type label; and determining the candidate image with the estimated click rate smaller than a second click rate threshold as a second candidate image corresponding to the second type label, wherein the first click rate threshold is not smaller than the second click rate threshold.

In some embodiments, the apparatus is further configured to: importing an image to be processed into a pre-trained significance detection model to obtain significance region indication information, wherein the significance detection model is obtained by utilizing a first image sample and a significance region label for training; and based on the indication information of the saliency area, cutting the image to be processed to obtain a cut image corresponding to the image to be processed.

In some embodiments, the cropping the image to be processed based on the indication information of the salient region to obtain a cropped image corresponding to the image to be processed includes: determining a cutting reserved area according to the size of the salient area indicated by the salient area indication information and the size of the target; and according to the determined cutting reserved area, cutting the image to be processed.

In some embodiments, the significance detection model is trained by a third step, wherein the third step comprises: importing the first image sample into an initial significance detection model to obtain a detection result; and adjusting the initial saliency detection model according to the saliency area label and the detection result corresponding to the first image sample.

Referring to fig. 5, fig. 5 illustrates an exemplary system architecture to which the image sample generation method of one embodiment of the present disclosure may be applied.

As shown in fig. 5, the system architecture may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The terminal devices 501, 502, 503 may interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have various client applications installed thereon, such as a web browser application, a search-type application, and a news-information-type application. The client application in the terminal device 501, 502, 503 may receive the instruction of the user, and complete the corresponding function according to the instruction of the user, for example, add the corresponding information in the information according to the instruction of the user.

The terminal devices 501, 502, 503 may be hardware or software. When the terminal devices 501, 502, 503 are hardware, they may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like. When the terminal devices 501, 502, and 503 are software, they can be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 505 may be a server providing various services, for example, receiving an information acquisition request sent by the terminal device 501, 502, 503, and acquiring the presentation information corresponding to the information acquisition request in various ways according to the information acquisition request. And the relevant data of the presentation information is sent to the terminal equipment 501, 502, 503.

It should be noted that the image sample generation method provided by the embodiment of the present disclosure may be executed by a terminal device, and accordingly, the image sample generation apparatus may be disposed in the terminal device 501, 502, 503. In addition, the image sample generation method provided by the embodiment of the present disclosure may also be executed by the server 505, and accordingly, an image sample generation apparatus may be provided in the server 505.

It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to fig. 6, shown is a schematic diagram of an electronic device (e.g., a terminal device or a server of fig. 5) suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, the electronic device may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (Hyper Text Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: importing a first image sample into a pre-trained classification model, wherein the classification model comprises at least one feature extraction layer and is used for representing the corresponding relation between an image and a predefined type; determining a salient region in the first image sample according to the feature map output by the feature extraction layer; and establishing a corresponding relation between a salient region label indicating the salient region and the first image sample, wherein the salient region label and the first image sample are used for training to obtain a salient detection model.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a cell does not in some cases constitute a limitation of the cell itself, for example, the import cell may also be described as a "cell importing a first image sample into a pre-trained classification model".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), application specific standard products (a SSPs), system on a chip (SO cs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

17页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种基于SAR图像全局-局部特征的舰船目标检测方法

Image sample generation method and device and electronic equipment

相关技术

网友询问留言