Blind spot implementation in neural networks

文档序号：1253279 发布日期：2020-08-21 浏览：9次中文

阅读说明：本技术 神经网络中的盲点实施 (Blind spot implementation in neural networks ) 是由 L.布埃 M.克梅尔马赫 R.比特曼于 2019-11-13 设计创作，主要内容包括：公开了用于在神经网络模型中实施盲点的技术。在一些示例实施例中,一种计算机实施的方法包括：获得在图像捕获设备的视场内捕获的并且包括占据视场内的特定位置的特定类型的对象的图像,以及使用神经网络模型基于对象的特定位置来确定对象的置信值。置信值表示对象是感兴趣对象的可能性,并且神经网络模型被训练为当特定类型的对象占据特定位置时,为特定类型的对象生成比当特定类型的对象不占据特定位置时更低的置信值。(Techniques for implementing blind spots in a neural network model are disclosed. In some example embodiments, a computer-implemented method includes: the method includes obtaining an image captured within a field of view of an image capture device and including a particular type of object occupying a particular location within the field of view, and determining a confidence value for the object based on the particular location of the object using a neural network model. The confidence value represents a likelihood that the object is an object of interest, and the neural network model is trained to generate a lower confidence value for the object of the particular type when the object of the particular type occupies the particular location than when the object of the particular type does not occupy the particular location.)

1. A computer-implemented method for blind spot implementation in a neural network model, comprising:

obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view;

determining, by the at least one hardware processor, a first confidence value for the first object based on the first location of the first object, the first confidence value representing a likelihood that the first object is the object of interest, using a neural network model configured to generate a lower confidence value for the first type of object when the first type of object occupies the first location within the field of view than when the first type of object does not occupy the first location within the field of view;

obtaining a second image captured within the field of view of the image capture device, the second image including a second object of the first type occupying a second location within the field of view of the image capture device, the second location being different from the first location;

determining, by the at least one hardware processor, a second confidence value for the second object based on the second location of the second object using the neural network model, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location;

determining that the second object is the object of interest based on a second confidence value of the second object; and

based on determining that the second object is the object of interest, communicating instructions to the computing device via the network, the instructions configured to cause the computing device to perform a function based on the second object being the object of interest.

2. The computer-implemented method of claim 1, wherein the image capture device comprises a camera.

3. The computer-implemented method of claim 1, wherein the first type is a person.

4. The computer-implemented method of claim 1, wherein the neural network model comprises a convolutional neural network model.

5. The computer-implemented method of claim 1, wherein the function comprises displaying, on the computing device, an indication that a second object is present within the field of view.

6. The computer-implemented method of claim 1, further comprising, prior to obtaining the first image and the second image:

accessing a database to obtain a first training data set comprising a first plurality of training data images, each of the first plurality of training data images comprising a corresponding training data object of a first type occupying a first position within the field of view, each corresponding training data object of the first type occupying a first position not being labeled as an object of interest in the first training data set;

accessing a database to obtain a second training data set comprising a second plurality of training data images, each of the second plurality of training data images comprising a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the second type occupying the first position being labeled as an object of interest in the second training data set; and

training, by the at least one hardware processor, the neural network model using a first set of training data, a second set of training data, and one or more machine learning algorithms.

7. The computer-implemented method of claim 1, further comprising:

obtaining a third image of the field of view, the third image having been captured by the image capture device and including a third object of a second type occupying a first position within the field of view, the second type being different from the first type;

determining, by the at least one hardware processor, a third confidence value for a third object based on the first location of the third object using the neural network model, the third confidence value representing a likelihood that the third object is the object of interest, and the third confidence value for the third object being higher than the first confidence value for the first object based on the third object being of the second type and not the first type;

determining that the third object is the object of interest based on a third confidence value of the third object; and

based on the determination that the third object is the object of interest, communicating, via the network, another instruction to the computing device, the another instruction configured to cause the computing device to perform a function based on the third object being the object of interest.

8. The computer-implemented method of claim 7, further comprising, prior to obtaining the third image:

accessing a database to obtain a third training data set comprising a third plurality of training data images, each of the third plurality of training data images comprising a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the second type occupying the first position not being labeled as an object of interest in the third training data set; and

training, by the at least one hardware processor, the neural network model using a third set of training data and one or more machine learning algorithms.

9. The computer-implemented method of claim 1, wherein the first image further includes a third object of a second type occupying a first location within the field of view, the second type being different from the first type, and the method further comprises:

determining that the third object is the object of interest based on a third confidence value of the third object; and

10. The computer-implemented method of claim 9, further comprising, prior to obtaining the first image:

accessing a database to obtain a training data set comprising a plurality of training data images, each of the plurality of training data images comprising a corresponding training data object of a first type occupying a first position within the field of view and a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the first type occupying a first position not being labeled as an object of interest in the training data set and each corresponding training data object of the second type occupying a second position being labeled as an object of interest in the training data set; and

training, by the at least one hardware processor, the neural network model using a set of training data and one or more machine learning algorithms.

11. A system for blind spot implementation in a neural network model, comprising:

at least one processor; and

a non-transitory computer-readable medium storing executable instructions that, when executed, cause at least one processor to perform operations comprising:

obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view;

determining a first confidence value for the first object based on the first location of the first object using a neural network model, the first confidence value representing a likelihood that the first object is an object of interest, the neural network model configured to generate a lower confidence value for the first type of object when the first type of object occupies the first location within the field of view than when the first type of object does not occupy the first location within the field of view;

determining, using the neural network model, a second confidence value for the second object based on the second location of the second object, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location;

determining that the second object is the object of interest based on a second confidence value of the second object; and

12. The system of claim 11, wherein the image capture device comprises a camera.

13. The system of claim 11, wherein the first type is a person.

14. The system of claim 11, wherein the function comprises displaying, on the computing device, an indication that a second object is present within the field of view.

15. The system of claim 11, wherein the operations further comprise, prior to obtaining the first image and the second image:

the neural network model is trained using a first set of training data, a second set of training data, and one or more machine learning algorithms.

16. The system of claim 11, wherein the operations further comprise:

determining, using the neural network model, a third confidence value for the third object based on the first location of the third object, the third confidence value representing a likelihood that the third object is the object of interest, and the third confidence value for the third object being higher than the first confidence value for the first object based on the third object being of the second type and not of the first type;

determining that the third object is the object of interest based on a third confidence value of the third object; and

17. The system of claim 16, wherein the operations further comprise, prior to obtaining the third image:

the neural network model is trained using a third set of training data and one or more machine learning algorithms.

18. The system of claim 11, wherein the first image further includes a third object of a second type occupying the first position within the field of view, the second type being different from the first type, and the operations further comprise:

determining that the third object is the object of interest based on a third confidence value of the third object; and

19. The system of claim 18, wherein the operations further comprise, prior to obtaining the first image:

accessing a database to obtain a training data set comprising a plurality of training data images, each of the plurality of training data images comprising a corresponding training data object of a first type occupying a first position within the field of view and a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the first type occupying a first position being flagged as not being an object of interest in the training data set and each corresponding training data object of the second type occupying a second position being flagged as not being an object of interest in the training data set; and

the neural network model is trained using a training data set and one or more machine learning algorithms.

20. A non-transitory machine-readable storage medium tangibly embodying a set of instructions, wherein the set of instructions, when executed by at least one processor, cause the at least one processor to perform operations comprising:

obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view;

determining that the second object is the object of interest based on a second confidence value of the second object; and

Technical Field

The present application relates generally to the field of neural networks and, in various embodiments, to systems and methods for blind spot implementation in neural network models.

Background

As practical applications begin to emerge in a variety of areas from computer vision to speech recognition, industry interest in artificial intelligence is undergoing exponential growth. Despite early success, Machine Learning (ML) models still suffer from inconsistencies in some cases. For example, ML-based models are not robust to small antagonistic perturbations and generally lack interpretability. Currently, computer vision models used to detect objects of interest in the field of view of an image capture device (e.g., a security camera) and provide an indication of such detection do not implement any blind spots in their detection. As a result, if a potential object of interest is detected anywhere within the field of view of the image capture device, it is considered a detected object of interest. These current computer vision models fail to provide selective detection of objects of interest, which results in unnecessary notifications indicative of detection of objects of interest being generated and, thus, results in excessive consumption of electronic resources, such as additional processor workload and network bandwidth consumption associated with generating and sending unnecessary notifications. In addition, unnecessary notifications caused by such failures of current computer vision models require excessive human attention, cause problems in load balancing of attention to notifications, and worsen user experience.

Disclosure of Invention

According to an aspect of the present disclosure, there is provided a computer-implemented method for blind spot implementation in a neural network model, the method comprising: obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view; determining, by the at least one hardware processor, a first confidence value for the first object based on the first location of the first object, the first confidence value representing a likelihood that the first object is the object of interest, using a neural network model configured to generate a lower confidence value for the first type of object when the first type of object occupies the first location within the field of view than when the first type of object does not occupy the first location within the field of view; obtaining a second image captured within the field of view of the image capture device, the second image including a second object of the first type occupying a second location within the field of view of the image capture device, the second location being different from the first location; determining, by the at least one hardware processor, a second confidence value for the second object based on the second location of the second object using the neural network model, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location; determining that the second object is the object of interest based on a second confidence value of the second object; and based on determining that the second object is the object of interest, communicating instructions to the computing device via the network, the instructions configured to cause the computing device to perform a function based on the second object being the object of interest.

According to yet another aspect of the present disclosure, there is provided a system for blind spot implementation in a neural network model, the system comprising: at least one processor; and a non-transitory computer-readable medium storing executable instructions, wherein the executable instructions, when executed, cause at least one processor to perform operations comprising: obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view; determining a first confidence value for the first object based on the first location of the first object using a neural network model, the first confidence value representing a likelihood that the first object is an object of interest, the neural network model configured to generate a lower confidence value for the first type of object when the first type of object occupies the first location within the field of view than when the first type of object does not occupy the first location within the field of view; obtaining a second image captured within the field of view of the image capture device, the second image including a second object of the first type occupying a second location within the field of view of the image capture device, the second location being different from the first location; determining, using the neural network model, a second confidence value for the second object based on the second location of the second object, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location; determining that the second object is the object of interest based on a second confidence value of the second object; and based on determining that the second object is the object of interest, communicating instructions to the computing device via the network, the instructions configured to cause the computing device to perform a function based on the second object being the object of interest.

According to yet another aspect of the disclosure, there is provided a non-transitory machine-readable storage medium tangibly embodying a set of instructions, which when executed by at least one processor, cause the at least one processor to perform operations comprising: obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view; determining a first confidence value for the first object based on the first location of the first object using a neural network model, the first confidence value representing a likelihood that the first object is an object of interest, the neural network model configured to generate a lower confidence value for the first type of object when the first type of object occupies the first location within the field of view than when the first type of object does not occupy the first location within the field of view; obtaining a second image captured within the field of view of the image capture device, the second image including a second object of the first type occupying a second location within the field of view of the image capture device, the second location being different from the first location; determining, using the neural network model, a second confidence value for the second object based on the second location of the second object, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location; determining that the second object is the object of interest based on a second confidence value of the second object; and based on determining that the second object is the object of interest, communicating instructions to the computing device via the network, the instructions configured to cause the computing device to perform a function based on the second object being the object of interest.

Drawings

Some example embodiments of the disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.

Fig. 1 is a network diagram illustrating a client-server system, according to some example embodiments.

Fig. 2 is a block diagram illustrating enterprise applications and services in an enterprise application platform, according to some example embodiments.

Fig. 3 is a block diagram illustrating a computer vision system, according to some example embodiments.

4A-4C illustrate different images captured within a field of view of an image capture device, according to some example embodiments.

Fig. 5A-5C illustrate different images captured within a field of view of an image capture device, according to some example embodiments.

Fig. 6 is a flow diagram illustrating blind spot implementation according to some example embodiments.

Fig. 7 illustrates a flow for blind spot implementation according to some example embodiments.

Fig. 8 is a block diagram of an example computer system on which methods described herein may be performed, according to some example embodiments.

Detailed Description

Example methods and systems for blind spot implementation in neural networks are disclosed. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the example embodiments. It will be apparent, however, to one skilled in the art that the present embodiments may be practiced without these specific details.

In some example embodiments, the computer vision system is configured to implement one or more blind spots in a neural network model used to detect the object of interest. The computer vision system may receive an image captured within a field of view of an image capture device, the image including an object, and then determine whether the object is an object of interest based on a location of the object within the field of view using a neural network model. In some example embodiments, the computer vision system is configured to: an object is classified as not being an object of interest if its position is within one of a set of one or more blind spot regions within the field of view (e.g., an object occupying a particular position), and as being an object of interest if its position is not within any of the set of one or more blind spot regions within the field of view. The computer vision system may be configured to perform one or more functions related to object of interest detection in response to or otherwise based on classifying an object as an object of interest, and to ignore an object and not perform any functions related to object of interest detection in response to or otherwise based on classifying the object as not being an object of interest. The neural network model may be trained to determine a confidence value for the object based on the location of the object within the field of view. The confidence value represents a likelihood that the object is an object of interest. The neural network model may be configured to generate a lower confidence value for the subject when the subject is within any one of the set of one or more blind spot regions than when the subject is not within any one of the set of one or more blind spot regions. The computer vision system may determine whether the object is an object of interest based on the confidence value of the object.

Embodiments of the features disclosed herein relate to non-generic, non-traditional, and non-conventional operations or combinations of operations. By applying one or more of the solutions disclosed herein, some technical effects of the systems and methods of the present disclosure are more controllable, subtle, and accurate detection of objects of interest, thereby reducing the consumption of electronic resources and human attention associated with excessive and unnecessary detection of certain objects within a particular area captured in an image by an image capture device. As a result, the functionality of the computer vision system is improved. Other technical effects will also be apparent from the present disclosure.

The methods or embodiments disclosed herein may be implemented as a computer system having one or more modules (e.g., hardware modules or software modules). Such modules may be executed by one or more hardware processors of a computer system. In some example embodiments, a non-transitory machine-readable storage device may store a set of instructions that, when executed by at least one processor, cause the at least one processor to perform the operations and method steps discussed within this disclosure.

The details of one or more variations of the subject matter described herein are set forth in the accompanying drawings and the description below. Other features and benefits of the subject matter described herein will be apparent from the description and drawings, and from the claims.

In the present disclosure, the terms "first," "second," and "third" are used in conjunction with other terms to distinguish those other terms from each other, rather than to refer to a particular order of those other terms. For example, the terms "first image", "second image", and "third image" in this disclosure should not be construed to mean that the "first image" is first captured or obtained, the "second image" is second captured or obtained, or the "third image" is third captured or obtained. Rather, the use of the terms "first", "second" and "third" together with the term "image" should only be construed to mean that these images are all different from each other. This non-sequential interpretation of terms should also apply to their use in this disclosure with other words, including but not limited to objects, types, locations, training data sets, multiple training data images, and confidence values.

Fig. 1 is a network diagram illustrating a client-server system 100, according to some example embodiments. A platform (e.g., machine and software), in the example form of an enterprise application platform 112, provides server-side functionality to one or more clients via a network 114 (e.g., the internet). FIG. 1 shows, for example, client machine 116 with programming client 118 (e.g., a browser), microdevice client machine 122 with microdevice web client 120 (e.g., a browser without a scripting engine), and client/server machine 117 with programming client 119.

Turning specifically to the example enterprise application platform 112, a web server 124 and an Application Program Interface (API) server 125 may be coupled to the application server 126 and provide web and program interfaces to the application server 126. Application server 126 may in turn be coupled to one or more database servers 128 that facilitate access to one or more databases 130. The cross-functionality service 132 may include a relational database module to provide support services for access to database(s) 130 including a user interface (user interface) library 136. The web server 124, API server 125, application server 126, and database server 128 may host a cross-function service 132. Application servers 126 may further host domain applications 134.

The cross-functionality services 132 provide services to users and processes that utilize the enterprise application platform 112. For example, cross-function services 132 may provide portal services (e.g., web services), database services, and connections to domain applications 134 for users operating client machine 116, client/server machine 117, and small device client machine 122. Further, the cross-function service 132 may provide an environment for delivering enhancements to existing applications and for integrating third-party and legacy applications with existing cross-function services 132 and domain applications 134. Furthermore, while the system 100 shown in fig. 1 employs a client-server architecture, embodiments of the present disclosure are certainly not limited to such an architecture, and may find application in distributed or peer-to-peer architecture systems as well.

The enterprise application platform 112 can improve (e.g., increase) data accessibility across different environments of the computer system architecture. For example, when testing an instance of a software solution in a development environment, the enterprise application platform 112 may effectively and efficiently enable users to use real-world data created from the use of deployed instances of the software solution by one or more end users in a production environment. The enterprise application platform 112 is described in more detail below in conjunction with fig. 2-8.

Fig. 2 is a block diagram illustrating enterprise applications and services in the enterprise application platform 112 according to an example embodiment. Enterprise application platform 112 may include cross-functionality services 132 and domain applications 134. Cross-functionality service 132 may include portal module 140, relational database module 142, connector and messaging module 144, API module 146, and development module 148.

Portal module 140 may enable a single point of access to other cross-functional services 132 and domain applications 134 for client machine 116, small device client machine 122, and client/server machine 117. The portal module 140 may be utilized to process, compose, and maintain web pages that present content (e.g., user interface elements and navigation controls) to a user. In addition, the portal module 140 can enable user roles, which are constructs that associate roles with private environments utilized by users, to perform tasks, utilize services, and exchange information with other users within defined scopes. For example, a role may determine the content available to a user and the activities that the user may perform. The portal module 140 includes a generation module, a communication module, a reception module, and a regeneration module. In addition, the portal module 140 may conform to web services standards and/or utilize various internet technologies including Java, J2EE, SAP, Advanced Business Application Programming Language (ABAP), and WebDynpro, XML, JCA, JAAS, x.509, LDAP, WSDL, WSRR, SOAP, UDDI, and microsoft.

Relational database module 142 may provide support services for accessing database(s) 130 including user interface library 136. The relational database module 142 may provide support for object relational mapping, database independence, and distributed computing. Relational database module 142 may be utilized to add, delete, update, and manage database elements. Further, relational database module 142 may conform to database standards and/or utilize various database technologies including SQL, SQLDBC, Oracle, MySQL, Unicode, JDBC, and the like.

The connector and messaging module 144 may enable communication across different types of messaging systems utilized by the cross-function service 132 and the domain applications 134 by providing a common messaging application processing interface. The connector and messaging module 144 may enable asynchronous communications on the enterprise application platform 112.

The API module 146 may enable development of service-based applications by exposing interfaces to existing and new applications as services. The repository may be included in the platform as a central place to find available services when building an application.

Development module 148 can provide a development environment for the addition, integration, updating, and expansion of software components on enterprise application platform 112 without affecting existing cross-functionality services 132 and domain applications 134.

Turning to the domain application 134, the customer relationship management application 150 may enable access to multiple data sources and business processes and may facilitate the collection and storage of relevant personalization information from the multiple data sources and business processes. Business personnel, who have taken the task of developing buyers into long-term customers, may utilize the customer relationship management application 150 to provide assistance to the buyer throughout the customer participation cycle.

The financial applications 152 and business processes may be utilized by enterprise personnel to track and control financial transactions within the enterprise application platform 112. The financial application 152 may facilitate performance of operational tasks, analysis tasks, and collaboration tasks associated with financial management. In particular, the financial application 152 may enable performance of tasks related to financial accountability, planning, forecasting, and managing financial costs.

The human resources application 154 may be utilized by enterprise personnel and business processes to manage, deploy, and track enterprise personnel. In particular, the human resources application 154 may enable analysis of human resources issues and facilitate human resources decisions based on real-time information.

The product lifecycle management application 156 can enable management of the product throughout its lifecycle. For example, the product lifecycle management application 156 can enable collaborative engineering, customized product development, project management, asset management, and quality management between business partners.

The supply chain management application 158 may enable monitoring of observed performance in the supply chain. The supply chain management application 158 can facilitate adherence to production schedules and on-time delivery of products and services.

Third party applications 160 and legacy applications 162 may be integrated with domain applications 134 and utilize cross-functionality services 132 on enterprise application platform 112.

Fig. 3 is a block diagram illustrating a computer vision system 300, according to some example embodiments. In some example embodiments, the computer vision system 300 uses the manipulated training data to develop controllable blind spots in a computer vision model used to detect the object of interest. The training data set may comprise a set of training images, wherein each training image contains a variable number of objects of interest. A person may either view all of the training images and manually annotate the location of each object of interest, such as by using an image annotation tool that allows the user to manually define regions in the images and create textual descriptions or some other type of classification identifier for those regions. For example, a user may use an image annotation tool to define a region of an image using a bounding box. In some example embodiments, the training images are automatically annotated by the computer using an automatic image annotation system. The annotated training image may be fed into a neural network for use as training data in training a neural network model to detect and classify an object of interest.

In some example embodiments, the training data is manipulated such that objects of interest occupying particular locations in the image are sometimes unannotated (e.g., not labeled as objects of interest in any way). The percentage of times these objects of interest are not annotated is called the contamination rate. Not annotating the object of interest has the implicit effect of telling the ML-based model to classify it as an irrelevant background. Experimentally, the inventors of the present disclosure have found that a model trained on a data set contaminated with an even negligible contamination rate (e.g., less than 1%) becomes blind in the controlled points of the image. Technically, this intentional "failure" is based on the ability of the deep learning based large-volume ML model to learn surface details and over-fit its training dataset.

Manipulating the data set so that the ML model has a single blind spot provides useful functionality. For example, one could imagine a security system in a building where a security officer is interested in moving alerts in the building, but is less interested in alerts that are present near a reception that is crowded and monitored by the reception officer. In this example, features of the present disclosure may be used to create and implement blind spots in the detection of objects of interest near a reception site. Furthermore, the aggregation of locations that generalize the steering to the entire image by introducing multiple blind spots leads to other kinds of exploitation. For example, detection of people along very specific areas or paths (such as a bathroom) may be prevented. Thus, this controllable blindness of ML-based models can be exploited without any modification to provide privacy zones by making it highly unlikely that certain objects are detected in the blind spot.

In some example embodiments, the computer-vision system 300 is configured to implement one or more blind points in the neural network model used to detect the object of interest, such as by training the neural network model to determine whether the object is the object of interest based on the location of the object within the field of view, such that objects detected within the training blind points of the neural network model are not classified, identified, or otherwise determined to be the object of interest, even though the same object is deemed to be determined to be the object of interest at a location that is not the training blind points of the neural network model.

In some embodiments, the computer vision system 300 includes any combination of one or more of an image capture device 310, a detection module 320, an interface module 330, a machine learning module 340, and one or more databases 350. The modules 310, 320, 330, and 340 and the database(s) 350 may reside on a computer system or other machine having memory and at least one processor (not shown). In some embodiments, modules 310, 320, 330, and 340 and database(s) 350 may be incorporated into application server(s) 126 in fig. 1. However, other configurations of modules 310, 320, 330, and 340 and database(s) 350 are contemplated to be within the scope of the present disclosure.

In some example embodiments, one or more of modules 310, 320, 330, and 340 are configured to provide various user interface functionality, such as generating a user interface, interactively presenting a user interface to a user, receiving information from a user (e.g., interactions with a user interface), and so forth. Presenting information to a user may include causing presentation of information to the user (e.g., communicating information to a device with instructions to present information to the user). Information may be presented using a variety of means including visually displaying information and outputting using other devices (e.g., audio, tactile, etc.). Similarly, information may be received via various means including alphanumeric input or other device input (e.g., one or more touch screens, cameras, tactile sensors, light sensors, infrared sensors, biosensors, microphones, gyroscopes, accelerometers, other sensors, etc.). In some example embodiments, one or more of the modules 310, 320, 330, 340 are configured to receive user input. For example, one or more of modules 310, 320, 330, and 340 may present one or more GUI elements (e.g., drop down menus, selectable buttons, text fields) with which a user may submit input. In some example embodiments, one or more of modules 310, 320, 330, and 340 are configured to perform various communication functions to facilitate the functions described herein, such as by communicating with computing device 305 via network 114 using a wired or wireless connection.

In some example embodiments, image capture device 310 is configured to capture images within a field of view of image capture device 310. The field of view is an open viewable area that can be viewed via an optical device. In some example embodiments, the image capture device 310 includes a video camera (e.g., a moving image camera). However, other types of image capture devices 310 are also within the scope of the present disclosure, including but not limited to still image cameras, thermal or infrared cameras, imaging radars, and sonic sensors.

In some example embodiments, the detection module 320 is configured to obtain any images captured by the image capture device 310. The detection module 320 may receive the image as streaming data from the image capture device 310 or may access a database in which the image is being stored to obtain the image. Each captured image may include one or more objects. The object in the captured image may be any physical entity or thing that can be seen. Each object may be of a certain type. One example of a type of object is a person. Another type of object is a vehicle such as an automobile. However, other types of objects are also within the scope of the present disclosure.

In some example embodiments, the detection module 320 is configured to, for each object in the image, determine whether the object is an object of interest based on a location of the object within a field of view of the image using a neural network model. In some example embodiments, the neural network model comprises a convolutional neural network model. However, other types of neural network models are also within the scope of the present disclosure.

The neural network model may be configured to generate a confidence value for each object based on the particular location of the object. In some example embodiments, the confidence value represents a likelihood that the object is an object of interest, and the neural network model is trained to generate a lower confidence value for the object of the particular type when the object of the particular type is within (e.g., occupies a particular position of) one of the set of one or more blind spot regions than when the object of the particular type is not within (e.g., does not occupy) any one of the set of one or more blind spot regions. For example, the neural network model may generate two different confidence values for the same object, where one confidence value for the object is low based on the object being within the blind spot, and the other confidence value is high based on the object not being within any blind spot.

In some example embodiments, the detection module 320 is configured to: an object is classified as not being an object of interest if its position is within one of a set of one or more blind spot regions within the field of view (e.g., an object occupying a particular position), and as being an object of interest if its position is not within any of the set of one or more blind spot regions within the field of view. The detection module 320 may use the confidence value of the object to determine whether the object is an object of interest. In some example embodiments, the detection module 320 uses a threshold to determine whether the object is an object of interest, such that objects having confidence values above the threshold are determined to be objects of interest and objects having confidence values below the threshold are determined not to be objects of interest. In one example, the detection module 320 uses a threshold of 0.5 in determining whether the object is an object of interest, and the neural network model is configured to generate a confidence value of less than 0.5 for objects within any blind spots and a confidence value of greater than 0.5 for objects not within any blind spots.

Fig. 4A-4C illustrate different images 400A, 400B, and 400C, respectively, captured within a field of view 410 of the image capture device 310, according to some example embodiments. In fig. 4A, an image 400A includes a first type of object 414, such as a person. In this example, the neural network model has been trained to implement a blind spot region 412, where in the blind spot region 412, objects are not determined to be objects of interest, and outside the blind spot region 412, objects are determined to be objects of interest. Thus, in fig. 4A, based on the location of the object 414 being outside the blind spot region 412, the detection module 320 generates a high confidence value (e.g., 0.75) using the neural network model and determines that the object 414 is the object of interest based on the high confidence value.

In some example embodiments, the interface module 330 is configured to perform one or more functions related to object of interest detection in response to or otherwise based on determining the object 414 as an object of interest. The interface module 330 may include a Human-Computer Interaction (HCI) module. In some example embodiments, the interface module 330 may send or otherwise communicate transmission instructions to a computing device, such as the computing device 305 in fig. 3, via a network in response to or otherwise based on determining the object 414 as an object of interest. The instructions are configured to cause the computing device 305 to perform a function based on the object 414 being an object of interest. In some example embodiments, the function includes displaying, on computing device 305, an indication that object 414 is present within the field of view. For example, an alert may be displayed on a screen of computing device 305 to notify a user of computing device 305 of the presence of object 414. However, other types of functions are also within the scope of the present disclosure.

In fig. 4B, object 414 is in position within blind spot region 412. Based on the location of the object 414 within the blind spot region 412, the detection module 320 generates a low confidence value (e.g., 0.25) using a neural network and determines that the object 414 is not an object of interest based on the low confidence value. As a result of this determination that the object 414 is not an object of interest, the detection module 320 does not perform any functions related to object of interest detection of the object 414.

In some example embodiments, the detection module 320 is configured to determine whether the object is an object of interest based on the location of the object (e.g., within or outside the blind spot 412) and based on the type of the object. The neural network model may be configured to generate high confidence values for particular types of objects and low confidence values for non-particular types of objects. The neural network model may also be configured to generate low confidence values for particular types of objects and to generate low confidence values for non-particular types of objects. As previously discussed, these high and low confidence values may be used by the detection module 320 to determine whether the object is an object of interest.

In some example embodiments, objects of a first type within the blind spot region 412 are determined to not be objects of interest, while objects of a second type, different from the first type, within the blind spot region 412 are determined to be objects of interest. In FIG. 4C, an object 417 of a different type than the object 414 in FIG. 4B is in position within the blind spot region 412. In this example, although the object 416 is within the blind spot region 412, the detection module 320 determines that the object 416 is an object of interest based on the fact that the object 416 is of a particular type (e.g., the object is a vehicle) or that the object 416 is not of a particular type (e.g., the object is not a person). As a result of this determination that the object 416 is an object of interest, the interface module 330 performs one or more functions related to object of interest detection of the object 416.

Fig. 5A-5C illustrate different images 500A, 500B, and 500C respectively captured within a field of view 410 of an image capture device 310, according to some example embodiments. In FIG. 5A, an image 500A includes an object 414A of a first type, such as a person, and another object 414B of the first type, such as another person. In this example, the neural network model has been trained to generate low confidence values for objects within the blind spot region 412 such that objects within the blind spot region are not determined to be objects of interest, and high confidence values for objects outside the blind spot region 412 such that objects outside the blind spot region 412 are determined to be objects of interest. Thus, in fig. 5A, detection module 320 determines that object 414A is an object of interest based on the location of object 414A being outside of blind spot region 412 and that object 414B is not an object of interest based on the location of object 414A being within blind spot region 412.

In fig. 5B, an object 416 of a second type, different from the first type, is in a position within the blind spot region 412. In this example, although the object 416 is within the blind spot region 412, the detection module 320 determines that the object 416 is an object of interest based on the fact that the object 416 is of the second type. As a result of this determination that the object 416 is an object of interest, the interface module 330 performs one or more functions related to object of interest detection of the object 416.

In fig. 5C, an object 518 of a third type, different from the first and second types, is in position within the blind spot region 412. In this example, the detection module 320 determines that the object 518 is not an object of interest based on the location of the object 518 within the blind spot region and the fact that the object 518 is of the third type. In this regard, the classification of an object within the blind spot region 412 as being an object of interest or not may depend on the object type (e.g., person-to-vehicle) of the object, as the neural network model may be configured to generate a high confidence value for a particular type of object even if the particular type of object is within the blind spot.

Referring back to fig. 3, in some example embodiments, the machine learning module 340 is configured to train a neural network model used by the detection module 320 to determine whether an object is an object of interest or is not an object of interest. The machine learning module 340 may use training data that includes training images that label objects of a particular type in certain locations as objects of interest and do not label other objects of the same particular type in other blind spot locations as objects of interest, thereby training the neural network model to generate a lower confidence value for objects of the particular type when objects of the particular type are within any of the blind spot locations than when objects of the particular type are not within any of the blind spot locations. As a result of this training of the neural network model to include these blind spot locations, if the object of the particular type is in one of those blind spot locations, the object of the particular type is determined not to be the object of interest. In some example embodiments, the training data is stored in database(s) 350, where the database(s) 350 may be accessed by the machine learning module 340 for use in training the neural network model.

Fig. 6 is a flow diagram illustrating a method 600 of blind spot implementation according to some example embodiments. The method 600 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (e.g., instructions run on a processing device), or a combination thereof. Method 600 includes operations 610, 620, 630(630A and 630B), 640(640A and 640B), 650(650A and 650B), and 660. In an example embodiment, the method 600 is performed by any combination of one or more of the computer-vision system of FIG. 3 or modules thereof, as described above. Operations 610 and 620 are machine learning operations and may be performed by the machine learning module 340. Operations 630, 640, and 650 are detection operations and may be performed by detection module 320. Operation 660 is an interactive operation and may be performed by interface module 330.

At operation 610, the computer vision system 300 obtains a training data set including a plurality of training data images. The training data may be obtained by accessing a database (e.g., database(s) 350) in which the training data is stored and retrieving the training data. Each of the plurality of training data images may include a corresponding training data object occupying a blind spot location within the field of view. In some example embodiments, each corresponding training data object is of the same type (e.g., human) and is labeled as not being an object of interest in the training data set. The plurality of training data images or the further plurality of training data images in the training data set may each comprise further training data objects in positions other than the blind spot position. In some example embodiments, these other training data objects are also of the same type (e.g., also human), and are labeled as objects of interest in the training data set.

At operation 620, the computer-vision system 300 trains the neural network model using the training data set and one or more machine learning algorithms. In some example embodiments, the neural network model comprises a convolutional neural network model. However, other types of neural network models are also within the scope of the present disclosure.

Operations 630A, 640A, 650A, and 660A correspond to a scene in which objects are within blind spot locations. At operation 630A, the computer vision system 300 obtains an image that has been captured within the field of view of the image capture device 310, where the object is in a blind spot position within the field of view. In some example embodiments, the image includes objects of a particular type. The computer vision system 300 may obtain the image by receiving the image as part of the streaming data from the image capture device 310 or by accessing a database (e.g., database(s) 350) in which the image is stored and retrieving the image.

At operation 640A, the computer vision system 300 determines a confidence value for the object based on the particular location of the object using the neural network model. The confidence value represents a likelihood that the object is an object of interest. In some example embodiments, the neural network model is configured to generate a lower confidence value for a particular type of object when the particular type of object occupies a particular location within the field of view than when the particular type of object does not occupy the particular location within the field of view.

At operation 650A, the computer vision system 300 determines that the object is not an object of interest based on the confidence value of the object. In some example embodiments, the determination is based on the confidence value of the object being below a threshold, as previously discussed.

At operation 660A, based on the determination that the object is not an object of interest, the computer vision system 300 does not send any instructions to any computing device configured to cause the computing device to perform functions.

Operations 630B, 640B, 650B, and 660B correspond to scenes in which the object is not within a blind spot location. At operation 630B, the computer vision system 300 obtains an image that has been captured within the field of view of the image capture device 310, where the object is not in a blind spot position. In some example embodiments, the image includes objects of a particular type. The computer vision system 300 may obtain the image by receiving the image as part of the streaming data from the image capture device 310 or by accessing a database (e.g., database(s) 350) in which the image is stored and retrieving the image.

At operation 640B, the computer vision system 300 determines a confidence value for the object based on the particular location of the object (e.g., not the blind spot location) using the neural network model. The confidence value represents a likelihood that the object is an object of interest.

At operation 650B, the computer vision system 300 determines that the object is an object of interest based on the confidence value of the object. In some example embodiments, the determination is based on the confidence value of the object being above a threshold, as previously discussed.

At operation 660B, the computer vision system 300 sends instructions to the computing device, wherein the instructions are configured to cause the computing device to perform a function based on the object being the object of interest, such as displaying on the computing device an indication that the object is present within the field of view.

It is contemplated that any other features described within this disclosure may be incorporated into method 600.

Fig. 7 illustrates a flow 700 for blind spot implementation according to some example embodiments. In the process, the database of training images 710 is accessed, such as by a user manually selecting training images 710 using a user interface of the computing device or by a machine learning module 340 automatically selecting training images 710. The training image 710 includes objects within the field of view, and the data manipulation process 720 is performed by a user or by the machine learning module 340. In the data manipulation process 720, one or more locations or regions within the field of view of the training images 710 are selected as blind spots such that objects outside those locations are annotated as objects of interest, but for most or all of the training images 710, objects within those locations are not annotated as objects of interest, even if the objects are of the same type. The data manipulation process 720 generates a database 730 of contaminated images, where the database is then used for training 740 of the ML-based model to detect objects of interest, as previously discussed. As a result of the training 740, a manipulated predictive model 760 is generated. The stream of new images 750 is fed into the manipulated prediction model 760, and the manipulated prediction model 760 generates inferences 770 (e.g., confidence values) for each of the new images 750 as to whether any objects of interest are present in the new images. Based on the use of the manipulated prediction model, objects that would otherwise be identified as objects of interest are not identified as objects of interest based on their positioning within one of the blind spots. It is contemplated that any other features described within this disclosure may be incorporated into flow 700.

Certain embodiments are described herein as comprising logic or multiple components, modules, or mechanisms. The modules may constitute software modules (e.g., code embodied on a machine-readable medium or in a transmission signal) or hardware modules. A hardware module is a tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. In example embodiments, one or more computer systems (e.g., a stand-alone, client, or server computer system) or one or more hardware modules (e.g., a processor or a set of processors) of a computer system may be configured by software (e.g., an application or application portion) as a hardware module that operates to perform certain operations as described herein.

In various embodiments, the hardware modules may be implemented mechanically or electronically. For example, a hardware module may include dedicated circuitry or logic that is permanently configured (e.g., configured as a special-purpose processor, such as a Field Programmable Gate Array (FPGA) or an Application-Specific Integrated Circuit (ASIC)) to perform certain operations. A hardware module may also include programmable logic or circuitry (e.g., as contained within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in a dedicated and permanently configured circuit, or in a temporarily configured (e.g., configured by software) circuit may be driven by cost and time considerations.

Thus, the term "hardware module" should be understood to encompass a tangible entity, i.e., an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which the hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one time instance. For example, where the hardware modules include a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. The software may configure the processor accordingly, for example to constitute a particular hardware module at one instance in time and to constitute a different hardware module at a different instance in time.

A hardware module may provide information to and receive information from other hardware modules. Thus, the described hardware modules may be considered communicatively coupled. In the case where a plurality of such hardware modules coexist, the communication may be realized by signal transmission (for example, through an appropriate circuit and bus) connecting the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communication between such hardware modules may be achieved, for example, through storage and retrieval of information in a memory structure accessible to the multiple hardware modules. For example, one hardware module may perform an operation and store the output of the operation in a memory device to which it is communicatively coupled. Yet another hardware module may then access the memory device at a later time to retrieve and process the stored output. The hardware modules may also initiate communication with an input device or an output device and may operate on a resource (e.g., a collection of information).

Various operations of the example methods described herein may be performed, at least in part, by one or more processors temporarily (e.g., via software) configured or permanently configured to perform the relevant operations. Whether temporarily configured or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. In some example embodiments, the modules referred to herein may comprise processor-implemented modules.

Similarly, the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of the method may be performed by one or more processors or processor-implemented modules. The performance of certain operations may be distributed among one or more processors, residing not only within a single machine, but also deployed across multiple machines. In some example embodiments, the processor or processors may be located at a single site (e.g., within a home environment, office environment, or as a server farm), while in other embodiments, the processor may be distributed across multiple sites.

The one or more processors may also be operable to support execution of related operations in a "cloud computing" environment or as a "Software as a Service (SaaS)". For example, at least some of the operations may be performed by a set of computers (as an example of a machine that includes a processor), which may be accessed via a network (e.g., network 114 of fig. 1) and via one or more appropriate interfaces (e.g., APIs).

The example embodiments may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Example embodiments may be implemented using a computer program product, such as a computer program tangibly embodied in an information carrier, e.g., in a machine-readable medium for execution by, or to control the operation of, data processing apparatus, e.g., a programmable processor, a computer, or multiple computers.

A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, subroutine, or other unit suitable for use in a computing environment. A computer program can be deployed to be executed on one computer or on multiple computers that are distributed across one site or across multiple sites and interconnected by a communication network.

In an example embodiment, the operations may be performed by one or more programmable processors executing a computer program to perform functions by operating on input data and generating output. Method operations may also be performed by, and apparatus of example embodiments may be implemented as, special purpose logic circuitry, e.g., an FPGA or an ASIC.

The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In embodiments where a programmable computing system is deployed, it will be understood that both hardware and software architectures are worth considering. In particular, it will be appreciated that the choice of whether to implement certain functions in permanently configured hardware (e.g., an ASIC), temporarily configured hardware (e.g., a combination of software and a programmable processor), or a combination of permanently and temporarily configured hardware may be a design choice. The following sets forth a hardware (e.g., machine) architecture and a software architecture that may be deployed in various example embodiments.

Fig. 8 is a block diagram of a machine in the example form of a computer system 800, where instructions 824 for causing the machine to perform any one or more of the methodologies discussed herein may be executed within the computer system 800. In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a Personal Computer (PC), a tablet PC, a Set-Top Box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term "machine" shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

The example computer system 800 includes a processor 802 (e.g., a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or both), a main memory 804 and a static memory 806 in communication with each other via a bus 808. The computer system 800 may further include a graphics or video Display unit 810 (e.g., a Liquid Crystal Display (LCD) or Cathode Ray Tube (CRT)). The computer system 800 also includes an alphanumeric input device 812 (e.g., a keyboard), a User Interface (UI) navigation (or cursor control) device 814 (e.g., a mouse), a storage unit (e.g., a disk drive unit) 816, an audio or signal generation device 818 (e.g., a speaker), and a network interface device 820.

The storage unit 816 includes a machine-readable medium 822 in which one or more sets of data structures and instructions 824 (e.g., software) embodying or utilized by any one or more of the methodologies or functions described herein are stored on the machine-readable medium 822. The instructions 824 may also reside, completely or at least partially, within the main memory 804 and/or within the processor 802 during execution thereof by the computer system 800, the main memory 804 and the processor 802 also constituting machine-readable media. The instructions 824 may also reside, completely or at least partially, within the static memory 806.

While the machine-readable medium 822 is shown in an example embodiment to be a single medium, the term "machine-readable medium" may include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions 824 or data structures. The term "machine-readable medium" shall also be taken to include any tangible medium that is capable of storing, encoding or carrying instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present embodiments, or that is capable of storing, encoding or carrying data structures utilized by or associated with such instructions. The term "machine-readable medium" shall accordingly be taken to include, but not be limited to, solid-state memories and optical and magnetic media. Specific examples of the machine-readable medium include non-volatile memories including, for example, semiconductor Memory devices (e.g., Erasable Programmable Read-Only-Memory (EPROM), Electrically Erasable Programmable Read-Only-Memory (EEPROM), and flash Memory devices); magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and Compact Disc-Read-Only-Memory (CD-ROM) and Digital Versatile Disc (or Digital video Disc) Read-Only-Memory (DVD-ROM) disks.

The instructions 824 may further be transmitted or received over a communication network 826 using a transmission medium. The instructions 824 may be sent using the network interface device 820 and any one of a number of well-known transfer protocols (e.g., HTTP). Examples of communication networks include a LAN, a WAN, the internet, a mobile telephone network, a POTS network, and a wireless data network (e.g., WiFi and WiMax networks). The term "transmission medium" shall be taken to include any intangible medium that is capable of storing, encoding or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible medium to facilitate communication of such software.

Each of the features and teachings disclosed herein may be utilized separately or in conjunction with other features and teachings to provide systems and methods for blind spot implementation in neural networks. Representative examples of many of these additional features and teachings, both separately and in combination, are described in more detail with reference to the accompanying drawings. This detailed description is merely intended to teach a person of ordinary skill in the art further details for practicing certain aspects of the present teachings and is not intended to limit the scope of the claims. Therefore, combinations of features disclosed above in the detailed description may not be necessary to practice the broadest teaching, but are instead taught merely to describe particularly representative examples of the present teachings.

Some portions of the detailed descriptions herein are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the description, discussions utilizing terms such as "processing" or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

The present disclosure also relates to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, Read-Only memories (ROMs), Random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.

The example methods or algorithms presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems, computer servers, or personal computers may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the method steps disclosed herein. The structure for a variety of these systems will appear from the description herein. It will be appreciated that a variety of programming languages may be used to implement the teachings of the disclosure as described herein.

Furthermore, various features of the representative examples and the dependent claims may be combined in ways that are not specifically and explicitly enumerated in order to provide additional useful embodiments of the present teachings. It is also expressly noted that all value ranges or indications of groups of entities disclose every possible intermediate value or intermediate entity for the purpose of original disclosure as well as for the purpose of limiting the claimed subject matter. It should also be particularly noted that the sizes and shapes of the components shown in the figures are designed to help understand how the present teachings are practiced, but are not intended to limit the sizes and shapes shown in the examples.

Although embodiments have been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the disclosure. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The accompanying drawings that form a part hereof show by way of illustration, and not of limitation, specific embodiments in which the subject matter may be practiced. The embodiments shown are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed herein. Other embodiments may be utilized and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. This detailed description, therefore, is not to be taken in a limiting sense, and the scope of various embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.

Such embodiments of the inventive subject matter may be referred to, individually and/or collectively, herein by the term "invention" merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Thus, although specific embodiments have been illustrated and described herein, it should be appreciated that any arrangement calculated to achieve the same purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the above description.

Examples of the invention

1. A computer-implemented method, comprising:

obtaining a first image captured within a field of view of an image capture device, the first image including a first object of a first type occupying a first position within the field of view;

determining, by the at least one hardware processor, a second confidence value for the second object based on the second location of the second object, the second confidence value representing a likelihood that the second object is the object of interest, and the second confidence value for the second object being higher than the first confidence value for the first object based on the second object being at the second location rather than the first location;

determining that the second object is the object of interest based on a second confidence value of the second object; and

2. The computer-implemented method of example 1, wherein the image capture device comprises a camera.

3. The computer-implemented method of example 1 or example 2, wherein the first type is a person.

4. The computer-implemented method of any of examples 1 to 3, wherein the neural network model comprises a convolutional neural network model.

5. The computer-implemented method of any of examples 1 to 4, wherein the function comprises displaying, on the computing device, an indication that the second object is present within the field of view.

6. The computer-implemented method of any of examples 1 to 5, further comprising, prior to obtaining the first image and the second image:

accessing a database to obtain a first training data set comprising a first plurality of training data images, each of the first plurality of training data images comprising a corresponding training data object of a first type occupying a first position within a field of view, each corresponding training data object of the first type occupying a first position not being labeled as an object of interest in the first training data set;

accessing the database to obtain a second training data set comprising a second plurality of training data images, each of the second plurality of training data images comprising a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the second type occupying the first position being labeled as an object of interest in the second training data set; and

training, by at least one hardware processor, a neural network model using a first set of training data, a second set of training data, and one or more machine learning algorithms.

7. The computer-implemented method of any of examples 1 to 6, further comprising:

determining, by the at least one hardware processor, a third confidence value for the third object based on the first location of the third object using the neural network model, the third confidence value representing a likelihood that the third object is the object of interest, and the third confidence value for the third object being higher than the first confidence value for the first object based on the third object being of the second type and not of the first type;

determining that the third object is the object of interest based on a third confidence value of the third object; and

8. The computer-implemented method of example 7, further comprising, prior to obtaining the third image:

accessing the database to obtain a third training data set comprising a third plurality of training data images, each of the third plurality of training data images comprising a corresponding training data object of the second type occupying a first position within the field of view, each corresponding training data object of the second type occupying the first position not being labeled as an object of interest in the third training data set; and

training, by the at least one hardware processor, the neural network model using the third set of training data and the one or more machine learning algorithms.

9. The computer-implemented method of any of examples 1 to 8, wherein the first image further includes a third object of a second type occupying a first position within the field of view, the second type being different from the first type, and the method further comprises:

determining that the third object is the object of interest based on a third confidence value of the third object; and

10. The computer-implemented method of any of examples 1 to 9, further comprising, prior to obtaining the first image:

accessing a database to obtain a training data set comprising a plurality of training data images, each of the plurality of training data images comprising a corresponding training data object of a first type occupying a first position within a field of view and a corresponding training data object of a second type occupying a first position within the field of view, each corresponding training data object of the first type occupying a first position being flagged as not being an object of interest in the training data set and each corresponding training data object of the second type occupying a second position being flagged as not being an object of interest in the training data set; and

training, by at least one hardware processor, a neural network model using a set of training data and one or more machine learning algorithms.

11. A system, comprising:

at least one processor; and

a non-transitory computer-readable medium storing executable instructions that, when executed, cause at least one processor to perform the method according to any one of examples 1 to 10.

12. A non-transitory machine-readable storage medium tangibly embodying a set of instructions that, when executed by at least one processor, cause the at least one processor to perform the method according to any of examples 1 to 10.

13. A machine-readable medium carrying a set of instructions which, when executed by at least one processor, causes the at least one processor to carry out the method according to any one of examples 1 to 10.

The Abstract of the disclosure is provided to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. Furthermore, in the foregoing detailed description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the detailed description, with each claim standing on its own as a separate embodiment.

30页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：一种箱号识别方法和装置

Blind spot implementation in neural networks

相关技术

网友询问留言