Intelligent terminal, server and image processing method

文档序号：73244 发布日期：2021-10-01 浏览：21次中文

阅读说明：本技术 智能终端、服务器和图像处理方法 (Intelligent terminal, server and image processing method ) 是由杨雪洁孙锦张玉高雪松陈维强于 2020-07-15 设计创作，主要内容包括：本公开公开了一种智能终端、服务器和图像处理方法。在本公开实施例中,通过智能终端响应于参与合照的任一目标对象所发起的合照指示,获取并展示用于合照的引导界面给各目标对象,以便于引导各目标对象进行跨屏合照。然后由服务器对各目标对象上传的图像进行图像处理,得到合成图像,并分发给各目标对象,从而实现智能终端的跨屏合照功能。通过实现用户在视频通话场景下参与合照的功能,解决了用户在执行视频通话的相关操作时,视频通话功能简单,有些用户需求无法满足,导致处理器空闲资源未能得到充分利用,造成资源浪费的问题。(The disclosure discloses an intelligent terminal, a server and an image processing method. In the embodiment of the disclosure, the intelligent terminal responds to the co-illumination instruction initiated by any target object participating in the co-illumination, and a guide interface for the co-illumination is acquired and displayed for each target object, so that each target object is guided to perform cross-screen co-illumination. And then, the server performs image processing on the images uploaded by the target objects to obtain synthetic images, and the synthetic images are distributed to the target objects, so that the cross-screen photo combination function of the intelligent terminal is realized. By realizing the function of the user participating in the group photo in the video call scene, the problem that when the user executes the related operation of the video call, the video call function is simple, some user requirements cannot be met, so that the idle resources of the processor cannot be fully utilized, and the resource waste is caused is solved.)

1. An intelligent terminal, comprising: display, image collector, memory and controller, wherein:

the display is used for displaying information;

the image collector is used for collecting images;

the memory for storing a computer program executable by the controller;

the controller is respectively connected with the display, the image collector and the memory, and is configured to:

when a first target object and at least one second target object carry out video call through an intelligent terminal, responding to a photo-combination instruction of the first target object or the second target object, and controlling a display to display a guide interface after acquiring guide interface data for photo-combination;

responding to an image acquisition instruction triggered by the guide interface, and controlling the image acquisition device to acquire an image to be processed of the first target object;

the controller is connected with the image collector and is configured to send the to-be-processed image of the first target object collected by the image collector to a server, so that the server synthesizes the to-be-processed image of the first target object with the to-be-processed images of the second target objects to obtain a synthesized image;

and receiving and controlling the display to display the composite image sent by the server and then storing the composite image.

2. The intelligent terminal according to claim 1, wherein the guidance interface comprises: a first operation item for setting a background image; the controller, before performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

in response to an operation instruction of the first operation item in the guide interface, determining a background image for co-lighting, and notifying the server.

3. The intelligent terminal according to claim 1, wherein the guidance interface comprises: a second operation item for setting the recommended photographing gesture; the controller, before performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

in response to an operation instruction of the second operation item in the guide interface, displaying the recommended photographing gesture selected by the first target object in the guide interface.

4. The intelligent terminal according to any one of claims 1-3, wherein the guidance interface comprises: a third operation item for setting body type data; the controller, after performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

responding to an operation instruction of the third operation item in the guide interface, acquiring body type data of each human body target to be synthesized and informing the server, so that the server can adjust the relative sizes of different human body targets according to the body type data of each human body target;

wherein the body shape data comprises height and weight; and determining the human body image size of each target object participating in the co-photographing in the composite image according to the body type data.

5. A server, comprising a memory and a processor, wherein:

the memory for storing a computer program executable by the processor;

the processor is connected with the memory and configured to control the intelligent terminals of the first target object and the second target object to display a guide interface for co-shooting if a co-shooting instruction sent by the intelligent terminal of the first target object or the intelligent terminal of the second target object is received when a video call is performed between the first target object and at least one second target object through the intelligent terminals;

receiving to-be-processed images sent by the intelligent terminal of the first target object and the intelligent terminals of the second target objects respectively;

synthesizing the images to be processed of the first target object and the second target objects to obtain a synthesized image;

and distributing the composite image to the intelligent terminal of the first target object and the intelligent terminals of the second target objects.

6. The server according to claim 5, wherein the processor, when executing the combining of the first target object and the to-be-processed image of each second target object to obtain the combined image, is configured to:

respectively segmenting human body images of the human body targets from the images to be processed;

and fusing the segmented human body image into a background image for co-photography.

7. The server according to claim 6, wherein the processor, when executing the fusing of the segmented human body image to the background image for the group photograph, is configured to:

carrying out image processing on the human body image of the human body target to obtain a mask image of the human body target; pixel points belonging to the human body target and pixel points outside the human body target are recorded in the mask image;

and after the human body target is determined in the synthesis area of the background image, replacing the pixel values of the pixel point positions belonging to the human body target in the background image with the pixel values of the human body image by taking the mask image of the human body image as a template.

8. The server of claim 7, wherein the processor, when determining that the human target is in the composite region of the background image, is configured to:

counting the total number N of human body targets in all the images to be processed; n is a positive integer greater than or equal to 2;

and dividing the background image into M non-overlapping synthetic regions according to the total number of the human body targets, wherein each human body target corresponds to one region, and M is greater than or equal to N.

9. An image processing method, characterized in that the method comprises:

when a first target object and at least one second target object carry out video call through an intelligent terminal, responding to a lighting instruction of the first target object or the second target object, and acquiring and displaying a guide interface for lighting;

in response to an image acquisition instruction triggered by the guide interface, acquiring an image to be processed of a first target object;

sending the acquired image to be processed of the first target object to a server so that the server synthesizes the image to be processed of the first target object with the image to be processed of each second target object to obtain a synthesized image;

and receiving and displaying the composite image sent by the server.

10. An image processing method, characterized in that the method comprises:

when a first target object and at least one second target object carry out video call through an intelligent terminal, if a photo-combination instruction sent by the intelligent terminal of the first target object or the intelligent terminal of the second target object is received, controlling the intelligent terminals of the first target object and the second target object to display a guide interface for photo-combination;

receiving the images to be processed sent by the first target object and the second target objects respectively;

synthesizing the images to be processed of the first target object and the second target objects to obtain a synthesized image;

distributing the composite image to the first target object and each of the second target objects.

Technical Field

The present disclosure relates to the field of intelligent terminal technologies, and in particular, to an intelligent terminal, a server, and an image processing method.

Background

With the wide application of the video call technology on the intelligent terminal, people can perform cross-screen social contact through the intelligent terminal. However, cross-screen social contact based on a single video call scene cannot meet the user requirements, and the problems of the situation are as follows: when the related operation of the video call is executed, the video call function is simple, some user requirements cannot be met, so that idle resources of the processor cannot be fully utilized, and resource waste is caused.

Disclosure of Invention

The purpose of the present disclosure is to provide an intelligent terminal, a server and an image processing method. The method is used for solving the problem that in the prior art, due to the fact that cross-screen social contact based on a single video call scene cannot meet the requirements of users, when the performance of an intelligent terminal is high in a processor, if only relevant operations of video calls are executed, idle resources of the processor cannot be fully utilized, and resources are wasted.

In a first aspect, the present disclosure provides an intelligent terminal, including: display, image collector, memory and controller, wherein:

the display is used for displaying information;

the image collector is used for collecting images;

the memory for storing a computer program executable by the controller;

the controller is respectively connected with the display, the image collector and the memory, and is configured to:

responding to an image acquisition instruction triggered by the guide interface, and controlling the image acquisition device to acquire an image to be processed of the first target object;

and receiving and controlling the display to display the composite image sent by the server and then storing the composite image.

In some possible embodiments, the guidance interface includes: a first operation item for setting a background image; the controller, before performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

in response to an operation instruction of the first operation item in the guide interface, determining a background image for co-lighting, and notifying the server.

In some possible embodiments, the guidance interface includes: a second operation item for setting the recommended photographing gesture; the controller, before performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

In some possible embodiments, the guidance interface includes: a third operation item for setting body type data; the controller, after performing the sending of the acquired to-be-processed image of the first target object to a server, is further configured to:

In a second aspect, the present disclosure provides a server comprising a memory and a processor, wherein:

the memory for storing a computer program executable by the processor;

the processor, coupled to the memory, configured to: when a first target object and at least one second target object carry out video call through an intelligent terminal, if a photo-combination instruction sent by the intelligent terminal of the first target object or the intelligent terminal of the second target object is received, controlling the intelligent terminals of the first target object and the second target object to display a guide interface for photo-combination;

receiving to-be-processed images sent by the intelligent terminal of the first target object and the intelligent terminals of the second target objects respectively;

synthesizing the images to be processed of the first target object and the second target objects to obtain a synthesized image;

and distributing the composite image to the intelligent terminal of the first target object and the intelligent terminals of the second target objects.

In some possible embodiments, when the processor synthesizes the first target object and the to-be-processed image of each of the second target objects, and obtains a synthesized image, the processor is configured to:

respectively segmenting human body images of the human body targets from the images to be processed;

and fusing the segmented human body image into a background image for co-photography.

In some possible embodiments, the processor, when fusing the segmented human body image to a background image for co-photography, is configured to:

In some possible embodiments, the processor, when determining that the human target is in the composite region of the background image, is configured to:

counting the total number N of human body targets in all the images to be processed; n is a positive integer greater than or equal to 2;

In a third aspect, the present disclosure is also directed to an image processing method applied to an intelligent terminal, where the method includes:

in response to an image acquisition instruction triggered by the guide interface, acquiring an image to be processed of a first target object;

and receiving and displaying the composite image sent by the server.

In some possible embodiments, the guidance interface includes: a first operation item for setting a background image; before sending the acquired to-be-processed image of the first target object to a server, the method further includes:

in response to an operation instruction of the first operation item in the guide interface, determining a background image for co-lighting, and notifying the server.

In some possible embodiments, the guidance interface includes: a second operation item for setting the recommended photographing gesture; before sending the acquired to-be-processed image of the first target object to a server, the method further includes:

In some possible embodiments, the guidance interface includes: a third operation item for setting body type data; after the acquired to-be-processed image of the first target object is sent to a server, the method further includes:

In a fourth aspect, the present disclosure further provides an image processing method applied to a server, the method including:

receiving to-be-processed images sent by the intelligent terminal of the first target object and the intelligent terminals of the second target objects respectively;

synthesizing the images to be processed of the first target object and the second target objects to obtain a synthesized image;

and distributing the composite image to the intelligent terminal of the first target object and the intelligent terminals of the second target objects.

In some possible embodiments, the synthesizing the to-be-processed images of the first target object and each of the second target objects to obtain a synthesized image includes:

respectively segmenting human body images of the human body targets from the images to be processed;

and fusing the segmented human body image into a background image for co-photography.

In some possible embodiments, the fusing the segmented human body image to a background image for photography includes:

In some possible embodiments, the determining the synthetic region of the human target in the background image includes:

counting the total number N of human body targets in all the images to be processed; n is a positive integer greater than or equal to 2;

In the embodiment of the disclosure, the function of providing a group photo for a user in a video call through an intelligent terminal is used for solving the problem that when the user performs related operations of the video call, the video call function is simple, some user requirements cannot be met, so that idle resources of a processor cannot be fully utilized, and resource waste is caused.

Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the disclosure. The objectives and other advantages of the disclosure may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments of the present disclosure will be briefly described below, and it is apparent that the drawings described below are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained based on the drawings without inventive labor.

Fig. 1A is an application scene diagram of an image processing method according to some embodiments of the present disclosure;

fig. 1B is a rear view of an intelligent television according to some embodiments of the present disclosure;

fig. 2 is a block diagram of a hardware configuration of the control device 100 in fig. 1A according to some embodiments of the present disclosure;

fig. 3A is a block diagram of a hardware configuration of the smart tv 200 in fig. 1A according to some embodiments of the present disclosure;

FIG. 3B is a block diagram of the server 300 of FIG. 1A according to some embodiments of the present disclosure;

fig. 3C is a timing diagram of an image processing method according to some embodiments of the present disclosure;

FIG. 4a is a schematic illustration of a guidance interface provided by some embodiments of the present disclosure;

FIG. 4b is a schematic diagram of an image after human body segmentation according to some embodiments of the present disclosure;

fig. 4c is a schematic diagram of a human body image after image processing according to some embodiments of the present disclosure;

FIG. 4d is a schematic view of a mask image provided by some embodiments of the present disclosure;

FIG. 4e is a schematic diagram of a composite image provided by some embodiments of the present disclosure;

fig. 5 is a flowchart illustrating an image processing method according to some embodiments of the present disclosure.

Fig. 6 is another flowchart of an image processing method according to some embodiments of the present disclosure.

Detailed Description

To further illustrate the technical solutions provided by the embodiments of the present disclosure, the following detailed description is made with reference to the accompanying drawings and the specific embodiments. Although the disclosed embodiments provide method steps as shown in the following embodiments or figures, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In steps where no necessary causal relationship exists logically, the order of execution of the steps is not limited to that provided by the disclosed embodiments. The method can be executed in the order of the embodiments or the method shown in the drawings or in parallel in the actual process or the control device.

It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure. The terms "first", "second" and "first" in the embodiments of the present disclosure are used for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, where features defined as "first", "second", may explicitly or implicitly include one or more of the features, in the description of embodiments of the disclosure, the term "plurality" refers to two or more, unless otherwise indicated, other terms and the like should be understood as such, and the preferred embodiments described herein are for the purpose of illustration and explanation only and are not intended to limit the disclosure, and features of embodiments and examples of the disclosure may be combined with each other without conflict.

The image processing method provided by the embodiment of the disclosure is suitable for intelligent terminals, and the intelligent terminals include but are not limited to: computers, smart phones, smart watches, smart televisions, smart robots, and the like. In the following, the image processing method provided by the present disclosure is described in detail by taking an intelligent electronic device as an example.

With the wide application of the video call technology to the smart television, people can perform cross-screen social contact through the smart television. However, cross-screen social contact based on a single video call scene cannot meet the user requirements, and the situation has many problems that if only relevant operations of video calls are executed when the performance of a processor is high, idle resources of the processor cannot be fully utilized, and resource waste is caused. In view of the above, the present disclosure provides an image processing method, an apparatus, an electronic device and a storage medium, which are used to solve the above problems.

According to the image processing method provided by the disclosure, the intelligent television can respond to the photo-combination instruction initiated by any target object participating in photo-combination, and a guide interface for photo-combination is acquired and displayed for each target object, so that each target object can be guided to carry out cross-screen photo-combination conveniently. And then, the server performs image processing on the images uploaded by the target objects to obtain synthetic images, and the synthetic images are distributed to the target objects, so that the cross-screen combination function of the intelligent television is realized.

Furthermore, the method and the device provide operation items for selecting the background image, setting the recommended photographing posture and setting the body type data in the guide interface, the server can recommend the maximum number of people participating in the group photo according to the size of the background image transmitted by the intelligent television, and intelligently adjust the size of the image to be processed of each target object according to the body type data transmitted by the intelligent television, so that the user satisfaction degree and the imaging effect of the combined photo are further ensured.

The following describes an image processing method in an embodiment of the present disclosure in detail with reference to the drawings.

Referring to fig. 1A, a view of an application scenario of image processing is provided in some embodiments of the present disclosure. As shown in fig. 1A, the control device 100 and the smart tv 200 may communicate with each other in a wired or wireless manner.

The control device 100 is configured to control the smart tv 200, receive an operation instruction input by a user, convert the operation instruction into an instruction recognizable and responsive by the smart tv 200, and play an intermediary role in interaction between the user and the smart tv 200. Such as: the user responds to the channel increasing and decreasing operation by operating the channel increasing and decreasing keys on the control device 100.

The control device 100 may be a remote controller 100A, which includes infrared protocol communication or bluetooth protocol communication, and other short-distance communication methods, and controls the smart tv 200 in a wireless or other wired manner. The user may input a user command through a button on the remote controller, a voice input, a control panel input, etc., to control the smart tv 200. Such as: the user can input a corresponding control instruction through a volume up-down key, a channel control key, an up/down/left/right moving key, a voice input key, a menu key, a power on/off key and the like on the remote controller, so as to realize the function of controlling the smart television 200.

The control device 100 may also be an intelligent device, such as a mobile terminal 100B, a tablet computer, a notebook computer, and the like. For example, the smart tv 200 is controlled using an application running on the smart device. The application program may provide various controls to a user through an intuitive User Interface (UI) on a screen associated with the smart device through configuration.

For example, the mobile terminal 100B may install a software application with the smart tv 200, implement connection communication through a network communication protocol, and implement the purpose of one-to-one control operation and data communication. Such as: the mobile terminal 100B and the smart tv 200 may establish a control instruction protocol, and implement functions such as physical keys arranged in the remote control 100A by operating various function keys or virtual controls of a user interface provided on the mobile terminal 100B. The audio and video content displayed on the mobile terminal 100B may also be transmitted to the smart television 200, so as to implement a synchronous display function.

The smart tv 200 may provide a network tv function of a broadcast receiving function and a computer support function. The smart tv may be implemented as a digital tv, a web tv, an Internet Protocol Tv (IPTV), and the like.

The smart tv 200 may be a liquid crystal display, an organic light emitting display, or a projection device. The specific type, size and resolution of the smart television are not limited.

The smart tv 200 also performs data communication with the server 300 through various communication methods. Here, the smart tv 200 may be allowed to make a communication connection through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 300 may provide various contents and interactions to the smart tv 200. For example, the smart tv 200 may send and receive information, such as: receiving Electronic Program Guide (EPG) data, receiving software program updates, or accessing a remotely stored digital media library. The servers 300 may be a group or groups of servers, and may be one or more types of servers. Other web service contents such as a video on demand and an advertisement service are provided through the server 300.

In some embodiments, as shown in fig. 1B, the smart tv 200 includes a rotating assembly 276, a controller 250, a display 275, a terminal interface 278 extending from the gap on the back panel, and the rotating assembly 276 coupled to the back panel, the rotating assembly 276 being capable of rotating the display 275. From the front view angle of the smart television, the rotating component 276 can rotate the display to a vertical screen state, that is, the vertical side length of the screen is greater than the horizontal side length, and can also rotate the screen to a horizontal screen state, that is, the horizontal side length of the screen is greater than the vertical side length.

Fig. 2 is a block diagram illustrating the configuration of the control device 100. As shown in fig. 2, the control device 100 includes a controller 110, a memory 120, a communicator 130, a user input interface 140, a user output interface 150, and a power supply 160.

The controller 110 includes a Random Access Memory (RAM)111, a Read Only Memory (ROM)112, a processor 113, a communication interface, and a communication bus. The controller 110 is used to control the operation of the control device 100, as well as the internal components of the communication cooperation, external and internal data processing functions.

Illustratively, the cavity controller 110 may control to generate a signal corresponding to the detected interaction and transmit the signal to the smart tv 200 when an interaction that a user presses a key disposed on the remote controller 100A or an interaction that touches a touch panel disposed on the remote controller 100A is detected.

And a memory 120 for storing various operation programs, data and applications for driving and controlling the control apparatus 100 under the control of the controller 110. The memory 120 may store various control signal commands input by a user.

The communicator 130 enables communication of control signals and data signals with the smart tv 200 under the control of the controller 110. Such as: the control device 100 transmits a control signal (e.g., a touch signal or a control signal) to the smart tv 200 via the communicator 130, and the control device 100 may receive the signal transmitted by the smart tv 200 via the communicator 130. The communicator 130 may include an infrared signal interface 131 and a radio frequency signal interface 132. For example: when the infrared signal interface is used, a user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the smart television 200 through the infrared sending module. The following steps are repeated: when the radio frequency signal interface is used, a user input instruction needs to be converted into a digital signal, and then the digital signal is modulated according to a radio frequency control signal modulation protocol and then is sent to the smart television 200 through a radio frequency sending terminal.

The user input interface 140 may include at least one of a microphone 141, a touch pad 142, a sensor 143, a key 144, and the like, so that the user may input a user instruction regarding controlling the smart tv 200 to the control apparatus 100 through voice, touch, gesture, press, and the like. For example, the lighting instruction may be generated according to a user operation and transmitted to the smart tv 200.

The user output interface 150 outputs a user instruction received by the user input interface 140 to the smart tv 200, or outputs an image or voice signal received by the smart tv 200. Here, the user output interface 150 may include an LED interface 151, a vibration interface 152 generating vibration, a sound output interface 153 outputting sound, a display 154 outputting images, and the like. For example, the remote controller 100A may receive an output signal such as audio, video, or data from the user output interface 150 and display the output signal in the form of an image on the display 154, an audio on the sound output interface 153, or a vibration on the vibration interface 152.

And a power supply 160 for providing operation power support for each element of the control device 100 under the control of the controller 110. In the form of a battery and associated control circuitry.

A hardware configuration block diagram of the smart tv 200 is exemplarily shown in fig. 3A. As shown in fig. 3A, the smart tv 200 may include a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a memory 260, a user interface 265, a video processor 270, a display 275, a rotating component 276, an audio processor 280, an audio output interface 285, and a power supply 290.

The rotating assembly 276 may also include other components, such as a transmission component, a detection component, and the like. Wherein, the transmission component can adjust the rotating speed and the torque output by the rotating component 276 through a specific transmission ratio, and can be in a gear transmission mode; the detection means may be composed of a sensor, such as an angle sensor, an attitude sensor, or the like, provided on the rotation shaft. These sensors may detect parameters such as the angle of rotation of the rotating assembly 276 and send the detected parameters to the controller 250, so that the controller 250 can determine or adjust the state of the smart tv 200 according to the detected parameters. In practice, rotating assembly 276 may include, but is not limited to, one or more of the components described above.

The tuner demodulator 210 receives the broadcast television signal in a wired or wireless manner, may perform modulation and demodulation processing such as amplification, mixing, and resonance, and is configured to demodulate, from a plurality of wireless or wired broadcast television signals, an audio/video signal carried in a frequency of a television channel selected by a user, and additional information (e.g., EPG data).

The tuner demodulator 210 is responsive to the user selected frequency of the television channel and the television signal carried by the frequency, as selected by the user and controlled by the controller 250.

The tuner demodulator 210 can receive a television signal in various ways according to the broadcasting system of the television signal, such as: terrestrial broadcasting, cable broadcasting, satellite broadcasting, internet broadcasting, or the like; and according to different modulation types, a digital modulation mode or an analog modulation mode can be adopted; and can demodulate the analog signal and the digital signal according to the different kinds of the received television signals.

The communicator 220 is a component for communicating with an external device or an external server according to various communication protocol types. For example, the smart tv 200 may transmit content data to an external device connected via the communicator 220, or browse and download content data from an external device connected via the communicator 220. The communicator 220 may include a network communication protocol module or a near field communication protocol module, such as a WIFI module 221, a bluetooth communication protocol module 222, and a wired ethernet communication protocol module 223, so that the communicator 220 may receive a control signal of the control device 100 according to the control of the controller 250 and implement the control signal as a WIFI signal, a bluetooth signal, a radio frequency signal, and the like.

The detector 230 is a component of the smart terminal 200 for collecting signals of an external environment or interaction with the outside. The detector 230 may include a sound collector 231, such as a microphone, which may be used to receive the sound of the user, such as a voice signal of a control instruction of the user controlling the smart tv 200; or, environmental sounds for identifying the type of the environmental scene may be collected, so that the smart tv 200 may adapt to the environmental noise.

In some other exemplary embodiments, the detector 230 may further include an image collector 232, such as a camera, a video camera, and the like, which may be used to collect external environment scenes to adaptively change the display parameters of the smart tv 200; and the intelligent television is used for acquiring the attribute of the user or the gesture interacted with the user so as to realize the interaction function between the intelligent television and the user.

The external device interface 240 is a component for providing the controller 250 to control data transmission between the smart tv 200 and an external device. The external device interface 240 may be connected to an external apparatus such as a set-top box, a game device, a notebook computer, etc. in a wired/wireless manner, and may receive data such as a video signal (e.g., moving image), an audio signal (e.g., music), additional information (e.g., EPG), etc. of the external apparatus.

The controller 250 controls the operation of the smart tv 200 and responds to the user's operation by running various software control programs (such as an operating system and various application programs) stored on the memory 260.

Controller 250 includes, among other things, Random Access Memory (RAM)251, Read Only Memory (ROM)252, graphics processor 253, CPU processor 254, communication interface 255, and communication bus 256. The RAM251, the ROM252, the graphic processor 253, and the CPU processor 254 are connected to each other through a communication bus 256 through a communication interface 255.

The ROM252 stores various system boot instructions. If the power of the smart tv 200 starts to be started when the power-on signal is received, the CPU processor 254 executes the system start instruction in the ROM252, and copies the operating system stored in the memory 260 to the RAM251 to start running the start operating system. After the start of the operating system is completed, the CPU processor 254 copies the various application programs in the memory 260 to the RAM251 and then starts running and starting the various application programs.

And a graphic processor 253 for generating various graphic objects such as icons, operation menus, and user input instruction display graphics, etc. The graphic processor 253 may include an operator for performing an operation by receiving various interactive instructions input by a user, and further displaying various objects according to display attributes; and a renderer for generating various objects based on the operator and displaying the rendered result on the display 275.

A CPU processor 254 for executing operating system and application program instructions stored in memory 260. And according to the received user input instruction, processing of various application programs, data and contents is executed so as to finally display and play various audio-video contents.

The communication interface 255 may include a first interface to an nth interface. These interfaces may be network interfaces that are connected to external devices via a network.

The controller 250 may control the overall operation of the smart tv 200. For example: in response to receiving a user input command for selecting a GUI object displayed on the display 275, the controller 250 may perform an operation related to the object selected by the user input command.

Where the object may be any one of the selectable objects, such as a hyperlink or an icon. The operation related to the selected object is, for example, an operation of displaying a link to a hyperlink page, document, image, or the like, or an operation of executing a program corresponding to the object. The user input command for selecting the GUI object may be a command input through various input devices (e.g., a mouse, a keyboard, a touch panel, etc.) connected to the smart tv 200 or a voice command corresponding to a voice spoken by the user.

The memory 260 is used for storing various types of data, software programs, or applications that drive and control the operation of the smart tv 200. The memory 260 may include volatile and/or nonvolatile memory. And the term "memory" includes the memory 260, the RAM251 and the ROM252 of the controller 250, or the memory card in the smart tv 200.

In the embodiment of the present disclosure, the controller 250 is configured to, when a video call is made between a first target object and at least one second target object through the smart tv 200, in response to a lighting instruction of the first target object or the second target object, control the display 275 to display a guide interface after acquiring guide interface data for lighting;

the controller 250 controls the image collector 232 to collect the to-be-processed image of the first target object in response to the image collection instruction triggered by the guide interface;

the controller 250 is connected to the image collector 232, and is configured to send the to-be-processed image of the first target object collected by the image collector 232 to the server 300, so that the server 300 synthesizes the to-be-processed image of the first target object with the to-be-processed images of the second target objects to obtain a synthesized image;

the controller 250 receives and controls the display 275 to show the composite image transmitted by the server 300, and the composite image is stored by the memory 260. The guidance interface and other operations of the smart tv will be described in detail later.

A hardware configuration block diagram of the server 300 is exemplarily illustrated in fig. 3B. As shown in fig. 3B, the components of server 300 may include, but are not limited to: at least one processor 131, at least one memory 132, and a bus 133 that connects the various system components (including the memory 132 and the processor 131).

Bus 133 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.

The memory 132 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)1321 and/or cache memory 1322, and may further include Read Only Memory (ROM) 1323.

Memory 132 may also include a program/utility 1325 having a set (at least one) of program modules 1324, such program modules 1324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

The server 300 may also communicate with one or more external devices 134 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with the server 300, and/or with any devices (e.g., router, modem, etc.) that enable the server 300 to communicate with one or more other electronic devices. Such communication may occur via input/output (I/O) interfaces 135. Further, server 300 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet) via network adapter 136. As shown, network adapter 136 communicates with the other modules for server 300 over bus 133. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the server 300, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

In some embodiments, for the to-be-processed image, the processor 131 may employ an image processing technique to obtain a human body image of each human body target in the to-be-processed image and fuse the human body image into a background image for photography. In order to ensure the quality and reality of the synthesized image, the processor 131 performs human body segmentation on each image to be processed, the segmented image to be processed only retains the human body target partial image in the image, and the rest background is set to be black.

And carrying out target detection on the segmented human body part in the image to be processed to obtain the number and the position of the human body target in each image to be processed. For example, taking a human body target as an example, a human body frame is obtained through target detection, and after the image to be processed is cut according to a human body frame, a human body image of each human body target cut based on the human body frame is obtained.

It should be noted that, as long as the method capable of acquiring the image to be processed is applicable to the embodiment of the present disclosure, the present disclosure does not limit this.

In some embodiments, various aspects of an image processing method provided by the present disclosure may also be implemented in the form of a program product including program code for causing a computer device to perform the steps of an image processing method according to various exemplary embodiments of the present disclosure described above in this specification when the program product is run on the computer device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for image processing of the embodiments of the present disclosure may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on an electronic device. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

After the controller, the smart tv and the server according to the present disclosure are generally introduced, the image processing method provided by the present disclosure is further described below. Taking the smart tv 400a of the first target object as the co-illumination initiating terminal and the smart tv 400b of the second target object as the co-illumination receiving terminal, as shown in fig. 3C, a timing chart of an image processing method is exemplarily shown, and the schematic diagram includes: smart tv 400a, smart tv 400b, and server 400 c.

In step 4011, the smart tv 400a transmits a lighting instruction to the server 400c in response to the lighting instruction of the first target object.

In step 4012, the server 400c controls both the smart tv 400a and the smart tv 400b to display a guidance interface for the group photo in response to the group photo instruction sent by the smart tv 400 a.

In one embodiment, for the smart television 400a of the user initiating the group photo, in response to the group photo indication triggered by the user operation, the smart television may send a guidance interface obtaining request to the server so as to obtain guidance interface data under the control of the server, so as to control the display of the server to display the guidance interface according to the guidance interface data.

For the smart television 400b of the user who does not actively initiate the group photo, the smart television 400b may passively receive the guidance interface data pushed by the server, and control the display of the smart television 400b to display the guidance interface according to the guidance interface data. Therefore, the smart televisions 400a and 400b need to complete the downloading and displaying of the guidance interface data under the control and cooperation of the server.

In step 4013, the smart tv 400a and the smart tv 400b determine each operation item in the guidance interface.

In one embodiment, when the first target object is the initiative co-illumination object, after the smart television 400a of the first target object responds to the co-illumination instruction of the first target object, the smart television 400a of the first target object acquires and displays a guide interface according to the co-illumination instruction; the guide interface may include a plurality of operation items, including but not limited to: selecting a background image item for the group photo and setting a recommended photo-taking gesture, wherein a guide interface determined by the smart television 400a in response to an operation instruction is shown in fig. 4a, the smart television 400a determines a background image for the group photo in response to an operation instruction on a first operation item in the guide interface, and the smart television 400a determines a photo-taking gesture for the group photo in response to an operation instruction on a second operation item in the guide interface; wherein, the image of the user who initiates the co-photographing can be used as the background image by default.

In another embodiment, when the first target object is the initiating photo, the guide interface of each second target object may only provide the function of part of the operation items. For example, the second target object may select the photographing gesture and may also set body type data, that is, in order to avoid the collision of the background images when different target objects select different background images, the right of the background image selection may not be provided to the second target object.

Wherein, the image of the user who initiates the group photo can be used as a background image by default; the operation items include, but are not limited to: selecting a background image item for the group photo, setting a recommended photographing gesture, and setting body type data for the group photo; the body type data for the group photo is set to adjust and integrate the body image size of each target object participating in the group photo in the image according to the body type data (such as height and weight), so as to avoid the occurrence of the body image which is too large to come in and go out actually and ensure the reality of the combined image. In addition, in order to ensure the privacy of each target object participating in the group photo, the authority of setting the operation item of the body type data for the group photo is automatically completed by each target object participating in the group photo, and the body type data of the target objects are invisible.

In order to simplify the flow, the authority of each operation item of each target object is not analyzed, and all functions in the guide interface can be opened to each target object during implementation.

In step 4014, in response to the image acquisition instruction sent by the first target object, the smart tv 400a uses the image acquirer to acquire a to-be-processed image of the first target object; the smart television 400b uses the image collector to obtain the to-be-processed image of the second target object in response to the image collection instruction sent by the second target object.

In step 4015, the smart tv 400a sends the image to be processed of the first target object to the server 400 c; the smart tv 400ab sends the image to be processed of the second target object to the server 400 c.

In step 4016, server 400c segments a human body image that highlights a human body target from the received images to be processed.

In one embodiment, the server performs human body segmentation on each image to be processed, as shown in fig. 4b, the segmented image to be processed only retains the human body target partial image in the image, and the rest of the background is set to be black;

and performing target detection on the segmented human body part in the image to be processed, acquiring the human body frame shown in fig. 4b through the target detection, and cutting the image to be processed according to the human body frame to obtain the human body image shown in fig. 4 c.

It should be noted that, as long as the method capable of acquiring the image to be processed is applicable to the embodiment of the present disclosure, the present disclosure does not limit this.

In step 4017, server 400c performs image processing on the human body image to obtain a mask image.

In step 4018, the server 400c determines that the human object is in the synthesized region of the background image.

In step 4019, the server 400c fuses the human body image and the background image using the mask image of the human body image as a template to obtain a composite image.

In one embodiment, a mask image of a human body image is obtained by image processing of the human body image, and pixel points belonging to a human body target and pixel points outside the human body target are recorded in the mask image; then, after determining a synthetic region of the human body target in the background image by taking the mask image of the human body image as a template, replacing the pixel values of the pixel positions belonging to the human body target in the background image with the pixel values of the human body image.

In order to ensure that each human body image participating in the synthesis is not blocked or covered in the synthesized image, the present disclosure provides two embodiments for determining the synthesis area of each human body target, including:

mode 1: in one embodiment, the total number N of human targets in all the images to be processed is counted, the server uniformly divides the background image into M non-overlapping synthetic regions according to the total number of the human targets, so that each human target can correspond to each divided region, and meanwhile, in order to ensure the aesthetic degree of a synthetic picture, an interval region and a blank region of each synthetic region can be preset for the synthetic picture;

wherein N is a positive integer greater than or equal to 2, and M is greater than or equal to N.

Mode 2: in view of increasing the user's satisfaction with the composite photograph, the positions of the respective target objects participating in photographing in the composite image may also be determined in advance by the user. For example, after the smart television determines the guide interface in response to the operation instruction, a co-shooting interface set by each target object participating in co-shooting can be provided, and each target object can set each recommended shooting posture by itself according to the recommended shooting posture determined before the guide interface is determined by previewing. In order to prevent the photographing gestures set by the target objects participating in the co-photographing from overlapping or blocking positions in the co-photographing interface, the position of the photographing gesture of each target object on the guide interface is finally adjusted by the initiator. And after each target object participating in the co-shooting is confirmed, the server determines a composite image at the corresponding position of the guide interface according to the shooting posture of each target object in the adjusted co-shooting interface.

In order to ensure that the synthesized image looks real and natural, when the mask image of the human body image is obtained, the effect of the synthesized image is influenced by the inconsistency of the co-illumination light rays of the users participating in the co-illumination, the human body images participating in the synthesis can be processed by adopting gamma conversion, and the image brightness of each human body image is adjusted in a self-adaptive manner, so that the human body images have visual consistency in the synthesized image.

Since the human body image generates a white jagged short edge with a segmentation effect according to the contour of the human body target when the human body image is segmented, and the white jagged short edge affects the effect of the synthesized image after being fused with the background image, the image can be filtered by feathering the edge of the human body target in the mask image, that is, by setting the size of the filter, and the processed mask image is as shown in fig. 4 d.

After determining the synthesis area of each human body target, the brightness of the human body image and the feathering, taking the mask image of the human body image as a template, taking the human body image as a foreground image, taking the background image as a background image, and fusing to obtain a synthesis image shown in fig. 4e, namely the fusion of the human body image and the background, namely the fusion of the foreground and the background images. For example, each mask image can be converted into a float type matrix only containing 0 and 1 numerical values, and pixel points belonging to a human body target part and pixel points belonging to a background image part in all pixel points of each mask image are determined through a binary method; and during fusion, replacing the pixel values of the pixel point positions belonging to the human body target in the background image with the pixel values belonging to the human body image to obtain a synthetic image. An alternative embodiment is to use the upper left corner of the background image as the origin of coordinates, the width of the background image as the horizontal axis, and the height of the background image as the vertical axis to establish a coordinate system, and determine the position of a pixel point belonging to a human body target in the background image, and replace the pixel value of the position with the pixel value belonging to the human body image, as shown in the following formula (1):

C_i，j＝B_i，j*(1-M_i，j)+A_i，j*M_i，j，i∈{0，1，...，w}，j∈{0，1，...，h}

wherein A is_i，jForeground picture being a human image, B_i，jBackground pictures being background images, C_i，jIs a composite image formed by fusing a human body image and a background image, M_i，jFor the mask image, w, h are the width and height of each mask image.

In step 4020, the server 400c distributes the composite image to the smart tv 400a and the smart tv 400 b.

Fig. 5 shows a flowchart of an image processing method applied to an intelligent terminal, which includes:

step 501: when the first target object and at least one second target object carry out video call through the intelligent terminal, a guide interface for co-photography is obtained and displayed in response to the co-photography instruction of the first target object or the second target object.

Step 502: and acquiring a to-be-processed image of the first target object in response to an image acquisition instruction triggered by the guide interface.

Step 503: and sending the acquired to-be-processed image of the first target object to a server so that the server synthesizes the to-be-processed image of the first target object with the to-be-processed images of the second target objects to obtain a synthesized image.

Step 504: and receiving and displaying the composite image sent by the server.

Fig. 6 shows a flowchart of an image processing method applied to a server, which includes:

step 601: when a first target object and at least one second target object carry out video call through an intelligent terminal, if a photo-combination instruction sent by the intelligent terminal of the first target object or the intelligent terminal of the second target object is received, the intelligent terminals of the first target object and the second target object are controlled to display a guide interface for photo-combination.

Step 602: and receiving the to-be-processed images respectively sent by the intelligent terminal of the first target object and the intelligent terminals of the second target objects.

Step 603: and synthesizing the images to be processed of the first target object and the second target objects to obtain a synthesized image.

Step 604: and distributing the composite image to the intelligent terminal of the first target object and the intelligent terminals of the second target objects.

25页详细技术资料下载

Intelligent terminal, server and image processing method

相关技术

网友询问留言