Temporal supersampling for point of gaze rendering systems

文档序号：1409664 发布日期：2020-03-06 浏览：2次中文

阅读说明：本技术 用于注视点渲染系统的时间超级采样 (Temporal supersampling for point of gaze rendering systems ) 是由 A.扬 C.霍 J.R.斯塔福德于 2018-06-07 设计创作，主要内容包括：提供用于使用时间超级采样来提高与注视点渲染视图的周边区相关联的显示分辨率的方法和系统。提供一种用于使得能够从低分辨率采样区重构较高分辨率像素以获得片段数据的方法。所述方法包括用于从GPU的光栅器接收片段并且用于在多个先前帧上对具有所述低分辨率采样区的所述片段应用时间超级采样以获得多个颜色值的操作。所述方法还包括用于基于经由所述时间超级采样获得的所述多个颜色值来在缓冲器中重构多个高分辨率像素的操作。此外,所述方法包括用于发送所述多个高分辨率像素以进行显示的操作。(Methods and systems are provided for using temporal super sampling to improve display resolution associated with peripheral regions of a gaze point rendered view. A method is provided for enabling higher resolution pixels to be reconstructed from a low resolution sampling region to obtain fragment data. The method includes operations for receiving a fragment from a rasterizer of a GPU and for applying temporal supersampling on a plurality of previous frames to the fragment having the low resolution sample region to obtain a plurality of color values. The method also includes an operation for reconstructing a plurality of high resolution pixels in a buffer based on the plurality of color values obtained via the temporal supersampling. Further, the method includes an operation for transmitting the plurality of high resolution pixels for display.)

1. A method for rendering higher resolution pixels from a low resolution sampling region, comprising:

receiving a fragment from a rasterizer;

applying temporal supersampling to the segment having the low resolution sample region over a plurality of previous frames to obtain a plurality of color values;

reconstructing a plurality of high resolution pixels in a buffer based on the plurality of color values obtained via the temporal supersampling; and

sending the plurality of high resolution pixels from the buffer for presentation on a display.

2. The method of claim 1, wherein the temporal supersampling comprises sampling at a location within the low resolution sample region of each of the plurality of previous frames.

3. The method of claim 1, wherein the temporal supersampling comprises sampling locations of the low resolution sampling regions as determined by pixel re-projection.

4. The method of claim 1, wherein the temporal supersampling comprises sampling positions of the low resolution sampling region as determined by dithering.

5. The method of claim 1, wherein a number of the plurality of high resolution pixels is greater than a number of the plurality of previous frames.

6. The method of claim 1, wherein the reconstruction of the plurality of high resolution pixels further comprises a blending of the plurality of color values.

7. The method of claim 1, wherein the plurality of high resolution pixels are associated with a native resolution of the display or a resolution greater than the native resolution of the display.

8. The method of claim 1, wherein the display is associated with a Head Mounted Display (HMD).

9. The method of claim 1, wherein the low resolution sampling regions are associated with a peripheral region of a point of regard rendered view.

10. A graphics system, comprising:

a Graphics Processing Unit (GPU) to apply temporal supersampling to a plurality of previous frames including a low resolution sample region, wherein the temporal supersampling obtains a plurality of color values;

a frame buffer to store the plurality of previous frames rendered by the GPU; and

a display buffer, wherein a plurality of high resolution pixels are reconstructed based on the plurality of color values obtained via the temporal supersampling of a previous frame;

wherein the plurality of high resolution pixels are configured for presentation on a display.

11. The graphics system of claim 10, wherein the temporal supersampling comprises sampling at a location within the low resolution sample area of each of the previous frames to obtain the plurality of color values.

12. The graphics system of claim 10, wherein the temporal supersampling defines sampling locations of the low resolution sampling regions as determined by pixel reprojection.

13. The graphics system of claim 10, wherein the temporal supersampling defines sampling locations of the low resolution sampling regions as determined by dithering.

14. The graphics system of claim 10, wherein a number of the plurality of high resolution pixels is greater than a number of the plurality of previous frames.

15. The graphics system of claim 10, wherein the reconstruction of the plurality of high resolutions further comprises a blending of the plurality of color values.

16. The graphics system of claim 10, wherein the plurality of high resolution pixels are associated with a native resolution of the display or a resolution greater than the native resolution of the display.

17. The graphics system of claim 10, wherein the display is associated with a Head Mounted Display (HMD).

18. The graphics system of claim 10, wherein the low resolution sampling region is associated with a peripheral region of a point of regard rendered view.

19. A non-transitory computer-readable medium storing a computer program executable by a processor-based system, comprising:

program instructions for receiving a fragment from a rasterizer, the fragment being associated with a low resolution sample region;

program instructions for applying temporal supersampling on a plurality of previous frames to the segment to obtain a plurality of color values;

program instructions for reconstructing, in a buffer, a plurality of high resolution pixels associated with the low resolution sample region, the plurality of high resolution pixels being based on the plurality of color values obtained via the temporal supersampling; and

program instructions for sending the plurality of high resolution pixels from the buffer for presentation on a display.

20. The non-transitory computer readable medium of claim 19, wherein the temporal supersampling defines a sampling location of the low resolution sampling region on the plurality of previous frames as determined by pixel reprojection or as determined by dithering, and wherein the plurality of high resolution pixels are associated with an original resolution of the display or a resolution greater than the original resolution of the display.

Technical Field

The present disclosure relates generally to a point-of-gaze rendered view for Virtual Reality (VR) content provided by a Head Mounted Display (HMD), and more particularly to methods and systems for generating higher resolution pixels in certain regions within the point-of-gaze rendered view using temporal super sampling.

Background

Virtual Reality (VR) presented through Head Mounted Displays (HMD) is increasingly becoming a way for consumers to interact with various types of content. As VR applications used to generate VR content become rendered with higher and higher resolution images and higher complexity, the computational, network, and memory costs required to support these VR scenes also increase. For example, as image resolution is increased, the associated graphics pipeline needs to perform more and more operations associated with generating pixel data from geometry data generated by a VR application. Also, the amount of memory required to store the geometry and pixel data required to run a VR application may increase proportionally. Further, if the VR application is executed on a computing system that communicates with the HMD over a network connection (e.g., wired or wireless), the amount of data that would otherwise need to be sent over the network connection increases.

Thus, bottlenecks often occur when executing VR applications with computing and graphics requirements. Bottlenecks can result in reduced frame rate (frames per second), increased delay or lag, reduced resolution, and increased aliasing, all of which are detrimental to the overall user experience. Some attempts to reduce the computational, memory, and network costs associated with executing VR applications have resulted in VR scenes with lower resolution, pixelation, visual artifacts, etc., which can negatively impact the VR experience.

It is in this context that various embodiments have emerged.

Disclosure of Invention

Embodiments of the present disclosure provide methods and systems for enabling reconstruction of higher resolution pixels for display in an undersampled region of a VR scene by using temporal supersampling. In one embodiment, a method for reconstructing higher resolution pixels from a low resolution sampling region is provided. The method provides operations for receiving a fragment from a rasterizer. The method also includes an operation for applying temporal supersampling on a plurality of previous frames to the segment having the low resolution sample region to obtain a plurality of color values. According to certain embodiments, the method may further include an operation for reconstructing a plurality of high resolution pixels in a buffer based on the plurality of color values obtained via the temporal supersampling. Further, the method includes an operation for sending the plurality of high resolution pixels from the buffer for presentation on a display. Thus, the provided methods are capable of rendering higher resolution images that are sent for display without the large and sometimes expensive memory usage typically associated with rendering high resolution images. Thus, the method provides a solution to the technical problem of being able to increase the image resolution associated with VR scenes while maintaining low memory usage.

In another embodiment, a graphics system includes a Graphics Processing Unit (GPU) to apply temporal supersampling to a plurality of previous frames including a low resolution sample region, wherein the temporal supersampling obtains a plurality of color values.

In another embodiment, a non-transitory computer readable medium storing a computer program executable by a processor-based system includes program instructions for receiving a fragment from a rasterizer, the fragment being associated with a low resolution sample region. The implementation also includes program instructions to apply temporal supersampling to the segment over a plurality of previous frames to obtain a plurality of color values. Program instructions are also provided in an embodiment for reconstructing, in a buffer, a plurality of high resolution pixels associated with the low resolution sampling region, the plurality of high resolution pixels being based on the plurality of color values obtained via the temporal supersampling. Additionally, implementations provide program instructions for sending the plurality of high resolution pixels from the buffer for presentation on a display.

Other aspects of the disclosure will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the disclosure.

Drawings

The disclosure may best be understood by reference to the following description taken in conjunction with the accompanying drawings, in which:

fig. 1A and 1B illustrate presenting Virtual Reality (VR) content with two resolutions to a Head Mounted Display (HMD) user, in accordance with various embodiments.

Fig. 2A and 2B illustrate presenting VR content to an HMD user, the VR content having a foveal area, a mid foveal area, and a peripheral area, in accordance with certain embodiments.

Fig. 3A-3H illustrate various embodiments of point-of-regard rendering views.

Fig. 4 illustrates a multi-resolution display or screen defined by a gaze-point rendered view and an unfolded view of associated relative pixel sizes, according to some embodiments.

Fig. 5 illustrates a screen defined by a gaze point rendered view having a foveal region, an intermediate foveal region, and a peripheral region, and a conceptual scheme for reconstructing higher resolution pixels in a low resolution peripheral region, according to some embodiments.

Fig. 6 illustrates a conceptual scheme for reconstructing a set of higher resolution pixels from a low resolution sample region using temporal super sampling and pixel re-projection over multiple frames stored in a buffer, according to various embodiments.

Fig. 7 illustrates a conceptual scheme for outputting high resolution pixels using high resolution sample regions according to one embodiment.

Fig. 8 illustrates a conceptual scheme for outputting low-resolution pixels using low-resolution sample regions according to one embodiment.

Fig. 9 illustrates a conceptual scheme for outputting high resolution pixels for a static object using a low resolution sampling region by temporal super sampling according to one embodiment.

Fig. 10 illustrates a conceptual scheme for outputting high resolution pixels for a dynamic object using low resolution sample regions by temporal super sampling according to one embodiment.

Fig. 11 illustrates a conceptual model for generating higher resolution pixels from low resolution pixels for sampling by utilizing temporal super sampling with a regular sampling pattern.

Fig. 12 illustrates a conceptual model for generating higher resolution pixels from low resolution pixels for sampling by utilizing temporal supersampling with a quasi-random sampling pattern.

Fig. 13A illustrates an embodiment of reconstructing a set of 16 high resolution pixels from a low resolution sampling region used during temporal super sampling over 16 frames.

Fig. 13B illustrates an embodiment of reconstructing a set of 16 high resolution pixels from a low resolution sample region used during temporal super sampling over a number of frames less than the number of high resolution pixels.

Fig. 15 illustrates an additional embodiment of a Head Mounted Display (HMD) that may be used with the proposed method and/or system.

FIG. 16 is a diagram of a computing system 1600 that can be used to implement various implementations described herein.

Detailed Description

The following embodiments describe methods, computer programs, and devices for increasing the final display resolution of a region within a VR scene associated with a lower resolution sampling region by time super-sampling the low resolution sampling region. It will be apparent, however, to one skilled in the art, that the present disclosure may be practiced without some or all of these specific details. In other instances, well known process operations have not been described in detail in order not to unnecessarily obscure the present disclosure.

The Virtual Reality (VR) environment provided by HMDs is an increasingly popular medium for consumers to interact with content and for content creators to deliver content to consumers. Furthermore, as VR scenes become more complex and displayed at higher resolutions, computational, memory, and network costs also increase. Thus, improvements to current methods of computer graphics rendering and anti-aliasing of VR scenes displayed via HMDs would be beneficial in terms of computing, memory and network resources, as well as the VR experience for the end user.

One way to reduce the computational, memory, and network costs (and associated delays) of a particular VR scene described herein is to display the VR scene using a point-of-gaze rendering view. According to one embodiment, point-of-regard rendering may define regions within the display that are displayed at a higher resolution, quality, level of detail, sharpness, frame rate, etc. than other regions. According to these and other embodiments, the region with higher resolution (or higher quality, level of detail, sharpness, frame rate) may be referred to as a fovea or fovea, and is generally related to the location at which the user is gazing or gazing. In addition, regions that do not have a higher level of resolution may be referred to as peripheral regions or perimeter regions, and may generally be associated with regions at which the user is not gazing. Thus, the point-of-regard rendering view and/or system represents one such solution to the technical problem that can reduce the computational, memory, and network costs associated with rendering a VR scene without negatively impacting the user's experience.

For regions rendered at a lower resolution (e.g., peripheral regions), the amount of pixel and/or fragment data that needs to be stored in memory to render the low resolution regions is correspondingly reduced. For example, if the resolution of a given region within a scene is reduced by a factor of 4, the amount of memory required to store pixel data for each video frame of the region within the scene will be proportionally reduced by a factor of 4. According to some embodiments, regions rendered at a lower resolution (e.g., peripheral regions) may also be referred to as undersampled regions because these regions are sampled at a lower frequency.

As mentioned above, reducing the amount of memory used to render each video frame of a given scene would be beneficial for VR systems because typically the speed of memory cannot be kept synchronized with the speed of a processor, such as a Central Processing Unit (CPU) or a Graphics Processing Unit (GPU). Therefore, reducing the resolution associated with the peripheral region within a point-of-regard rendered view by reducing memory usage and by maintaining consistent memory access would be an improvement over existing VR systems. For example, one of the improvements resulting from the point-of-regard rendering system described herein may include an improvement in the delay or lag associated with rendering an interactive VR scene, which may be noted by currently average HMD users.

However, as the resolution of a given computer-generated scene decreases, the frequency and extent of low-resolution associated artifacts may increase in the form of jagged edges or lines ("jaggies"), pixelation, and other visual artifacts. Even if these low resolution regions are limited to only peripheral regions within the point-of-regard rendered view (e.g., the user's peripheral vision), the HMD user may still be able to identify certain types of aliasing due to the reduction in resolution in these regions. It is known in the relevant art that although foveal vision is generally not easily resolved by human peripheral vision, it is sensitive to detecting certain types of visual inconsistencies or patterns. For example, if the resolution is reduced low enough, the user's peripheral vision will be able to detect the presence or appearance of pixilated areas, jagged edges, flicker, and other forms of aliasing or graphics artifacts. Thus, there is a need to both maintain low memory usage by rendering relatively lower resolution in the peripheral region of the display, and to reduce aliasing associated with low resolution regions within the point-of-regard rendering system and/or view.

The systems, methods, and devices described herein enable a point of regard rendering system and/or view to maintain a reduction in memory usage associated with low resolution regions while reducing the degree of pixelation and aliasing of those low resolution regions. In one embodiment, a system or method uses temporal supersampling on a low resolution sampling region to sample at different locations within a low resolution pixel over a specified number of past frames to create a higher resolution pixel for display. Temporal supersampling records a plurality of pixel values sampled from a plurality of temporally separated frames. It should be noted that according to some embodiments, a single buffer (e.g., within video RAM) may be used to accumulate these pixel values over time. These embodiments would have the advantage of not requiring multiple buffers (frames) of data to be maintained. Thus, the use of temporal super-sampling for low resolution sampled regions (e.g., undersampled regions or peripheral regions) provides a technical solution that can be implemented to address the pixelation and aliasing problems associated with low resolution regions without requiring substantial increases in, for example, memory usage.

For some embodiments, the central recessed area may be fixed or static relative to the display. In such embodiments, the central recessed area may be located toward the center of the screen or display. In other embodiments, the foveal region may be dynamically positioned relative to the screen or display. For example, in some embodiments, a foveal region may be defined to move within a display or screen in a predetermined manner or by software programming. In other embodiments, the dynamic foveal region may track or follow a user's point of gaze (POG) or gaze direction. Thus, the area within the display corresponding to the user's POG is rendered at a higher quality, level of detail, and/or clarity than areas further away from the user's POG without compromising the user's visual experience.

In some embodiments, for locations where no foveal region is present, the peripheral region will be defined within the screen or display by the point of regard rendering. For example, if the central recessed region is positioned toward the center of the display, the peripheral region should occupy the remainder of the display (or at least a portion thereof) toward the periphery of the display. If the foveal region is to be moved to a different area of the display, the peripheral region should fill the rest of the display where no foveal region is currently present.

FIGS. 1A and 1B illustrate a two-resolution R₁And R₂To present Virtual Reality (VR) content to the HMD user 101. According to the embodiment shown in fig. 1A, an HMD user 101 is shown with a gaze 102 directed substantially straight ahead. That is, the HMD user 101 is shown looking forward within the VR environment 104, which may include 360 levels.

According to the embodiment shown in fig. 1A and 1B, the gazing HMD user 101 is tracked by a gaze detection component (not shown) located within an HMD/computing system 103 worn by the HMD user 101. In some embodiments, the gaze information may be obtained via a camera located within the HMD that captures an image of the user's eye. The image may then be analyzed to determine a gaze point or gaze direction of the user (e.g., a location at which the user is currently looking). Thus, the HMD/computing system 103 with real-time information about the HMD user's 101 gaze 102 is able to provide a foveal zone 106 aligned with the HMD user's 101 gaze 102. For example, the foveal region 106 is shown as being placed in a similar direction relative to the HMD user 101 as the HMD user's 101 gaze 102 within the VR environment 104. In addition, the central recessed area 106 is shown with R₁The resolution of (2).

Also shown in fig. 1A is a peripheral region 108. As mentioned above, according to some embodiments, the peripheral region 108 may be defined by a point of regard rendering method or system as a region within the display or field of view that does not coincide with the foveal region. For example, the peripheral region may be outside the foveal region, or may surround the foveal region, or may fill the remaining space/pixels of the display that are not associated with the foveal region. Furthermore, the non-gaze point may be defined by a lower resolution, quality, level of detail, sharpness, frame rate, etc.

Thus, according to certain embodiments, the peripheral region 108 may include a region of the VR environment 104 that is displayed to the HMD user 101 but does not correspond to the HMD user's 101 gaze 102 as detected by the HMD/computing device 103. Thus, the resolution R can be adjusted₁Different resolution R₂The peripheral region 108 is displayed to the HMD user 101.

According to some embodiments, for a given VR scene, the resolution R₁May be higher than R₂. In these embodiments, the foveal region 106 may be provided with a higher resolution rendering than the peripheral region 108 without compromising the visual experience of the HMD user 101. In general, the human visual system is only able to perceive finer detail within regions associated with about 5 horizons and about 5 verticals relative to a person's gaze point. This field of view is projected onto a region within the retina called the fovea. As the angular distance from the user's central gaze direction or gaze point increases, visual acuity may drop dramatically (e.g., the ability to perceive fine details). This physiological phenomenon is referred to herein as focusing (foveation).

Point-of-regard rendering exploits the phenomenon of focus by providing a configuration, format and paradigm of rendering, post-rendering and/or graphics processing for display, wherein one or more zones (e.g., foveal zones) are defined by a higher level of resolution, a higher level of detail, a higher level of texture and/or a higher level of clarity than other zones. According to some embodiments, the foveal region is made to correspond to a region of the display that the user is currently viewing or predicting to be viewing. In other embodiments, the central foveal region may be placed in a static manner in the central region of the display that the user will spend a significant amount of time looking at. Also, as mentioned previously, the point of regard rendering may define a peripheral region of the display corresponding to where the user is not gazing or is predicted not to be gazed.

Embodiments contemplated herein enable the use of a point of regard rendering display configuration to take advantage of focused physiological phenomena by rendering and/or displaying higher quality (e.g., resolution, level of detail (LOD), clarity, frame rate) content within a display region associated with a field of view in focus of a user (e.g., a center of gaze and a surrounding field of view projected onto a fovea of a user). Additionally, embodiments contemplated herein are capable of displaying content with lower quality in regions of the display that are not associated with the center of the user's gaze (e.g., the user's peripheral field of view). Thus, only a portion of a given scene may be rendered and/or processed for display at high quality or high resolution under point-of-regard rendering, as compared to rendering the entire display or screen at full quality or full resolution.

One of the technical advantages of point-of-regard rendering is the ability to reduce the computational and video transmission costs associated with rendering and delivering a given scene at full quality (e.g., high resolution, sharpness, level of detail, frame rate, etc.) for the entire display (e.g., each pixel on the display). In the case of both wired systems (e.g., high definition multimedia interface (HMD) and/or displayport implementations) and wireless systems, there is a video transmission cost. By rendering a portion (e.g., 20 to 50%, 5 to 75%, 25 to 40%) of the entire display at high resolution and/or quality, computing resources (e.g., GPU, CPU, cloud computing resources) and video transmission resources (e.g., transmitting data to and from the HMD, from the computing device, and/or from the combined HMD/computing device to a remote server) may be reduced and allocated for other uses.

According to another embodiment, the point-of-regard rendering method and/or system may enable a reduction in the amount of data required to display a given scene on the HMD even if the GPU associated with the HMD/computing device computes full resolution video frames of the scene. For example, if the GPU is associated with a computing device wirelessly connected to the HMD, the gaze point rendering methods and/or systems described herein may enable a reduction in the amount of wireless data transmitted from the computing device to the HMD for rendering certain regions of the scene.

According to the embodiment shown in FIG. 1A, the central depression 106 represents about 30% of the total display or viewable area. Although the central recessed area 106 is rectangular in shape for clarity, it should be noted that the central recessed area 106 may take any number of shapes without departing from the spirit and scope of the embodiments. Some contemplated embodiments are described below with reference to fig. 3A-3F. Further, although the central depression 106 is shown to represent 30% of the total displayable or viewable area, in other embodiments, the central depression 106 may range from 5% to 75% of the total displayable or viewable area.

In one embodiment, it is contemplated that the peripheral region 108 may have a resolution R that is less than the foveal region 106 for at least some period of a VR scene₁Resolution R of₂. For example, if R₁Equal to 1920x1080 pixels (e.g., 1080p), then R₂May be equal to 960x540 pixels (e.g., 540p), or about half the number of vertical pixels and half the number of horizontal pixels. Thus, having a resolution R of 1080(p)₁May be associated with an image resolution equal to about 2.074 megapixels. In contrast, with a resolution R of 540(p)₂May be associated with an image resolution equal to about 0.518 megapixels, indicating an image resolution relative to resolution R₁The difference of (a) is about 0.25 times.

According to another embodiment, it is contemplated that the central recessed area 106 may be compatible with a resolution R of 3840x2160p (4K UHD)₁While the peripheral region 108 may be associated with a resolution R that is less than 4K UHD₂(e.g., 1080(p), 540(p), 360(p), 240(p), etc.). Any number of other resolutions may be used in other embodiments in accordance with the methods and systems presented herein. By way of non-limiting example, it is contemplated that the foveal region 106 can have a resolution R characterized by₁: 2160x1200 (or 1080x1200 per eye), 1280x720(HD), 1600x900(HD +), 1920x1080(FHD), 2560x1440((W) QHD), 3200x1800(QHD +), 3840x2160(4K UHD), 5120x2880(5K UHD +), 7680x4320(8K UHD), 16K, etc. The example solutions discussed herein are not limiting or exhaustive, but are merely intended to provide an illustration of certain criteria that may be implemented in certain embodiments.

According to some embodiments, the resolution R₂Can be prepared from less than R₁Any resolution of (2). As a non-limiting example, R₂Can be distinguished byAnd (3) rate characterization: 320x240(240p), 640x360(nHD, 360p), 960x540(qHD, 540p), 1280x720(HD, 720p), 1600x900(HD +), etc. It is contemplated that R, according to various embodiments, is₁And R₂May change during the entire VR scene and/or between different VR scenes. Furthermore, the resolutions discussed are intended to be examples only, and are not limiting of various other resolutions that may be implemented in various embodiments, whether standardized or not.

Fig. 1B illustrates the HMD user 101 directing his gaze 110 towards the upper left corner of the peripheral region 114 within the VR environment 104. According to some implementations, the gaze 110 is detected by the HMD/computing device 103, which is then enabled to provide a foveal area 112 at a location corresponding to the gaze 110 within the VR environment. That is, the gaze 110 is tracked by the HMD/computing device 103 in real-time, and thus, the HMD computing device 103 can determine where to focus the VR environment such that the fovea 112 is in the same direction as the gaze center associated with the gaze 110. Thus, there is a transition between the location of foveal region 106 of fig. 1A to a new location associated with foveal region 112 of fig. 1B that naturally tracks or tracks the change between gaze 102 of fig. 1A and gaze 110 of fig. 1B.

While certain embodiments have been shown with a dynamic foveal region that tracks the user's gaze direction, other embodiments may include a fixed foveal region that does not track the user's gaze direction.

Fig. 2A shows the HMD user 101 being presented VR content within a VR environment 210 having a foveal area 204, a mid foveal area 206, and a peripheral area 208. It is contemplated that some embodiments may have a foveal region 204 with a resolution R of the foveal region₁Greater than resolution R of central foveal area 206₂. Further, according to some embodiments, a resolution R may be expected₂Will be greater than the resolution R of the peripheral region 208₃. Meanwhile, similar to the embodiment shown in fig. 1A and 1B, a foveal region 204 is shown in fig. 2A to occupy a region within the VR environment 210 that coincides with the HMD user 101's instantaneous gaze 202. However, as previously mentioned, other embodiments may implement point-of-regard rendering, whichFoveal region 204 and foveal region 206 are fixed relative to the display area and do not need to track the user's gaze direction.

According to the embodiment shown in fig. 2A, the central foveal area 206 generally surrounds the area occupied by the foveal area 204 within the VR environment 210. Thus, the medial foveal region 206 may coincide with a region within the VR environment 210 that is associated with an angular distance (eccentricity) of about 5 ° to about 60 ° from the central gaze. The visual acuity associated with this space in the field of view (e.g., the central foveal region) is less than the visual acuity of the foveal region, but may still be greater than the visual acuity of the peripheral region (having an eccentricity of greater than about 60 ° with respect to the center of the gaze direction). Thus, the methods and systems described herein are enabled to provide an intermediate foveal area 206 having a resolution that is between the resolution of foveal area 204 and peripheral area 208.

According to one embodiment, the central pit 204 may have a resolution R characterized by 1080p₁And the middle foveal region 206 may have a resolution R characterized by 720p₂And the peripheral region 208 may be characterized by 540 p. These resolutions are merely examples, and it is contemplated that the foveal region 204 may have higher resolutions, such as 4K, 8K, 16K, and so forth. In these other embodiments, the resolution of central foveal area 206 may be less than the resolution of foveal area 204, while the resolution of peripheral area 208 will be less than the resolution of central foveal area 206.

It is also contemplated that the central foveal region 206 will occupy the space within the VR environment 210 between the foveal region 204 and the peripheral region 208. It is also contemplated that the medial foveal area 206 and peripheral area 208 track or follow the HMD user's 101 gaze 202, or track or follow the foveal area 204 within the VR environment 210. That is, the central foveal area 204 and the peripheral area 208 can also be shifted within the VR environment 210 to move in real time with or appear to move with the foveal area 204.

Fig. 2B shows that the HMD user 101 has changed from a gaze 202 directed substantially straight ahead in fig. 2A to a gaze 203 directed toward the upper left corner of the VR environment 210. According to some embodiments, the gaze 203 is tracked by the HMD/computing system 103 via gaze detection, and thus, the HMD/computing system 103 is enabled to locate the foveal region 212 in a direction similar to the direction in which the gaze 203 is directed. The HMD/computing system 103 is also enabled to provide an intermediate foveal area 214 at a location within the VR environment 210 that surrounds the area occupied by foveal area 212.

As described above, the foveal region 212 may be made to correspond to approximately 5 to 75% of the HMD user's 101 field of view, or to 5 to 75% of the total displayable space within the VR environment 210. Additionally, according to various embodiments, the intermediate foveal area 214 may, for example, correspond to about 5 to 50% of the HMD user's 101 field of view, or to about 5 to 50% of the total viewable area of the VR environment 210. Thus, the peripheral region 216 may correspond to the total field of view of the viewable area and/or anywhere between 40 to 90% of the total viewable area. However, it is contemplated that, according to various embodiments, the proportion of the field of view and/or viewable area of the VR environment 210 allocated to each of the central recessed region 212, the central recessed region 214, and the peripheral region 216 may vary within a VR scene or between different VR scenes.

Fig. 3A-3H illustrate various embodiments of point-of-regard rendering views. For example, FIG. 3A illustrates a point-of-regard rendered display having a foveal region characterized by a circular boundary. Fig. 3B illustrates a gaze point rendered view having a foveal region characterized by an ellipse, an oblong or an oval, which may be used with the methods and systems described herein. Additionally, fig. 3C illustrates an embodiment of a point of regard rendering configuration, where the foveal region is shown as a rectangular shape with rounded corners.

Fig. 3D and 3E illustrate embodiments of a point-of-regard rendering view with a circular foveal region. Fig. 3D additionally shows a central foveal region that also has a circular shape, located outside the foveal region between the foveal region and the peripheral region. Further, fig. 3E illustrates two intermediate foveal regions arranged in a nested manner. It is contemplated that, in general, any number of intermediate foveates may be utilized in various embodiments, with each successive intermediate foveates that are increasingly distant from the foveates having progressively lower quality (e.g., resolution, sharpness, level of detail, frame rate, refresh rate) associated therewith. It is also contemplated that although the middle region is shown as having a similar shape to a given foveal region within the point-of-regard rendered display, in other embodiments such similarity is not required. For example, the intermediate regions of fig. 3D and 3E may be characterized by shapes other than circles.

Fig. 3F illustrates an embodiment of a point-of-regard rendered view and/or display having a dynamic foveal region bounded by a frame. In these and other embodiments, the foveal region may track the user's gaze such that the foveal region is shown within an area of the display and/or view that is consistent with the HMD user's gaze direction, so long as the user's gaze remains within a certain area characterized by the bounding box. Thus, the foveal region may track the user's gaze upward until the gaze moves outside of the bounding box. According to some embodiments, the foveal region may still attempt to track the gaze outside the bounding box by shifting to a location within the bounding box that is determined to be closer to the gaze than other locations. Of course, the geometries and shapes shown in fig. 3A-3F are intended to be exemplary and not limiting. For example, any number of other shapes or boundaries (including squares, trapezoids, diamonds, and other polygons) may be used to define the fovea and/or the mid-fovea in accordance with the methods and systems described herein.

In general, each of the embodiments shown in fig. 3A-3E may have a foveal region that is ' fixed ' relative to the display and/or view or that dynamically tracks the user's gaze when viewing the respective gaze point rendering view and/or display. For example, for some types of VR content, it may be that: it is expected that the HMD user will directly view most VR sessions. Thus, certain embodiments may render views and/or displays using a fixed point of regard relative to the display and/or view of the VR environment.

Fig. 3G illustrates a VR scene 300 generated using point-of-gaze rendering according to the methods and systems described herein. The point-of-regard rendering produces a foveal area 302 and a plurality of intermediate foveal areas 304-310. In fig. 3G, the number of intermediate foveates 304-310 is arbitrary, with the resolution of each intermediate foveated region gradually decreasing as the display of the intermediate foveated region becomes farther from the foveated region. For example, the central foveal area 306 may include anywhere between 1 to 100 additional central foveal areas.

FIG. 3H depicts various exemplary relationships between the resolution of the display region and the distance of the region from the foveal region or gaze point. For example, curve 312 may describe a point-of-regard rendered display having only a foveal region and a peripheral region. Curve 314 depicts a point-of-regard rendered display having a parabolic relationship between resolution and distance away from the foveal region. Curve 316 depicts a step function of decreasing resolution as the distance from the fovea increases. In addition, curves 318 and 320 describe a linear and sigmoid relationship between resolution and distance from the fovea. Thus, when each intermediate fovea is further removed from the fovea, the point of regard rendering systems contemplated herein are capable of rendering any number of intermediate fovea with various resolutions.

Fig. 4 illustrates a display or screen defined by a gaze-point rendered view 400 and an expanded view 408 of associated relative pixel sizes, according to some embodiments. For example, the point-of-regard rendered view 400 is shown as including a view having a resolution R₁Has a resolution R, of the central recessed area 402₂And has a resolution R₃The peripheral region 406. It is contemplated that the resolution of each of the regions 402, 404, and 406 will generally have the following relationship: r₁>R₂>R₃But other relationships are possible.

An expanded view 408 of the three regions 402-406 is shown including the relative pixel sizes of the center recessed pixel 410, the middle center recessed pixel 412, and the peripheral region pixel 410. As mentioned above, the resolution R of the foveal area 402₁May be greater than the resolution R of the middle zone 404₂And thus, the size of the foveal pixel 410 should be smaller than the size of the central foveal pixel 412. In the embodiment of fig. 4, as just one example, the middle region pixels 412 are shown as being about 4 times larger in size than the region pixels 410. That is, the middle foveal pixel 412 may occupy or fill or map to the same amount of screen as the 4 foveal pixels 410Display area. Thus, for example, if the foveal region pixels 410 correspond to original resolution pixels, each of the mid region pixels 412 may be associated with 4 original pixels.

According to some embodiments, even though the middle region pixel 412 may include or be associated with more than one (e.g., 4, or 9, or 16, or any other number) original/physical pixel, the middle region pixel 412 may still be referred to as one (lower resolution) pixel because it is considered by the graphics pipeline as a single pixel during at least part of the rendering process. For example, the graphics pipeline of the VR system may store only one color value for each pixel 412 of the rendered frame. When the VR system then continues to display pixel 412, it may then map or project the color values stored for pixel 412 to each of the 4 original pixels. Thus, "low resolution pixels" or "large pixels" may be used herein to refer to elements for a final view that are treated as a single pixel (e.g., through a graphics pipeline) by being associated with only one color value for each rendered frame, but that ultimately map or project onto more than one original or physical pixel on a display (associated with the HMD).

Further depicted by fig. 4 is a peripheral pixel 414, which is shown to be even lower resolution than the intermediate pixel 412. For example, peripheral pixels 414 are shown to be 4 times the size of middle pixel 412 and 16 times the size of foveated pixel 410. Thus, the peripheral pixels 414 may include 16 original pixels and may also be considered low resolution pixels or large pixels because the foveated pixels 414 are considered as a single pixel that stores only one color value per frame while also projecting onto more than one pixel.

Fig. 5 shows a representative display 500 defined by a gaze point rendered view having a foveal area and a peripheral area. The foveal region is shown as an array comprising an array of representative pixels 502 that may correspond to the original pixels of the display 500. The peripheral region is shown to include, for example, a low resolution pixel array 504 that may be associated with 4 original pixels, respectively. According to the illustrated embodiment, each of the high-resolution pixels 506a, 506b, 506c, and 506d of frame N is extracted from corresponding pixel data 508a, 508b, 508c, and 508d stored in a time buffer 510 of frame N.

Certain aspects of the embodiments described herein allow "low resolution" pixels to be rendered at a higher resolution by extracting pixels associated with lower resolution pixels from low resolution pixel data stored in a temporal buffer. For example, low resolution pixels 512 are shown to include original pixels 514a, 514b, 514c, and 514d, each of which is extracted from low resolution pixel values stored for a different frame. In particular, the original pixel 514a is extracted from the pixel data 516a, which is the pixel value of the low-resolution pixel 512 at frame N, obtained by dithering and sampling at the upper left corner of the low-resolution pixel.

Original pixel 514b is shown as being extracted from pixel data 516b, which includes the same pixel value for low-resolution pixel 512, but from a previous frame (N-1) and a different dither position (e.g., upper right corner). In addition, original pixel 514c is extracted from pixel data 516c, which includes pixel values obtained from frame N-2 and from a different dither position (e.g., lower left corner). In addition, original pixel 514b is shown as being extracted from pixel data 516d, which includes the sampled pixel value for frame N-3 at the lower right dither position.

Thus, a higher resolution display output for the low resolution pixels 512 may be achieved by extracting the original pixels 514a through 514d for the previous frames that have been correspondingly dithered based on the pixel data 516a through 516d stored in the time buffer without having to increase the number of pixel values required to store each frame. For example, for low resolution pixel 512, only one color value is stored for each of frames N, N-1, N-2, and N-3. In contrast, 4 pixel values are stored for the group of high resolution pixels 506 a-506 d of frame N alone (there are still approximately 4 pixel values for the pixels of each of frames N-1, N-2, and N-3, and so on).

That is, the number of pixel values required per frame is 4 pixel values for the 4 groups of high resolution pixels 506a to 506 d. In contrast, the group of 4 pixels 514a to 514d associated with the low resolution pixel 512, although having "the same resolution" as the group of high resolution pixels 506a to 506d, only needs to store one pixel value or color per frame in the temporal buffer. Accordingly, embodiments described herein enable higher resolution pixels to be displayed and/or constructed in low resolution pixel regions without increasing the number of pixel values stored in the time buffer 510 per frame (e.g., without increasing memory usage). According to some embodiments, the process of extracting the original pixels 514 a-514 d from the time-defined low resolution pixel data stored in the time buffer may utilize temporal super sampling to sample different locations of the low resolution pixels.

Fig. 6 illustrates a conceptual scheme for creating a set of higher resolution pixels from low resolution pixels 600 using temporal supersampling 601 of low resolution pixels 600 and pixel re-projection from a temporal supersampling history of low resolution pixels 600, according to various embodiments. For example, the low resolution pixels 600 are shown divided into 4 regions corresponding to 4 high resolution pixels. During the temporal supersampling 601 of the object 602, the low resolution pixel 600 is shown sampled at different locations for each of the four frames N, N-1, N-2, and N-3. For clarity, in the illustrated embodiment, the sampling location is exactly in the center of each grid area of the low resolution pixel 600.

According to various embodiments, a sampled pixel 604 of the object 602, which has a sampling location within the upper left region of the low-resolution pixel 600 for frame N, returns a pixel value that is stored in memory and subsequently projected to a corresponding high-resolution pixel 612. Thus, according to the illustrated embodiment, the sampled pixels 604 of the object 602 may return pixel values corresponding to a background color (e.g., white). Likewise, the sampled pixel 606 of frame N-1 may return a color value corresponding to the color of the object 602. Thus, the associated graphics pipeline may project the sampled color value (e.g., gray as shown) to the upper right high resolution pixel 614. The same process may be repeated for frames N-2 and N-3, with sampled pixel 608 projecting the corresponding sampled color value to the lower left corner high resolution pixel 616 and sampled pixel 610 projecting the corresponding sampled color value to the lower right corner high resolution pixel 618.

According to some embodiments, the projection of the sampled pixel values from sampled pixels 604 through 610 is achieved by having information about the sampled position of the dither and the corresponding screen coordinates or pixel coordinates of the sampled position. For example, a graphics pipeline of a VR system compatible with embodiments described herein may determine where to project a respective sampled color value based on information about jitter and information about corresponding screen coordinates or pixel coordinates of a given sample location.

In one embodiment, temporal antialiasing reprojection may be used to perform the reprojection shown in FIG. 6. For example, in some embodiments, the following exemplary equation may be used:

world position (1) screen position (inverse projection of current view)

Previous screen position (previous view projection) world position (2)

uv 1/2 x (previous screen position xy/previous screen position w) +1/2 (3)

According to some implementations, equations (1) through (3) above may be used to sample a previous frame stored in a time buffer. For example, equation (1) may map the current pixel back to world space. Equation (2) projects the position onto the previous frame using the camera (view projection matrix) of the previous frame, and equation (3) converts the previous screen position to uv coordinates that can be used to sample the previous frame in the time buffer. Thus, the associated graphics pipeline will know the location at which to sample the previous frame (e.g., frames N-1, N-2, N-3, etc.). For example, the dashed lines shown in fig. 6 may represent re-projections using derived uv coordinates for determining the location at which each previous frame is to be sampled. In particular, re-projection 620 may result in a change in sample position (e.g., 0.5 pixels in x) between sample pixels 604 and 606 for frames N and N-1, respectively. Likewise, re-projecting 622 and 624 may cause the uv coordinates to dither, changing the sample position between sample pixels 606 and 608 (e.g., -0.5 pixel in x and-0.5 pixel in y) and sample positions between sample pixels 608 and 610 (e.g., +0.5 pixel in x). According to some embodiments, the sampling positions may be defined by dithering.

Fig. 7 illustrates a conceptual scheme for outputting a set of four high-resolution pixels 712 using a set of high-resolution pixels 704 for sampling 706 an object 702 over a plurality of frames 700, according to one embodiment. In the illustrated embodiment, there are objects 702 that remain static across multiple frames 700 (e.g., N, N-1, … …, N-7). The set of high resolution pixels 704 is shown as sampled at a location corresponding to the center of each of the high resolution pixels 704 used to sample 706 the object 702. Each of the sample locations during sampling 706 will produce a color value that is stored in memory. For example, since there are sample locations in each of the high resolution pixels 704 during the sampling 706, there will be 4 stored color values 708 per set of high resolution pixels 704 per frame.

Also shown in fig. 7 is a sampled color value 710 for each of the four high resolution pixels 704 of each of the plurality of frames 700. For clarity, the object 702 may correspond to black, and thus, the sampled color values 710 of two of the four high-resolution pixels 704 may return color values of black. The remaining two high resolution pixels may return a color value corresponding to white. The resulting color output/rendered image 712 will reflect the sampled color values 710 for the set of high resolution pixels 704 and may be correspondingly displayed on an associated screen within a Head Mounted Display (HMD). According to some embodiments, the color output 712 for the set of high resolution pixels 704 may correspond to the original pixels and be mapped to a foveal region of the display.

Fig. 8 shows a conceptual scheme for rendering an object 802 using low-resolution pixels 804 over multiple frames 800. The object 802 is shown as being static over multiple frames 800. The low resolution pixel 804 is shown to be about 4 times the size of the high resolution pixel in fig. 8. For example, if each of the high-resolution pixels 704 corresponds to an original pixel of a display of the HMD, the low-resolution pixels 804 may include 4 original pixels.

During sampling 806, the low-resolution pixels 804 are shown sampled at the center of the low-resolution pixels 804 at each frame and result in the storage of 1 color value 808 per frame. The sampled color value 810 (e.g., black) of the entire low-resolution pixel 804 is shown as being derived from the sample 806 of the object 802 using the low-resolution pixel 804. The output color/rendered image 812 of the plurality of frames 800 is shown reflecting the sampled color values 810. For example, the output color/rendered image 812 is shown as including a black type of 'large pixel'. According to some embodiments, the output color/rendered image 812 may be mapped to a peripheral region of the HMD display associated with a lower resolution.

Fig. 9 shows a conceptual scheme of using temporal supersampling 906 and reprojection 912 of low-resolution pixels 904 to achieve a color output/rendered image 914 having a higher resolution relative to the low-resolution pixels 904. Similar to FIGS. 7 and 8, the object 902 is shown as being static over a number of frames 900 (frames N, N-1, … …, N-7). Meanwhile, similar to the low-resolution pixels 804 of fig. 8, low-resolution pixels 904 are shown for rendering an object 902. However, unlike the sampling 806 used in FIG. 8, which samples the center of the low resolution pixels 804 of each frame, the temporal supersampling 906 illustrated in FIG. 9 is shown to sample at different locations of each frame within a period of 4 frames of the plurality of frames 900.

For example, temporal supersampling 906 may indicate that sampled low resolution pixel 916a has a sample position toward the upper left corner of low resolution pixel 904 of frame N-7. For the next frame N-6, the temporal supersampling 906 may define a sample position in the upper right quadrant of the sampled low resolution pixel 918 a. For subsequent frames N-5 and N-4, the sample positions are shown in the lower left quadrant of sampled low resolution pixel 920a and the lower right quadrant of sampled low resolution pixel 922a, respectively. The illustrated pattern of sampling locations is shown to repeat for sampled low resolution pixels 924a to 930a over 4 frames of subsequent N-3 to N.

It should be noted that in accordance with the illustrated embodiment, the temporal supersampling 906 of each of the sampled low resolution pixels 916a through 930a results in each of the plurality of frames 900 storing only 1 color value 908. This is in contrast to the number of stored color values 708 (e.g., 4) sampled using the high resolution pixels 704 shown in fig. 7.

Also shown in FIG. 9 are the sample colors and locations 910 for frames N-7 through N resulting from the temporal supersampling 906. Each of the sample color and location data 916b through 930b conceptually represents content stored by the associated graphics pipeline into memory, including color values (e.g., black or white) and corresponding locations (e.g., screen coordinates). Thus, for example, the sample color and location data 930b may include a color value corresponding to white and screen coordinate data corresponding to the lower right pixel phase of the group of four higher resolution pixels (e.g., original pixels) to which the low resolution pixel 904 is mapped. In addition, the sample color and position data 928b includes a color value corresponding to black and screen coordinate data corresponding to a lower left pixel of the high resolution pixel group to which the low resolution pixel 904 is mapped. In a similar manner, sample color and position data 926b and 924b may include color values corresponding to black and white, respectively, and screen coordinate data corresponding to the top-right and top-left pixels, respectively, of the high-resolution pixel group to which low-resolution pixel 904 is mapped on the screen or display.

According to the illustrated embodiment, the color output/rendered image 914 results from a re-projection 912 of the sampled color and position 910 onto a window of 4 frames. For example, there is a rendered image 932 of frame N that includes a set of 4 high resolution pixels that are constructed 'from the sampled color and position data 930b, 928b, 926b, and 924 b' of frames N, N-1, N-2, and N-3, respectively. In particular, by mapping the color values stored in sample color and position data 930b to the bottom-right pixel in rendered image 932, the associated graphics pipeline is enabled to construct high-resolution rendered image 932. Likewise, the color values stored in the sample color and position data 928b, 926b, and 924b are mapped to the lower left pixel, the upper right pixel, and the upper left pixel, respectively. Thus, a high resolution rendered image 932 (sampled using high resolution pixels) similar to the color output/rendered image 712 of fig. 7 is implemented without the need to store four color values for each frame as in the rendering process shown in fig. 7.

A similar re-projection 912 is shown to cause each of the rendered images 934 through 940. For example, rendered image 940 for frame N-4 is shown as being constructed or re-projected from sampled color and location data 916b through 922 b. Although each rendered image 932-940 of fig. 9 is shown based on 4 frames for illustrative purposes, it should be noted that the use of temporal supersampling to render a given object may be extended to any number of previous frames. Furthermore, although each of the rendered images 932-940 of fig. 9 is based on one current frame and 3 previous frames, this is not required. For example, in certain other embodiments, rendered image 932 may be constructed or re-projected based on the sampled color and location data 928b, 926b, 924b, and 922b for frames N-1, N-2, N-3, and N-4, respectively. Thus, the illustrated dependencies of rendered images 932-940 are meant to be exemplary and not limiting.

According to some embodiments, the sample locations shown in the temporal supersampling 906 may be determined by re-projecting 912. That is, the sample position of the previous frame (e.g., frame N-1) may be determined from the screen coordinates of the current frame (e.g., the bottom-right pixel of frame N). As shown in fig. 9, re-projecting from the screen coordinates of the current frame may determine a sampling location shown in sampled low resolution pixel 928a, which is just in the lower left quadrant of low resolution pixel 904 and mapped to the lower left high resolution pixel of color output/rendered image 914. It should be noted that although a regular sampling grid pattern is shown in fig. 9 for clarity, the details of the sampling pattern (such as the location of each frame period and the unique location) may be different for different embodiments.

It should be noted that the rendering using temporal supersampling 906 and pixel reprojection 912 shown in fig. 9 enables a higher resolution color output/rendered image 914 than the rendering shown in fig. 8 that requires similar memory usage (e.g., 1 stored color value per frame per low resolution pixel). This is evident by comparing the color output/rendered image 914 produced by a rendering process that uses temporal super sampling 906 with the color output/rendered image 944 produced by a rendering process that does not use the temporal super sampling.

Thus, embodiments described herein achieve the technical benefit of improving the functionality of VR systems by increasing resolution without increasing memory usage (e.g., the number of color values stored per pixel per frame). The increase in resolution also reduces aliasing associated with low resolution regions of the point of regard rendering system, which may improve the overall quality of experience for the viewer. For example, if the rendering process shown in fig. 9 is performed on the entire region composed of a plurality of low-resolution pixels 904, improvements with respect to overall resolution, picture quality, level of detail, sharpness, and anti-aliasing may be even more prominent than that shown in fig. 9.

Fig. 10 illustrates a conceptual scheme of a rendering process using temporal super sampling 1006 that enables output of higher resolution pixels for rendering an image 1014 from low resolution pixels 1004 used to sample a dynamic object 1002 over multiple frames 1000, according to one embodiment. Similar to the rendering process of fig. 9, the low resolution pixels 1004 are sampled at different locations of each 4-frame window. According to some embodiments, the sampling locations shown in the temporal supersampling 1006 may be determined by re-projection 1012, which may calculate the jitter for sampling the previous frame. In addition, similar to the rendering process of FIG. 9, the rendering process shown in FIG. 10 stores only one color value for the low-resolution pixel 1006 in the stored color and location 1010 per frame.

The resulting rendered image 1014 is shown as an improvement over the up-rendered image 1016, which is performed in terms of resolution, level of detail, and aliasing without the use of temporal supersampling.

FIG. 11 illustrates a conceptual model for generating higher resolution pixels associated with a rendered image 1112 from low resolution pixels 1104 for sampling by utilizing temporal supersampling 1106 with reprojection 1110 and blending 1111, according to one embodiment. The object 1102 is shown moving in y within a period of 12 frames (e.g., frames N-11, N-10, … …, N). The object 1102 is temporally supersampled 1106 at a location within each frame of low resolution pixels 1104 corresponding to the sampling pattern 1114. As previously mentioned, it is contemplated that the sample locations shown in the temporal supersampling 1106 may be determined or defined by the reprojection 1110.

In accordance with the illustrated embodiment, temporal supersampling 1106 results in a single color value per frame that is conceptually and visually represented by stored colors and locations 1108. Although temporal supersampling 1106 may result in stored color values, the location components shown in each of the stored color locations 1108 may be provided by reprojection 1110, according to some embodiments. For example, since the re-projection 1110 may provide the next sample location for the previous frame, the re-projection 1110 will also have information on the screen coordinates corresponding to the next sample location. For example, the re-projection 1110 may determine from the stored color and position data 1126 for frame N that the next sampling position will be shifted by-0.5 pixels in x to sample the lower left quadrant of the previous frame N-1. Therefore, the re-projection 1110 will have information about the screen coordinates of the next sample location for frame N-1.

According to the illustrated embodiment, each of the rendered images 1116-1124 is shown to be the result of a re-projection 1110 and blending 1111 based on one respective current frame and 7 previous frames. For some implementations, blending 1111 is performed by a pixel shader of the associated graphics pipeline. For example, rendered image 1116 is shown to be based on color values 1126 through 1140 for frames N through N-7. However, since there are 8 color values 1126 through 1140 mapped to the 4 high resolution pixels 1116a through 1116d of the rendered image 1116, there is redundancy in the stored color values with respect to the high resolution pixels to which the stored color values are mapped. For example, color value 1130 (white) and color value 1138 (black) are both mapped to high resolution pixel 1116 c. According to some embodiments, the blend 1111 may calculate the final color of the high-resolution pixel 1116c based on the two color values 1130 and 1138. According to the illustrated embodiment, the final color of the high-resolution pixel 1116c is shaded gray, representing an intervening color or mixture of color values 1130 and 1138.

In some embodiments, the average of the color values 1130 and 1138 may be calculated by the pixel shader during blending 1111 and used as the final color value for the high-resolution pixel 1116 c. In other embodiments, different color values may contribute differently to the final color value, i.e., the color values may be different. For example, according to some embodiments, an exponential function may be used to describe the contribution of a given pixel value to a final color value over time or a number of frames. For example, color values associated with newer frames have (exponentially) greater weight than color values associated with later frames. Thus, according to the embodiment shown in FIG. 11, high-resolution pixel 1116c may have a final color value that is closer to color value 1130 than to color value 1138 because color value 1130 is associated with the updated frame (e.g., frame N-2) while the color value is associated with the frame that is the previous 4 frames (e.g., frame N-6).

According to the illustrated embodiment, high resolution pixel 1116a has a final color value mapped from color values 1126 and 1134. However, since color value 1126 is associated with the most recent frame (e.g., frame N), color value 1126 is associated with a greater weight than color value 1134. Thus, the final color of the high resolution pixel 1116a is closer in color to the color value 1126 than to the color value 1134. Also shown in FIG. 11 are high resolution pixels 1116b and 1116d mapped from color values 1128 and 1136 and color values 1132 and 1140, respectively. According to the illustrated embodiment, no blending is required to calculate the final color value of either of the high resolution pixels 1116b and 1116d, since there is no change between color values 1128 and 1136 or color values 1132 and 1140, respectively.

Also shown in FIG. 11 are rendered images 1118, 1120, 1122, and 1124, each of which is constructed from color values from 8 frames. Note that there is a high resolution pixel 1118a of the rendered image 1118 at frame N-1 that has the same screen coordinates as the high resolution pixel 1116c of the rendered image 1116 at frame N-1. High resolution pixels 1118a and 1116c are both shown as being constructed from color values 1130 and 1138, but are shown as being differently colored. For example, high-resolution pixels 1118a are shown as being colored in light shades of gray, while high-resolution pixels 1116c are shown as being colored in dark shades of gray. According to some embodiments, the high-resolution pixels 1118a may be associated with final color values that are lighter shades of gray because the color value 1130 (e.g., white) is associated with the updated frame relative to the rendered image 1118 (1 frame before the current frame) as compared to the rendered image 1116 (2 frames before the current frame). Thus, the color value 1130, due to its relative recency, contributes more to the final color of the high-resolution pixel 1118a than to the final color of the high-resolution pixel 1116 c.

According to the illustrated embodiment, rendered image 1120 also has high resolution pixels 1120a that share screen coordinates with high resolution pixels 1118a and 1116 c. In addition, high-resolution pixel 1120a is shown as being constructed from the same color values 1130 and 1138 as both high- resolution pixels 1118a and 1116 c. However, since the color value 1130 is shown as being associated with the latest frame relative to the rendered image 1120, according to some embodiments, the color value 1130 may be associated with a greater weight relative to the high-resolution pixel 1120a than to either of the high- resolution pixels 1118 or 1116 for shading.

Fig. 12 illustrates a conceptual model for generating higher resolution pixels associated with a rendered image 1212 from low resolution pixels 1204 used during a temporal supersampling 1206 with a quasi-random dithered sampling pattern 1214. In the illustrated embodiment, the dynamic object 1202 is sampled within the low resolution pixels 1204 at different locations per frame according to a sampling pattern 1214. Sampling pattern 1214 is shown to include 8 sampling locations repeated every 8 frames, although this is not necessary for other embodiments. For example, in other embodiments, the sampling locations need not be repeated or cycled.

Further, the sampling positions are shown to be unevenly distributed in the 4 quadrants of the low resolution pixel 1204. For example, the top-right sub-pixel is shown as being sampled at 3 locations of a window of 8 frames, while the bottom-left sub-pixel is shown as being sampled at only 1 location. There are quantitative algorithms for sampling within the low resolution pixels 1204, some of which may minimize the occurrence of clustering or uneven distribution of sample locations. Thus, the sampling patterns and/or algorithms illustrated herein are illustrative and not restrictive, as there are any number of supersampling patterns that may be used in conjunction with the embodiments described herein without departing from the spirit or scope of the embodiments. Furthermore, although the temporal supersampling embodiment for constructing a high resolution rendered image is shown based on one current frame and 7 previous frames, there may be any number of frames from which a high resolution rendered image may be constructed using temporal supersampling and re-projection, in accordance with various embodiments. Further, although the rendered image is shown as being constructed from pixel values associated with the current frame, other embodiments may be such that the most recent rendered image does not have to be mapped from pixel values associated with the most recent frame.

Fig. 13A illustrates an implementation of reconstructing a set of 16 high resolution pixels 1304 from a low resolution sampling region 1301 used during temporal super sampling 1300 over 16 frames. According to the illustrated embodiment, the low resolution sampling region 1301 maps to a set of 16 high resolution pixels 1304. Embodiments are enabled to obtain color values for each of the high resolution pixels 1304 via temporal supersampling 1300 of 16 sub-pixel regions corresponding to each of the 16 high resolution pixels 1304. For example, a different sub-pixel region may be sampled for each of the 16 frames shown, resulting in only one color value being stored per frame, as shown by the stored color and location 1302. As mentioned previously, the sampling pattern for the temporal supersampling 1300 may involve jitter and may be determined by re-projection (which has been described in more detail above). Thus, the advantages and embodiments discussed herein are not limited to reconstructing 4 pixels, but may be extended to reconstructing any number of higher resolution pixels from a lower resolution sample region.

Fig. 13B illustrates an implementation in which a set of 16 high resolution pixels 1316 is reconstructed from the low resolution sampling region 1308 used during the temporal super sampling 1306 over a number of frames less than the number of high resolution pixels 1316. For example, temporal supersampling 1306 is shown to occur over only 8 frames, resulting in the same number of color values 1310. During reconstruction, the associated pixel shader or compute shader attempts to "fill" (e.g., draw, color) the high resolution pixel grid 1312 with the color value 1310 and its associated location. For example, 8 of the 16 pixels of the high resolution pixel grid 1312 are associated with stored pixel values. The remaining uncolored pixels may then be blended 1314 using a nearest neighbor method, historical color data, a combination of the two, or other means to provide a rendered image 1316 comprised of 16 high resolution pixels. Thus, the principles and advantages of the embodiments described herein may be implemented for a particular number of high resolution pixels, even though the process of temporal supersampling involves sampling for fewer frames than the number of high resolution pixels. For example, embodiments contemplated herein may apply temporal supersampling.

It should be noted that where a pixel shader is mentioned, for some implementations it is also meant to refer to a compute shader. Further, although exemplary sampling patterns are shown, for example, in fig. 13A and 13B, it should be noted that any number of sampling patterns may be implemented with the embodiments described herein.

Fig. 14 illustrates the overall flow of a method that enables reconstruction of higher resolution pixels from a low resolution sample region using color values obtained by temporal supersampling over multiple previous frames. The embodied method includes an operation 1410 for receiving a fragment from a rasterizer of an associated graphics pipeline, and an operation 1420 for applying temporal supersampling to the fragment using lower resolution sample regions over a plurality of previous frames stored in a buffer (such as a frame buffer). As discussed above, temporal supersampling may sample different locations within a lower resolution sampling region based on pixel re-projection and/or dithering. Furthermore, as described above, the number of previous frames sampled may vary in number according to various implementations.

The method shown in fig. 14 then proceeds to operation 1430, which is for reconstructing the higher resolution pixels associated with the lower resolution sample area using the color values obtained via temporal super sampling. According to certain embodiments, reconstruction may occur in a final buffer (such as a display buffer) that may store color values obtained via temporal supersampling in a manner that is addressable by actual pixel locations (e.g., physical pixels). The method of fig. 14 then proceeds to operation 1440 for sending the reconstructed high resolution pixels for display (e.g., on a head mounted display).

Fig. 15 illustrates an additional embodiment of an HMD 1500 that may be used with the proposed method and/or system. HMD 1500 includes hardware such as gaze detector 1502, processor 1504, battery 1506, virtual reality generator 1508, buttons, sensors, switch 1510, sound positioning device 1512, display 1514, and memory 1516. HMD 1500 is also shown to include a location module 1528 that includes magnetometer 1518, accelerometer 1520, gyroscope 1522, GPS 1524, and compass 1526. Also included on HMD 1500 are speaker 1530, microphone 1532, LED 1534, object for visual recognition 1536, IR light 1538, front camera 1540, rear camera 1542, gaze tracking camera 1544, USB 1546, permanent storage 1548, vibro-tactile feedback device 1550, communications link 1552, Wi-Fi 1554, ultrasound communication device 1556, bluetooth 1558, and Photodiode (PSD) array 1560. In some embodiments, HMD 1500 may also include one or more CPUs 1562, one or more GPUs 1564, and a video memory 1566.

FIG. 16 is a diagram of a computing system 1600 that can be used to implement various implementations described herein. The computing system 1600 includes an input device 1602 for receiving user inputs. The input device 1602 may be any user-controlled device or user-responsive device, such as a mouse, touch screen, joystick, remote control, pointing device, wearable object, or head mounted display. Computing system 1600 is also shown to include a CPU 1604 that is responsible for executing applications that generate vertices and geometric data for processing and rendering by graphics system 1610. The CPU 1604 is also responsible for processing input received via the input device 1602 regarding the application. Additionally, the computing system is shown including a memory 1606 and a persistent storage 1608.

The graphics system 1610 of the exemplary computing system 1600 is shown to include a GPU1612 in communication with a memory/VRAM 1620 in communication with a scanner 1628. GPU1612 is shown to include vertex shader 1614, which receives vertex and geometry data associated with an executed application, and performs operations related to geometry transformation and manipulation on the received vertex and geometry data. In some implementations, the output of vertex shader 1614 is sent to and stored in frame buffer/time buffer 1622.

According to some implementations, the GPU1612 is also shown implementing a rasterizer 1616 that converts vertex and geometry data from the output of the vertex shader 1614 into pixel data (e.g., fragment data). According to some embodiments, the rasterizer 1616 is capable of performing certain sampling functions described herein.

The GPU1612 is also shown executing a pixel shader 1618 (also referred to as a fragment shader) that is used to obtain color values for pixels to be displayed. According to some embodiments, temporal supersampling as described herein may be performed with the aid of a pixel shader, for example, by accessing frame buffer/temporal buffer 1622. Further, according to some embodiments, pixel shader 1618 may output pixel data to be stored in display buffer 1624. In one implementation, the scanner 1628 is enabled to read pixel data stored on the display buffer and send the pixel data for display on the display 1630. Further, herein, a pixel shader refers to a pixel shader or a compute shader.

Although the method operations are described in a particular order, it should be understood that other internal management operations may be performed between the operations, or the operations may be adjusted so that they occur at slightly different times, or may be distributed in a system that allows processing operations to occur at various intervals associated with processing.

One or more embodiments may also be fabricated as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data which can thereafter be read by a computer system. Examples of the computer readable medium include hard disk drives, Network Attached Storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-R, CD-RWs, magnetic tape, and other optical and non-optical data storage devices. The computer readable medium can include computer readable tangible media distributed over a network coupled computer system so that the computer readable code is stored and executed in a distributed fashion.

Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the embodiments are not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

37页详细技术资料下载

上一篇：一种医用注射器针头装配设备

下一篇：具有带有极轮廓的磁双稳态轴对称线性致动器和开关矩阵的触觉显示器以及具有其的光学触觉助视器

Temporal supersampling for point of gaze rendering systems

相关技术

网友询问留言