Domain execution context masking and saving
阅读说明:本技术 领域执行情境屏蔽和保存 (Domain execution context masking and saving ) 是由 詹森·帕克 马修·卢西恩·埃文斯 加雷斯·里斯·斯托克韦尔 乔德杰·科瓦切维奇 于 2018-06-08 设计创作,主要内容包括:存储器存取电路26实施对存储器区域的所有权权限。给定存储器区域与从多个领域中指定的所有者领域相关联,每个领域对应于处理电路8执行的至少一个软件过程的一部分。响应于更高特权的异常级处的从源领域到目标领域的领域切换,执行与源领域相关联的架构状态的子集的状态屏蔽,以使该状态对于目标领域是不可访问的。响应于在领域切换之后的清空命令,确保尚未保存到至少一个领域执行情境存储器区域的架构状态数据的子集中的任何架构状态数据被保存。(The memory access circuit 26 enforces ownership rights for the memory regions. The given memory region is associated with an owner zone designated from a plurality of zones, each zone corresponding to a portion of at least one software process executed by the processing circuit 8. In response to a domain switch from the source domain to the target domain at a more privileged exception level, a state mask of a subset of the architectural state associated with the source domain is executed to make the state inaccessible to the target domain. In response to a flush command following a domain switch, ensuring that any architectural state data in the subset of architectural state data that has not been saved to the at least one domain execution context memory region is saved.)
1. An apparatus, comprising:
processing circuitry for processing the software process at one of a plurality of exception levels; and
memory access circuitry to enforce ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone defined from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege exception level than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from a source domain to a target domain to be processed at a higher privileged exception level than the source domain, the processing circuitry is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following the domain switch, the processing circuitry is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by the source domain is saved to the at least one domain execution context memory region.
2. The device of claim 1, wherein, in response to the domain switch, the processing circuitry is configured to: performing a register washing operation for a given architectural register storing a portion of the subset of architectural state data to ensure that a subsequent read access to the given architectural register, performed by the processing circuitry without any intervening write access to the given architectural register between the domain switch and the subsequent read access, returns a predetermined value for the given architectural register.
3. The apparatus of claim 2, wherein the register washing operation comprises at least one of:
setting an entity register corresponding to the given architectural register to the predetermined value;
remapping the given architectural register from a first physical register to a second physical register; and
setting a status value associated with the given architectural register or a physical register mapped to the given architectural register to indicate that a read access to the given architectural register should return the predetermined value.
4. The device of any preceding claim, wherein the processing circuitry is configured to: in response to the domain switch, triggering saving of the subset of architecture state data to the at least one domain execution context memory region.
5. The apparatus of any of claims 1-3, wherein the processing circuitry is configured to: processing of the target domain commences when at least a portion of the subset of architecture state data that becomes inaccessible in response to the domain switch remains stored in a register of the processing circuitry.
6. The device of claim 5, wherein, in response to returning to the source realm while a given entry of the subset of architecture state data is still stored in the register, the processing circuitry is configured to: resuming access to the given entry of the subset of architecture state data stored in the register.
7. The device of any one of claims 1-3, 5, and 6, wherein after initiating processing of the target domain, the processing circuitry is configured to: responsive to an occurrence of a predetermined event other than the flush command, triggering saving of a given entry of the subset of architecture state data to the at least one domain execution context memory region.
8. The device of claim 7, wherein the predetermined event comprises at least one of:
register access to an architectural register corresponding to the given entry of the subset of architectural state data;
a remapping of an entity register storing the given entry of the subset of architectural state data;
the number of available physical registers becomes less than or equal to a predetermined threshold;
a given number of cycles or a given period of time; and
an event indicating a reduction in processor workload.
9. The apparatus according to any of the preceding claims, wherein after the domain switch, the processing circuitry is configured to reject domain entry requests for entry into a domain other than the source domain if: the other domains will be processed at the same exception level or a lower privilege exception level than the target domain and no flush command is received between the domain switch and the domain entry request.
10. Apparatus according to any preceding claim, comprising an instruction decoder for controlling the processing circuitry in response to instructions of the plurality of software processes, wherein the flush command comprises instructions decoded by the instruction decoder.
11. The apparatus of any of claims 1-9, wherein the flush command comprises a command triggered by a predetermined event occurring during processing of instructions of the plurality of software processes.
12. The device of any preceding claim, wherein the processing circuitry is configured to: selecting the subset of architectural state data according to a boundary exception level associated with the source domain, the boundary exception level indicating a highest privileged exception level corresponding to the source domain.
13. The device of any preceding claim, wherein the processing circuitry is configured to: triggering the domain switch in response to an exception condition.
14. The apparatus of any preceding claim, wherein in response to a predetermined type of domain switch, the processing circuitry is configured to: the status mask is suppressed.
15. The apparatus of claim 14, wherein the predetermined type of domain switch comprises a domain switch triggered by execution of a voluntary domain switch instruction in the source domain.
16. An apparatus, comprising:
processing means for processing the software process at one of a plurality of exception levels; and
enforcing ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone specified from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having authority to prevent software processes handled at a higher privilege level of exception than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from a source domain to a target domain to be processed at a higher privileged exception level than the source domain, the processing device is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following the domain switch, the processing device is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by the source domain is saved to the at least one domain execution context memory region.
17. A method of data processing, comprising:
processing the software process at one of a plurality of exception levels; and
enforcing ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone specified from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes that are handled at a higher privilege exception level than the owner zone from accessing the given memory region;
in response to a domain switch from a source domain to a target domain to be processed at a more privileged exception level than the source domain, performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following the domain switch, ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by the source domain is saved to the at least one domain execution context memory region.
18. A computer program for controlling a host data processing apparatus to provide an instruction execution environment, comprising:
handler logic to handle the software process at one of a plurality of exception levels; and
memory access program logic to enforce ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone specified from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege exception level than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from a source domain to a target domain to be processed at a higher privileged exception level than the source domain, the handler logic is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following the domain switch, the handler logic is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by the source domain is saved to the at least one domain execution context memory region.
19. A storage medium storing a computer program according to claim 18.
Technical Field
The present technology relates to the field of data processing.
Background
It is known to provide memory access control techniques for enforcing access rights to specific memory regions. Generally, these techniques are based on privilege levels such that processes executing at a higher privilege level may preclude processes with lower privilege levels from accessing a memory region.
Disclosure of Invention
At least some examples provide an apparatus comprising:
processing circuitry for processing the software process at one of a plurality of exception levels; and
memory access circuitry to enforce ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone defined from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege level of exception than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from the source domain to a target domain to be processed at a higher privileged exception level than the source domain, the processing circuitry is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following the domain switch, the processing circuit is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by a source domain is saved to the at least one domain execution context memory region.
At least some examples provide an apparatus comprising:
processing means for processing the software process at one of a plurality of exception levels; and
enforcing ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone specified from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege exception level than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from the source domain to a target domain to be processed at a higher privileged exception level than the source domain, the processing device is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to the flush command following the domain switch, the processing device is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by a source domain is saved to the at least one domain execution context memory region.
At least some examples provide a data processing method comprising:
processing the software process at one of a plurality of exception levels; and
enforcing ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone defined from the plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege exception level than the owner zone from accessing the given memory region;
in response to a domain switch from the source domain to a target domain to be processed at a higher privileged exception level than the source domain, performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following a domain switch, ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by a source domain is saved to the at least one domain execution context memory region.
At least some examples provide a computer program for controlling a host data processing apparatus to provide an instruction execution environment, the computer program comprising:
handler logic to handle the software process at one of a plurality of exception levels; and
memory access program logic to enforce ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone defined from a plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege level of exception than the owner zone from accessing the given memory region;
wherein, in response to a domain switch from the source domain to a target domain to be processed at a higher privileged exception level than the source domain, the handler logic is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and
in response to a flush command following a domain switch, the handler logic is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by a source domain is saved to the at least one domain execution context memory region.
The storage medium may store a computer program. The storage medium may be a non-transitory storage medium.
Drawings
Further aspects, features and advantages of the present technology will become apparent from the following description of examples, read in conjunction with the accompanying drawings, in which:
FIG. 1 schematically illustrates a data processing system that includes a plurality of processing components that utilize memory regions stored within a first memory and a second memory;
FIG. 2 schematically illustrates the relationship between processes being performed, the privilege levels associated with the processes, and the realms associated with the processes for controlling which process owns a given memory region and thus has exclusive rights to control access to the given memory region;
FIG. 3 schematically shows a memory area under management by a domain management unit and a memory management unit;
FIG. 4 schematically illustrates a sequence of program instructions executed to output a given memory region from a first memory to a second memory;
FIG. 5 is a flow chart schematically illustrating page output;
FIG. 6 schematically illustrates a plurality of domains and their relationship within a control hierarchy to control which output commands can interrupt which other output commands;
FIG. 7 is a flow chart schematically illustrating page entry;
FIG. 8 schematically illustrates a first output command source and a second output command source performing an overlapping output operation for a given memory region;
FIG. 9 shows a more detailed example of the processing components and the domain management control data stored in memory;
FIG. 10 illustrates an example of a domain hierarchy in which an parent domain can define domain descriptors that describe the properties of various child domains;
FIGS. 11 and 12 show two different examples of domain hierarchies;
FIG. 13 illustrates an example of a domain descriptor tree that an predecessor domain maintains to record domain descriptors of its descendant domains;
FIG. 14 illustrates an example of a local domain identifier constructed from a plurality of variable length bit portions that each provide an index to a corresponding level of the domain descriptor tree;
FIG. 15 illustrates an example of local and global domain identifiers for each domain in a domain hierarchy;
FIG. 16 shows an example of the contents of a domain descriptor;
FIG. 17 is a table showing different domain lifecycle states;
FIG. 18 is a state machine diagram indicating changes in the lifecycle states of a domain;
FIG. 19 is a table showing the contents of entries in the ownership table for a given memory region;
FIG. 20 is a table showing visibility attributes that may be set for a given memory region to control which domains other than the owner are allowed to access the region;
FIG. 21 illustrates examples of different lifecycle states for memory regions, including states corresponding to RMU-private memory regions reserved for mutually exclusive access by a domain management unit;
FIG. 22 is a state machine showing the transition of the lifecycle states for a given memory region;
FIG. 23 illustrates how ownership of a given memory region may be transferred between an ancestor domain and its descendant domains;
FIG. 24 schematically illustrates memory access control provided based on page tables defining memory control attributes that depend on privilege levels and domain management unit levels that provide orthogonal levels of control of memory access based on permissions set by the owner domain;
FIG. 25 illustrates an example of a translation look-aside buffer;
FIG. 26 is a flow chart illustrating a method of controlling access to memory based on a page table and an RMU table;
FIG. 27 illustrates states accessible to a process executing at different exception stages;
FIG. 28 is a flow chart illustrating a method of entering a domain or returning from an exception;
FIG. 29 is a flow chart illustrating a method of exiting a domain or taking an exception;
FIG. 30 illustrates an example of entering a child domain and returning to a parent domain;
FIG. 31 illustrates an example of nested (nested) domain exit and nested domain entry;
FIG. 32 illustrates an example of lazy save using a domain execution context upon exit from a domain;
FIG. 33 illustrates an example of the use of a flush command that ensures that a subset of the state associated with a previously exited child domain is saved to memory before entering a different child domain;
FIG. 34 illustrates the use of sub-realms that correspond to particular address ranges within a process associated with a parent realm of the sub-realms; and
FIG. 35 shows an example of a simulator that can be used.
Detailed Description
Fig. 1 schematically shows a
The
The
Thus, multiple memory regions are divided among multiple owner zones. Each domain corresponds to at least a portion of at least one software process and is assigned ownership of a plurality of memory regions. Owning processes/domains has exclusive rights to control access to data stored within memory regions of their domains. Management and control of which memory regions are memory mapped to each domain is performed by processes other than the owner domain itself. With this arrangement, a process such as a hypervisor may control which memory regions (pages of memory) are contained within a domain owned by respective guest virtual (guest operating systems) managed by the hypervisor, yet the hypervisor itself may not have the authority to actually access data stored within the memory regions that the hypervisor has allocated to a given domain. Thus, for example, a guest operating system may keep data stored within the domain of the guest operating system (i.e., data stored within a memory region owned by the guest operating system) private with respect to its management manager.
The division of the memory address space into realms and control of ownership of those realms is managed by a
The processing components comprising the
In the context of such output and input of data from a memory region, it will be appreciated that a first memory, such as the on-
The output process may be accompanied by the generation of metadata that specifies characteristics of the output data. This metadata may be separately stored within a metadata memory area of the first memory (on-chip memory 16), where the metadata is kept private to the
This metadata describing the characteristics of the memory region and the data stored within the memory region may be arranged as part of a hierarchical structure, such as a metadata memory region tree with a branching pattern. The form of this metadata memory area tree may be determined under software control because different areas of the memory address space are registered for use as metadata areas owned by the
When a given data stored in a given memory region is output, the memory region in question is subsequently invalidated, making the content inaccessible. To reuse this page, the memory region is "validated" by using a Clean (Clean) command that overwrites the memory region with other data unrelated to the previous content so as to not make this previous content accessible to another process when the given memory region is freed for use by another process. For example, the contents of a given memory region may be written all as zero values, or as fixed values, or as random values, thereby overwriting the original contents of the memory region. In other examples, the overwriting of the contents of the output memory region may be triggered by the output command itself rather than a subsequent cleaning command. In general, given owned data that is output may be overwritten by values that are not associated with the given owned data before the given memory region is made accessible to processes other than the given owned process. When a given memory region owned by a given process is to be exported, as part of the export process, the
Figure 2 schematically shows the relationship between a number of processes (programs/threads), a number of exception levels (privilege levels), secure and non-secure processor domains, and a number of domains representing ownership of a given memory region. As shown, the hierarchy of privilege levels extends from exception level EL0 to exception level EL3 (with exception level EL3 having the highest privilege level). The operating state of the system may be divided between a safe operating state and a non-safe operating state, the safe operating state and the non-safe operating state being as determined by use
Secure and non-secure domain representations of an architecture, such as a processorIs constructed byLimited (Cambridge, UK).As shown in fig. 2, memory access circuitry (
The relationships between domains shown in FIG. 2 illustrate child/parent relationships between different domains, and this can be used to generate a control hierarchy for controlling the operation of the system when multiple different command sources for memory region management compete with each other. Thus, for example, in the case of an output command for outputting a memory region as discussed above, a first output command may be received by a given domain management unit (memory access circuitry) from a first output command source, such as the domain B inside
In this example, the second output command has a higher priority and thus interrupts operation of the first output command. However, if the second output command has originated, for example, from an
Fig. 3 schematically shows a
The memory regions may be addressed by virtual addresses, intermediate physical addresses, or physical addresses, depending on the particular system under consideration. The
Fig. 4 schematically shows program instructions associated with an output operation of a memory region. These program instructions appear in the program instruction stream and may be executed (acted upon) by different components within the overall circuit. For example, the domain-management-unit commands are executed by the respective
Once the barrier instruction DSB has received an acknowledgement confirming that the clearing from the virtual address translation data within the system has been completed, an output command for the domain management unit is executed by the domain management unit. Execution of such output instructions received from a given process by the domain management unit triggers performance of a command sequence (corresponding to the millicode embedded within the domain management unit) that includes a plurality of command actions with respect to a specified given memory region. These command targets may include, for example, the following steps as illustrated in FIG. 4: collecting address translation data, locking a memory region, encrypting data, storing data externally, writing metadata associated with the memory region, and subsequently unlocking the memory region.
The address translation collection step performed by the domain management unit as part of the command sequence collects access control data required to complete the access operation under consideration to the domain management unit. This ensures that once an output operation is in progress, the likelihood of the output operation being suspended is reduced, for example, possibly due to unavailability of parameters or data required to complete the output operation, such as address translation data, attribute data, or other data required by the output process. As an example of the extraction and storage of access control data within the memory access circuitry, the address translation step is used to extract all required address translation data (e.g. virtual to intermediate physical address (or physical address) mapping data) that may be required to complete the output operation.
Once the address translation data has been extracted, the domain management unit is operable to set the lock flag associated with the region under consideration to a locked state. This lock flag may be stored within the
Fig. 5 is a flowchart schematically showing page (memory area) output. At step 44 program instructions are executed (VUMAP, TLBI, DSB) which are used to clear the use of pages elsewhere in the system than the
When clear requests have been issued at step 44, processing waits at
If the determination at
In the exemplary embodiment discussed above, the CCB is provided as a separate private memory area specified by, for example, an associated pointer within the initialization instruction. However, in other exemplary embodiments, the CCB may not be provided as a separate memory region, but rather as a portion of a memory region already used by a command that may be interrupted, such as a destination memory region into which result data generated by the command is stored. In the case of an output command that can be interrupted, the output encrypted data is stored in the destination memory area, which is the RMU private memory area when the output is executed. When the CCB is filled with encrypted data, the CCB may be provided, for example, as an end portion of this target area. The integrity of the context data stored within the CCB is ensured by the target area being private to the RMU when performing the output operation.
In another exemplary embodiment, the CCB may be provided as part of a domain descriptor (RD); in this case, the storage space available for context data may be constrained by the space available in the RD, and thus the number of interruptible parallel commands supported may be constrained by the storage space available to the RD for use as a corresponding CCB. The CCB may be provided separately or as part of a memory area or resource that is also used for another purpose.
Fig. 6 schematically shows the relationship between the realms and the control hierarchy that determines which commands from different command sources are allowed to interrupt/block partially completed commands from other sources. The illustrated example includes three levels of nested domains. The previous generation field M corresponds to the
Fig. 7 is a flowchart schematically showing a page (memory area) input operation subsequent to the RMU input command.
Fig. 8 schematically shows two output commands that may appear in parallel from different command sources. One of the sequences of instructions originates from a process corresponding to a virtual machine (e.g., a guest operating system). Another command source is a manager at a higher privilege level (or possibly a higher level within a domain level) than the virtual machine. Thus, output commands from the hypervisor can interrupt partially completed output commands representing the virtual machine being executed by the
In this example, the command to the
The command context buffer is used to store a partial completion status that represents a partially completed command sequence so that this data can be recovered at a later time. In this way, the system does not need to wait until the full output operation has completed before the interrupt can be serviced. Further, when the partially completed state is maintained, even if the output operation is repeatedly interrupted, it is ensured that the progress is made forward by the output operation, because the output operation will not need to be restarted from the initial point of the output operation.
Fig. 9 shows a more detailed example of one of the
The
As shown in FIG. 9, the
In addition, a plurality of domain management tables or
The
As shown in fig. 10, the realms are managed by the RMU20 according to a realm hierarchy in which each realm other than the
In general, for the domain management portion of memory access control provided by the
As shown in fig. 10, each domain 140 is associated with one or more domain execution context (REC)
Each domain is associated with a
Fig. 11 and 12 show two different examples of possible domain levels. In the example of FIG. 11, each of the processes shown in FIG. 2 defines its own domain. Thus, the
As shown in fig. 12, it is not necessary for the process at each privilege level to have a separate realm, and thus some of the privilege level boundaries shown in dashed lines in fig. 12 may not correspond to realm boundaries. For example, in fig. 12,
The
As shown in the table in fig. 13, a given RDTE164 providing a pointer to an
It should be noted that the tree shown in FIG. 13 shows the child domains of a particular parent domain. Each other parent domain may have a split domain descriptor tree that tracks its own children domains of the parent domain. Datagrams associated with the tree, including
As shown in fig. 13, each of the child domains of a given parent domain may have a corresponding domain identifier (RID)168 that is used by that parent domain to identify a particular child domain. A RID is a local realm identifier because the RID is specific to a particular generation realm. Progeny domains of different predecessor domains may have the same local RID. Although it is possible to use a local RID having any value selected by the predecessor domain for a given child domain, in the methods shown in fig. 13 and 14, the local RID for a given child domain has a variable number of variable length bit portions and each of the variable length portions is used by the RMU20 to index into a given stage of the
In fig. 13, the local RIDs are shown in decimal form, but fig. 14 shows how these local RIDs can be represented using binary identifiers. The binary identifier may have a plurality of variable
The number of bits to be used within each of the variable
This approach provides a flexible architecture that allows different numbers of child domains to be established by a given parent domain, and allows domain descriptors for these child domains to be efficiently accessed. Because the realm identifier explicitly provides the index required to step through the realm descriptor tree, there is no need to maintain a mapping table that maps any realm number to a particular way through the tree. When a tree structure is used, the tree can be expanded as needed by the number of child realms, by adding additional RDTG or additional RDTE to a given level of the tree, as appropriate, as compared to a table structure that would provide a certain fixed number of entries. Thus, the architecture is scalable to the needs of different software processes. This enables the available bits of the RID to be flexibly allocated to accommodate different depths/widths of the tree, since it is not specified in advance exactly which parts of the RID are to be mapped to a given level of the tree.
The
The
The RMU may allow variable length bit portions for indexing to different branches at the same stage of the domain descriptor tree to have different numbers of bits. That is, while both have the same order value (and thus the same number of entries) in the
In general, a RID for a given domain may include an in-order connection of indices to be used at respective stages of a domain descriptor tree to access domain management data for the given domain. Although it is not necessary that the indexes connect in order in the same sequential order as they are used to step through the tree, this may be preferred as it makes management of tree accesses simpler. It does not matter whether the consecutive connections are from low to high or from high to low. The indexed sequential connections may be followed by a predetermined termination pattern that may allow the RMU20 to determine when there are no further levels of trees to be stepped through.
Some embodiments may apply this RID construction technique to a global domain descriptor tree that may store domain descriptors for all domains within the system in a tree-like structure (where each of the RIDs is a globally unique value). However, software development can be made simpler by defining the child domains of a given generation within one tree and then tracking the child domains of that generation for the split tree for each other generation domain. Thus, a domain descriptor tree may be a local domain descriptor tree related to a given parent domain for storing domain management data for child domains that have been initialized by the given parent domain. The realm identifier can thus be a local realm identifier that identifies a particular child realm used by a given parent realm. Child domains initialized by different parent domains may be allowed to have the same value of the local domain identifier. In this way, the parent domain can select which RIDs are used for child domains of the parent domain without knowing that any other domain has been established by other parent domains, where the RIDs for the child domains are constructed according to the way the parent domain configures its domain descriptor tree.
The local realm identifier can be used by a realm entry instruction or RMU command issued by a software process. However, the hardware architecture may use absolute identification of a given child domain to distinguish domains created by different parents. Thus, in addition to the local domain identifiers shown in fig. 13 and 14, a given domain may also have a global domain identifier (or "internal" domain identifier) that is unique to the given domain. At least one hardware structure may identify a given domain using a global domain identifier (GRID) instead of a local domain identifier (LRID). For example, the domain group table 128 and/or the TLB100 may use a global domain identifier to identify a domain.
In some instances, any binary value may be assigned as a GRID for a given realm, which may be completely independent of the LRID used by the predecessor realm to reference the descendant realm. Different microarchitectural implementations of the same domain architecture may use different methods to assign GRIDs.
However, in one example as shown in fig. 15, a GRID for a given domain may be constructed based on LRIDs of prior domains of the given domain. This may be useful because it may enable simpler determinations that a given domain is a descendant of another domain or an antecedent of another domain, which may be used for access control by the
In some cases, in GRID, the LRID may be concatenated in-order, including the termination flag and zero-padded bits shown in fig. 14. Alternatively, the binary representation of the GRID may exclude such termination flags and zero-padded bits, and instead a meaningful portion of the LRID including the RDT index may be directly concatenated. Because each of the LRIDs may itself have a variable number of bits depending on the depth and width of the RDT used to associate the prior domain, the number of bits allocated to the global RID representing the local RID of a given domain generation may be variable. In addition, this change in which portions of the global RID are assigned to a given domain generation may change at runtime based on the particular software being run, but may also change between different branches of the "family tree" of the domain, such that one branch of the family tree may use a larger portion of the domain identifier than the other branches. Because the common prefix or suffix of a GRID is the same for domains sharing a common prior generation, any subsequent generation may still be discerned by a remainder that is specific to the subsequent generation, regardless of how the remainder is divided among further generations.
By constructing the GRID as a concatenated concatenation of LRIDs for a plurality of prior realms, this enables a more efficient determination of whether the first realm is a prior or a subsequent realm for the second realm. Circuitry may be provided (e.g., within the TLB100 or RMU 20) to determine whether the global RID of one of the first and second domains matches a prefix or suffix portion of the global RID of the other, such as by masking portions of the global RID corresponding to later generations using a bit mask, so as to allow comparisons between global RIDs of earlier and later domains within the same family to match.
It is not necessary that all local RIDs be constructed using the ordered concatenation of the tree indexing method shown in FIG. 13. In some cases, it may be useful to retain specific values of the local RID for reference to certain pre-set fields. RMU commands specifying the current domain or a previous generation domain of the current domain may be relatively common. Therefore, the predetermined RID value can be reserved for the current generation domain with reference to the current domain. For example, an LRID (value indicating 1) with all bits set to 1 may be reserved for referencing the current domain's predecessor domain. Similarly, a predetermined realm identifier value can be reserved for reference to the current realm itself. For example, an LRID value of 0 may be used to reference the current domain. It should be noted that the use of the
The RMU may support certain query commands that may be triggered by a given domain in order to query the constraints that must be met when the given domain builds its domain descriptor tree. For example, in response to a query command, the RMU20 (or the processing circuitry 32) may return a constraint value indicating at least one of a maximum number of levels of the
Fig. 16 shows an example of the contents of the
the global RID of the domain. Thus, by traversing the domain descriptor tree based on a local RID, a corresponding global RID may be identified and this may be used to index hardware structures, such as TLBs, or check ownership tables or other information defined based on GRID by a given domain.
The lifecycle state of a given domain, which may be used by the RMU20 to determine whether to accept a given command triggered by the given domain.
The type of a given domain. For example, a domain type may indicate that the domain is a complete domain or a sub-domain as discussed later.
A Boundary Exception Level (BEL) value that identifies a boundary exception level for the corresponding domain. The BEL indicates the maximum level of privilege the domain is allowed to execute. For example,
A resource count indicating the total number of memory regions (domain protection groups or RPGs) owned by the domain and its descendants. This is used to ensure that all memory pages owned by the descendants of the domain are invalidated (and eventually erased) before these memory regions can be allocated to different domains. For example, a resource count may be used to track how many regions still need to be washed.
The start and end addresses of protected addresses for the domain. For example, a protected address range may define a memory address space range within which a page may be owned by a corresponding domain. This can be used to protect against malicious parent domains that reclaim ownership of a region previously assigned to a child domain in attempting to access child domain data, as by comparing the protected address range defined in the domain descriptor with subsequent addresses of memory accesses, situations can be identified where a memory region previously owned by a domain is no longer owned by that domain.
One or more encryption keys used by the
A domain description tree entry (RDTE) that identifies the root of the domain descriptor tree. The RDTE in the domain descriptor provides an index for accessing the root RDTG (and defining how many bits will be used as the order value of the index for that RDTG).
Pointers to main REC (domain execution context) memory regions for saving or restoring architectural state related to the execution of the domain.
FIG. 17 shows a set of lifecycle states that may exist for a given domain, including in this example a clean state, a new state, an active state, and an invalid state. Fig. 17 summarizes the properties of each state, indicating for each state: whether a domain in the corresponding state can have the parameters of the
FIG. 18 is a state machine diagram showing the allowable transitions of the lifecycle states of a domain. Each state transition shown in fig. 18 is triggered by the previous generation realm issuing a realm management command to the RMU20 that specifies a local RID of the child target realm (realm. invalid)
When a domain is in the
Thus, by providing a managed life cycle for a domain associated with a given domain identifier, this ensures that data associated with a previous domain using the same domain identifier must be washed from memory and any cache before the domain can be returned to a clean state in which parameters of that domain can be modified (and thus before the given domain identifier can be recycled for use by a different domain) to prevent any data associated with an old domain from being leaked to other domains by reuse of the same domain identifier. While the domain is in the
Fig. 19 shows an example of the contents of an entry of the domain area group table 128 (or ownership table). Each entry corresponds to a given memory region of the memory address space. The size of a given memory region may be fixed or variable, depending on the implementation. The particular manner in which the ownership table 128 is structured may vary significantly depending on implementation requirements, and thus the particular manner in which the corresponding memory region for a given entry is identified may vary (e.g., data may be stored in each entry identifying the corresponding region, or alternatively, the corresponding entry may be identified based at least in part on the location of the corresponding ownership entry within the table itself). In addition, fig. 19 shows specific examples of parameters that may be specified for a given memory region, but other examples may provide more information or may omit some of the information types shown.
As shown in fig. 19, each ownership table entry may specify the following for the corresponding memory region:
identify the global RID for the owner zone of the memory region. An owner realm may be a realm that has the right to set attributes that control which other realms are allowed to access a memory region.
The life cycle state of the corresponding memory region used to control which RMU commands are allowed to execute on the memory region.
Mapped addresses mapped to by the
Visibility attributes that specify which domains other than the owner can access the memory region. For example, as shown in FIG. 20, the visibility attribute may specify a previous generation visibility bit that controls whether a previous generation domain of the current domain is allowed to access the region, and may specify a global visibility bit whether any domain can access the corresponding memory region. In general, a domain protection scheme may assume that descendant domains of a current domain are always allowed to access memory regions owned by the current domain's descendant or predecessor domains (subject to whether access is allowed based on a translation table 120 that provides protection based on privilege level), but a given domain may control whether memory regions are accessible by the given domain's descendant or any other domain that is not an immediate descendant of the given domain. In some embodiments, both the previous generation visibility bit and the global visibility bit may be set by the owner zone itself. Alternatively, while the previous generation visibility bit may be set by the owner domain, the global visibility bit could perhaps be set by a previous generation domain of the owner domain (provided that the previous generation visibility bit for a memory region has been set to give the memory region a previous generation visibility). It will be appreciated that this is just one example of how the owner zone can control which other processes can access data of the owner zone.
FIG. 21 is a table showing different lifecycle states that may exist for a given memory region, and FIG. 22 is a state machine showing commands that trigger transitions between the corresponding lifecycle states. In a manner similar to the domain lifecycle states shown in FIG. 18, transitions between memory region lifecycle states are managed to ensure that a memory region that is passed from ownership of one domain to ownership of another domain must first undergo an invalidation process in which data in the region is scrubbed (e.g., set to zero). Thus, to transition from the
In some systems, it may be sufficient to provide the
Thus, cleaning
However, robustness can be promoted by specifying multiple types of RMU-private memory areas each corresponding to a particular form of domain management data. For example, in fig. 21 and 22, a plurality of RMU registration states 228 are defined that each correspond to RMU private areas that are designated for a specific purpose. In this example, the RMU registration state 228 includes RMU registration RDT (RDTG for storing domain descriptor trees), RMU registration RD (for storing domain descriptors), RMU registration REC (for storing domain execution context data), and RMU registration MDT (for storing paged metadata used during output/input operations as discussed above). Different forms of registration commands 230 may be executed by the RMU for a memory region in the RMU clean state to transition the memory region to a corresponding one of the RMU registration states 228. Commands for storing data to RMU-private memory areas that do not correspond to a prescribed purpose (RDT, RD, REC, or MDT) may be rejected. Accordingly, in a first life cycle state of the RMU registration state, a first type of RMU command for storing a first type of domain management data may be allowed, and in a second life cycle state, a second type of RMU command for storing a second type of domain management data may be allowed, wherein the first RMU command is rejected when the target memory region is in the second life cycle state, and the second RMU command is rejected when the target memory region is in the first life cycle state. This may enable further security by avoiding malicious predecessor domains, e.g., attempting to store domain descriptor entries to a domain execution context region or vice versa, in order to attempt to split operations of children domains. According to each of the RMU registration states 228, a corresponding form of a
Thus, in summary, at least one RMU-private memory region may be defined that is still owned by a given owner zone but has an attribute specified in the ownership table that means that the at least one RMU-private memory region is reserved for mutually exclusive access by RMUs. In this example, the attribute controlling the RMU-private state is the lifecycle state specified in the corresponding entry in the ownership table, but the attribute may also be identified in other ways. When a given memory region is specified by at least one state attribute as an RMU private memory region, the MMU may prevent access to the given memory region by one or more software processes. Thus, any software-triggered access that is not triggered by the RMU itself may be denied when it targets the RMU-private memory area. This includes preventing access to the RMU-private memory area by the owner zone itself.
The skilled person can ask why it is useful to define an owner zone for an RMU-private memory area if the owner zone cannot even access the data in the memory area. For example, an alternative method for implementing access to data only by an RMU would define a special domain for the RMU, and allocate pages of memory address space for storing data that would remain private to that special RMU owner domain. However, the inventors have recognized that when a domain is invalidated, there may be a requirement to invalidate all control data related to that domain, and this may complicate the washing of data of the invalid domain if this control data is associated with a particular RMU owner domain rather than the invalid domain.
In contrast, by using RMU-private attributes, the memory region storing the control data for a given domain is still owned by that domain even if the owner cannot access the control data, which means that it is simpler to identify which memory regions need to be invalidated when the owner domain is revoked. When a given realm is invalidated, the previous generation realm may simply perform a sequence of eviction operations (e.g., by executing an eviction command that is subsequently acted upon by the RMU) that triggers a region of memory owned by the specified invalidation realm (or a descendant of the specified invalidation realm) to be invalidated, made inaccessible, and returned to ownership of the previous generation realm that triggered the eviction command. The eviction operation may affect not only the pages accessible by the invalidation domain, but also the RMU-private memory area owned by the invalidation domain.
Another advantage of storing control data for a domain in an RMU-private memory area owned by the domain is when performing output operations. To reduce the memory footprint of a domain to zero, management structures associated with the domain may be exported in addition to normal memory during export operations. These structures are required to be owned by the field to simplify the management of the output operations.
In general, any kind of domain management data may be stored in the RMU-private area, but specifically, the domain management data may include any of: a domain descriptor defining properties of a given domain, a domain descriptor tree entry or further domain descriptor tree entries identifying a memory region storing the domain descriptor for the given domain, domain execution context data indicating an architectural state related to at least one thread executing within the given domain, and temporal work data for use at intermediate points of predetermined operations related to the given domain.
While in general, RMU private areas may be used to store domain-specific control data related to a given domain, these RMU private areas may also be used in order to increase security with respect to certain other operations performed once the domain is active. For example, when performing the above-discussed paging out or in operations in which data is encrypted or decrypted, and a check using metadata is performed to check that the data is still valid when the data is again input, such operations may take many cycles and such long-running operations are more likely to be interrupted in the middle. To avoid the need to restart operations again, it is desirable to allow metadata or other temporary working data associated with such long running operations to remain in the cache/memory even at the time of interruption, without making this data accessible to other processes (including the owner zone itself). This temporary working data can be protected by temporarily designating an area of the memory system as an RMU-private area. Thus, as shown in FIG. 21, the page states may also include RMUExporting and RMUIMPorting states that may be used when this temporary working data is stored to the memory area, and when one of these states is selected, then only the RMU may access the data.
Other examples of operations that may benefit from temporarily designating a corresponding memory region as RMU private may include: generation or verification of encrypted or decrypted data during data transfer between at least one memory region owned by a given domain and at least one memory region owned by a domain other than the given domain; transfer of ownership of a memory region to another domain; and a destructive eviction operation performed to render inaccessible data stored in the invalid memory region. For example, a eviction operation to wash the entire contents of a given page of the address space may be interrupted in the middle, and thus ensure that other processes cannot access the page until the wash is complete, the page may be temporarily designated as RMU-private. In general, any long latency operation performed by the RMU may benefit from transitioning the lifecycle state of some memory regions to RMU-private state before beginning the long-running operation, and then transitioning the lifecycle state back when the long-running operation is completed so that the temporary working data of the long latency operation is protected.
When an area is designated as private to an RMU, the area is reserved for access by the RMU20, which is used to perform domain management operations. The domain management operations may include at least one of: creating a new domain; updating the properties of the existing field; rendering the domain useless; allocating memory regions for ownership by a given domain; changing an owner zone for a given memory region; changing the state of a given memory region; updating access control information for controlling access to the given memory region in response to a command triggered by an owner field for the given memory region; managing transitions between domains during processing of one or more software processes; managing transfer of data associated with a given domain between memory regions owned by the given domain and memory regions owned by a different domain than the given domain; and encryption or decryption of data associated with a given domain. The RMU may be a hardware unit to perform at least a portion of the domain management operations, or may include processing
FIG. 22 illustrates a state transition that may be triggered by a given domain to clean a given page so that the given page may be validly accessed, or invalidate the corresponding page. FIG. 23 expands this scenario to show further commands that may be used to transfer ownership of a given page from one domain to another. If the memory region is currently in the
One advantage of using the hierarchical domain structure discussed above in which an offspring domain is initialized with an offspring domain is that this greatly simplifies invalidation of domains and descendants of the domain. It is relatively common that if a given virtual machine realm is to be invalidated, it may also be desirable to invalidate the realm for any application running under that virtual machine. However, there may be a large amount of program code, data, and other control information associated with each of the processes that will be invalidated. It may be desirable to ensure that such invalidation occurs atomically, so that when only part of the data wash has been implemented, it is not possible to continue accessing data related to the invalid domain. This can make such atoms difficult if each domain is built completely independently of other domains without a domain hierarchy as discussed above, as multiple separate commands must be provided to individually invalidate each domain identified by the corresponding domain ID.
In contrast, by providing a domain level in which the RMU management domains are such that each domain other than the root domain is a child domain initialized in response to a command triggered by the parent domain, the RMU20 may make the target domain and any child domains of the target domain inaccessible to the processing circuitry with more efficient operation when a command requesting invalidation of the target domain is received.
In particular, in response to invalidation of the target domain, the RMU may update domain management data (e.g., domain descriptors) associated with the target domain to indicate that the target domain is invalid, but need not update any domain management data associated with any descendant domains of the target domain. The domain management data associated with the descendant domains may remain unchanged. This is because simply invalidating the target domain may also make any descendant domain ineffectively inaccessible even though the domain management data has not changed, because access to a given domain is controlled by the descendant of the given domain and thus if the descendant domain is invalidated, this means that it is not yet possible to access descendants of the descendant domain. Because each of the domains is entered using a domain entry instruction (an ERET instruction discussed below) that uses a local RID defined by the parent domain to identify a particular child of the parent domain, and this is used to step through domain descriptors stored in memory regions owned by the parent domain of a given child domain, no process other than the parent domain can trigger the RMU to access domain management data of the child domain. Thus, if the predecessor realm is invalidated, the RMU cannot access the realm management data of a given successor realm, thereby ensuring that the given successor realm becomes inaccessible.
After a domain has been invalidated, a predecessor domain of the domain may trigger the RMU to perform an eviction operation to evict each memory region owned by the invalidated target domain. For example, as shown in FIG. 23, an
Thus, in summary, the use of domain hierarchies greatly simplifies the management of domains and inefficiencies. In such invalidation, and overwriting of data in memory, the invalidation may also trigger invalidation of cache realm management data for the target realm and any descendant realms of the target realm, which cache realm management data is held not only in the
FIG. 24 shows an example of checks performed by the
Upon receiving a memory access,
Having obtained the physical address, the physical address may then be looked up in the RMU table 128 (domain group table) to determine whether the domain protections implemented by the MMU allow the memory access to proceed. The domain check is discussed in more detail below in FIG. 26. If the check at
FIG. 25 illustrates an example of a
A hit in the TLB100 not only requires that the
To address this issue, the TLB100 may specify, within each
Subsequently, when a translation cache is looked up to check whether the translation cache already includes an
If the second comparison of the domain identifiers detects a mismatch, then even if the tag comparison and translation context comparison match, the access request is considered a miss in the TLB, since it indicates that there is a change in the mapping between
FIG. 26 is a flow chart illustrating a method of determining whether a given memory access is allowed by
In response to a memory access request,
a tag comparison 302 for comparing whether the address of the memory access request matches the
a first (context)
a second (realm)
At
If there are no entries matching all three of the
On the other hand, if the access request passes the
If a
if the lifecycle state for the corresponding memory region is indicated as invalid in the domain ownership table 128. This ensures that pages of the memory address space that have not been subjected to the
The current realm is not allowed by the owner realm for the corresponding memory region for accessing the memory region. There may be a number of reasons why a given domain may not be allowed to access a given memory region. If an owner domain has specified a memory region that is only visible to the owner itself and to descendants of the owner, then another domain may not be allowed to access that domain. Additionally, memory access may be denied if the current domain is a previous generation domain of the owner domain and the owner domain has not defined a previous generation visibility attribute to allow the previous generation to access the region. Additionally, if the memory region is currently set to RMU-private as discussed above, the owner zone itself may be prevented from accessing the memory region. At the RMU check stage, descendant domains of the owner domain may be allowed to access the memory region (as long as the memory region is not an RMU-private region). Thus, this check enforces the access permissions set by the owner zone.
If the physical address translated by S1/S2 for the current memory access map does not match the mapped address specified in the ownership table 128 for the corresponding memory region as shown in FIG. 19, then the memory access is denied. This protection is from the following situation: the malicious predecessor domain may assign ownership of a given memory region to a child domain, but then change the translation mapping in page table 120 so that subsequent memory accesses triggered by the child domain using the same virtual address that the child domain previously used to reference the page owned by the child domain now map to a different entity address that is not actually owned by the child domain itself. By providing a reverse mapping in the ownership table from the physical address of the corresponding memory region back to the mapped address used to generate the physical address when ownership was asserted, this allows security breaches caused by changes in the address mapping to be detected so that the memory access will fail.
It will be appreciated that other types of inspections may also be performed. If the realm check is successful, then the physical address is returned at step 322, the memory access is allowed to proceed using the physical address, and a new entry is allocated to the TLB indicating the physical address obtained from the page table 120 and the owner realm and visibility attributes obtained from the ownership table 128 corresponding to the requested virtual address and translation context.
Thus, in summary, by requiring a second comparison (comparing the GRID of the current domain to the GRID provided in the entry of the translation cache) to match in order to allow a hit to be detected in the translation cache lookup, this ensures that even after a TLB entry has been allocated, there is a change in the translation context identifier associated with a given domain, which cannot be used to override the domain protection, even if the domain check is not repeated again on a TLB hit. This allows for improved performance, such as by making it unnecessary to repeat the domain check at each memory access (which would be relatively processor intensive given the number of checks to be performed). This allows most memory accesses to proceed faster, since hits are much more common than misses. When the second comparison identifies a mismatch between the domain identifier specified in the entry and the domain identifier of the current domain, a mismatch between the memory access and a given entry of the translation cache is detected. This will then trigger a miss and this may trigger page table and RMU table traversals in order to find the correct access control data (with the domain check repeated in the case that VMID/ASID has changed).
This method is safe because the RMU can prevent initialization of a new domain having the same domain identifier as the previous active domain until after a washing process for invalidating information related to the previous active domain has been performed. This washing process may include not only invalidation of the domain management data and any data stored in memory that is related to the domain of invalidation, but also invalidation of at least one entry of the translation cache for which the second comparison identifies a match between the domain identifier of the entry and the domain identifier of the domain of invalidation. Thus, this means that it is not possible to regenerate a different process using the same realm identifier as the previous process, unless all data in the
A miss in the translation cache may trigger an ownership table lookup that accesses an ownership table specifying, for each of a plurality of memory regions, an owner realm for the corresponding memory region and access constraints set by the owner realm for controlling which other realms are allowed to access the memory region. This enables the ownership table lookup to be omitted on lookup hits by including an additional second comparison for determining TLB hits. An ownership table lookup is performed on a TLB miss.
Although fig. 25 illustrates a method in which the GRID of the owner domain is stored in each TLB entry, there may be other ways of representing information that enables a determination of whether the GRID of the current domain is suitable for accessing the corresponding memory region. For example, a list of GRID's of authorized domains may be maintained in the TLB, or the TLB may maintain a separate list of active domains with TLB entries including an index into the active domain list, rather than a full GRID, which may reduce TLB entry size compared to storing the list in TLB entries. However, simply representing the GRID of the owner realm may be a more efficient way to identify the authorized realm because it makes the process of allocating and checking TLB entries less complicated by avoiding additional levels of indirection in consulting the active realm list, and also avoids the need to synchronize changes in the active realm list between TLBs.
It should be noted that a match in the second (GRID) comparison performed in looking up the TLB does not necessarily require that the current realm identifier be identical to the
Thus, by caching the
For at least one value of the visibility attribute, the control circuitry may determine a mismatch when the current domain is a domain other than the owner domain, a descendant domain of the owner domain, or a predecessor domain of the owner domain (e.g., when the predecessor visibility is set as discussed above). In some cases, at least one value of the visibility attribute may allow
FIG. 27 is a Venn (Venn) diagram showing an example of the architectural state accessible to
When operating at exception level EL0,
general purpose registers, including integer registers, floating point registers, and/or vector registers, for storing general purpose data values during data processing operations.
A Program Counter (PC) register that stores a program instruction address representing the current execution point within the program being executed.
A save processor state register (SPSR _ EL0) for storing information about the current state of the processor when an exception is taken from a process executing at
An exception link register ELR _ EL0 for storing the current program counter value when an exception is taken, so that the ELR provides a return address to which processing should branch once the exception has been handled.
A domain identifier register RID _ EL0 to store the local RID of the child domain for which domain entry was made (even though exception level EL0 is the lowest, least privileged exception level, a domain may be entered from a process operating at exception level EL0 with the ability to create a child domain as discussed below).
An Exception Status Register (ESR), which is used by EL0 to store information about exceptions that occur (e.g., to allow selection of an appropriate exception handler). When
Similarly, when executing instructions at the exception level EL2, the
Finally, when operating at EL3, the processing component may access
Thus, each exception stage is associated with a corresponding group of registers that the processing circuitry can access when processing a software process at that exception stage. For a given exception level other than the least privileged exception level, the group of registers accessible at the given exception level includes a group of registers accessible at a lesser privileged exception level than the given exception level. This state hierarchy accessible to a particular level may be utilized to reduce the administrative burden associated with state saving and restoration upon domain entry and exit, as will be discussed below.
Upon entering or exiting from the domain, the
In the techniques described below, the domain mechanism reuses the mechanisms already provided for exception entry and return in order to enter and exit from the domain. This reduces the amount of software modification required to support domain entry and exit and simplifies the architecture and hardware. This is particularly useful because the general domain boundaries may correspond to exception level boundaries anyway, and even if new instructions are provided to control entry and exit, behavior for handling exceptions will still be required, so in general, extending the exception mechanism so as to also control entry and exit may be less expensive.
Thus, an Exception Return (ERET) instruction that would normally return processing from an exception handled in the current realm to another process also handled in the current realm, where the other process may be handled at the same or less privileged exception level than the exception, may be reused to trigger a realm entry from the current realm to the destination realm. In response to a first variant of the exception return instruction, the processing circuitry may switch processing from the current exception level to a less privileged exception level (without changing the realm), while in response to a second variant of the exception return instruction, the processing circuitry may switch processing from the current realm to a destination realm that may operate at the same exception level or a reduced (less privileged) exception level as the current realm. The use of an exception return instruction to trigger a domain entry may greatly simplify the architectural and hardware management burden and reduce the software modification requirements to support the use of domains.
Another advantage of using an exception return instruction is that, typically on return from an exception, the processing circuitry may perform an atomic set of operations in response to the exception return instruction. The set of operations required on return from an exception may be executed atomically, such that the operations may not be split in the middle, and thus either the instruction fails and none of the atomic set of operations is executed, or the instruction is successfully executed and all of the atomic set of operations are executed. For a second variant of the exception return instruction, the processing circuitry may similarly perform an atomic set of second operations, which may be different from the atomic set of the first operations. Mechanisms that have been provided in processors to ensure that exception return instructions complete atomically may be reused for domain entry in order to avoid situations where domain entry may only be partially executed that may lead to security vulnerabilities. For example, the atomic set of second operations may include changing the current domain being executed, making domain execution context state available, and branching control to processing the program counter address where previously executed on the last execution of the same domain.
The first variant and the second variant of the exception return instruction may have the same instruction encoding. Modification of the exception-free return instruction itself is therefore necessary in order to trigger a domain entry. This improves compatibility with legacy code. The execution of a given exception return instruction as either the first variant or the second variant may depend on the control value that the given exception return instruction stores in the status register (e.g., the first and second values of the control value may represent the first and second variants of the exception return instruction, respectively). Thus, the current architecture state when the exception return instruction is executed controls the exception return instruction to return the processor to a lower privilege level in the same domain, or to trigger entry into a new domain.
This approach enables domain entry to be controlled with fewer software modifications, especially when the values in the status registers can be set automatically by hardware in response to certain events that suggest a domain switch to be possible (in addition to allowing for voluntary setting of control values in response to software instructions). For example, when an exception condition triggering an exit to a given domain occurs, the processing circuitry may set the control value to a second value for the given domain, such that a subsequent exception return instruction will automatically return processing to the domain in which the exception occurred, even taking into account that the exception handler code used to handle the exception is the same as the previous legacy code not written in the domain. Alternatively, it is contemplated in some architectures that when exiting from a domain, the control value in the status register will still include the second value set prior to triggering the domain entry to that domain, and thus explicit setting of the control value in the status register may not be required.
In one example, the control value in the status register may be the R flag in the SPSR register associated with the current exception stage, as discussed above. Using SPSR may be useful because this register will normally be used when an exception returns to provide processor mode (including exception stage) and other information about how processing should continue when returning from the exception currently being processed. However, for domain entry, this information may instead be determined according to a domain execution context (REC), and thus the SPSR may not be needed. This avoids the need to provide an additional register for storing this information by reusing portions of the SPSR for the storage control exception return instruction to be treated as an R-flag for the first variant as well as the second variant. Thus, it may be useful to use a status register that is used to determine return status information (such as processing mode) for continuing an exception at a less privileged exception level in response to the first variant of the ERET instruction, but this return status information will instead be determined from memory in response to the second variant of the exception return instruction, so that the status register itself need not be accessed. In particular, the status register used to store the control value may be the status register associated with the current exception stage from which the exception return instruction was executed.
As shown in fig. 27, at least one realm identifier register can be provided, and in response to a second variant of the exception return instruction, the processing circuitry can identify the destination realm from the realm identifier stored in the realm identifier register. The domain identifier registers may be grouped such that there are a plurality of domain identifier registers each associated with one of the exception stages, and in response to a second variant of the exception return instruction, the processing circuitry may identify the destination domain from the domain identifier stored in the domain identifier register associated with the current exception stage. By using the realm identifier register to store the target realm identifier, there is no need to include this in the instruction encoding of the ERET instruction, which enables the existing format of the ERET instruction to be used to trigger the realm entry, thereby reducing the amount of software modification required. The realm identifier in the realm identifier register can be a local realm identifier used by an upper-generation realm to reference a child realm of the upper-generation realm, and thus, realm entry can be limited to transfer from the upper-generation realm to the child realm, and it is not possible to go from a first realm to another realm that is not a direct child of the first realm. In response to a second variant of the exception return instruction, the processing circuitry may trigger a fault condition when the realm associated with the realm ID identified in the RID register is an invalid realm (no realm descriptor has been defined or a realm descriptor defines a RID for a life cycle state other than active).
In response to a second variation of the exception return instruction, the processing circuitry may restore an architectural state associated with the thread to be processed in the destination domain from a domain execution context (REC) memory region specified for the exception return instruction. The state recovery may occur immediately (e.g., as part of the atomic set of operations) in response to the second variant of the exception return instruction, or may occur later. For example, state recovery may be done in a lazy manner such that state that requires processing to begin in the destination domain may be recovered immediately (e.g., program counters, processing mode information, etc.), but other state such as general purpose registers may be gradually recovered as needed at a later time, or in the context of continued processing in the new domain. Thus, the processing circuitry may begin processing of the destination domain before all required architectural states have been restored from the REC memory region.
In response to a first variant of the exception return instruction, the processing circuitry may branch to the program instruction address stored in the link register. For example, this may be the ELR of FIG. 27, which corresponds to the current exception stage at which the exception return instruction is executed. Conversely, for a second variant of an exception return instruction, the processing circuitry may branch to a program instruction address specified in a domain execution context (REC) memory region. Thus, because the link register will not be used for the second variant of the exception return instruction to directly identify any architectural state for the new domain, the link register can be reused to instead provide a pointer to the REC memory region from which the architectural state of the new domain is to be restored. This avoids the need to provide further registers for storing the REC pointer.
Thus, prior to executing an exception return instruction that attempts to cause a realm entry into a given realm, some additional instructions may be included in order to set the RID register to the realm identifier of the destination realm and set the link register to store a pointer to the REC memory region associated with the destination realm. The REC index may be obtained by the previous generation domain from a domain descriptor of the destination domain.
In response to the second variant of the exception return instruction, the fault condition may be triggered by the processing circuitry when the REC memory region is associated with an owner field other than the destination field or the REC memory region specified for the exception return instruction is invalid. The first check prevents the predecessor realm from causing a predecessor realm to execute with a processor state that the predecessor realm did not create itself, because only the memory region owned by a predecessor realm can store the REC memory region that is accessible upon entry into the realm (and as discussed above, the REC memory region will be set to RMU private). A second check of the validity of the REC memory region may be used to ensure that the REC memory region can be used only once to enter the domain, and subsequent attempts to enter the domain with the same REC data will be rejected thereafter. For example, each REC may have a lifecycle state that may be invalid or valid. In response to an exception occurring during processing of a given thread in the current domain, the architectural state of the thread may be saved to a corresponding REC memory region, and the corresponding REC memory region may then transition from inactive to active. The REC memory region may then transition from active back to inactive in response to successful execution of the second variant of the exception return instruction. This avoids the descendant domain from maliciously incorrectly behaving by the descendant domain by specifying a pointer to an obsolete REC memory region, a REC memory region associated with a different thread, or some other REC associated with the destination domain but not used to store the correct REC for architectural state at the previous exit of the destination domain.
In a corresponding manner, exit from the domain may reuse the mechanism provided for exception handling. Thus, in response to an exception condition occurring during processing of the first domain that cannot be handled by the first domain, the processing circuitry may trigger a domain exit to an earlier domain that initializes the first domain. Upon exception occurrence/domain exit, some additional operations may be performed that would not be performed for exception occurrences that may be handled within the same domain. This may include, for example, masking or washing of architectural states and triggering of state storage to the REC, as will be discussed in more detail below.
However, in some cases, an exception may occur that may not be handled by the predecessor domain of the first domain in which the exception occurred. Thus, in this case, it may be necessary to switch to a further prior generation area beyond the previous generation. While it may be possible to provide the ability to switch directly from a given domain to an earlier generation domain that is more than one generation old, this may increase the complexity of the status registers needed to handle exception entry and return or domain exit and entry.
Alternatively, a nested domain exit may be performed when the exception condition is to be handled at a target exception level having a greater privilege level than the most privileged exception level that the previous generation domain of the first domain is allowed to execute. Nested domain retirement may include two or more successive domain retirements from a child domain to an upper-generation domain until a second domain is reached that is allowed to be processed at a target exception level for the exception that occurred. Thus, by raising the domain level one level at a time, this may simplify the architecture. At each successive domain exit, there may be operations performed to save a subset of the processor state to the REC associated with the corresponding domain.
When the exception has been handled, then in response to an exception return instruction of the second variant executing in the second domain after the nested domain exits, the processing circuitry may then trigger the nested domain entry to return to the first domain. This can be handled in different ways. In some examples, the hardware may trigger the nested domain entry itself without requiring any instructions to be executed at any intermediate domain encountered between the first domain and the second domain during the nested domain exit. Alternatively, the hardware may be simplified by providing a nested domain entry process that returns back one level at a time to each successive domain encountered in the nested domain exit and executes a second variant of a further ERET instruction at each intermediate domain. In this case, to ensure that the intermediate domain triggers a return to a child domain of the intermediate domain that made a domain exit to the intermediate domain during the nested domain exit, an exception status register may be set to indicate that a predetermined type of exception condition occurred in the child domain. For example, a new type of exception condition (e.g., "fake domain exit") may be defined to handle this intermediate domain case. Thus, when the intermediate domain is reached, the processor may then resume processing within the intermediate domain from the program instruction address corresponding to the exception handling routine for handling the predetermined type of exception condition. This exception handling routine may, for example, simply determine that a child domain exits for some unknown reason and then may choose to execute another exception return instruction of the second variant to return processing to a further child domain. By performing this operation at each intermediate domain, eventually, the original first domain in which the original exception occurred can resume processing.
During this nested domain entry and exit procedure, an intermediate domain flag within the status register may be used to flag which domains are intermediate domains to trigger the setting of hardware-triggered immediate domain entry to the relevant child domain or trigger the setting of exception state information that will then trigger the exception handler or other code within the intermediate domain to return to the child domain. For example, the middle realm flag may be the Int flag in the associated SPSR as discussed in fig. 27.
FIG. 28 is a flow chart illustrating a method of handling a domain entry or exception return. At
At
If at
the local RID indicated in the realm identifier register RID _ ELx associated with exception stage ELx indicates a valid child realm. That is, the RMU checks the domain descriptors accessed from the domain descriptor tree 360 used to specify the child domains, and checks whether the life cycle state of the domain descriptors of the child domains indicates an active state. If the child domain is in any state other than the active state, the domain check is unsuccessful.
The RMU20 also checks that the REC memory region indicated by the pointer in the link register ELR _ ELx is the memory region owned by the child domain indicated in the domain ID register RID _ ELx. That is, the RMU20 accesses the realm group table 128 (or cache information from the RGT 128), locates the associated entry corresponding to the memory region indicated in the REC pointer, and checks the owner realm specified for that memory region. The owner realm indicated in the ownership table may be specified as a global RID, and this may be compared to the global RID specified in the realm descriptor of the target child realm to determine whether the child realm is a valid owner of the REC. This check is unsuccessful if the REC is owned by any domain other than the specified child domain in the RID register.
The RMU20 also checks whether the status of the REC memory region defined in ELR _ ELx is valid. There are different ways in which the validity of the REC memory region can be expressed. For example, each REC memory region may include a flag that specifies whether the REC memory region is valid. Alternatively, the separate table may define the validity of RECs stored in other memory areas. The REC may be valid if it has been used to store the architectural state of the associated domain on a previous exception exit, but has not been used to restore the state after returning from the exception. If the REC is invalid, the domain check is again unsuccessful.
The RMU20 also checks whether a flush command has been executed after the last exit from any child domains other than the child domain indicated in the RID register RID _ ELx. The flush command will be discussed in more detail below, but is a command to ensure that any state of the REC that is still to be saved to the child domain is pushed to memory (which helps support the lazy state save approach). If no flush command has been executed and the system attempts to enter a different child domain than the previously exited child domain, there is a danger that there may still be a state left in the processor registers that has not yet been pushed to memory. Implementing the use of the flush command ensures that different child domains can be safely entered without loss of state (or leakage to other domains) of the previous child domain. There may be multiple ways of identifying whether a flush command has been executed. For example, some status flags may be used to track whether (a) there has been a change to the RID register RID _ ELx after the exit from the last domain, and (b) a flush command has been executed after the exit from the last domain. This may cause the domain check to be unsuccessful if there is a change to the RID register and no flush command has been executed after exiting from the previous domain.
If either of the domain checks is unsuccessful, then a fault is triggered at
If all of the domain checks are successful, then at
At
At
Alternatively, if the intermediate realm flag is set, this indicates that nested realm exit has previously occurred, and that nested realm entry has reached the intermediate realm. Thus, it is necessary to return processing to further child domains in order to return to the domain in which the exception originally occurred. There are two alternative techniques for handling this situation. In a first alternative, at
Alternatively, instead of handling nested domain entry with another ERET instruction executing in the intermediate domain, at
FIG. 29 shows a flow chart illustrating a method of exiting or taking an exception from the domain. At step 430, the exception occurred within a given exception level ELx targeting exception level ELy (ELy ≧ ELx). Target exception level ELy is the exception level at which an exception is to be handled. The target exception level may be the same as ELx, just one exception level above ELx, or may be multiple exception levels above.
At step 432, the RMU determines whether the target exception level ELy is greater than the Boundary Exception Level (BEL) of the current domain that can be read from the domain descriptor of the current domain, and whether the current domain is a sub-domain (see discussion of sub-domains below-the type field of the domain descriptor shown in FIG. 16 specifies whether the domain is a sub-domain). If the target exception level ELy is not greater than the boundary exception level, this indicates that the exception may be handled within the current domain, and thus, if the current domain is a full domain, there is no need to trigger a domain exit. In this case, at
On the other hand, if at step 432, the target exception level ELy is greater than the BEL of the current domain, or if the current domain is a sub-domain (for which any exception triggers an exit to a predecessor domain of the sub-domain), a domain exit is required in order to handle the exception. At
If the domain exit is not a voluntary domain exit, then at
If the domain exit is a voluntary domain exit, then at
Regardless of whether the domain exit is voluntary or involuntary, at
At
At
However, for nested domain exit, it may be assumed that for the intermediate domain, any registers accessible at a lower exception level compared to the boundary exception level of the intermediate domain will have been modified by the child domains at the lower exception level because the intermediate domain triggered entry to the child domains. Thus, such registers accessible at a lower exception level may not need to be saved to the REC associated with the intermediate realm during a nested realm exit (no further execution at the intermediate realm occurs from the previous entry into the child realm). Conversely, during nested domain entry, these registers accessible at lower levels need not be restored during passage through the intermediate domain, as these registers will then be restored by the domain at the lower exception level. Alternatively, the middle realm state save and restore may simply comprise registers accessible by the boundary exception stage of the middle realm but not accessible at the lower exception stage. For example, at the intermediate domain at EL1, the state saved/restored in nested domain exit/entry may include
Thus, when an exception condition is to be handled by the exception handler at a target exception level that is more privileged than a boundary exception level of an earlier generation domain of a given domain in which the exception occurred, a nested domain exit may be triggered, the nested domain exit comprising a plurality of successive domain exits from a child domain to the earlier generation domain until a target domain having a boundary exception level corresponding to the target exception level or higher is reached. A respective state masking process (and state save) may be triggered for each of the successive domain exits, and each respective state masking process may mask (and save) a corresponding subset of registers selected based on the boundary exception level. For a domain exit from a given child domain having a boundary exception level other than the least privileged exception level, the corresponding subset of registers masked/saved during the nested domain exit may include at least one register accessible at the boundary exception level of the given child domain, but may exclude at least one register accessible to processing circuitry at a less privileged exception level as compared to the boundary exception level of the given child domain (as it may be assumed that such register would have been saved when exiting the domain at the less privileged exception level). This reduces the amount of operations required for state masking and saving.
Similarly, upon a domain entry (exception return), an intermediate domain flag may be used to determine whether the incoming domain is an intermediate domain. If the intermediate realm state value for a realm having a boundary exception level other than the least privileged exception level is set to a predetermined value (indicating an intermediate realm), the subset of registers to be restored upon realm entry may include at least one register accessible at the boundary exception level of the intermediate realm, but may exclude at least one register accessible to processing circuitry at a lesser privileged exception level than the boundary exception level of the particular realm. If the intermediate state value is set to a value other than the predetermined value, then this domain being entered is the last domain and thus the subset of registers to be restored accessed may include all registers accessible to the processing circuitry at the boundary exception stage of the particular domain (without excluding any registers from the lower stages).
In this way, state save and restore operations during nested domain exit and entry can be performed more efficiently.
FIG. 30 shows an example of non-nested domain entry and exit. In this example, the predecessor domain is a operating at exception level EL1 and it is desired to enter the descendant domain B of the BEL with
When an exception occurs at step 472 during execution of child domain B, then a set of masking operations is performed to hide the state associated with domain B from the parent domain of that domain. This includes masking and washing of at least a subset of the architectural states associated with EL0 at step 474 (the subset of architectural states masked/washed may depend on the exit being a voluntary or involuntary domain exit). The mask makes the state inaccessible and the wash ensures that subsequent accesses to the corresponding register from the previous generation domain will trigger the predetermined value to be returned. At step 476, a state save to the REC associated with realm B is performed for the masked subset of the architecture state. In this case, because the child domain has a BEL for EL0, the masked subset of states includes at least the
FIG. 31 shows a similar example showing nested domain entry and exit. The grandparent realm a at exception level EL2 executes an ERET instruction at
Subsequently, an exception occurs at step 514. Exceptions target
At step 520, the processing component detects that the target exception level for the exception that occurred is higher than the boundary exception level of Domain B, and thus Domain B is an intermediate Domain and requires a further Domain exit to a previous generation Domain of Domain B. Thus, at
A
Upon returning to Domain B at EL1, the processing component detects that the intermediate flag is set in SPSR _ EL1 (step 532). Thus, at
Alternatively, for hardware assisted nested domain entry, steps 536 to 542 may be omitted, and instead of restoring the required subset of states at
By using this nested domain entry and exit procedure, this avoids the domain at EL2 needing to process any domain values or REC indices associated with
Fig. 32 and 33 show the lazy state preservation to the REC and the state recovery from the REC at the time of the domain exit and the domain entry, respectively. In general, upon exiting a domain to an predecessor domain, it may be desirable to mask the state associated with the descendant domain to hide that state from the predecessor domain, and to perform a flush to ensure that some predetermined value will be seen if the predecessor domain is attempting to access an architectural register corresponding to the flushed state. These operations may be performed relatively quickly. However, if there is insufficient space in the physical register file of the processing circuit for holding child domain states indefinitely, it may be desirable to save some of this data to the REC. However, this can take a longer time and occupy memory bandwidth that could otherwise be used for processing in the previous generation domain, which can delay processing in the previous generation domain. Similarly, the corresponding operation to restore state from memory to a register upon entering the realm may take some time. Therefore, for performance reasons, it may be desirable to support asynchronous saving/restoring of processing component states to/from the REC. Whether or not a given processor implementation actually makes this lazy state save is an implementation choice for a particular processor. For example, some processors that do not aim at high performance may find it simpler to simply trigger a state save operation immediately, in order to reduce the complexity of managing which states have been saved and which have not. However, to provide performance when needed, it may be desirable to provide architectural functionality that supports such asynchronous, lazy state preservation approaches.
Thus, in response to a domain switch from the source domain to the target domain to be processed at a more privileged exception level than the source domain, the processing circuitry may perform a state mask to render a subset of the architectural state data associated with the source domain inaccessible to the target domain. While it is possible that this masked subset of states is saved to memory, this is not necessary at this time. However, the architecture provides a flush command that can be used after a domain switch. When the flush command is executed, the processing circuitry ensures that any of the masked subset of the architectural state data that has not been saved to the at least one REC memory region owned by the source domain is saved to the at least one REC memory region. By providing such a flush command, it may be ensured that this may be forced through when it has to be ensured that a subset of the architecture state data has been explicitly saved, and this gives a degree of freedom for the exact change of the specific micro-architectural implementation of the architecture when this subset of the architecture state data is actually saved to memory without the flush command having been executed.
In addition to the status mask, after a domain switch, the processing circuit may also perform a register wash operation as discussed above, which ensures that any subsequent read access to a given architectural register returns a predetermined value (if done without an intervening write access). This washing may be performed by actually writing the predetermined value to the physical register corresponding to the given architectural register, or by register renaming, or by setting other control state values associated with the given architectural register to indicate that the read access should return the predetermined value rather than the actual contents of the corresponding physical register. If the state saving of these is to be done asynchronously in response to a domain switch, the processing circuit may begin processing of the target domain while at least a portion of the subset of the architectural state data that is made inaccessible in response to the domain switch remains stored in the registers of the processing circuit. For example, a processor may have a larger physical register file than the number of registers provided as architectural registers in an instruction set architecture, and thus some spare physical registers may be used to hold the previously masked state for a period of time after processing has begun in the target domain. This is advantageous because if processing then returns to the source domain while a given item of the subset of architectural state data is still stored in the register, the processing circuitry can simply resume access to that given item of architectural state from the register file without the need to resume data from the REC. Some types of exceptions may only require a relatively short exception handler to be executed, in which case it is possible that some mask state remains resident in the register file when returning from the exception. Such "shallow" exception entry/return events may benefit from using lazy state preservation.
If lazy state preservation is used, the processing circuitry may trigger the preservation of a given item of the REC region in response to the occurrence of a predetermined event other than a flush command once processing of the target domain has begun after the exception. Although processing has now switched to the higher-generation domain (which typically cannot access RECs associated with the previous children domain), because these operations are triggered by the microarchitectural implementation in hardware rather than by software, these operations are not subject to the same ownership checks required for general software-triggered memory accesses (effectively, these REC save operations will have been granted by the children domain before exiting).
Many different types of predetermined events may be used to trigger certain items of the subset of architecture state data to be saved to the REC, including the following:
register access to an architectural register, which corresponds to a given entry of a subset of the architectural state data. This approach may be useful for less complex processors that do not support register renaming. In this case, each architectural register may be mapped to a fixed physical register, and thus the first time the code associated with the predecessor domain attempts to access a given architectural register, this may require the old value of that register used by the predecessor domain to be saved to memory.
A remapping of the entity registers storing a given entry of the subset of architecture state data. In systems that support register renaming, the architectural state may remain longer in the register file, but eventually the corresponding physical registers may have to be remapped to store different values and at this point the corresponding architectural state of the child domain may be saved to the REC.
The number of available physical registers becomes less than or equal to a predetermined threshold. In this case, instead of waiting for the actual remapping of a given physical register, state saving may begin preemptive execution once the number of free physical registers (which are available for reallocation to different architectural registers) becomes low.
A given number of cycles or the passage of a given period of time. Thus, trigger saving is not necessary for any particular processing event, but instead lazy state saving may simply extend the context of the descendant domain to the saving of the REC over a period of time in order to reduce the impact on memory bandwidth available for other memory accesses triggered by processing in the predecessor domain.
Events that indicate a reduced processor workload, such as an idle processor time period or some other event that indicates that performing a state save now will have less impact on the overall performance of processing in the previous generation domain. At this point, the saving of at least a portion of the subset of the architecture state data may be triggered.
After the domain switch, if the processing circuitry attempts to enter a further domain other than the source domain, from which the domain was previously switched to the previous domain, the processing circuitry may deny the domain entry request when the further domain is to be processed at the same or a less privileged exception level as the target domain exited from the previous domain and no flush command has been received between the domain switch and the domain entry request. Alternatively, the realm entry request can be accepted regardless of whether the flush command has been executed, but if the flush command has not been executed, the initial child realm REC state can be destroyed such that the REC is not reusable, thereby preventing effective entry into the child realm. In summary, a flush command is required before an ancestor domain can successfully direct processing to a different descendant domain compared to the one previously executed. This ensures that even if the hardware chooses to use the lazy state save method, all necessary state associated with the previous child domains will have been committed to be saved to memory upon entering a different child domain. This avoids the need to backup multiple sets of child domain data to be saved to memory and simplifies the architecture.
It should be noted that the flush command only needs to ensure that the state from the mask register is committed to be stored to the REC memory region. The store operation triggered by the flush command may be queued in a load/store queue of the
The flush command may be a native instruction supported by an instruction decoder of the processing circuit. Alternatively, the flush command may be a command triggered by a predetermined event to continue processing of the instruction decoded by the instruction decoder. For example, the flush command may be automatically triggered by some other type of instruction that implies that a state save operation should be ensured to have been triggered to memory for a subset of all needs of the architectural state related to the previous child domain.
As discussed above, the particular subset of architectural state to be saved during a domain switch may depend on the boundary exception level associated with the source domain (and may also depend on whether the source domain is an intermediate domain in a nested domain exit). The state masking and saving operations may be suppressed if the domain switch is a predetermined type of domain switch (e.g., a domain switch triggered by execution of a voluntary domain switch instruction in the source domain).
Thus, FIG. 32 shows an example of lazy state save and restore. At
Fig. 33 shows another example of performing a
Thus, the use of the flush command enables fast exception exit and slow outflow of processor states into the REC of the previously exited domain, and also allows shallow exception exit and return, where the states remain within the registers of the processing component and are not stored and reloaded from the REC.
FIG. 34 illustrates the concept of a sub-domain that can be initialized by a previous generation domain. As shown in FIG. 34, a given upper-
The sub-domains may generally be handled in the same way as the full domain, with some differences as explained below. Entry into and exit from the sub-domain may be handled in the same manner as discussed above using exception return instructions and exception events. Thus, a sub-realm may have child realm IDs constructed in the same manner for the same previous generation of complete child realms, and may be provided with realm descriptors within a realm descriptor tree as discussed above. Entry into the sub-domain may be triggered simply by executing an ERET instruction that has placed the appropriate child sub-domain RID in the RID register prior to executing the ERET instruction. Thus, the same type of ERET instruction (belonging to the second variant) may be used to trigger entry into the full domain or the sub-domain.
One way in which a sub-domain may differ from a full domain may be that the sub-domain may not allow for initializing the self-owned sub-domains of those sub-domains. Accordingly, if the current domain is a sub-domain, a domain initialization command for initializing a new domain may be rejected. The RMU may use a domain type value in a domain descriptor of the current domain to determine whether the current domain is a full domain or a sub-domain. By disabling the realm initialization when currently in the sub-realm, this simplifies the architecture, as no additional status registers have to be provided for use by the sub-realm in initializing further realms.
Similarly, when currently in the sub-domain, execution of the domain entry instruction may be prohibited. This simplifies the architecture, as it means that some of the banked registers, such as the ELR, SPSR, ESR, and RID registers discussed above, for handling domain entries and exits (and exception entries and returns) do not need to be banked again for each sub-domain, which would be difficult to manage, as it may not be known at design time how many sub-domains a given process would create. Similarly, when the current realm is a sub-realm rather than a full realm, exception return events that trigger a switch to a process operating at a lower privilege level may be disabled. Although in the examples discussed above a single type of ERET instruction is used as both a realm entry instruction and an exception return instruction, this is not necessary for all embodiments, and where a separate instruction is provided, then both exception return instructions may be disabled when the current realm is a sub-realm.
Similarly, when an exception occurs while in a sub-domain, the processing circuitry may trigger an exit from the sub-domain to an earlier generation full domain that initialized the sub-domain before handling the exception, rather than taking the exception directly from the sub-domain. Thus, the exception triggers a return to the full domain of the previous generation. Exception returns to the upper generation full domain may include state mask, wash, and save operations to the REC, but by avoiding exceptions going directly from the sub-domain to the domain at a higher exception level, this avoids the need to group exception control registers such as ELR, SPSR, and ESR again for the sub-domain, simplifying the architecture.
For a sub-domain, the boundary exception level indicating the maximum privilege level for processing of the allowed domain is equal to the boundary exception level for the previous generation full domain for that domain. In contrast, for a child full domain, the boundary exception level is a less privileged exception level than the boundary exception level of the parent domain of the child full domain.
When a domain is initialized by an predecessor domain, the predecessor domain may select whether the new domain will be a child full domain or a child sub-domain, and may set the appropriate domain type parameter in the domain descriptor accordingly. Once the domain is operational, the previous generation domain can no longer change the domain type because modification of the domain descriptor is prohibited by the management domain lifecycle discussed above with respect to FIG. 18.
In summary, the ability to introduce sub-domains that are managed similar to the full domain but where exception handling, domain initialization, and domain entry functions are disabled within the sub-domains enables smaller portions of code corresponding to a given address range within the full domain's software process to be isolated from other portions of the software in order to provide additional security for certain pieces of sensitive code or data.
FIG. 35 shows a simulator implementation that may be used. Although the earlier described embodiments implement the present invention in terms of apparatus and methods for operating specific processing hardware supporting the technology in question, it is also possible to provide an instruction execution environment in accordance with what is described herein by using a computer program. To the extent that such computer programs provide software-based implementations of the hardware architecture, these computer programs are often referred to as emulators. Various emulator computer programs include emulators, virtual machines, models, and binary translators, including dynamic binary translators. In general, an emulator implementation may run on a
To the extent that embodiments have been described above with reference to particular hardware configurations or features, in an analog embodiment, equivalent functionality may be provided by appropriate software configurations or features. For example, the particular circuitry may be implemented as computer program logic in a simulation embodiment. Similarly, memory hardware such as registers or caches may be implemented as software data structures in analog embodiments. Some emulation embodiments may utilize host hardware where appropriate, in configurations where one or more of the hardware components referenced in the previously described embodiments reside on host hardware (e.g., host processor 730).
At least some examples provide a virtual machine computer program comprising instructions for controlling a host data processing apparatus to provide an instruction execution environment in accordance with an apparatus comprising: processing circuitry for processing the software process at one of a plurality of exception levels; and memory access circuitry for enforcing ownership rights for a plurality of memory regions, wherein a given memory region is associated with an owner zone defined from the plurality of zones, each zone corresponding to at least a portion of at least one software process, the owner zone having rights to prevent software processes handled at a higher privilege level of exception than the owner zone from accessing the given memory region; wherein, in response to a domain switch from the source domain to a target domain to be processed at a higher privileged exception level than the source domain, the processing circuitry is configured to: performing a state mask to make a subset of architectural state data associated with the source domain inaccessible to the target domain; and in response to a flush command following the domain switch, the processing circuitry is configured to: ensuring that any architectural state data in the subset of architectural state data that has not been saved to at least one domain execution context memory region owned by a source domain is saved to the at least one domain execution context memory region. The storage medium may store a virtual machine computer program. The storage medium may be a non-transitory storage medium.
In this application, the word "configured to … …" is used to mean that a component of a device has a configuration capable of performing the defined operation. In this context, "configuration" means the configuration or manner of interconnection of hardware or software. For example, the apparatus may have dedicated hardware providing the defined operations, or may be programmed as a processor or other processing device that performs the functions. "configured to" does not imply that the device components need to be changed in any way in order to provide the defined operation.
Although illustrative embodiments of the present invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope and spirit of the invention as defined by the appended claims.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:具有选择性基于页面刷新的存储器装置