Region fusion
阅读说明:本技术 区域融合 (Region fusion ) 是由 詹森·帕克 马丁·魏德曼 于 2018-12-10 设计创作,主要内容包括:领域管理单元(RMU)20维护所有权表128,该所有权表128指定相应的存储器区域的所有权条目,所有权条目定义所有权属性,所有权属性从多个领域中指定相应区域的拥有者领域。每个领域对应于至少一个软件过程的至少一部分。拥有者领域有权排除其他领域存取存储在相应区域中的数据。基于所有权表控制存储器存取。响应于指定了融合目标地址的区域融合命令,该融合目标地址指示了存储器的将被融合到区域的融合组中的连续区域,区域融合操作更新所有权表来指示区域的融合组的所有权属性由单个所有权条目来表示。这为实现TLB性能的提升提供了架构支持。(The domain management unit (RMU)20 maintains an ownership table 128, the ownership table 128 specifying ownership entries for corresponding memory regions, the ownership entries defining ownership attributes, the ownership attributes specifying owner domains for the corresponding regions from among the plurality of domains. Each domain corresponds to at least a portion of at least one software process. The owner field has the right to exclude other fields from accessing the data stored in the corresponding area. Memory access is controlled based on the ownership table. In response to a region fusion command specifying a fusion target address indicating a contiguous region of memory to be fused into a fusion group of regions, a region fusion operation updates an ownership table to indicate that ownership attributes of the fusion group of regions are represented by a single ownership entry. This provides architectural support for achieving an increase in TLB performance.)
1. An apparatus, comprising:
processing circuitry to perform data processing in response to one or more software processes;
a domain management unit that maintains an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a respective region of a predetermined size of memory, the ownership attribute specifying an owner domain of the respective region from among a plurality of domains, wherein each domain corresponds to at least a portion of at least one of the software processes and the owner domain has authority to exclude other domains from accessing data stored within the respective region; and
memory access circuitry to control access to the memory based on the ownership table; wherein:
the domain management unit performs a region fusion operation in response to a region fusion command that specifies a fusion target address indicating a plurality of consecutive regions of the memory to be fused into a fusion group of regions, the region fusion operation updating the ownership table to indicate that ownership attributes of each region in the fusion group of regions are represented by a single ownership entry corresponding to a predetermined region of the fusion group.
2. The apparatus of claim 1, wherein the memory access circuitry is configured to: access to the memory is controlled based on both the ownership table and at least one translation table that provides address translation data for translating virtual addresses to physical addresses.
3. The apparatus of claim 2, comprising at least one translation look-aside buffer to buffer information from the address translation table and the ownership table.
4. The apparatus of any preceding claim, wherein the domain management unit is configured to: allowing the region fusion operation to be performed in response to a region fusion command triggered by a region other than an owner region specified in the ownership table for the plurality of continuous regions.
5. The apparatus of any preceding claim, wherein, in response to the zone fusion command, the domain management unit is configured to: determining whether the zone-fusion command passes at least one validation check, rejecting the zone-fusion command if the at least one validation check fails, and performing a zone-fusion operation if the at least one validation check passes.
6. The apparatus of claim 5, wherein the domain management unit is configured to: determining that the at least one validation check fails when different ownership attributes are specified in the ownership entry for any two of the plurality of contiguous regions.
7. The apparatus of any of claims 5 and 6, wherein each ownership entry corresponds to an physically addressed region of memory and specifies a mapping address, the mapping address being an address to which the physical address identifying the respective region is mapped when the ownership attribute of the respective region is set in the ownership table; and is
The at least one validation check fails when ownership entries of the plurality of contiguous regions specify non-contiguous sets of mapped addresses.
8. The apparatus of any of claims 5 to 7, wherein the ownership attribute of each ownership entry specifies one of a plurality of region states associated with a respective memory region, the region states including at least an invalid state in which the memory region is allowed to be reassigned to a different owner zone and a valid state in which the memory region is assigned to a given owner zone and prevented from being reassigned to a different owner zone; and is
The at least one validation check fails when the ownership entry for any of the plurality of contiguous regions specifies that the region is in a state other than the valid state.
9. The apparatus of any preceding claim, wherein the ownership table comprises one or more linear tables.
10. The apparatus of any of the preceding claims, wherein in response to a zone merge command in the form of a group size increment, the zone merge operation comprises updating the ownership table to indicate that a plurality of consecutive zones are to be merged into a merged group having a target merged group size, the zone merge command in the form of a group size increment specifying an indication of the target merged group size, each zone of the plurality of consecutive zones being associated with a merged group having an expected current merged group size, the expected current merged group size being a next smallest merged group size, as compared to the target merged group size, of a plurality of merged group sizes supported by the domain management unit.
11. The apparatus of claim 10, wherein a smallest fused group size of the plurality of fused group sizes corresponds to the predetermined size of a single region.
12. The apparatus according to any one of claims 10 and 11, wherein the domain management unit is configured to: rejecting the region fusion command in the set size increment when the ownership table indicates that at least one region of the plurality of contiguous regions is associated with a current fused group size other than the next smallest fused group size compared to the target fused group size.
13. The apparatus of any of claims 10 to 12, wherein in response to a zone fusion command in the form of the group size increment, the zone fusion operation comprises updating a subset of ownership entries associated with the plurality of contiguous zones, the subset selected in accordance with the indication of the target fusion group size.
14. The apparatus of claim 13, wherein, when the expected current fused group size is the smallest fused group size of the plurality of fused group sizes, the subset of ownership entries includes ownership entries associated with all of the plurality of contiguous regions to be fused into the fused group having the target fused group size; and is
When the expected current fused group size comprises a fused group size other than the minimum fused group size, the subset of ownership entries comprises ownership entries associated with less than all of the plurality of contiguous regions to be fused into the fused group.
15. The apparatus of any of claims 13 and 14, wherein the subset comprises ownership entries associated with regions that: the addresses of these regions are offset by an amount corresponding to the expected current fused group size.
16. The apparatus of any of claims 13 to 15, wherein the subset includes the same number of ownership entries regardless of which fused group size is the target fused group size.
17. The apparatus of any preceding claim, wherein the domain management unit is to perform a region splitting operation in response to a region splitting command specifying a split target address, the split target address indicating a fused group of regions to be split into subsets of one or more regions, the region splitting operation updating the ownership table to indicate ownership attributes of respective subsets of regions are represented by different ownership entries of the ownership table.
18. The apparatus of claim 17, comprising a translation lookaside buffer to buffer information from the ownership table; wherein:
the region splitting operation comprises: triggering the translation look-aside buffer to invalidate information associated with the fused group of the region indicated by the split target address.
19. The apparatus of any preceding claim, wherein during the zone fusion operation, the domain management unit is configured to: locking ownership entries to be updated in the zone merge operation to prevent them from being updated by other processes until the zone merge operation is complete.
20. The apparatus of any preceding claim, wherein the domain management unit is configured to: when the single target region of the predetermined size of memory is part of a fused group of regions, rejecting the following command: the command specifies an operation to be performed on the single target area.
21. The apparatus of any one of the preceding claims, wherein an owner realm of a given memory region has authority to prevent processes performed at a higher privilege level than the owner realm from accessing the given memory region.
22. The apparatus of any preceding claim, wherein the domain management unit comprises one of:
a hardware unit; and
processing circuitry executing domain management software.
23. A method, comprising:
maintaining an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a respective region of a predetermined size of memory, the ownership attribute specifying an owner zone of the respective region from a plurality of zones, wherein each zone corresponds to at least a portion of at least one software process processed by processing circuitry, and the owner zone is entitled to exclude other zones from accessing data stored within the respective region;
controlling access to the memory based on the ownership table; and
performing a zone merge operation in response to a zone merge command specifying a merge target address, the merge target address indicating a plurality of consecutive zones of the memory that are to be merged into a merge group of zones, the zone merge operation updating the ownership table to indicate that ownership attributes of each zone in the merge group of zones are represented by a single ownership entry corresponding to a predetermined zone of the merge group.
24. A computer program for controlling a host data processing apparatus to simulate processing of one or more software processes on a target data processing apparatus, the computer program comprising:
handler logic to control the host data processing apparatus to perform data processing in response to object code representing the one or more software processes;
domain manager logic to maintain an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a respective region of a predetermined size of memory, the ownership attribute specifying an owner domain of the respective region from a plurality of domains, wherein each domain corresponds to at least a portion of at least one of the software processes and the owner domain has rights to exclude other domains from accessing data stored in the respective region; and
memory access program logic that controls access to the memory based on the ownership table; wherein:
the domain manager logic performs a region fusion operation in response to a region fusion command specifying a fusion target address indicating a plurality of contiguous regions of the memory to be fused into a fusion group of regions, the region fusion operation updating the ownership table to indicate that ownership attributes of each region in the fusion group of regions are represented by a single ownership entry corresponding to a predetermined region of the fusion group.
25. A storage medium storing the computer program of claim 24.
Technical Field
The present technology relates to the field of data processing.
Background
It is known to provide memory access control techniques for enforcing access rights to specific memory regions. Generally, these techniques are based on privilege levels such that processes executing with higher privileges may preclude less privileged processes from accessing memory regions.
Disclosure of Invention
At least some examples provide an apparatus comprising:
processing circuitry to perform data processing in response to one or more software processes;
a domain management unit that maintains an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a corresponding region of a predetermined size of memory, the ownership attribute specifying an owner domain for the corresponding region from among a plurality of domains, wherein each domain corresponds to at least a portion of at least one software process, and the owner domain has access to data stored in the corresponding region to the exclusion of other domains; and
memory access circuitry to control access to the memory based on the ownership table; wherein:
the domain management unit performs a region fusion operation in response to a region fusion command specifying a fusion target address indicating a plurality of continuous regions of the memory to be fused into a fusion group of the regions, the region fusion operation updating the ownership table to indicate that ownership attributes of each region in the fusion group of the regions are represented by a single ownership entry corresponding to a predetermined region of the fusion group.
At least some examples provide a method comprising:
maintaining an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a respective region of a predetermined size of the memory, the ownership attribute specifying an owner zone of the respective region from among a plurality of zones, wherein each zone corresponds to at least a portion of at least one software process processed by the processing circuitry, and the owner zone has access to data stored in the respective region to the exclusion of other zones;
controlling access to the memory based on the ownership table; and
a region fusion operation is performed in response to a region fusion command specifying a fusion target address indicating a plurality of contiguous regions of memory to be fused into a fusion group of regions, the region fusion operation updating an ownership table to indicate that an ownership attribute of each region in the fusion group of regions is represented by a single ownership entry corresponding to a predetermined region of the fusion group.
At least some examples provide a computer program for controlling a host data processing apparatus to simulate processing of one or more software processes on a target data processing apparatus, the computer program comprising:
handler logic to control a host data processing apparatus to perform data processing in response to object code representing the one or more software processes;
domain manager logic to maintain an ownership table specifying a plurality of ownership entries, each ownership entry defining an ownership attribute for a respective region of a predetermined size of memory, the ownership attribute specifying an owner domain for the respective region from among a plurality of domains, wherein each domain corresponds to at least a portion of at least one software process and the owner domain has authority to exclude other domains from accessing data stored within the respective region; and
memory access program logic that controls access to the memory based on the ownership table; wherein:
the domain manager logic performs a zone merge operation in response to a zone merge command specifying a merge target address indicating a plurality of contiguous zones of memory to be merged into a merge group of zones, the zone merge operation updating an ownership table to indicate that ownership attributes of each zone in the merge group of zones are represented by a single ownership entry corresponding to a predetermined zone of the merge group.
The computer program may be stored on a storage medium. The storage medium may be a non-transitory storage medium.
Drawings
Further aspects, features and advantages of the present technology will become apparent from the following description of examples, read in conjunction with the accompanying drawings, in which:
FIG. 1 schematically illustrates a data processing system that includes a plurality of processing components that utilize memory regions stored within a first memory and a second memory;
FIG. 2 schematically illustrates the relationship between processes being performed, the privilege levels associated with the processes, and the realms associated with the processes for controlling which process owns a given memory region and thus has exclusive rights to control access to the given memory region;
FIG. 3 schematically shows a memory area under management by a domain management unit and a memory management unit;
FIG. 4 schematically illustrates a sequence of program instructions executed to output a given memory region from a first memory to a second memory;
FIG. 5 is a flow chart schematically illustrating page output;
FIG. 6 schematically illustrates a plurality of domains and their relationship within a control hierarchy to control which output commands can interrupt which other output commands;
FIG. 7 is a flow chart schematically illustrating page entry;
FIG. 8 schematically illustrates a first output command source and a second output command source performing an overlapping output operation for a given memory region;
FIG. 9 shows a more detailed example of the processing components and the domain management control data stored in memory;
FIG. 10 illustrates an example of a domain hierarchy in which an parent domain can define domain descriptors that describe the properties of various child domains;
FIGS. 11 and 12 show two different examples of domain hierarchies;
FIG. 13 illustrates an example of a domain descriptor tree that an predecessor domain maintains to record domain descriptors of its descendant domains;
FIG. 14 illustrates an example of a local domain identifier constructed from a plurality of variable length bit portions that each provide an index to a corresponding level of the domain descriptor tree;
FIG. 15 illustrates an example of local and global domain identifiers for each domain in a domain hierarchy;
FIG. 16 shows an example of the contents of a domain descriptor;
FIG. 17 is a table showing different domain lifecycle states;
FIG. 18 is a state machine diagram indicating changes in the lifecycle states of a domain;
FIG. 19 is a table showing the contents of entries in the ownership table for a given memory region;
FIG. 20 is a table showing visibility attributes that may be set for a given memory region to control which domains other than the owner are allowed to access the region;
FIG. 21 illustrates examples of different lifecycle states for memory regions, including states corresponding to RMU-private memory regions reserved for mutually exclusive access by a domain management unit;
FIG. 22 is a state machine showing the transition of the lifecycle states for a given memory region;
FIG. 23 illustrates how ownership of a given memory region may be transferred between an ancestor domain and its descendant domains;
FIG. 24 schematically illustrates memory access control provided based on page tables defining memory control attributes that depend on privilege levels and domain management unit levels that provide orthogonal levels of control of memory access based on permissions set by the owner domain;
FIG. 25 illustrates an example of a translation look-aside buffer;
FIG. 26 is a flow chart illustrating a method of controlling access to memory based on a page table and an RMU table;
FIG. 27 illustrates a state accessible to a process executing at different exception stages;
FIG. 28 is a flow chart illustrating a method of entering a domain or returning from an exception;
FIG. 29 is a flow chart illustrating a method of exiting a domain or taking an exception;
FIG. 30 illustrates an example of entering a child domain and returning to a parent domain;
FIG. 31 illustrates an example of nested (nested) domain exit and nested domain entry;
FIG. 32 illustrates an example of lazy save using a domain execution context upon exit from a domain;
FIG. 33 illustrates an example of the use of a flush command that ensures that a subset of the state associated with a previously exited child domain is saved to memory before entering a different child domain;
FIG. 34 illustrates the use of sub-realms (sub-realms) corresponding to particular address ranges within a process associated with a parent domain of a child domain;
FIG. 35 illustrates a second example of a set of region states associated with a memory region;
FIG. 36 is a state machine illustrating transitions between different zone states according to the example of FIG. 35;
FIG. 37 shows a change in the state of a region when the region changes ownership between an ancestor domain and a descendant domain;
FIG. 38 is a flowchart showing an example of processing of an output command;
FIG. 39 is a flowchart showing an example of processing of an input command;
FIG. 40 shows a hierarchy of translation tables;
FIG. 41 shows an example of a linear table structure for an ownership table;
FIG. 42 schematically illustrates an example of a region of a fuse and split (shatter) ownership table;
FIG. 43 is a flowchart illustrating a method of processing a region fusion command;
FIG. 44 is a flow chart illustrating a method of processing a region split command; and
FIG. 45 shows an example of a simulator that can be used.
Detailed Description
Fig. 1 schematically shows a data processing system 2 comprising a system-on-chip integrated circuit 4 connected to a separate non-volatile memory 6, such as an off-chip flash memory serving as a mass storage device. The system-on-chip integrated circuit 4 includes a plurality of processing components in the form of (in this exemplary embodiment) two general purpose processors (CPUs) 8, 10, and a Graphics Processing Unit (GPU) 12. It will be appreciated that in practice many different forms of processing components may be provided, such as additional general purpose processors, graphics processing units, Direct Memory Access (DMA) units, co-processors, and other processing components used to access memory regions within a memory address space and perform data processing operations on data stored within these memory regions.
The
The
Thus, multiple memory regions are divided among multiple owner zones. Each domain corresponds to at least a portion of at least one software process and is assigned ownership of a plurality of memory regions. Owning processes/domains has exclusive rights to control access to data stored within memory regions of their domains. Management and control of which memory regions are memory mapped to each domain is performed by processes other than the owner domain itself. With this arrangement, a process such as a hypervisor may control which memory regions (pages of memory) are contained within a domain owned by respective guest virtual (guest operating systems) managed by the hypervisor, yet the hypervisor itself may not have the authority to actually access data stored within the memory regions that the hypervisor has allocated to a given domain. Thus, for example, a guest operating system may keep data stored within the domain of the guest operating system (i.e., data stored within a memory region owned by the guest operating system) private with respect to its management manager.
The division of the memory address space into realms and control of ownership of those realms is managed by a
The processing components comprising the
In the context of such output and input of data from a memory region, it will be appreciated that a first memory, such as the on-
The output process may be accompanied by the generation of metadata that specifies characteristics of the output data. This metadata may be separately stored within a metadata memory area of the first memory (on-chip memory 16), where the metadata is kept private to the
This metadata describing the characteristics of the memory region and the data stored within the memory region may be arranged as part of a hierarchical structure, such as a metadata memory region tree with a branching pattern. The form of this metadata memory area tree may be determined under software control because different areas of the memory address space are registered for use as metadata areas owned by the
When a given data stored in a given memory region is output, the memory region in question is subsequently invalidated, making the content inaccessible. To reuse this page, the memory region is "validated" by using a Clean (Clean) command that overwrites the memory region with other data unrelated to the previous content so as to not make this previous content accessible to another process when the given memory region is freed for use by another process. For example, the contents of a given memory region may be written all as zero values, or as fixed values, or as random values, thereby overwriting the original contents of the memory region. In other examples, the overwriting of the contents of the output memory region may be triggered by the output command itself rather than a subsequent cleaning command. In general, given owned data that is output may be overwritten by values that are not associated with the given owned data before the given memory region is made accessible to processes other than the given owned process. When a given memory region owned by a given process is to be exported, as part of the export process, the
Figure 2 schematically shows the relationship between a number of processes (programs/threads), a number of exception levels (privilege levels), secure and non-secure processor domains, and a number of domains representing ownership of a given memory region. As shown, the hierarchy of privilege levels extends from exception level EL0 to exception level EL3 (with exception level EL3 having the highest privilege level). The operation state of the system can be in a safe operation state and a non-safe operation stateDividing between states, the safe operating state and the non-safe operating state as usedSecure and non-secure domain representations of an architecture, such as a processorIs constructed by
Limited (Cambridge, UK).As shown in fig. 2, memory access circuitry (
The relationships between domains shown in FIG. 2 illustrate child/parent relationships between different domains, and this can be used to generate a control hierarchy for controlling the operation of the system when multiple different command sources for memory region management compete with each other. Thus, for example, in the case of an output command for outputting a memory region as discussed above, a first output command may be received by a given domain management unit (memory access circuitry) from a first output command source, such as the domain B inside operating system core 36. The second output command may then be received by the given domain management unit from a second command source, such as the manager program 38 executing in domain a. In this example, the manager program 38 that is the source of the second output command has a higher priority within the control hierarchy established by the relationship between the parent and child realms, such that the second output command issued by the manager program 38 interrupts the processing of the first output command issued by the operating system core 36. The first output command, as issued by operating system core 36, may resume when the second output command, as issued by manager 38, has completed.
In this example, the second output command has a higher priority and thus interrupts operation of the first output command. However, if the second output command has originated, for example, from an application program 40 within domain C, this has a lower priority position within the control hierarchy established by the relationship between the domains, and thus the second output command from the application program 40 will not interrupt the operation of the first output command from the operating system core 36, and will itself be prevented from being executed until the first output command has completed. Thus, paging operations (output and input operations) may be protected from each other in the sense that these paging operations may or may not interrupt each other depending on the control level, which may be associated with the domain level. In other exemplary embodiments, the control hierarchy may correspond to a privilege level.
Fig. 3 schematically shows a
The memory regions may be addressed by virtual addresses, intermediate physical addresses, or physical addresses, depending on the particular system under consideration. The
Fig. 4 schematically shows program instructions associated with an output operation of a memory region. These program instructions appear in the program instruction stream and may be executed (acted upon) by different components within the overall circuit. For example, the domain-management-unit commands are executed by the respective
Once the barrier instruction DSB has received an acknowledgement confirming that the clearing from the virtual address translation data within the system has been completed, an output command for the domain management unit is executed by the domain management unit. Execution of such output instructions received from a given process by the domain management unit triggers performance of a command sequence (corresponding to the millicode embedded within the domain management unit) that includes a plurality of command actions with respect to a specified given memory region. These command targets may include, for example, the following steps as illustrated in FIG. 4: collecting address translation data, locking a memory region, encrypting data, storing data externally, writing metadata associated with the memory region, and subsequently unlocking the memory region.
The address translation collection step performed by the domain management unit as part of the command sequence collects access control data required to complete the access operation in question to the domain management unit. This ensures that once an output operation is in progress, the likelihood of the output operation being suspended is reduced, for example, possibly due to unavailability of parameters or data required to complete the output operation, such as address translation data, attribute data, or other data required by the output process. As an example of restoring and storing access control data into memory access circuitry (domain management unit), an address translation step is used to extract all required address translation data (e.g., virtual to intermediate physical address (or physical address) mapping data) that may be required to complete an output operation.
Once the address translation data has been extracted, the domain management unit is operable to set the lock flag associated with the region under consideration to a locked state. This lock flag may be stored within the region attribute data 42 for the region under consideration. Alternatively, the lock flag may be stored in a memory area private to the domain management unit that is performing the output operation, such that the lock flag is not overridden by any other process or domain management unit. To set the lock flag to the locked state, the domain management unit must determine that no other domain management unit is currently maintaining the memory region under consideration in the locked state itself. Thus, polling of the lock flag value of any area control data stored elsewhere is performed, and if a result indicating that the area is not locked elsewhere is returned, the lock flag is set to the locked state. If the region locks elsewhere, the output operation fails and an error is reported to the process that directed the output operation. Once the lock has been obtained, the data within the given memory region is encrypted and stored outside the system-on-chip integrated circuit, such as to an external non-volatile memory 6. As previously discussed, metadata characterizing the encrypted data (or given data prior to encryption) is then generated and stored within the domain management unit private area so that the metadata can be used at a later time to validate the output data. Finally, the memory area under consideration is unlocked by the domain management unit executing the output command by switching the lock flag from the locked state to the unlocked state. The use of a lock implemented by the memory access circuitry (domain management unit) hardware mechanism serves to prevent the progress of any other (second) access commands from the further processing component that may be received while the lock flag is in the locked state.
Fig. 5 is a flowchart schematically showing page (memory area) output. At step 44 program instructions are executed (VUMAP, TLBI, DSB) which are used to clear the use of pages elsewhere in the system than the
When clear requests have been issued at step 44, processing waits at step 46 until responses are received from these clear requests indicating that address data has been invalidated elsewhere (except by the domain management unit), at which point it is safe to continue to pass the barrier instruction DSB within the program sequence (barrier instruction DSB halting
If the determination at step 60 is that there is no interrupt, the process proceeds to step 70 where a determination is made as to whether the output of the memory region is complete. If the output is not complete, the process returns to step 58. If the output has been completed, the process proceeds to step 72, where the cleared memory region (from which the stored data for the memory region has been output) is overwritten with data unrelated to the original stored data (e.g., zeroed, set to some other fixed number, padded with random data, etc.). The process then terminates.
In the exemplary embodiment discussed above, the CCB is provided as a separate private memory area specified by, for example, an associated pointer within the initialization instruction. However, in other exemplary embodiments, the CCB may not be provided as a separate memory region, but rather as a portion of a memory region already used by a command that may be interrupted, such as a destination memory region into which result data generated by the command is stored. In the case of an output command that can be interrupted, the output encrypted data is stored in the destination memory area, which is the RMU private memory area when the output is executed. When the CCB is filled with encrypted data, the CCB may be provided, for example, as an end portion of this destination area. The integrity of the context data stored within the CCB is ensured by the RMU's private destination area during the execution of the output operation.
In another exemplary embodiment, the CCB may be provided as part of a domain descriptor (RD); in this case, the storage space available for context data may be constrained by the space available in the RD, and thus the number of interruptible parallel commands supported may be constrained by the storage space available to the RD for use as a corresponding CCB. The CCB may be provided separately or as part of a memory area or resource that is also used for another purpose.
Fig. 6 schematically shows the relationship between the realms and the control hierarchy that determines which commands from different command sources are allowed to interrupt/block partially completed commands from other sources. The illustrated example includes three levels of nested domains. The previous generation field M corresponds to the abnormal level EL 3. The child domain N corresponds to the exception level EL 2. Two grandchild domains within domain N include domain O and domain P and both are at
Fig. 7 is a flowchart schematically showing a page (memory area) input operation subsequent to the RMU input command. Step 74 is used to obtain and clean an empty page (memory region) into which data may be entered. Step 76 then uses the encrypted data's associated storage metadata (stored in the RMU-private area) to verify the encrypted data to be entered. If the verification is unsuccessful, an error is generated. After successful verification, step 78 is used to decrypt the encrypted data and step 80 is used to store the decrypted data into the memory page that has been obtained at step 74. Once a memory page has been filled with decrypted data, the memory page may be freed to the owning domain (process). The pages obtained and subsequently filled are locked so as to be exclusively available to the memory management circuitry (
Fig. 8 schematically shows two output commands that may appear in parallel from different command sources. One of the sequences of instructions originates from a process corresponding to a virtual machine (e.g., a guest operating system). Another command source is a manager at a higher privilege level (or possibly a higher level within a domain level) than the virtual machine. Thus, output commands from the hypervisor can interrupt partially completed output commands representing the virtual machine being executed by the
In this example, the command to the
The command context buffer is used to store a partial completion status that represents a partially completed command sequence so that this data can be recovered at a later time. In this way, the system does not need to wait until the full output operation has completed before the interrupt can be serviced. Further, when the partially completed state is maintained, even if the output operation is repeatedly interrupted, it is ensured that the progress is made forward by the output operation, because the output operation will not need to be restarted from the initial point of the output operation.
Fig. 9 shows a more detailed example of one of the
The
As shown in FIG. 9, the
In addition, a plurality of domain management tables or
The
As shown in fig. 10, the realms are managed by the RMU20 according to a realm hierarchy in which each realm other than the root realm 130 is a child realm, which has a corresponding parent realm that initializes the child realm by executing an initialization command. The root realm 130 can be, for example, a realm associated with monitor code or system firmware executing at the most privileged exception level EL 3. For ease of explanation, the example of FIG. 10 and the initial examples discussed below illustrate the case where each child domain executes at a lower privilege level than its parent domain. However, as will be discussed below, it is also possible to establish a sub-domain that executes at the same exception level as its predecessor.
In general, for the domain management portion of the memory access control provided by the
As shown in fig. 10, each domain 140 is associated with one or more domain execution context (REC)
Each domain is associated with a
Fig. 11 and 12 show two different examples of possible domain levels. In the example of FIG. 11, each of the processes shown in FIG. 2 defines its own domain. Thus, the root domain 130 corresponds to monitor software or firmware operating at the exception level EL 3. The root realm defines two child realms 142, one child realm corresponding to the secure operating system operating at secure EL1 and the other child realm corresponding to the manager at EL 2. The hypervisor defines grandchild domains 144 corresponding to the different guest operating systems at EL1, and each of these guest operating systems defines a further great-grandchild domain 146 corresponding to the application executing at the least privileged
As shown in fig. 12, it is not necessary for the process at each privilege level to have a separate realm, and thus some of the privilege level boundaries shown in dashed lines in fig. 12 may not correspond to realm boundaries. For example, in fig. 12, application 150 and its operating system execute within the same domain as manager domain 142 operating at exception level EL2, and thus a single domain spans EL2 manager code, the operating system operating at EL1, and the application at
The
As shown in the table in fig. 13, a given RDTE 164 providing pointers to RDTGs 162 at subsequent stages of the tree may include a rank value (order value) indicating the maximum number of entries in the RDTG pointed to. For example, the order value may indicate a power of 2 corresponding to the total number of entries in the RDTG pointed to. Other information that may be included in the RDTE 164 may include a status value that indicates the status of the RDTE (e.g., whether the RDTE is free for allocation of domain descriptor tree data, and whether the RDTE provides pointers to further RDTG 162 or to child domain descriptors 166). In addition to the pointers, the RDTEs may include a reference count that may track the number of non-free RDTEs in the pointer to the RDTG that may be used to determine whether further RDTEs may be allocated to the RDTG 162. RMU commands triggered by the prior generation domain may control the RMU20 to build further RDTG of the tree and/or edit the contents of the RDTE within the existing RDTG.
It should be noted that the tree shown in FIG. 13 shows the child domains of a particular parent domain. Each other parent domain may have a split domain descriptor tree that tracks its own children domains of the parent domain. Datagrams associated with the tree, including RDTG 162 and
As shown in fig. 13, each of the child domains of a given parent domain may have a corresponding domain identifier (RID)168 that is used by that parent domain to identify a particular child domain. A RID is a local realm identifier because the RID is specific to a particular generation realm. Progeny domains of different predecessor domains may have the same local RID. Although it is possible to use a local RID having any value selected by the predecessor domain for a given child domain, in the methods shown in fig. 13 and 14, the local RID for a given child domain has a variable number of variable length bit portions and each of the variable length portions is used by the RMU20 to index into a given stage of the domain descriptor tree 160. For example, the domain descriptors of the child domains with local RID 7 in fig. 13 are accessed by the domain descriptor pointer in
In fig. 13, the local RIDs are shown in decimal form, but fig. 14 shows how these local RIDs can be represented using binary identifiers. The binary identifier may have a plurality of variable
The number of bits to be used within each of the variable
This approach provides a flexible architecture that allows different numbers of child domains to be established by a given parent domain, and allows domain descriptors for these child domains to be efficiently accessed. Because the realm identifier explicitly provides the index required to step through the realm descriptor tree, there is no need to maintain a mapping table that maps any realm number to a particular way through the tree. When a tree structure is used, the tree can be expanded as needed by the number of child realms, by adding additional RDTG or additional RDTE to a given level of the tree, as appropriate, as compared to a table structure that would provide a certain fixed number of entries. Thus, the architecture is scalable to the needs of different software processes. This enables the available bits of the RID to be flexibly allocated to accommodate different depths/widths of the tree, since it is not specified in advance exactly which parts of the RID are to be mapped to a given level of the tree.
The
The
The RMU may allow variable length bit portions for indexing to different branches at the same stage of the domain descriptor tree to have different numbers of bits. That is, while both have the same order value (and thus the same number of entries) in the RDTGs 162 shown in layer 2 in fig. 13, this is not necessary and some implementations may have different RDTGs 162 at the same level of the tree, these different RDTGs having different numbers of entries. Thus, corresponding portions of the RIDs of the respective domains may have different numbers of bits for the same level of the tree. Thus, the variation in length of the bit portion for indexing to a given length of the tree may not only vary from generation to generation, but may also vary within different branches of the tree managed by one generation, providing further flexibility in the manner in which children domains may be defined.
In general, a RID for a given domain may include an in-sequence connection (collocation) of indices to be used at respective stages of a domain descriptor tree to access domain management data for the given domain. Although it is not necessary that the indexes connect in order in the same sequential order as they are used to step through the tree, this may be preferred as it makes management of tree accesses simpler. It does not matter whether the consecutive connections are from low to high or from high to low. The indexed sequential connections may be followed by a predetermined termination pattern that may allow the RMU20 to determine when there are no further levels of trees to be stepped through.
Some embodiments may apply this RID construction technique to a global domain descriptor tree that may store domain descriptors for all domains within the system in a tree-like structure (where each of the RIDs is a globally unique value). However, software development can be made simpler by defining the child domains of a given generation within one tree and then tracking the child domains of that generation for the split tree for each other generation domain. Thus, a domain descriptor tree may be a local domain descriptor tree related to a given parent domain for storing domain management data for child domains that have been initialized by the given parent domain. The realm identifier can thus be a local realm identifier that identifies a particular child realm used by a given parent realm. Child domains initialized by different parent domains may be allowed to have the same value of the local domain identifier. In this way, the parent domain can select which RIDs are used for child domains of the parent domain without knowing that any other domain has been established by other parent domains, where the RIDs for the child domains are constructed according to the way the parent domain configures its domain descriptor tree.
The local realm identifier can be used by a realm entry instruction or RMU command issued by a software process. However, the hardware architecture may use absolute identification of a given child domain to distinguish domains created by different parents. Thus, in addition to the local domain identifiers shown in fig. 13 and 14, a given domain may also have a global domain identifier (or "internal" domain identifier) that is unique to the given domain. At least one hardware structure may identify a given domain using a global domain identifier (GRID) instead of a local domain identifier (LRID). For example, the domain group table 128 and/or the TLB100 may use a global domain identifier to identify a domain.
In some instances, any binary value may be assigned as a GRID for a given realm, which may be completely independent of the LRID used by the predecessor realm to reference the descendant realm. Different microarchitectural implementations of the same domain architecture may use different methods to assign GRIDs.
However, in one example as shown in fig. 15, a GRID for a given domain may be constructed based on LRIDs of prior domains of the given domain. This may be useful because it may enable simpler determinations that a given domain is a descendant of another domain or an antecedent of another domain, which may be used for access control by the
In some cases, in GRID, the LRID may be concatenated in-order, including the termination flag and zero-padded bits shown in fig. 14. Alternatively, the binary representation of the GRID may exclude such termination flags and zero-padded bits, and instead a meaningful portion of the LRID including the RDT index may be directly concatenated. Because each of the LRIDs may itself have a variable number of bits depending on the depth and width of the RDT used to associate the prior domain, the number of bits allocated to the global RID representing the local RID of a given domain generation may be variable. In addition, this change, where portions of the global RID are allocated to a given domain generation, may change at runtime based on the particular software being run, but may also change between different branches of the "family tree" of the domain, such that one branch of the family tree may use a larger portion of the domain identifier than the other branches. Because the common prefix or suffix of a GRID is the same for domains sharing a common prior generation, any subsequent generation may still be discerned by a remainder that is specific to the subsequent generation, regardless of how the remainder is divided among further generations.
By constructing the GRID as a concatenated concatenation of LRIDs for a plurality of prior realms, this enables a more efficient determination of whether the first realm is a prior or a subsequent realm for the second realm. Circuitry may be provided (e.g., within the TLB100 or RMU 20) to determine whether the global RID of one of the first and second domains matches a prefix or suffix portion of the global RID of the other, e.g., by masking portions of the global RID corresponding to later generations using bit masking, in order to allow comparisons between global RIDs of earlier and later domains within the same family to match.
It is not necessary that all local RIDs be constructed using the ordered concatenation of the tree indexing method shown in FIG. 13. In some cases, it may be useful for certain values of the local RID to be reserved for reference to certain default fields. RMU commands specifying the current domain or a previous generation domain of the current domain may be relatively common. Therefore, the predetermined RID value can be reserved for the current generation domain with reference to the current domain. For example, an LRID (value indicating 1) with all bits set to 1 may be reserved for referencing the current domain's predecessor domain. Similarly, a predetermined realm identifier value can be reserved for reference to the current realm itself. For example, an LRID value of 0 may be used to reference the current domain. It should be noted that the use of the
The RMU may support certain query commands that may be triggered by a given domain in order to query the constraints that must be met when the given domain builds its domain descriptor tree. For example, in response to a query command, the RMU20 (or the processing circuitry 32) may return a constraint value indicating at least one of a maximum number of levels of the domain descriptor tree 160 that are allowed to be defined by a given domain, a maximum number of entries that are allowed at a given level of the tree structure for a given domain, and/or a maximum number of children domains that may be initialized by a given domain. For example, the system may include registers that may indicate properties such as the number of bits available in an LRID or a GRID for a particular hardware implementation. The RMU or processing circuitry may check the number of bits available for the realm identifier in response to a query command (or an appropriate response may be hardwired for a particular processor implementation), and may also check information specifying how many bits of the identifier have been used up by an earlier realm in the global realm identifier in order to determine how many bits are available to remain for further descendants of the current realm definition. The domain of the generation may use the response to the query command to determine how to construct the RDT for the domain of the generation.
Fig. 16 shows an example of the contents of the
the global RID of the domain. Thus, by traversing the domain descriptor tree based on a local RID, a corresponding global RID may be identified and this may be used to index hardware structures, such as TLBs, or check ownership tables or other information defined based on GRID by a given domain.
The lifecycle state of a given domain, which may be used by the RMU20 to determine whether to accept a given command triggered by the given domain.
The type of a given domain. For example, the domain type may indicate that the domain is a complete domain or a sub-domain as discussed later.
A Boundary Exception Level (BEL) value that identifies a boundary exception level for the corresponding domain. The BEL indicates the maximum level of privilege the domain is allowed to execute. For example, domain 142 in fig. 12 may have a BEL for EL2, domain 152 may have a BEL for EL0, and domain 154 may have a BEL for
A resource count indicating the total number of memory regions (domain protection groups or RPGs) owned by the domain and its descendants. This is used to ensure that all memory pages owned by the descendants of the domain are invalidated (and eventually erased) before these memory regions can be allocated to different domains. For example, a resource count may be used to track how many regions still need to be washed.
The start and end addresses of protected addresses for the domain. For example, a protected address range may define a memory address space range within which a page may be owned by a corresponding domain. This can be used to protect against malicious parent domains that reclaim ownership of a region previously assigned to a child domain in attempting to access child domain data, as by comparing the protected address range defined in the domain descriptor with subsequent addresses of memory accesses, situations can be identified where a memory region previously owned by a domain is no longer owned by that domain.
One or more encryption keys used by the
A domain description tree entry (RDTE) that identifies the root of the domain descriptor tree. The RDTE in the domain descriptor provides a pointer for accessing the root RDTG (and defining how many bits will be used as the order value of the index for that RDTG).
Pointers to main REC (domain execution context) memory regions for saving or restoring architectural state related to the execution of the domain.
FIG. 17 shows a set of lifecycle states that may exist for a given domain, including in this example a clean state, a new state, an active state, and an invalid state. Fig. 17 summarizes the properties of each state, indicating for each state: whether a domain in the corresponding state can have the parameters of the
FIG. 18 is a state machine diagram showing the allowable transitions of the lifecycle states of a domain. Each state transition shown in fig. 18 is triggered by the previous generation realm issuing a realm management command to the RMU20 that specifies a local RID of the child target realm (realm. invalid) command 212 may also be issued by the target realm itself). When no previous fields have been defined for that local RID and the domain descriptor register granule command 200 is executed by the previous generation field, this triggers the configuration of a given memory region owned by the previous generation field because the domain descriptors for the child fields have the specified local RID. The global RID for the child domain may be set based on the global RID of the previous domain and the in-sequence connection of the new local RID specified in the domain descriptor register granule command 200. The specified child domain then enters the clean state 202. In the clean state, the descendant domain can set the properties of the descendant domain by updating various parameters of the domain descriptor of the descendant domain. These properties may be modified using further RMU commands issued by the predecessor domain (if a specified predecessor domain is not in a clean state, such domain descriptor modification commands may be rejected). When the previous generation realm has finished setting the parameters of the realm descriptor of the child realm, the previous generation realm executes a realm initialization command 204 that specifies the LRID of the child realm and this triggers the transition of the child realm from the clean state 202 to the new state 206, and at this point, the parameters of the realm descriptor can no longer be modified by the previous generation realm. If the specified domain is not currently in a clean state, the domain initialization command 204 will fail.
When a domain is in the new state 206, execution of the domain activity command 208 specifying the local RID of the domain triggers a transition from the new state 206 to an active state 210 in which the domain is now executable, and after this time, domain entry into the corresponding domain will no longer trigger a failure. The field is now fully operational. A subsequent domain invalidate command 212 triggered by an predecessor of a descendant domain in any of clean state 202, new state 206, and active state 210 results in a transition to invalid state 214. To leave the inactive state 214 and return to the clean state 202, the previous generation domain must execute a domain wash command 216. If the resource count, which tracks the number of pages owned by the domain, has a value other than zero, then the domain wash command 216 is rejected. Thus, for the realm wash command 216 to succeed, the previous generation realm must first issue a granule for each page owned by the invalid realm. The eviction command specifies a target memory page and triggers invalidation of the target page to make the page inaccessible, and also lowers the reference count of the owner domain of the page by one. When the granule reclaim or field wash command 216 is executed, it is not necessary to actually overwrite the data in the invalid region, as the overwrite can occur when a clean command is subsequently issued to transition the memory page from invalid to valid (see FIG. 22 discussed below). Additionally, any cache data related to the invalid domain may also be invalidated in response to the domain flush command, for example within the TLB100 or
Thus, by providing a managed life cycle for a domain associated with a given domain identifier, this ensures that data associated with a previous domain using the same domain identifier must be washed from memory and any cache before the domain can be returned to a clean state in which parameters of that domain can be modified (and thus before the given domain identifier can be recycled for use by a different domain) to prevent any data associated with an old domain from being leaked to other domains by reuse of the same domain identifier. While the domain is in the clean state 202, the domain descriptor for the domain may also be cancelled by executing a domain descriptor release command 218 that enables the memory region stored in the domain descriptor to be allocated for other purposes (at this point, no washing is required, as the domain is clean).
Fig. 19 shows an example of the contents of an entry of the domain area group table 128 (or ownership table). Each entry corresponds to a given memory region of the memory address space. The size of a given memory region may be fixed or variable, depending on the implementation. The particular manner in which the ownership table 128 is structured may vary significantly depending on implementation requirements, and thus the particular manner in which the corresponding memory region for a given entry is identified may vary (e.g., data may be stored in each entry identifying the corresponding region, or alternatively, the corresponding entry may be identified based at least in part on the location of the corresponding ownership entry within the table itself). In addition, fig. 19 shows specific examples of parameters that may be specified for a given memory region, but other examples may provide more information or may omit some of the information types shown.
As shown in fig. 19, each ownership table entry may specify the following for the corresponding memory region:
identify the global RID for the owner zone of the memory region. An owner realm may be a realm that has the right to set attributes that control which other realms are allowed to access a memory region.
The life cycle state of the corresponding memory region used to control which RMU commands are allowed to execute on the memory region.
Mapped addresses mapped to by the
Visibility attributes that specify which domains other than the owner can access the memory region. For example, as shown in FIG. 20, the visibility attribute may specify a previous generation visibility bit that controls whether a previous generation domain of the current domain is allowed to access the region, and may specify a global visibility bit whether any domain can access the corresponding memory region. In general, a domain protection scheme may assume that descendant domains of a current domain are always allowed to access memory regions owned by the current domain's descendant or predecessor domains (subject to whether access is allowed based on a translation table 120 that provides protection based on privilege level), but a given domain may control whether memory regions are accessible by the given domain's descendant or any other domain that is not an immediate descendant of the given domain. In some embodiments, both the previous generation visibility bit and the global visibility bit may be set by the owner zone itself. Alternatively, while the previous generation visibility bit may be set by the owner domain, the global visibility bit could perhaps be set by a previous generation domain of the owner domain (provided that the previous generation visibility bit for a memory region has been set to give the memory region a previous generation visibility). It will be appreciated that this is just one example of how the owner zone can control which other processes can access data of the owner zone.
FIG. 21 is a table showing different lifecycle states that may exist for a given memory region, and FIG. 22 is a state machine showing commands that trigger transitions between the corresponding lifecycle states. In a manner similar to the domain lifecycle states shown in FIG. 18, transitions between memory region lifecycle states are managed to ensure that a memory region that is passed from ownership of one domain to ownership of another domain must first undergo an invalidation process in which data in the region is scrubbed (e.g., set to zero). Thus, to transition from the
In some systems, it may be sufficient to provide the
Thus, cleaning
However, robustness can be promoted by specifying multiple types of RMU-private memory areas each corresponding to a particular form of domain management data. For example, in fig. 21 and 22, a plurality of RMU registration states 228 are defined that each correspond to RMU private areas that are designated for a specific purpose. In this example, the
Thus, in summary, at least one RMU-private memory region may be defined that is still owned by a given owner zone but has an attribute specified in the ownership table that means that the at least one RMU-private memory region is reserved for mutually exclusive access by RMUs. In this example, the attribute controlling the RMU-private state is the lifecycle state specified in the corresponding entry in the ownership table, but the attribute may also be identified in other ways. When a given memory region is specified by at least one state attribute as an RMU private memory region, the MMU may prevent access to the given memory region by one or more software processes. Thus, any software-triggered access that is not triggered by the RMU itself may be denied when it targets the RMU-private memory area. This includes preventing access to the RMU-private memory area by the owner zone itself.
A skilled person may ask why it is useful to define an owner zone for an RMU-private memory area if the owner zone cannot even access the data in the memory area. For example, an alternative method for implementing access to data only by an RMU would define a special domain for the RMU, and allocate pages of memory address space for storing data that would remain private to that special RMU owner domain. However, the inventors have recognized that when a domain is invalidated, there may be a requirement to invalidate all control data related to that domain, and this may complicate the washing of data of the invalid domain if this control data is associated with a particular RMU owner domain rather than the invalid domain.
In contrast, by using RMU-private attributes, the memory region storing the control data for a given domain is still owned by that domain even if the owner cannot access the control data, which means that it is simpler to identify which memory regions need to be invalidated when the owner domain is revoked. When a given realm is invalidated, the previous generation realm may simply perform a sequence of eviction operations (e.g., by executing an eviction command that is subsequently acted upon by the RMU) that triggers a region of memory owned by the specified invalidation realm (or a descendant of the specified invalidation realm) to be invalidated, made inaccessible, and returned to ownership of the previous generation realm that triggered the eviction command. The eviction operation may affect not only the pages accessible by the invalidation domain, but also the RMU-private memory area owned by the invalidation domain.
Another advantage of storing control data for a domain in an RMU-private memory area owned by the domain is when performing output operations. To reduce the memory footprint of a domain to zero, management structures associated with the domain may be exported in addition to normal memory during export operations. These structures are required to be owned by the field to simplify the management of the output operations.
In general, any kind of domain management data may be stored in the RMU-private area, but specifically, the domain management data may include any of: a domain descriptor defining properties of a given domain, a domain descriptor tree entry or further domain descriptor tree entries identifying a memory region storing the domain descriptor for the given domain, domain execution context data indicating an architectural state related to at least one thread executing within the given domain, and temporal work data for use at intermediate points of predetermined operations related to the given domain.
While in general, RMU private areas may be used to store domain-specific control data related to a given domain, these RMU private areas may also be used in order to increase security with respect to certain other operations performed once the domain is active. For example, when performing the above-discussed paging out or in operations in which data is encrypted or decrypted, and a check using metadata is performed to check that the data is still valid when the data is again input, such operations may take many cycles and such long-running operations are more likely to be interrupted in the middle. To avoid the need to restart operations again, it is desirable to allow metadata or other temporary working data associated with such long running operations to remain in the cache/memory even upon interruption, without making this data accessible to other processes (including the owner zone itself). This temporary working data can be protected by temporarily designating an area of the memory system as an RMU-private area. Thus, as shown in FIG. 21, the page states may also include RMU output (RMUEporting) and RMU input (RMUImporting) states that may be used when this temporary working data is stored to the memory area, and when one of these states is selected, then only the RMU may access the data.
Other examples of operations that may benefit from temporarily designating a corresponding memory region as RMU private may include: generation or verification of encrypted or decrypted data during data transfer between at least one memory region owned by a given domain and at least one memory region owned by a domain other than the given domain; transfer of ownership of a memory region to another domain; and a destructive eviction operation performed to render inaccessible data stored in the invalid memory region. For example, a eviction operation to wash the entire contents of a given page of the address space may be interrupted in the middle, and thus ensure that other processes cannot access the page until the wash is complete, the page may be temporarily designated as RMU-private. In general, any long latency operation performed by the RMU may benefit from transitioning the lifecycle state of some memory regions to RMU-private state before beginning the long-running operation, and then transitioning the lifecycle state back when the long-running operation is completed so that the temporary working data of the long latency operation is protected.
When an area is designated as private to an RMU, the area is reserved for access by the RMU20, which is used to perform domain management operations. The domain management operations may include at least one of: creating a new domain; updating the properties of the existing field; rendering the domain useless; allocating memory regions for ownership by a given domain; changing an owner zone for a given memory region; changing the state of a given memory region; updating access control information for controlling access to the given memory region in response to a command triggered by an owner field for the given memory region; managing transitions between domains during processing of one or more software processes; managing transfer of data associated with a given domain between memory regions owned by the given domain and memory regions owned by a different domain than the given domain; and encryption or decryption of data associated with a given domain. The RMU may be a hardware unit to perform at least a portion of the domain management operations, or may include processing
FIG. 22 illustrates a state transition that may be triggered by a given domain to clean a given page so that the given page may be validly accessed, or invalidate the corresponding page. FIG. 23 expands this scenario to show further commands that may be used to transfer ownership of a given page from one domain to another. If the memory region is currently in the
One advantage of using the hierarchical domain structure discussed above in which an offspring domain is initialized with an offspring domain is that this greatly simplifies invalidation of domains and descendants of the domain. It is relatively common that if a given virtual machine realm is to be invalidated, it may also be desirable to invalidate the realm for any application running under that virtual machine. However, there may be a large amount of program code, data, and other control information associated with each of the processes that will be invalidated. It may be desirable to ensure that such invalidation occurs atomically, so that when only part of the data wash has been implemented, it is not possible to continue accessing data related to the invalid domain. This can make such atoms difficult if each domain is built completely independently of other domains without a domain hierarchy as discussed above, as multiple separate commands must be provided to individually invalidate each domain identified by the corresponding domain ID.
In contrast, by providing a domain level in which the RMU management domains are such that each domain other than the root domain is a child domain initialized in response to a command triggered by the parent domain, the RMU20 may make the target domain and any child domains of the target domain inaccessible to the processing circuitry with more efficient operation when a command requesting invalidation of the target domain is received.
In particular, in response to invalidation of the target domain, the RMU may update domain management data (e.g., domain descriptors) associated with the target domain to indicate that the target domain is invalid, but need not update any domain management data associated with any descendant domains of the target domain. The domain management data associated with the descendant domains may remain unchanged. This is because simply invalidating the target domain may also make any descendant domain ineffectively inaccessible even though the domain management data has not changed, because access to a given domain is controlled by the descendant of the given domain and thus if the descendant domain is invalidated, this means that it is not yet possible to access descendants of the descendant domain. Because each of the domains is entered using a domain entry instruction (an ERET instruction discussed below) that uses a local RID defined by the parent domain to identify a particular child of the parent domain, and this is used to step through domain descriptors stored in memory regions owned by the parent domain of a given child domain, no process other than the parent domain can trigger the RMU to access domain management data of the child domain. Thus, if the predecessor realm is invalidated, the RMU cannot access the realm management data of a given successor realm, thereby ensuring that the given successor realm becomes inaccessible.
After a domain has been invalidated, a predecessor domain of the domain may trigger the RMU to perform an eviction operation to evict each memory region owned by the invalidated target domain. For example, as shown in FIG. 23, an eviction command 236 for a memory region owned by a child domain may trigger the return of the memory region to the
Thus, in summary, the use of domain hierarchies greatly simplifies the management of domains and inefficiencies. In such invalidation, and overwriting of data in memory, the invalidation may also trigger invalidation of cache realm management data for the target realm and any descendant realms of the target realm, which cache realm management data is held not only in the
FIG. 24 shows an example of checks performed by the
Upon receiving a memory access,
After obtaining the physical address, the physical address may then be looked up in the RMU table 128 (domain group table) to determine whether the domain protections implemented by the MMU allow the memory access to proceed. The domain check is discussed in more detail below in FIG. 26. If the RMU check at stage 3 succeeds, the verified physical address is output and the memory access is allowed to proceed. If either the check at
FIG. 25 illustrates an example of a
A hit in the TLB100 not only requires that the tag 262 match the corresponding portion of the
To address this issue, the TLB100 may specify, within each TLB entry 260, a global RID 270 for the owner domain that owns the corresponding memory region, as well as visibility attributes 272 set by the owner domain for controlling which other domains are allowed to access the corresponding memory region. When a given lookup of the
Subsequently, when a translation cache is looked up to check whether the translation cache already includes an entry 260 that provides an address translation for a given address, TLB control circuitry 280 determines whether the memory access matches the given entry of
If the second comparison of the domain identifiers detects a mismatch, then even if the tag comparison and translation context comparison match, the access request is considered a miss in the TLB, since it indicates that there is a change in the mapping between
FIG. 26 is a flow chart illustrating a method of determining whether a given memory access is allowed by
In response to a memory access request, TLB control circuitry 280 looks up these TLBs. The lookup accesses at least some entries of the TLB. Some methods may use a fully associative cache structure, and in this case, all entries of at least the
a tag comparison 302 for comparing whether the address of the memory access request matches the tag 262 stored in the access entry;
a first (context) comparison 304 for comparing the translation context identifier stored in the access entry with the translation context identifier of the memory access request; and
a second (realm) comparison 306 for comparing the global RID of the memory access request with the owner RID 270 and the visibility attributes 272 for each of the access sets of entries.
At step 308, control circuitry 280 determines whether there is an entry in the TLB that returns a match for all of the comparisons 302, 304, 306, and if so, a hit is identified, and at step 310, returns the physical address 264 specified in the matching entry and allows the memory access to proceed based on the physical address. In the case of a hit, there is no need to perform any lookup of the page table or RMU table (the ownership table lookup for memory access may be omitted). The protection provided by the page table and RMU table is only invoked on a miss.
If there are no entries matching all three of the comparisons 302, 304, 306, then a miss is detected. If further levels of TLB are provided, corresponding lookup steps 300-308 may be performed in the level 2 or subsequent levels of TLB. If the lookup misses in the last level TLB, various page tables and RMU tables are traversed. Accordingly, a
On the other hand, if the access request passes the
If a stepless stage 2 failure occurs, at step 318, an RMU table lookup is triggered based on the physical address returned by stage 2, and at step 320, a determination is made as to whether a domain failure has been detected. A domain fault may be triggered if any of the following events occur:
if the lifecycle state for the corresponding memory region is indicated as invalid in the domain ownership table 128. This ensures that pages of the memory address space that have not been subjected to the
The current realm is not allowed by the owner realm for the corresponding memory region for accessing the memory region. There may be a number of reasons why a given domain may not be allowed to access a given memory region. If an owner domain has specified a memory region that is only visible to the owner itself and to descendants of the owner, then another domain may not be allowed to access that domain. Additionally, memory access may be denied if the current domain is a previous generation domain of the owner domain and the owner domain has not defined a previous generation visibility attribute to allow the previous generation to access the region. Additionally, if the memory region is currently set to RMU-private as discussed above, the owner zone itself may be prevented from accessing the memory region. At the RMU check stage, descendant domains of the owner domain may be allowed to access the memory region (as long as the memory region is not an RMU-private region). Thus, this check enforces the access permissions set by the owner zone.
If the physical address translated by S1/S2 for the current memory access map does not match the mapped address specified in the ownership table 128 for the corresponding memory region as shown in FIG. 19, then the memory access is denied. This protection is from the following situation: the malicious predecessor domain may assign ownership of a given memory region to a child domain, but then change the translation mapping in page table 120 so that subsequent memory accesses triggered by the child domain using the same virtual address that the child domain previously used to reference the page owned by the child domain now map to a different entity address that is not actually owned by the child domain itself. By providing a reverse mapping in the ownership table from the physical address of the corresponding memory region back to the mapped address used to generate the physical address when ownership was asserted, this allows security breaches caused by changes in the address mapping to be detected so that the memory access will fail.
It will be appreciated that other types of inspections may also be performed. If the realm check is successful, then the physical address is returned at step 322, the memory access is allowed to proceed using the physical address, and a new entry is allocated to the TLB indicating the physical address obtained from the page table 120 and the owner realm and visibility attributes obtained from the ownership table 128 corresponding to the requested virtual address and translation context.
Thus, in summary, by requiring a second comparison (comparing the GRID of the current domain to the GRID provided in the entry of the translation cache) to match in order to allow a hit to be detected in the translation cache lookup, this ensures that even after a TLB entry has been allocated, there is a change in the translation context identifier associated with a given domain, which cannot be used to override the domain protection, even if the domain check is not repeated again on a TLB hit. This makes it possible to improve performance, such as by making it unnecessary to repeat domain checks at each memory access (which would be relatively processor intensive given the number of checks to be made). This allows most memory accesses to proceed faster, since hits are much more common than misses. When the second comparison identifies a mismatch between the domain identifier specified in the entry and the domain identifier of the current domain, a mismatch between the memory access and a given entry of the translation cache is detected. This will then trigger a miss and this may trigger page table and RMU table traversals in order to find the correct access control data (with the domain check repeated in the case that VMID/ASID has changed).
This method is safe because the RMU can prevent initialization of a new domain having the same domain identifier as the previous active domain until after a washing process for invalidating information related to the previous active domain has been performed. This washing process may include not only invalidation of the domain management data and any data stored in memory that is related to the domain of invalidation, but also invalidation of at least one entry of the translation cache for which the second comparison identifies a match between the domain identifier of the entry and the domain identifier of the domain of invalidation. Thus, this means that it is not possible to regenerate a different process using the same realm identifier as the previous process, unless all data in the
A miss in the translation cache may trigger an ownership table lookup that accesses an ownership table specifying, for each of a plurality of memory regions, an owner realm for the corresponding memory region and access constraints set by the owner realm for controlling which other realms are allowed to access the memory region. This enables the ownership table lookup to be omitted on lookup hits by including an additional second comparison for determining TLB hits. An ownership table lookup is performed on a TLB miss.
Although fig. 25 illustrates a method in which the GRID of the owner domain is stored in each TLB entry, there may be other ways of representing information that enables a determination of whether the GRID of the current domain is suitable for accessing the corresponding memory region. For example, a list of GRID's of authorized domains may be maintained in the TLB, or the TLB may maintain a separate list of active domains with TLB entries including an index into the active domain list, rather than a full GRID, which may reduce TLB entry size compared to storing the list in TLB entries. However, simply representing the GRID of the owner realm may be a more efficient way to identify the authorized realm because it makes the process of allocating and checking TLB entries less complicated by avoiding additional levels of indirection in consulting the active realm list, and also avoids the need to synchronize changes in the active realm list between TLBs.
It should be noted that a match in the second (GRID) comparison performed in looking up the TLB does not necessarily require that the current realm identifier be identical to the global realm identifier 270 specified in the two corresponding TLB entries 260 — some forms of second comparison may use a partial match. Some embodiments may only allow the owner domain to access pages owned by that owner domain, and thus in this case, an exact match between the current GRID and the owner GRID 270 may be required. However, because it may be useful that data will be shared between domains, visibility attributes 272 may also be provided to allow the owner domain to define what access is allowed by other domains.
Thus, by caching the visibility attribute 272 in the TLB100 and the owner realm global RID 270, this enables the TLB control circuitry 280 to vary the degree of match required for the second comparison to determine a match based on the visibility attribute. For example, visibility attributes 272 may control which portions of the GRID should be masked when making the comparison, so that it does not matter if these masked bits do not match, as these masked bits do not affect the overall match of the comparison. For example, the control circuitry may determine a mismatch in some cases when the current domain indicates a domain other than the owner domain or a descendant domain of the owner domain. Descendant domains can be easily identified using the global RID format discussed above, as these descendant domains will have prefix or suffix portions that match the GRID of the owner domain.
For at least one value of the visibility attribute, the control circuitry may determine a mismatch when the current domain is a domain other than the owner domain, a descendant domain of the owner domain, or a predecessor domain of the owner domain (e.g., when the predecessor visibility is set as discussed above). In some cases, at least one value of the visibility attribute may allow control circuitry 280 to determine a match for the second comparison regardless of which domain is the current domain (e.g., if the global visibility bit is set). Thus, while in general the second comparison based on the realm identifier does not need to pass, exactly what requirement will be met by the GRID of the current realm can depend on the visibility bit 272. This enables such partial matching to be efficiently performed by constructing the domain identifier of the child domain including the bit portion of the domain identifier corresponding to the parent domain of the initialization child domain. In embodiments that support variable assignment of different variable length bit portions of a global realm identifier to different generations of realms, TLB entry 260 may also specify some information identifying the location of boundaries between different local RIDs that are connected in series to form a GRID (in order to allow the previous generation realm to be distinguished from the grandparent or earlier generation realm). This may enable the TLB control circuitry 280 to determine which portions of the domain identifier are to be masked. For other embodiments, this may not be necessary (e.g., if any previous generation is allowed to access a memory region for which the owner domain has given visibility to a previous generation domain of the owner domain). In addition, some embodiments may have a fixed mapping, where each global realm identifier has a fixed number of bits (such as 32-bits), and in that case, it may not be necessary to provide any additional boundary definition data.
FIG. 27 is a Venn (Venn) diagram showing an example of the architectural state accessible to
When operating at exception level EL0,
general purpose registers, including integer registers, floating point registers, and/or vector registers, for storing general purpose data values during data processing operations.
A Program Counter (PC) register that stores a program instruction address representing the current execution point within the program being executed.
A save processor state register (SPSR _ EL0) for storing information about the current state of the processor when an exception is taken from a process executing at
An exception link register ELR _ EL0 for storing the current program counter value when an exception is taken, so that the ELR provides a return address to which processing should branch once the exception has been handled.
A domain identifier register RID _ EL0 to store the local RID of the child domain for which domain entry was made (even though exception level EL0 is the lowest, least privileged exception level, a domain may be entered from a process operating at exception level EL0 with the ability to create a sub-domain as discussed below).
An Exception Status Register (ESR), which is used by EL0 to store information about exceptions that occur (e.g., to allow selection of an appropriate exception handler).
When
Similarly, when executing instructions at the exception level EL2, the
Finally, when operating at EL3, the processing component may access state subset 356, which includes all of subset 354 accessible at EL2, but may also include other states, such as a further domain identifier register RID _ EL3 used by processes operating at exception level EL3 to enter the domain, and further exception handling registers ELR, SPSR, ESR similar to the corresponding registers for the lower exception levels. In addition, FIG. 27 is merely an example and other states may also be included in the associated subset accessible at a particular exception level.
Thus, each exception stage is associated with a corresponding group of registers that the processing circuitry can access when processing a software process at that exception stage. For a given exception level other than the least privileged exception level, the group of registers accessible at the given exception level includes a group of registers accessible at a lesser privileged exception level than the given exception level. This state hierarchy accessible to a particular level may be utilized to reduce the administrative burden associated with state saving and restoration upon domain entry and exit, as will be discussed below.
Upon entering or exiting from the domain, the
In the techniques described below, the domain mechanism reuses the mechanisms already provided for exception entry and return in order to enter and exit from the domain. This reduces the amount of software modification required to support domain entry and exit and simplifies the architecture and hardware. This is particularly useful because the general domain boundaries may correspond to exception level boundaries anyway, and even if new instructions are provided to control entry and exit, behavior for handling exceptions will still be required, so in general, extending the exception mechanism so as to also control entry and exit may be less expensive.
Thus, an Exception Return (ERET) instruction that would normally return processing from an exception handled in the current realm to another process also handled in the current realm, where the other process may be handled at the same or less privileged exception level than the exception, may be reused to trigger a realm entry from the current realm to the destination realm. In response to a first variant of the exception return instruction, the processing circuitry may switch processing from the current exception level to a less privileged exception level (without changing the realm), while in response to a second variant of the exception return instruction, the processing circuitry may switch processing from the current realm to a destination realm that may operate at the same exception level or a reduced (less privileged) exception level as the current realm. The use of an exception return instruction to trigger a domain entry may greatly simplify the architectural and hardware management burden and reduce the software modification requirements to support the use of domains.
Another advantage of using an exception return instruction is that, typically on return from an exception, the processing circuitry may perform an atomic set of operations in response to the exception return instruction. The set of operations required on return from an exception may be executed atomically, such that the operations may not be split in the middle, and thus either the instruction fails and none of the atomic set of operations is executed, or the instruction is successfully executed and all of the atomic set of operations are executed. For a second variant of the exception return instruction, the processing circuitry may similarly perform an atomic set of second operations, which may be different from the atomic set of the first operations. Mechanisms that have been provided in processors to ensure that exception return instructions complete atomically may be reused for domain entry in order to avoid situations where domain entry may only be partially executed that may lead to security vulnerabilities. For example, the atomic set of second operations may include changing the current domain being executed, making domain execution context state available, and branching control to processing the program counter address where previously executed on the last execution of the same domain.
The first variant and the second variant of the exception return instruction may have the same instruction encoding. Modification of the exception-free return instruction itself is therefore necessary in order to trigger a domain entry. This improves compatibility with legacy code. The execution of a given exception return instruction as either the first variant or the second variant may depend on the control value that the given exception return instruction stores in the status register (e.g., the first and second values of the control value may represent the first and second variants of the exception return instruction, respectively). Thus, the current architecture state when the exception return instruction is executed controls the exception return instruction to return the processor to a lower privilege level in the same domain, or to trigger entry into a new domain.
This approach enables domain entry to be controlled with fewer software modifications, especially when the values in the status registers can be set automatically by hardware in response to certain events that suggest a domain switch to be possible (in addition to allowing for voluntary setting of control values in response to software instructions). For example, when an exception condition triggering an exit to a given domain occurs, the processing circuitry may set the control value to a second value for the given domain, such that a subsequent exception return instruction will automatically return processing to the domain in which the exception occurred, even taking into account that the exception handler code used to handle the exception is the same as the previous legacy code not written in the domain. Alternatively, it is contemplated in some architectures that when exiting from a domain, the control value in the status register will still include the second value set prior to triggering the domain entry to that domain, and thus explicit setting of the control value in the status register may not be required.
In one example, the control value in the status register may be the R flag in the SPSR register associated with the current exception stage, as discussed above. Using SPSR may be useful because this register will normally be used when an exception returns to provide processor mode (including exception stage) and other information about how processing should continue when returning from the exception currently being processed. However, for domain entry, this information may instead be determined according to a domain execution context (REC), and thus the SPSR may not be needed. This avoids the need to provide an additional register for storing this information by reusing portions of the SPSR for the storage control exception return instruction to be treated as an R-flag for the first variant as well as the second variant. Thus, it may be useful to use a status register that is used to determine return status information (such as processing mode) for continuing an exception at a less privileged exception level in response to the first variant of the ERET instruction, but this return status information will instead be determined from memory in response to the second variant of the exception return instruction, so that the status register itself need not be accessed. In particular, the status register used to store the control value may be the status register associated with the current exception stage from which the exception return instruction was executed.
As shown in fig. 27, at least one realm identifier register can be provided, and in response to a second variant of the exception return instruction, the processing circuitry can identify the destination realm from the realm identifier stored in the realm identifier register. The domain identifier registers may be grouped such that there are a plurality of domain identifier registers each associated with one of the exception stages, and in response to a second variant of the exception return instruction, the processing circuitry may identify the destination domain from the domain identifier stored in the domain identifier register associated with the current exception stage. By using the realm identifier register to store the target realm identifier, there is no need to include this in the instruction encoding of the ERET instruction, which enables the existing format of the ERET instruction to be used to trigger the realm entry, thereby reducing the amount of software modification required. The realm identifier in the realm identifier register can be a local realm identifier used by an upper-generation realm to reference a child realm of the upper-generation realm, and thus, realm entry can be limited to transfer from the upper-generation realm to the child realm, and it is not possible to go from a first realm to another realm that is not a direct child of the first realm. In response to a second variant of the exception return instruction, the processing circuitry may trigger a fault condition when the realm associated with the realm ID identified in the RID register is an invalid realm (no realm descriptor has been defined or a realm descriptor defines a RID for a life cycle state other than active).
In response to a second variation of the exception return instruction, the processing circuitry may restore an architectural state associated with the thread to be processed in the destination domain from a domain execution context (REC) memory region specified for the exception return instruction. The state recovery may occur immediately (e.g., as part of the atomic set of operations) in response to the second variant of the exception return instruction, or may occur later. For example, state recovery may be done in a lazy manner such that state that requires processing to begin in the destination domain may be recovered immediately (e.g., program counters, processing mode information, etc.), but other state such as general purpose registers may be gradually recovered as needed at a later time, or in the context of continued processing in the new domain. Thus, the processing circuitry may begin processing of the destination domain before all required architectural states have been restored from the REC memory region.
In response to a first variant of the exception return instruction, the processing circuitry may branch to the program instruction address stored in the link register. For example, this may be the ELR of FIG. 27, which corresponds to the current exception stage at which the exception return instruction is executed. Conversely, for a second variant of an exception return instruction, the processing circuitry may branch to a program instruction address specified in a domain execution context (REC) memory region. Thus, because the link register will not be used for the second variant of the exception return instruction to directly identify any architectural state for the new domain, the link register can be reused to instead provide a pointer to the REC memory region from which the architectural state of the new domain is to be restored. This avoids the need to provide further registers for storing the REC pointer.
Thus, prior to executing an exception return instruction that attempts to cause a realm entry into a given realm, some additional instructions may be included in order to set the RID register to the realm identifier of the destination realm and set the link register to store a pointer to the REC memory region associated with the destination realm. The REC pointer may be obtained by the prior art from a domain descriptor of the destination art.
In response to the second variant of the exception return instruction, the fault condition may be triggered by the processing circuitry when the REC memory region is associated with an owner field other than the destination field or the REC memory region specified for the exception return instruction is invalid. The first check prevents the predecessor realm from causing a predecessor realm to execute with a processor state that the predecessor realm did not create itself, because only the memory region owned by a predecessor realm can store the REC memory region that is accessible upon entry into the realm (and as discussed above, the REC memory region will be set to RMU private). A second check of the validity of the REC memory region may be used to ensure that the REC memory region can be used only once to enter the domain, and subsequent attempts to enter the domain with the same REC data will be rejected thereafter. For example, each REC may have a lifecycle state that may be invalid or valid. In response to an exception occurring during processing of a given thread in the current domain, the architectural state of the thread may be saved to a corresponding REC memory region, and the corresponding REC memory region may then transition from inactive to active. The REC memory region may then transition from active back to inactive in response to successful execution of the second variant of the exception return instruction. This avoids the descendant domain from maliciously incorrectly behaving by the descendant domain by specifying a pointer to an obsolete REC memory region, a REC memory region associated with a different thread, or some other REC associated with the destination domain but not used to store the correct REC for architectural state at the previous exit of the destination domain.
In a corresponding manner, exit from the domain may reuse the mechanism provided for exception handling. Thus, in response to an exception condition occurring during processing of the first domain that cannot be handled by the first domain, the processing circuitry may trigger a domain exit to an earlier domain that initializes the first domain. Upon exception occurrence/domain exit, some additional operations may be performed that would not be performed for exception occurrences that may be handled within the same domain. This may include, for example, masking or washing of architectural states and triggering of state storage to the REC, as will be discussed in more detail below.
However, in some cases, an exception may occur that may not be handled by the predecessor domain of the first domain in which the exception occurred. Thus, in this case, it may be necessary to switch to a further prior generation area beyond the previous generation. While it may be possible to provide the ability to switch directly from a given domain to an earlier generation domain that is more than one generation old, this may increase the complexity of the status registers needed to handle exception entry and return or domain exit and entry.
Alternatively, a nested domain exit may be performed when the exception condition is to be handled at a target exception level having a greater privilege level than the most privileged exception level that the previous generation domain of the first domain is allowed to execute. Nested domain retirement may include two or more successive domain retirements from a child domain to an upper-generation domain until a second domain is reached that is allowed to be processed at a target exception level for the exception that occurred. Thus, by raising the domain level one level at a time, this may simplify the architecture. At each successive domain exit, there may be operations performed to save a subset of the processor state to the REC associated with the corresponding domain.
When the exception has been handled, then in response to an exception return instruction of the second variant executing in the second domain after the nested domain exits, the processing circuitry may then trigger the nested domain entry to return to the first domain. This can be handled in different ways. In some examples, the hardware may trigger the nested domain entry itself without requiring any instructions to be executed at any intermediate domain encountered between the first domain and the second domain during the nested domain exit. Alternatively, the hardware may be simplified by providing a nested domain entry process that returns back one level at a time to each successive domain encountered in the nested domain exit and executes a second variant of a further ERET instruction at each intermediate domain. In this case, to ensure that the intermediate domain triggers a return to a child domain of the intermediate domain that made a domain exit to the intermediate domain during the nested domain exit, an exception status register may be set to indicate that a predetermined type of exception condition occurred in the child domain. For example, a new type of exception condition (e.g., "fake domain exit") may be defined to handle this intermediate domain case. Thus, when the intermediate domain is reached, the processor may then resume processing within the intermediate domain from the program instruction address corresponding to the exception handling routine for handling the predetermined type of exception condition. This exception handling routine may, for example, simply determine that a child domain exits for some unknown reason and then may choose to execute another exception return instruction of the second variant to return processing to a further child domain. By performing this operation at each intermediate domain, eventually, the original first domain in which the original exception occurred can resume processing.
During this nested domain entry and exit procedure, an intermediate domain flag within the status register may be used to flag which domains are intermediate domains to trigger the setting of hardware-triggered immediate domain entry to the relevant child domain or trigger the setting of exception state information that will then trigger the exception handler or other code within the intermediate domain to return to the child domain. For example, the middle realm flag may be the Int flag in the associated SPSR as discussed in fig. 27.
FIG. 28 is a flow chart illustrating a method of handling a domain entry or exception return. At step 400, when the current exception level is ELx, an Exception Return (ERET) instruction is executed. ELx may be any of the exception stages supported by the processing circuitry. Although the skilled person may not expect an exception return to occur from least privileged exception level EL0, the ability to create a sub-realm as will be discussed below means that there may still be ERET instructions executed from EL0 in order to trigger a go to the sub-realm also executed at
At step 402, the processing component determines the current value of the domain flag R in the SPSR associated with the exception stage ELx. If the realm flag R is zero, this indicates that a conventional exception is returned, without entering a different realm. At step 404,
If at step 402, the realm flag R is set to 1, this indicates a realm entry, and thus this triggers an atomic set of second operations that is different from the set performed for the traditional exception return. At step 408, the processing component triggers the domain management unit to perform a plurality of domain checks. These include the checks:
the local RID indicated in the realm identifier register RID _ ELx associated with exception stage ELx indicates a valid child realm. That is, the RMU checks the domain descriptors accessed from the domain descriptor tree 360 used to specify the child domains, and checks whether the life cycle state of the domain descriptors of the child domains indicates an active state. If the child domain is in any state other than the active state, the domain check is unsuccessful.
The RMU20 also checks that the REC memory region indicated by the pointer in the link register ELR _ ELx is the memory region owned by the child domain indicated in the domain ID register RID _ ELx. That is, the RMU20 accesses the realm group table 128 (or cache information from the RGT 128), locates the associated entry corresponding to the memory region indicated in the REC pointer, and checks the owner realm specified for that memory region. The owner realm indicated in the ownership table may be specified as a global RID, and this may be compared to the global RID specified in the realm descriptor of the target child realm to determine whether the child realm is a valid owner of the REC. This check is unsuccessful if the REC is owned by any domain other than the specified child domain in the RID register.
The RMU20 also checks whether the status of the REC memory region defined in ELR _ ELx is valid. There are different ways in which the validity of the REC memory region can be expressed. For example, each REC memory region may include a flag that specifies whether the REC memory region is valid. Alternatively, the separate table may define the validity of RECs stored in other memory areas. The REC may be valid if it has been used to store the architectural state of the associated domain on a previous exception exit, but has not been used to restore the state after returning from the exception. If the REC is invalid, the domain check is again unsuccessful.
The RMU20 also checks whether a flush command has been executed after the last exit from any child domains other than the child domain indicated in the RID register RID _ ELx. The flush command will be discussed in more detail below, but is a command to ensure that any state of the REC that is still to be saved to the child domain is pushed to memory (which helps support the lazy state save approach). If no flush command has been executed and the system attempts to enter a different child domain than the previously exited child domain, there is a danger that there may still be a state left in the processor registers that has not yet been pushed to memory. Implementing the use of the flush command ensures that different child domains can be safely entered without loss of state (or leakage to other domains) of the previous child domain. There may be multiple ways of identifying whether a flush command has been executed. For example, some status flags may be used to track whether (a) there has been a change to the RID register RID _ ELx after the exit from the last domain, and (b) a flush command has been executed after the exit from the last domain. This may cause the domain check to be unsuccessful if there is a change to the RID register and no flush command has been executed after exiting from the previous domain.
If either of the domain checks is unsuccessful, then a fault is triggered at step 409 and the system stays within the current domain associated with the ERET instruction. Therefore, it is not possible to reach child domains unless all of the domain checks are successful.
If all of the domain checks are successful, then at step 410, the processing component switches to processing in the child domain indicated in the domain ID register RID _ ELx. For example, the processor may have internal registers that specify the global RID of the current domain (such internal registers may not be software visible and are different from the set of RID registers shown in fig. 27). The switch to the child domain may occur by writing the global RID of the new destination domain to an internal RID register.
At step 412, the state associated with the new domain is made available based on the state saved in memory in the REC memory region indicated by the pointer in the ELR _ ELx register. Because the REC region is owned by the new child domain, it is now accessible and thus return state information, such as program counters and target exception levels, can be obtained from the REC. At this point, a selected subset of the architectural state may be restored from the REC, or alternatively, the architectural state may be restored lazily so that processing may begin without fully restoring all states, and then states may be gradually restored as needed or over a period of time, in order to improve performance by reducing latency before processing may be restored from the new domain.
At step 414, it is determined whether the intermediate domain flag Int is set in the SPSR associated with the new domain. The SPSR content will be recovered from the REC along with the rest of the architecture state. If the intermediate domain flag is not set, this indicates that the new domain is the domain in which the original exception occurred (or the domain is being entered for the first time without any previous exceptions occurring in that domain), and thus does not need to trigger any further domain entries into the child domains. At step 416, the program counter is obtained from the REC, and then at step 418, processing continues in the new domain at the target exception level obtained from the REC.
Alternatively, if the intermediate realm flag is set, this indicates that nested realm exit has previously occurred, and that nested realm entry has reached the intermediate realm. Thus, it is necessary to return processing to further child domains in order to return to the domain in which the exception originally occurred. There are two alternative techniques for handling this situation. In a first alternative, at step 420, a fake realm exit exception is taken, and thus the exception status register associated with the new realm may be set by the processing component to the status code associated with this type of exception, and then processing branches to the exception vector associated with this type of exception, which triggers the exception handler to be processed. The exception handler does not have to do any actual processing, but may simply determine that an exception of unknown type occurred in the given child domain and thus may then trigger another ERET instruction to execute at step 422. When the intermediate realm previously entered a further child realm, the RID and ELR registers associated with the intermediate realm may still have values placed in these registers, and thus ERET instruction execution may then trigger a further realm entry to a further child realm. The method may return to step 408 to check whether the domain check was successful for this further domain entry, and then the method continues in a similar manner as the previous domain entry in the nesting process.
Alternatively, instead of handling nested domain entry with another ERET instruction executing in the intermediate domain, at step 424, the hardware may detect that the intermediate domain flag is set for the current domain, and may then trigger further domain entry to child domains without requiring any instructions to be executed within the intermediate domain, and the method may then return to step 408.
FIG. 29 shows a flow chart illustrating a method of exiting or taking an exception from the domain. At step 430, the exception occurred within a given exception level ELx targeting exception level ELy (ELy ≧ ELx). Target exception level ELy is the exception level at which an exception is to be handled. The target exception level may be the same as ELx, just one exception level above ELx, or may be multiple exception levels above.
At step 432, the RMU determines whether the target exception level ELy is greater than the Boundary Exception Level (BEL) of the current domain that can be read from the domain descriptor of the current domain, and whether the current domain is a sub-domain (see discussion of the sub-domain below-the type field of the domain descriptor shown in FIG. 16 specifies whether the domain is a sub-domain). If the target exception level ELy is not greater than the boundary exception level, this indicates that the exception may be handled within the current domain, and thus, if the current domain is a full domain, there is no need to trigger a domain exit. In this case, at
On the other hand, if at step 432, the target exception level ELy is greater than the BEL of the current realm, or if the current realm is a secondary realm for which any exception triggers an exit to a previous generation realm of the secondary realm, a realm exit is required in order to handle the exception. At
If the domain exit is not a voluntary domain exit, then at
If the domain exit is a voluntary domain exit, then at
Regardless of whether the domain exit is voluntary or involuntary, at
At
At
However, for nested domain exit, it may be assumed that for the intermediate domain, any registers accessible at a lower exception level compared to the boundary exception level of the intermediate domain will have been modified by the child domains at the lower exception level because the intermediate domain triggered entry to the child domains. Thus, such registers accessible at a lower exception level may not need to be saved to the REC associated with the intermediate realm during a nested realm exit (no further execution at the intermediate realm occurs from the previous entry into the child realm). Conversely, during nested domain entry, these registers accessible at lower levels need not be restored during passage through the intermediate domain, as these registers will then be restored by the domain at the lower exception level. Alternatively, the middle realm state save and restore may simply comprise registers accessible by the boundary exception stage of the middle realm but not accessible at the lower exception stage. For example, at the intermediate domain at EL1, the state saved/restored in nested domain exit/entry may include subset 352 in fig. 27 but may exclude subset 350 accessible at
Thus, when an exception condition is to be handled by the exception handler at a target exception level that is more privileged than a boundary exception level of an earlier generation domain of a given domain in which the exception occurred, a nested domain exit may be triggered, the nested domain exit comprising a plurality of successive domain exits from a child domain to the earlier generation domain until a target domain having a boundary exception level corresponding to the target exception level or higher is reached. A respective state masking process (and state save) may be triggered for each of the successive domain exits, and each respective state masking process may mask (and save) a corresponding subset of registers selected based on the boundary exception level. For a domain exit from a given child domain having a boundary exception level other than the least privileged exception level, the corresponding subset of registers masked/saved during the nested domain exit may include at least one register accessible at the boundary exception level of the given child domain, but may exclude at least one register accessible to processing circuitry at a less privileged exception level as compared to the boundary exception level of the given child domain (as it may be assumed that such register would have been saved when exiting the domain at the less privileged exception level). This reduces the amount of operations required for state masking and saving.
Similarly, upon a domain entry (exception return), an intermediate domain flag may be used to determine whether the incoming domain is an intermediate domain. If the intermediate realm state value for a realm having a boundary exception level other than the least privileged exception level is set to a predetermined value (indicating an intermediate realm), the subset of registers to be restored upon realm entry may include at least one register accessible at the boundary exception level of the intermediate realm, but may exclude at least one register accessible to processing circuitry at a lesser privileged exception level than the boundary exception level of the particular realm. If the intermediate state value is set to a value other than the predetermined value, then this domain being entered is the last domain and thus the subset of registers to be restored accessed may include all registers accessible to the processing circuitry at the boundary exception stage of the particular domain (without excluding any registers from the lower stages).
In this way, state save and restore operations during nested domain exit and entry can be performed more efficiently.
FIG. 30 shows an example of non-nested domain entry and exit. In this example, the predecessor domain is a operating at exception level EL1 and it is desired to enter the descendant domain B of the BEL with
When an exception occurs at step 472 during execution of child domain B, then a set of masking operations is performed to hide the state associated with domain B from the parent domain of that domain. This includes masking and washing of at least a subset of the architectural states associated with EL0 at step 474 (the subset of architectural states masked/washed may depend on the exit being a voluntary or involuntary domain exit). The mask makes the state inaccessible and the wash ensures that subsequent accesses to the corresponding register from the previous generation domain will trigger the predetermined value to be returned. At step 476, a state save to the REC associated with realm B is performed for the masked subset of the architecture state. In this case, because the child domain has a BEL for EL0, the masked subset of states includes at least the subset 350 accessible at
FIG. 31 shows a similar example showing nested domain entry and exit. The grandparent realm a at exception level EL2 executes an ERET instruction at
Subsequently, an exception occurs at step 514. Exceptions target exception level EL 2. For example, the exception may be of the type of exception to be handled by the hypervisor (such as an event associated with the virtualization device or a stage 2 address translation fault). At
At step 520, the processing component detects that the target exception level for the exception that occurred is higher than the boundary exception level of Domain B, and thus Domain B is an intermediate Domain and requires a further Domain exit to a previous generation Domain of Domain B. Thus, at
A
Upon returning to Domain B at EL1, the processing component detects that the intermediate flag is set in SPSR _ EL1 (step 532). Thus, at
Alternatively, for hardware assisted nested domain entry, steps 536 to 542 may be omitted, and instead of restoring the required subset of states at
By using this nested domain entry and exit procedure, this avoids the domain at EL2 needing to process any domain values or REC pointers associated with
Fig. 32 and 33 show the lazy state preservation to the REC and the state recovery from the REC at the time of the domain exit and the domain entry, respectively. In general, upon exiting a domain to an predecessor domain, it may be desirable to mask the state associated with the descendant domain to hide that state from the predecessor domain, and to perform a flush to ensure that some predetermined value will be seen if the predecessor domain is attempting to access an architectural register corresponding to the flushed state. These operations may be performed relatively quickly. However, if there is insufficient space in the physical register file of the processing circuit for holding child domain states indefinitely, it may be desirable to save some of this data to the REC. However, this can take a longer time and occupy memory bandwidth that could otherwise be used for processing in the previous generation domain, which can delay processing in the previous generation domain. Similarly, the corresponding operation to restore state from memory to a register upon entering the realm may take some time. Therefore, for performance reasons, it may be desirable to support asynchronous saving/restoring of processing component states to/from the REC. Whether or not a given processor implementation actually makes this lazy state save is an implementation choice for a particular processor. For example, some processors that do not aim at high performance may find it simpler to simply trigger a state save operation immediately, in order to reduce the complexity of managing which states have been saved and which have not. However, to provide performance when needed, it may be desirable to provide architectural functionality that supports such asynchronous, lazy state preservation approaches.
Thus, in response to a domain switch from the source domain to the target domain to be processed at a more privileged exception level than the source domain, the processing circuitry may perform a state mask to render a subset of the architectural state data associated with the source domain inaccessible to the target domain. While it is possible that this masked subset of states is saved to memory, this is not necessary at this time. However, the architecture provides a flush command that can be used after a domain switch. When the flush command is executed, the processing circuitry ensures that any of the masked subset of the architectural state data that has not been saved to the at least one REC memory region owned by the source domain is saved to the at least one REC memory region. By providing such a flush command, it may be ensured that this may be forced through when it has to be ensured that a subset of the architecture state data has been explicitly saved, and this gives a degree of freedom for the exact change of the specific micro-architectural implementation of the architecture when this subset of the architecture state data is actually saved to memory without the flush command having been executed.
In addition to the status mask, after a domain switch, the processing circuit may also perform a register wash operation as discussed above, which ensures that any subsequent read access to a given architectural register returns a predetermined value (if done without an intervening write access). This washing may be performed by actually writing the predetermined value to the physical register corresponding to the given architectural register, or by register renaming, or by setting other control state values associated with the given architectural register to indicate that the read access should return the predetermined value rather than the actual contents of the corresponding physical register. If the state saving of these is to be done asynchronously in response to a domain switch, the processing circuit may begin processing of the target domain while at least a portion of the subset of the architectural state data that is made inaccessible in response to the domain switch remains stored in the registers of the processing circuit. For example, a processor may have a larger physical register file than the number of registers provided as architectural registers in an instruction set architecture, and thus some spare physical registers may be used to hold the previously masked state for a period of time after processing has begun in the target domain. This is advantageous because if processing then returns to the source domain while a given item of the subset of architectural state data is still stored in the register, the processing circuitry can simply resume access to that given item of architectural state from the register file without the need to resume data from the REC. Some types of exceptions may only require a relatively short exception handler to be executed, in which case it is possible that some mask state remains resident in the register file when returning from the exception. Such "shallow" exception entry/return events may benefit from using lazy state preservation.
If lazy state preservation is used, the processing circuitry may trigger the preservation of a given item of the REC region in response to the occurrence of a predetermined event other than a flush command once processing of the target domain has begun after the exception. Although processing has now switched to the higher-generation domain (which typically cannot access RECs associated with the previous children), because these operations are triggered in hardware by the microarchitectural implementation rather than by software, these operations are not subject to the same ownership checks required for general software-triggered memory accesses (effectively, these REC save operations will have been granted by the children domains before exiting).
Many different types of predetermined events may be used to trigger certain items of the subset of architecture state data to be saved to the REC, including the following:
register access to an architectural register, which corresponds to a given entry of a subset of the architectural state data. This approach may be useful for less complex processors that do not support register renaming. In this case, each architectural register may be mapped to a fixed physical register, and thus the first time the code associated with the predecessor domain attempts to access a given architectural register, this may require the old value of that register used by the predecessor domain to be saved to memory.
A remapping of the entity registers storing a given entry of the subset of architecture state data. In systems that support register renaming, the architectural state may remain longer in the register file, but eventually the corresponding physical registers may have to be remapped to store different values and at this point the corresponding architectural state of the child domain may be saved to the REC.
The number of available physical registers becomes less than or equal to a predetermined threshold. In this case, instead of waiting for the actual remapping of a given physical register, state saving may begin preemptive execution once the number of free physical registers (which are available for reallocation to different architectural registers) becomes low.
A given number of cycles or a given period of time. Thus, trigger saving is not necessary for any particular processing event, but instead lazy state saving may simply extend the context of the descendant domain to the saving of the REC over a period of time in order to reduce the impact on memory bandwidth available for other memory accesses triggered by processing in the predecessor domain.
Events that indicate a reduced processor workload, such as an idle processor time period or some other event that indicates that performing a state save now will have less impact on the overall performance of processing in the previous generation domain. At this point, the saving of at least a portion of the subset of the architecture state data may be triggered.
After the domain switch, if the processing circuitry attempts to enter a further domain other than the source domain, from which the domain was previously switched to the previous domain, the processing circuitry may deny the domain entry request when the further domain is to be processed at the same or a less privileged exception level as the target domain exited from the previous domain and no flush command has been received between the domain switch and the domain entry request. Alternatively, the realm entry request can be accepted regardless of whether the flush command has been executed, but if the flush command has not been executed, the initial child realm REC state can be destroyed such that the REC is not reusable, thereby preventing effective entry into the child realm. In summary, a flush command is required before an ancestor domain can successfully direct processing to a different descendant domain compared to the one previously executed. This ensures that even if the hardware chooses to use the lazy state save method, all necessary state associated with the previous child domains will have been committed to be saved to memory upon entering a different child domain. This avoids the need to backup multiple sets of child domain data to be saved to memory and simplifies the architecture.
It should be noted that the flush command only needs to ensure that the state from the mask register is committed to be stored to the REC memory region. The store operation triggered by the flush command may be queued in a load/store queue of the
The flush command may be a native instruction supported by an instruction decoder of the processing circuit. Alternatively, the flush command may be a command triggered by a predetermined event to continue processing of the instruction decoded by the instruction decoder. For example, the flush command may be automatically triggered by some other type of instruction that implies that a state save operation should be ensured to have been triggered to memory for a subset of all needs of the architectural state related to the previous child domain.
As discussed above, the particular subset of architectural state to be saved during a domain switch may depend on the boundary exception level associated with the source domain (and may also depend on whether the source domain is an intermediate domain in a nested domain exit). The state masking and saving operations may be suppressed if the domain switch is a predetermined type of domain switch (e.g., a domain switch triggered by execution of a voluntary domain switch instruction in the source domain).
Thus, FIG. 32 shows an example of lazy state save and restore. At
Fig. 33 shows another example of performing a
Thus, the use of the flush command enables fast exception exit and slow outflow of processor states into the REC of the previously exited domain, and also allows shallow exception exit and return, where the states remain within the registers of the processing component and are not stored and reloaded from the REC.
FIG. 34 illustrates the concept of a sub-domain that can be initialized by a previous generation domain. As shown in FIG. 34, a given
The sub-domain may generally be handled in the same way as the full domain, with some differences as explained below. Entry into and exit from the sub-domain may be handled in the same manner as discussed above using exception return instructions and exception events. Thus, the sub-domain may have child domain IDs constructed in the same manner for the full child domains of the same generation, and may be provided with domain descriptors within a domain descriptor tree as discussed above. Entry into the sub-domain may be triggered simply by executing an ERET instruction that has placed the appropriate sub-domain RID in the RID register prior to executing the ERET instruction. Thus, the same type of ERET instruction (belonging to the second variant) may be used to trigger entry into the full domain or the sub-domain.
One way in which sub-domains may differ from full domains may be that the sub-domains may not allow initialization of their own children domains. Accordingly, if the current domain is the sub-domain, the domain initialization command for initializing the new domain may be rejected. The RMU may use a domain type value in a domain descriptor of the current domain to determine whether the current domain is a full domain or a sub-domain. By disabling the realm initialization when currently in the secondary realm, this simplifies the architecture, as no additional status registers have to be provided for use by the secondary realm in initializing further realms.
Similarly, when currently in the secondary realm, execution of the realm entry instruction may be prohibited. This simplifies the architecture, as it means that some of the banked registers, such as the ELR, SPSR, ESR and RID registers discussed above, for handling domain entries and exits (and exception entries and returns) do not need to be banked again for each sub-domain, which would be difficult to manage, as it may not be known at design time how many sub-domains a given process will create. Similarly, when the current realm is a secondary realm rather than a full realm, exception return events that trigger a switch to a process operating at a lower privilege level may be disabled. Although in the examples discussed above a single type of ERET instruction is used as both a realm entry instruction and an exception return instruction, this is not necessary for all embodiments, and where a separate instruction is provided, then both exception return instructions may be disabled when the current realm is the next realm.
Similarly, when an exception occurs while in the secondary realm, the processing circuitry may trigger an exit from the secondary realm to the previous generation full realm that initializes the secondary realm before handling the exception, rather than taking the exception directly from the secondary realm. Thus, the exception triggers a return to the full domain of the previous generation. Exception returns to the upper generation full realm may include state mask, wash, and save operations on the REC, but by avoiding exceptions going directly from the secondary realm to the realm at a higher exception level, this avoids the need to group exception control registers such as ELR, SPSR, and ESR again for the secondary realm, simplifying the architecture.
For the sub-domain, the boundary exception level indicating the maximum privilege level for processing of the allowed domain is equal to the boundary exception level for the previous generation full domain for the domain. In contrast, for a child full domain, the boundary exception level is a less privileged exception level than the boundary exception level of the parent domain of the child full domain.
When a domain is initialized by an predecessor domain, the predecessor domain may select whether the new domain will be a child full domain or a child sub-domain, and may set the appropriate domain type parameter in the domain descriptor accordingly. Once the domain is operational, the previous generation domain can no longer change the domain type because modification of the domain descriptor is prohibited by the management domain lifecycle discussed above with respect to FIG. 18.
In summary, the ability to introduce sub-domains that are managed similar to the full domain but where exception handling, domain initialization, and domain entry functions are disabled within the sub-domain enables smaller portions of code corresponding to a given address range within the full domain's software process to be isolated from other portions of the software in order to provide additional security for certain pieces of sensitive code or data.
As described above, the domain management unit (RMU) may control the transition of the memory region between a plurality of region states (or lifecycle states). The available zone states may include at least: an invalid state in which memory regions are allowed to be reassigned to different owner regions; and an active state in which the memory area is allocated to the owner area, accessible to the owner area, and prevented from being reallocated to a different owner zone. For a memory region that transitions from an invalid state to a valid state, the RMU may require a patrol clearing (scrubbing) process to be performed to set each storage location of the memory region to a value that is not related to the previous value of memory in that storage location. For example, a patrol purge may include writing a zero to each location within a patrol purged memory region, or may include writing some other fixed non-zero value to each memory location, or writing a random value to a different memory location independent of a previously stored value. In either case, the patrol purge ensures that any previous data stored in the invalid region cannot be accessed by the new owner domain when the region is owned by a different domain and any previous data in the invalid region becomes valid in the region. The patrol purge process may be triggered by the RMU itself in response to a command requesting the zone to transition from invalid to valid, or alternatively, the patrol purge may be triggered by a different command or series of commands (separate from the command indicating the transition from invalid to valid).
In the above examples of fig. 21 and 22, the invalid state and the valid state are the only states that the memory region can have (except for the RMU-private state). This means that in order for a given domain to take ownership of a non-RMU private memory region, it is necessary to patrol to clear that memory region, which requires some time and causes the actual physical memory locations to be updated with new data values, thus requiring the physical memory to be committed. Furthermore, once an area becomes valid, the RMU will not be able to distinguish whether the area has just been patrol cleared and thus the data is not important, or whether at least one write request to write actual data to the area occurred since the area became valid, and therefore if an output command is executed, the data needs to be output to the backup storage device and thus space in the backup storage device is consumed.
In the examples discussed below in fig. 35-39, an additional patrol clear-commit state is provided in which a memory region is allocated to an owner zone but is not accessible by the owner zone until a patrol clear process has been performed on the memory region and the memory region is prevented from being reallocated to a different owner zone. This means that it is not necessary to perform a patrol clearing process in order to be able to assign a memory area to a particular owner zone. Thus, owned regions may be allocated for the domain in the patrol clear-commit state, and if the owner domain does not use all of these regions for some reason, the time and energy associated with performing the patrol clear process is not unnecessarily consumed.
In some examples, the patrol clear-commit status may be a zero-commit status in which the region is not accessible by the owner field until a patrol clear process has been performed to write zeros to each location of the memory region. Alternatively, the patrol clear-commit status may correspond to a patrol clear process that writes a non-zero value to each location.
In general, an area in a patrol clear-commit state may not be accessible by the owner field until a patrol clear process has been performed on the memory area. Therefore, it would be desirable to perform a patrol cleanup process before the area becomes valid and accessible to the owner field. If the owner zone attempts a memory access to a memory region that is in a patrol clear-commit state, the memory access circuitry may signal an exception condition (e.g., a fault). For example, an exception may trigger a domain or other management process to trigger one or more commands to perform a patrol clean process and convert the zone status to valid.
As described above, while the RMU and memory access circuitry may enforce ownership rights of the memory region of the first memory, the first memory may not have sufficient capacity to process all of the data that may be needed, and thus it may sometimes be necessary to output some of the data from the first memory to the second memory (e.g., the off-chip non-volatile memory 6 as shown in fig. 1). The RMU20 and the
However, generating the encrypted data and the metadata requires some time and involves some processing resources. Further, outputting the data to the second memory consumes space within the second memory. By providing a patrol clear-commit status, memory regions may still be allocated to the owner zone, but they may be exported to the second memory without actually generating any encrypted data and without having to write any data to the second memory. Alternatively, in response to an output command specifying an output target memory region in a patrol clear-commit state, the RMU may refrain from encrypting and writing to the second memory and generate metadata specifying that the output target memory region is in a patrol clear-commit state. The target memory region may then transition from the patrol clear-commit state to the invalid state. When the input command is executed and the corresponding metadata specifies that the target memory region is in the patrol clear-commit state, the RMU may simply transition the selected memory region from the invalid state to the patrol clear-commit state and need not read data from the second memory because there is no ciphertext associated with the encryption.
Thus, a child realm can be created and allocated many owned memory regions so that from the perspective of the child realm, the child realm has a fairly large available address space, so the child realm will not have to later claim ownership of all these regions, which could cause problems and conflicts with other owner realms. However, the child realm can be created using minimal committed memory, since the patrol clear-commit state means that the page can be immediately output without generating any ciphertext, with the metadata only indicating that the corresponding region is in the patrol clear-commit state. Thus, neither the first storage nor the second storage need immediately commit physical space when creating a domain. This may improve performance in many cases. Some example usage models are discussed below.
Various RMU commands may be provided to trigger transitions between memory region lifecycle states. For example, a patrol clear-commit variant of a zone add command that specifies a predetermined zone (e.g., the variant of granule add
Another example of an RMU command may be a patrol clear-commit command specifying a patrol clear-commit target memory region that is valid and owned by the domain issuing the command. In response to the patrol clear-commit command, the RMU may transition the patrol clear-commit target memory region from the valid state to a patrol clear-commit state. Thus, a given child domain may require one of its owned regions to transition to a patrol clear-commit state. This may be useful, for example, for balloon drivers (balloon drivers) that have variable memory usage and thus may be allocated a relatively large amount of memory, but may later find certain areas unnecessary and may return for other purposes. By converting the no longer needed region to the patrol clear-commit state, this preserves the following options: the child domains can easily return these regions back to active without having to go through the ownership claim process again later if the memory requirements of the driver increase again (avoiding the risk that other domains might claim to own these pages during this time), but if the parent domain determines that certain regions that are no longer needed should be reallocated for other purposes, the parent can safely remove them from the child domains for reuse by exporting these regions without consuming physical space in the second memory.
To convert the patrol clear-commit memory region to valid, a commit command may be executed by the RMU to trigger a patrol clear process to be performed. The command may be issued by a child domain that owns the region in the patrol clear-commit state, or may be issued by a parent domain that owns the region in the patrol clear-commit state. Thus, the RMU may accept commit commands issued by the owner domain, or a predecessor domain that committed the owner domain of the target memory region.
FIG. 35 illustrates an example of a set of memory region lifecycle states that may be implemented in some examples. FIG. 35 includes all of the states of FIG. 21, but additionally includes a zero commit (patrol clear-commit) state in which memory regions are not accessible by software, but are assigned to a particular owner domain and are therefore prevented from being reassigned to a different domain. The zero commit state is not the RMU private state. In the zero-commit state, unlike the invalid state, the area cannot transfer ownership to a different domain, but unlike the valid state, the area is not accessible to software because the patrol clearing process has not been performed yet. Transitioning from zero commit to active requires performing a patrol purge process.
As shown in FIG. 36, the state diagram of FIG. 22 may be extended to include a zero commit
As shown in fig. 36 and 37, many commands are provided (in addition to those previously described) to handle transitions with respect to the zero commit state, as follows:
patrol flush-commit (zero-commit) variant of block add command (802)
Rmu.granule.add(rid,ZC,src,dst)
RID is the RID of the child domain to which ownership is transferred
ZC ═ parameter denoting the zero-commit variant added for a block
src, dst: addresses of translation regions in address space of an ancestor realm and a descendant realm, respectively
Similar to granule add
Patrol clear-submit (zero-submit)
RMU.granule.zerocommit(rid,a)
Transitioning an owned granule (memory region) to a zero-commit (ZC) state
The call by the owner of the granule, rid, will specify the current realm, a being the address of the zero-commit target area.
Polling clear-submit (zero-submit) variant of cleaning Command (810)
Invoked by the owner of an invalid owning granule (memory region) to convert the owning granule to a zero-committed (ZC) state without actually performing a patrol wipe. Rather than converting the invalid granule to valid or RMU private, as for other variants of the clean command, the variant converts the invalid granule to a zero-commit.
Submit
RMU.granule.commit(rid,a)
The patrol clears (e.g., zero pads) the contents of the granule and sets the status to valid.
If the granule indicated by address a is not in a zero-commit state or the designated realm is inactive, then it is rejected.
Invoked by the owner field or a previous generation of the owner field.
Outputting commands
RMU.granule.export()
The output ZC zone block will not produce the target ciphertext, only the metadata. The metadata generated for the output ZC zone indicates that the zone is a ZC zone and there is no ciphertext.
Inputting commands
RMU.granule.import()
The input ZC zone has no input ciphertext, only MD. Such that an invalid zone is converted to a ZC, but not a automatic patrol flush (zero-pad) ZC zone-will issue a commit command to perform a patrol flush and convert to valid.
Note that for simplicity, input commands and output commands are not shown in fig. 22 and 36, as output commands may be called for a memory region in any lifecycle state to convert an output target region to invalid, and input commands may be called to convert an input target region to any other state (including invalid).
FIG. 38 is a flow diagram that illustrates the processing of an output command in an embodiment that includes a zero commit
At
If the output target memory region is valid, invalid, or RMU private (including any corresponding RMU private state shown in FIG. 35), then at
At
On the other hand, if the output target memory region is in a zero commit state at
Regardless of the output target memory region status, after
Thus, by using a zero commit state, output can be performed faster and commit memory avoided.
Fig. 39 shows a corresponding flowchart illustrating the processing of an input command. At
If the state of the previously outputted memory region is indicated as valid or RMU-private at
If it is determined in
Regardless of the state of the previously output memory region determined at
The zero commit state may be used for a range of usage models, including the following:
creating a domain:
the memory is completely pre-filled. The descendant domain is normally created by the descendant domain. Subsequently, the previous generation "pre-committed" zero-filled the memory region. A full MD tree is created for this by using an add (ZC) (add (ZC)) and an export (ZC) (export (ZC), using only one physical page (RMU private page for storing metadata). In realm runtime, when a page fault occurs for a ZC granule, the predecessor will commit the physical page to the descendant, enter the ZC into it, then commit it (commit ()), fix the MMU mapping and continue.
Memory allocation:
reducing child bookkeeping (bookkeeping) requirements. The child queries the parent against memory, which provides and maps the region of the invalid granule to the child. The child claims immediate ownership of the granule. Child zero-commit granules for later use. The implicit contract for the child to do so allows the parent to select the ZC output. Since any complexity accompanies the generation, the descendants do not need to track any other state. It is safe for children to take domain protection errors on ZC blocks and commit automatically.
The balloon driver:
software with variable memory usage. When memory usage drops, the driver will no longer need a zero-commit (ZC) granule. The predecessor can safely delete them from the realm of the descendant for reuse. Any child access to a previously zero-committed granule results in a page fault or a domain protection fault.
As described above, a domain management unit (RMU) may maintain an ownership table that specifies ownership entries, each of which defines ownership attributes for a corresponding region of memory having a given size. The ownership attribute specifies an owner zone of the corresponding zone from a plurality of zones. The owner field has the right to exclude other fields from accessing the data stored in the corresponding area. The memory access circuitry may control access to the memory based on the attributes defined in the ownership table.
In some instances, the RMU may perform a enclave merge operation in response to a enclave merge command specifying a merge target address that indicates multiple contiguous extents of memory that are to be merged into a merge group of extents, the enclave merge operation updating an ownership table to indicate that ownership attributes of each extent in the merge group of extents are represented by a single ownership entry corresponding to a predetermined extent of the merge group. It may be relatively common that multiple contiguous regions of memory may need to be provided with the same ownership attributes, and in such cases it may be more efficient to use a single ownership entry to represent the attributes of those regions, for example if the entries are cached in a translation lookaside buffer or other cache structure by a particular micro-architectural implementation. While the specific details of how such caching is implemented may be an implementation choice, this may improve performance by providing the RMU with the ability to respond to a zone-fuse command by updating the ownership table to indicate that the attributes of multiple contiguous zones may each be represented by a single ownership entry associated with one of the zones of the fused group. By having a single entry representing the ownership attributes of the entire fused group of regions cached, this frees up cache space to cache ownership entries associated with other regions of the memory address space. That is, by fusing groups of regions where possible, the portion of the total address space in which entries may be cached simultaneously may be increased, thereby improving performance.
This technique is particularly useful if the memory access circuitry controls access to the memory based on both the ownership table and at least one translation table that provides address translation data for translating virtual addresses to physical addresses. The at least one translation data table may provide further access permissions for controlling access to the memory, which may provide an orthogonal protection layer in addition to the ownership protection provided by the ownership table. For example, while the at least one translation table may implement top-down privilege-based access control (where a higher-privilege process may deny a lower-privilege process access to certain memory regions), the ownership table may specify corresponding protections that enable the owner domain to define specific regions of the memory address space to set ownership attributes that control whether the higher-privilege domain may access the owned region. In a typical translation table, it is common to merge multiple virtual address pages into a single larger page, which may be represented by a single page table entry within a translation look-aside buffer or other cache structure. This may take advantage of the hierarchical nature of many translation tables, which may make it relatively easy for a higher level of page table entries to indicate that multiple lower entries have been combined. Thus, the operating system or other process that sets the translation table may expect that if it allocates memory so that multiple consecutive pages may each have their access permissions and address translation data represented by a single page table entry in the translation table, this should achieve performance gains by increasing the efficiency with which a translation lookaside buffer or other cache structure for caching information from the address cache table may be managed, so that more different pages have their address translation data cached (ethical) in order within the TLB.
However, when access control is also based on an ownership table, and ownership entries of the ownership table are not consolidated or fused in a similar manner as page table entries in translation tables, then in practice this may prevent the performance gains that can normally be expected from combining multiple translation table entries. For example, in some microarchitectural embodiments, at least one Translation Lookaside Buffer (TLB) may be provided that caches information from both the address translation table and the ownership table. In some cases, information from the address translation table and ownership table may be combined into a single entry. Thus, in such a combined TLB, each entry may correspond to a minimum granularity of memory regions associated with information from the address translation tables or information from the ownership tables. Thus, if the ownership entries remain unfused, this may not result in any efficiency gains in TLB allocation even though page table entries in the translation tables may be combined. By providing RMU support for a region merge command, this enables multiple regions with corresponding ownership attributes to be merged into a merged group of regions, where a single ownership entry may represent an ownership attribute, and thus TLB allocation efficiency may be improved. It should be understood that not every microarchitecture may benefit from the potential performance improvements obtained through fusion, depending on the particular implementation of the TLB. However, by providing architectural support for such a region fusion command, software may be written in such a way that microarchitectures that provide TLB implementations that may benefit from fusion may achieve these advantages.
Although combining the different page table entries of the translation tables by the operating system or other process of managing the translation tables to write updated page table entries to the translation tables can be implemented relatively simply by defining ownership tables for different owners for different portions of the address space (where the owner is responsible for controlling which other domains can access data in the corresponding region), this can make fusing the regions associated with the ownership tables more complex. Thus, using zone fusion in the ownership table, as opposed to translation table updates (where a single process responsible for managing the entire translation table may simply override the entry to be updated), the RMU may be responsible for receiving a zone fusion command from other processes requesting zone fusion, and the RMU determines whether this zone fusion command can be acted upon, and if so, performs a zone fusion operation to update the ownership table to indicate that the attributes of each zone in the fused set of zones can be represented by a single ownership entry. Thus, there may be a separation between: a process that requires zone fusion by issuing a zone fusion command, and an RMU that determines whether the zone fusion command is acceptable (if so, controls a zone fusion operation to be performed). The RMU may be a dedicated hardware unit for controlling the management of ownership licenses, or may be domain management software executing on the same processing circuitry that also executes some software programs corresponding to the respective domains.
The zone merge command may be allowed to be issued by different processes. In some cases, the owner zone designated as the owner of the contiguous regions to be fused in the ownership table may itself issue a region fusion command to trigger those regions to be fused in the fusion group of regions. For example, when ownership attributes are set for multiple contiguous regions, the owner field may recognize that those regions will have the same attributes, and thus may trigger a region fusion command so that the common attributes for each region may be represented by a single entry in the ownership table. Thus, the TLB of some microarchitectural embodiments is capable of caching a single ownership entry as a representation of the attributes of the entire fused set of regions, rather than requiring the caching of individual entries for each individual region.
However, the RMU may also allow a zone merge operation to be performed in response to a zone merge command triggered by a zone other than the owner zone specified in the ownership table of the continuous zone to be merged. For example, it may be useful to allow a software process that involves setting translation tables to request fusion of regions defined in the ownership table. For example, when the operating system sets translation tables so that multiple pages of the address space have the same translation attribute, it may also issue a enclave merge command to request the corresponding enclave to be merged (if allowed by the RMU). The actual owner domain associated with those regions of the address space may not yet know that all regions have the same properties in the translation table, and may therefore have not yet chosen to fuse those regions. On the other hand, the operating system may not yet know whether the ownership attributes of those regions are the same, and if not, it may not be appropriate to fuse regions, but if the RMU does approve the region fuse command, this may improve performance when using the operating system's translation table. Thus, by allowing different processing requests to fuse but providing the RMU to verify whether or not the fusion is allowed, the software writing may be simplified as compared to owner-only domains that may trigger the fusion.
The RMU may verify whether or not the application is appropriate for the zone merge command to prevent a situation where, for example, another process may issue a zone merge command, but the attributes set by the owner zone indicate that it is not appropriate to merge those zones, for example because they may define different ownership attributes. Thus, in response to the zone-fuse command, the RMU may determine whether the zone-fuse command passes at least one validation check, and reject the zone-fuse command when the at least one validation check fails. When at least one validation check is passed, then the RMU may perform a zone fusion operation.
In general, the at least one validation check may be such that the zone merge command may pass the at least one validation check only if merging specified zones does not result in a change in the valid ownership attributes of any of those zones. This means that whether regions are fused does not affect program logic correctness or security of ownership table enforcement, but may be merely a performance enhancement measure (which may or may not be chosen to be acted upon by the RMU). From the perspective of the software issuing the zone-fuse command, whether this zone-fuse command passes the validation check has no effect on the correct functioning of this software, but this may enable better performance if it does pass at least one validation check.
For example, the RMU may determine that at least one validation check failed when different ownership attributes are specified in the ownership entry for any two of the multiple contiguous regions to be fused. Thus, if the respective owners of any two specified contiguous regions are different or the owners have defined different ownership attributes (e.g., attributes defining which other processes may access the regions or indicate a particular region status (valid, invalid, patrol clear-commit, etc., as described above)), the validation check may fail and thus any region fusion command for those regions with different ownership attributes may be rejected. If the zone merge command is rejected, the ownership table remains unchanged, and thus multiple separate ownership entries are still needed to represent the attributes of the contiguous zones.
In addition to the ownership attributes that define the owner zone and any other access permissions, another form of validation check may involve a mapped address that may be specified in each entry. Each ownership entry may correspond to an physically addressed memory region. The mapping address specified in a given ownership entry may identify the address to which the entity address of the corresponding zone is mapped when the ownership attributes of the zone are set in the ownership table. The mapping address may be a virtual address, an intermediate physical address, or a physical address, depending on the level of authority of the owner zone associated with the region. Specifying such a mapped address in an ownership entry may be used to avoid the security protection provided by the owner attribute being compromised when a higher authority process than the owner realm of the corresponding memory region remaps the translation tables to change which addresses are mapped to the physical addresses of the memory region after the owner attribute for that region has been defined in the owner table. As described above, when accessing memory, the partial security check performed to determine whether memory access is allowed may compare the mapped address in the ownership entry of the physically addressed region of memory required with the corresponding address currently mapped to this physically addressed region of memory, and deny memory access when these addresses do not match. When fusing multiple regions into a fused group of regions that can be represented with a single ownership entry, if those contiguous regions do not specify a contiguous set of mapping addresses, this may result in the protection associated with checking mapping addresses being bypassed if only a single ownership entry is provided to record ownership attributes for the overall fused group. Thus, when the ownership entries of the contiguous regions specify non-contiguously mapped address groups when performing validation checks of the region merge command, the RMU may determine that at least one validation check failed.
As described above, each ownership entry may specify one of a plurality of region states associated with a respective memory region, which may include at least an invalid state in which the memory region is allowed to be reassigned to a different owner zone and a valid state in which the memory region is assigned to a given owner zone and prevented from being reassigned to a different owner zone. Other states are also possible. In one example, at least one validation check may fail when an ownership entry for any contiguous region to be fused specifies that the region is in any state other than a valid state. The transition of a region from an inactive state to an active state may require that certain measures be performed to maintain security. For example, the RMU may require that a patrol clearing process be performed to set each storage location of the zone to a value that is not related to a previous value stored in the storage location before the zone can transition to a valid state. This may prevent data leakage from areas that were previously valid when reused and reassigned to different owner zones. By requiring that all of the contiguous regions to be fused are currently active to perform a region fusion operation, this can ensure that the fused regions cannot bypass such security measures before the regions become active, for example to avoid the need to patrol to clear data from each fused region.
In general, a translation table may be implemented as a hierarchical table that involves multiple levels of sub-tables that are traversed to find an entry of interest for a given address, with data indexed at one level of the table structure being used to identify the relevant entry at the next level of the hierarchy. This may be used to allow information shared for a larger range of addresses to be specified in the shared higher-level entries, rather than requiring this information to be replicated for each region, and to allow entries corresponding to relatively large sparsely populated address spaces to be compressed into smaller spaces. With a hierarchical table, it is relatively straightforward to set an indication that multiple pages are to be combined, e.g., by a higher level page table entry indicating that all lower level entries below it are to be represented using a common attribute.
Rather, the ownership table may include one or more linear tables. In a linear table, the required entries for a given address can be accessed through a single index in the table structure, rather than making multiple jumps through the hierarchy as in a translation table. In some instances, the ownership table may include more than one linear table, for example, if multiple memory banks are provided, a separate linear table may be provided for each bank, with some addresses selecting which bank's table to look up, and the remainder of the addresses providing a single linear index for the bank's tables. A linear table may be faster for lookup performance because it means that only one memory access is sufficient to obtain the ownership entries required for a given memory.
In some implementations, the zone merge command can specify that any number of zones are to be merged into a merged group represented by a single ownership entry.
However, for linear tables, it may be more complex to efficiently handle the merging of multiple regions so that they can be represented by a single ownership entry within a TLB or other cache structure, since there is no path into multiple entries of a table via shared higher-level entries, as with a hierarchy of translation tables. Thus, while it may be desirable to support fusion of a fusion group of multiple differently sized regions, it may also be desirable to limit how many ownership entries must be updated when performing such fusion in operations, and to limit the granularity at which regions can be fused, to simplify the process of looking up ownership tables and handling region fusion commands.
Thus, in one example, the region fusion command may be a command in the form of a group size increment specifying an indication of the target fusion group size. The target fused group size may be specified from a plurality of fused group sizes supported by the RMU. The region fusion operation performed in response to the command in the form of the group size increment may include updating the ownership table to indicate that a plurality of contiguous regions, each associated with a fusion group having a desired current fusion group size that is the next smallest fusion group size after the target fusion group size, are to be fused into the fusion group having the target fusion group size. By providing a command to increment the fused group size to the next level, rather than enabling a single command to arbitrarily fuse any number of regions into the fused group, this may make it more efficient to update ownership entries to indicate a new fused group size memory access, as described below. Furthermore, if the fused group is constrained to a certain level of size of the hierarchy (e.g., certain powers of 2 of the region), this may simplify conventional ownership table lookups, as the location of a single ownership entry representing the attributes of the entire fused group may be an entry corresponding to an address aligned with a particular group size. This may make the micro-architectural implementation of the memory access circuitry and RMU more efficient.
In some cases, the region fusion command in the form of a group size increment may directly specify the target fusion group size. Alternatively, the target fused group size may be implicit in some other information specified by the command. For example, the command may specify a desired current fused group size that is desired to be associated with the continuous region identified by the command, and in this case, the target fused group size may be inferred from the desired current fused group size as the next highest fused group size among the supported fused group sizes.
A minimum fused group size among the supported fused group sizes may correspond to a predetermined size of a single region. Thus, when regions are indicated as having the smallest fused group size, this may effectively indicate that they are not part of any fused group. Other fused group sizes may indicate fused groups that include two or more fused regions.
For a region fuse command in the form of a group size increment, another form of validation check may be to check that the specified contiguous region to be fused is currently associated with a current fused group size, which is the next smallest fused group size compared to the target fused group size. That is, the RMU may check whether the actual current fused group size of the specified zone matches the expected current fused group size of the zone fuse command. If any specified contiguous region is associated with the current fused group size (rather than the next smallest fused group size compared to the target fused group size), the validation check may fail and the command may be rejected. This may be useful because, as described above, different software processes may issue a zone-fuse command, and thus another process may have fused zones to form a fused group having the target fused group size or larger, and thus in this case, the zone-fuse command need not be undertaken because the desired effect has already occurred.
In response to a zone merge command in the form of a group size increment, the zone merge operation may include updating a subset of ownership entries associated with the contiguous zones to be merged. The selected subset may depend on an indication of the target fusion group size specified by the zone fusion command. Thus, one advantage of providing a command in incremental form is that each individual ownership entry associated with each contiguous extent in the fused group fused to the target fused group size does not have to be updated. If multiple zone merge commands are executed in succession to skip two or more steps of the merge group size hierarchy, the latter command may be constructed on the basis of the update of the table that has been made by the previous command, to avoid the need to update all entries of the larger merge group.
For example, when the desired current fused group size is the smallest fused group size of the plurality of fused group sizes (i.e., the designated region has not been fused into the fused group), then the subset of entries to be updated may be all entries associated with a contiguous region to be fused into the fused group having the target fused group size. On the other hand, when the desired current fused group size is a fused group size other than the minimum fused group size, the subset of entries may be less than all of the entries associated with the continuous region to be fused. Specifically, when the desired current fused group size is greater than the minimum size, then the selected ownership entries to be updated may include the first ownership entry of each fused group that has been fused by the previous zone fusion command, and thus ownership entries associated with subsequent zones in the same fused group that have been further fused by another zone fusion command do not have to be updated. For example, the subset of ownership entries updated in response to the zone merge command may be entries associated with zones whose addresses are offset by an amount corresponding to the desired current merge group size associated with those zones.
In some cases, each stage of the region fusion command in size increments may be associated with fusing some predetermined number of regions or fused groups of regions into a larger fused group size. In some cases, the number of regions or fused groups fused by a given command of a group size specifying a given level for a target fused group size may be the same for each level. In this case, the subset of entries selected for updating may include the same number of ownership entries regardless of which fused group size is the target fused group size. Although the selected entries may be separated by different offsets depending on the target group size, this may limit the memory access overhead for performing the merge operation to a certain amount for each stage of the merge group size delta processing.
However, in other examples, a command specifying one fused group size as the target fused group size may combine a first predetermined number of regions or fused groups into a larger fused group size, while a command specifying a different target fused group size may combine a different number of regions or fused groups into a larger fused group size. In this case, a different number of ownership entries may need to be updated depending on the target fusion group size.
Note that the subset of the entries of the ownership table updated in response to the zone merge command may also be the same subset that is looked up to verify whether the current merge group size is the next minimum merge group size compared to the target merge group size to determine whether to perform the zone merge operation in response to the zone merge command.
Each ownership entry may specify an indication of the current fused group size associated with the corresponding zone. However, when only a subset of the entries are updated when a region merge command is processed, this may mean that the effective merge-group size associated with a given region may be indicated by the current merge-group size indicated in the ownership entry associated with a region different from the given region. Thus, when looking up a region table to assign a new entry for a given region to a TLB, the TLB or RMU may need to attempt to read the current fused group size of a number of different ownership entries, each located at an address boundary corresponding to a different candidate fused group size within which the desired region may have been fused, in order to determine the actual fused group size for that region, and thus which ownership entry represents the attribute associated with that region.
In a corresponding manner, a region split command specifying a split target address indicating a region fusion group to be split into subsets of one or more regions may also be provided. In response, the RMU may perform a zone split operation to update the ownership table to indicate that the ownership attributes of the corresponding subset of zones are now represented by a different ownership entry of the ownership table. Thus, when the attributes of different subsets of the regions within the fused group now need to be set to different values, a region split command may be issued to signal that next for those region subsets, different ownership entries need to be cached in the TLB or other cache structure. The subset of regions into which the fused group is split may be individual regions or may be a smaller group of fused regions. For example, in some embodiments, the region split command may be a decrement form command that decreases the size of the fused group one level down the fused group size hierarchy in a manner complementary to the manner in which an increment form region fusion command increases the fused group size one level down the hierarchy. Alternatively, another form of region splitting command may simply split any size fused group into separate regions. Reduced form commands can be built more efficiently in the microarchitecture for similar reasons as discussed above for the incremental form region fusion commands.
Thus, for a region split command in a reduced form, the region split command may specify an indication of a target fusion group size that is expected to be the next smallest fusion group size compared to the current fusion group size associated with the region fusion group to be split. The RMU may verify whether the actual current fused group size matches the expected fused group size and, if not, reject the region splitting command. Likewise, only a subset of the ownership entries may be updated in response to a region split command in a reduced form, with the updated ownership entries being selected at the offset as indicated by the size of the target fused group.
In some instances, the zone split operation performed by the RMU may include triggering a look-aside buffer that caches information from the ownership table to invalidate any information associated with the fused zone group indicated by the split target address. This may provide security by ensuring that the TLB cannot continue to retain old information about the entire region fusion group, so that if the owner zone of any of the split subset of regions subsequently changes the ownership attributes of those regions, the new attributes will be accessed because the previously defined attributes of the entire region fusion group cannot be accessed from the TLB anymore.
Alternatively, rather than triggering a TLB invalidation in response to a region-splitting operation, a TLB invalidation may be triggered in response to a first update of information in one of the ownership entries associated with one of the previously fused region groups after the region-splitting operation has been performed. This recognizes that the fused entry may be allowed to remain in the TLB as long as no change to any ownership attributes of the previously fused set of regions has occurred after the split command is processed. Since any change to an ownership attribute may be accompanied by a TLB invalidation in any case to ensure that the old attribute is not still cached in the TLB, this approach may avoid the need to perform additional TLB invalidations in response to a region split command.
In some instances, the zone merge operation performed in response to the zone merge command may also include such TLB invalidation of any information associated with the new zone merge group. This may ensure that if there are any TLB entries in the TLB associated with each fused entry, then these entries are discarded so that the next access to any of these regions will trigger the loading of a single combined ownership entry, thereby allowing a greater number of other memory regions to cache the entries of their ownership tables in the TLB as well. However, with the region merge operation, TLB invalidation may be expected to achieve performance gains, but is not necessary for security, and thus may be omitted.
In some instances, during a zone merge operation, the RMU may lock ownership entries to be updated in the zone merge operation to prevent those ownership entries from being updated by other processes until the zone merge operation is complete. Similarly, ownership entries updated during a region splitting operation may also be locked until the region splitting operation is complete. This avoids unpredictable and potentially unsafe results that may occur if another process updates one of the related ownership entries midway through a region fusion operation or region split operation, which may mean, for example, that verification checks may be bypassed or that incorrect ownership attributes are indicated for some regions. A zone merge operation or zone split operation can be performed atomically (atomically) by locking the relevant ownership entries at the beginning of the update process associated with the zone merge or split operation, and then unlocking those entries once all relevant updates are completed. This means that any subsequently received command can only see the results produced by the complete region fusion or splitting operation. Locking may be achieved using any technique. For example, a bit may be specified in each entry that indicates whether the entry is locked by the RMU.
When the single target region is part of the region-fused group, commands serviced by the RMU that specify operations to be performed on a single target region of a predetermined size of memory may be rejected. For example, if the designated region is part of a fused group, a command to change the life cycle state of the region in the ownership table, a command to change visibility attributes that control which processes other than the owner zone may access the memory region, or a command to export data from a single target region to the backup storage device may be denied. As described above, since the zone merge command may be issued by multiple processes, the process of issuing a single zone command may not know that the zones are merged, so it is useful for the RMU to check whether the target zone is in the merge group, and if so, to reject such single zone operation. If a single zone RMU command is rejected, a fault handling routine in the software requesting the RMU command, or in an exception handler, may respond, for example, by issuing a zone split command and then reissuing the single zone RMU command.
FIG. 40 schematically illustrates an example of a hierarchical table structure for the translation tables, which may be the stage one page table 120-1 or the stage two page table 120-2 discussed above. Translation tables (page tables) are used to describe virtual-to-physical address translations, memory types, and access permissions defined by a top-down approach in which a higher-privileged process responsible for managing the table defines information in the table that controls how a lower-privileged process may access the corresponding memory region. Translation tables are typically used to describe memory with 4KB granularity. Logically, at each CPU memory access, the virtual input address will be checked against the translation table. In effect, these checks are cached in the
The translation tables are hierarchical in that multiple steps of indexing in different levels of the table are required to access the required information. FIG. 40 shows an example with 4-level tables (
To improve TLB efficiency, MMU architectures may allow for larger regions of memory to be described, and these large regions may be cached in the TLB. For example, this may be done using a block entry at an earlier stage of the translation table, e.g., for a 4K byte translation table, a stage 3 entry would describe a 4K byte page and a stage 2 entry would describe a region corresponding to 2M bytes. The level2 entries may specify attributes for all of the level 3 entries below, so it is not necessary to cache all of the individual level 3 entries. Alternatively, consecutive entries within the current stage of the table may specify that they are to be combined into one, e.g., 16 aligned 4K entries may describe a 64 kbyte region. By using a larger transform, software performance can be boosted by a measurable percentage (1-5%). Real world software such as Linux and Hypervisors may perform behind the scenes operations to align virtual and physical addresses to allow larger pages to be used.
In contrast, as shown in FIG. 41, the realm group table (ownership table) 128 can be implemented as one or more linear tables that are indexed based on a single index derived from the physical address. Each RGTE can identify multiple ownership attributes including owner realm, realm state, and visibility as described above, and can also specify a mapping address as shown in fig. 19. Further, as described below, the RGTE may specify additional parameters indicating the level of fusion (fusion group size). For any given address, a single memory access can be utilized to access the desired entry in the linear table structure, simply by indexing into a single table based on the physical address. In some cases, multiple linear tables may be needed (e.g., if covering regions in multiple memory banks), but still in this case, the required table is selected based on the address and the required entry read from a single access to the table, rather than using information derived from one index entry of the table to then locate another entry at the next level of the hierarchy, such as the translation table of FIG. 40. Using such a linear table, it is not so simple to fuse multiple regions so that their attributes can be represented by a single entry within the TLB. If regions with the same attributes cannot be fused, for a microarchitectured implementation using a
Page fusion allows a single RGTE to represent a set of contiguous pages to achieve a more efficient level 3RPU (TLB). The fused page may restore TLB performance for larger pages of the MMU. Without fusion, the TLB can only retain references to 4K mappings even with larger MMU mappings.
Groups of pages are fused together and split (unfused) using the following commands:
region fusion command: RMU.G.fuse (addr, level)
Region splitting command: RMU.Granule.Shatter (addr, level)
Fusion and splitting can be applied recursively to create and destroy larger fused groups. The fused group size is defined to match the size of the larger region of TT.
Valid pages may be merged. Invalid, zero commit (patrol clear commit) and RMU _ private page cannot be merged.
The fused pages are required to be contiguous and the addresses aligned with the new fused group size. Unless the page to be fused is claimed for the first time and cleared of the same type, attributes and permissions, the fuse command will be rejected.
Furthermore, if the page to be merged is not consecutive between the mapping address and the physical address PA, the area merge command is rejected:
EL0 Page requires continuous VA and PA (Large TT mapping may also require continuous IPA)
EL1 requires continuous IPA and PA
Pages EL2 and EL3 require continuous PA
All pages must exist physically to be merged. This is implemented by requiring the corresponding page to be in a valid state in the RGT. There is no way to export (page out) fused pages, which first need to be split.
If a fused page is provided as input, RMU commands operating on a single page return an error. The fused page will first need to be split and retry the command (otherwise the state of the pages within the fused group may be different).
Fusion is a performance optimization that can be performed by software that owns the page, or can be performed by higher level software in coordination with the management of the large page table.
Failure to fuse pages affects TLB performance, but does not affect program logic correctness or security.
Fusion level (fusion group size)
The fused group state is stored as "fused Level" within the RGTE. When a page is not part of a fused group, the minimum fused level of 0 is the default level. Additional fusion stages may be defined corresponding to certain larger fusion groups, e.g., stage 1-64K, stage 2-2 MB. It may be useful to select these sizes based on the size of the area covered by the various levels of the translation table hierarchy.
Thus, to fuse 16 × 4K pages into a single 64K fused group, the software may call rmu. Upon successful completion of the address (addr) from 64K alignment, 16 physically consecutive pages increase their rgte. Similarly, rmu.kernel.shatter (addr) lowers the merge level of 16 pages back to merge level 0(FuseLevel 0).
To create a 2MB fused group, all 512 separate 4K pages are first fused into a 64K fused group. These 32 groups were then fused again into the 2MB group.
When fusing to a higher level, only the fusion level of the first RGTE of each lower group is updated to a new fusion level. Therefore, an RGTE aligned with a 64K address will make RGTE. merge level 2(RGTE. merge level 2). Other RGTEs that are not aligned with 64K will be RGTE. fusion level 1(RGTE. FuseLevel 1). This limited update reduces the amount of RGTE that needs to be modified per Fuse/split (Fuse/Shatter) command.
In linear RGT, updating the merge level at a larger page-granule aligned address can be viewed as flattening the two-dimensional tree into a one-dimensional table.
For a level 3RMU lookup as shown in FIG. 24,
After the rmu.kernel.shatter command is completed, the RMU issues a TLB invalidate command specifying the aligned entity address and size of the considered fused group. This maintains security by: it is ensured that if the various attributes associated with the previously fused group are subsequently updated, the stale values previously shared by all regions of the fused group cannot still be accessed from the TLB. On the other hand, TLB is expected to be invalid for rmu.kernel.fuse to gain performance improvement, but this is not necessary for security.
Fig. 42 shows the concept of the fusion level (fusion group size). As shown on the left side of fig. 42, initially when a region (granule) is assigned to a particular owner, the corresponding ownership entries (RGTEs) of the ownership table (RGT)128 may each be defined to have a merge level of 0, which is the default level when a page is not part of any merged group. The merge level0 may be considered to indicate a minimum merge group size, corresponding to the size of a single memory region represented by one RGTE, e.g., 4 kB. Thus, if a memory access is made to one of these regions, the TLB caches information from the single RGTE corresponding to the accessed region, regardless of any entries associated with other regions in this portion of the address space. For example, each region may have a different owner zone and/or different ownership (visibility) attributes defined for that region.
As shown in the middle portion of FIG. 42, processing then issues a zone merge command to the RMU, specifying target merge level1 and identification NL1The target addresses of blocks 900 of consecutive 4K regions. N is a radical ofL1May be implicit. For example, the architecture may define the number of regions to fuse by a level1 region fuse command, so that this need not be explicitly indicated in the command. In this example,
If the zone fusion command passes the verification, NL1The consecutive regions are fused to form a single fused group of 64KB, and the RMU increments the fusion level specified by each fused region by 1. This means that when a memory access is made to any region within the 64KB fused group, the attribute, i.e., the RGTE of the first region associated with the fused group, is read from the RGTE (shaded in fig. 42) located at the 64KB aligned address corresponding to this region.
A level2 zone merge command specifying a target merge level of 2 may then be issued, which triggers NL2Fusion of individual level1 fusion groups to form a larger level2 fusion group. Furthermore, NL2May be implicit in the architecture, e.g. N in this exampleL2Let the level2 fused group correspond to 2 MB-32. In addition, the RMU verifies that the attributes of each level1 fused group are the same (only checks N at the aligned address of each level1 fused group)L2One shaded RGTE), the desired merge level for each of these RGTEs is 1 (one level less than the target merge level), and these RGTEs define consecutive mapped addresses (in this example, an offset of 64K).
If the level2 region merge command is verified, the RMU updates the table to indicate NL2*NL1All attributes of a region are indicated by a single RGTE, which is a shaded entry in the right hand diagram of fig. 42, at an address aligned with the start of the 2MB level2 fused group.
Note that not all RGTEs associated with a merge region update their merge levels when a granule merge command specifying a target merge level of 2 or higher is executed. Instead, the fusion level of the first RGTE in each smaller fusion group (e.g., level1) that fuses only increases. Thus, in the example of fig. 42, the fusion level of the 32 RGTEs shown shaded in the middle portion of fig. 42 is increased to 2, but the remaining RGTEs in each of those level1 fusion groups still indicate a fusion level of 1. By only updating the first RGTE of each lower fused group size when the fusion stage is increased, this avoids the need to make as many memory accesses as possible, thereby improving performance. When the MMU accesses regions associated with one of the non-delta RGTEs, the actual blending level associated with those regions may be determined from the RGTEs associated with aligned addresses aligned to the boundary corresponding to the larger blending region size. Thus, when the MMU accesses data from the domain group table (ownership table), the MMU may need to walk up and down to access different aligned addresses corresponding to different possible fused group sizes (fusion levels) in order to find the actual fused group size and attributes associated with a given region. After coalescing, a single TLB entry may store information that may be looked up when accessing any coalescing region group.
As shown in FIG. 42, respective level2 and level1 region split commands may also be provided to reverse the effect of the respective level2 and level1 region fuse commands. Alternatively, as shown at the bottom of FIG. 42, a single global split command may decompose any size fused group into separate 4KB regions without the need to selectively invert each level of fusion one step at a time. However, a region splitting command in a decrement form (labeled level1 or level2 in FIG. 42) may be more efficiently implemented in the microarchitecture.
FIG. 43 is a flow chart illustrating a method for processing issued zone-fuse commands. At step 920, the RMU20 receives a zone merge command. The zone merge command may be issued by any software process, including processing other than the owner zone of the merge zone. The zone-fuse command specifies an address and a target Fusion Level (FL) indicating a size of a fusion group to be formed in response to the zone-fuse command. In this example, the target fusion level FL is explicitly indicated in the command, but other examples may implicitly identify the target fusion level FL, such as by alternatively specifying the desired current fusion level, namely FL-1.
At
At step 926, for the addressed precinct identified by the address specified in the precinct merge command, the RMU20 checks the ownership table to determine if the target merge level FL is one level higher than the current merge level indicated in the
If the addressed area is valid and has the current blending level (after the blending group size corresponding to the target blending level FL) matching the desired blending levelThe next minimum fused group size), then at
For each of these selected subsets of RGTEs, the following loop is performed:
locking the RGTE to prevent the MMU from accessing the RGTE in response to other processing, or in response to the RMU responding to other RMU commands, to prevent the partially updated RGTE from becoming visible to other processes until all atomic effects of the zone merge command have been implemented; and
for any RGTE in the selected subset of RGTEs other than the first RGTE:
the RMU20 compares the ownership attributes (including owner zone, regional lifecycle state, and visibility attributes) of the current RGTE with the corresponding attributes defined for the previously looked up RGTEs in the selected RGTE subset to check if the attributes are the same; and
the RMU20 checks the mapping address M specified in the currently looked-up RGTEiWhether to map address M with the RGTE previously looked upi+1And (4) continuous. That is, if address M is mappediCorresponds to Mi+1+ offset, then verify the mapping address MiWhere the offset is selected based on the target fusion level FL such that the offset corresponds to the size of the fusion group of fusion level FL-1 (e.g., in the example of fig. 42, the offset is 4K for level1 commands, 64K for level2 commands, and so on).
This cycle is repeated for each RGTE of the selected subset. If for any selected RGTE subset the ownership attributes are different from the corresponding attributes set for the previous RGTE, or the mapped address is not consecutive to the mapped address of the previous RGTE, then at
If all RGTEs pass the verification of ownership attributes and mapping addresses at
RMU20 increases the fusion level specified in the RGTE; and
the RMU20 then unlocks the RGTE so that the MMU can now access the RGTE in response to a process other than the RMU.
Performing the loop of step 934 in the reverse order of the loop of 930 ensures that the first RGTE of the subset (which corresponds to an address aligned with the group size boundary and is the primary entry identifying the attributes of the fused group) will be updated and unlocked last to prevent incorrect results that may be caused by RGTEs at the head of the fused group being accessed or updated before all other RGTEs of the fused group are updated.
At
FIG. 44 shows a flow diagram illustrating a method of processing a region split command. At step 950, the RMU20 receives the zone split command. The region splitting command may be issued by any software process, including processing other than to fuse the owner domain of the region. The region split command specifies an address and a desired fusion level (EFL) indicating a desired current fusion group size for the fusion group to be split (i.e., a level2 split command specifying EFL 2 is for splitting the level2 fusion group into level1 fusion groups). It should be understood that in other embodiments, the fusion level specified by the command may indicate a target fusion level that is desired to result from the splitting (such that a level1 split command with a target FL ═ 1 would be used to split a level2 fusion group into level1 fusion groups). Regardless of how the fusion level is represented, the RMU is able to determine the desired fusion level EFL from the encoding of the region split command.
At step 952, the RMU20 looks up the region associated with the address specified in the region split command to the RGTE and checks if the region is indicated as valid. If the zone is not valid (e.g., in one of the invalid, RMU-private, or patrol clear-commit states), then the zone split command is rejected at step 954.
At step 956, for the addressed extent identified by the address specified in the extent split command, the RMU20 checks the ownership table to determine if the desired merge stage EFL derived from the extent split command matches the current merge stage indicated in the
If the addressed area is valid and has a current merge stage that matches the desired merge stage EFL, then in step 958, the address specified by the split command is aligned with the size boundary corresponding to the desired/current merge stage EFL (e.g., using the particular example shown in fig. 42, level1 to 64K for EFL, and level2 to 2MB for EFL). The RMU then performs a loop over the selected RGTE starting from the aligned address at an interval based on the offset selected by the desired fusion stage, step 960. These are the same RGTE subsets set in response to the zone fuse command of the corresponding stage. For each of these selected subsets of RGTEs, the following loop is performed:
locking the RGTE to prevent updates triggered by any other command than the region split command, and to prevent possible access to partially updated information; and
for all RGTEs of the fused group to be split except the first RGTE, the RMU20 verifies whether the current fusion level specified in the RGTEs is 1 (as shown in fig. 42, the remaining RGTEs of any size of fused group should always specify a fusion level of 1).
This cycle is repeated for each RGTE of the selected subset. If the fusion level is not equal to 1 for any selected subset of RGTEs other than the first RGTE associated with the fused group to be split, then at step 962, any previously locked RGTEs are unlocked and the region split command is rejected.
If all RGTEs pass the fusion level verification at step 960, then at step 964, a reverse loop is performed on the selected subset of RGTEs in the reverse order of the loop of step 960. In the reverse loop, for each selected RGTE subset:
RMU20 reduces the fusion level specified in the RGTE; and
the RMU20 then unlocks this RGTE so that the MMU can now update or access the RGTE in response to a program other than the RMU.
At step 966, the RMU20 issues a TLB invalidate command specifying the address and depending on the size of the desired fusion level EFL to trigger any
In the example of FIG. 44, the region splitting command specifies a desired fusion level (EFL). In another variation, the split command may omit an indication of a desired level of fusion. In this case, step 956 may be omitted (so the method proceeds directly from step 952 to 958), and the desired blend level (EFL) may be determined at step 958 simply by looking up the RGT to determine the actual current blend level of the addressed area.
FIG. 45 shows a simulator implementation that may be used. Although the earlier described embodiments implement the present invention in terms of apparatus and methods for operating specific processing hardware supporting the technology in question, it is also possible to provide an instruction execution environment in accordance with what is described herein by using a computer program. To the extent that such computer programs provide software-based implementations of the hardware architecture, these computer programs are often referred to as emulators. Various emulator computer programs include emulators, virtual machines, models, and binary translators, including dynamic binary translators. In general, an emulator implementation may run on a host processor 730, which optionally runs a host operating system 720, supporting an emulator program 710. In some configurations, there may be multiple layers of emulation interposed between the hardware and the provided instruction execution environment, and/or multiple distinct instruction execution environments provided on the same host processor. Historically, powerful processors have been required to provide emulator implementations that execute at reasonable speeds, but this approach may be adjusted in certain circumstances, such as when code native to another processor needs to be run for compatibility or reuse reasons. For example, emulator implementations may provide an instruction execution environment with additional functionality not supported by the host processor hardware, or provide an instruction execution environment typically associated with a different hardware architecture. An overview of simulation is given in "Some Efficient architecture simulation technologies" (Some Efficient architecture simulation technologies) ", Robert Bedichek, Winter 1990USENIX Conference, pages 53-63.
To the extent that embodiments have been described above with reference to particular hardware configurations or features, in an analog embodiment, equivalent functionality may be provided by appropriate software configurations or features. For example, the particular circuitry may be implemented as computer program logic in a simulation embodiment. Similarly, memory hardware such as registers or caches may be implemented as software data structures in analog embodiments. Some emulation embodiments may utilize host hardware where appropriate, in configurations where one or more of the hardware components referenced in the previously described embodiments reside on host hardware (e.g., host processor 730).
Emulator program 710 may be stored on a computer-readable storage medium (which may be a non-transitory medium) and provides a program interface (instruction execution environment) to object code 700 (which may include an application operating system and a manager as shown in fig. 2) that is the same as the application programming interface of the hardware architecture modeled by emulator program 710. Thus, the program instructions of object code 700 (including control of memory access based on the above-described domain protection functions) may be executed within the instruction execution environment using emulator program 710 so that host computer 730, which does not actually have the hardware features of device 2 discussed above, may emulate these features.
In this application, the word "configured to … …" is used to indicate that a component of a device has a configuration capable of performing the defined operation. In this context, "configuration" means the configuration or manner of interconnection of hardware or software. For example, the apparatus may have dedicated hardware providing the defined operations, or may be programmed as a processor or other processing device that performs the functions. "configured to" does not imply that the device components need to be changed in any way in order to provide the defined operation.
Although illustrative embodiments of the present invention have been described in detail herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope and spirit of the invention as defined by the appended claims.
- 上一篇:一种医用注射器针头装配设备
- 下一篇:会话控制装置、会话控制方法以及程序