ARMv9 RME Cache Coherency Problems During GPC-Protected Memory Access
The ARMv9 architecture introduces Realm Management Extensions (RME), which include Granule Protection Checks (GPC) to enforce memory access permissions at a granular level. The GPC mechanism is designed to ensure that memory accesses are validated against the Granule Protection Table (GPT) before proceeding. However, a critical issue arises when considering cache coherency and the sequence in which GPCs are performed relative to cache accesses. Specifically, the interaction between cache coherency protocols and GPCs can lead to scenarios where memory access permissions are bypassed or violated, particularly in systems with multiple Processing Elements (PEs) that have inconsistent GPT configurations.
In a typical ARMv9 system, caches are tagged with Physical Addresses (PAs), and these tags are used in coherency protocols to maintain consistency across multiple cores. When a PE attempts to access a memory location, the GPC is conceptually performed before the memory access, including cache accesses. However, if two PEs have different GPT configurations—for example, one PE marks a region as Normal while another marks it as Root—there is a potential for one PE to access the other PE’s memory via the coherency bus if the data is already cached and tagged as Normal. This scenario raises questions about whether the GPC will block such access or if the cache coherency mechanism will allow the access to proceed unchecked.
This issue is further complicated by the fact that cache invalidation operations, such as those performed by set/way-based Cache Maintenance Operations (CMOs), do not have an associated address to check against the GPT. This means that a PE could invalidate a cache line belonging to another PE, even if the GPT would otherwise block access to that memory region. Such behavior could lead to data corruption or security vulnerabilities, particularly in systems where isolation between PEs is critical.
Inconsistent GPT Configurations and Cache Coherency Protocol Interactions
The root cause of the issue lies in the interaction between the cache coherency protocol and the GPC mechanism. The ARMv9 architecture expects all PEs to have a coherent view of the GPT, meaning that each PE should have the same GPT configuration. However, if this expectation is not met—for example, if one PE has a GPT that marks a region as Normal while another PE marks the same region as Root—the cache coherency protocol may allow one PE to access the other PE’s memory via the coherency bus. This is because the cache tags are based on PAs, and the coherency protocol does not inherently enforce GPCs.
Additionally, cache invalidation operations pose a significant challenge. When a PE performs a cache invalidation operation, it does not have an associated address to check against the GPT. This means that a PE could invalidate a cache line belonging to another PE, even if the GPT would otherwise block access to that memory region. This behavior is particularly problematic in systems where isolation between PEs is critical, as it could lead to data corruption or security vulnerabilities.
Another factor contributing to the issue is the use of set/way-based CMOs. These operations do not have an associated address, making it impossible to perform a GPC before invalidating the cache line. This means that a PE could invalidate a cache line belonging to another PE, even if the GPT would otherwise block access to that memory region. This behavior is particularly problematic in systems where isolation between PEs is critical, as it could lead to data corruption or security vulnerabilities.
Implementing Consistent GPT Configurations and Cache Management Strategies
To address these issues, it is essential to ensure that all PEs in the system have a consistent view of the GPT. This can be achieved by configuring the GPT in a way that is consistent across all PEs before enabling the GPC mechanism. Specifically, the Root firmware must ensure that the GPCCR_EL3 and GPTBR_EL3 registers are configured consistently across all PEs before setting the GPCCR_EL3.GPC bit to 1. This ensures that all PEs have the same view of the GPT, preventing inconsistencies that could lead to cache coherency issues.
In addition to ensuring consistent GPT configurations, it is also important to implement proper cache management strategies. This includes avoiding the use of set/way-based CMOs in favor of address-based CMOs, which can be checked against the GPT before performing the operation. Address-based CMOs ensure that the GPC is performed before the cache operation, preventing unauthorized access to memory regions.
Another important consideration is the use of SMMUs (System Memory Management Units) to enforce GPCs for devices that access memory, such as the GIC (Generic Interrupt Controller). In systems with RME, the GIC must be subject to GPCs, which can be achieved by placing it behind an SMMU. The SMMU must be configured to use the same GPT as the PEs to ensure consistency. This is particularly important in systems with shared GICs, as the SMMU must use the same GPT as the PEs to ensure that GPCs are enforced consistently across the system.
Finally, it is important to consider the fidelity of FVP (Fixed Virtual Platform) models when testing these scenarios. While FVP models provide a useful tool for testing and development, they may not fully replicate the behavior of real hardware. It is important to validate the behavior observed in FVP models against real hardware to ensure that the system behaves as expected in practice.
Issue | Root Cause | Solution |
---|---|---|
Inconsistent GPT configurations | Different PEs have different GPT configurations, leading to cache coherency issues | Ensure consistent GPT configurations across all PEs before enabling GPC |
Cache invalidation bypassing GPC | Set/way-based CMOs do not have an associated address to check against the GPT | Use address-based CMOs to ensure GPC is performed before cache operations |
GIC and SMMU configuration | Shared GICs must be subject to GPCs, requiring consistent GPT configurations | Configure SMMUs to use the same GPT as the PEs to enforce GPCs consistently |
FVP model fidelity | FVP models may not fully replicate real hardware behavior | Validate FVP model behavior against real hardware |
By implementing these strategies, it is possible to mitigate the risks associated with cache coherency and GPC sequence issues in ARMv9 systems with RME. Ensuring consistent GPT configurations, using address-based CMOs, and properly configuring SMMUs are critical steps in maintaining system security and stability. Additionally, validating the behavior observed in FVP models against real hardware is essential to ensure that the system behaves as expected in practice.