ARMv8 MMU Permission Faults During EL0 Access to EL1 Virtual Addresses
In ARMv8 architectures, the Memory Management Unit (MMU) plays a critical role in enforcing memory access permissions and ensuring that user-space applications (EL0) cannot access kernel-space virtual addresses (EL1) without proper authorization. When an application running at EL0 attempts to access a virtual address mapped by the kernel (EL1), the MMU triggers a permission fault, resulting in a segmentation fault (segfault) being reported by the operating system. This behavior is fundamental to maintaining system security and stability, but understanding the exact mechanisms behind it requires a deep dive into the ARMv8 MMU architecture, translation tables, and permission controls.
The core issue revolves around how the MMU detects and prevents unauthorized access to kernel virtual addresses from user space. Specifically, the MMU must determine whether the access is permitted at multiple stages: during the initial virtual address lookup, during the translation table walk, and when evaluating the permissions encoded in the page table entries (PTEs). The fault can be triggered by various factors, including incorrect Access Permission (AP) bits in the page table, misconfigured Translation Control Registers (TCR_EL1), or even hardware-level controls like TCR_EL1.E0PDx.
To fully understand and troubleshoot this issue, we must examine the following aspects:
- The role of the MMU in enforcing memory access permissions.
- The specific mechanisms by which the MMU detects permission faults.
- The configuration of translation tables and TCR_EL1 registers.
- The interaction between the MMU, the operating system, and the hardware.
Misconfigured AP Bits and TCR_EL1 Controls in ARMv8 MMU
The ARMv8 MMU relies on a combination of hardware controls and software-configured permissions to enforce memory access restrictions. Two key components are involved in this process: the Access Permission (AP) bits in the page table entries and the Translation Control Register (TCR_EL1). Misconfiguration of either component can lead to unexpected permission faults, even when the intended behavior is to allow access.
Access Permission (AP) Bits in Page Table Entries
The AP bits in the page table entries define the access permissions for a given memory region. These bits specify whether a memory region is readable, writable, or executable, and whether these permissions apply to privileged (EL1) or unprivileged (EL0) execution levels. In ARMv8, the AP bits are typically configured as follows:
- AP[2:1] = 0b00: No access at EL0, full access at EL1.
- AP[2:1] = 0b01: Read-only at EL0, full access at EL1.
- AP[2:1] = 0b10: Read-only at EL0, read-only at EL1.
- AP[2:1] = 0b11: Full access at EL0, full access at EL1.
If the AP bits are misconfigured, the MMU will trigger a permission fault when an EL0 application attempts to access a memory region that is not explicitly permitted. For example, if the AP bits are set to 0b00 (no access at EL0), any attempt by an EL0 application to read or write to the corresponding memory region will result in a fault.
Translation Control Register (TCR_EL1) Controls
The TCR_EL1 register provides additional controls for managing memory access permissions. Two specific fields in TCR_EL1 are relevant to this discussion:
- TCR_EL1.E0PDx: These fields disable EL0 access to specific regions of the virtual address space. For example, setting TCR_EL1.E0PD1 disables EL0 access to the upper half of the virtual address space (typically used for kernel mappings).
- TCR_EL1.EPDn: These fields disable translation table walks for specific regions of the virtual address space. If EPDn is set, the MMU will not perform a table walk for the corresponding region, and any access attempt will immediately trigger a translation fault.
Misconfiguration of these fields can lead to unexpected behavior. For example, if TCR_EL1.E0PD1 is set to 1, any EL0 access to the upper half of the virtual address space will result in a translation fault, regardless of the AP bits in the page table entries.
Kernel Virtual Address Space Layout
The kernel virtual address space in ARMv8 is typically divided into two regions:
- The lower half of the virtual address space (0x0000000000000000 – 0x0000FFFFFFFFFFFF) is reserved for user-space applications (EL0).
- The upper half of the virtual address space (0xFFFF000000000000 – 0xFFFFFFFFFFFFFFFF) is reserved for kernel-space mappings (EL1).
The MMU uses separate translation table base registers (TTBR0_EL1 and TTBR1_EL1) to manage these regions. TTBR0_EL1 is used for the lower half of the address space, while TTBR1_EL1 is used for the upper half. This separation allows the kernel to enforce different access permissions for user-space and kernel-space mappings.
Diagnosing and Resolving MMU Permission Faults in ARMv8
To diagnose and resolve MMU permission faults in ARMv8, a systematic approach is required. This involves examining the configuration of the translation tables, the TCR_EL1 register, and the faulting address, as well as analyzing the Exception Syndrome Register (ESR_EL1) to determine the exact cause of the fault.
Step 1: Verify the Configuration of TCR_EL1
The first step in diagnosing MMU permission faults is to verify the configuration of the TCR_EL1 register. Specifically, check the values of the E0PDx and EPDn fields to ensure that they are set correctly for your use case. For example:
- If EL0 access to kernel-space mappings is required, ensure that TCR_EL1.E0PD1 is set to 0.
- If translation table walks should be enabled for the upper half of the virtual address space, ensure that TCR_EL1.EPD1 is set to 0.
Step 2: Examine the Access Permission (AP) Bits in the Page Table Entries
Next, examine the AP bits in the page table entries for the faulting address. Ensure that the AP bits are configured to allow the desired access at EL0. For example:
- If EL0 read access is required, set AP[2:1] to 0b01.
- If EL0 write access is required, set AP[2:1] to 0b11.
Step 3: Analyze the Exception Syndrome Register (ESR_EL1)
The ESR_EL1 register provides detailed information about the cause of a permission fault. The register contains a syndrome field that indicates the type of fault (e.g., translation fault, permission fault) and the specific conditions that triggered the fault. To analyze the ESR_EL1 register:
- Decode the syndrome field to determine the type of fault.
- Check the faulting address to ensure that it corresponds to the expected memory region.
- Verify that the access type (read, write, or execute) matches the permissions defined in the page table entries.
Step 4: Verify the Kernel Virtual Address Space Layout
Finally, verify that the faulting address falls within a valid region of the kernel virtual address space. In ARMv8, certain regions of the virtual address space are reserved or unused, and attempting to access these regions will result in a fault. For example:
- The region from 0xFFFFFE0000000000 to 0xFFFFFFFFFFFFFFFF is typically reserved as a guard region and is not mapped to any physical memory.
- Ensure that the faulting address does not fall within this region or any other reserved region.
Example: Resolving a Permission Fault in Linux
Consider a scenario where an EL0 application attempts to access a kernel virtual address (0xFFFFFFFFFF28008000) and triggers a permission fault. To resolve this issue:
- Verify that TCR_EL1.E0PD1 is set to 0, allowing EL0 access to the upper half of the virtual address space.
- Examine the page table entry for the faulting address and ensure that the AP bits are set to 0b01 (EL0 read-only access) or 0b11 (EL0 read/write access).
- Analyze the ESR_EL1 register to confirm that the fault is a permission fault and not a translation fault.
- Verify that the faulting address does not fall within a reserved or unmapped region of the kernel virtual address space.
By following these steps, you can systematically diagnose and resolve MMU permission faults in ARMv8, ensuring that your system operates securely and efficiently.