EL3 to Non-Secure EL1 Transition Failure and Missing EL1 Entry Call
When transitioning from EL3 (Exception Level 3) to non-secure EL1 (Exception Level 1) on an ARM Cortex-A55 processor, the CPU successfully switches to EL1h (non-secure mode), but the el1_entry
function is never called. This issue is particularly perplexing because the same code works when switching to secure EL1. The problem manifests when using a binary file with QEMU, but the code functions correctly when using an ELF file with the -kernel
option. This discrepancy suggests that the issue lies in the initialization sequence, memory access configuration, or the handling of the transition between exception levels.
The core of the problem revolves around the proper setup of system registers, memory access permissions, and the handling of physical address spaces (PAS) during the transition from EL3 to non-secure EL1. The ARM Cortex-A55 architecture enforces strict rules regarding secure and non-secure memory accesses, and any misconfiguration can lead to unexpected behavior, such as the failure to execute the el1_entry
function.
Secure vs. Non-Secure Memory Access Configuration and EL2/EL1 Register Initialization
The ARM Cortex-A55 processor supports two distinct physical address spaces: Secure and Non-Secure. All memory accesses by the core resolve to one of these two PASs. In non-secure state, the output PAS is always Non-Secure, regardless of whether the Memory Management Unit (MMU) is enabled or disabled. In secure state, the output PAS depends on the MMU configuration. When the MMU is disabled, the output PAS is Secure. When the MMU is enabled, the output PAS is determined by the translation tables.
In the context of the issue, the el1_entry
function resides in the same memory page as the EL3 code. By default, the memory accesses from EL3 are Secure. If the memory system is configured to allow only Secure accesses, then any attempt to access the same memory location in Non-Secure mode will fail. This is because Secure and Non-Secure memory accesses are treated as distinct address spaces, and a memory location configured for Secure access cannot be accessed in Non-Secure mode without proper configuration.
Additionally, the initialization of EL2 and EL1 registers is critical for a successful transition to non-secure EL1. Most ARM system registers reset to an unknown state, and software is expected to initialize these registers before transitioning to lower exception levels. For example, the HCR_EL2
register controls whether non-secure EL1 operates in AArch32 or AArch64 mode. If this register is not properly initialized, the transition to non-secure EL1 may fail, or the processor may enter an unexpected state.
The issue is further complicated by the use of QEMU for emulation. QEMU may not fully emulate the behavior of real hardware, particularly with respect to the initialization of system registers and memory access permissions. When using a binary file, QEMU may not apply the necessary initialization steps that are automatically handled when using an ELF file with the -kernel
option. This discrepancy highlights the importance of understanding the differences between binary and ELF file formats and their impact on the initialization sequence.
Proper Initialization of System Registers, Memory Access Configuration, and Debugging Techniques
To resolve the issue of the missing el1_entry
call during the transition from EL3 to non-secure EL1, the following steps should be taken:
1. Initialize EL2 and EL1 Registers
Before transitioning to non-secure EL1, ensure that all relevant EL2 and EL1 registers are properly initialized. This includes setting the HCR_EL2
register to the correct value to ensure that non-secure EL1 operates in the desired mode (AArch32 or AArch64). Additionally, initialize the SCTLR_EL1
register to configure the MMU and cache settings for EL1.
// Example initialization of HCR_EL2 and SCTLR_EL1
mov x0, #0x80000000 // Set HCR_EL2.RW to 1 for AArch64 EL1
msr hcr_el2, x0
mov x0, #0x00000000 // Disable MMU and caches in EL1
msr sctlr_el1, x0
2. Configure Memory Access Permissions
Ensure that the memory region containing the el1_entry
function is accessible in Non-Secure mode. This may involve configuring the MMU translation tables to map the memory region as Non-Secure. If the MMU is disabled, ensure that the memory system allows Non-Secure accesses to the region.
// Example MMU configuration for Non-Secure access
// Assume x1 contains the base address of the translation table
mov x2, #0x00000000 // Set memory attributes for Non-Secure access
orr x2, x2, #(1 << 6) // Set the NS bit
str x2, [x1, #0x00] // Store the entry in the translation table
3. Verify the Exception Return (ERET) Sequence
Ensure that the ERET
instruction is correctly executed and that the SPSR_EL3
and ELR_EL3
registers are properly set before the transition. The SPSR_EL3
register should be configured to specify the target exception level and mode (e.g., EL1h for non-secure EL1). The ELR_EL3
register should point to the el1_entry
function.
// Example setup for ERET to non-secure EL1
mov x0, #0b00101 // Set SPSR_EL3 to EL1h, non-secure
msr spsr_el3, x0
adr x1, el1_entry // Load the address of el1_entry into ELR_EL3
msr elr_el3, x1
eret // Execute the exception return
4. Debugging Techniques
If the el1_entry
function is still not called, use the following debugging techniques to diagnose the issue:
-
Check the Exception Syndrome Register (ESR): Examine the
ESR_EL3
andESR_EL1
registers to determine if an exception occurred during the transition. The ESR provides detailed information about the cause of the exception, such as an illegal instruction or an invalid memory access. -
Verify the Current Exception Level: After the
ERET
instruction, check the current exception level to determine if the transition to non-secure EL1 was successful. If the processor remains in EL3, theERET
instruction may have failed due to an illegal exception return. -
Inspect Memory Access Permissions: Use a debugger to inspect the memory access permissions for the region containing the
el1_entry
function. Ensure that the region is accessible in Non-Secure mode and that the memory system is correctly configured. -
Compare Binary and ELF File Behavior: Analyze the differences between the binary and ELF file formats when used with QEMU. Ensure that the binary file includes all necessary initialization steps that are automatically handled by the ELF loader.
5. Example Code for Full Initialization and Transition
Below is an example of a complete initialization and transition sequence from EL3 to non-secure EL1:
.global _Entry
_Entry:
// Configure SCR_EL3 for non-secure EL1
mrs x0, scr_el3
orr x0, x0, #(1 << 10) // Set SCR_EL3.NS to 1 for non-secure
orr x0, x0, #(1 << 0) // Set SCR_EL3.SMD to 1 to disable secure monitor calls
msr scr_el3, x0
// Initialize HCR_EL2 for AArch64 EL1
mov x0, #0x80000000
msr hcr_el2, x0
// Initialize SCTLR_EL1 (disable MMU and caches)
mov x0, #0x00000000
msr sctlr_el1, x0
// Set SPSR_EL3 for EL1h, non-secure
mov x0, #0b00101
msr spsr_el3, x0
// Set ELR_EL3 to el1_entry
adr x1, el1_entry
msr elr_el3, x1
// Perform the exception return
eret
el1_entry:
// Example code for EL1 entry point
ldr x15, =0xdeadbeef
b .
By following these steps and ensuring proper initialization of system registers and memory access configuration, the transition from EL3 to non-secure EL1 should succeed, and the el1_entry
function should be called as expected.