ARM Cortex-A72 GICv3 Interrupt Handling Issues During EL3 to EL1 Transition
The ARM Cortex-A72 processor, when paired with the Generic Interrupt Controller (GIC) version 3, can exhibit complex interrupt handling issues during transitions between Exception Levels (ELs), particularly when moving from EL3 to EL1. These issues often manifest as interrupts not being correctly handled after the transition, leading to system instability or unresponsiveness. This post delves into the root causes of these issues, explores the underlying mechanisms of the GICv3 and ARMv8 architecture, and provides detailed troubleshooting steps to resolve them.
GICv3 SGI Handling Mismatch Between EL3 and EL1
The core issue revolves around the handling of Software Generated Interrupts (SGIs) in the GICv3 architecture, specifically when the ARM Cortex-A72 transitions from EL3 to EL1. SGIs are interrupts generated by software, typically used for inter-processor communication (IPC). In GICv3, SGIs can be generated using different registers, such as ICC_SGI1R_EL1
, ICC_ASGI1R_EL1
, or GICD_SGIR
, and can be grouped into Secure Group1 (SG1) or Non-Secure Group1 (NSG1). The problem arises when SGIs generated at EL3 are not correctly handled after the transition to EL1, leading to missed or improperly routed interrupts.
The root cause of this issue lies in the configuration of the GICv3 registers and the mismatch between the sender and receiver configurations. When an SGI is generated, the sender specifies the interrupt group (G0, SG1, or NSG1), and the receiver must be configured to accept interrupts from that group. If there is a mismatch between the sender and receiver configurations, the interrupt will be dropped, leading to the observed behavior where only certain SGIs are handled correctly.
For example, if an SGI is generated as NSG1 using ICC_SGI1R_EL1
at EL3, but the receiver is configured to accept only SG1 interrupts, the interrupt will be dropped. Similarly, if an SGI is generated as SG1 using ICC_ASGI1R_EL1
, but the receiver is configured to accept only NSG1 interrupts, the interrupt will also be dropped. This mismatch can occur due to incorrect configuration of the GICR_IGROUR0
and GICR_IGRPMODR0
registers, which control the group and mode of the interrupts for each CPU interface.
Configuration and State Transition Issues in GICv3 and ARMv8
The second part of the issue involves the configuration of the GICv3 and ARMv8 registers during the transition from EL3 to EL1. When the Cortex-A72 transitions from EL3 to EL1, several critical registers must be configured correctly to ensure that interrupts are handled properly in the new Exception Level. These include the SCR_EL3
, HCR_EL2
, ICC_SRE_EL2
, ICC_SRE_EL1
, and ICC_IGRPEN1_EL1
registers, among others.
One common mistake is failing to set the ARE_NS
and ARE_S
bits in the GICD_CTLR
register, which enable the Affinity Routing Extension (ARE) for Non-Secure and Secure states, respectively. Without these bits set, the GICv3 will not route interrupts correctly to the CPU interfaces in the new Exception Level. Additionally, the ICC_SRE_EL2.SRE
bit must be set to enable the System Register Interface for the GICv3 at EL2, and the ICC_SRE_EL1.SRE
bit must be set in both banked copies of the ICC_SRE_EL1
register to enable the System Register Interface at EL1.
Another critical step is to ensure that the NS.G1
enable bit is set in both the CPU interface (ICC_IGRPEN1_EL1
) and the Distributor (GICD_CTLR
). This bit enables Non-Secure Group1 interrupts, which are typically used for inter-processor communication in the Non-Secure state. If this bit is not set, the CPU will not accept NSG1 interrupts, leading to missed interrupts.
Furthermore, before leaving EL3, the ICC_PMR_EL1
register must be programmed to a value in the Non-Secure range to ensure that the CPU accepts Non-Secure interrupts. The GICx_IGROUPRn
and GICx_IGRPMODn
registers must also be configured to make the appropriate interrupts Non-Secure Group1. Finally, the GICR_WAKER
sequence must be completed to ensure that the Redistributor is awake and ready to handle interrupts.
Detailed Troubleshooting and Configuration Steps for GICv3 and ARMv8
To resolve the issues described above, follow these detailed troubleshooting and configuration steps:
-
Verify SGI Configuration at EL3:
- Ensure that the
GICR_IGROUR0
andGICR_IGRPMODR0
registers are correctly configured for each SGI (INTID 0-15). These registers control the group and mode of the interrupts for each CPU interface. For example, if you want to generate NSG1 interrupts, ensure that the corresponding bits inGICR_IGROUR0
andGICR_IGRPMODR0
are set to Non-Secure Group1.
- Ensure that the
-
Check Sender and Receiver Agreement:
- When generating an SGI, ensure that the sender and receiver configurations match. For example, if the sender generates an SGI as NSG1 using
ICC_SGI1R_EL1
, the receiver must be configured to accept NSG1 interrupts. If there is a mismatch, the interrupt will be dropped.
- When generating an SGI, ensure that the sender and receiver configurations match. For example, if the sender generates an SGI as NSG1 using
-
Configure GICv3 Registers for EL3 to EL1 Transition:
- Set the
ARE_NS
andARE_S
bits in theGICD_CTLR
register to enable Affinity Routing for Non-Secure and Secure states. - Set the
ICC_SRE_EL2.SRE
bit to enable the System Register Interface for the GICv3 at EL2. - Set the
ICC_SRE_EL1.SRE
bit in both banked copies of theICC_SRE_EL1
register to enable the System Register Interface at EL1. - Set the
NS.G1
enable bit in both the CPU interface (ICC_IGRPEN1_EL1
) and the Distributor (GICD_CTLR
). - Program the
ICC_PMR_EL1
register to a value in the Non-Secure range before leaving EL3. - Configure the
GICx_IGROUPRn
andGICx_IGRPMODn
registers to make the appropriate interrupts Non-Secure Group1. - Complete the
GICR_WAKER
sequence to ensure that the Redistributor is awake and ready to handle interrupts.
- Set the
-
Verify Interrupt State After Transition:
- After transitioning to EL1, check the
GICx_ISPENDRn
register to ensure that the interrupt is pending. If the interrupt is not pending, it will not be sent to the CPU. - Check the
GICx_ISACTIVERn
register to ensure that the interrupt is not active. If the interrupt is active or active and pending, it will not be signaled. - Check the individual enable, group, priority, and target configuration for the interrupt.
- Check the
ICC_HPPIR0_EL1
andICC_HPPIR1_EL1
registers to see if the interrupt was forwarded to the CPU. These registers are not affected by the Priority Mask Register (PMR) or Running Priority Register (RPR). - Check the
ICC_RPR_EL1
register to ensure that it is in the "idle" state. If it is not idle, a new interrupt will only be signaled if it meets the preemption rules. - Ensure that the interrupt priority is numerically lower than the value in the
ICC_PMR_EL1
register. If the priority is not lower, the interrupt will not be signaled.
- After transitioning to EL1, check the
-
Handle FIQ and IRQ Correctly:
- Ensure that the
SCR_EL3.FIQ
bit is not cleared while in Non-Secure state. In GICv3, FIQ is used for Group0 interrupts and interrupts for the other Security state. Clearing this bit can lead to missed interrupts in Non-Secure state.
- Ensure that the
By following these steps, you can ensure that the ARM Cortex-A72 and GICv3 are correctly configured to handle interrupts during the transition from EL3 to EL1, avoiding the issues described in this post.