ARM Neoverse N2 SVE Vector Length Configuration Mismatch at EL0
The ARM Neoverse N2 processor, a high-performance core designed for infrastructure and cloud workloads, implements the Scalable Vector Extension (SVE) as part of its architecture. SVE allows for variable vector lengths, which can be configured through system registers such as ZCR_EL3, ZCR_EL2, and ZCR_EL1. However, a common issue arises when the configured vector length in these registers does not align with the vector length reported at Exception Level 0 (EL0), the operating system environment. Specifically, despite setting ZCR_EL3, ZCR_EL2, and ZCR_EL1 to 0xf (indicating a 2048-bit vector length), the vector length at EL0 is reported as 128 bits. This discrepancy raises questions about hardware limitations, register behavior, and proper configuration practices.
The core of the issue lies in understanding the relationship between the ZCR_ELx registers, the hardware-implemented vector length, and how the processor enforces these settings across different exception levels. The Neoverse N2 Technical Reference Manual (TRM) explicitly states that the core implements a scalable vector length of 128 bits, which suggests that the hardware is inherently limited to this vector length. However, the behavior of the ZCR_ELx registers, particularly ZCR_EL3, complicates this understanding, as they can be written with values that imply support for larger vector lengths.
ZCR_ELx Register Behavior and Hardware Limitations
The ZCR_ELx registers are used to control the effective vector length (VL) for SVE operations. These registers are accessible at different exception levels (EL3, EL2, and EL1) and allow software to request a specific vector length by writing to the LEN field. However, the actual vector length implemented by the hardware may differ from the requested length due to architectural and microarchitectural constraints.
In the case of the Neoverse N2, the hardware implements a fixed vector length of 128 bits. This means that regardless of the value written to ZCR_ELx.LEN, the effective vector length will always be 128 bits. The ZCR_ELx registers do not serve as identification registers that reflect the hardware’s capabilities; instead, they act as control registers that limit the effective vector length to the minimum of the requested length and the hardware-supported length.
A critical point of confusion arises from the behavior of ZCR_EL3. When ZCR_EL3 is read directly, it may return a value of 0xf (2048 bits), which is inconsistent with the hardware’s 128-bit limitation. This behavior is explained by the fact that ZCR_EL3.LEN resets to an architecturally unknown value and does not directly represent the hardware-supported vector length. Instead, the effective vector length is determined by the processor based on the minimum of the requested length and the hardware’s capabilities.
To accurately determine the effective vector length, software should use instructions such as RDVL, which reads the actual vector length implemented by the hardware. This instruction provides a reliable way to query the vector length at runtime, bypassing the potential confusion caused by the ZCR_ELx registers.
Proper Configuration and Verification of SVE Vector Length
To ensure correct configuration and verification of the SVE vector length on the ARM Neoverse N2 processor, the following steps should be taken:
First, software should recognize that the Neoverse N2 core implements a fixed vector length of 128 bits. This means that any attempt to configure a larger vector length through the ZCR_ELx registers will not result in a change to the effective vector length. Instead, the processor will enforce the 128-bit limit regardless of the values written to these registers.
Second, when configuring the ZCR_ELx registers, software should write the appropriate value for the desired vector length, keeping in mind that the effective vector length will be the minimum of the requested length and the hardware-supported length. For the Neoverse N2, this means that writing any value to ZCR_ELx.LEN will result in an effective vector length of 128 bits.
Third, to verify the effective vector length, software should use the RDVL instruction. This instruction reads the actual vector length implemented by the hardware and provides a reliable way to confirm that the processor is operating with the expected vector length. The RDVL instruction should be used in preference to directly reading the ZCR_ELx registers, as the latter may return values that do not reflect the hardware’s capabilities.
Finally, software should be aware of the architectural behavior of the ZCR_ELx registers, particularly ZCR_EL3. These registers are control registers, not identification registers, and their values do not directly indicate the hardware-supported vector length. Instead, they serve to limit the effective vector length based on the requested length and the hardware’s capabilities. Understanding this distinction is crucial for correctly configuring and verifying the SVE vector length on the Neoverse N2 processor.
In summary, the ARM Neoverse N2 processor implements a fixed SVE vector length of 128 bits, and attempts to configure a larger vector length through the ZCR_ELx registers will not change this limitation. The ZCR_ELx registers should be used to control the effective vector length, but software must recognize that the hardware will enforce the 128-bit limit. To verify the effective vector length, the RDVL instruction should be used, as it provides a reliable way to query the hardware’s capabilities. By following these guidelines, software can ensure correct configuration and operation of the SVE vector length on the Neoverse N2 processor.