ARM Cortex-A L2ACTLR[7

L2 Cache Stalls Due to Indefinite Memory Read Issues

The ARM Cortex-A series processors are widely used in embedded systems due to their high performance and efficiency. However, certain architectural nuances can lead to unexpected behavior, particularly in the memory subsystem. One such issue is the indefinite stalling of memory reads in the L2 cache, which can severely impact system performance and reliability. This issue is documented under ARM Errata 798870, which describes a scenario where a memory read operation can stall indefinitely in the L2 cache under specific conditions.

The L2 cache is a critical component in the memory hierarchy, acting as an intermediary between the faster L1 cache and the slower main memory. When a memory read operation stalls in the L2 cache, it can cause the entire processor to wait, leading to significant performance degradation. This issue is particularly problematic in real-time systems where deterministic behavior is crucial.

The root cause of this issue lies in the interaction between the L2 cache controller and the memory subsystem. Under certain conditions, the L2 cache controller may fail to properly handle a memory read request, causing it to stall indefinitely. This behavior is not consistent and may only manifest under specific workloads or system configurations, making it difficult to diagnose and reproduce.

To mitigate this issue, ARM has provided a workaround that involves setting bit 7 of the L2 Auxiliary Control Register (L2ACTLR). This bit, when set, alters the behavior of the L2 cache controller to prevent the indefinite stalling of memory reads. However, the implementation of this workaround requires careful consideration of when and where to set this bit, as it can have implications for system initialization and runtime behavior.

L2ACTLR[7] Bit Configuration: SBL vs. Application-Level Setting

The L2ACTLR[7] bit can be set either during the initialization phase, typically in the Secondary Bootloader (SBL), or during runtime within the application itself. The choice between these two approaches depends on several factors, including the level of control over the bootloader, the specific requirements of the application, and the overall system architecture.

Setting L2ACTLR[7] in the Secondary Bootloader (SBL)

The SBL is responsible for initializing the hardware components and preparing the system for the execution of the main application. Setting the L2ACTLR[7] bit in the SBL ensures that the workaround is applied as early as possible in the system’s lifecycle. This approach is particularly beneficial in systems where the bootloader has full control over the hardware configuration and where the application may not have the necessary privileges to modify the L2ACTLR register.

However, setting the L2ACTLR[7] bit in the SBL requires access to and modification of the bootloader code. This may not always be feasible, especially in systems where the bootloader is provided by a third party or is otherwise inaccessible. Additionally, modifying the bootloader can introduce risks, as any errors in the bootloader code can prevent the system from booting altogether.

Setting L2ACTLR[7] in the Application

Alternatively, the L2ACTLR[7] bit can be set within the application itself. This approach provides greater flexibility, as the application can dynamically enable or disable the workaround based on runtime conditions. For example, the application could enable the workaround only when performing specific memory-intensive operations where the risk of L2 cache stalls is higher.

Setting the L2ACTLR[7] bit in the application requires that the application has sufficient privileges to access and modify the L2ACTLR register. This may not be the case in all systems, particularly those with strict security constraints. Additionally, setting the bit at runtime may introduce a small performance overhead, as the processor must perform additional register writes.

Trade-offs and Considerations

The decision to set the L2ACTLR[7] bit in the SBL or the application involves trade-offs between early initialization, flexibility, and system security. In systems where the bootloader is under the developer’s control and where early initialization is critical, setting the bit in the SBL is the preferred approach. Conversely, in systems where flexibility and runtime control are more important, setting the bit in the application may be more appropriate.

Implementing the L2ACTLR[7] Workaround: Best Practices and Step-by-Step Guide

Implementing the L2ACTLR[7] workaround requires a thorough understanding of the system architecture and careful consideration of the timing and location of the bit setting. Below is a detailed guide on how to implement this workaround, including best practices and potential pitfalls.

Step 1: Identify the System Configuration

Before implementing the workaround, it is essential to understand the system configuration, including the specific ARM Cortex-A processor being used, the memory hierarchy, and the bootloader and application software. This information will help determine the most appropriate location for setting the L2ACTLR[7] bit.

Step 2: Accessing the L2ACTLR Register

The L2ACTLR register is a privileged register, meaning that it can only be accessed from privileged modes such as Supervisor mode or Hypervisor mode. To modify this register, the system must be in a privileged state, and the appropriate access controls must be in place.

In the SBL, this typically involves writing to the register during the early initialization phase, before the operating system or application has started. In the application, this may require elevating the privilege level temporarily to access the register.

Step 3: Setting the L2ACTLR[7] Bit

Once the system is in a privileged state, the L2ACTLR[7] bit can be set by writing to the register. The exact method for doing this will depend on the development environment and tools being used. Below is an example of how to set the L2ACTLR[7] bit using assembly code:

MRC p15, 1, r0, c9, c0, 2  // Read L2ACTLR into r0
ORR r0, r0, #0x80           // Set bit 7
MCR p15, 1, r0, c9, c0, 2  // Write modified value back to L2ACTLR

This code reads the current value of the L2ACTLR register, sets bit 7, and then writes the modified value back to the register.

Step 4: Verifying the Workaround

After setting the L2ACTLR[7] bit, it is important to verify that the workaround has been successfully applied. This can be done by monitoring the system for the occurrence of L2 cache stalls or by using performance analysis tools to measure the impact on memory access times.

Step 5: Testing and Validation

The final step is to thoroughly test the system to ensure that the workaround has resolved the issue without introducing new problems. This should include stress testing the memory subsystem and verifying that the system behaves as expected under various workloads.

Best Practices

Early Initialization: If possible, set the L2ACTLR[7] bit as early as possible in the system initialization process to minimize the risk of L2 cache stalls.
Privilege Management: Ensure that the system has the necessary privileges to access and modify the L2ACTLR register, and consider the security implications of elevating privilege levels.
Testing: Thoroughly test the system after applying the workaround to ensure that it has resolved the issue and that no new issues have been introduced.

Potential Pitfalls

Bootloader Modifications: Modifying the bootloader can be risky, as errors in the bootloader code can prevent the system from booting. Ensure that any modifications are thoroughly tested.
Runtime Overhead: Setting the L2ACTLR[7] bit at runtime may introduce a small performance overhead. Consider the impact on system performance when deciding where to set the bit.
Security Implications: Elevating privilege levels to access the L2ACTLR register can have security implications. Ensure that the system is properly secured and that access to privileged registers is tightly controlled.

Conclusion

The indefinite stalling of memory reads in the L2 cache is a complex issue that can have significant implications for system performance and reliability. By understanding the root cause of the issue and carefully implementing the recommended workaround, developers can mitigate the risk of L2 cache stalls and ensure that their systems operate as intended. Whether setting the L2ACTLR[7] bit in the SBL or the application, it is essential to consider the trade-offs and follow best practices to achieve the best possible outcome.

L2 Cache Stalls Due to Indefinite Memory Read Issues

L2ACTLR[7] Bit Configuration: SBL vs. Application-Level Setting