ARM Cortex-M7 LDREX Instruction Causing Bus Fault in Multi-Core Environment
The ARM Cortex-M7 processor is a high-performance embedded processor designed for real-time applications. One of its key features is support for exclusive access instructions, such as LDREX (Load Exclusive) and STREX (Store Exclusive), which are used to implement atomic operations in multi-core or multi-threaded environments. However, in a system with multiple Cortex-M7 cores and Cortex-R8 clusters, the LDREX instruction can occasionally trigger a Hard Fault, specifically a Bus Fault, as indicated by the Configuration Fault Status Register (CFSR). This issue is particularly perplexing because the fault does not occur on the first attempt to acquire a spinlock but rather on subsequent attempts. This behavior suggests a subtle interaction between the Cortex-M7’s memory system, the exclusive access mechanism, and the multi-core environment.
The memory region being accessed by the LDREX instruction is configured with Device or Strongly Ordered memory attributes, which are typically used for memory-mapped peripherals or shared resources. These memory types impose strict ordering and access rules, which can complicate the behavior of exclusive access instructions. The fault occurs intermittently, making it difficult to reproduce and diagnose. This post will explore the underlying causes of this issue and provide detailed troubleshooting steps to resolve it.
Memory Attribute Mismatch and Exclusive Access Constraints
The primary cause of the Hard Fault triggered by the LDREX instruction lies in the interaction between the Cortex-M7’s exclusive access mechanism and the memory attributes of the target address. The ARM architecture imposes specific requirements for the memory types that can be used with exclusive access instructions. According to the ARMv7-M Architecture Reference Manual, exclusive access instructions are only supported in Normal memory regions. Device and Strongly Ordered memory types are not supported for exclusive accesses.
When the LDREX instruction is executed on a memory address marked as Device or Strongly Ordered, the processor may generate a Bus Fault. This is because these memory types do not support the exclusive monitor mechanism required for LDREX and STREX operations. The exclusive monitor is a hardware component that tracks exclusive access requests and ensures atomicity. When the monitor is not supported or is improperly configured, the processor cannot guarantee the atomicity of the operation, leading to a fault.
In the described system, the memory address used for the spinlock is configured as Device or Strongly Ordered memory. This configuration is likely chosen to ensure strict ordering of accesses to shared resources across multiple cores. However, this choice inadvertently violates the architectural constraints for exclusive access instructions. The intermittent nature of the fault can be attributed to the timing of access attempts by multiple cores, which may occasionally align in a way that exposes the unsupported memory type.
Another contributing factor is the multi-core environment itself. The Cortex-M7 cores and Cortex-R8 clusters share access to the same memory regions, including the spinlock address. The exclusive monitor on each Cortex-M7 core must coordinate with the monitors on other cores to maintain consistency. If the memory type is incompatible with exclusive access, this coordination can fail, leading to a Bus Fault. The fault does not occur on the first attempt because the exclusive monitor may initially handle the access correctly, but subsequent attempts expose the underlying issue.
Reconfiguring Memory Attributes and Implementing Proper Synchronization
To resolve the Hard Fault triggered by the LDREX instruction, the memory attributes of the spinlock address must be reconfigured to comply with the architectural requirements for exclusive access. The target memory region should be marked as Normal memory, which supports the exclusive monitor mechanism. This change ensures that the LDREX and STREX instructions can operate correctly without generating a Bus Fault.
The first step is to identify the memory region containing the spinlock address and update its attributes in the Memory Protection Unit (MPU) or system memory map. The MPU is a programmable unit that defines the memory attributes for different regions of the address space. By configuring the spinlock region as Normal memory, the exclusive access instructions will function as intended. The following table summarizes the required memory attributes:
Memory Attribute | Value for Normal Memory |
---|---|
Type | Normal |
Cacheable | Yes |
Bufferable | Yes |
Shareable | Yes |
After reconfiguring the memory attributes, the system should be tested to verify that the Hard Fault no longer occurs. It is also important to ensure that the spinlock implementation correctly handles concurrent access from multiple cores. The spinlock function should include appropriate memory barriers to enforce ordering and prevent race conditions.
In addition to reconfiguring the memory attributes, the spinlock implementation should be reviewed to ensure proper synchronization. The following code snippet demonstrates a corrected spinlock implementation using LDREX and STREX:
void HalCpu_SpinLock(uint32_t *lock) {
while (1) {
// Attempt to acquire the lock
if (__LDREXW(lock) == 0) {
// Try to store the lock value
if (__STREXW(1, lock) == 0) {
// Lock acquired successfully
__DMB(); // Data Memory Barrier
break;
}
}
// Wait for the lock to be released
__WFE(); // Wait For Event
}
}
void HalCpu_SpinUnlock(uint32_t *lock) {
__DMB(); // Data Memory Barrier
*lock = 0;
__DSB(); // Data Synchronization Barrier
__SEV(); // Send Event
}
This implementation includes memory barriers to ensure proper ordering of memory accesses and uses the WFE instruction to reduce power consumption while waiting for the lock. The DMB instruction ensures that all memory accesses before the barrier are completed before any accesses after the barrier. The DSB instruction ensures that all previous instructions are completed before proceeding, and the SEV instruction signals other cores that the lock has been released.
Finally, the system should be tested under heavy load to verify that the spinlock implementation works correctly in a multi-core environment. This testing should include scenarios with high contention for the spinlock to ensure that the exclusive access mechanism handles concurrent access without generating faults.
By reconfiguring the memory attributes and implementing proper synchronization, the Hard Fault triggered by the LDREX instruction can be resolved, ensuring reliable operation of the multi-core system.