ARM Cortex-A72 Generic Timer Counter Anomalies During Multi-Core Synchronization
The ARM Cortex-A72 processor, part of the ARMv8-A architecture, is widely used in high-performance embedded systems. One of its critical components is the ARM Generic Timer, which provides a system-wide synchronized counter for timing and scheduling purposes. However, a recurring issue has been observed where the timer counter appears to "back off" or decrement under specific multi-core synchronization scenarios. This issue manifests when multiple cores attempt to read the timer counter (CNTPCT) in a synchronized manner, leading to cases where a subsequent read on one core returns a value lower than a previous read on another core.
The core of the problem lies in the interaction between the ARM Generic Timer’s counter and the multi-core synchronization mechanisms. The ARM Generic Timer is designed to provide a monotonically increasing counter, but certain hardware behaviors and software synchronization patterns can lead to unexpected results. Specifically, when Core 0 reads the CNTPCT register and stores the value in a shared global variable (core0_ticks
), other cores (Core 1/2/3) read this global variable and then read their own CNTPCT values. In some cases, the subsequent CNTPCT read on Core 1/2/3 returns a value less than the value read from core0_ticks
, which violates the expected monotonic behavior of the timer.
This issue is not unique to the Cortex-A72 and has been documented in other ARM-based systems, such as the NXP LS1043A and HiSilicon Hikey960, where similar timer counter anomalies have been observed. These anomalies are often attributed to hardware errata, such as FSL Erratum A-008585 and HiSilicon Erratum 161010101, which describe scenarios where the timer counter may return erroneous values during specific read sequences.
The implications of this issue are significant for real-time systems, where accurate timing is critical. For example, in a multi-core system where tasks are scheduled based on the timer counter, a backoff in the timer value could lead to incorrect task ordering or missed deadlines. Additionally, debugging such issues can be challenging, as the problem may occur sporadically and depend on the precise timing of operations across multiple cores.
Timer Counter Read Anomalies Due to Hardware Errata and Synchronization Gaps
The root cause of the ARM Cortex-A72 Generic Timer backoff issue can be traced to a combination of hardware limitations and software synchronization challenges. The primary factors contributing to this issue include hardware errata, memory synchronization gaps, and the inherent behavior of the ARM Generic Timer counter.
Hardware Errata
The ARM Cortex-A72 processor is affected by specific hardware errata that impact the behavior of the Generic Timer counter. For instance, FSL Erratum A-008585 states that the ARM Generic Timer counter may contain an erroneous value for a small number of core clock cycles when the timer value changes. This can result in a consecutive counter read returning a lower value than the previous read, effectively causing the timer to appear to go backward. Similarly, HiSilicon Erratum 161010101 describes a scenario where the timer counter may return an incorrect value during transitions, with the error value being larger than the correct one by a specific margin (e.g., 32 ticks).
These errata highlight a fundamental limitation in the timer hardware, where the counter value is not guaranteed to be stable during certain transitions. This instability can be exacerbated in multi-core systems, where cores may read the counter at slightly different times, leading to inconsistencies.
Memory Synchronization Gaps
Another contributing factor is the lack of proper memory synchronization between cores. In the described scenario, Core 0 writes the CNTPCT value to a global variable (core0_ticks
), and other cores read this variable before reading their own CNTPCT values. However, there is no explicit synchronization mechanism to ensure that the other cores wait for Core 0 to complete its write before proceeding. This can lead to race conditions where Core 1/2/3 read core0_ticks
before it has been updated, or read the CNTPCT value before Core 0 has completed its write.
The use of Data Synchronization Barriers (DSBs) in the pseudo-code is a step in the right direction, but it may not be sufficient to fully address the issue. DSBs ensure that memory operations are completed before proceeding, but they do not provide a mechanism for inter-core synchronization. As a result, cores may still read outdated or inconsistent values.
Timer Counter Behavior
The ARM Generic Timer counter is designed to increment at a fixed frequency, but its behavior during read operations can be influenced by the underlying hardware implementation. In some cases, the counter may be buffered or pipelined, leading to delays between the actual counter value and the value returned by a read operation. This can result in scenarios where a read operation returns a value that is slightly behind the current counter value, especially if the read occurs during a counter transition.
Additionally, the timer counter may be affected by clock domain crossings, where the counter value is transferred between different clock domains within the processor. These crossings can introduce small delays or glitches, further contributing to the observed anomalies.
Implementing Robust Timer Counter Reads with Synchronization and Workarounds
To address the ARM Cortex-A72 Generic Timer backoff issue, a combination of software workarounds and synchronization techniques can be employed. These solutions aim to mitigate the impact of hardware errata and ensure consistent timer counter reads across multiple cores.
Double-Read Workaround
One effective workaround, as suggested by FSL Erratum A-008585, is to implement a double-read mechanism for the timer counter. This involves reading the CNTPCT register twice and only using the value if the two reads return the same result. If the values differ, the read operation is repeated until a consistent value is obtained. This approach helps to filter out erroneous values caused by counter transitions or hardware glitches.
The double-read workaround can be implemented as follows:
uint64_t read_counter_safe() {
uint64_t first_read, second_read;
do {
first_read = read_CNTPCT();
second_read = read_CNTPCT();
} while (first_read != second_read);
return first_read;
}
This ensures that the returned value is stable and free from transient errors. However, it is important to note that this approach may introduce additional latency, especially if the counter is frequently transitioning.
Inter-Core Synchronization
To address the memory synchronization gaps between cores, explicit inter-core synchronization mechanisms can be implemented. One approach is to use a shared flag variable to coordinate the timing of counter reads. For example, Core 0 can set a flag after writing the CNTPCT value to core0_ticks
, and Core 1/2/3 can wait for this flag to be set before reading core0_ticks
and their own CNTPCT values.
The following pseudo-code illustrates this approach:
volatile int flag = -1;
uint64_t core0_ticks;
void core0_task() {
while (1) {
if (flag == 1) {
core0_ticks = read_counter_safe();
flag = 0;
dsb();
} else {
// Wait for flag to be set by other cores
}
}
}
void corex_task() {
while (1) {
if (flag == 0) {
uint64_t tmp_ticks = core0_ticks;
flag = 1;
dsb();
uint64_t corex_ticks = read_counter_safe();
dsb();
if (corex_ticks < tmp_ticks) {
// Handle error
}
} else {
// Wait for flag to be reset by Core 0
}
}
}
This approach ensures that Core 1/2/3 wait for Core 0 to complete its write before proceeding, reducing the likelihood of race conditions and inconsistent reads.
Timer Counter Configuration
In some cases, the issue may be mitigated by adjusting the configuration of the ARM Generic Timer. For example, ensuring that the timer frequency is set appropriately for the system can reduce the likelihood of counter transitions occurring during read operations. Additionally, disabling certain timer features, such as virtualization or secure-world timers, may help to simplify the timer behavior and reduce the potential for anomalies.
Software Patches and Updates
Finally, it is important to stay up-to-date with software patches and updates from the SoC vendor. Many vendors, such as NXP and HiSilicon, provide patches or workarounds for known timer-related issues. Applying these patches can help to address specific hardware errata and improve the overall stability of the timer counter.
By combining these techniques, developers can effectively mitigate the ARM Cortex-A72 Generic Timer backoff issue and ensure accurate and consistent timing in multi-core systems. While the issue is rooted in hardware limitations, careful software design and synchronization can provide a robust solution that meets the demands of real-time embedded systems.