Cortex-M3 and PrimeCell uDMAC Bus Arbitration in TI CC2640R2F
The integration of the ARM Cortex-M3 microcontroller and the PrimeCell uDMAC (Micro Direct Memory Access Controller) in the Texas Instruments CC2640R2F Bluetooth controller presents a complex scenario where bus arbitration between the two masters can lead to performance bottlenecks. The Cortex-M3 and the uDMAC both operate on the same AHB-lite system bus, with the Cortex-M3 typically having priority over the uDMAC. This setup can result in the uDMAC being starved of bus access, particularly when the Cortex-M3 is heavily utilizing the system bus for memory and peripheral accesses. The issue is exacerbated by the lack of a built-in mechanism in the Cortex-M3 to yield bus access to the uDMAC periodically, which could otherwise ensure fair arbitration and prevent performance degradation in DMA operations.
The Cortex-M3’s bus access behavior is influenced by its three primary interfaces: ICODE, DCODE, and SYSTEM. The ICODE and DCODE interfaces are used for fetching instructions and accessing data in the CODE memory space, respectively, while the SYSTEM interface is used for all other memory and peripheral accesses. Ideally, the majority of the Cortex-M3’s accesses should be through the ICODE and DCODE interfaces, minimizing contention on the SYSTEM bus. However, in scenarios where the Cortex-M3 is frequently accessing peripherals or memory regions outside the CODE space, the SYSTEM bus can become a bottleneck, leading to potential starvation of the uDMAC.
The PrimeCell uDMAC, on the other hand, relies on the SYSTEM bus for all its memory and peripheral accesses. When the Cortex-M3 is heavily utilizing the SYSTEM bus, the uDMAC’s ability to perform DMA transfers can be severely impacted, leading to missed deadlines in real-time applications or reduced throughput in data-intensive tasks. This issue is particularly critical in embedded systems where the uDMAC is responsible for handling high-speed data transfers, such as in wireless communication controllers like the CC2640R2F.
The arbitration logic between the Cortex-M3 and the uDMAC is typically implemented in the AHB-lite BusMatrix, which may be either ARM IP or a custom design by the SoC vendor (in this case, Texas Instruments). The BusMatrix determines which master gets access to the bus at any given time, and in many implementations, the Cortex-M3 is given higher priority. This fixed-priority arbitration scheme can lead to the uDMAC being consistently starved of bus access, especially in systems where the Cortex-M3 is heavily loaded.
Memory Access Contention and Fixed-Priority Arbitration
The core issue of bus arbitration between the Cortex-M3 and the PrimeCell uDMAC in the TI CC2640R2F Bluetooth controller stems from the fixed-priority arbitration scheme implemented in the AHB-lite BusMatrix. In this scheme, the Cortex-M3 is typically given higher priority over the uDMAC, leading to potential starvation of the uDMAC when the Cortex-M3 is heavily utilizing the SYSTEM bus. This fixed-priority arbitration is a common design choice to ensure that the CPU has low-latency access to memory and peripherals, but it can have detrimental effects on the performance of DMA operations.
The Cortex-M3’s bus access behavior is influenced by the nature of the tasks it is performing. When the Cortex-M3 is executing code from the CODE memory space, it primarily uses the ICODE and DCODE interfaces, which do not contend with the uDMAC for SYSTEM bus access. However, when the Cortex-M3 needs to access peripherals or memory regions outside the CODE space, it must use the SYSTEM bus, which is shared with the uDMAC. In systems where the Cortex-M3 frequently accesses peripherals or external memory, the SYSTEM bus can become a bottleneck, leading to contention and potential starvation of the uDMAC.
The PrimeCell uDMAC, being a DMA controller, is designed to offload data transfer tasks from the CPU, thereby improving system performance. However, for the uDMAC to function effectively, it requires timely access to the SYSTEM bus. When the Cortex-M3 is given higher priority, the uDMAC may be forced to wait for extended periods, leading to missed deadlines in real-time applications or reduced throughput in data-intensive tasks. This issue is particularly critical in systems where the uDMAC is responsible for handling high-speed data transfers, such as in wireless communication controllers like the CC2640R2F.
The arbitration logic in the AHB-lite BusMatrix is typically implemented as a fixed-priority scheme, where the Cortex-M3 is given higher priority over the uDMAC. This design choice is made to ensure that the CPU has low-latency access to memory and peripherals, which is critical for real-time performance. However, this fixed-priority arbitration can lead to the uDMAC being consistently starved of bus access, especially in systems where the Cortex-M3 is heavily loaded. In such cases, the uDMAC may be unable to perform its tasks effectively, leading to performance degradation in DMA operations.
Implementing Fair Bus Arbitration and Optimizing Cortex-M3 Bus Usage
To address the issue of bus arbitration between the Cortex-M3 and the PrimeCell uDMAC in the TI CC2640R2F Bluetooth controller, it is essential to implement fair bus arbitration and optimize the Cortex-M3’s bus usage. The goal is to ensure that both the Cortex-M3 and the uDMAC have timely access to the SYSTEM bus, thereby preventing performance degradation in DMA operations.
One approach to achieving fair bus arbitration is to modify the arbitration logic in the AHB-lite BusMatrix to implement a round-robin or weighted round-robin arbitration scheme. In a round-robin scheme, each master is given equal opportunity to access the bus, while in a weighted round-robin scheme, each master is assigned a weight that determines its share of bus access. By implementing a round-robin or weighted round-robin arbitration scheme, the uDMAC can be guaranteed a fair share of bus access, even when the Cortex-M3 is heavily utilizing the SYSTEM bus.
Another approach is to optimize the Cortex-M3’s bus usage to minimize contention on the SYSTEM bus. This can be achieved by ensuring that the majority of the Cortex-M3’s accesses are through the ICODE and DCODE interfaces, which do not contend with the uDMAC for SYSTEM bus access. This can be done by carefully organizing the memory map and placing frequently accessed code and data in the CODE memory space. Additionally, the use of cache memory can help reduce the number of SYSTEM bus accesses by the Cortex-M3, further minimizing contention with the uDMAC.
In systems where the Cortex-M3 must frequently access peripherals or memory regions outside the CODE space, it may be necessary to implement a mechanism to periodically yield bus access to the uDMAC. This can be achieved by inserting idle cycles or using software-controlled bus arbitration. For example, the Cortex-M3 could be programmed to periodically enter a low-power mode or execute a NOP loop, allowing the uDMAC to access the bus. Alternatively, the Cortex-M3 could be configured to use a lower-priority bus request signal, allowing the uDMAC to preempt the Cortex-M3 when necessary.
In conclusion, the issue of bus arbitration between the Cortex-M3 and the PrimeCell uDMAC in the TI CC2640R2F Bluetooth controller can be addressed by implementing fair bus arbitration and optimizing the Cortex-M3’s bus usage. By modifying the arbitration logic in the AHB-lite BusMatrix and carefully organizing the memory map, it is possible to ensure that both the Cortex-M3 and the uDMAC have timely access to the SYSTEM bus, thereby preventing performance degradation in DMA operations. Additionally, the use of cache memory and software-controlled bus arbitration can further minimize contention and ensure that the uDMAC is able to perform its tasks effectively.