ARM926EJ-S Soft Reset Inconsistencies and Hang Scenarios
The ARM926EJ-S processor, a widely used embedded core, is known for its robust performance and versatility in embedded systems. However, one of the challenges faced by developers working with this processor is the inconsistent behavior of software-initiated soft resets. The issue manifests in three primary scenarios:
-
Hang During Reset Instruction Execution: The processor hangs when executing the reset instruction, such as
mov pc, [0x20000000]
or equivalent kernel API calls likecpu_reset(0x20000000)
. This suggests that the processor fails to transition cleanly to the reset vector address. -
Hang During Bootloader Initialization: The system reboots successfully but hangs during the bootloader initialization phase. This indicates that the processor state or memory configuration post-reset is inconsistent, preventing the bootloader from executing correctly.
-
Hang During Kernel Decompression: The system reboots and the bootloader starts successfully, but the processor hangs during the kernel decompression phase. This points to issues with memory or cache state management during the reset process.
These issues are particularly problematic in systems without a hardware watchdog timer, where a reliable software reset mechanism is critical for system recovery and stability. The root cause of these hangs often lies in improper handling of the processor state, cache, and memory management unit (MMU) during the reset sequence.
Cache and MMU State Management During Soft Reset
The ARM926EJ-S processor relies on a combination of hardware and software mechanisms to manage its internal state during a reset. A soft reset, unlike a hard reset, does not physically reset the processor but instead attempts to simulate a reset by reinitializing the processor state. This process is highly sensitive to the state of the cache, MMU, and interrupt handling mechanisms.
Cache Coherency and Dirty Data
One of the primary causes of soft reset hangs is improper handling of the data cache. The ARM926EJ-S features a Harvard architecture with separate instruction and data caches. During normal operation, the data cache may contain "dirty" data—modified data that has not yet been written back to main memory. If a soft reset is initiated without cleaning the data cache, this dirty data can lead to memory corruption or inconsistent system state post-reset.
Cleaning the data cache involves writing all dirty data back to main memory, ensuring that the memory system is in a consistent state before the reset. This is typically done using the Clean and Invalidate
cache operations. Failure to perform this step can result in the processor accessing stale or corrupted data during the boot process, leading to hangs.
MMU and Memory Mapping
The MMU plays a critical role in managing virtual-to-physical address translation and memory protection. During a soft reset, the MMU must be disabled to ensure that the processor operates in a 1:1 memory mapping mode, where virtual addresses directly correspond to physical addresses. If the MMU is not properly disabled, the processor may attempt to access incorrect memory locations, leading to undefined behavior or hangs.
Additionally, the MMU configuration must be reset to its default state to ensure that the processor starts with a clean slate. This includes invalidating the Translation Lookaside Buffer (TLB) to remove any stale entries that could interfere with the reset process.
Interrupt Handling and Processor Mode
Interrupts must be disabled during a soft reset to prevent the processor from being interrupted during critical reset operations. The ARM926EJ-S supports both IRQ and FIQ interrupts, and both must be disabled to ensure a clean reset. Furthermore, the processor must be switched to Supervisor (SVC) mode, which is the default mode after a hard reset. Operating in an incorrect mode during the reset process can lead to unpredictable behavior.
Implementing a Reliable Soft Reset Sequence
To address the soft reset hangs in the ARM926EJ-S, a carefully designed reset sequence must be implemented. This sequence should ensure that the processor state, cache, and MMU are properly managed before initiating the reset. Below is a detailed step-by-step guide to implementing a reliable soft reset:
Step 1: Disable Interrupts
The first step in the soft reset sequence is to disable all interrupts. This prevents the processor from being interrupted during critical reset operations. The following assembly code demonstrates how to disable IRQ and FIQ interrupts:
MRS r0, CPSR ; Read the Current Program Status Register (CPSR)
ORR r0, r0, #0xC0 ; Set the I and F bits to disable IRQ and FIQ interrupts
MSR CPSR_c, r0 ; Write back the modified CPSR
Step 2: Clean and Invalidate the Data Cache
Next, the data cache must be cleaned and invalidated to ensure that all dirty data is written back to main memory. This step is critical to prevent memory corruption during the reset process. The following code demonstrates how to clean and invalidate the data cache:
MRC p15, 0, r0, c1, c0, 0 ; Read the Control Register (CP15 register 1)
BIC r0, r0, #0x4 ; Disable the data cache
MCR p15, 0, r0, c1, c0, 0 ; Write back the modified Control Register
; Clean and invalidate the data cache
MOV r0, #0
MCR p15, 0, r0, c7, c10, 4 ; Data Synchronization Barrier (DSB)
MCR p15, 0, r0, c7, c14, 1 ; Clean and Invalidate Data Cache Line by MVA
Step 3: Disable the MMU and Invalidate the TLB
The MMU must be disabled to ensure that the processor operates in a 1:1 memory mapping mode. Additionally, the TLB must be invalidated to remove any stale entries. The following code demonstrates how to disable the MMU and invalidate the TLB:
MRC p15, 0, r0, c1, c0, 0 ; Read the Control Register (CP15 register 1)
BIC r0, r0, #0x1 ; Disable the MMU
MCR p15, 0, r0, c1, c0, 0 ; Write back the modified Control Register
; Invalidate the TLB
MCR p15, 0, r0, c8, c7, 0 ; Invalidate the entire TLB
Step 4: Switch to Supervisor Mode
The processor must be switched to Supervisor (SVC) mode, which is the default mode after a hard reset. This ensures that the processor operates in a known state during the reset process. The following code demonstrates how to switch to SVC mode:
MRS r0, CPSR ; Read the Current Program Status Register (CPSR)
BIC r0, r0, #0x1F ; Clear the mode bits
ORR r0, r0, #0x13 ; Set the mode bits to SVC mode
MSR CPSR_c, r0 ; Write back the modified CPSR
Step 5: Jump to the Reset Vector
Finally, the processor must jump to the reset vector address to initiate the reset. This is typically done by loading the Program Counter (PC) with the address of the reset vector. The following code demonstrates how to jump to the reset vector:
LDR pc, =0x20000000 ; Load the PC with the reset vector address
Step 6: Verify System State Post-Reset
After the reset, it is important to verify that the system state is consistent and that the processor has successfully restarted. This includes checking the state of the cache, MMU, and interrupt handling mechanisms. Additionally, any system-specific initialization code should be executed to ensure that the system is in a known good state.
Conclusion
The ARM926EJ-S processor is a powerful and versatile embedded core, but its soft reset mechanism requires careful handling to avoid hangs and inconsistent behavior. By following the detailed reset sequence outlined above, developers can ensure a reliable and consistent soft reset process. This involves disabling interrupts, cleaning and invalidating the data cache, disabling the MMU, switching to Supervisor mode, and jumping to the reset vector. Proper management of these components is critical to achieving a successful soft reset and maintaining system stability.
In systems where a hardware watchdog timer is not available, implementing a robust soft reset mechanism is essential for ensuring system recovery and reliability. By addressing the root causes of soft reset hangs and following best practices for cache, MMU, and interrupt management, developers can achieve a reliable and consistent reset process for the ARM926EJ-S processor.