ARM64 Subroutine Call and Return Mechanism with BL, BLR, and BR Instructions

The core issue revolves around the misuse of the ARM64 BL, BLR, and BR instructions in inline assembly, leading to incorrect subroutine calls and returns. The BL (Branch with Link) instruction is used to call a subroutine, storing the return address in the x30 register (also known as the Link Register). The BLR (Branch with Link to Register) instruction performs a similar function but branches to an address stored in a register. The BR (Branch to Register) instruction is used to return from a subroutine by branching to the address stored in a register, typically x30.

In the provided code, the BL instruction is used to call a subroutine (myFunction), which is expected to return a value in x0. However, the program fails to proceed to the for loop after the BranchingModes function, indicating a problem with the subroutine call or return mechanism. The issue is further complicated by the incorrect use of the ret instruction, which is intended to return from a subroutine but is misapplied in this context.

Incorrect Return Address Handling and Register Corruption

The primary cause of the issue lies in the incorrect handling of the return address and potential register corruption. The BL instruction updates the x30 register with the return address (PC + 4), but the subroutine (myFunction) does not correctly preserve or restore the x30 register. Additionally, the ret instruction is used incorrectly, as it should branch to the address stored in x30 but is instead used with an explicit register operand (ret x30), which is redundant and potentially problematic.

The BranchingModes function also fails to correctly handle the return values from the inline assembly block. The inline assembly modifies x0 and attempts to store its value in regw1, but the surrounding C++ code does not correctly interpret or use these values. This misalignment between the assembly and C++ code leads to undefined behavior, causing the program to hang or crash before reaching the for loop.

Furthermore, the use of the BR instruction as an alternative to ret is incorrect. The BR instruction branches to an address stored in a register but does not handle the return address in the same way as ret. This misuse exacerbates the issue, as the program flow is disrupted, and the return address is not correctly restored.

Correcting Subroutine Calls and Returns in ARM64 Inline Assembly

To resolve the issue, the subroutine call and return mechanism must be correctly implemented in the inline assembly. The BL instruction should be used to call the subroutine, and the ret instruction should be used to return from it. The x30 register must be preserved and restored correctly to ensure proper program flow.

The myFunction subroutine should be modified to correctly handle the return address and ensure that the x30 register is not corrupted. The ret instruction should be used without an explicit register operand, as it implicitly uses x30 as the return address. The inline assembly block in the BranchingModes function should also be updated to correctly handle the return values and ensure that the C++ code interprets them correctly.

Here is the corrected implementation:

#include <stdio.h>

__asm __volatile
(
  ".global myFunction        \n\t"
  ".p2align 4                \n\t"
  ".type  myFunction,%function \n\t"
  "myFunction:               \n\t"
    "mov x0, #10             \n\t"  // Set x0 to 10
    "ret                     \n\t"  // Return to the address in x30
);

bool BranchingModes(void)
{
  bool BranchingModesFlag = false;
  int regw0 = 0x00;
  int regw1 = 0x00;

  __asm __volatile
  (
    "mov x0, #0              \n\t"  // Clear x0
    "bl myFunction           \n\t"  // Call myFunction
    "mov %[reg1], x0         \n\t"  // Move x0 to regw1
    "nop                     \n\t"  // No operation
    : [reg1] "=r"(regw1)            // Output operand
    :                               // No input operands
    : "x0", "x30"                   // Clobbered registers
  );

  if (regw1 == 10)
  {
    BranchingModesFlag = true;
  }
  else
  {
    BranchingModesFlag = false;
  }

  return BranchingModesFlag;
}

int main()
{
  unsigned int i0 = 0x00;
  unsigned int counter = 0x00;

  BranchingModes();

  for (i0 = 0x00; i0 <= 10000; i0++)
  {
    counter = counter + 1;
  }

  return 0;
}

In this corrected implementation, the myFunction subroutine correctly sets x0 to 10 and returns using the ret instruction. The inline assembly block in the BranchingModes function correctly handles the return value from myFunction and stores it in regw1. The BranchingModes function then checks the value of regw1 to determine if the subroutine call was successful.

The main function calls BranchingModes and proceeds to the for loop, ensuring that the program flow is correct. This implementation avoids the issues caused by incorrect return address handling and register corruption, ensuring that the program behaves as expected.

Additional Considerations for ARM64 Inline Assembly

When working with ARM64 inline assembly, it is essential to understand the calling conventions and register usage. The ARM64 architecture uses the x30 register as the Link Register, which stores the return address for subroutine calls. The x0x7 registers are used for passing arguments and returning values, while x8x15 are temporary registers that may be clobbered by function calls.

It is also important to correctly specify clobbered registers in the inline assembly block to inform the compiler that these registers may be modified. This prevents the compiler from making incorrect assumptions about register values and ensures that the program behaves correctly.

Finally, when using the BL, BLR, and BR instructions, it is crucial to ensure that the return address is correctly handled and that the program flow is not disrupted. The ret instruction should be used to return from subroutines, and the x30 register should be preserved and restored as needed.

By following these guidelines and correctly implementing subroutine calls and returns in ARM64 inline assembly, you can avoid the issues described in this post and ensure that your program behaves as expected.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *