riscv: memcpy_noalignment: Fold SZREG/BLOCK_SIZE alignment to single andi

Simplify the alignment steps for SZREG and BLOCK_SIZE multiples. The previous
three-instruction sequences

    addi    a7, a2, -SZREG
    andi    a7, a7, -SZREG
    addi    a7, a7, SZREG

and

    addi    a7, a2, -BLOCK_SIZE
    andi    a7, a7, -BLOCK_SIZE
    addi    a7, a7, BLOCK_SIZE

are equivalent to a single

    andi    a7, a2, -SZREG
    andi    a7, a2, -BLOCK_SIZE

because SZREG and BLOCK_SIZE are powers of two in this context, making the
surrounding addi steps cancel out. Folding to one instruction reduces code
size with identical semantics.

No functional change.

sysdeps/riscv/multiarch/memcpy_noalignment.S: Remove redundant addi around
alignment; keep a single andi for SZREG/BLOCK_SIZE rounding.

Signed-off-by: Yao Zihong <zihong.plct@isrc.iscas.ac.cn>
Reviewed-by: Peter Bergner <bergner@tenstorrent.com>
This commit is contained in:
Yao Zihong 2025-10-30 17:47:24 -05:00 committed by Peter Bergner
parent 444d81284e
commit 0698fd462a
1 changed files with 2 additions and 6 deletions

View File

@ -57,9 +57,7 @@ ENTRY (__memcpy_noalignment)
add a5, a0, a4
add a1, a1, a4
bleu a2, a3, L(word_copy_adjust)
addi a7, a2, -BLOCK_SIZE
andi a7, a7, -BLOCK_SIZE
addi a7, a7, BLOCK_SIZE
andi a7, a2, -BLOCK_SIZE
add a3, a5, a7
mv a4, a1
L(block_copy):
@ -106,9 +104,7 @@ L(word_copy):
li a5, SZREG-1
/* if LEN < SZREG jump to tail handling. */
bleu a2, a5, L(tail_adjust)
addi a7, a2, -SZREG
andi a7, a7, -SZREG
addi a7, a7, SZREG
andi a7, a2, -SZREG
add a6, a3, a7
mv a5, a1
L(word_copy_loop):