From: Mao Minkai maominkai@wxiat.com
Sunway inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5GFOO
--------------------------------
In commit 20b900c4b7fb ("sw64: optimize simd version of memcpy and memset"), _nc instructions are used to improve performance, but the position of memb instruction in memset is wrong. Fix it.
Fixes: 20b900c4b7fb ("sw64: optimize simd version of memcpy and memset") Signed-off-by: Mao Minkai maominkai@wxiat.com
Signed-off-by: Gu Zitao guzitao@wxiat.com --- arch/sw_64/lib/deep-memset.S | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/sw_64/lib/deep-memset.S b/arch/sw_64/lib/deep-memset.S index ed2171c56d4d..7fbd529c72a8 100644 --- a/arch/sw_64/lib/deep-memset.S +++ b/arch/sw_64/lib/deep-memset.S @@ -99,12 +99,11 @@ $mod32_aligned: .align 5 $mod32_loop_nc: subl $18, 64, $18 - blt $18, $mod32_tail + blt $18, $mod32_tail_memb vstd_nc $f10, 0($16) vstd_nc $f10, 32($16) addl $16, 64, $16 br $31, $mod32_loop_nc - memb # required for _nc store instructions
.align 5 $mod32_loop: @@ -115,6 +114,8 @@ $mod32_loop: addl $16, 64, $16 br $31, $mod32_loop
+$mod32_tail_memb: + memb # required for _nc store instructions $mod32_tail: vldd $f10, 0($4) addl $sp, 64, $sp