glibc/sysdeps
Adhemerval Zanella Netto 4c128c7823 aarch64: Add optimized chacha20
It adds vectorized ChaCha20 implementation based on libgcrypt
cipher/chacha20-aarch64.S.  It is used as default and only
little-endian is supported (BE uses generic code).

As for generic implementation, the last step that XOR with the
input is omited.  The final state register clearing is also
omitted.

On a virtualized Linux on Apple M1 it shows the following
improvements (using formatted bench-arc4random data):

GENERIC                                    MB/s
-----------------------------------------------
arc4random [single-thread]               380.89
arc4random_buf(16) [single-thread]       500.73
arc4random_buf(32) [single-thread]       552.61
arc4random_buf(48) [single-thread]       566.82
arc4random_buf(64) [single-thread]       574.01
arc4random_buf(80) [single-thread]       581.02
arc4random_buf(96) [single-thread]       591.19
arc4random_buf(112) [single-thread]      592.29
arc4random_buf(128) [single-thread]      596.43
-----------------------------------------------

OPTIMIZED                                  MB/s
-----------------------------------------------
arc4random [single-thread]               569.60
arc4random_buf(16) [single-thread]       825.78
arc4random_buf(32) [single-thread]       987.03
arc4random_buf(48) [single-thread]      1042.39
arc4random_buf(64) [single-thread]      1075.50
arc4random_buf(80) [single-thread]      1094.68
arc4random_buf(96) [single-thread]      1130.16
arc4random_buf(112) [single-thread]     1129.58
arc4random_buf(128) [single-thread]     1137.91
-----------------------------------------------

Checked on aarch64-linux-gnu.
2022-07-22 11:58:27 -03:00
..
aarch64 aarch64: Add optimized chacha20 2022-07-22 11:58:27 -03:00
alpha
arc elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA 2022-06-15 11:29:55 -07:00
arm Add bounds check to __libc_ifunc_impl_list 2022-06-10 17:13:29 +01:00
csky
generic aarch64: Add optimized chacha20 2022-07-22 11:58:27 -03:00
gnu
hppa
htl
hurd
i386 i386: Remove -Wa,-mtune=i686 2022-07-12 11:14:32 -07:00
ia64
ieee754
m68k m68k: optimize RTLD_START 2022-06-25 00:22:02 +02:00
mach stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) 2022-07-22 11:58:27 -03:00
microblaze
mips
nios2 Remove remnant reference to ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA 2022-06-15 13:02:17 -07:00
nptl stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) 2022-07-22 11:58:27 -03:00
or1k
posix Refactor internal-signals.h 2022-06-30 14:56:21 -03:00
powerpc Add bounds check to __libc_ifunc_impl_list 2022-06-10 17:13:29 +01:00
pthread nptl: Fix __libc_cleanup_pop_restore asynchronous restore (BZ#29214) 2022-06-08 09:23:02 -03:00
riscv riscv: Use memcpy to handle unaligned access when fixing R_RISCV_RELATIVE 2022-06-30 08:04:52 -07:00
s390 s390: use LC_ALL=C for readelf call 2022-06-21 10:16:44 +02:00
sh
sparc Add bounds check to __libc_ifunc_impl_list 2022-06-10 17:13:29 +01:00
unix stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) 2022-07-22 11:58:27 -03:00
wordsize-32
wordsize-64
x86 x86: Add support to build strcmp/strlen/strchr with explicit ISA level 2022-07-16 03:07:59 -07:00
x86_64 x86: Add support to build st{p|r}{n}{cpy|cat} with explicit ISA level 2022-07-16 03:07:59 -07:00