mirror of git://sourceware.org/git/glibc.git
It adds vectorized ChaCha20 implementation based on libgcrypt cipher/chacha20-aarch64.S. It is used as default and only little-endian is supported (BE uses generic code). As for generic implementation, the last step that XOR with the input is omited. The final state register clearing is also omitted. On a virtualized Linux on Apple M1 it shows the following improvements (using formatted bench-arc4random data): GENERIC MB/s ----------------------------------------------- arc4random [single-thread] 380.89 arc4random_buf(16) [single-thread] 500.73 arc4random_buf(32) [single-thread] 552.61 arc4random_buf(48) [single-thread] 566.82 arc4random_buf(64) [single-thread] 574.01 arc4random_buf(80) [single-thread] 581.02 arc4random_buf(96) [single-thread] 591.19 arc4random_buf(112) [single-thread] 592.29 arc4random_buf(128) [single-thread] 596.43 ----------------------------------------------- OPTIMIZED MB/s ----------------------------------------------- arc4random [single-thread] 569.60 arc4random_buf(16) [single-thread] 825.78 arc4random_buf(32) [single-thread] 987.03 arc4random_buf(48) [single-thread] 1042.39 arc4random_buf(64) [single-thread] 1075.50 arc4random_buf(80) [single-thread] 1094.68 arc4random_buf(96) [single-thread] 1130.16 arc4random_buf(112) [single-thread] 1129.58 arc4random_buf(128) [single-thread] 1137.91 ----------------------------------------------- Checked on aarch64-linux-gnu. |
||
|---|---|---|
| .. | ||
| aarch64 | ||
| alpha | ||
| arc | ||
| arm | ||
| csky | ||
| generic | ||
| gnu | ||
| hppa | ||
| htl | ||
| hurd | ||
| i386 | ||
| ia64 | ||
| ieee754 | ||
| m68k | ||
| mach | ||
| microblaze | ||
| mips | ||
| nios2 | ||
| nptl | ||
| or1k | ||
| posix | ||
| powerpc | ||
| pthread | ||
| riscv | ||
| s390 | ||
| sh | ||
| sparc | ||
| unix | ||
| wordsize-32 | ||
| wordsize-64 | ||
| x86 | ||
| x86_64 | ||