Commit Graph

395 Commits

Author SHA1 Message Date
Wilco Dijkstra 7c14d8a985 Benchtests: Increase benchmark iterations
Increase benchmark iterations for math and vector math functions to improve
timing accuracy.  Vector math benchmarks now take 1-3 seconds on a modern CPU.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2024-01-12 16:00:28 +00:00
Paul Eggert dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Frederic Berat 99f9ae4ed0 benchtests: fix warn unused result
Few tests needed to properly check for asprintf and system calls return
values with _FORTIFY_SOURCE enabled.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2023-06-22 00:21:19 -04:00
Paul Pluzhnikov 7f0d9e61f4 Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
Carlos O'Donell 85c3569cf4 benchtests: Reformat Makefile.
Reflow all long lines adding comment terminators.
Sort all reflowed text using scripts/sort-makefile-lines.py.

No regressions running microbenchmarks.
No code generation changes observed in binary artifacts.
No regressions on x86_64 and i686.
2023-05-18 13:11:48 -04:00
Joe Ramsay cd94326a13 Enable libmvec support for AArch64
This patch enables libmvec on AArch64. The proposed change is mainly
implementing build infrastructure to add the new routines to ABI,
tests and benchmarks. I have demonstrated how this all fits together
by adding implementations for vector cos, in both single and double
precision, targeting both Advanced SIMD and SVE.

The implementations of the routines themselves are just loops over the
scalar routine from libm for now, as we are more concerned with
getting the plumbing right at this point. We plan to contribute vector
routines from the Arm Optimized Routines repo that are compliant with
requirements described in the libmvec wiki.

Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising
the minimum GCC by such a big jump, we allow users to disable libmvec
if their compiler is too old.

Note that at this point users have to manually call the vector math
functions. This seems to be acceptable to some downstream users.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-05-03 12:09:49 +01:00
Wilco Dijkstra 2623479105 Benchtests: Adjust timing
Adjust iteration counts so benchmarks don't run too slowly or quickly.
Ensure benchmarks take less than 10 seconds on older, slower cores and
more than 0.5 seconds on fast cores.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-04-17 13:00:38 +01:00
Nisha Menon 51a121eb36 compare_strings.py : Add --gmean flag
To calculate geometric mean for string benchmark results.

Signed-off-by: Nisha Poyarekar <nisha.s.menon@gmail.com>
2023-04-04 13:51:45 -05:00
Adhemerval Zanella Netto 5c11701c51 benchtests: Add fmodf benchmark
1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^8):
   1024 inputs between FLT_MIN and FLT_MAX;
3. Close exponents (ey >= -103 and |x/y| < 2^8): 1024 inputs with
   exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:13:55 -03:00
Adhemerval Zanella Netto 3ba0c9593f benchtests: Add fmod benchmark
Add three different dataset, from random floating point numbers:

1. Subnormals: 128 inputs.
2. Normal numbers with large exponent difference (|x/y| > 2^52):
   1024 inputs between DBL_MIN and DBL_MAX;
3. Close exponents (ey >= -907 and |x/y| < 2^52): 1024 inputs with
   exponents between -10 and 10.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2023-04-03 16:13:55 -03:00
Joe Ramsay e4d336f1ac benchtests: Move libmvec benchtest inputs to benchtests directory
This allows other targets to use the same inputs for their own libmvec
microbenchmarks without having to duplicate them in their own
subdirectory.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-03-27 17:04:03 +01:00
Wilco Dijkstra 10f980d31e Benchtests: Remove simple_str(r)chr
Instead of benchmarking slow byte oriented loops, include the optimized generic
strchr and strrchr implementation.  Adjust iteration count to reduce benchmark
time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:48 +00:00
Wilco Dijkstra 9ab7c42387 Benchtests: Remove simple_str(n)casecmp
Remove the slow byte oriented loops.  Adjust iteration count to reduce
benchmark time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:48 +00:00
Wilco Dijkstra 183b425a05 Benchtests: Remove simple_memcmp
Remove the slow byte oriented simple_memcmp.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:48 +00:00
Wilco Dijkstra 5de1508803 Benchtests: Remove simple_strcspn/strpbrk/strsep
Remove simple_strcspn/strpbrk/strsep which are significantly slower than the
generic implementations.  Also remove oldstrsep and oldstrtok since they are
practically identical to the generic implementation.  Adjust iteration count
to reduce benchmark time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:48 +00:00
Wilco Dijkstra b0e02d5b6d Benchtests: Remove memchr_strnlen
Remove memchr_strnlen since it is now the same as generic_strnlen.  Adjust
iteration count to reduce benchmark time.  Keep memchr_strlen since the
generic strlen does not use memchr.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:35 +00:00
Wilco Dijkstra dcfcb8e392 Benchtests: Remove simple_mem(r)chr
Instead of benchmarking slow byte oriented loops, include the optimized
generic memchr/memrchr implementation.  Adjust iteration count to reduce
benchmark time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:25 +00:00
Wilco Dijkstra 73a284f618 Benchtests: Remove simple_strcpy_chk
Remove the slow byte oriented simple_strcpy_chk and simple_stpcpy_chk.
Adjust iteration count to increase benchmark time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:25 +00:00
Wilco Dijkstra d1c3c0e4fe Benchtests: Remove simple_str(n)cmp
Instead of benchmarking slow byte oriented loops, include the optimized generic
strcmp/strncmp implementation.  Adjust iteration count to reduce benchmark time.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-03-08 18:36:11 +00:00
Wilco Dijkstra 32c7acd464 Replace rawmemchr (s, '\0') with strchr
Almost all uses of rawmemchr find the end of a string.  Since most targets use
a generic implementation, replacing it with strchr is better since that is
optimized by compilers into strlen (s) + s.  Also fix the generic rawmemchr
implementation to use a cast to unsigned char in the if statement.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2023-02-06 16:16:19 +00:00
Joseph Myers 6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Noah Goldstein d44e116428 benchtests: Make str{n}{cat|cpy} benchmarks output json
Json output is easier to parse and most other benchmarks already do
the same.
2022-11-08 19:22:33 -08:00
Noah Goldstein ca7d181b62 string: Add len=0 to {w}memcmp{eq} tests and benchtests
len=0 is valid and fairly common so should be tested.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-11-08 19:19:35 -08:00
Adhemerval Zanella 5c5a8b99cf Disable use of -fsignaling-nans if compiler does not support it
Reviewed-by: Fangrui Song <maskray@google.com>
2022-11-01 09:46:08 -03:00
Noah Goldstein 643a2d0139 Bench: Improve benchtests for memchr, strchr, strnlen, strrchr
1. Add more complete coverage in the medium size range.
2. In strnlen remove the `1 << i` which was UB (`i` could go beyond
   32/64)
2022-10-19 17:31:03 -07:00
Noah Goldstein 10c779f44a Benchtests: Add bench for pthread_spin_{try}lock and mutex_trylock
Reuses infrastructure from previous pthread_mutex_lock benchmarks to
test other performance sensitive functions.
2022-10-03 14:13:49 -07:00
Noah Goldstein 5eb21c62ce Benchtest: Add additional benchmarks for strlen and strnlen
Current benchmarks are missing many cases in the mid-length range
which is often the hottest size range.
2022-09-28 20:16:04 -07:00
Adhemerval Zanella Netto 5d765ada01 benchtests: Add arc4random benchtest
It shows both throughput (total bytes obtained in the test duration)
and latecy for both arc4random and arc4random_buf with different
sizes.

Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
2022-07-22 11:58:27 -03:00
Noah Goldstein d0370d992e Benchtests: Improve memrchr benchmarks
Add a second iteration for memrchr to set `pos` starting from the end
of the buffer.

Previously `pos` was only set relative to the beginning of the
buffer. This isn't really useful for memrchr because the beginning
of the search space is (buf + len).
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-07 13:09:16 -07:00
Adhemerval Zanella dc208f4a53 benchtests: Add workload name for sincosf
So it can show both reciprocal-throughput and latency.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Adhemerval Zanella c1176b62a9 benchtests: Add workload name for cosf
So it can show both reciprocal-throughput and latency.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-06-01 10:47:44 -03:00
Noah Goldstein a8f62164b1 benchtests: Improve benchtests for strstr, memmem, and memchr
1. Use json_ctx for output to help standardize format across all
   benchtests.

2. Add some additional tests to strstr and memchr expanding alignments
   and adding more small values.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-05-27 22:52:37 -05:00
Noah Goldstein a01a13601c benchtests: Improve bench-strnlen.c
1. Output results in json format so its easier to parse
2. Increase max alignment to `getpagesize () - 1` to make it possible
   to test page cross cases.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-05-23 18:14:06 -05:00
Noah Goldstein 9a421348cd elf: Optimize _dl_new_hash in dl-new-hash.h
Unroll slightly and enforce good instruction scheduling. This improves
performance on out-of-order machines. The unrolling allows for
pipelined multiplies.

As well, as an optional sysdep, reorder the operations and prevent
reassosiation for better scheduling and higher ILP. This commit
only adds the barrier for x86, although it should be either no
change or a win for any architecture.

Unrolling further started to induce slowdowns for sizes [0, 4]
but can help the loop so if larger sizes are the target further
unrolling can be beneficial.

Results for _dl_new_hash
Benchmarked on Tigerlake: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz

Time as Geometric Mean of N=30 runs
Geometric of all benchmark New / Old: 0.674
  type, length, New Time, Old Time, New Time / Old Time
 fixed,      0,    2.865,     2.72,               1.053
 fixed,      1,    3.567,    2.489,               1.433
 fixed,      2,    2.577,    3.649,               0.706
 fixed,      3,    3.644,    5.983,               0.609
 fixed,      4,    4.211,    6.833,               0.616
 fixed,      5,    4.741,    9.372,               0.506
 fixed,      6,    5.415,    9.561,               0.566
 fixed,      7,    6.649,   10.789,               0.616
 fixed,      8,    8.081,   11.808,               0.684
 fixed,      9,    8.427,   12.935,               0.651
 fixed,     10,    8.673,   14.134,               0.614
 fixed,     11,    10.69,   15.408,               0.694
 fixed,     12,   10.789,   16.982,               0.635
 fixed,     13,   12.169,   18.411,               0.661
 fixed,     14,   12.659,   19.914,               0.636
 fixed,     15,   13.526,   21.541,               0.628
 fixed,     16,   14.211,   23.088,               0.616
 fixed,     32,   29.412,   52.722,               0.558
 fixed,     64,    65.41,  142.351,               0.459
 fixed,    128,  138.505,  295.625,               0.469
 fixed,    256,  291.707,  601.983,               0.485
random,      2,   12.698,   12.849,               0.988
random,      4,   16.065,   15.857,               1.013
random,      8,   19.564,   21.105,               0.927
random,     16,   23.919,   26.823,               0.892
random,     32,   31.987,   39.591,               0.808
random,     64,   49.282,   71.487,               0.689
random,    128,    82.23,  145.364,               0.566
random,    256,  152.209,  298.434,                0.51

Co-authored-by: Alexander Monakov <amonakov@ispras.ru>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-23 10:38:40 -05:00
Noah Goldstein 319dddc143 benchtests: Add benchtests for dl_elf_hash, dl_new_hash and nss_hash
Benchtests are for throughput and include random / fixed size
benchmarks.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-05-23 10:38:40 -05:00
Siddhesh Poyarekar 050cc5f7c1 benchtests: Add wcrtomb microbenchmark
Add a simple benchmark that measures wcrtomb performance with various
locales with 1-4 byte characters.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2022-05-06 18:16:43 +05:30
Siddhesh Poyarekar 5b5b1012d5 benchtests: Better libmvec integration
Improve libmvec benchmark integration so that in future other
architectures may be able to run their libmvec benchmarks as well.  This
now allows libmvec benchmarks to be run with `make BENCHSET=bench-math`.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-04-29 11:48:18 +05:30
Siddhesh Poyarekar 944afe6d95 benchtests: Add UNSUPPORTED benchmark status
The libmvec benchmarks print a message indicating that a certain CPU
feature is unsupported and exit prematurelyi, which breaks the JSON in
bench.out.

Handle this more elegantly in the bench makefile target by adding
support for an UNSUPPORTED exit status (77) so that bench.out continues
to have output for valid tests.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-04-29 11:48:16 +05:30
Wangyang Guo 9e5daa1f6a benchtests: Add pthread-mutex-locks bench
Benchmark for testing pthread mutex locks performance with different
threads and critical sections.

The test configuration consists of 3 parts:
1. thread number
2. critical-section length
3. non-critical-section length

Thread number starts from 1 and increased by 2x until num of CPU cores
(nprocs). An additional over-saturation case (1.25 * nprocs) is also
included.
Critical-section is represented by a loop of shared do_filler(),
length can be determined by the loop iters.
Non-critical-section is similiar to the critical-section, except it's
based on non-shared do_filler().

Currently, adaptive pthread_mutex lock is tested.
2022-04-27 13:41:57 -07:00
Noah Goldstein c2ff9555a1 benchtests: Improve bench-strrchr
1. Use json-lib for printing results.
2. Expose all parameters (before pos, seek_char, and max_char where
   not printed).
3. Add benchmarks that test multiple occurence of seek_char in the
   string.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-04-22 23:07:54 -05:00
Noah Goldstein c6853907b1 benchtests: Use json-lib in bench-strncasecmp.c
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein 6f2a331b16 benchtests: Use json-lib in bench-strcasecmp.c
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein dc18cd6c81 benchtests: Use json-lib in bench-strspn.c
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein 4ed0347a25 benchtests: Use json-lib in bench-strpbrk.c
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein ece0eaa3f8 benchtests: Add random benchmark in bench-strchr.c
Add benchmark that randomizes whether return should be NULL or pointer
to CHAR. The rationale is on many architectures there is a choice
between a predicate execution option (i.e cmovcc on x86) or a branch.

On x86 the results for cmovcc vs branch are something along the lines
of the following:

perc-zero, Br On Result, Time Br / Time cmov
     0.10,            1,              ,0.983
     0.10,            0,              ,1.246
     0.25,            1,              ,1.035
     0.25,            0,              ,1.49
     0.33,            1,              ,1.016
     0.33,            0,              ,1.579
     0.50,            1,              ,1.228
     0.50,            0,              ,1.739
     0.66,            1,              ,1.039
     0.66,            0,              ,1.764
     0.75,            1,              ,0.996
     0.75,            0,              ,1.642
     0.90,            1,              ,1.071
     0.90,            0,              ,1.409
     1.00,            1,              ,0.937
     1.00,            0,              ,0.999
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
Noah Goldstein 4c5200dd9f benchtests: Use json-lib in bench-strchr.c
Just QOL change to make parsing the output of the benchtests more
consistent.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-03-25 11:46:13 -05:00
H.J. Lu 564f7ae7b4 benchtests: Use "=" instead of ":=" [BZ ]
Use "=" instead of ":=" to allow sysdeps Makefiles to add more benches
to bench and benchset.  This fixes BZ .
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2022-03-16 08:48:36 -07:00
Su Lifan edddffc9df benchtests: make compare_strings.py accept string as attribute value
Commit ac759b1fbf added attribute
"overlap" to bench-memmove-walk, whose value is a string. This change
makes compare_strings.py fail since benchout_strings.schema.json
requires the values of attributes to be number.

This patch relaxes such constraint.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-03-08 19:42:52 +05:30
H.J. Lu c12c2a41b0 benchtests: Generate .d dependency files [BZ ]
1. Add all .o files to extra-objs.
2. Include ../Rules after extra-objs has been set.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2022-02-25 10:35:25 -08:00
H.J. Lu cf92721bef benchtests: Remove duplicated loop in bench-bzero-walk.c
Remove one of 2 identical loops in bench-bzero-walk.c.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2022-02-25 07:07:43 -08:00
H.J. Lu 89377d41d7 benchtests: Add small sizes (<= 64) to bench-bzero-walk.c
Small sizes (<= 64) represent large portion of memset usages with zero
value.  Add sizes (<= 64) to bench-bzero-walk.c to cover small sizes.
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2022-02-24 12:28:34 -08:00
H.J. Lu cf97591313 benchtests: Add benches for memset with 0 value
memset with zero as the value to set is by far the majority value (99%+
for Python3 and GCC).  Add bench-memset-zero-large.c,
bench-memset-zero-walk.c and bench-memset-zero.c to measure memset
implementations for zeroing.

Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
2022-02-23 12:07:06 -08:00
H.J. Lu dc98eeeb95 benchtests: Add benches for bzero
Add bench-bzero-large.c, bench-bzero-walk.c and bench-bzero.c.
2022-02-08 14:41:58 -08:00
H.J. Lu 03c9c4fce4 benchtests: Sort benches in Makefile
Put one bench per line and sort them.
2022-02-07 07:09:38 -08:00
Noah Goldstein 69e6992d79 Benchtests: Add length zero benchmark for memset in bench-memset.c
Zero is a relevant size for some workloads (roughly 5% of uses for
GCC) so we should be testing it's performance as well.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-02-06 22:01:39 -06:00
Noah Goldstein 90cbb80636 Benchtests: move 'alloc_bufs' from loop in bench-memset.c
One buf allocation is sufficient. Calling `alloc_bufs' in the loop
just adds unnecessary syscall overhead.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2022-02-05 16:48:00 -06:00
Noah Goldstein 80e6c6554b benchtests: Add more coverage for strcmp and strncmp benchmarks
Add more small and medium sized tests for strcmp and strncmp.

As well for strcmp add option for more direct control of
alignment. Previously alignment was being pushed to the end of the
page. While this is the most difficult case to implement, it is far
from the common case and so shouldn't be the only benchmark.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2022-02-03 16:41:43 -06:00
Paul Eggert 581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Noah Goldstein ac759b1fbf benchtests: Add partial overlap case in bench-memmove-walk.c
This commit adds a new partial overlap benchmark. This is generally
the most interesting performance case for memmove and was missing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-06 16:17:59 -05:00
Noah Goldstein 5e6cce9b34 benchtests: Add additional cases to bench-memcpy.c and bench-memmove.c
This commit adds more benchmarks for the common memcpy/memmove
benchmarks. The most signifcant cases are the half page offsets. The
current versions leaves dst and src near page aligned which leads to
false 4k aliasing on x86_64. This can add noise due to false
dependencies from one run to the next. As well, this seems like more
of an edge case that common case so it shouldn't be the only thing

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-06 16:17:51 -05:00
Sunil K Pandey 2856829ee7 Revert "benchtests: Add acosf function to bench-math"
This reverts commit 79d0fc6539.
2021-11-05 16:13:12 -07:00
Adhemerval Zanella b8a6ee43bb benchtests: Add hypotf
Based on random input arguments.  About 85% tuples have exponents
of the two arguments close together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:39 -03:00
Adhemerval Zanella dba44dbe54 benchtests: Make hypot input random
Instead of inputs based on the algorithm implementation details.
About 85% tuples have exponents of the two arguments close
together (+-1 range).

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-01 16:23:22 -03:00
Sunil K Pandey 79d0fc6539 benchtests: Add acosf function to bench-math
Add acosf function to bench-math and copy acosf-inputs to benchtests.
Motivation for this patch is to prepare for upcoming libmvec new
functions.  Float and double version of libmvec functions stays
together.

acosf-inputs file generated from acos-inputs file using following
scaling formula:

f = d * (FLT_MAX/DBL_MAX)

Where d is input(double) and f is output(float).  If scaled float value
is duplicate in new input file, nextafterf() function used to find next
float value, ensuring no duplicates.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-10-29 08:52:30 -07:00
Wilco Dijkstra f392915d1e benchtests: Improve bench-memcpy-random
Improve the random memcpy benchmark. Double the number of tests and increase
the size of the memory region to test between 32KB and 1024KB. This improves
accuracy on modern cores. Clean up formatting of the frequency array.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-10-29 15:45:53 +01:00
Noah Goldstein cf3acd774f Benchtests: Add benchtests for __memcmpeq
No bug. This commit adds __memcmpeq benchmarks. The benchmarks just
use the existing ones in memcmp. This will be useful for testing
implementations of __memcmpeq that do not just alias memcmp.
2021-10-27 13:03:46 -05:00
H.J. Lu d8e7d06381 bench-math: Sort and put each bench per line
Sort and put each math bench per line to prepare for new math benches.

Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-10-23 05:20:25 -07:00
Noah Goldstein 5d26d12f4a benchtests: Add medium cases and increase iters in bench-memset.c
No bug.

This commit adds new medium size cases for lengths in [512, 1024). As
well it increase the iters to INNER_LOOP_ITERS_LARGE for more reliable
results.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-10-08 15:13:06 -05:00
H.J. Lu de0a7c5a0b benchtests: Building benchmarks as static executables
Building benchmarks as static executables:
=========================================

To build benchmarks as static executables, on the build system, run:

  $ make STATIC-BENCHTESTS=yes bench-build

You can copy benchmark executables to another machine and run them
without copying the source nor build directories.
2021-10-04 10:09:13 -07:00
Noah Goldstein a1c056c9d0 benchtests: Improve reliability of memcmp benchmarks
No bug. Remove reallocation of bufs between implementation tests. Move
initialization outside of foreach implementation test loop. Increase
iteration count.

Generally before this commit was seeing a great deal of variability
between runs. The goal of this commit is to make the results more
reliable.

Benchtests build and bench-memcmp succeeding.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-09-24 18:04:05 -05:00
Naohiro Tamura cb5088cfd3 benchtests: Fix validate_benchout.py exceptions
This patch fixed validate_benchout.py two exceptions,
1) AttributeError
   if benchout_strings.schema.json is specified, and
2) json.decoder.JSONDecodeError
   if benchout file is not JSON.

$ ~/glibc/benchtests/scripts/validate_benchout.py bench-memset.out \
~/glibc/benchtests/scripts/benchout_strings.schema.json
Traceback (most recent call last):
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main
    bench.parse_bench(args[0], args[1])
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 139, in parse_bench
    do_for_all_timings(bench, lambda b, f, v:
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 107, in do_for_all_timings
    if 'timings' not in bench['functions'][func][k].keys():
AttributeError: 'str' object has no attribute 'keys'

$ ~/glibc/benchtests/scripts/validate_benchout.py bench-math-inlines.out \
~/glibc/benchtests/scripts/benchout_strings.schema.json
Traceback (most recent call last):
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 86, in <module>
    sys.exit(main(sys.argv[1:]))
  File "/home/naohirot/glibc/benchtests/scripts/validate_benchout.py", line 69, in main
    bench.parse_bench(args[0], args[1])
  File "/home/naohirot/glibc/benchtests/scripts/import_bench.py", line 137, in parse_bench
    bench = json.load(benchfile)
  File "/usr/lib/python3.6/json/__init__.py", line 299, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.6/json/decoder.py", line 342, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 17 (char 16)

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-16 09:19:55 +05:30
Naohiro Tamura 2fd36391be benchtests: Remove redundant assert.h
This patch removed redundant "#include <assert.h>" from
bench-memset-large.c and bench-memset-walk.c.

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-13 09:08:26 +05:30
Naohiro Tamura 3886eaff9d benchtests: Enable scripts/plot_strings.py to read stdin
This patch enables scripts/plot_strings.py to read a benchmark result
file from stdin.
To keep backward compatibility, that is to keep accepting multiple of
benchmark result files in argument, blank argument doesn't mean stdin,
but '-' does.
Therefore nargs parameter of ArgumentParser.add_argument() method is
not changed to '?', but keep '+'.

ex:
  $ jq '.' bench-memset.out | plot_strings.py -
  $ jq '.' bench-memset.out | plot_strings.py - bench-memset-large.out
  $ plot_strings.py bench-memset.out bench-memset-large.out

error ex:
  $ jq '.' bench-memset.out | plot_strings.py

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-09-13 09:04:21 +05:30
Fangrui Song 710ba420fd Remove sysdeps/*/tls-macros.h
They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing.  Now
that we have migrated to __thread and tls_model attributes, these macros
are unused and the tls-macros.h files can retire.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2021-08-18 09:15:20 -07:00
Paul Zimmermann db737c79c6 Remove obsolete comments/name from several benchtest input files.
These comments refer to slow paths that were removed in
glibc 2.34 or earlier.  The corresponding "names" that yield
separate workload traces for "make bench" are thus obsolete.
We are however keeping the corresponding inputs.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-08-02 15:27:16 +02:00
Paul Zimmermann 4165dd2e95 Remove obsolete comments/name from acos-inputs, since slow path was removed. 2021-08-02 15:05:22 +02:00
Siddhesh Poyarekar 70d08ba204 tests: use xmalloc to allocate implementation array
The benchmark and tests must fail in case of allocation failure in the
implementation array.  Also annotate the x* allocators in support.h so
that the compiler has more information about them.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2021-07-28 17:45:19 +05:30
Naohiro Tamura f12ec02f53 benchtests: Fixed bench-memcpy-random: buf1: mprotect failed
This patch fixed mprotect system call failure on AArch64.
This failure happened on not only A64FX but also ThunderX2.

Also this patch updated a JSON key from "max-size" to "length" so that
'plot_strings.py' can process 'bench-memcpy-random.out'
2021-05-26 12:01:06 +01:00
Noah Goldstein fc335a0ded Bench: Add support for choose direction of memcpy in benchtests
This patch adds support for testing memcpy with both dst > src and dst
< src. Since memcpy is implemented as memmove which has seperate
control flows for certain sizes depending on dst > src it seems like
1) information that should be provided in the benchtest output and a
variable that can be controlled for the benchmarks.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-05-23 19:36:36 -04:00
Noah Goldstein e68d6fccca x86: Expand bench-memcmp.c and test-memcmp.c
No bug. This commit adds some additional performance test cases to
bench-memcmp.c and test-memcmp.c. The new benchtests include some
medium range sizes, as well as small sizes near page cross. The new
correctness tests correspond with the new benchtests though add some
additional cases for checking the page cross logic.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-18 22:57:39 -04:00
Matheus Castanho f4605e611a benchtests: Use JSON for bench-rawmemchr output
Convert the output of benchtests/bench-rawmemchr to JSON like other string
benchmarks.  This makes the output more parseable and allows usage of
compare_strings.py, for example.

Reviewed-by: Lucas A. M. Magalhaes <lamm@linux.ibm.com>
2021-05-17 11:10:19 -03:00
Paul Zimmermann 8d0985b055 add workload traces for cbrtl
These workload traces cover the whole "long double" range.
This patch was prepared with the help of Adhemerval Zanella.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-05-10 18:45:34 +02:00
Noah Goldstein 1427d28e30 Bench: Expand bench-memchr.c
No bug. This commit adds some additional cases for bench-memchr.c
including testing medium sizes and testing short length with both an
inbound match and out of bound match.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-05-03 10:18:11 -07:00
H.J. Lu 98544f5bcf bench-memcpy: Collect data from 2KB to 4KB
Collect data on memcpy from 2KB to 4KB with the 64-byte increment value.
2021-05-03 05:08:22 -07:00
Noah Goldstein 81f6dd2135 x86: Expand test-memset.c and bench-memset.c
No bug. This commit adds tests cases and benchmarks for page cross and
for memset to the end of the page without crossing. As well in
test-memset.c this commit adds sentinel on start/end of tstbuf to test
for overwrites

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-04-19 15:08:04 -07:00
Siddhesh Poyarekar 5660ab19f4 benchtests: Fix name of exp10f benchmark variant
Variant names don't accept brackets.
2021-04-18 12:56:33 +05:30
Siddhesh Poyarekar a373aa25c7 benchtests: Fix pthread-locks test to produce valid json
The benchtests json allows {function {variant}} categorization of
results whereas the pthread-locks tests had {function {variant
{subvariant}}}, which broke validation.  Fix that by serializing the
subvariants as variant-subvariant.  Also update the schema to
recognize the new benchmark attributes after fixing the naming
conventions.
2021-04-18 12:56:29 +05:30
noah 81cbc3bcae x86: Expanding test-memmove.c, test-memcpy.c, bench-memcpy-large.c
No Bug. This commit expanding the range of tests / benchmarks for
memmove and memcpy. The test expansion is mostly in the vein of
increasing the maximum size, increasing the number of unique
alignments tested, and testing both source < destination and vice
versa. The benchmark expansaion is just to increase the number of
unique alignments. test-memcpy, test-memccpy, test-mempcpy,
test-memmove, and tst-memmove-overflow all pass.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
2021-04-16 12:09:56 -07:00
Paul Zimmermann 934d88d862 add workload traces for missing functions (double format)
This patch adds workload traces for all double format functions where such
files are missing.  For each function, a set of 1000 random values is
generated at random using SageMath, such that the output values are
meaningful (for example avoiding too large inputs for exp10 where the
output would be +Inf).  More details about the generated values are
given at the beginning of each file.
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2021-03-29 16:23:19 +02:00
Raphael Moreira Zinsly 6cf1911122 benchtests: Add ilogb* tests
Add a benchtest to ilogb, ilogbf and ilogbf128 based on the logb* benchtests.
2021-03-16 12:19:09 -03:00
Naohiro Tamura 7960c5eea9 benchtests: Updated json bench-variant attribute
This patch updates json "bench-variant" attribute of "bench-memset.c"
to "default" so that the script "benchtests/scripts/plot_strings.py"
can generate a file "memset_time_default_linear.png".
Without this patch, the script "benchtests/scripts/plot_strings.py"
generates a file "memset_time__linear.png" which has inconsistent form
with "memcpy_time_default_linear.png" and
"memmove_time_default_linear.png".
2021-02-10 08:50:26 +05:30
noah a00e2fe3df strchr: Add additional benchmarks and tests
This patch adds additional benchmarks and tests for string size of
4096 and several benchmarks for string size 256 with different
alignments.
2021-02-08 11:34:00 -08:00
Arjun Shankar 3725ee39db benchtests: Do not build bench-timing-type with MODULE_NAME=libc
Since commit 2682695e5c, `make bench-build' with `--enable-static-pie'
fails due to bench-timing-type being incorrectly built with MODULE_NAME
set to `libc'.  This commit sets MODULE_NAME to nonlib, thus fixing the
build failure.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2021-01-26 18:14:19 +01:00
Fangrui Song 87d583c6e8 install: Replace scripts/output-format.sed with objdump -f [BZ ]
GNU ld and gold have supported --print-output-format since 2011. glibc
requires binutils>=2.25 (2015), so if LD is GNU ld or gold, we can
assume the option is supported.

lld is by default a cross linker supporting multiple targets. It auto
detects the file format and does not need OUTPUT_FORMAT. It does not
support --print-output-format.

By parsing objdump -f, we can support all the three linkers.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2021-01-11 12:03:36 -08:00
Paul Eggert 2b778ceb40 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 6694 files FOO.
I then removed trailing white space from benchtests/bench-pthread-locks.c
and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this
diagnostic from Savannah:
remote: *** pre-commit check failed ...
remote: *** error: lines with trailing whitespace found
remote: error: hook declined to update refs/heads/master
2021-01-02 12:17:34 -08:00
DJ Delorie 4be44c3208 New benchtest: pthread locks
Performance benchmarks for various posix locks: mutex, rwlock,
spinlock, condvar, and semaphore.  Each test is performed with
an empty loop body or with a computationally "interesting" (i.e.
difficult to optimize away, and used just to allow lock code to
be "hidden" in the filler's CPU cycles).
2020-10-21 11:03:52 -04:00
H.J. Lu 06e95b93f0 bench-strcmp.c: Add workloads on page boundary
Add strcmp workloads on page boundary.
2020-09-24 10:46:38 -07:00
H.J. Lu c4277ba234 bench-strncmp.c: Add workloads on page boundary
Add strncmp workloads on page boundary.
2020-09-24 10:46:30 -07:00
Arjun Shankar 03e26098b1 benchtests: Run _Float128 tests only on architectures that support it
__float128 is a non-standard name and is not available on some architectures
(like aarch64 or s390x) even though they may support the standard _Float128
type.  Other architectures (like armv7) don't support quad-precision
floating-point operations at all.

This commit replaces benchtests references to __float128 with _Float128 and
runs the corresponding tests only on architectures that support it.
2020-09-23 16:11:57 +02:00
Paul Zimmermann 26fbd74059 benchtests: Add "workload" traces for sinf128
This patch adds workload traces for sinf128 in binary32.  The trace is
made of 1000 random numbers, generated with SageMath.
2020-09-10 15:25:22 -03:00