Commit Graph

31 Commits

Author SHA1 Message Date
Joe Ramsay 080998f6e7 AArch64: Add vector tanpi routines
Vector variant of the new C23 tanpi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Joe Ramsay 40c3a06293 AArch64: Add vector cospi routines
Vector variant of the new C23 cospi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Joe Ramsay 6050b45716 AArch64: Add vector sinpi to libmvec
Vector variant of the new C23 sinpi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Paul Eggert 2642002380 Update copyright dates with scripts/update-copyrights 2025-01-01 11:22:09 -08:00
Joe Ramsay 0fed0b250f aarch64/fpu: Add vector variants of pow
Plus a small amount of moving includes around in order to be able to
remove duplicate definition of asuint64.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-21 14:38:49 +01:00
Joe Ramsay 75207bde68 aarch64/fpu: Add vector variants of cbrt
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-16 14:35:06 +01:00
Joe Ramsay 157f89fa3d aarch64/fpu: Add vector variants of hypot
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-05-16 14:34:43 +01:00
Joe Ramsay 87cb1dfcd6 aarch64/fpu: Add vector variants of erfc
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:24 +01:00
Joe Ramsay 3d3a4fb8e4 aarch64/fpu: Add vector variants of tanh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:20 +01:00
Joe Ramsay eedbbca0bf aarch64/fpu: Add vector variants of sinh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:16 +01:00
Joe Ramsay 8b67920528 aarch64/fpu: Add vector variants of atanh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:12 +01:00
Joe Ramsay 81406ea3c5 aarch64/fpu: Add vector variants of asinh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:33:02 +01:00
Joe Ramsay b09fee1d21 aarch64/fpu: Add vector variants of acosh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:58 +01:00
Joe Ramsay bdb5705b7b aarch64/fpu: Add vector variants of cosh
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:52 +01:00
Joe Ramsay cb5d84f1f8 aarch64/fpu: Add vector variants of erf
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2024-04-04 10:32:48 +01:00
Paul Eggert dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Joe Ramsay a8830c9285 aarch64: Add vector implementations of expm1 routines
May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
2023-11-20 17:53:14 +00:00
Joe Ramsay 3548a4f087 aarch64: Add vector implementations of log1p routines
May discard sign of zero.
2023-11-10 17:07:43 +00:00
Joe Ramsay b07038c5d3 aarch64: Add vector implementations of atan2 routines 2023-11-10 17:07:43 +00:00
Joe Ramsay d30c39f80d aarch64: Add vector implementations of atan routines 2023-11-10 17:07:42 +00:00
Joe Ramsay b5d23367a8 aarch64: Add vector implementations of acos routines 2023-11-10 17:07:42 +00:00
Joe Ramsay 9bed498418 aarch64: Add vector implementations of asin routines 2023-11-10 17:07:42 +00:00
Joe Ramsay 31aaf6fed9 aarch64: Add vector implementations of exp10 routines
Double-precision routines either reuse the exp table (AdvSIMD) or use
SVE FEXPA intruction.
2023-10-23 15:00:45 +01:00
Joe Ramsay 067a34156c aarch64: Add vector implementations of log10 routines
A table is also added, which is shared between AdvSIMD and SVE log10.
2023-10-23 15:00:45 +01:00
Joe Ramsay a8e3ab3074 aarch64: Add vector implementations of log2 routines
A table is also added, which is shared between AdvSIMD and SVE log2.
2023-10-23 15:00:45 +01:00
Joe Ramsay b39e9db5e3 aarch64: Add vector implementations of exp2 routines
Some routines reuse table from v_exp_data.c
2023-10-23 15:00:45 +01:00
Joe Ramsay f554334c05 aarch64: Add vector implementations of tan routines
This includes some utility headers for evaluating polynomials using
various schemes.
2023-10-23 15:00:44 +01:00
Joe Ramsay 4a9392ffc2 aarch64: Add vector implementations of exp routines
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.

As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30 09:04:26 +01:00
Joe Ramsay 78c01a5cbe aarch64: Add vector implementations of log routines
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines. Log lookup table
added as HIDDEN symbol to allow it to be shared between AdvSIMD and
SVE variants.

As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30 09:04:22 +01:00
Joe Ramsay 3bb1af2051 aarch64: Add vector implementations of sin routines
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.

As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-06-30 09:04:16 +01:00
Joe Ramsay cd94326a13 Enable libmvec support for AArch64
This patch enables libmvec on AArch64. The proposed change is mainly
implementing build infrastructure to add the new routines to ABI,
tests and benchmarks. I have demonstrated how this all fits together
by adding implementations for vector cos, in both single and double
precision, targeting both Advanced SIMD and SVE.

The implementations of the routines themselves are just loops over the
scalar routine from libm for now, as we are more concerned with
getting the plumbing right at this point. We plan to contribute vector
routines from the Arm Optimized Routines repo that are compliant with
requirements described in the libmvec wiki.

Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising
the minimum GCC by such a big jump, we allow users to disable libmvec
if their compiler is too old.

Note that at this point users have to manually call the vector math
functions. This seems to be acceptable to some downstream users.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
2023-05-03 12:09:49 +01:00