Vector variants of the new C23 rsqrt routines for both AdvSIMD and
SVE, as well as in both single and double precision.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Vector variants of the new C23 log10p1 routines.
Note: Benchmark inputs for log10p1(f) are identical to log1p(f)
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Vector variants of the new C23 log2p1 routines.
Note: Benchmark inputs for log2p1(f) are identical to log1p(f).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Vector variants of the new C23 exp2m1 & exp10m1 routines.
Note: Benchmark inputs for exp2m1 & exp10m1 are identical to exp2 & exp10
respectively, this also includes the floating point variations.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine atan2pi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine atanpi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine asinpi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine acospi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Plus a small amount of moving includes around in order to be able to
remove duplicate definition of asuint64.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines. Log lookup table
added as HIDDEN symbol to allow it to be shared between AdvSIMD and
SVE variants.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Optimised implementations for single and double precision, Advanced
SIMD and SVE, copied from Arm Optimized Routines.
As previously, data tables are used via a barrier to prevent
overly aggressive constant inlining. Special-case handlers are
marked NOINLINE to avoid incurring the penalty of switching call
standards unnecessarily.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
This patch enables libmvec on AArch64. The proposed change is mainly
implementing build infrastructure to add the new routines to ABI,
tests and benchmarks. I have demonstrated how this all fits together
by adding implementations for vector cos, in both single and double
precision, targeting both Advanced SIMD and SVE.
The implementations of the routines themselves are just loops over the
scalar routine from libm for now, as we are more concerned with
getting the plumbing right at this point. We plan to contribute vector
routines from the Arm Optimized Routines repo that are compliant with
requirements described in the libmvec wiki.
Building libmvec requires minimum GCC 10 for SVE ACLE. To avoid raising
the minimum GCC by such a big jump, we allow users to disable libmvec
if their compiler is too old.
Note that at this point users have to manually call the vector math
functions. This seems to be acceptable to some downstream users.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>