aarch64: Add half-width versions of AdvSIMD f32 libmvec routines

Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
author: Joe Ramsay <Joe.Ramsay@arm.com> 2023-12-19 16:44:01 +0000
committer: Szabolcs Nagy <szabolcs.nagy@arm.com> 2023-12-20 08:41:25 +0000
commit: cc0d77ba944cd4ce46c5f0e6d426af3057962ca5 (patch)
tree: 840c09b10bcb0ad4f733e8cb4bce2acbd92e5945 /sysdeps/aarch64/fpu/asinf_advsimd.c
parent: 3150cc0c9019bf9da841419f86dda8e7f26d676d (diff)
download: glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.xz
glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.zip
1 files changed, 3 insertions, 1 deletions
diff --git a/sysdeps/aarch64/fpu/asinf_advsimd.c b/sysdeps/aarch64/fpu/asinf_advsimd.c
index 3180ae7c8e..9a100e52fe 100644
--- a/sysdeps/aarch64/fpu/asinf_advsimd.c
+++ b/sysdeps/aarch64/fpu/asinf_advsimd.c
@@ -63,7 +63,7 @@ special_case (float32x4_t x, float32x4_t y, uint32x4_t special)
 
    The largest observed error in this region is 2.41 ulps,
      _ZGVnN4v_asinf (0x1.00203ep-1) got 0x1.0c3a64p-1 want 0x1.0c3a6p-1.  */
-float32x4_t VPCS_ATTR V_NAME_F1 (asin) (float32x4_t x)
+float32x4_t VPCS_ATTR NOINLINE V_NAME_F1 (asin) (float32x4_t x)
 {
   const struct data *d = ptr_barrier (&data);
 
@@ -102,3 +102,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (asin) (float32x4_t x)
   /* Copy sign.  */
   return vbslq_f32 (v_u32 (AbsMask), y, x);
 }
+libmvec_hidden_def (V_NAME_F1 (asin))
+HALF_WIDTH_ALIAS_F1 (asin)
author	Joe Ramsay <Joe.Ramsay@arm.com>	2023-12-19 16:44:01 +0000
committer	Szabolcs Nagy <szabolcs.nagy@arm.com>	2023-12-20 08:41:25 +0000
commit	cc0d77ba944cd4ce46c5f0e6d426af3057962ca5 (patch)
tree	840c09b10bcb0ad4f733e8cb4bce2acbd92e5945 /sysdeps/aarch64/fpu/asinf_advsimd.c
parent	3150cc0c9019bf9da841419f86dda8e7f26d676d (diff)
download	glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.tar.xz glibc-cc0d77ba944cd4ce46c5f0e6d426af3057962ca5.zip