From c70e4e9c9efff9df4c847dd7cfd81bae674219ab Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 8 Jan 2018 08:04:26 -0800 Subject: x86-64: Add sincosf with vector FMA Since the x86-64 assembly version of sincosf is higly optimized with vector instructions, there isn't much room for improvement. However s_sincosf.c written in C with vector math and intrinsics can be optimized by GCC with FMA. On Skylake, bench-sincosf reports performance improvement: Assembly FMA improvement max 104.042 101.008 3% min 9.426 8.586 10% mean 20.6209 18.2238 13% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_sincosf-sse2 and s_sincosf-fma. (CFLAGS-s_sincosf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise. * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if __sincosf is defined. --- ChangeLog | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'ChangeLog') diff --git a/ChangeLog b/ChangeLog index 09ea55ea2b..ad0641e232 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,14 @@ +2018-01-08 H.J. Lu + + * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): + Add s_sincosf-sse2 and s_sincosf-fma. + (CFLAGS-s_sincosf-fma.c): New. + * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: New file. + * sysdeps/x86_64/fpu/multiarch/s_sincosf-sse2.S: Likewise. + * sysdeps/x86_64/fpu/multiarch/s_sincosf.c: Likewise. + * sysdeps/x86_64/fpu/s_sincosf.S: Don't add alias if + __sincosf is defined. + 2018-01-08 Florian Weimer * nptl/tst-thread-exit-clobber.cc: New file. -- cgit v1.2.3