From f3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Tue, 25 Aug 2015 04:33:54 -0700 Subject: Save and restore vector registers in x86-64 ld.so This patch adds SSE, AVX and AVX512 versions of _dl_runtime_resolve and _dl_runtime_profile, which save and restore the first 8 vector registers used for parameter passing. elf_machine_runtime_setup selects the proper _dl_runtime_resolve or _dl_runtime_profile based on _dl_x86_cpu_features. It avoids race condition caused by FOREIGN_CALL macros, which are only used for x86-64. Performance impact of saving and restoring 8 vector registers are negligible on Nehalem, Sandy Bridge, Ivy Bridge and Haswell when ld.so is optimized with SSE2. [BZ #15128] * sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add ifuncmain8. (modules-names): Add ifuncmod8. ($(objpfx)ifuncmain8): New rule. * sysdeps/x86_64/dl-machine.h: Include and . (elf_machine_runtime_setup): Use _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512, _dl_runtime_profile_sse, _dl_runtime_profile_avx, or _dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE. * sysdeps/x86_64/dl-trampoline.S: Rewrite. * sysdeps/x86_64/dl-trampoline.h: Likewise. * sysdeps/x86_64/ifuncmain8.c: New file. * sysdeps/x86_64/ifuncmod8.c: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE): Removed. * sysdeps/x86_64/nptl/tls.h (__128bits): Removed. (tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1. Change rtld_savespace_sse to __glibc_unused2. (RTLD_CHECK_FOREIGN_CALL): Removed. (RTLD_ENABLE_FOREIGN_CALL): Likewise. (RTLD_PREPARE_FOREIGN_CALL): Likewise. (RTLD_FINALIZE_FOREIGN_CALL): Likewise. --- ChangeLog | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 'ChangeLog') diff --git a/ChangeLog b/ChangeLog index 457778a5d8..0dfa3b3c8e 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,30 @@ +2015-08-25 H.J. Lu + + [BZ #15128] + * sysdeps/x86_64/Makefile [$(subdir) == elf] (tests): Add + ifuncmain8. + (modules-names): Add ifuncmod8. + ($(objpfx)ifuncmain8): New rule. + * sysdeps/x86_64/dl-machine.h: Include and + . + (elf_machine_runtime_setup): Use _dl_runtime_resolve_sse, + _dl_runtime_resolve_avx, or _dl_runtime_resolve_avx512, + _dl_runtime_profile_sse, _dl_runtime_profile_avx, or + _dl_runtime_profile_avx512, based on HAS_ARCH_FEATURE. + * sysdeps/x86_64/dl-trampoline.S: Rewrite. + * sysdeps/x86_64/dl-trampoline.h: Likewise. + * sysdeps/x86_64/ifuncmain8.c: New file. + * sysdeps/x86_64/ifuncmod8.c: Likewise. + * sysdeps/x86_64/nptl/tcb-offsets.sym (RTLD_SAVESPACE_SSE): + Removed. + * sysdeps/x86_64/nptl/tls.h (__128bits): Removed. + (tcbhead_t): Change rtld_must_xmm_save to __glibc_unused1. + Change rtld_savespace_sse to __glibc_unused2. + (RTLD_CHECK_FOREIGN_CALL): Removed. + (RTLD_ENABLE_FOREIGN_CALL): Likewise. + (RTLD_PREPARE_FOREIGN_CALL): Likewise. + (RTLD_FINALIZE_FOREIGN_CALL): Likewise. + 2015-08-24 Wilco Dijkstra * sysdeps/aarch64/bzero.S (__bzero): Remove. -- cgit v1.2.3