| Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds the narrowing subtract functions from TS 18661-1 to
glibc's libm: fsub, fsubl, dsubl, f32subf64, f32subf32x, f32xsubf64
for all configurations; f32subf64x, f32subf128, f64subf64x,
f64subf128, f32xsubf64x, f32xsubf128, f64xsubf128 for configurations
with _Float64x and _Float128; __nldbl_dsubl for ldbl-opt.
The changes are essentially the same as for the narrowing add
functions, so the description of those generally applies to this patch
as well.
Tested for x86_64, x86, mips64 (all three ABIs, both hard and soft
float) and powerpc, and with build-many-glibcs.py.
* math/Makefile (libm-narrow-fns): Add sub.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing subtract functions.
* math/bits/mathcalls-narrow.h (sub): Use __MATHCALL_NARROW.
* math/gen-auto-libm-tests.c (test_functions): Add sub.
* math/math-narrow.h (CHECK_NARROW_SUB): New macro.
(NARROW_SUB_ROUND_TO_ODD): Likewise.
(NARROW_SUB_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__fsubl): New
macro.
(__dsubl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fsub and
dsub.
(CFLAGS-nldbl-dsub.c): New variable.
(CFLAGS-nldbl-fsub.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_dsubl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_dsubl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fsub, fsubl,
dsubl, fMsubfN, fMsubfNx, fMxsubfN and fMxsubfNx.
* math/auto-libm-test-in: Add tests of sub.
* math/auto-libm-test-out-narrow-sub: New generated file.
* math/libm-test-narrow-sub.inc: New file.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xsubf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fsub.c: Likewise.
* sysdeps/ieee754/float128/s_f32subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64subf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xsubf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_dsubl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_fsubl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dsub.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsub.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
This patch series cleans up the many uses of __ieee754_sqrt(f/l) in GLIBC.
The goal is to enable GCC to do the inlining, and if this fails call the
__ieee754_sqrt function. This is done by internally declaring sqrt with asm
redirects. The compat symbols and sqrt wrappers need to disable the redirect.
The redirect is also disabled if there are already redirects defined when
using -ffinite-math-only.
All math functions (but not math tests, non-library code and libnldbl) are
built with -fno-math-errno which means GCC will typically inline sqrt as a
single instruction. This means targets are no longer forced to add a special
inline for sqrt.
* include/math.h (sqrt): Declare with asm redirect.
(sqrtf): Likewise.
(sqrtl): Likewise.
(sqrtf128): Likewise.
* Makeconfig: Add -fno-math-errno for libc/libm, but build testsuite,
nonlib and libnldbl with -fmath-errno.
* math/w_sqrt_compat.c: Define NO_MATH_REDIRECT.
* math/w_sqrt_template.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* sysdeps/i386/fpu/w_sqrt.c: Likewise.
* sysdeps/i386/fpu/w_sqrt_compat.c: Likewise.
* sysdeps/generic/math-type-macros-float128.h: Remove math.h and
complex.h.
|
|
Remove the now unused mplog and mpexp files.
* math/Makefile: Remove mpexp.c and mplog.c
* sysdeps/i386/fpu/mpexp.c: Delete file.
* sysdeps/i386/fpu/mplog.c: Likewise.
* sysdeps/ia64/fpu/mpexp.c: Likewise.
* sysdeps/ia64/fpu/mplog.c: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog.
* sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function.
* sysdeps/ieee754/dbl-64/mpexp.c: Delete file.
* sysdeps/ieee754/dbl-64/mplog.c: Likewise.
* sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/mplog.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*.
* sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines.
* sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file.
* sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise.
|
|
Remove the __slowexp code, so exp is no longer correctly rounded. The
result is computed to about 70 bits precision so the worst case ulp
error is about 0.500007 in nearest rounding mode.
* manual/probes.texi: Remove slowexp probes.
* math/Makefile: Remove slowexp.
* sysdeps/generic/math_private.h (__slowexp): Remove.
* sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and
document error bounds.
* sysdeps/i386/fpu/slowexp.c: Remove.
* sysdeps/ia64/fpu/slowexp.c: Remove.
* sysdeps/ieee754/dbl-64/slowexp.c: Remove.
* sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove.
* sysdeps/m68k/m680x0/fpu/slowexp.c: Remove.
* sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma.
* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove.
|
|
Remove the slow paths from pow. Like several other double precision math
functions, pow is exactly rounded. This is not required from math functions
and causes major overheads as it requires multiple fallbacks using higher
precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns
of up to 100000x have been reported when the highest precision path triggers.
All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1).
The worst case error is ~0.506ULP. A simple test over a few hundred million
values shows pow is 10% faster on average. This fixes BZ #13932.
[BZ #13932]
* sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove.
* benchtests/pow-inputs: Update comment for slow path cases.
* manual/probes.texi (slowpow_p10): Delete removed probe.
(slowpow_p10): Likewise.
* math/Makefile: Remove halfulp.c and slowpow.c.
* sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/generic/math_private.h (__exp1): Remove error argument.
(__halfulp): Remove.
(__slowpow): Remove.
* sysdeps/i386/fpu/halfulp.c: Delete file.
* sysdeps/i386/fpu/slowpow.c: Likewise.
* sysdeps/ia64/fpu/halfulp.c: Likewise.
* sysdeps/ia64/fpu/slowpow.c: Likewise.
* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument,
improve comments and add error analysis.
* sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis.
(power1): Remove function:
(log1): Remove error argument, add error analysis.
(my_log2): Remove function.
* sysdeps/ieee754/dbl-64/halfulp.c: Delete file.
* sysdeps/ieee754/dbl-64/slowpow.c: Likewise.
* sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise.
* sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c.
* sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1.
* sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c,
slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define.
* sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file.
* sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise.
|
|
This patch adds the narrowing add functions from TS 18661-1 to glibc's
libm: fadd, faddl, daddl, f32addf64, f32addf32x, f32xaddf64 for all
configurations; f32addf64x, f32addf128, f64addf64x, f64addf128,
f32xaddf64x, f32xaddf128, f64xaddf128 for configurations with
_Float64x and _Float128; __nldbl_daddl for ldbl-opt. As discussed for
the build infrastructure patch, tgmath.h support is deliberately
deferred, and FP_FAST_* macros are not applicable without optimized
function implementations.
Function implementations are added for all relevant pairs of formats
(including certain cases of a format and itself where more than one
type has that format). The main implementations use round-to-odd, or
a trivial computation in the case where both formats are the same or
where the wider format is IBM long double (in which case we don't
attempt to be correctly rounding). The sysdeps/ieee754/soft-fp
implementations use soft-fp, and are used automatically for
configurations without exceptions and rounding modes by virtue of
existing Implies files. As previously discussed, optimized versions
for particular architectures are possible, but not included.
i386 gets a special version of f32xaddf64 to avoid problems with
double rounding (similar to the existing fdim version), since this
function must round just once without an intermediate rounding to long
double. (No such special version is needed for any other function,
because the nontrivial functions use round-to-odd, which does the
intermediate computation with the rounding mode set to round-to-zero,
and double rounding is OK except in round-to-nearest mode, so is OK
for that intermediate round-to-zero computation.) mul and div will
need slightly different special versions for i386 (using round-to-odd
on long double instead of precision control) because of the
possibility of inexact intermediate results in the subnormal range for
double.
To reduce duplication among the different function implementations,
math-narrow.h gets macros CHECK_NARROW_ADD, NARROW_ADD_ROUND_TO_ODD
and NARROW_ADD_TRIVIAL.
In the trivial cases and for any architecture-specific optimized
implementations, the overhead of the errno setting might be
significant, but I think that's best handled through compiler built-in
functions rather than providing separate no-errno versions in glibc
(and likewise there are no __*_finite entry points for these function
provided, __*_finite effectively being no-errno versions at present in
most cases).
Tested for x86_64 and x86, with both GCC 6 and GCC 7. Tested for
mips64 (all three ABIs, both hard and soft float) and powerpc with GCC
7. Tested with build-many-glibcs.py with both GCC 6 and GCC 7.
* math/Makefile (libm-narrow-fns): Add add.
(libm-test-funcs-narrow): Likewise.
* math/Versions (GLIBC_2.28): Add narrowing add functions.
* math/bits/mathcalls-narrow.h (add): Use __MATHCALL_NARROW .
* math/gen-auto-libm-tests.c (test_functions): Add add.
* math/math-narrow.h (CHECK_NARROW_ADD): New macro.
(NARROW_ADD_ROUND_TO_ODD): Likewise.
(NARROW_ADD_TRIVIAL): Likewise.
* sysdeps/ieee754/float128/float128_private.h (__faddl): New
macro.
(__daddl): Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Add fadd and
dadd.
(CFLAGS-nldbl-dadd.c): New variable.
(CFLAGS-nldbl-fadd.c): Likewise.
* sysdeps/ieee754/ldbl-opt/Versions (GLIBC_2.28): Add
__nldbl_daddl.
* sysdeps/ieee754/ldbl-opt/nldbl-compat.h (__nldbl_daddl): New
prototype.
* manual/arith.texi (Misc FP Arithmetic): Document fadd, faddl,
daddl, fMaddfN, fMaddfNx, fMxaddfN and fMxaddfNx.
* math/auto-libm-test-in: Add tests of add.
* math/auto-libm-test-out-narrow-add: New generated file.
* math/libm-test-narrow-add.inc: New file.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_f32xaddf64.c: Likewise.
* sysdeps/ieee754/dbl-64/s_fadd.c: Likewise.
* sysdeps/ieee754/float128/s_f32addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64addf128.c: Likewise.
* sysdeps/ieee754/float128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_f64xaddf128.c: Likewise.
* sysdeps/ieee754/ldbl-128/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_daddl.c: Likewise.
* sysdeps/ieee754/ldbl-96/s_faddl.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-dadd.c: Likewise.
* sysdeps/ieee754/ldbl-opt/nldbl-fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_daddl.c: Likewise.
* sysdeps/ieee754/soft-fp/s_fadd.c: Likewise.
* sysdeps/ieee754/soft-fp/s_faddl.c: Likewise.
* sysdeps/powerpc/fpu/libm-test-ulps: Update.
* sysdeps/mach/hurd/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/riscv/rv64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
Testing narrowing functions for x86_64 with GCC 6 showed up a further
testsuite fix needed: there is no _Float128 sNaN support before GCC 7
on x86_64 / x86, and the existing tests of SNAN_TESTS only checked it
for the return type, not for the argument type. This patch fixes the
code to check SNAN_TESTS (ARG_FLOAT) as well (in a variable set in
libm-test-driver.c, since libm-test-support.c is compiled only once
for each choice of FLOAT).
Tested for x86_64 and x86 with GCC 6 in conjunction with the main
patch adding narrowing add functions.
* math/libm-test-driver.c (snan_tests_arg): New variable.
* math/libm-test-support.h (snan_tests_arg): New declaration.
* math/libm-test-support.c (enable_test): Check snan_tests_arg.
|
|
This patch continues preparations for adding TS 18661-1 narrowing libm
functions by adding the required testsuite infrastructure to test such
functions through the libm-test infrastructure.
That infrastructure is based around testing for a single type, FLOAT.
For the narrowing functions, FLOAT, the "main" type for testing, is
the function return type; the argument type is ARG_FLOAT. This is
consistent with how the code built once for each type,
libm-test-support.c, depends on FLOAT for such things as calculating
ulps errors in results but can already handle different argument types
(pointers, integers, long double for nexttoward).
Makefile machinery is added to handle building tests for all pairs of
types for which there are narrowing functions (as with non-narrowing
functions, aliases are tested just the same as the functions they
alias). gen-auto-libm-tests gains a --narrow option for building
outputs for narrowing functions (so narrowing sqrt and fma will share
the same inputs as non-narrowing, but gen-auto-libm-tests will be run
with and without that option to generate different output files). In
the narrowing case, the auto-libm-test-out-narrow-* files include
annotations for each test about what properties ARG_FLOAT must have to
be able to represent all the inputs for that test; those annotations
result in calls to the TEST_COND_arg_fmt macro.
gen-libm-test.pl has some minor updates to handle narrowing tests (for
example, arguments in such tests must be surrounded by ARG_LIT calls
instead of LIT calls). Various new macros are added to the C test
support code (for example, sNaN initializers need to be properly
typed, so arg_snan_value is added; other such arg_* macros are added
as it seems cleanest to do so, though some are not strictly required).
Special-casing of the ibm128 format to allow for its limitations is
adjusted to handle it as the argument format as well as as the result
format; thus, the tests of the new functions allow nonzero ulps only
in the case where ibm128 is the argument format, as otherwise the
functions correspond to fully-defined IEEE operations. The ulps in
question appear as e.g. 'Function: "add_ldouble"' in libm-test-ulps
(with 1ulp errors then listed for double and float for that function
in powerpc); no support is added to generate corresponding faddl /
daddl ulps listings in the ulps table in the manual.
For the previous patch, I noted the need to avoid spurious macro
expansions of identifiers such as "add". A test test-narrow-macros.c
is added to verify such macro expansions are successfully avoided, and
there is also a -mlong-double-64 version of that test for ldbl-opt.
This test is set up to cover the full set of relevant identifiers from
the start rather than adding functions one at a time as each function
group is added.
Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with the actual addition of
"add" functions).
* math/Makefile (test-type-pairs): New variable.
(test-type-pairs-f64xf128-yes): Likewise.
(tests): Add test-narrow-macros.
(libm-test-funcs-narrow): New variable.
(libm-test-c-narrow): Likewise.
(generated): Add $(libm-test-c-narrow).
(libm-tests-base-narrow): New variable.
(libm-tests-narrow): Likewise.
(libm-tests): Add $(libm-tests-narrow).
(libm-tests-for-type): Handle $(libm-tests-narrow).
(libm-test-c-narrow-obj): New variable.
($(libm-test-c-narrow-obj)): New rule.
($(foreach t,$(libm-tests-narrow),$(objpfx)$(t).c)): Likewise.
($(foreach f,$(libm-test-funcs-narrow),$(objpfx)$(o)-$(f).o)): Use
$(o-iterator) to set dependencies and CFLAGS.
* math/gen-auto-libm-tests.c: Document use for narrowing
functions.
(output_for_one_input_case): Take argument NARROW.
(generate_output): Likewise. Update call to
output_for_one_input_case.
(main): Take --narrow option. Update call to generate_output.
* math/gen-libm-test.pl (_apply_lit): Take macro name as argument.
(apply_lit): Update call to _apply_lit.
(apply_arglit): New function.
(parse_args): Handle "a" arguments.
(parse_auto_input): Handle format names using ":".
* math/README.libm-test: Document "a" parameter type.
* math/libm-test-support.h (ARG_TYPE_MIN): New macro.
(ARG_TYPE_TRUE_MIN): Likewise.
(ARG_TYPE_MAX): Likwise.
(ARG_MIN_EXP): Likewise.
(ARG_MAX_EXP): Likewise.
(ARG_MANT_DIG): Likewise.
(TEST_COND_arg_ibm128): Likewise.
(TEST_COND_ibm128_libgcc): Define conditional on [ARG_FLOAT].
(TEST_COND_arg_fmt): New macro.
(init_max_error): Update prototype.
* math/libm-test-support.c (test_ibm128): New variable.
(init_max_error): Take argument testing_ibm128 and set test_ibm128
instead of using [TEST_COND_ibm128] conditional.
(test_exceptions): Use test_ibm128 instead of TEST_COND_ibm128.
* math/libm-test-driver.c (STR_ARG_FLOAT): New macro.
[TEST_NARROW] (TEST_MSG): New definition.
(arg_plus_zero): New macro.
(arg_minus_zero): Likewise.
(arg_plus_infty): Likewise.
(arg_minus_infty): Likewise.
(arg_qnan_value_pl): Likewise.
(arg_qnan_value): Likewise.
(arg_snan_value_pl): Likewise.
(arg_snan_value): Likewise.
(arg_max_value): Likewise.
(arg_min_value): Likewise.
(arg_min_subnorm_value): Likewise.
[ARG_FLOAT] (struct test_aa_f_data): New struct type.
(RUN_TEST_LOOP_aa_f): New macro.
(TEST_SUFF): New macro.
(TEST_SUFF_STR): Likewise.
[!TEST_MATHVEC] (VEC_SUFF): Don't define.
(TEST_COND_any_ibm128): New macro.
(START): Use TEST_SUFF and TEST_SUFF_STR in initializer for
this_func. Update call to init_max_error.
* math/test-double.h (FUNC_NARROW_PREFIX): New macro.
* math/test-float.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float128.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float32.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float32x.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float64.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-float64x.h (FUNC_NARROW_PREFIX): Likewise.
* math/test-math-scalar.h (TEST_NARROW): Likewise.
* math/test-math-vector.h (TEST_NARROW): Likewise.
* math/test-arg-double.h: New file.
* math/test-arg-float128.h: Likewise.
* math/test-arg-float32x.h: Likewise.
* math/test-arg-float64.h: Likewise.
* math/test-arg-float64x.h: Likewise.
* math/test-arg-ldouble.h: Likewise.
* math/test-math-narrow.h: Likewise.
* math/test-narrow-macros.c: Likewise.
* sysdeps/ieee754/ldbl-opt/test-narrow-macros-ldbl-64.c: Likewise.
* sysdeps/ieee754/ldbl-opt/Makefile (tests): Add
test-narrow-macros-ldbl-64.
(CFLAGS-test-narrow-macros-ldbl-64.c): New variable.
|
|
TS 18661-1 defines libm functions that carry out an operation (+ - * /
sqrt fma) on their arguments and return a result rounded to a
(usually) narrower type, as if the original result were computed to
infinite precision and then rounded directly to the result type
without any intermediate rounding to the argument type. For example,
fadd, faddl and daddl for addition. These are the last remaining TS
18661-1 functions left to be added to glibc. TS 18661-3 extends this
to corresponding functions for _FloatN and _FloatNx types.
As functions parametrized by two rather than one varying
floating-point types, these functions require infrastructure in glibc
that was not required for previous libm functions. This patch
provides such infrastructure - excluding test support, and actual
function implementations, which will be in subsequent patches.
Declaring the functions uses a header bits/mathcalls-narrow.h, which
is included many times, for each relevant pair of types. This will
end up containing macro calls of the form
__MATHCALL_NARROW (__MATHCALL_NAME (add), __MATHCALL_REDIR_NAME (add), 2);
for each family of narrowing functions. (The structure of this macro
call, with the calls to __MATHCALL_NAME and __MATHCALL_REDIR_NAME
there rather than in the definition of __MATHCALL_NARROW, arises from
the names such as "add" *not* themselves being reserved identifiers -
meaning it's necessary to avoid any indirection that would result in a
user-defined "add" macro being expanded.) Whereas for existing
functions declaring long double functions is disabled if _LIBC in the
case where they alias double functions, to facilitate defining the
long double functions as aliases of the double ones, there is no such
logic for the narrowing functions in this patch. Rather, the files
defining such functions are expected to use #define to hide the
original declarations of the alias names, to avoid errors about
defining aliases with incompatible types.
math/Makefile support is added for building the functions (listed in
libm-narrow-fns, currently empty) for all relevant pairs of types. An
internal header math-narrow.h is added for macros shared between
multiple function implementations - currently a ROUND_TO_ODD macro to
facilitate writing functions using the round-to-odd implementation
approach, and alias macros to create all the required function
aliases. libc_feholdexcept_setroundf128 and libc_feupdateenv_testf128
are added for use when required (only for x86_64). float128_private.h
support is added for ldbl-128 narrowing functions to be used for
_Float128.
Certain things are specifically omitted from this patch and the
immediate followups. tgmath.h support is deferred; there remain
unresolved questions about how the type-generic macros for these
functions are supposed to work, especially in the case of arguments of
integer type. The math.h / bits/mathcalls-narrow.h logic, and the
logic for determining what functions / aliases to define, will need
some adjustments to support the sqrt and fma functions, where
e.g. f32xsqrtf64 can just be an alias for sqrt rather than a separate
function. TS 18661-1 defines FP_FAST_* macros but no support is
included for defining them (they won't in general be true without
architecture-specific optimized function versions).
For each of the function groups (add sub mul div sqrt fma) there are
always six functions present (e.g. fadd, faddl, daddl, f32addf64,
f32addf32x, f32xaddf64). When _Float64x and _Float128 are supported,
there are seven more (e.g. f32addf64x, f32addf128, f64addf64x,
f64addf128, f32xaddf64x, f32xaddf128, f64xaddf128). In addition, in
the ldbl-opt case there are function names such as __nldbl_daddl (an
alias for f32xaddf64, which is not a reserved name in TS 18661-1, only
in TS 18661-3), for calls to daddl to be mapped to in the
-mlong-double-64 case. (Calls to faddl just get mapped to fadd, and
for sqrt and fma there won't be __nldbl_* functions because dsqrtl and
dfmal can just be mapped to sqrt and fma with -mlong-double-64.)
While there are six or thirteen functions present in each group (plus
__nldbl_* names only as an ABI, not an API), not all are distinct;
they fall in various groups of aliases. There are two distinct
versions built if long double has the same format as double; four if
they have distinct formats but there is no _Float64x or _Float128
support; five if long double has binary128 format; seven when
_Float128 is distinct from long double.
Architecture-specific optimized versions are possible, but not
included in my patches. For example, IA64 generally supports
narrowing the result of most floating-point instructions; Power ISA
2.07 (POWER8) supports double values as arguments to float
instructions, with the results narrowed as expected; Power ISA 3
(POWER9) supports round-to-odd for float128 instructions, so meaning
that approach can be used without needing to set and restore the
rounding mode and test "inexact". I intend to leave any such
optimized versions to the architecture maintainers. Generally in such
cases it would also make sense for calls to these functions to be
expanded inline (given -fno-math-errno); I put a suggestion for TS
18661-1 built-in functions at <https://gcc.gnu.org/wiki/SummerOfCode>.
Tested for x86_64 (this patch in isolation, as well as testing for
various configurations in conjunction with further patches).
* math/bits/mathcalls-narrow.h: New file.
* include/bits/mathcalls-narrow.h: Likewise.
* math/math-narrow.h: Likewise.
* math/math.h (__MATHCALL_NARROW_ARGS_1): New macro.
(__MATHCALL_NARROW_ARGS_2): Likewise.
(__MATHCALL_NARROW_ARGS_3): Likewise.
(__MATHCALL_NARROW_NORMAL): Likewise.
(__MATHCALL_NARROW_REDIR): Likewise.
(__MATHCALL_NARROW): Likewise.
[__GLIBC_USE (IEC_60559_BFP_EXT)]: Repeatedly include
<bits/mathcalls-narrow.h> with _Mret_, _Marg_ and __MATHCALL_NAME
defined.
[__GLIBC_USE (IEC_60559_TYPES_EXT)]: Likewise.
* math/Makefile (headers): Add bits/mathcalls-narrow.h.
(libm-narrow-fns): New variable.
(libm-narrow-types-basic): Likewise.
(libm-narrow-types-ldouble-yes): Likewise.
(libm-narrow-types-float128-yes): Likewise.
(libm-narrow-types-float128-alias-yes): Likewise.
(libm-narrow-types): Likewise.
(libm-routines): Add narrowing functions.
* sysdeps/i386/fpu/fenv_private.h [__x86_64__]
(libc_feholdexcept_setroundf128): New macro.
[__x86_64__] (libc_feupdateenv_testf128): Likewise.
* sysdeps/ieee754/float128/float128_private.h: Include
<math/math-narrow.h>.
[libc_feholdexcept_setroundf128] (libc_feholdexcept_setroundl):
Undefine and redefine.
[libc_feupdateenv_testf128] (libc_feupdateenv_testl): Likewise.
(libm_alias_float_ldouble): Undefine and redefine.
(libm_alias_double_ldouble): Likewise.
|
|
The math/Makefile variable libm-test-incs was formerly used, but no
longer is. This patch removes it.
Tested for x86_64.
* math/Makefile [$(PERL) != no] (libm-test-incs): Remove variable.
|
|
Currently math_errhandling is always set to MATH_ERRNO | MATH_ERREXCEPT
even if -fno-math-errno is used. It is not defined at all when fast-math
is used. Set it to 0 with fast-math - this is noncomforming but more
useful than not define math_errhandling at all. Also take __NO_MATH_ERRNO__
into account and update comment.
* math/math.h (math_errhandling): Set to 0 with __FAST_MATH__.
Add __NO_MATH_ERRNO__ check.
|
|
I found that "make regen-ulps" failed when building with unmodified
GNU make 4.1, and an objdir /some/where/math/ longer than about 37
characters, because the list of tests in the "for run in $^" loop
exceeded the Linux kernel's MAX_ARG_STRLEN limit (131072 bytes) on the
length of a single argument passed to a command.
Some GNU/Linux distributions have a patch to make to work around this
limit (see e.g. Debian bug 688601), but clearly this ought to work
without needing such a patch. This patch arranges for the shell loop
to be over the test names without a $(objdir) prefix, which reduces
the space used to less than half MAX_ARG_STRLEN.
(I think we ought to aim to get rid of bits/mathinline.h completely -
filing GCC bugs for any optimizations GCC can't currently do with
-ffast-math - which would mean we could halve the number of libm tests
run because separate inline function tests would no longer be needed.
However, with a long directory name even half the number of tests
could make this command exceed MAX_ARG_STRLEN without my patch.)
Tested regen-ulps on a system where it failed before this patch.
* math/Makefile (run-regen-ulps): Add $(objpfx) to test name here.
(regen-ulps): Use $(libm-tests) not $^ in shell loop.
|
|
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
In C++ mode, __MATH_TG cannot be used for defining iseqsig, because
__MATH_TG relies on __builtin_types_compatible_p, which is a C-only
builtin. This is true when float128 is provided as an ABI-distinct type
from long double.
Moreover, the comparison macros from ISO C take two floating-point
arguments, which need not have the same type. Choosing what underlying
function to call requires evaluating the formats of the arguments, then
selecting which is wider. The macro __MATH_EVAL_FMT2 provides this
information, however, only the type of the macro expansion is relevant
(actually evaluating the expression would be incorrect).
This patch provides a C++ version of iseqsig, in which only the type of
__MATH_EVAL_FMT2 (__typeof or decltype) is used as a template parameter
for __iseqsig_type. This function calls the appropriate underlying
function.
Tested for powerpc64le and x86_64.
[BZ #22377]
* math/Makefile [C++] (tests): Add test for iseqsig.
* math/math.h [C++] (iseqsig): New implementation, which does
not rely on __MATH_TG/__builtin_types_compatible_p.
* math/test-math-iseqsig.cc: New file.
* sysdeps/powerpc/powerpc64le/Makefile
(CFLAGS-test-math-iseqsig.cc): New variable.
|
|
Revert:
2017-12-19 Joseph Myers <joseph@codesourcery.com>
* sysdeps/x86_64/fpu/libm-test-ulps: Update.
2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com>
* sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and
<errno.h>. Include "eexp.tbl".
(half): New constant.
(one): Likewise.
(__ieee754_exp): Rewrite.
(__slowexp): Remove prototype.
* sysdeps/ieee754/dbl-64/eexp.tbl: New file.
* sysdeps/ieee754/dbl-64/slowexp.c: Remove file.
* sysdeps/i386/fpu/slowexp.c: Likewise.
* sysdeps/ia64/fpu/slowexp.c: Likewise.
* sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise.
* sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise.
* sysdeps/generic/math_private.h (__slowexp): Remove prototype.
* sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in
comment.
* sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math]
(CPPFLAGS-slowexp.c): Remove variable.
* sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines):
Remove slowexp-fma, slowexp-fma4 and slowexp-avx.
(CFLAGS-slowexp-fma.c): Remove variable.
(CFLAGS-slowexp-fma4.c): Likewise.
(CFLAGS-slowexp-avx.c): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not
define as macro.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise.
* sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise.
* math/Makefile (type-double-routines): Remove slowexp.
* manual/probes.texi (slowexp_p6): Remove.
(slowexp_p32): Likewise.
|
|
These changes will be active for all platforms that don't provide
their own exp() routines. They will also be active for ieee754
versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and
erf.
Typical performance gains |