| Age | Commit message (Collapse) | Author | Files | Lines |
|
This replaces the pthread rwlock with a new implementation that uses a
more scalable algorithm (primarily through not using a critical section
anymore to make state changes). The fast path for rdlock acquisition and
release is now basically a single atomic read-modify write or CAS and a few
branches. See nptl/pthread_rwlock_common.c for details.
* nptl/DESIGN-rwlock.txt: Remove.
* nptl/lowlevelrwlock.sym: Remove.
* nptl/Makefile: Add new tests.
* nptl/pthread_rwlock_common.c: New file. Contains the new rwlock.
* nptl/pthreadP.h (PTHREAD_RWLOCK_PREFER_READER_P): Remove.
(PTHREAD_RWLOCK_WRPHASE, PTHREAD_RWLOCK_WRLOCKED,
PTHREAD_RWLOCK_RWAITING, PTHREAD_RWLOCK_READER_SHIFT,
PTHREAD_RWLOCK_READER_OVERFLOW, PTHREAD_RWLOCK_WRHANDOVER,
PTHREAD_RWLOCK_FUTEX_USED): New.
* nptl/pthread_rwlock_init.c (__pthread_rwlock_init): Adapt to new
implementation.
* nptl/pthread_rwlock_rdlock.c (__pthread_rwlock_rdlock_slow): Remove.
(__pthread_rwlock_rdlock): Adapt.
* nptl/pthread_rwlock_timedrdlock.c
(pthread_rwlock_timedrdlock): Adapt.
* nptl/pthread_rwlock_timedwrlock.c
(pthread_rwlock_timedwrlock): Adapt.
* nptl/pthread_rwlock_trywrlock.c (pthread_rwlock_trywrlock): Adapt.
* nptl/pthread_rwlock_tryrdlock.c (pthread_rwlock_tryrdlock): Adapt.
* nptl/pthread_rwlock_unlock.c (pthread_rwlock_unlock): Adapt.
* nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock_slow): Remove.
(__pthread_rwlock_wrlock): Adapt.
* nptl/tst-rwlock10.c: Adapt.
* nptl/tst-rwlock11.c: Adapt.
* nptl/tst-rwlock17.c: New file.
* nptl/tst-rwlock18.c: New file.
* nptl/tst-rwlock19.c: New file.
* nptl/tst-rwlock2b.c: New file.
* nptl/tst-rwlock8.c: Adapt.
* nptl/tst-rwlock9.c: Adapt.
* sysdeps/aarch64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/arm/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/hppa/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/ia64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/m68k/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/microblaze/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/mips/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/nios2/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/s390/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/sh/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/sparc/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/tile/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h
(pthread_rwlock_t): Adapt.
* sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h
(pthread_rwlock_t): Adapt.
* sysdeps/x86/bits/pthreadtypes.h (pthread_rwlock_t): Adapt.
* nptl/nptl-printers.py (): Adapt.
* nptl/nptl_lock_constants.pysym: Adapt.
* nptl/test-rwlock-printers.py: Adapt.
* nptl/test-rwlockattr-printers.c: Adapt.
* nptl/test-rwlockattr-printers.py: Adapt.
|
|
|
|
This is a new implementation for condition variables, required
after http://austingroupbugs.net/view.php?id=609 to fix bug 13165. In
essence, we need to be stricter in which waiters a signal or broadcast
is required to wake up; this couldn't be solved using the old algorithm.
ISO C++ made a similar clarification, so this also fixes a bug in
current libstdc++, for example.
We can't use the old algorithm anymore because futexes do not guarantee
to wake in FIFO order. Thus, when we wake, we can't simply let any
waiter grab a signal, but we need to ensure that one of the waiters
happening before the signal is woken up. This is something the previous
algorithm violated (see bug 13165).
There's another issue specific to condvars: ABA issues on the underlying
futexes. Unlike mutexes that have just three states, or semaphores that
have no tokens or a limited number of them, the state of a condvar is
the *order* of the waiters. A waiter on a semaphore can grab a token
whenever one is available; a condvar waiter must only consume a signal
if it is eligible to do so as determined by the relative order of the
waiter and the signal.
Therefore, this new algorithm maintains two groups of waiters: Those
eligible to consume signals (G1), and those that have to wait until
previous waiters have consumed signals (G2). Once G1 is empty, G2
becomes the new G1. 64b counters are used to avoid ABA issues.
This condvar doesn't yet use a requeue optimization (ie, on a broadcast,
waking just one thread and requeueing all others on the futex of the
mutex supplied by the program). I don't think doing the requeue is
necessarily the right approach (but I haven't done real measurements
yet):
* If a program expects to wake many threads at the same time and make
that scalable, a condvar isn't great anyway because of how it requires
waiters to operate mutually exclusive (due to the mutex usage). Thus, a
thundering herd problem is a scalability problem with or without the
optimization. Using something like a semaphore might be more
appropriate in such a case.
* The scalability problem is actually at the mutex side; the condvar
could help (and it tries to with the requeue optimization), but it
should be the mutex who decides how that is done, and whether it is done
at all.
* Forcing all but one waiter into the kernel-side wait queue of the
mutex prevents/avoids the use of lock elision on the mutex. Thus, it
prevents the only cure against the underlying scalability problem
inherent to condvars.
* If condvars use short critical sections (ie, hold the mutex just to
check a binary flag or such), which they should do ideally, then forcing
all those waiter to proceed serially with kernel-based hand-off (ie,
futex ops in the mutex' contended state, via the futex wait queues) will
be less efficient than just letting a scalable mutex implementation take
care of it. Our current mutex impl doesn't employ spinning at all, but
if critical sections are short, spinning can be much better.
* Doing the requeue stuff requires all waiters to always drive the mutex
into the contended state. This leads to each waiter having to call
futex_wake after lock release, even if this wouldn't be necessary.
[BZ #13165]
* nptl/pthread_cond_broadcast.c (__pthread_cond_broadcast): Rewrite to
use new algorithm.
* nptl/pthread_cond_destroy.c (__pthread_cond_destroy): Likewise.
* nptl/pthread_cond_init.c (__pthread_cond_init): Likewise.
* nptl/pthread_cond_signal.c (__pthread_cond_signal): Likewise.
* nptl/pthread_cond_wait.c (__pthread_cond_wait): Likewise.
(__pthread_cond_timedwait): Move here from pthread_cond_timedwait.c.
(__condvar_confirm_wakeup, __condvar_cancel_waiting,
__condvar_cleanup_waiting, __condvar_dec_grefs,
__pthread_cond_wait_common): New.
(__condvar_cleanup): Remove.
* npt/pthread_condattr_getclock.c (pthread_condattr_getclock): Adapt.
* npt/pthread_condattr_setclock.c (pthread_condattr_setclock):
Likewise.
* npt/pthread_condattr_getpshared.c (pthread_condattr_getpshared):
Likewise.
* npt/pthread_condattr_init.c (pthread_condattr_init): Likewise.
* nptl/tst-cond1.c: Add comment.
* nptl/tst-cond20.c (do_test): Adapt.
* nptl/tst-cond22.c (do_test): Likewise.
* sysdeps/aarch64/nptl/bits/pthreadtypes.h (pthread_cond_t): Adapt
structure.
* sysdeps/arm/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/ia64/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/m68k/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/microblaze/nptl/bits/pthreadtypes.h (pthread_cond_t):
Likewise.
* sysdeps/mips/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/nios2/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/s390/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/sh/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/tile/nptl/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h (pthread_cond_t):
Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h (pthread_cond_t):
Likewise.
* sysdeps/x86/bits/pthreadtypes.h (pthread_cond_t): Likewise.
* sysdeps/nptl/internaltypes.h (COND_NWAITERS_SHIFT): Remove.
(COND_CLOCK_BITS): Adapt.
* sysdeps/nptl/pthread.h (PTHREAD_COND_INITIALIZER): Adapt.
* nptl/pthreadP.h (__PTHREAD_COND_CLOCK_MONOTONIC_MASK,
__PTHREAD_COND_SHARED_MASK): New.
* nptl/nptl-printers.py (CLOCK_IDS): Remove.
(ConditionVariablePrinter, ConditionVariableAttributesPrinter): Adapt.
* nptl/nptl_lock_constants.pysym: Adapt.
* nptl/test-cond-printers.py: Adapt.
* sysdeps/unix/sysv/linux/hppa/internaltypes.h (cond_compat_clear,
cond_compat_check_and_clear): Adapt.
* sysdeps/unix/sysv/linux/hppa/pthread_cond_timedwait.c: Remove file ...
* sysdeps/unix/sysv/linux/hppa/pthread_cond_wait.c
(__pthread_cond_timedwait): ... and move here.
* nptl/DESIGN-condvar.txt: Remove file.
* nptl/lowlevelcond.sym: Likewise.
* nptl/pthread_cond_timedwait.c: Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_broadcast.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_signal.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_timedwait.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i486/pthread_cond_wait.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i586/pthread_cond_broadcast.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i586/pthread_cond_signal.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i586/pthread_cond_timedwait.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i586/pthread_cond_wait.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i686/pthread_cond_broadcast.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i686/pthread_cond_signal.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i686/pthread_cond_timedwait.S: Likewise.
* sysdeps/unix/sysv/linux/i386/i686/pthread_cond_wait.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_signal.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S: Likewise.
* sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: Likewise.
|
|
Patch disables Intel TSX on some Haswell processors to avoid TSX
on kernels that weren't updated with the latest microcode package
(which disables broken feature by default).
* sysdeps/x86/cpu-features.c (get_common_indeces): Add
stepping identification.
(init_cpu_features): Add handle of Haswell.
|
|
Information about whether the ABI of long double is the same as that
of double is split between bits/mathdef.h and bits/wordsize.h.
When the ABIs are the same, bits/mathdef.h defines
__NO_LONG_DOUBLE_MATH. In addition, in the case where the same glibc
binary supports both -mlong-double-64 and -mlong-double-128,
bits/wordsize.h defines __LONG_DOUBLE_MATH_OPTIONAL, along with
__NO_LONG_DOUBLE_MATH if this particular compilation is with
-mlong-double-64.
As part of the refactoring I proposed in
<https://sourceware.org/ml/libc-alpha/2016-11/msg00745.html>, this
patch puts all that information in a single header,
bits/long-double.h. It is included from sys/cdefs.h alongside the
include of bits/wordsize.h, so other headers generally do not need to
include bits/long-double.h directly.
Previously, various bits/mathdef.h headers and bits/wordsize.h headers
had this long double information (including implicitly in some
bits/mathdef.h headers through not having the defines present in the
default version). After the patch, it's all in six bits/long-double.h
headers. Furthermore, most of those new headers are not
architecture-specific. Architectures with optional long double all
use the ldbl-opt sysdeps directory, either in the order (ldbl-64-128,
ldbl-opt, ldbl-128) or (ldbl-128ibm, ldbl-opt). Thus a generic header
for the case where long double = double, and headers in ldbl-128,
ldbl-96 and ldbl-opt, suffices to cover every architecture except for
cases where long double properties vary between different ABIs sharing
a set of installed headers; fortunately all the ldbl-opt cases share a
single compiler-predefined macro __LONG_DOUBLE_128__ that can be used
to tell whether this compilation is -mlong-double-64 or
-mlong-double-128.
The two cases where a set of headers is shared between ABIs with
different long double properties, MIPS (o32 has long double = double,
other ABIs use ldbl-128) and SPARC (32-bit has optional long double,
64-bit has required long double), need their own bits/long-double.h
headers.
As with bits/wordsize.h, multiple-include protection for this header
is generally implicit through the include guards on sys/cdefs.h, and
multiple inclusion is harmless in any case. There is one subtlety:
the header must not define __LONG_DOUBLE_MATH_OPTIONAL if
__NO_LONG_DOUBLE_MATH was defined before its inclusion, because doing
so breaks how sysdeps/ieee754/ldbl-opt/nldbl-compat.h defines
__NO_LONG_DOUBLE_MATH itself before including system headers. Subject
to keeping that working, it would be reasonable to move these macros
from defined/undefined #ifdef to always-defined 1/0 #if semantics, but
this patch does not attempt to do so, just rearranges where the macros
are defined.
After this patch, the only use of bits/mathdef.h is the alpha one for
modifying complex function ABIs for old GCC. Thus, all versions of
the header other than the default and alpha versions are removed, as
is the include from math.h.
Tested for x86_64 and x86. Also did compilation-only testing with
build-many-glibcs.py.
* bits/long-double.h: New file.
* sysdeps/ieee754/ldbl-128/bits/long-double.h: Likewise.
* sysdeps/ieee754/ldbl-96/bits/long-double.h: Likewise.
* sysdeps/ieee754/ldbl-opt/bits/long-double.h: Likewise.
* sysdeps/mips/bits/long-double.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/long-double.h: Likewise.
* math/Makefile (headers): Add bits/long-double.h.
* misc/sys/cdefs.h: Include <bits/long-double.h>.
* stdlib/strtold.c: Include <bits/long-double.h> instead of
<bits/wordsize.h>.
* bits/mathdef.h [!_COMPLEX_H]: Do not allow inclusion.
[!__NO_LONG_DOUBLE_MATH]: Remove conditional code.
* math/math.h: Do not include <bits/mathdef.h>.
* sysdeps/aarch64/bits/mathdef.h: Remove file.
* sysdeps/alpha/bits/mathdef.h [!_COMPLEX_H]: Do not allow
inclusion.
* sysdeps/ia64/bits/mathdef.h: Remove file.
* sysdeps/m68k/m680x0/bits/mathdef.h: Likewise.
* sysdeps/mips/bits/mathdef.h: Likewise.
* sysdeps/powerpc/bits/mathdef.h: Likewise.
* sysdeps/s390/bits/mathdef.h: Likewise.
* sysdeps/sparc/bits/mathdef.h: Likewise.
* sysdeps/x86/bits/mathdef.h: Likewise.
* sysdeps/s390/s390-32/bits/wordsize.h
[!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]: Remove
conditional code.
* sysdeps/s390/s390-64/bits/wordsize.h
[!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]:
Likewise.
* sysdeps/unix/sysv/linux/alpha/bits/wordsize.h
[!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h
[!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]:
Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/wordsize.h
[!__NO_LONG_DOUBLE_MATH && !__LONG_DOUBLE_MATH_OPTIONAL]:
Likewise.
|
|
This uses atomic operations to access lock elision metadata that is accessed
concurrently (ie, adapt_count fields). The size of the data is less than a
word but accessed only with atomic loads and stores; therefore, we add
support for shorter-size atomic load and stores too.
* include/atomic.h (__atomic_check_size_ls): New.
(atomic_load_relaxed, atomic_load_acquire, atomic_store_relaxed,
atomic_store_release): Use it.
* sysdeps/x86/elide.h (ACCESS_ONCE): Remove.
(elision_adapt, ELIDE_LOCK): Use atomics.
* sysdeps/unix/sysv/linux/x86/elision-lock.c (__lll_lock_elision): Use
atomics and improve code comments.
* sysdeps/unix/sysv/linux/x86/elision-trylock.c
(__lll_trylock_elision): Likewise.
|
|
Continuing the refactoring of bits/mathdef.h, this patch stops it
defining FP_ILOGB0 and FP_ILOGBNAN, moving the required information to
a new header bits/fp-logb.h.
There are only two possible values of each of those macros permitted
by ISO C. TS 18661-1 adds corresponding macros for llogb, and their
values are required to correspond to those of the ilogb macros in the
obvious way. Thus two boolean values - for which the same choices are
correct for most architectures - suffice to determine the value of all
these macros, and by defining macros for those boolean values in
bits/fp-logb.h we can then define the public FP_* macros in math.h and
avoid the present duplication of the associated feature test macro
logic.
This patch duly moves to bits/fp-logb.h defining __FP_LOGB0_IS_MIN and
__FP_LOGBNAN_IS_MIN. Default definitions of those to 0 are correct
for both architectures, while ia64, m68k and x86 get their own
versions of bits/fp-logb.h to reflect their use of values different
from the defaults.
The patch renders many copies of bits/mathdef.h trivial (needed only
to avoid the default __NO_LONG_DOUBLE_MATH). I'll revise
<https://sourceware.org/ml/libc-alpha/2016-11/msg00865.html>
accordingly so that it removes all bits/mathdef.h headers except the
default one and the alpha one, and arranges for the header to be
included only by complex.h as the only remaining use at that point
will be for the alpha ABI issues there.
Tested for x86_64 and x86. Also did compile-only testing with
build-many-glibcs.py (using glibc sources from before the commit that
introduced many build failures with undefined __GI___sigsetjmp).
* bits/fp-logb.h: New file.
* sysdeps/ia64/bits/fp-logb.h: Likewise.
* sysdeps/m68k/m680x0/bits/fp-logb.h: Likewise.
* sysdeps/x86/bits/fp-logb.h: Likewise.
* math/Makefile (headers): Add bits/fp-logb.h.
* math/math.h: Include <bits/fp-logb.h>.
[__USE_ISOC99] (FP_ILOGB0): Define based on __FP_LOGB0_IS_MIN.
[__USE_ISOC99] (FP_ILOGBNAN): Define based on __FP_LOGBNAN_IS_MIN.
* bits/mathdef.h (FP_ILOGB0): Remove.
(FP_ILOGBNAN): Likewise.
* sysdeps/aarch64/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/alpha/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/ia64/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/m68k/m680x0/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/mips/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/powerpc/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/s390/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/sparc/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
* sysdeps/x86/bits/mathdef.h (FP_ILOGB0): Likewise.
(FP_ILOGBNAN): Likewise.
|
|
Continuing the refactoring of bits/mathdef.h, this patch moves the
FP_FAST_* definitions into a new bits/fp-fast.h header. Currently
this is only for FP_FAST_FMA*, but in future it would be the
appropriate place for the FP_FAST_* macros from TS 18661-1 as well.
The generic bits/mathdef.h header defines these macros based on
whether the compiler defines __FP_FAST_*. Most architecture-specific
headers, however, fail to do so, meaning that if the architecture (or
some particular processors) does in fact have fused operations, and
GCC knows to use them inline, the FP_FAST_* macros will still not be
defined.
By refactoring, this patch causes the generic version (based on
__FP_FAST_*) to be used in more cases, and so the macro definitions to
be more accurate. Architectures that already defined some or all of
these macros other than based on the predefines have their own
versions of fp-fast.h, which are arranged so they define FP_FAST_* if
either the architecture-specific conditions are true or __FP_FAST_*
are defined.
After this refactoring, various bits/mathdef.h headers for
architectures with long double = double are semantically identical to
the generic version. The patch removes those headers that are
redundant. (In fact two of the four removed were already redundant
before this patch because they did use __FP_FAST_*.)
Tested for x86_64 and x86, and compilation-only with
build-many-glibcs.py.
* bits/fp-fast.h: New file.
* sysdeps/aarch64/bits/fp-fast.h: Likewise.
* sysdeps/powerpc/bits/fp-fast.h: Likewise.
* math/Makefile (headers): Add bits/fp-fast.h.
* math/math.h: Include <bits/fp-fast.h>.
* bits/mathdef.h (FP_FAST_FMA): Remove.
(FP_FAST_FMAF): Likewise.
(FP_FAST_FMAL): Likewise.
* sysdeps/aarch64/bits/mathdef.h (FP_FAST_FMA): Likewise.
(FP_FAST_FMAF): Likewise.
* sysdeps/powerpc/bits/mathdef.h (FP_FAST_FMA): Likewise.
(FP_FAST_FMAF): Likewise.
* sysdeps/x86/bits/mathdef.h (FP_FAST_FMA): Likewise.
(FP_FAST_FMAF): Likewise.
(FP_FAST_FMAL): Likewise.
* sysdeps/arm/bits/mathdef.h: Remove file.
* sysdeps/hppa/fpu/bits/mathdef.h: Likewise.
* sysdeps/sh/sh4/bits/mathdef.h: Likewise.
* sysdeps/tile/bits/mathdef.h: Likewise.
|
|
At present, definitions of float_t and double_t are split among many
bits/mathdef.h headers.
For all but three architectures, these types are float and double.
Furthermore, if you assume __FLT_EVAL_METHOD__ to be defined, that
provides a more generic way of determining the correct values of these
typedefs. Defining these typedefs more generally based on
__FLT_EVAL_METHOD__ was previously proposed by Paul Eggert in
<https://sourceware.org/ml/libc-alpha/2012-02/msg00002.html>.
This patch refactors things in the way I proposed in
<https://sourceware.org/ml/libc-alpha/2016-11/msg00745.html>. A new
header bits/flt-eval-method.h defines a single macro,
__GLIBC_FLT_EVAL_METHOD, which is then used by math.h to define
float_t and double_t. The default is based on __FLT_EVAL_METHOD__
(although actually a default to 0 would have the same effect for
current ports, because ports where values other than 0 or 16 are
possible all have their own headers).
To avoid changing the existing semantics in any case, including for
compilers not defining __FLT_EVAL_METHOD__, architecture-specific
files are then added for m68k, s390, x86 which replicate the existing
semantics. At least with __FLT_EVAL_METHOD__ values possible with
GCC, there should be no change to the choices of float_t and double_t
for any supported configuration.
Architecture maintainer notes:
* m68k: sysdeps/m68k/m680x0/bits/flt-eval-method.h always defines
__GLIBC_FLT_EVAL_METHOD to 2 to replicate the existing logic. But
actually GCC defines __FLT_EVAL_METHOD__ to 0 if TARGET_68040. It
might make sense to make the header prefer to base things on
__FLT_EVAL_METHOD__ if defined, like the x86 version, and so make
the choices of these types more accurate (with a NEWS entry as for
the other changes to these types on particular architectures).
* s390: sysdeps/s390/bits/flt-eval-method.h always defines
__GLIBC_FLT_EVAL_METHOD to 1 to replicate the existing logic. As
previously discussed, it might make sense in coordination with GCC
to eliminate the historic mistake, avoid excess precision in the
-fexcess-precision=standard case and make the typedefs match (with a
NEWS entry, again).
Tested for x86-64 and x86. Also did compilation-only testing with
build-many-glibcs.py.
* bits/flt-eval-method.h: New file.
* sysdeps/m68k/m680x0/bits/flt-eval-method.h: Likewise.
* sysdeps/s390/bits/flt-eval-method.h: Likewise.
* sysdeps/x86/bits/flt-eval-method.h: Likewise.
* math/Makefile (headers): Add bits/flt-eval-method.h.
* math/math.h: Include <bits/flt-eval-method.h>.
[__USE_ISOC99] (float_t): Define based on __GLIBC_FLT_EVAL_METHOD.
[__USE_ISOC99] (double_t): Likewise.
* bits/mathdef.h (float_t): Remove.
(double_t): Likewise.
* sysdeps/aarch64/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/alpha/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/arm/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/hppa/fpu/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/ia64/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/m68k/m680x0/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/mips/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/powerpc/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/s390/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/sh/sh4/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/sparc/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/tile/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
* sysdeps/x86/bits/mathdef.h (float_t): Likewise.
(double_t): Likewise.
|
|
Bug 20787 reports that, while float_t and double_t for 32-bit x86
properly respect -mfpmath=sse, for x86_64 they fail to reflect
-mfpmath=387, which is valid if unusual and results in FLT_EVAL_METHOD
being 2. This patch fixes the definitions to respect
__FLT_EVAL_METHOD__ in that case, arranging for the test that the
types correspond with FLT_EVAL_METHOD to be run with both -mfpmath=387
and -mfpmath=sse.
Note: this patch will also have the effect of making float_t and
double_t be long double for x86_64 with -mfpmath=sse+387, when
FLT_EVAL_METHOD is -1. It seems reasonable for x86_64 to be
consistent with 32-bit x86 in this case (and that definition is
conservatively safe, in that it makes the types correspond to the
widest evaluation format that might be used).
Tested for x86-64 and x86.
[BZ #20787]
* sysdeps/x86/bits/mathdef.h (float_t): Do not define to float if
[__x86_64__] when __FLT_EVAL_METHOD__ is nonzero.
(double_t): Do not define to double if [__x86_64__] when
__FLT_EVAL_METHOD__ is nonzero.
* sysdeps/x86/fpu/test-flt-eval-method-387.c: New file.
* sysdeps/x86/fpu/test-flt-eval-method-sse.c: Likewise.
* sysdeps/x86/fpu/Makefile [$(subdir) = math] (tests): Add
test-flt-eval-method-387 and test-flt-eval-method-sse.
[$(subdir) = math] (CFLAGS-test-flt-eval-method-387.c): New
variable.
[$(subdir) = math] (CFLAGS-test-flt-eval-method-sse.c): Likewise.
|
|
|
|
* bits/wordsize.h: Add documentation.
* sysdeps/aarch64/bits/wordsize.h : New file
* sysdeps/generic/stdint.h (PTRDIFF_MIN, PTRDIFF_MAX): Update
definitions.
(SIZE_MAX): Change ifdef to if in __WORDSIZE32_SIZE_ULONG check.
* sysdeps/gnu/bits/utmp.h (__WORDSIZE_TIME64_COMPAT32): Check
with #if instead of #ifdef.
* sysdeps/gnu/bits/utmpx.h (__WORDSIZE_TIME64_COMPAT32): Ditto.
* sysdeps/mips/bits/wordsize.h (__WORDSIZE32_SIZE_ULONG,
__WORDSIZE32_PTRDIFF_LONG, __WORDSIZE_TIME64_COMPAT32):
Add or change defines.
* sysdeps/powerpc/powerpc32/bits/wordsize.h: Likewise.
* sysdeps/powerpc/powerpc64/bits/wordsize.h: Likewise.
* sysdeps/s390/s390-32/bits/wordsize.h: Likewise.
* sysdeps/s390/s390-64/bits/wordsize.h: Likewise.
* sysdeps/sparc/sparc32/bits/wordsize.h: Likewise.
* sysdeps/sparc/sparc64/bits/wordsize.h: Likewise.
* sysdeps/tile/tilegx/bits/wordsize.h: Likewise.
* sysdeps/tile/tilepro/bits/wordsize.h: Likewise.
* sysdeps/unix/sysv/linux/alpha/bits/wordsize.h: Likewise.
* sysdeps/unix/sysv/linux/powerpc/bits/wordsize.h: Likewise.
* sysdeps/unix/sysv/linux/sparc/bits/wordsize.h: Likewise.
* sysdeps/wordsize-32/bits/wordsize.h: Likewise.
* sysdeps/wordsize-64/bits/wordsize.h: Likewise.
* sysdeps/x86/bits/wordsize.h: Likewise.
|
|
In the Intel Architecture Instruction Set Extensions Programming
reference the recommended way to test for FMA in section
'2.2.1 Detection of FMA' is:
"Application Software must identify that hardware supports AVX as
explained in ... after that it must also detect support for FMA..."
We don't do that in glibc. We use osxsave to detect the use of xgetbv,
and after that we check for AVX and FMA orthogonally. It is conceivable
that you could have the AVX bit clear and the FMA bit in an undefined
state.
This commit fixes FMA and AVX2 detection to depend on usable AVX
as required by the recommended Intel sequences.
v1: https://www.sourceware.org/ml/libc-alpha/2016-10/msg00241.html
v2: https://www.sourceware.org/ml/libc-alpha/2016-10/msg00265.html
|
|
Since the maximum CPUID level of older Intel CPUs is 1, change
handle_intel to return -1, instead of assert, when the maximum
CPUID level is less than 2.
[BZ #20647]
* sysdeps/x86/cacheinfo.c (handle_intel): Return -1 if the
maximum CPUID level is less than 2.
|
|
TS 18661-1 adds an iseqsig type-generic comparison macro to <math.h>.
This macro is like the == operator except that unordered operands
result in the "invalid" exception and errno being set to EDOM.
This patch implements this macro for glibc. Given the need to set
errno, this is implemented with out-of-line functions __iseqsigf,
__iseqsig and __iseqsigl (of which the last only exists at all if long
double is ABI-distinct from double, so no function aliases or compat
support are needed). The present patch ignores excess precision
issues; I intend to deal with those in a followup patch. (Like
comparison operators, type-generic comparison macros should *not*
convert operands to their semantic types but should preserve excess
range and precision, meaning that for some argument types and values
of FLT_EVAL_METHOD, an underlying function should be called for a
wider type than that of the arguments.)
The underlying functions are implemented with the type-generic
template machinery. Comparing x <= y && x >= y is sufficient in ISO C
to achieve an equality comparison with "invalid" raised for unordered
operands (and the results of those two comparisons can also be used to
tell whether errno needs to be set). However, some architectures have
GCC bugs meaning that unordered comparison instructions are used
instead of ordered ones. Thus, a mechanism is provided for
architectures to use an explicit call to feraiseexcept to raise
exceptions if required. If your architecture has such a bug you
should add a fix-fp-int-compare-invalid.h header for it, with a
comment pointing to the relevant GCC bug report; if such a GCC bug is
fixed, that header's contents should have a __GNUC_PREREQ conditional
added so that the workaround can eventually be removed for that
architecture.
Tested for x86_64, x86, mips64, arm and powerpc.
* math/math.h [__GLIBC_USE (IEC_60559_BFP_EXT)] (iseqsig): New
macro.
* math/bits/mathcalls.h [__GLIBC_USE (IEC_60559_BFP_EXT)]
(__iseqsig): New declaration.
* math/s_iseqsig_template.c: New file.
* math/Versions (__iseqsigf): New libm symbol at version
GLIBC_2.25.
(__iseqsig): Likewise.
(__iseqsigl): Likewise.
* math/libm-test.inc (iseqsig_test_data): New array.
(iseqsig_test): New function.
(main): Call iseqsig_test.
* math/Makefile (gen-libm-calls): Add s_iseqsigF.
* manual/arith.texi (FP Comparison Functions): Document iseqsig.
* manual/libm-err-tab.pl: Update comment on interfaces without
ulps tabulated.
* sysdeps/generic/fix-fp-int-compare-invalid.h: New file.
* sysdeps/powerpc/fpu/fix-fp-int-compare-invalid.h: Likewise.
* sysdeps/x86/fpu/fix-fp-int-compare-invalid.h: Likewise.
* sysdeps/nacl/libm.abilist: Update.
* sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist:
Likewise.
* sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist:
Likewise.
* sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise.
* sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise.
|
|
This adds a test to ensure that the problems fixed in the last several
patches do not recur. Each directory checks the headers that it
installs for two properties: first, each header must be compilable in
isolation, as both C and C++, under a representative combination of
language and library conformance levels; second, there is a blacklist
of identifiers that may not appear in any installed header, currently
consisting of the legacy BSD typedefs. (There is an exemption for the
headers that define those typedefs, and for the RPC headers. It may be
necessary to make this more sophisticated if we add more stuff to the
blacklist in the future.)
In order for this test to work correctly, every wrapper header
that actually defines something must guard those definitions with
#ifndef _ISOMAC. This is the existing mechanism used by the conform/
tests to tell wrapper headers not to define anything that the public
header wouldn't, and not to use anything from libc-symbols.h. conform/
only cares for headers that we need to check for standards conformance,
whereas this test applies to *every* header. (Headers in include/ that
are either installed directly, or are internal-use-only and do *not*
correspond to any installed header, are not affected.)
* scripts/check-installed-headers.sh: New script.
* Rules: In each directory that defines header files to be installed,
run check-installed-headers.sh on them as a special test.
* Makefile: Likewise for the headers installed at top level.
* include/aliases.h, include/alloca.h, include/argz.h
* include/arpa/nameser.h, include/arpa/nameser_compat.h
* include/elf.h, include/envz.h, include/err.h
* include/execinfo.h, include/fpu_control.h, include/getopt.h
* include/gshadow.h, include/ifaddrs.h, include/libintl.h
* include/link.h, include/malloc.h, include/mcheck.h
* include/mntent.h, include/netinet/ether.h
* include/nss.h, include/obstack.h, include/printf.h
* include/pty.h, include/resolv.h, include/rpc/auth.h
* include/rpc/auth_des.h, include/rpc/auth_unix.h
* include/rpc/clnt.h, include/rpc/des_crypt.h
* include/rpc/key_prot.h, include/rpc/netdb.h
* include/rpc/pmap_clnt.h, include/rpc/pmap_prot.h
* includ |