aboutsummaryrefslogtreecommitdiff
path: root/sysdeps/nptl/_Fork.c
AgeCommit message (Collapse)AuthorFilesLines
2025-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
2024-11-12linux: Add support for getrandom vDSOAdhemerval Zanella1-0/+2
Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque state allocated with mmap using flags specified by the vDSO. Multiple states are allocated at once, as many as fit into a page, and these are held in an array of available states to be doled out to each thread upon first use, and recycled when a thread terminates. As these states run low, more are allocated. To make this procedure async-signal-safe, a simple guard is used in the LSB of the opaque state address, falling back to the syscall if there's reentrancy contention. Also, _Fork() is handled by blocking signals on opaque state allocation (so _Fork() always sees a consistent state even if it interrupts a getrandom() call) and by iterating over the thread stack cache on reclaim_stack. Each opaque state will be in the free states list (grnd_alloc.states) or allocated to a running thread. The cancellation is handled by always using GRND_NONBLOCK flags while calling the vDSO, and falling back to the cancellable syscall if the kernel returns EAGAIN (would block). Since getrandom is not defined by POSIX and cancellation is supported as an extension, the cancellation is handled as 'may occur' instead of 'shall occur' [1], meaning that if vDSO does not block (the expected behavior) getrandom will not act as a cancellation entrypoint. It avoids a pthread_testcancel call on the fast path (different than 'shall occur' functions, like sem_wait()). It is currently enabled for x86_64, which is available in Linux 6.11, and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are available in Linux 6.12. Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1] Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64 Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64 Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64 Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x
2024-10-08stdlib: Make abort/_Exit AS-safe (BZ 26275)Adhemerval Zanella1-3/+5
The recursive lock used on abort does not synchronize with a new process creation (either by fork-like interfaces or posix_spawn ones), nor it is reinitialized after fork(). Also, the SIGABRT unblock before raise() shows another race condition, where a fork or posix_spawn() call by another thread, just after the recursive lock release and before the SIGABRT signal, might create programs with a non-expected signal mask. With the default option (without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for SIGABRT, where it should be SIG_IGN. To fix the AS-safe, raise() does not change the process signal mask, and an AS-safe lock is used if a SIGABRT is installed or the process is blocked or ignored. With the signal mask change removal, there is no need to use a recursive loc. The lock is also taken on both _Fork() and posix_spawn(), to avoid the spawn process to see the abort handler as SIG_DFL. A read-write lock is used to avoid serialize _Fork and posix_spawn execution. Both sigaction (SIGABRT) and abort() requires to lock as writer (since both change the disposition). The fallback is also simplified: there is no need to use a loop of ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the process, the system is broken). The proposed fix changes how setjmp works on a SIGABRT handler, where glibc does not save the signal mask. So usage like the below will now always abort. static volatile int chk_fail_ok; static jmp_buf chk_fail_buf; static void handler (int sig) { if (chk_fail_ok) { chk_fail_ok = 0; longjmp (chk_fail_buf, 1); } else _exit (127); } [...] signal (SIGABRT, handler); [....] chk_fail_ok = 1; if (! setjmp (chk_fail_buf)) { // Something that can calls abort, like a failed fortify function. chk_fail_ok = 0; printf ("FAIL\n"); } Such cases will need to use sigsetjmp instead. The _dl_start_profile calls sigaction through _profil, and to avoid pulling abort() on loader the call is replaced with __libc_sigaction. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
2024-09-28Linux: Block signals around _Fork (bug 32215)Florian Weimer1-0/+7
This hides the inconsistent TCB state (missing robust mutex list) from signal handlers. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2024-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
2023-01-06Update copyright dates with scripts/update-copyrightsJoseph Myers1-1/+1
2022-07-27arc4random: simplify design for better safetyJason A. Donenfeld1-2/+0
Rather than buffering 16 MiB of entropy in userspace (by way of chacha20), simply call getrandom() every time. This approach is doubtlessly slower, for now, but trying to prematurely optimize arc4random appears to be leading toward all sorts of nasty properties and gotchas. Instead, this patch takes a much more conservative approach. The interface is added as a basic loop wrapper around getrandom(), and then later, the kernel and libc together can work together on optimizing that. This prevents numerous issues in which userspace is unaware of when it really must throw away its buffer, since we avoid buffering all together. Future improvements may include userspace learning more from the kernel about when to do that, which might make these sorts of chacha20-based optimizations more possible. The current heuristic of 16 MiB is meaningless garbage that doesn't correspond to anything the kernel might know about. So for now, let's just do something conservative that we know is correct and won't lead to cryptographic issues for users of this function. This patch might be considered along the lines of, "optimization is the root of all evil," in that the much more complex implementation it replaces moves too fast without considering security implications, whereas the incremental approach done here is a much safer way of going about things. Once this lands, we can take our time in optimizing this properly using new interplay between the kernel and userspace. getrandom(0) is used, since that's the one that ensures the bytes returned are cryptographically secure. But on systems without it, we fallback to using /dev/urandom. This is unfortunate because it means opening a file descriptor, but there's not much of a choice. Secondly, as part of the fallback, in order to get more or less the same properties of getrandom(0), we poll on /dev/random, and if the poll succeeds at least once, then we assume the RNG is initialized. This is a rough approximation, as the ancient "non-blocking pool" initialized after the "blocking pool", not before, and it may not port back to all ancient kernels, though it does to all kernels supported by glibc (≥3.2), so generally it's the best approximation we can do. The motivation for including arc4random, in the first place, is to have source-level compatibility with existing code. That means this patch doesn't attempt to litigate the interface itself. It does, however, choose a conservative approach for implementing it. Cc: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Cristian Rodríguez <crrodriguez@opensuse.org> Cc: Paul Eggert <eggert@cs.ucla.edu> Cc: Mark Harris <mark.hsj@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: linux-crypto@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2022-07-22stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417)Adhemerval Zanella Netto1-0/+2
The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer. To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched). Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically). The ChaCha20 implementation is based on RFC8439 [1], omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does. The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013) [2], who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case. The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization. Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu. Co-authored-by: Florian Weimer <fweimer@redhat.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> [1] https://datatracker.ietf.org/doc/html/rfc8439 [2] https://arxiv.org/pdf/1304.1916.pdf
2022-04-26posix: Remove unused definition on _ForkAdhemerval Zanella1-3/+0
Checked on x86_64-linux-gnu.
2022-01-01Update copyright dates with scripts/update-copyrightsPaul Eggert1-1/+1
I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2021-06-24posix: Consolidate fork implementationAdhemerval Zanella1-0/+52
The Linux nptl implementation is used as base for generic fork implementation to handle the internal locks and mutexes. The system specific bits are moved a new internal _Fork symbol. (This new implementation will be used to provide a async-signal-safe _Fork now that POSIX has clarified that fork might not be async-signal-safe [1]). For Hurd it means that the __nss_database_fork_prepare_parent and __nss_database_fork_subprocess will be run in a slight different order. [1] https://austingroupbugs.net/view.php?id=62