aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAdhemerval Zanella <adhemerval.zanella@linaro.org>2024-12-06 14:37:49 -0300
committerAdhemerval Zanella <adhemerval.zanella@linaro.org>2025-03-06 10:13:46 -0300
commit9c858712dd8de7f72156c3cf780c6fc9e8c96bcc (patch)
tree5861b487ef966f88ba8f49066f3b58aa5fd563a5
parent4e68a5ca5da468c7e8a710a94455d5b27722f8e6 (diff)
downloadglibc-9c858712dd8de7f72156c3cf780c6fc9e8c96bcc.tar.xz
glibc-9c858712dd8de7f72156c3cf780c6fc9e8c96bcc.zip
linux: Add mseal syscall support
It as added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370) as way to block operations as unmaping, moving to another location, shrinking the size, expanding the size, or modifying to a pre-existent memory mapping. Although the systecall only work on 64 bit CPU, the entrypoint was added for all ABIs (since kernel might eventually implement it to additional ones and/or the abi can execute on a 64 bit kernel). Checked on x86_64-linux-gnu.
-rw-r--r--NEWS4
-rw-r--r--manual/memory.texi69
-rw-r--r--sysdeps/unix/sysv/linux/Makefile2
-rw-r--r--sysdeps/unix/sysv/linux/Versions3
-rw-r--r--sysdeps/unix/sysv/linux/aarch64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/alpha/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/arc/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/arm/be/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/arm/le/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/bits/mman-shared.h8
-rw-r--r--sysdeps/unix/sysv/linux/csky/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/hppa/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/i386/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/kernel-features.h8
-rw-r--r--sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/microblaze/be/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/microblaze/le/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/or1k/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/powerpc/powerpc64/be/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/powerpc/powerpc64/le/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/riscv/rv32/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/sh/be/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/sh/le/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/syscalls.list1
-rw-r--r--sysdeps/unix/sysv/linux/tst-mseal-pkey.c84
-rw-r--r--sysdeps/unix/sysv/linux/tst-mseal.c67
-rw-r--r--sysdeps/unix/sysv/linux/x86_64/64/libc.abilist1
-rw-r--r--sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist1
40 files changed, 276 insertions, 1 deletions
diff --git a/NEWS b/NEWS
index e2e40e141c..4732ec2522 100644
--- a/NEWS
+++ b/NEWS
@@ -9,7 +9,9 @@ Version 2.42
Major new features:
- [Add new features here]
+* On Linux, the mseal function has been added. It allows to seal memory
+ mappings to avoid further change during process execution such as protection
+ permissions, unmapping, moving to another location, or shrinking the size.
Deprecated and removed features, and other changes affecting compatibility:
diff --git a/manual/memory.texi b/manual/memory.texi
index dc4621e2c5..f092ee4ce6 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -3072,6 +3072,75 @@ process memory, no matter how it was allocated. However, portable use
of the function requires that it is only used with memory regions
returned by @code{mmap} or @code{mmap64}.
+@deftypefun int mseal (void *@var{address}, size_t @var{length}, unsigned long @var{flags})
+@standards{Linux, sys/mman.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+A successful call to the @code {mseal} function protects the memory
+range @var{address} of @var{length} bytes, previous allocated with
+@code{mmap} or @code{mremap}, against further metadata changes such
+as:
+
+@itemize @bullet
+@item
+Unmapping, moving to another location, extending or shrinking the size,
+via @code{munmap} and @code{mremap}.
+
+@item
+Moving or expanding a different VMA into the current location, via
+@code{mremap}.
+
+@item
+Modifying the memory range with @code{mmap} along with flag @code{MAP_FIXED}.
+
+@item
+Change the protection flags with @code{mprotect} or @code{pkey_mprotect}. Also
+for certain destructive @code{madvise} behaviours (@code{MADV_DONTNEED},
+@code{MADV_FREE}, @code{MADV_DONTNEED_LOCKED}, and @code{MADV_WIPEONFORK}),
+@code{mseal} only blocks the operation if the protection key associate with
+the memory denies write.
+
+@item
+Destructive behaviors on anonymous memory, such as @code{madvice} with
+@code{MADV_DONTNEED}.
+@end itemize
+
+The @var{address} must be an allocated virtual memory done by @code{mmap}
+or @code{mremap}, and it must be page aligned. The end address (@var{address}
+plus @var{length}) must be within an allocated virtual memory range. There
+should be no unallocated memory between the start and end of address range.
+
+The @var{flags} is currently ununsed.
+
+The @code{mseal} function returns @math{0} on sucess and @math{-1} on
+failure.
+
+The following @code{errno} error conditions are defined for this
+function:
+
+@table @code
+@item EPERM
+The system blocked the operation, and the given address range is unmodified
+without a partial update. This error is also returned when @code{mseal}
+is issued on a 32 bit CPUs (the sealing is currently supported only on
+64-bit CPUs, although 32 bit binaries running on 64 bit kernel is
+supported).
+
+@item ENOMEM
+Either the @var{address} is not allocated, or the end address is not within the
+allocation, or there is an unallocated memory between start and end address.
+
+@item ENOSYS
+The kernel does not support the @code{mseal} syscall.
+
+@strong{NB:} The memory sealing changes the lifetime of a mapping, where the
+sealing memory could not be unmapped until the process terminates or replaces
+the process image through @code{execve} function. The sealed mappings are
+inherited through @code{fork}.
+
+@end table
+@end deftypefun
+
@subsection Memory Protection Keys
@cindex memory protection key
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 395d2d6593..ae46e0726d 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -213,6 +213,8 @@ tests += \
tst-misalign-clone \
tst-mlock2 \
tst-mount \
+ tst-mseal \
+ tst-mseal-pkey \
tst-ntp_adjtime \
tst-ntp_gettime \
tst-ntp_gettimex \
diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions
index 55d565545a..e5d226165e 100644
--- a/sysdeps/unix/sysv/linux/Versions
+++ b/sysdeps/unix/sysv/linux/Versions
@@ -332,6 +332,9 @@ libc {
sched_getattr;
sched_setattr;
}
+ GLIBC_2.42 {
+ mseal;
+ }
GLIBC_PRIVATE {
# functions used in other libraries
__syscall_rt_sigqueueinfo;
diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
index 38db77e4f7..eab487fc76 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist
@@ -2750,3 +2750,4 @@ GLIBC_2.39 stdc_trailing_zeros_ull F
GLIBC_2.39 stdc_trailing_zeros_us F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist
index 637bfce9fb..d6d3464c46 100644
--- a/sysdeps/unix/sysv/linux/alpha/libc.abilist
+++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist
@@ -3097,6 +3097,7 @@ GLIBC_2.4 wprintf F
GLIBC_2.4 wscanf F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix/sysv/linux/arc/libc.abilist b/sysdeps/unix/sysv/linux/arc/libc.abilist
index 4a305cf730..2c7aa2c939 100644
--- a/sysdeps/unix/sysv/linux/arc/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arc/libc.abilist
@@ -2511,3 +2511,4 @@ GLIBC_2.39 stdc_trailing_zeros_ull F
GLIBC_2.39 stdc_trailing_zeros_us F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
diff --git a/sysdeps/unix/sysv/linux/arm/be/libc.abilist b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
index 1d54f71b14..54fd3d3a83 100644
--- a/sysdeps/unix/sysv/linux/arm/be/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/be/libc.abilist
@@ -2803,6 +2803,7 @@ GLIBC_2.4 xprt_register F
GLIBC_2.4 xprt_unregister F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix/sysv/linux/arm/le/libc.abilist b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
index ff7e8bc40b..4231ef1ffd 100644
--- a/sysdeps/unix/sysv/linux/arm/le/libc.abilist
+++ b/sysdeps/unix/sysv/linux/arm/le/libc.abilist
@@ -2800,6 +2800,7 @@ GLIBC_2.4 xprt_register F
GLIBC_2.4 xprt_unregister F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix/sysv/linux/bits/mman-shared.h b/sysdeps/unix/sysv/linux/bits/mman-shared.h
index 31590979b9..b9892f62c2 100644
--- a/sysdeps/unix/sysv/linux/bits/mman-shared.h
+++ b/sysdeps/unix/sysv/linux/bits/mman-shared.h
@@ -81,6 +81,14 @@ int pkey_free (int __key) __THROW;
range. */
int pkey_mprotect (void *__addr, size_t __len, int __prot, int __pkey) __THROW;
+/* Seal the address range to avoid further modifications, such as remmap to
+ shrink or expand the VMA, change protection permission with mprotect,
+ unmap with munmap, destructive semantic such madvise with MADV_DONTNEED.
+ The address range must be valid VMA, withouth any gap (unallocated memory)
+ between start and end, and ADDR much be page aligned (LEN will be page
+ aligned implicitly). */
+int mseal (void *__addr, size_t __len, unsigned long flags) __THROW;
+
__END_DECLS
#endif /* __USE_GNU */
diff --git a/sysdeps/unix/sysv/linux/csky/libc.abilist b/sysdeps/unix/sysv/linux/csky/libc.abilist
index c3ed65467d..53265587ca 100644
--- a/sysdeps/unix/sysv/linux/csky/libc.abilist
+++ b/sysdeps/unix/sysv/linux/csky/libc.abilist
@@ -2787,3 +2787,4 @@ GLIBC_2.39 stdc_trailing_zeros_ull F
GLIBC_2.39 stdc_trailing_zeros_us F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist
index 991475380c..2ad9eb1286 100644
--- a/sysdeps/unix/sysv/linux/hppa/libc.abilist
+++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist
@@ -2824,6 +2824,7 @@ GLIBC_2.4 unshare F
GLIBC_2.41 cacheflush F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist
index 4fedf775d4..f808d3f110 100644
--- a/sysdeps/unix/sysv/linux/i386/libc.abilist
+++ b/sysdeps/unix/sysv/linux/i386/libc.abilist
@@ -3007,6 +3007,7 @@ GLIBC_2.4 unlinkat F
GLIBC_2.4 unshare F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix/sysv/linux/kernel-features.h b/sysdeps/unix/sysv/linux/kernel-features.h
index 86b2d3ce51..a44824991f 100644
--- a/sysdeps/unix/sysv/linux/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/kernel-features.h
@@ -257,4 +257,12 @@
# define __ASSUME_FCHMODAT2 0
#endif
+/* The mseal system call was introduced across all architectures in Linux 6.10
+ (although only supported on 64-bit CPUs). */
+#if __LINUX_KERNEL_VERSION >= 0x060A00
+# define __ASSUME_MSEAL 1
+#else
+# define __ASSUME_MSEAL 0
+#endif
+
#endif /* kernel-features.h */
diff --git a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
index 0024282289..db7a5896ff 100644
--- a/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
+++ b/sysdeps/unix/sysv/linux/loongarch/lp64/libc.abilist
@@ -2271,3 +2271,4 @@ GLIBC_2.39 stdc_trailing_zeros_ull F
GLIBC_2.39 stdc_trailing_zeros_us F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
index 142595eb3e..91250faca5 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
@@ -2783,6 +2783,7 @@ GLIBC_2.4 xprt_register F
GLIBC_2.4 xprt_unregister F
GLIBC_2.41 sched_getattr F
GLIBC_2.41 sched_setattr F
+GLIBC_2.42 mseal F
GLIBC_2.5 __readlinkat_chk F
GLIBC_2.5 inet6_opt_append F
GLIBC_2.5 inet6_opt_find F
diff --git a/sysdeps/unix