* [PATCH v2 0/5] coredump: allow for flexible coredump handling
@ 2025-06-03 13:31 Christian Brauner
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
` (6 more replies)
0 siblings, 7 replies; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
In addition to the extensive selftests I've already written a
(non-production ready) simple Rust coredump server for this in
userspace:
https://github.com/brauner/dumdum.git
Extend the coredump socket to allow the coredump server to tell the
kernel how to process individual coredumps. This allows for fine-grained
coredump management. Userspace can decide to just let the kernel write
out the coredump, or generate the coredump itself, or just reject it.
When the crashing task connects to the coredump socket the kernel will
send a struct coredump_req to the coredump server. The kernel will set
the size member of struct coredump_req allowing the coredump server how
much data can be read.
The coredump server uses MSG_PEEK to peek the size of struct
coredump_req. If the kernel uses a newer struct coredump_req the
coredump server just reads the size it knows and discard any remaining
bytes in the buffer. If the kernel uses an older struct coredump_req
the coredump server just reads the size the kernel knows.
The returned struct coredump_req will inform the coredump server what
features the kernel supports. The coredump_req->mask member is set to
the currently know features.
The coredump server may only use features whose bits were raised by the
kernel in coredump_req->mask.
In response to a coredump_req from the kernel the coredump server sends
a struct coredump_ack to the kernel. The kernel informs the coredump
server what version of struct coredump_ack it supports by setting struct
coredump_req->size_ack to the size it knows about. The coredump server
may only send as many bytes as coredump_req->size_ack indicates (a
smaller size is fine of course). The coredump server must set
coredump_ack->size accordingly.
The coredump server sets the features it wants to use in struct
coredump_ack->mask. Only bits returned in struct coredump_req->mask may
be used.
In case an invalid struct coredump_ack is sent to the kernel an
out-of-band byte will be sent by the kernel indicating the reason why
the coredump_ack was rejected.
The out-of-band markers allow advanced userspace to infer failure. They
are optional and can be ignored by not listening for POLLPRI events and
aren't necessary for the coredump server to function correctly.
In the initial version the following features are supported in
coredump_{req,ack}->mask:
* COREDUMP_KERNEL
The kernel will write the coredump data to the socket.
* COREDUMP_USERSPACE
The kernel will not write coredump data but will indicate to the
parent that a coredump has been generated. This is used when userspace
generates its own coredumps.
* COREDUMP_REJECT
The kernel will skip generating a coredump for this task.
* COREDUMP_WAIT
The kernel will prevent the task from exiting until the coredump
server has shutdown the socket connection.
The flexible coredump socket can be enabled by using the "@@" prefix
instead of the single "@" prefix for the regular coredump socket:
@@/run/systemd/coredump.socket
will enable flexible coredump handling. Current kernels already enforce
that "@" must be followed by "/" and will reject anything else. So
extending this is backward and forward compatible.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Changes in v2:
- Add epoll-based concurrent coredump handling selftests.
- Improve cover letter.
- Ensure that enum coredump_oob is packed aka a single byte and add a
static_assert() verifying that.
- Simplify helper functions making the patch even smaller.
- Link to v1: https://lore.kernel.org/20250530-work-coredump-socket-protocol-v1-0-20bde1cd4faa@kernel.org
---
Christian Brauner (5):
coredump: allow for flexible coredump handling
selftests/coredump: fix build
selftests/coredump: cleanup coredump tests
tools: add coredump.h header
selftests/coredump: add coredump server selftests
fs/coredump.c | 130 +-
include/uapi/linux/coredump.h | 104 ++
tools/include/uapi/linux/coredump.h | 104 ++
tools/testing/selftests/coredump/Makefile | 2 +-
tools/testing/selftests/coredump/config | 4 +
tools/testing/selftests/coredump/stackdump_test.c | 1705 ++++++++++++++++++---
6 files changed, 1799 insertions(+), 250 deletions(-)
---
base-commit: 3e406741b19890c3d8a2ed126aa7c23b106ca9e1
change-id: 20250520-work-coredump-socket-protocol-6980d1f54c2f
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/5] coredump: allow for flexible coredump handling
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
@ 2025-06-03 13:31 ` Christian Brauner
2025-06-03 13:49 ` Alexander Mikhalitsyn
2025-06-09 14:16 ` Jeff Layton
2025-06-03 13:31 ` [PATCH v2 2/5] selftests/coredump: fix build Christian Brauner
` (5 subsequent siblings)
6 siblings, 2 replies; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
Extend the coredump socket to allow the coredump server to tell the
kernel how to process individual coredumps.
When the crashing task connects to the coredump socket the kernel will
send a struct coredump_req to the coredump server. The kernel will set
the size member of struct coredump_req allowing the coredump server how
much data can be read.
The coredump server uses MSG_PEEK to peek the size of struct
coredump_req. If the kernel uses a newer struct coredump_req the
coredump server just reads the size it knows and discard any remaining
bytes in the buffer. If the kernel uses an older struct coredump_req
the coredump server just reads the size the kernel knows.
The returned struct coredump_req will inform the coredump server what
features the kernel supports. The coredump_req->mask member is set to
the currently know features.
The coredump server may only use features whose bits were raised by the
kernel in coredump_req->mask.
In response to a coredump_req from the kernel the coredump server sends
a struct coredump_ack to the kernel. The kernel informs the coredump
server what version of struct coredump_ack it supports by setting struct
coredump_req->size_ack to the size it knows about. The coredump server
may only send as many bytes as coredump_req->size_ack indicates (a
smaller size is fine of course). The coredump server must set
coredump_ack->size accordingly.
The coredump server sets the features it wants to use in struct
coredump_ack->mask. Only bits returned in struct coredump_req->mask may
be used.
In case an invalid struct coredump_ack is sent to the kernel an
out-of-band byte will be sent by the kernel indicating the reason why
the coredump_ack was rejected.
The out-of-band markers allow advanced userspace to infer failure. They
are optional and can be ignored by not listening for POLLPRI events and
aren't necessary for the coredump server to function correctly.
In the initial version the following features are supported in
coredump_{req,ack}->mask:
* COREDUMP_KERNEL
The kernel will write the coredump data to the socket.
* COREDUMP_USERSPACE
The kernel will not write coredump data but will indicate to the
parent that a coredump has been generated. This is used when userspace
generates its own coredumps.
* COREDUMP_REJECT
The kernel will skip generating a coredump for this task.
* COREDUMP_WAIT
The kernel will prevent the task from exiting until the coredump
server has shutdown the socket connection.
The flexible coredump socket can be enabled by using the "@@" prefix
instead of the single "@" prefix for the regular coredump socket:
@@/run/systemd/coredump.socket
will enable flexible coredump handling. Current kernels already enforce
that "@" must be followed by "/" and will reject anything else. So
extending this is backward and forward compatible.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
fs/coredump.c | 130 +++++++++++++++++++++++++++++++++++++++---
include/uapi/linux/coredump.h | 104 +++++++++++++++++++++++++++++++++
2 files changed, 227 insertions(+), 7 deletions(-)
diff --git a/fs/coredump.c b/fs/coredump.c
index f217ebf2b3b6..e79f37d3eefb 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -51,6 +51,7 @@
#include <net/sock.h>
#include <uapi/linux/pidfd.h>
#include <uapi/linux/un.h>
+#include <uapi/linux/coredump.h>
#include <linux/uaccess.h>
#include <asm/mmu_context.h>
@@ -83,15 +84,17 @@ static int core_name_size = CORENAME_MAX_SIZE;
unsigned int core_file_note_size_limit = CORE_FILE_NOTE_SIZE_DEFAULT;
enum coredump_type_t {
- COREDUMP_FILE = 1,
- COREDUMP_PIPE = 2,
- COREDUMP_SOCK = 3,
+ COREDUMP_FILE = 1,
+ COREDUMP_PIPE = 2,
+ COREDUMP_SOCK = 3,
+ COREDUMP_SOCK_REQ = 4,
};
struct core_name {
char *corename;
int used, size;
enum coredump_type_t core_type;
+ u64 mask;
};
static int expand_corename(struct core_name *cn, int size)
@@ -235,6 +238,9 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
int pid_in_pattern = 0;
int err = 0;
+ cn->mask = COREDUMP_KERNEL;
+ if (core_pipe_limit)
+ cn->mask |= COREDUMP_WAIT;
cn->used = 0;
cn->corename = NULL;
if (*pat_ptr == '|')
@@ -264,6 +270,13 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
pat_ptr++;
if (!(*pat_ptr))
return -ENOMEM;
+ if (*pat_ptr == '@') {
+ pat_ptr++;
+ if (!(*pat_ptr))
+ return -ENOMEM;
+
+ cn->core_type = COREDUMP_SOCK_REQ;
+ }
err = cn_printf(cn, "%s", pat_ptr);
if (err)
@@ -632,6 +645,93 @@ static int umh_coredump_setup(struct subprocess_info *info, struct cred *new)
return 0;
}
+#ifdef CONFIG_UNIX
+static inline bool coredump_sock_recv(struct file *file, struct coredump_ack *ack, size_t size, int flags)
+{
+ struct msghdr msg = {};
+ struct kvec iov = { .iov_base = ack, .iov_len = size };
+ ssize_t ret;
+
+ memset(ack, 0, size);
+ ret = kernel_recvmsg(sock_from_file(file), &msg, &iov, 1, size, flags);
+ return ret == size;
+}
+
+static inline bool coredump_sock_send(struct file *file, struct coredump_req *req)
+{
+ struct msghdr msg = { .msg_flags = MSG_NOSIGNAL };
+ struct kvec iov = { .iov_base = req, .iov_len = sizeof(*req) };
+ ssize_t ret;
+
+ ret = kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(*req));
+ return ret == sizeof(*req);
+}
+
+static_assert(sizeof(enum coredump_oob) == sizeof(__u8));
+
+static inline bool coredump_sock_oob(struct file *file, enum coredump_oob oob)
+{
+#ifdef CONFIG_AF_UNIX_OOB
+ struct msghdr msg = { .msg_flags = MSG_NOSIGNAL | MSG_OOB };
+ struct kvec iov = { .iov_base = &oob, .iov_len = sizeof(oob) };
+
+ kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(oob));
+#endif
+ coredump_report_failure("Coredump socket ack failed %u", oob);
+ return false;
+}
+
+static bool coredump_request(struct core_name *cn, struct coredump_params *cprm)
+{
+ struct coredump_req req = {
+ .size = sizeof(struct coredump_req),
+ .mask = COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT,
+ .size_ack = sizeof(struct coredump_ack),
+ };
+ struct coredump_ack ack = {};
+ ssize_t usize;
+
+ if (cn->core_type != COREDUMP_SOCK_REQ)
+ return true;
+
+ /* Let userspace know what we support. */
+ if (!coredump_sock_send(cprm->file, &req))
+ return false;
+
+ /* Peek the size of the coredump_ack. */
+ if (!coredump_sock_recv(cprm->file, &ack, sizeof(ack.size),
+ MSG_PEEK | MSG_WAITALL))
+ return false;
+
+ /* Refuse unknown coredump_ack sizes. */
+ usize = ack.size;
+ if (usize < COREDUMP_ACK_SIZE_VER0 || usize > sizeof(ack))
+ return coredump_sock_oob(cprm->file, COREDUMP_OOB_INVALIDSIZE);
+
+ /* Now retrieve the coredump_ack. */
+ if (!coredump_sock_recv(cprm->file, &ack, usize, MSG_WAITALL))
+ return false;
+ if (ack.size != usize)
+ return false;
+
+ /* Refuse unknown coredump_ack flags. */
+ if (ack.mask & ~req.mask)
+ return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
+
+ /* Refuse mutually exclusive options. */
+ if (hweight64(ack.mask & (COREDUMP_USERSPACE | COREDUMP_KERNEL |
+ COREDUMP_REJECT)) != 1)
+ return coredump_sock_oob(cprm->file, COREDUMP_OOB_CONFLICTING);
+
+ if (ack.spare)
+ return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
+
+ cn->mask = ack.mask;
+ return true;
+}
+#endif
+
void do_coredump(const kernel_siginfo_t *siginfo)
{
struct core_state core_state;
@@ -850,6 +950,8 @@ void do_coredump(const kernel_siginfo_t *siginfo)
}
break;
}
+ case COREDUMP_SOCK_REQ:
+ fallthrough;
case COREDUMP_SOCK: {
#ifdef CONFIG_UNIX
struct file *file __free(fput) = NULL;
@@ -918,6 +1020,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
cprm.limit = RLIM_INFINITY;
cprm.file = no_free_ptr(file);
+
+ if (!coredump_request(&cn, &cprm))
+ goto close_fail;
#else
coredump_report_failure("Core dump socket support %s disabled", cn.corename);
goto close_fail;
@@ -929,12 +1034,17 @@ void do_coredump(const kernel_siginfo_t *siginfo)
goto close_fail;
}
+ /* Don't even generate the coredump. */
+ if (cn.mask & COREDUMP_REJECT)
+ goto close_fail;
+
/* get us an unshared descriptor table; almost always a no-op */
/* The cell spufs coredump code reads the file descriptor tables */
retval = unshare_files();
if (retval)
goto close_fail;
- if (!dump_interrupted()) {
+
+ if ((cn.mask & COREDUMP_KERNEL) && !dump_interrupted()) {
/*
* umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would
* have this set to NULL.
@@ -968,17 +1078,23 @@ void do_coredump(const kernel_siginfo_t *siginfo)
kernel_sock_shutdown(sock_from_file(cprm.file), SHUT_WR);
#endif
+ /* Let the parent know that a coredump was generated. */
+ if (cn.mask & COREDUMP_USERSPACE)
+ core_dumped = true;
+
/*
* When core_pipe_limit is set we wait for the coredump server
* or usermodehelper to finish before exiting so it can e.g.,
* inspect /proc/<pid>.
*/
- if (core_pipe_limit) {
+ if (cn.mask & COREDUMP_WAIT) {
switch (cn.core_type) {
case COREDUMP_PIPE:
wait_for_dump_helpers(cprm.file);
break;
#ifdef CONFIG_UNIX
+ case COREDUMP_SOCK_REQ:
+ fallthrough;
case COREDUMP_SOCK: {
ssize_t n;
@@ -1249,8 +1365,8 @@ static inline bool check_coredump_socket(void)
if (current->nsproxy->mnt_ns != init_task.nsproxy->mnt_ns)
return false;
- /* Must be an absolute path. */
- if (*(core_pattern + 1) != '/')
+ /* Must be an absolute path or the socket request. */
+ if (*(core_pattern + 1) != '/' && *(core_pattern + 1) != '@')
return false;
return true;
diff --git a/include/uapi/linux/coredump.h b/include/uapi/linux/coredump.h
new file mode 100644
index 000000000000..4fa7d1f9d062
--- /dev/null
+++ b/include/uapi/linux/coredump.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+#ifndef _UAPI_LINUX_COREDUMP_H
+#define _UAPI_LINUX_COREDUMP_H
+
+#include <linux/types.h>
+
+/**
+ * coredump_{req,ack} flags
+ * @COREDUMP_KERNEL: kernel writes coredump
+ * @COREDUMP_USERSPACE: userspace writes coredump
+ * @COREDUMP_REJECT: don't generate coredump
+ * @COREDUMP_WAIT: wait for coredump server
+ */
+enum {
+ COREDUMP_KERNEL = (1ULL << 0),
+ COREDUMP_USERSPACE = (1ULL << 1),
+ COREDUMP_REJECT = (1ULL << 2),
+ COREDUMP_WAIT = (1ULL << 3),
+};
+
+/**
+ * struct coredump_req - message kernel sends to userspace
+ * @size: size of struct coredump_req
+ * @size_ack: known size of struct coredump_ack on this kernel
+ * @mask: supported features
+ *
+ * When a coredump happens the kernel will connect to the coredump
+ * socket and send a coredump request to the coredump server. The @size
+ * member is set to the size of struct coredump_req and provides a hint
+ * to userspace how much data can be read. Userspace may use MSG_PEEK to
+ * peek the size of struct coredump_req and then choose to consume it in
+ * one go. Userspace may also simply read a COREDUMP_ACK_SIZE_VER0
+ * request. If the size the kernel sends is larger userspace simply
+ * discards any remaining data.
+ *
+ * The coredump_req->mask member is set to the currently know features.
+ * Userspace may only set coredump_ack->mask to the bits raised by the
+ * kernel in coredump_req->mask.
+ *
+ * The coredump_req->size_ack member is set by the kernel to the size of
+ * struct coredump_ack the kernel knows. Userspace may only send up to
+ * coredump_req->size_ack bytes to the kernel and must set
+ * coredump_ack->size accordingly.
+ */
+struct coredump_req {
+ __u32 size;
+ __u32 size_ack;
+ __u64 mask;
+};
+
+enum {
+ COREDUMP_REQ_SIZE_VER0 = 16U, /* size of first published struct */
+};
+
+/**
+ * struct coredump_ack - message userspace sends to kernel
+ * @size: size of the struct
+ * @spare: unused
+ * @mask: features kernel is supposed to use
+ *
+ * The @size member must be set to the size of struct coredump_ack. It
+ * may never exceed what the kernel returned in coredump_req->size_ack
+ * but it may of course be smaller (>= COREDUMP_ACK_SIZE_VER0 and <=
+ * coredump_req->size_ack).
+ *
+ * The @mask member must be set to the features the coredump server
+ * wants the kernel to use. Only bits the kernel returned in
+ * coredump_req->mask may be set.
+ */
+struct coredump_ack {
+ __u32 size;
+ __u32 spare;
+ __u64 mask;
+};
+
+enum {
+ COREDUMP_ACK_SIZE_VER0 = 16U, /* size of first published struct */
+};
+
+/**
+ * enum coredump_oob - Out-of-band markers for the coredump socket
+ *
+ * The kernel will place a single byte coredump_oob marker on the
+ * coredump socket. An interested coredump server can listen for POLLPRI
+ * and figure out why the provided coredump_ack was invalid.
+ *
+ * The out-of-band markers allow advanced userspace to infer more details
+ * about a coredump ack. They are optional and can be ignored. They
+ * aren't necessary for the coredump server to function correctly.
+ *
+ * @COREDUMP_OOB_INVALIDSIZE: the provided coredump_ack size was invalid
+ * @COREDUMP_OOB_UNSUPPORTED: the provided coredump_ack mask was invalid
+ * @COREDUMP_OOB_CONFLICTING: the provided coredump_ack mask has conflicting options
+ * @__COREDUMP_OOB_MAX: the maximum value for coredump_oob
+ */
+enum coredump_oob {
+ COREDUMP_OOB_INVALIDSIZE = 1U,
+ COREDUMP_OOB_UNSUPPORTED = 2U,
+ COREDUMP_OOB_CONFLICTING = 3U,
+ __COREDUMP_OOB_MAX = 255U,
+} __attribute__ ((__packed__));
+
+#endif /* _UAPI_LINUX_COREDUMP_H */
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 2/5] selftests/coredump: fix build
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
@ 2025-06-03 13:31 ` Christian Brauner
2025-06-03 13:51 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 3/5] selftests/coredump: cleanup coredump tests Christian Brauner
` (4 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
Fix various warnings in the selftest build.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
tools/testing/selftests/coredump/Makefile | 2 +-
tools/testing/selftests/coredump/stackdump_test.c | 17 +++++------------
2 files changed, 6 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/coredump/Makefile b/tools/testing/selftests/coredump/Makefile
index ed210037b29d..bc287a85b825 100644
--- a/tools/testing/selftests/coredump/Makefile
+++ b/tools/testing/selftests/coredump/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-CFLAGS = $(KHDR_INCLUDES)
+CFLAGS = -Wall -O0 $(KHDR_INCLUDES)
TEST_GEN_PROGS := stackdump_test
TEST_FILES := stackdump
diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
index 9984413be9f0..aa366e6f13a7 100644
--- a/tools/testing/selftests/coredump/stackdump_test.c
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@ -24,6 +24,8 @@ static void *do_nothing(void *)
{
while (1)
pause();
+
+ return NULL;
}
static void crashing_child(void)
@@ -46,9 +48,7 @@ FIXTURE(coredump)
FIXTURE_SETUP(coredump)
{
- char buf[PATH_MAX];
FILE *file;
- char *dir;
int ret;
self->pid_coredump_server = -ESRCH;
@@ -106,7 +106,6 @@ FIXTURE_TEARDOWN(coredump)
TEST_F_TIMEOUT(coredump, stackdump, 120)
{
- struct sigaction action = {};
unsigned long long stack;
char *test_dir, *line;
size_t line_length;
@@ -171,11 +170,10 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
TEST_F(coredump, socket)
{
- int fd, pidfd, ret, status;
+ int pidfd, ret, status;
FILE *file;
pid_t pid, pid_coredump_server;
struct stat st;
- char core_file[PATH_MAX];
struct pidfd_info info = {};
int ipc_sockets[2];
char c;
@@ -356,11 +354,10 @@ TEST_F(coredump, socket)
TEST_F(coredump, socket_detect_userspace_client)
{
- int fd, pidfd, ret, status;
+ int pidfd, ret, status;
FILE *file;
pid_t pid, pid_coredump_server;
struct stat st;
- char core_file[PATH_MAX];
struct pidfd_info info = {};
int ipc_sockets[2];
char c;
@@ -384,7 +381,7 @@ TEST_F(coredump, socket_detect_userspace_client)
pid_coredump_server = fork();
ASSERT_GE(pid_coredump_server, 0);
if (pid_coredump_server == 0) {
- int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
+ int fd_server, fd_coredump, fd_peer_pidfd;
socklen_t fd_peer_pidfd_len;
close(ipc_sockets[0]);
@@ -464,7 +461,6 @@ TEST_F(coredump, socket_detect_userspace_client)
close(fd_coredump);
close(fd_server);
close(fd_peer_pidfd);
- close(fd_core_file);
_exit(EXIT_SUCCESS);
}
self->pid_coredump_server = pid_coredump_server;
@@ -488,7 +484,6 @@ TEST_F(coredump, socket_detect_userspace_client)
if (ret < 0)
_exit(EXIT_FAILURE);
- (void *)write(fd_socket, &(char){ 0 }, 1);
close(fd_socket);
_exit(EXIT_SUCCESS);
}
@@ -519,7 +514,6 @@ TEST_F(coredump, socket_enoent)
int pidfd, ret, status;
FILE *file;
pid_t pid;
- char core_file[PATH_MAX];
file = fopen("/proc/sys/kernel/core_pattern", "w");
ASSERT_NE(file, NULL);
@@ -569,7 +563,6 @@ TEST_F(coredump, socket_no_listener)
ASSERT_GE(pid_coredump_server, 0);
if (pid_coredump_server == 0) {
int fd_server;
- socklen_t fd_peer_pidfd_len;
close(ipc_sockets[0]);
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 3/5] selftests/coredump: cleanup coredump tests
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
2025-06-03 13:31 ` [PATCH v2 2/5] selftests/coredump: fix build Christian Brauner
@ 2025-06-03 13:31 ` Christian Brauner
2025-06-03 13:52 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 4/5] tools: add coredump.h header Christian Brauner
` (3 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
Make the selftests we added this cycle easier to read.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
tools/testing/selftests/coredump/stackdump_test.c | 409 +++++++++-------------
1 file changed, 174 insertions(+), 235 deletions(-)
diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
index aa366e6f13a7..4d922e5f89fe 100644
--- a/tools/testing/selftests/coredump/stackdump_test.c
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
+#include <assert.h>
#include <fcntl.h>
#include <inttypes.h>
#include <libgen.h>
@@ -20,6 +21,10 @@
#define STACKDUMP_SCRIPT "stackdump"
#define NUM_THREAD_SPAWN 128
+#ifndef PAGE_SIZE
+#define PAGE_SIZE 4096
+#endif
+
static void *do_nothing(void *)
{
while (1)
@@ -109,7 +114,7 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
unsigned long long stack;
char *test_dir, *line;
size_t line_length;
- char buf[PATH_MAX];
+ char buf[PAGE_SIZE];
int ret, i, status;
FILE *file;
pid_t pid;
@@ -168,152 +173,163 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
fclose(file);
}
+static int create_and_listen_unix_socket(const char *path)
+{
+ struct sockaddr_un addr = {
+ .sun_family = AF_UNIX,
+ };
+ assert(strlen(path) < sizeof(addr.sun_path) - 1);
+ strncpy(addr.sun_path, path, sizeof(addr.sun_path) - 1);
+ size_t addr_len =
+ offsetof(struct sockaddr_un, sun_path) + strlen(path) + 1;
+ int fd, ret;
+
+ fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ if (fd < 0)
+ goto out;
+
+ ret = bind(fd, (const struct sockaddr *)&addr, addr_len);
+ if (ret < 0)
+ goto out;
+
+ ret = listen(fd, 1);
+ if (ret < 0)
+ goto out;
+
+ return fd;
+
+out:
+ if (fd >= 0)
+ close(fd);
+ return -1;
+}
+
+static bool set_core_pattern(const char *pattern)
+{
+ FILE *file;
+ int ret;
+
+ file = fopen("/proc/sys/kernel/core_pattern", "w");
+ if (!file)
+ return false;
+
+ ret = fprintf(file, "%s", pattern);
+ fclose(file);
+
+ return ret == strlen(pattern);
+}
+
+static int get_peer_pidfd(int fd)
+{
+ int fd_peer_pidfd;
+ socklen_t fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
+ int ret = getsockopt(fd, SOL_SOCKET, SO_PEERPIDFD, &fd_peer_pidfd,
+ &fd_peer_pidfd_len);
+ if (ret < 0) {
+ fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
+ return -1;
+ }
+ return fd_peer_pidfd;
+}
+
+static bool get_pidfd_info(int fd_peer_pidfd, struct pidfd_info *info)
+{
+ memset(info, 0, sizeof(*info));
+ info->mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ return ioctl(fd_peer_pidfd, PIDFD_GET_INFO, info) == 0;
+}
+
+static void
+wait_and_check_coredump_server(pid_t pid_coredump_server,
+ struct __test_metadata *const _metadata,
+ FIXTURE_DATA(coredump)* self)
+{
+ int status;
+ waitpid(pid_coredump_server, &status, 0);
+ self->pid_coredump_server = -ESRCH;
+ ASSERT_TRUE(WIFEXITED(status));
+ ASSERT_EQ(WEXITSTATUS(status), 0);
+}
+
TEST_F(coredump, socket)
{
int pidfd, ret, status;
- FILE *file;
pid_t pid, pid_coredump_server;
struct stat st;
struct pidfd_info info = {};
int ipc_sockets[2];
char c;
- const struct sockaddr_un coredump_sk = {
- .sun_family = AF_UNIX,
- .sun_path = "/tmp/coredump.socket",
- };
- size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
- sizeof("/tmp/coredump.socket");
+
+ ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
ASSERT_EQ(ret, 0);
- file = fopen("/proc/sys/kernel/core_pattern", "w");
- ASSERT_NE(file, NULL);
-
- ret = fprintf(file, "@/tmp/coredump.socket");
- ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
- ASSERT_EQ(fclose(file), 0);
-
pid_coredump_server = fork();
ASSERT_GE(pid_coredump_server, 0);
if (pid_coredump_server == 0) {
- int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
- socklen_t fd_peer_pidfd_len;
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
+ int exit_code = EXIT_FAILURE;
close(ipc_sockets[0]);
- fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
if (fd_server < 0)
- _exit(EXIT_FAILURE);
-
- ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
- if (ret < 0) {
- fprintf(stderr, "Failed to bind coredump socket\n");
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
-
- ret = listen(fd_server, 1);
- if (ret < 0) {
- fprintf(stderr, "Failed to listen on coredump socket\n");
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
+ goto out;
- if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
close(ipc_sockets[1]);
fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
- if (fd_coredump < 0) {
- fprintf(stderr, "Failed to accept coredump socket connection\n");
- close(fd_server);
- _exit(EXIT_FAILURE);
- }
+ if (fd_coredump < 0)
+ goto out;
- fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
- ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
- &fd_peer_pidfd, &fd_peer_pidfd_len);
- if (ret < 0) {
- fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
- close(fd_coredump);
- close(fd_server);
- _exit(EXIT_FAILURE);
- }
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
- memset(&info, 0, sizeof(info));
- info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
- ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
- if (ret < 0) {
- fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
- if (!(info.mask & PIDFD_INFO_COREDUMP)) {
- fprintf(stderr, "Missing coredump information from coredumping task\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
- if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
- fprintf(stderr, "Received connection from non-coredumping task\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
fd_core_file = creat("/tmp/coredump.file", 0644);
- if (fd_core_file < 0) {
- fprintf(stderr, "Failed to create coredump file\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (fd_core_file < 0)
+ goto out;
for (;;) {
char buffer[4096];
ssize_t bytes_read, bytes_write;
bytes_read = read(fd_coredump, buffer, sizeof(buffer));
- if (bytes_read < 0) {
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- close(fd_core_file);
- _exit(EXIT_FAILURE);
- }
+ if (bytes_read < 0)
+ goto out;
if (bytes_read == 0)
break;
bytes_write = write(fd_core_file, buffer, bytes_read);
- if (bytes_read != bytes_write) {
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- close(fd_core_file);
- _exit(EXIT_FAILURE);
- }
+ if (bytes_read != bytes_write)
+ goto out;
}
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- close(fd_core_file);
- _exit(EXIT_SUCCESS);
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_core_file >= 0)
+ close(fd_core_file);
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
}
self->pid_coredump_server = pid_coredump_server;
@@ -333,47 +349,27 @@ TEST_F(coredump, socket)
ASSERT_TRUE(WIFSIGNALED(status));
ASSERT_TRUE(WCOREDUMP(status));
- info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
- ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
- waitpid(pid_coredump_server, &status, 0);
- self->pid_coredump_server = -ESRCH;
- ASSERT_TRUE(WIFEXITED(status));
- ASSERT_EQ(WEXITSTATUS(status), 0);
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
ASSERT_EQ(stat("/tmp/coredump.file", &st), 0);
ASSERT_GT(st.st_size, 0);
- /*
- * We should somehow validate the produced core file.
- * For now just allow for visual inspection
- */
system("file /tmp/coredump.file");
}
TEST_F(coredump, socket_detect_userspace_client)
{
int pidfd, ret, status;
- FILE *file;
pid_t pid, pid_coredump_server;
struct stat st;
struct pidfd_info info = {};
int ipc_sockets[2];
char c;
- const struct sockaddr_un coredump_sk = {
- .sun_family = AF_UNIX,
- .sun_path = "/tmp/coredump.socket",
- };
- size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
- sizeof("/tmp/coredump.socket");
- file = fopen("/proc/sys/kernel/core_pattern", "w");
- ASSERT_NE(file, NULL);
-
- ret = fprintf(file, "@/tmp/coredump.socket");
- ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
- ASSERT_EQ(fclose(file), 0);
+ ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
ASSERT_EQ(ret, 0);
@@ -381,87 +377,46 @@ TEST_F(coredump, socket_detect_userspace_client)
pid_coredump_server = fork();
ASSERT_GE(pid_coredump_server, 0);
if (pid_coredump_server == 0) {
- int fd_server, fd_coredump, fd_peer_pidfd;
- socklen_t fd_peer_pidfd_len;
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
close(ipc_sockets[0]);
- fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
if (fd_server < 0)
- _exit(EXIT_FAILURE);
+ goto out;
- ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
- if (ret < 0) {
- fprintf(stderr, "Failed to bind coredump socket\n");
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
-
- ret = listen(fd_server, 1);
- if (ret < 0) {
- fprintf(stderr, "Failed to listen on coredump socket\n");
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
-
- if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
close(ipc_sockets[1]);
fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
- if (fd_coredump < 0) {
- fprintf(stderr, "Failed to accept coredump socket connection\n");
- close(fd_server);
- _exit(EXIT_FAILURE);
- }
+ if (fd_coredump < 0)
+ goto out;
- fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
- ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
- &fd_peer_pidfd, &fd_peer_pidfd_len);
- if (ret < 0) {
- fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
- close(fd_coredump);
- close(fd_server);
- _exit(EXIT_FAILURE);
- }
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
- memset(&info, 0, sizeof(info));
- info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
- ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
- if (ret < 0) {
- fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
- if (!(info.mask & PIDFD_INFO_COREDUMP)) {
- fprintf(stderr, "Missing coredump information from coredumping task\n");
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
- if (info.coredump_mask & PIDFD_COREDUMPED) {
- fprintf(stderr, "Received unexpected connection from coredumping task\n");
+ if (info.coredump_mask & PIDFD_COREDUMPED)
+ goto out;
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
close(fd_coredump);
+ if (fd_server >= 0)
close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_FAILURE);
- }
-
- close(fd_coredump);
- close(fd_server);
- close(fd_peer_pidfd);
- _exit(EXIT_SUCCESS);
+ _exit(exit_code);
}
self->pid_coredump_server = pid_coredump_server;
@@ -474,12 +429,18 @@ TEST_F(coredump, socket_detect_userspace_client)
if (pid == 0) {
int fd_socket;
ssize_t ret;
+ const struct sockaddr_un coredump_sk = {
+ .sun_family = AF_UNIX,
+ .sun_path = "/tmp/coredump.socket",
+ };
+ size_t coredump_sk_len =
+ offsetof(struct sockaddr_un, sun_path) +
+ sizeof("/tmp/coredump.socket");
fd_socket = socket(AF_UNIX, SOCK_STREAM, 0);
if (fd_socket < 0)
_exit(EXIT_FAILURE);
-
ret = connect(fd_socket, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
if (ret < 0)
_exit(EXIT_FAILURE);
@@ -495,15 +456,11 @@ TEST_F(coredump, socket_detect_userspace_client)
ASSERT_TRUE(WIFEXITED(status));
ASSERT_EQ(WEXITSTATUS(status), 0);
- info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
- ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
ASSERT_EQ((info.coredump_mask & PIDFD_COREDUMPED), 0);
- waitpid(pid_coredump_server, &status, 0);
- self->pid_coredump_server = -ESRCH;
- ASSERT_TRUE(WIFEXITED(status));
- ASSERT_EQ(WEXITSTATUS(status), 0);
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
ASSERT_NE(stat("/tmp/coredump.file", &st), 0);
ASSERT_EQ(errno, ENOENT);
@@ -511,16 +468,10 @@ TEST_F(coredump, socket_detect_userspace_client)
TEST_F(coredump, socket_enoent)
{
- int pidfd, ret, status;
- FILE *file;
+ int pidfd, status;
pid_t pid;
- file = fopen("/proc/sys/kernel/core_pattern", "w");
- ASSERT_NE(file, NULL);
-
- ret = fprintf(file, "@/tmp/coredump.socket");
- ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
- ASSERT_EQ(fclose(file), 0);
+ ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
pid = fork();
ASSERT_GE(pid, 0);
@@ -538,7 +489,6 @@ TEST_F(coredump, socket_enoent)
TEST_F(coredump, socket_no_listener)
{
int pidfd, ret, status;
- FILE *file;
pid_t pid, pid_coredump_server;
int ipc_sockets[2];
char c;
@@ -549,44 +499,36 @@ TEST_F(coredump, socket_no_listener)
size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
sizeof("/tmp/coredump.socket");
+ ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
+
ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
ASSERT_EQ(ret, 0);
- file = fopen("/proc/sys/kernel/core_pattern", "w");
- ASSERT_NE(file, NULL);
-
- ret = fprintf(file, "@/tmp/coredump.socket");
- ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
- ASSERT_EQ(fclose(file), 0);
-
pid_coredump_server = fork();
ASSERT_GE(pid_coredump_server, 0);
if (pid_coredump_server == 0) {
- int fd_server;
+ int fd_server = -1;
+ int exit_code = EXIT_FAILURE;
close(ipc_sockets[0]);
fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
if (fd_server < 0)
- _exit(EXIT_FAILURE);
+ goto out;
ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
- if (ret < 0) {
- fprintf(stderr, "Failed to bind coredump socket\n");
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
+ if (ret < 0)
+ goto out;
- if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
- close(fd_server);
- close(ipc_sockets[1]);
- _exit(EXIT_FAILURE);
- }
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
- close(fd_server);
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_server >= 0)
+ close(fd_server);
close(ipc_sockets[1]);
- _exit(EXIT_SUCCESS);
+ _exit(exit_code);
}
self->pid_coredump_server = pid_coredump_server;
@@ -606,10 +548,7 @@ TEST_F(coredump, socket_no_listener)
ASSERT_TRUE(WIFSIGNALED(status));
ASSERT_FALSE(WCOREDUMP(status));
- waitpid(pid_coredump_server, &status, 0);
- self->pid_coredump_server = -ESRCH;
- ASSERT_TRUE(WIFEXITED(status));
- ASSERT_EQ(WEXITSTATUS(status), 0);
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
}
TEST_HARNESS_MAIN
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 4/5] tools: add coredump.h header
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
` (2 preceding siblings ...)
2025-06-03 13:31 ` [PATCH v2 3/5] selftests/coredump: cleanup coredump tests Christian Brauner
@ 2025-06-03 13:31 ` Christian Brauner
2025-06-03 13:51 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 5/5] selftests/coredump: add coredump server selftests Christian Brauner
` (2 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
Copy the coredump header so we can rely on it in the selftests.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
tools/include/uapi/linux/coredump.h | 104 ++++++++++++++++++++++++++++++++++++
1 file changed, 104 insertions(+)
diff --git a/tools/include/uapi/linux/coredump.h b/tools/include/uapi/linux/coredump.h
new file mode 100644
index 000000000000..4fa7d1f9d062
--- /dev/null
+++ b/tools/include/uapi/linux/coredump.h
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+#ifndef _UAPI_LINUX_COREDUMP_H
+#define _UAPI_LINUX_COREDUMP_H
+
+#include <linux/types.h>
+
+/**
+ * coredump_{req,ack} flags
+ * @COREDUMP_KERNEL: kernel writes coredump
+ * @COREDUMP_USERSPACE: userspace writes coredump
+ * @COREDUMP_REJECT: don't generate coredump
+ * @COREDUMP_WAIT: wait for coredump server
+ */
+enum {
+ COREDUMP_KERNEL = (1ULL << 0),
+ COREDUMP_USERSPACE = (1ULL << 1),
+ COREDUMP_REJECT = (1ULL << 2),
+ COREDUMP_WAIT = (1ULL << 3),
+};
+
+/**
+ * struct coredump_req - message kernel sends to userspace
+ * @size: size of struct coredump_req
+ * @size_ack: known size of struct coredump_ack on this kernel
+ * @mask: supported features
+ *
+ * When a coredump happens the kernel will connect to the coredump
+ * socket and send a coredump request to the coredump server. The @size
+ * member is set to the size of struct coredump_req and provides a hint
+ * to userspace how much data can be read. Userspace may use MSG_PEEK to
+ * peek the size of struct coredump_req and then choose to consume it in
+ * one go. Userspace may also simply read a COREDUMP_ACK_SIZE_VER0
+ * request. If the size the kernel sends is larger userspace simply
+ * discards any remaining data.
+ *
+ * The coredump_req->mask member is set to the currently know features.
+ * Userspace may only set coredump_ack->mask to the bits raised by the
+ * kernel in coredump_req->mask.
+ *
+ * The coredump_req->size_ack member is set by the kernel to the size of
+ * struct coredump_ack the kernel knows. Userspace may only send up to
+ * coredump_req->size_ack bytes to the kernel and must set
+ * coredump_ack->size accordingly.
+ */
+struct coredump_req {
+ __u32 size;
+ __u32 size_ack;
+ __u64 mask;
+};
+
+enum {
+ COREDUMP_REQ_SIZE_VER0 = 16U, /* size of first published struct */
+};
+
+/**
+ * struct coredump_ack - message userspace sends to kernel
+ * @size: size of the struct
+ * @spare: unused
+ * @mask: features kernel is supposed to use
+ *
+ * The @size member must be set to the size of struct coredump_ack. It
+ * may never exceed what the kernel returned in coredump_req->size_ack
+ * but it may of course be smaller (>= COREDUMP_ACK_SIZE_VER0 and <=
+ * coredump_req->size_ack).
+ *
+ * The @mask member must be set to the features the coredump server
+ * wants the kernel to use. Only bits the kernel returned in
+ * coredump_req->mask may be set.
+ */
+struct coredump_ack {
+ __u32 size;
+ __u32 spare;
+ __u64 mask;
+};
+
+enum {
+ COREDUMP_ACK_SIZE_VER0 = 16U, /* size of first published struct */
+};
+
+/**
+ * enum coredump_oob - Out-of-band markers for the coredump socket
+ *
+ * The kernel will place a single byte coredump_oob marker on the
+ * coredump socket. An interested coredump server can listen for POLLPRI
+ * and figure out why the provided coredump_ack was invalid.
+ *
+ * The out-of-band markers allow advanced userspace to infer more details
+ * about a coredump ack. They are optional and can be ignored. They
+ * aren't necessary for the coredump server to function correctly.
+ *
+ * @COREDUMP_OOB_INVALIDSIZE: the provided coredump_ack size was invalid
+ * @COREDUMP_OOB_UNSUPPORTED: the provided coredump_ack mask was invalid
+ * @COREDUMP_OOB_CONFLICTING: the provided coredump_ack mask has conflicting options
+ * @__COREDUMP_OOB_MAX: the maximum value for coredump_oob
+ */
+enum coredump_oob {
+ COREDUMP_OOB_INVALIDSIZE = 1U,
+ COREDUMP_OOB_UNSUPPORTED = 2U,
+ COREDUMP_OOB_CONFLICTING = 3U,
+ __COREDUMP_OOB_MAX = 255U,
+} __attribute__ ((__packed__));
+
+#endif /* _UAPI_LINUX_COREDUMP_H */
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 5/5] selftests/coredump: add coredump server selftests
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
` (3 preceding siblings ...)
2025-06-03 13:31 ` [PATCH v2 4/5] tools: add coredump.h header Christian Brauner
@ 2025-06-03 13:31 ` Christian Brauner
2025-06-03 13:53 ` Alexander Mikhalitsyn
2025-06-03 14:44 ` [PATCH v2 0/5] coredump: allow for flexible coredump handling Lennart Poettering
2025-06-09 12:56 ` Jeff Layton
6 siblings, 1 reply; 14+ messages in thread
From: Christian Brauner @ 2025-06-03 13:31 UTC (permalink / raw)
To: linux-fsdevel, Jann Horn
Cc: Josef Bacik, Jeff Layton, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Christian Brauner, Alexander Mikhalitsyn
This adds extensive tests for the coredump server.
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
tools/testing/selftests/coredump/Makefile | 2 +-
tools/testing/selftests/coredump/config | 4 +
tools/testing/selftests/coredump/stackdump_test.c | 1291 ++++++++++++++++++++-
3 files changed, 1295 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/coredump/Makefile b/tools/testing/selftests/coredump/Makefile
index bc287a85b825..77b3665c73c7 100644
--- a/tools/testing/selftests/coredump/Makefile
+++ b/tools/testing/selftests/coredump/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
-CFLAGS = -Wall -O0 $(KHDR_INCLUDES)
+CFLAGS += -Wall -O0 -g $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
TEST_GEN_PROGS := stackdump_test
TEST_FILES := stackdump
diff --git a/tools/testing/selftests/coredump/config b/tools/testing/selftests/coredump/config
new file mode 100644
index 000000000000..6ce9610b06d0
--- /dev/null
+++ b/tools/testing/selftests/coredump/config
@@ -0,0 +1,4 @@
+CONFIG_AF_UNIX_OOB=y
+CONFIG_COREDUMP=y
+CONFIG_NET=y
+CONFIG_UNIX=y
diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
index 4d922e5f89fe..ad0d5f271db1 100644
--- a/tools/testing/selftests/coredump/stackdump_test.c
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@ -4,10 +4,15 @@
#include <fcntl.h>
#include <inttypes.h>
#include <libgen.h>
+#include <limits.h>
+#include <linux/coredump.h>
+#include <linux/fs.h>
#include <linux/limits.h>
#include <pthread.h>
#include <string.h>
#include <sys/mount.h>
+#include <poll.h>
+#include <sys/epoll.h>
#include <sys/resource.h>
#include <sys/stat.h>
#include <sys/socket.h>
@@ -15,6 +20,7 @@
#include <unistd.h>
#include "../kselftest_harness.h"
+#include "../filesystems/wrappers.h"
#include "../pidfd/pidfd.h"
#define STACKDUMP_FILE "stack_values"
@@ -49,14 +55,32 @@ FIXTURE(coredump)
{
char original_core_pattern[256];
pid_t pid_coredump_server;
+ int fd_tmpfs_detached;
};
+static int create_detached_tmpfs(void)
+{
+ int fd_context, fd_tmpfs;
+
+ fd_context = sys_fsopen("tmpfs", 0);
+ if (fd_context < 0)
+ return -1;
+
+ if (sys_fsconfig(fd_context, FSCONFIG_CMD_CREATE, NULL, NULL, 0) < 0)
+ return -1;
+
+ fd_tmpfs = sys_fsmount(fd_context, 0, 0);
+ close(fd_context);
+ return fd_tmpfs;
+}
+
FIXTURE_SETUP(coredump)
{
FILE *file;
int ret;
self->pid_coredump_server = -ESRCH;
+ self->fd_tmpfs_detached = -1;
file = fopen("/proc/sys/kernel/core_pattern", "r");
ASSERT_NE(NULL, file);
@@ -65,6 +89,8 @@ FIXTURE_SETUP(coredump)
ASSERT_LT(ret, sizeof(self->original_core_pattern));
self->original_core_pattern[ret] = '\0';
+ self->fd_tmpfs_detached = create_detached_tmpfs();
+ ASSERT_GE(self->fd_tmpfs_detached, 0);
ret = fclose(file);
ASSERT_EQ(0, ret);
@@ -103,6 +129,15 @@ FIXTURE_TEARDOWN(coredump)
goto fail;
}
+ if (self->fd_tmpfs_detached >= 0) {
+ ret = close(self->fd_tmpfs_detached);
+ if (ret < 0) {
+ reason = "Unable to close detached tmpfs";
+ goto fail;
+ }
+ self->fd_tmpfs_detached = -1;
+ }
+
return;
fail:
/* This should never happen */
@@ -192,7 +227,7 @@ static int create_and_listen_unix_socket(const char *path)
if (ret < 0)
goto out;
- ret = listen(fd, 1);
+ ret = listen(fd, 128);
if (ret < 0)
goto out;
@@ -551,4 +586,1258 @@ TEST_F(coredump, socket_no_listener)
wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
}
+int recv_oob_marker(int fd)
+{
+ uint8_t oob_marker;
+ ssize_t ret;
+
+ ret = recv(fd, &oob_marker, 1, MSG_OOB);
+ if (ret < 0)
+ return -1;
+ if (ret > 1 || ret == 0)
+ return -EINVAL;
+
+ switch (oob_marker) {
+ case COREDUMP_OOB_INVALIDSIZE:
+ fprintf(stderr, "Received OOB marker: InvalidSize\n");
+ return COREDUMP_OOB_INVALIDSIZE;
+ case COREDUMP_OOB_UNSUPPORTED:
+ fprintf(stderr, "Received OOB marker: Unsupported\n");
+ return COREDUMP_OOB_UNSUPPORTED;
+ case COREDUMP_OOB_CONFLICTING:
+ fprintf(stderr, "Received OOB marker: Conflicting\n");
+ return COREDUMP_OOB_CONFLICTING;
+ default:
+ fprintf(stderr, "Received unknown OOB marker: %u\n", oob_marker);
+ break;
+ }
+ return -1;
+}
+
+static bool is_msg_oob_supported(void)
+{
+ int sv[2];
+ char c = 'X';
+ int ret;
+ static int supported = -1;
+
+ if (supported >= 0)
+ return supported == 1;
+
+ if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) < 0)
+ return false;
+
+ ret = send(sv[0], &c, 1, MSG_OOB);
+ close(sv[0]);
+ close(sv[1]);
+
+ if (ret < 0) {
+ if (errno == EINVAL || errno == EOPNOTSUPP) {
+ supported = 0;
+ return false;
+ }
+
+ return false;
+ }
+ supported = 1;
+ return true;
+}
+
+static bool wait_for_oob_marker(int fd, enum coredump_oob oob_marker)
+{
+ ssize_t ret;
+ struct pollfd pfd = {
+ .fd = fd,
+ .events = POLLPRI,
+ .revents = 0,
+ };
+
+ if (!is_msg_oob_supported())
+ return true;
+
+ ret = poll(&pfd, 1, -1);
+ if (ret < 0)
+ return false;
+ if (!(pfd.revents & POLLPRI))
+ return false;
+ if (pfd.revents & POLLERR)
+ return false;
+ if (pfd.revents & POLLHUP)
+ return false;
+
+ ret = recv_oob_marker(fd);
+ if (ret < 0)
+ return false;
+ return ret == oob_marker;
+}
+
+static bool read_coredump_req(int fd, struct coredump_req *req)
+{
+ ssize_t ret;
+ size_t field_size, user_size, ack_size, kernel_size, remaining_size;
+
+ memset(req, 0, sizeof(*req));
+ field_size = sizeof(req->size);
+
+ /* Peek the size of the coredump request. */
+ ret = recv(fd, req, field_size, MSG_PEEK | MSG_WAITALL);
+ if (ret != field_size)
+ return false;
+ kernel_size = req->size;
+
+ if (kernel_size < COREDUMP_ACK_SIZE_VER0)
+ return false;
+ if (kernel_size >= PAGE_SIZE)
+ return false;
+
+ /* Use the minimum of user and kernel size to read the full request. */
+ user_size = sizeof(struct coredump_req);
+ ack_size = user_size < kernel_size ? user_size : kernel_size;
+ ret = recv(fd, req, ack_size, MSG_WAITALL);
+ if (ret != ack_size)
+ return false;
+
+ fprintf(stderr, "Read coredump request with size %u and mask 0x%llx\n",
+ req->size, (unsigned long long)req->mask);
+
+ if (user_size > kernel_size)
+ remaining_size = user_size - kernel_size;
+ else
+ remaining_size = kernel_size - user_size;
+
+ if (PAGE_SIZE <= remaining_size)
+ return false;
+
+ /*
+ * Discard any additional data if the kernel's request was larger than
+ * what we knew about or cared about.
+ */
+ if (remaining_size) {
+ char buffer[PAGE_SIZE];
+
+ ret = recv(fd, buffer, sizeof(buffer), MSG_WAITALL);
+ if (ret != remaining_size)
+ return false;
+ fprintf(stderr, "Discarded %zu bytes of non-OOB data after coredump request\n", remaining_size);
+ }
+
+ return true;
+}
+
+static bool send_coredump_ack(int fd, const struct coredump_req *req,
+ __u64 mask, size_t size_ack)
+{
+ ssize_t ret;
+ /*
+ * Wrap struct coredump_ack in a larger struct so we can
+ * simulate sending to much data to the kernel.
+ */
+ struct large_ack_for_size_testing {
+ struct coredump_ack ack;
+ char buffer[PAGE_SIZE];
+ } large_ack = {};
+
+ if (!size_ack)
+ size_ack = sizeof(struct coredump_ack) < req->size_ack ?
+ sizeof(struct coredump_ack) :
+ req->size_ack;
+ large_ack.ack.mask = mask;
+ large_ack.ack.size = size_ack;
+ ret = send(fd, &large_ack, size_ack, MSG_NOSIGNAL);
+ if (ret != size_ack)
+ return false;
+
+ fprintf(stderr, "Sent coredump ack with size %zu and mask 0x%llx\n",
+ size_ack, (unsigned long long)mask);
+ return true;
+}
+
+static bool check_coredump_req(const struct coredump_req *req, size_t min_size,
+ __u64 required_mask)
+{
+ if (req->size < min_size)
+ return false;
+ if ((req->mask & required_mask) != required_mask)
+ return false;
+ if (req->mask & ~required_mask)
+ return false;
+ return true;
+}
+
+TEST_F(coredump, socket_request_kernel)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct stat st;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_core_file = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ fd_core_file = creat("/tmp/coredump.file", 0644);
+ if (fd_core_file < 0)
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_KERNEL | COREDUMP_WAIT, 0))
+ goto out;
+
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read, bytes_write;
+
+ bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read < 0)
+ goto out;
+
+ if (bytes_read == 0)
+ break;
+
+ bytes_write = write(fd_core_file, buffer, bytes_read);
+ if (bytes_read != bytes_write)
+ goto out;
+ }
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_core_file >= 0)
+ close(fd_core_file);
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_TRUE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+
+ ASSERT_EQ(stat("/tmp/coredump.file", &st), 0);
+ ASSERT_GT(st.st_size, 0);
+ system("file /tmp/coredump.file");
+}
+
+TEST_F(coredump, socket_request_userspace)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_USERSPACE | COREDUMP_WAIT, 0))
+ goto out;
+
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read;
+
+ bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read > 0)
+ goto out;
+
+ if (bytes_read < 0)
+ goto out;
+
+ if (bytes_read == 0)
+ break;
+ }
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_TRUE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+TEST_F(coredump, socket_request_reject)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_REJECT | COREDUMP_WAIT, 0))
+ goto out;
+
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read;
+
+ bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read > 0)
+ goto out;
+
+ if (bytes_read < 0)
+ goto out;
+
+ if (bytes_read == 0)
+ break;
+ }
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+TEST_F(coredump, socket_request_invalid_flag_combination)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_KERNEL | COREDUMP_REJECT | COREDUMP_WAIT, 0))
+ goto out;
+
+ if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_CONFLICTING))
+ goto out;
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+TEST_F(coredump, socket_request_unknown_flag)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req, (1ULL << 63), 0))
+ goto out;
+
+ if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_UNSUPPORTED))
+ goto out;
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+TEST_F(coredump, socket_request_invalid_size_small)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_REJECT | COREDUMP_WAIT,
+ COREDUMP_ACK_SIZE_VER0 / 2))
+ goto out;
+
+ if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_INVALIDSIZE))
+ goto out;
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+TEST_F(coredump, socket_request_invalid_size_large)
+{
+ int pidfd, ret, status;
+ pid_t pid, pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
+ ASSERT_EQ(ret, 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ struct coredump_req req = {};
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ close(ipc_sockets[0]);
+
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+
+ close(ipc_sockets[1]);
+
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0)
+ goto out;
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP))
+ goto out;
+
+ if (!(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_REJECT | COREDUMP_WAIT,
+ COREDUMP_ACK_SIZE_VER0 + PAGE_SIZE))
+ goto out;
+
+ if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_INVALIDSIZE))
+ goto out;
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ pid = fork();
+ ASSERT_GE(pid, 0);
+ if (pid == 0)
+ crashing_child();
+
+ pidfd = sys_pidfd_open(pid, 0);
+ ASSERT_GE(pidfd, 0);
+
+ waitpid(pid, &status, 0);
+ ASSERT_TRUE(WIFSIGNALED(status));
+ ASSERT_FALSE(WCOREDUMP(status));
+
+ ASSERT_TRUE(get_pidfd_info(pidfd, &info));
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+
+static int open_coredump_tmpfile(int fd_tmpfs_detached)
+{
+ return openat(fd_tmpfs_detached, ".", O_TMPFILE | O_RDWR | O_EXCL, 0600);
+}
+
+#define NUM_CRASHING_COREDUMPS 5
+
+TEST_F_TIMEOUT(coredump, socket_multiple_crashing_coredumps, 500)
+{
+ int pidfd[NUM_CRASHING_COREDUMPS], status[NUM_CRASHING_COREDUMPS];
+ pid_t pid[NUM_CRASHING_COREDUMPS], pid_coredump_server;
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+
+ ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets), 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
+ int exit_code = EXIT_FAILURE;
+ struct coredump_req req = {};
+
+ close(ipc_sockets[0]);
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0) {
+ fprintf(stderr, "Failed to create and listen on unix socket\n");
+ goto out;
+ }
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
+ fprintf(stderr, "Failed to notify parent via ipc socket\n");
+ goto out;
+ }
+ close(ipc_sockets[1]);
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0) {
+ fprintf(stderr, "accept4 failed: %m\n");
+ goto out;
+ }
+
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0) {
+ fprintf(stderr, "get_peer_pidfd failed for fd %d: %m\n", fd_coredump);
+ goto out;
+ }
+
+ if (!get_pidfd_info(fd_peer_pidfd, &info)) {
+ fprintf(stderr, "get_pidfd_info failed for fd %d\n", fd_peer_pidfd);
+ goto out;
+ }
+
+ if (!(info.mask & PIDFD_INFO_COREDUMP)) {
+ fprintf(stderr, "pidfd info missing PIDFD_INFO_COREDUMP for fd %d\n", fd_peer_pidfd);
+ goto out;
+ }
+ if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
+ fprintf(stderr, "pidfd info missing PIDFD_COREDUMPED for fd %d\n", fd_peer_pidfd);
+ goto out;
+ }
+
+ if (!read_coredump_req(fd_coredump, &req)) {
+ fprintf(stderr, "read_coredump_req failed for fd %d\n", fd_coredump);
+ goto out;
+ }
+
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT)) {
+ fprintf(stderr, "check_coredump_req failed for fd %d\n", fd_coredump);
+ goto out;
+ }
+
+ if (!send_coredump_ack(fd_coredump, &req,
+ COREDUMP_KERNEL | COREDUMP_WAIT, 0)) {
+ fprintf(stderr, "send_coredump_ack failed for fd %d\n", fd_coredump);
+ goto out;
+ }
+
+ fd_core_file = open_coredump_tmpfile(self->fd_tmpfs_detached);
+ if (fd_core_file < 0) {
+ fprintf(stderr, "%m - open_coredump_tmpfile failed for fd %d\n", fd_coredump);
+ goto out;
+ }
+
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read, bytes_write;
+
+ bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read < 0) {
+ fprintf(stderr, "read failed for fd %d: %m\n", fd_coredump);
+ goto out;
+ }
+
+ if (bytes_read == 0)
+ break;
+
+ bytes_write = write(fd_core_file, buffer, bytes_read);
+ if (bytes_read != bytes_write) {
+ fprintf(stderr, "write failed for fd %d: %m\n", fd_core_file);
+ goto out;
+ }
+ }
+
+ close(fd_core_file);
+ close(fd_peer_pidfd);
+ close(fd_coredump);
+ fd_peer_pidfd = -1;
+ fd_coredump = -1;
+ }
+
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_core_file >= 0)
+ close(fd_core_file);
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_server >= 0)
+ close(fd_server);
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ pid[i] = fork();
+ ASSERT_GE(pid[i], 0);
+ if (pid[i] == 0)
+ crashing_child();
+ pidfd[i] = sys_pidfd_open(pid[i], 0);
+ ASSERT_GE(pidfd[i], 0);
+ }
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ waitpid(pid[i], &status[i], 0);
+ ASSERT_TRUE(WIFSIGNALED(status[i]));
+ ASSERT_TRUE(WCOREDUMP(status[i]));
+ }
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ASSERT_EQ(ioctl(pidfd[i], PIDFD_GET_INFO, &info), 0);
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+ }
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
+#define MAX_EVENTS 128
+
+static void process_coredump_worker(int fd_coredump, int fd_peer_pidfd, int fd_core_file)
+{
+ int epfd = -1;
+ int exit_code = EXIT_FAILURE;
+
+ epfd = epoll_create1(0);
+ if (epfd < 0)
+ goto out;
+
+ struct epoll_event ev;
+ ev.events = EPOLLIN | EPOLLPRI | EPOLLRDHUP | EPOLLET;
+ ev.data.fd = fd_coredump;
+ if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd_coredump, &ev) < 0)
+ goto out;
+
+ for (;;) {
+ struct epoll_event events[1];
+ int n = epoll_wait(epfd, events, 1, -1);
+ if (n < 0)
+ break;
+
+ if (events[0].events & EPOLLPRI) {
+ uint8_t oob;
+ ssize_t oobret = recv(fd_coredump, &oob, 1, MSG_OOB);
+ if (oobret == 1) {
+ fprintf(stderr, "Worker: Received OOB marker %u on fd %d, aborting coredump\n", oob, fd_coredump);
+ break;
+ }
+ }
+ if (events[0].events & (EPOLLIN | EPOLLRDHUP)) {
+ for (;;) {
+ char buffer[4096];
+ ssize_t bytes_read = read(fd_coredump, buffer, sizeof(buffer));
+ if (bytes_read < 0) {
+ if (errno == EAGAIN || errno == EWOULDBLOCK)
+ break;
+ goto out;
+ }
+ if (bytes_read == 0)
+ goto done;
+ ssize_t bytes_write = write(fd_core_file, buffer, bytes_read);
+ if (bytes_write != bytes_read)
+ goto out;
+ }
+ }
+ }
+
+done:
+ exit_code = EXIT_SUCCESS;
+out:
+ if (epfd >= 0)
+ close(epfd);
+ if (fd_core_file >= 0)
+ close(fd_core_file);
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ _exit(exit_code);
+}
+
+TEST_F_TIMEOUT(coredump, socket_multiple_crashing_coredumps_epoll_workers, 500)
+{
+ int pidfd[NUM_CRASHING_COREDUMPS], status[NUM_CRASHING_COREDUMPS];
+ pid_t pid[NUM_CRASHING_COREDUMPS], pid_coredump_server, worker_pids[NUM_CRASHING_COREDUMPS];
+ struct pidfd_info info = {};
+ int ipc_sockets[2];
+ char c;
+
+ ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
+ ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets), 0);
+
+ pid_coredump_server = fork();
+ ASSERT_GE(pid_coredump_server, 0);
+ if (pid_coredump_server == 0) {
+ int fd_server = -1, exit_code = EXIT_FAILURE, n_conns = 0;
+ fd_server = -1;
+ exit_code = EXIT_FAILURE;
+ n_conns = 0;
+ close(ipc_sockets[0]);
+ fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
+ if (fd_server < 0)
+ goto out;
+
+ if (write_nointr(ipc_sockets[1], "1", 1) < 0)
+ goto out;
+ close(ipc_sockets[1]);
+
+ while (n_conns < NUM_CRASHING_COREDUMPS) {
+ int fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
+ struct coredump_req req = {};
+ fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
+ if (fd_coredump < 0) {
+ if (errno == EAGAIN || errno == EWOULDBLOCK)
+ continue;
+ goto out;
+ }
+ fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+ if (fd_peer_pidfd < 0)
+ goto out;
+ if (!get_pidfd_info(fd_peer_pidfd, &info))
+ goto out;
+ if (!(info.mask & PIDFD_INFO_COREDUMP) || !(info.coredump_mask & PIDFD_COREDUMPED))
+ goto out;
+ if (!read_coredump_req(fd_coredump, &req))
+ goto out;
+ if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
+ COREDUMP_KERNEL | COREDUMP_USERSPACE |
+ COREDUMP_REJECT | COREDUMP_WAIT))
+ goto out;
+ if (!send_coredump_ack(fd_coredump, &req, COREDUMP_KERNEL | COREDUMP_WAIT, 0))
+ goto out;
+ fd_core_file = open_coredump_tmpfile(self->fd_tmpfs_detached);
+ if (fd_core_file < 0)
+ goto out;
+ pid_t worker = fork();
+ if (worker == 0) {
+ close(fd_server);
+ process_coredump_worker(fd_coredump, fd_peer_pidfd, fd_core_file);
+ }
+ worker_pids[n_conns] = worker;
+ if (fd_coredump >= 0)
+ close(fd_coredump);
+ if (fd_peer_pidfd >= 0)
+ close(fd_peer_pidfd);
+ if (fd_core_file >= 0)
+ close(fd_core_file);
+ n_conns++;
+ }
+ exit_code = EXIT_SUCCESS;
+out:
+ if (fd_server >= 0)
+ close(fd_server);
+
+ // Reap all worker processes
+ for (int i = 0; i < n_conns; i++) {
+ int wstatus;
+ if (waitpid(worker_pids[i], &wstatus, 0) < 0) {
+ fprintf(stderr, "Failed to wait for worker %d: %m\n", worker_pids[i]);
+ } else if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != EXIT_SUCCESS) {
+ fprintf(stderr, "Worker %d exited with error code %d\n", worker_pids[i], WEXITSTATUS(wstatus));
+ exit_code = EXIT_FAILURE;
+ }
+ }
+
+ _exit(exit_code);
+ }
+ self->pid_coredump_server = pid_coredump_server;
+
+ EXPECT_EQ(close(ipc_sockets[1]), 0);
+ ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
+ EXPECT_EQ(close(ipc_sockets[0]), 0);
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ pid[i] = fork();
+ ASSERT_GE(pid[i], 0);
+ if (pid[i] == 0)
+ crashing_child();
+ pidfd[i] = sys_pidfd_open(pid[i], 0);
+ ASSERT_GE(pidfd[i], 0);
+ }
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ ASSERT_GE(waitpid(pid[i], &status[i], 0), 0);
+ ASSERT_TRUE(WIFSIGNALED(status[i]));
+ ASSERT_TRUE(WCOREDUMP(status[i]));
+ }
+
+ for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
+ info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
+ ASSERT_EQ(ioctl(pidfd[i], PIDFD_GET_INFO, &info), 0);
+ ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
+ ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
+ }
+
+ wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
+}
+
TEST_HARNESS_MAIN
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/5] coredump: allow for flexible coredump handling
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
@ 2025-06-03 13:49 ` Alexander Mikhalitsyn
2025-06-09 14:16 ` Jeff Layton
1 sibling, 0 replies; 14+ messages in thread
From: Alexander Mikhalitsyn @ 2025-06-03 13:49 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Lennart Poettering,
Mike Yuan, Zbigniew Jędrzejewski-Szmek
Am Di., 3. Juni 2025 um 15:32 Uhr schrieb Christian Brauner
<brauner@kernel.org>:
>
> Extend the coredump socket to allow the coredump server to tell the
> kernel how to process individual coredumps.
>
> When the crashing task connects to the coredump socket the kernel will
> send a struct coredump_req to the coredump server. The kernel will set
> the size member of struct coredump_req allowing the coredump server how
> much data can be read.
>
> The coredump server uses MSG_PEEK to peek the size of struct
> coredump_req. If the kernel uses a newer struct coredump_req the
> coredump server just reads the size it knows and discard any remaining
> bytes in the buffer. If the kernel uses an older struct coredump_req
> the coredump server just reads the size the kernel knows.
>
> The returned struct coredump_req will inform the coredump server what
> features the kernel supports. The coredump_req->mask member is set to
> the currently know features.
>
> The coredump server may only use features whose bits were raised by the
> kernel in coredump_req->mask.
>
> In response to a coredump_req from the kernel the coredump server sends
> a struct coredump_ack to the kernel. The kernel informs the coredump
> server what version of struct coredump_ack it supports by setting struct
> coredump_req->size_ack to the size it knows about. The coredump server
> may only send as many bytes as coredump_req->size_ack indicates (a
> smaller size is fine of course). The coredump server must set
> coredump_ack->size accordingly.
>
> The coredump server sets the features it wants to use in struct
> coredump_ack->mask. Only bits returned in struct coredump_req->mask may
> be used.
>
> In case an invalid struct coredump_ack is sent to the kernel an
> out-of-band byte will be sent by the kernel indicating the reason why
> the coredump_ack was rejected.
>
> The out-of-band markers allow advanced userspace to infer failure. They
> are optional and can be ignored by not listening for POLLPRI events and
> aren't necessary for the coredump server to function correctly.
>
> In the initial version the following features are supported in
> coredump_{req,ack}->mask:
>
> * COREDUMP_KERNEL
> The kernel will write the coredump data to the socket.
>
> * COREDUMP_USERSPACE
> The kernel will not write coredump data but will indicate to the
> parent that a coredump has been generated. This is used when userspace
> generates its own coredumps.
>
> * COREDUMP_REJECT
> The kernel will skip generating a coredump for this task.
>
> * COREDUMP_WAIT
> The kernel will prevent the task from exiting until the coredump
> server has shutdown the socket connection.
>
> The flexible coredump socket can be enabled by using the "@@" prefix
> instead of the single "@" prefix for the regular coredump socket:
>
> @@/run/systemd/coredump.socket
>
> will enable flexible coredump handling. Current kernels already enforce
> that "@" must be followed by "/" and will reject anything else. So
> extending this is backward and forward compatible.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> ---
> fs/coredump.c | 130 +++++++++++++++++++++++++++++++++++++++---
> include/uapi/linux/coredump.h | 104 +++++++++++++++++++++++++++++++++
> 2 files changed, 227 insertions(+), 7 deletions(-)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index f217ebf2b3b6..e79f37d3eefb 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -51,6 +51,7 @@
> #include <net/sock.h>
> #include <uapi/linux/pidfd.h>
> #include <uapi/linux/un.h>
> +#include <uapi/linux/coredump.h>
>
> #include <linux/uaccess.h>
> #include <asm/mmu_context.h>
> @@ -83,15 +84,17 @@ static int core_name_size = CORENAME_MAX_SIZE;
> unsigned int core_file_note_size_limit = CORE_FILE_NOTE_SIZE_DEFAULT;
>
> enum coredump_type_t {
> - COREDUMP_FILE = 1,
> - COREDUMP_PIPE = 2,
> - COREDUMP_SOCK = 3,
> + COREDUMP_FILE = 1,
> + COREDUMP_PIPE = 2,
> + COREDUMP_SOCK = 3,
> + COREDUMP_SOCK_REQ = 4,
> };
>
> struct core_name {
> char *corename;
> int used, size;
> enum coredump_type_t core_type;
> + u64 mask;
> };
>
> static int expand_corename(struct core_name *cn, int size)
> @@ -235,6 +238,9 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> int pid_in_pattern = 0;
> int err = 0;
>
> + cn->mask = COREDUMP_KERNEL;
> + if (core_pipe_limit)
> + cn->mask |= COREDUMP_WAIT;
> cn->used = 0;
> cn->corename = NULL;
> if (*pat_ptr == '|')
> @@ -264,6 +270,13 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> pat_ptr++;
> if (!(*pat_ptr))
> return -ENOMEM;
> + if (*pat_ptr == '@') {
> + pat_ptr++;
> + if (!(*pat_ptr))
> + return -ENOMEM;
> +
> + cn->core_type = COREDUMP_SOCK_REQ;
> + }
>
> err = cn_printf(cn, "%s", pat_ptr);
> if (err)
> @@ -632,6 +645,93 @@ static int umh_coredump_setup(struct subprocess_info *info, struct cred *new)
> return 0;
> }
>
> +#ifdef CONFIG_UNIX
> +static inline bool coredump_sock_recv(struct file *file, struct coredump_ack *ack, size_t size, int flags)
> +{
> + struct msghdr msg = {};
> + struct kvec iov = { .iov_base = ack, .iov_len = size };
> + ssize_t ret;
> +
> + memset(ack, 0, size);
> + ret = kernel_recvmsg(sock_from_file(file), &msg, &iov, 1, size, flags);
> + return ret == size;
> +}
> +
> +static inline bool coredump_sock_send(struct file *file, struct coredump_req *req)
> +{
> + struct msghdr msg = { .msg_flags = MSG_NOSIGNAL };
> + struct kvec iov = { .iov_base = req, .iov_len = sizeof(*req) };
> + ssize_t ret;
> +
> + ret = kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(*req));
> + return ret == sizeof(*req);
> +}
> +
> +static_assert(sizeof(enum coredump_oob) == sizeof(__u8));
> +
> +static inline bool coredump_sock_oob(struct file *file, enum coredump_oob oob)
> +{
> +#ifdef CONFIG_AF_UNIX_OOB
> + struct msghdr msg = { .msg_flags = MSG_NOSIGNAL | MSG_OOB };
> + struct kvec iov = { .iov_base = &oob, .iov_len = sizeof(oob) };
> +
> + kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(oob));
> +#endif
> + coredump_report_failure("Coredump socket ack failed %u", oob);
> + return false;
> +}
> +
> +static bool coredump_request(struct core_name *cn, struct coredump_params *cprm)
> +{
> + struct coredump_req req = {
> + .size = sizeof(struct coredump_req),
> + .mask = COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT,
> + .size_ack = sizeof(struct coredump_ack),
> + };
> + struct coredump_ack ack = {};
> + ssize_t usize;
> +
> + if (cn->core_type != COREDUMP_SOCK_REQ)
> + return true;
> +
> + /* Let userspace know what we support. */
> + if (!coredump_sock_send(cprm->file, &req))
> + return false;
> +
> + /* Peek the size of the coredump_ack. */
> + if (!coredump_sock_recv(cprm->file, &ack, sizeof(ack.size),
> + MSG_PEEK | MSG_WAITALL))
> + return false;
> +
> + /* Refuse unknown coredump_ack sizes. */
> + usize = ack.size;
> + if (usize < COREDUMP_ACK_SIZE_VER0 || usize > sizeof(ack))
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_INVALIDSIZE);
> +
> + /* Now retrieve the coredump_ack. */
> + if (!coredump_sock_recv(cprm->file, &ack, usize, MSG_WAITALL))
> + return false;
> + if (ack.size != usize)
> + return false;
> +
> + /* Refuse unknown coredump_ack flags. */
> + if (ack.mask & ~req.mask)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
> +
> + /* Refuse mutually exclusive options. */
> + if (hweight64(ack.mask & (COREDUMP_USERSPACE | COREDUMP_KERNEL |
> + COREDUMP_REJECT)) != 1)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_CONFLICTING);
> +
> + if (ack.spare)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
> +
> + cn->mask = ack.mask;
> + return true;
> +}
> +#endif
> +
> void do_coredump(const kernel_siginfo_t *siginfo)
> {
> struct core_state core_state;
> @@ -850,6 +950,8 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> }
> break;
> }
> + case COREDUMP_SOCK_REQ:
> + fallthrough;
> case COREDUMP_SOCK: {
> #ifdef CONFIG_UNIX
> struct file *file __free(fput) = NULL;
> @@ -918,6 +1020,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>
> cprm.limit = RLIM_INFINITY;
> cprm.file = no_free_ptr(file);
> +
> + if (!coredump_request(&cn, &cprm))
> + goto close_fail;
> #else
> coredump_report_failure("Core dump socket support %s disabled", cn.corename);
> goto close_fail;
> @@ -929,12 +1034,17 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> goto close_fail;
> }
>
> + /* Don't even generate the coredump. */
> + if (cn.mask & COREDUMP_REJECT)
> + goto close_fail;
> +
> /* get us an unshared descriptor table; almost always a no-op */
> /* The cell spufs coredump code reads the file descriptor tables */
> retval = unshare_files();
> if (retval)
> goto close_fail;
> - if (!dump_interrupted()) {
> +
> + if ((cn.mask & COREDUMP_KERNEL) && !dump_interrupted()) {
> /*
> * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would
> * have this set to NULL.
> @@ -968,17 +1078,23 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> kernel_sock_shutdown(sock_from_file(cprm.file), SHUT_WR);
> #endif
>
> + /* Let the parent know that a coredump was generated. */
> + if (cn.mask & COREDUMP_USERSPACE)
> + core_dumped = true;
> +
> /*
> * When core_pipe_limit is set we wait for the coredump server
> * or usermodehelper to finish before exiting so it can e.g.,
> * inspect /proc/<pid>.
> */
> - if (core_pipe_limit) {
> + if (cn.mask & COREDUMP_WAIT) {
> switch (cn.core_type) {
> case COREDUMP_PIPE:
> wait_for_dump_helpers(cprm.file);
> break;
> #ifdef CONFIG_UNIX
> + case COREDUMP_SOCK_REQ:
> + fallthrough;
> case COREDUMP_SOCK: {
> ssize_t n;
>
> @@ -1249,8 +1365,8 @@ static inline bool check_coredump_socket(void)
> if (current->nsproxy->mnt_ns != init_task.nsproxy->mnt_ns)
> return false;
>
> - /* Must be an absolute path. */
> - if (*(core_pattern + 1) != '/')
> + /* Must be an absolute path or the socket request. */
> + if (*(core_pattern + 1) != '/' && *(core_pattern + 1) != '@')
> return false;
>
> return true;
> diff --git a/include/uapi/linux/coredump.h b/include/uapi/linux/coredump.h
> new file mode 100644
> index 000000000000..4fa7d1f9d062
> --- /dev/null
> +++ b/include/uapi/linux/coredump.h
> @@ -0,0 +1,104 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +
> +#ifndef _UAPI_LINUX_COREDUMP_H
> +#define _UAPI_LINUX_COREDUMP_H
> +
> +#include <linux/types.h>
> +
> +/**
> + * coredump_{req,ack} flags
> + * @COREDUMP_KERNEL: kernel writes coredump
> + * @COREDUMP_USERSPACE: userspace writes coredump
> + * @COREDUMP_REJECT: don't generate coredump
> + * @COREDUMP_WAIT: wait for coredump server
> + */
> +enum {
> + COREDUMP_KERNEL = (1ULL << 0),
> + COREDUMP_USERSPACE = (1ULL << 1),
> + COREDUMP_REJECT = (1ULL << 2),
> + COREDUMP_WAIT = (1ULL << 3),
> +};
> +
> +/**
> + * struct coredump_req - message kernel sends to userspace
> + * @size: size of struct coredump_req
> + * @size_ack: known size of struct coredump_ack on this kernel
> + * @mask: supported features
> + *
> + * When a coredump happens the kernel will connect to the coredump
> + * socket and send a coredump request to the coredump server. The @size
> + * member is set to the size of struct coredump_req and provides a hint
> + * to userspace how much data can be read. Userspace may use MSG_PEEK to
> + * peek the size of struct coredump_req and then choose to consume it in
> + * one go. Userspace may also simply read a COREDUMP_ACK_SIZE_VER0
> + * request. If the size the kernel sends is larger userspace simply
> + * discards any remaining data.
> + *
> + * The coredump_req->mask member is set to the currently know features.
> + * Userspace may only set coredump_ack->mask to the bits raised by the
> + * kernel in coredump_req->mask.
> + *
> + * The coredump_req->size_ack member is set by the kernel to the size of
> + * struct coredump_ack the kernel knows. Userspace may only send up to
> + * coredump_req->size_ack bytes to the kernel and must set
> + * coredump_ack->size accordingly.
> + */
> +struct coredump_req {
> + __u32 size;
> + __u32 size_ack;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_REQ_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * struct coredump_ack - message userspace sends to kernel
> + * @size: size of the struct
> + * @spare: unused
> + * @mask: features kernel is supposed to use
> + *
> + * The @size member must be set to the size of struct coredump_ack. It
> + * may never exceed what the kernel returned in coredump_req->size_ack
> + * but it may of course be smaller (>= COREDUMP_ACK_SIZE_VER0 and <=
> + * coredump_req->size_ack).
> + *
> + * The @mask member must be set to the features the coredump server
> + * wants the kernel to use. Only bits the kernel returned in
> + * coredump_req->mask may be set.
> + */
> +struct coredump_ack {
> + __u32 size;
> + __u32 spare;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_ACK_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * enum coredump_oob - Out-of-band markers for the coredump socket
> + *
> + * The kernel will place a single byte coredump_oob marker on the
> + * coredump socket. An interested coredump server can listen for POLLPRI
> + * and figure out why the provided coredump_ack was invalid.
> + *
> + * The out-of-band markers allow advanced userspace to infer more details
> + * about a coredump ack. They are optional and can be ignored. They
> + * aren't necessary for the coredump server to function correctly.
> + *
> + * @COREDUMP_OOB_INVALIDSIZE: the provided coredump_ack size was invalid
> + * @COREDUMP_OOB_UNSUPPORTED: the provided coredump_ack mask was invalid
> + * @COREDUMP_OOB_CONFLICTING: the provided coredump_ack mask has conflicting options
> + * @__COREDUMP_OOB_MAX: the maximum value for coredump_oob
> + */
> +enum coredump_oob {
> + COREDUMP_OOB_INVALIDSIZE = 1U,
> + COREDUMP_OOB_UNSUPPORTED = 2U,
> + COREDUMP_OOB_CONFLICTING = 3U,
> + __COREDUMP_OOB_MAX = 255U,
> +} __attribute__ ((__packed__));
> +
> +#endif /* _UAPI_LINUX_COREDUMP_H */
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 4/5] tools: add coredump.h header
2025-06-03 13:31 ` [PATCH v2 4/5] tools: add coredump.h header Christian Brauner
@ 2025-06-03 13:51 ` Alexander Mikhalitsyn
0 siblings, 0 replies; 14+ messages in thread
From: Alexander Mikhalitsyn @ 2025-06-03 13:51 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Lennart Poettering,
Mike Yuan, Zbigniew Jędrzejewski-Szmek
Am Di., 3. Juni 2025 um 15:32 Uhr schrieb Christian Brauner
<brauner@kernel.org>:
>
> Copy the coredump header so we can rely on it in the selftests.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> ---
> tools/include/uapi/linux/coredump.h | 104 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 104 insertions(+)
>
> diff --git a/tools/include/uapi/linux/coredump.h b/tools/include/uapi/linux/coredump.h
> new file mode 100644
> index 000000000000..4fa7d1f9d062
> --- /dev/null
> +++ b/tools/include/uapi/linux/coredump.h
> @@ -0,0 +1,104 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +
> +#ifndef _UAPI_LINUX_COREDUMP_H
> +#define _UAPI_LINUX_COREDUMP_H
> +
> +#include <linux/types.h>
> +
> +/**
> + * coredump_{req,ack} flags
> + * @COREDUMP_KERNEL: kernel writes coredump
> + * @COREDUMP_USERSPACE: userspace writes coredump
> + * @COREDUMP_REJECT: don't generate coredump
> + * @COREDUMP_WAIT: wait for coredump server
> + */
> +enum {
> + COREDUMP_KERNEL = (1ULL << 0),
> + COREDUMP_USERSPACE = (1ULL << 1),
> + COREDUMP_REJECT = (1ULL << 2),
> + COREDUMP_WAIT = (1ULL << 3),
> +};
> +
> +/**
> + * struct coredump_req - message kernel sends to userspace
> + * @size: size of struct coredump_req
> + * @size_ack: known size of struct coredump_ack on this kernel
> + * @mask: supported features
> + *
> + * When a coredump happens the kernel will connect to the coredump
> + * socket and send a coredump request to the coredump server. The @size
> + * member is set to the size of struct coredump_req and provides a hint
> + * to userspace how much data can be read. Userspace may use MSG_PEEK to
> + * peek the size of struct coredump_req and then choose to consume it in
> + * one go. Userspace may also simply read a COREDUMP_ACK_SIZE_VER0
> + * request. If the size the kernel sends is larger userspace simply
> + * discards any remaining data.
> + *
> + * The coredump_req->mask member is set to the currently know features.
> + * Userspace may only set coredump_ack->mask to the bits raised by the
> + * kernel in coredump_req->mask.
> + *
> + * The coredump_req->size_ack member is set by the kernel to the size of
> + * struct coredump_ack the kernel knows. Userspace may only send up to
> + * coredump_req->size_ack bytes to the kernel and must set
> + * coredump_ack->size accordingly.
> + */
> +struct coredump_req {
> + __u32 size;
> + __u32 size_ack;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_REQ_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * struct coredump_ack - message userspace sends to kernel
> + * @size: size of the struct
> + * @spare: unused
> + * @mask: features kernel is supposed to use
> + *
> + * The @size member must be set to the size of struct coredump_ack. It
> + * may never exceed what the kernel returned in coredump_req->size_ack
> + * but it may of course be smaller (>= COREDUMP_ACK_SIZE_VER0 and <=
> + * coredump_req->size_ack).
> + *
> + * The @mask member must be set to the features the coredump server
> + * wants the kernel to use. Only bits the kernel returned in
> + * coredump_req->mask may be set.
> + */
> +struct coredump_ack {
> + __u32 size;
> + __u32 spare;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_ACK_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * enum coredump_oob - Out-of-band markers for the coredump socket
> + *
> + * The kernel will place a single byte coredump_oob marker on the
> + * coredump socket. An interested coredump server can listen for POLLPRI
> + * and figure out why the provided coredump_ack was invalid.
> + *
> + * The out-of-band markers allow advanced userspace to infer more details
> + * about a coredump ack. They are optional and can be ignored. They
> + * aren't necessary for the coredump server to function correctly.
> + *
> + * @COREDUMP_OOB_INVALIDSIZE: the provided coredump_ack size was invalid
> + * @COREDUMP_OOB_UNSUPPORTED: the provided coredump_ack mask was invalid
> + * @COREDUMP_OOB_CONFLICTING: the provided coredump_ack mask has conflicting options
> + * @__COREDUMP_OOB_MAX: the maximum value for coredump_oob
> + */
> +enum coredump_oob {
> + COREDUMP_OOB_INVALIDSIZE = 1U,
> + COREDUMP_OOB_UNSUPPORTED = 2U,
> + COREDUMP_OOB_CONFLICTING = 3U,
> + __COREDUMP_OOB_MAX = 255U,
> +} __attribute__ ((__packed__));
> +
> +#endif /* _UAPI_LINUX_COREDUMP_H */
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/5] selftests/coredump: fix build
2025-06-03 13:31 ` [PATCH v2 2/5] selftests/coredump: fix build Christian Brauner
@ 2025-06-03 13:51 ` Alexander Mikhalitsyn
0 siblings, 0 replies; 14+ messages in thread
From: Alexander Mikhalitsyn @ 2025-06-03 13:51 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Lennart Poettering,
Mike Yuan, Zbigniew Jędrzejewski-Szmek
Am Di., 3. Juni 2025 um 15:32 Uhr schrieb Christian Brauner
<brauner@kernel.org>:
>
> Fix various warnings in the selftest build.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> ---
> tools/testing/selftests/coredump/Makefile | 2 +-
> tools/testing/selftests/coredump/stackdump_test.c | 17 +++++------------
> 2 files changed, 6 insertions(+), 13 deletions(-)
>
> diff --git a/tools/testing/selftests/coredump/Makefile b/tools/testing/selftests/coredump/Makefile
> index ed210037b29d..bc287a85b825 100644
> --- a/tools/testing/selftests/coredump/Makefile
> +++ b/tools/testing/selftests/coredump/Makefile
> @@ -1,5 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -CFLAGS = $(KHDR_INCLUDES)
> +CFLAGS = -Wall -O0 $(KHDR_INCLUDES)
>
> TEST_GEN_PROGS := stackdump_test
> TEST_FILES := stackdump
> diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
> index 9984413be9f0..aa366e6f13a7 100644
> --- a/tools/testing/selftests/coredump/stackdump_test.c
> +++ b/tools/testing/selftests/coredump/stackdump_test.c
> @@ -24,6 +24,8 @@ static void *do_nothing(void *)
> {
> while (1)
> pause();
> +
> + return NULL;
> }
>
> static void crashing_child(void)
> @@ -46,9 +48,7 @@ FIXTURE(coredump)
>
> FIXTURE_SETUP(coredump)
> {
> - char buf[PATH_MAX];
> FILE *file;
> - char *dir;
> int ret;
>
> self->pid_coredump_server = -ESRCH;
> @@ -106,7 +106,6 @@ FIXTURE_TEARDOWN(coredump)
>
> TEST_F_TIMEOUT(coredump, stackdump, 120)
> {
> - struct sigaction action = {};
> unsigned long long stack;
> char *test_dir, *line;
> size_t line_length;
> @@ -171,11 +170,10 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
>
> TEST_F(coredump, socket)
> {
> - int fd, pidfd, ret, status;
> + int pidfd, ret, status;
> FILE *file;
> pid_t pid, pid_coredump_server;
> struct stat st;
> - char core_file[PATH_MAX];
> struct pidfd_info info = {};
> int ipc_sockets[2];
> char c;
> @@ -356,11 +354,10 @@ TEST_F(coredump, socket)
>
> TEST_F(coredump, socket_detect_userspace_client)
> {
> - int fd, pidfd, ret, status;
> + int pidfd, ret, status;
> FILE *file;
> pid_t pid, pid_coredump_server;
> struct stat st;
> - char core_file[PATH_MAX];
> struct pidfd_info info = {};
> int ipc_sockets[2];
> char c;
> @@ -384,7 +381,7 @@ TEST_F(coredump, socket_detect_userspace_client)
> pid_coredump_server = fork();
> ASSERT_GE(pid_coredump_server, 0);
> if (pid_coredump_server == 0) {
> - int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
> + int fd_server, fd_coredump, fd_peer_pidfd;
> socklen_t fd_peer_pidfd_len;
>
> close(ipc_sockets[0]);
> @@ -464,7 +461,6 @@ TEST_F(coredump, socket_detect_userspace_client)
> close(fd_coredump);
> close(fd_server);
> close(fd_peer_pidfd);
> - close(fd_core_file);
> _exit(EXIT_SUCCESS);
> }
> self->pid_coredump_server = pid_coredump_server;
> @@ -488,7 +484,6 @@ TEST_F(coredump, socket_detect_userspace_client)
> if (ret < 0)
> _exit(EXIT_FAILURE);
>
> - (void *)write(fd_socket, &(char){ 0 }, 1);
> close(fd_socket);
> _exit(EXIT_SUCCESS);
> }
> @@ -519,7 +514,6 @@ TEST_F(coredump, socket_enoent)
> int pidfd, ret, status;
> FILE *file;
> pid_t pid;
> - char core_file[PATH_MAX];
>
> file = fopen("/proc/sys/kernel/core_pattern", "w");
> ASSERT_NE(file, NULL);
> @@ -569,7 +563,6 @@ TEST_F(coredump, socket_no_listener)
> ASSERT_GE(pid_coredump_server, 0);
> if (pid_coredump_server == 0) {
> int fd_server;
> - socklen_t fd_peer_pidfd_len;
>
> close(ipc_sockets[0]);
>
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 3/5] selftests/coredump: cleanup coredump tests
2025-06-03 13:31 ` [PATCH v2 3/5] selftests/coredump: cleanup coredump tests Christian Brauner
@ 2025-06-03 13:52 ` Alexander Mikhalitsyn
0 siblings, 0 replies; 14+ messages in thread
From: Alexander Mikhalitsyn @ 2025-06-03 13:52 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Lennart Poettering,
Mike Yuan, Zbigniew Jędrzejewski-Szmek
Am Di., 3. Juni 2025 um 15:32 Uhr schrieb Christian Brauner
<brauner@kernel.org>:
>
> Make the selftests we added this cycle easier to read.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> ---
> tools/testing/selftests/coredump/stackdump_test.c | 409 +++++++++-------------
> 1 file changed, 174 insertions(+), 235 deletions(-)
>
> diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
> index aa366e6f13a7..4d922e5f89fe 100644
> --- a/tools/testing/selftests/coredump/stackdump_test.c
> +++ b/tools/testing/selftests/coredump/stackdump_test.c
> @@ -1,5 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0
>
> +#include <assert.h>
> #include <fcntl.h>
> #include <inttypes.h>
> #include <libgen.h>
> @@ -20,6 +21,10 @@
> #define STACKDUMP_SCRIPT "stackdump"
> #define NUM_THREAD_SPAWN 128
>
> +#ifndef PAGE_SIZE
> +#define PAGE_SIZE 4096
> +#endif
> +
> static void *do_nothing(void *)
> {
> while (1)
> @@ -109,7 +114,7 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
> unsigned long long stack;
> char *test_dir, *line;
> size_t line_length;
> - char buf[PATH_MAX];
> + char buf[PAGE_SIZE];
> int ret, i, status;
> FILE *file;
> pid_t pid;
> @@ -168,152 +173,163 @@ TEST_F_TIMEOUT(coredump, stackdump, 120)
> fclose(file);
> }
>
> +static int create_and_listen_unix_socket(const char *path)
> +{
> + struct sockaddr_un addr = {
> + .sun_family = AF_UNIX,
> + };
> + assert(strlen(path) < sizeof(addr.sun_path) - 1);
> + strncpy(addr.sun_path, path, sizeof(addr.sun_path) - 1);
> + size_t addr_len =
> + offsetof(struct sockaddr_un, sun_path) + strlen(path) + 1;
> + int fd, ret;
> +
> + fd = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> + if (fd < 0)
> + goto out;
> +
> + ret = bind(fd, (const struct sockaddr *)&addr, addr_len);
> + if (ret < 0)
> + goto out;
> +
> + ret = listen(fd, 1);
> + if (ret < 0)
> + goto out;
> +
> + return fd;
> +
> +out:
> + if (fd >= 0)
> + close(fd);
> + return -1;
> +}
> +
> +static bool set_core_pattern(const char *pattern)
> +{
> + FILE *file;
> + int ret;
> +
> + file = fopen("/proc/sys/kernel/core_pattern", "w");
> + if (!file)
> + return false;
> +
> + ret = fprintf(file, "%s", pattern);
> + fclose(file);
> +
> + return ret == strlen(pattern);
> +}
> +
> +static int get_peer_pidfd(int fd)
> +{
> + int fd_peer_pidfd;
> + socklen_t fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
> + int ret = getsockopt(fd, SOL_SOCKET, SO_PEERPIDFD, &fd_peer_pidfd,
> + &fd_peer_pidfd_len);
> + if (ret < 0) {
> + fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
> + return -1;
> + }
> + return fd_peer_pidfd;
> +}
> +
> +static bool get_pidfd_info(int fd_peer_pidfd, struct pidfd_info *info)
> +{
> + memset(info, 0, sizeof(*info));
> + info->mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> + return ioctl(fd_peer_pidfd, PIDFD_GET_INFO, info) == 0;
> +}
> +
> +static void
> +wait_and_check_coredump_server(pid_t pid_coredump_server,
> + struct __test_metadata *const _metadata,
> + FIXTURE_DATA(coredump)* self)
> +{
> + int status;
> + waitpid(pid_coredump_server, &status, 0);
> + self->pid_coredump_server = -ESRCH;
> + ASSERT_TRUE(WIFEXITED(status));
> + ASSERT_EQ(WEXITSTATUS(status), 0);
> +}
> +
> TEST_F(coredump, socket)
> {
> int pidfd, ret, status;
> - FILE *file;
> pid_t pid, pid_coredump_server;
> struct stat st;
> struct pidfd_info info = {};
> int ipc_sockets[2];
> char c;
> - const struct sockaddr_un coredump_sk = {
> - .sun_family = AF_UNIX,
> - .sun_path = "/tmp/coredump.socket",
> - };
> - size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
> - sizeof("/tmp/coredump.socket");
> +
> + ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
>
> ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> ASSERT_EQ(ret, 0);
>
> - file = fopen("/proc/sys/kernel/core_pattern", "w");
> - ASSERT_NE(file, NULL);
> -
> - ret = fprintf(file, "@/tmp/coredump.socket");
> - ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
> - ASSERT_EQ(fclose(file), 0);
> -
> pid_coredump_server = fork();
> ASSERT_GE(pid_coredump_server, 0);
> if (pid_coredump_server == 0) {
> - int fd_server, fd_coredump, fd_peer_pidfd, fd_core_file;
> - socklen_t fd_peer_pidfd_len;
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
> + int exit_code = EXIT_FAILURE;
>
> close(ipc_sockets[0]);
>
> - fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> if (fd_server < 0)
> - _exit(EXIT_FAILURE);
> -
> - ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to bind coredump socket\n");
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> -
> - ret = listen(fd_server, 1);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to listen on coredump socket\n");
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> + goto out;
>
> - if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
>
> close(ipc_sockets[1]);
>
> fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> - if (fd_coredump < 0) {
> - fprintf(stderr, "Failed to accept coredump socket connection\n");
> - close(fd_server);
> - _exit(EXIT_FAILURE);
> - }
> + if (fd_coredump < 0)
> + goto out;
>
> - fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
> - ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
> - &fd_peer_pidfd, &fd_peer_pidfd_len);
> - if (ret < 0) {
> - fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
> - close(fd_coredump);
> - close(fd_server);
> - _exit(EXIT_FAILURE);
> - }
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
>
> - memset(&info, 0, sizeof(info));
> - info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> - ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
>
> - if (!(info.mask & PIDFD_INFO_COREDUMP)) {
> - fprintf(stderr, "Missing coredump information from coredumping task\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
>
> - if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
> - fprintf(stderr, "Received connection from non-coredumping task\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
>
> fd_core_file = creat("/tmp/coredump.file", 0644);
> - if (fd_core_file < 0) {
> - fprintf(stderr, "Failed to create coredump file\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (fd_core_file < 0)
> + goto out;
>
> for (;;) {
> char buffer[4096];
> ssize_t bytes_read, bytes_write;
>
> bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> - if (bytes_read < 0) {
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - close(fd_core_file);
> - _exit(EXIT_FAILURE);
> - }
> + if (bytes_read < 0)
> + goto out;
>
> if (bytes_read == 0)
> break;
>
> bytes_write = write(fd_core_file, buffer, bytes_read);
> - if (bytes_read != bytes_write) {
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - close(fd_core_file);
> - _exit(EXIT_FAILURE);
> - }
> + if (bytes_read != bytes_write)
> + goto out;
> }
>
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - close(fd_core_file);
> - _exit(EXIT_SUCCESS);
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_core_file >= 0)
> + close(fd_core_file);
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> }
> self->pid_coredump_server = pid_coredump_server;
>
> @@ -333,47 +349,27 @@ TEST_F(coredump, socket)
> ASSERT_TRUE(WIFSIGNALED(status));
> ASSERT_TRUE(WCOREDUMP(status));
>
> - info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> - ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
>
> - waitpid(pid_coredump_server, &status, 0);
> - self->pid_coredump_server = -ESRCH;
> - ASSERT_TRUE(WIFEXITED(status));
> - ASSERT_EQ(WEXITSTATUS(status), 0);
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
>
> ASSERT_EQ(stat("/tmp/coredump.file", &st), 0);
> ASSERT_GT(st.st_size, 0);
> - /*
> - * We should somehow validate the produced core file.
> - * For now just allow for visual inspection
> - */
> system("file /tmp/coredump.file");
> }
>
> TEST_F(coredump, socket_detect_userspace_client)
> {
> int pidfd, ret, status;
> - FILE *file;
> pid_t pid, pid_coredump_server;
> struct stat st;
> struct pidfd_info info = {};
> int ipc_sockets[2];
> char c;
> - const struct sockaddr_un coredump_sk = {
> - .sun_family = AF_UNIX,
> - .sun_path = "/tmp/coredump.socket",
> - };
> - size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
> - sizeof("/tmp/coredump.socket");
>
> - file = fopen("/proc/sys/kernel/core_pattern", "w");
> - ASSERT_NE(file, NULL);
> -
> - ret = fprintf(file, "@/tmp/coredump.socket");
> - ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
> - ASSERT_EQ(fclose(file), 0);
> + ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
>
> ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> ASSERT_EQ(ret, 0);
> @@ -381,87 +377,46 @@ TEST_F(coredump, socket_detect_userspace_client)
> pid_coredump_server = fork();
> ASSERT_GE(pid_coredump_server, 0);
> if (pid_coredump_server == 0) {
> - int fd_server, fd_coredump, fd_peer_pidfd;
> - socklen_t fd_peer_pidfd_len;
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
>
> close(ipc_sockets[0]);
>
> - fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> if (fd_server < 0)
> - _exit(EXIT_FAILURE);
> + goto out;
>
> - ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to bind coredump socket\n");
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> -
> - ret = listen(fd_server, 1);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to listen on coredump socket\n");
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> -
> - if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
>
> close(ipc_sockets[1]);
>
> fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> - if (fd_coredump < 0) {
> - fprintf(stderr, "Failed to accept coredump socket connection\n");
> - close(fd_server);
> - _exit(EXIT_FAILURE);
> - }
> + if (fd_coredump < 0)
> + goto out;
>
> - fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
> - ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
> - &fd_peer_pidfd, &fd_peer_pidfd_len);
> - if (ret < 0) {
> - fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
> - close(fd_coredump);
> - close(fd_server);
> - _exit(EXIT_FAILURE);
> - }
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
>
> - memset(&info, 0, sizeof(info));
> - info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> - ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
>
> - if (!(info.mask & PIDFD_INFO_COREDUMP)) {
> - fprintf(stderr, "Missing coredump information from coredumping task\n");
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
>
> - if (info.coredump_mask & PIDFD_COREDUMPED) {
> - fprintf(stderr, "Received unexpected connection from coredumping task\n");
> + if (info.coredump_mask & PIDFD_COREDUMPED)
> + goto out;
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> close(fd_coredump);
> + if (fd_server >= 0)
> close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_FAILURE);
> - }
> -
> - close(fd_coredump);
> - close(fd_server);
> - close(fd_peer_pidfd);
> - _exit(EXIT_SUCCESS);
> + _exit(exit_code);
> }
> self->pid_coredump_server = pid_coredump_server;
>
> @@ -474,12 +429,18 @@ TEST_F(coredump, socket_detect_userspace_client)
> if (pid == 0) {
> int fd_socket;
> ssize_t ret;
> + const struct sockaddr_un coredump_sk = {
> + .sun_family = AF_UNIX,
> + .sun_path = "/tmp/coredump.socket",
> + };
> + size_t coredump_sk_len =
> + offsetof(struct sockaddr_un, sun_path) +
> + sizeof("/tmp/coredump.socket");
>
> fd_socket = socket(AF_UNIX, SOCK_STREAM, 0);
> if (fd_socket < 0)
> _exit(EXIT_FAILURE);
>
> -
> ret = connect(fd_socket, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
> if (ret < 0)
> _exit(EXIT_FAILURE);
> @@ -495,15 +456,11 @@ TEST_F(coredump, socket_detect_userspace_client)
> ASSERT_TRUE(WIFEXITED(status));
> ASSERT_EQ(WEXITSTATUS(status), 0);
>
> - info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> - ASSERT_EQ(ioctl(pidfd, PIDFD_GET_INFO, &info), 0);
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> ASSERT_EQ((info.coredump_mask & PIDFD_COREDUMPED), 0);
>
> - waitpid(pid_coredump_server, &status, 0);
> - self->pid_coredump_server = -ESRCH;
> - ASSERT_TRUE(WIFEXITED(status));
> - ASSERT_EQ(WEXITSTATUS(status), 0);
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
>
> ASSERT_NE(stat("/tmp/coredump.file", &st), 0);
> ASSERT_EQ(errno, ENOENT);
> @@ -511,16 +468,10 @@ TEST_F(coredump, socket_detect_userspace_client)
>
> TEST_F(coredump, socket_enoent)
> {
> - int pidfd, ret, status;
> - FILE *file;
> + int pidfd, status;
> pid_t pid;
>
> - file = fopen("/proc/sys/kernel/core_pattern", "w");
> - ASSERT_NE(file, NULL);
> -
> - ret = fprintf(file, "@/tmp/coredump.socket");
> - ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
> - ASSERT_EQ(fclose(file), 0);
> + ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
>
> pid = fork();
> ASSERT_GE(pid, 0);
> @@ -538,7 +489,6 @@ TEST_F(coredump, socket_enoent)
> TEST_F(coredump, socket_no_listener)
> {
> int pidfd, ret, status;
> - FILE *file;
> pid_t pid, pid_coredump_server;
> int ipc_sockets[2];
> char c;
> @@ -549,44 +499,36 @@ TEST_F(coredump, socket_no_listener)
> size_t coredump_sk_len = offsetof(struct sockaddr_un, sun_path) +
> sizeof("/tmp/coredump.socket");
>
> + ASSERT_TRUE(set_core_pattern("@/tmp/coredump.socket"));
> +
> ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> ASSERT_EQ(ret, 0);
>
> - file = fopen("/proc/sys/kernel/core_pattern", "w");
> - ASSERT_NE(file, NULL);
> -
> - ret = fprintf(file, "@/tmp/coredump.socket");
> - ASSERT_EQ(ret, strlen("@/tmp/coredump.socket"));
> - ASSERT_EQ(fclose(file), 0);
> -
> pid_coredump_server = fork();
> ASSERT_GE(pid_coredump_server, 0);
> if (pid_coredump_server == 0) {
> - int fd_server;
> + int fd_server = -1;
> + int exit_code = EXIT_FAILURE;
>
> close(ipc_sockets[0]);
>
> fd_server = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0);
> if (fd_server < 0)
> - _exit(EXIT_FAILURE);
> + goto out;
>
> ret = bind(fd_server, (const struct sockaddr *)&coredump_sk, coredump_sk_len);
> - if (ret < 0) {
> - fprintf(stderr, "Failed to bind coredump socket\n");
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> + if (ret < 0)
> + goto out;
>
> - if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
> - close(fd_server);
> - close(ipc_sockets[1]);
> - _exit(EXIT_FAILURE);
> - }
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
>
> - close(fd_server);
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_server >= 0)
> + close(fd_server);
> close(ipc_sockets[1]);
> - _exit(EXIT_SUCCESS);
> + _exit(exit_code);
> }
> self->pid_coredump_server = pid_coredump_server;
>
> @@ -606,10 +548,7 @@ TEST_F(coredump, socket_no_listener)
> ASSERT_TRUE(WIFSIGNALED(status));
> ASSERT_FALSE(WCOREDUMP(status));
>
> - waitpid(pid_coredump_server, &status, 0);
> - self->pid_coredump_server = -ESRCH;
> - ASSERT_TRUE(WIFEXITED(status));
> - ASSERT_EQ(WEXITSTATUS(status), 0);
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> }
>
> TEST_HARNESS_MAIN
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/5] selftests/coredump: add coredump server selftests
2025-06-03 13:31 ` [PATCH v2 5/5] selftests/coredump: add coredump server selftests Christian Brauner
@ 2025-06-03 13:53 ` Alexander Mikhalitsyn
0 siblings, 0 replies; 14+ messages in thread
From: Alexander Mikhalitsyn @ 2025-06-03 13:53 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Lennart Poettering,
Mike Yuan, Zbigniew Jędrzejewski-Szmek
Am Di., 3. Juni 2025 um 15:32 Uhr schrieb Christian Brauner
<brauner@kernel.org>:
>
> This adds extensive tests for the coredump server.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
> ---
> tools/testing/selftests/coredump/Makefile | 2 +-
> tools/testing/selftests/coredump/config | 4 +
> tools/testing/selftests/coredump/stackdump_test.c | 1291 ++++++++++++++++++++-
> 3 files changed, 1295 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/coredump/Makefile b/tools/testing/selftests/coredump/Makefile
> index bc287a85b825..77b3665c73c7 100644
> --- a/tools/testing/selftests/coredump/Makefile
> +++ b/tools/testing/selftests/coredump/Makefile
> @@ -1,5 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -CFLAGS = -Wall -O0 $(KHDR_INCLUDES)
> +CFLAGS += -Wall -O0 -g $(KHDR_INCLUDES) $(TOOLS_INCLUDES)
>
> TEST_GEN_PROGS := stackdump_test
> TEST_FILES := stackdump
> diff --git a/tools/testing/selftests/coredump/config b/tools/testing/selftests/coredump/config
> new file mode 100644
> index 000000000000..6ce9610b06d0
> --- /dev/null
> +++ b/tools/testing/selftests/coredump/config
> @@ -0,0 +1,4 @@
> +CONFIG_AF_UNIX_OOB=y
> +CONFIG_COREDUMP=y
> +CONFIG_NET=y
> +CONFIG_UNIX=y
> diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
> index 4d922e5f89fe..ad0d5f271db1 100644
> --- a/tools/testing/selftests/coredump/stackdump_test.c
> +++ b/tools/testing/selftests/coredump/stackdump_test.c
> @@ -4,10 +4,15 @@
> #include <fcntl.h>
> #include <inttypes.h>
> #include <libgen.h>
> +#include <limits.h>
> +#include <linux/coredump.h>
> +#include <linux/fs.h>
> #include <linux/limits.h>
> #include <pthread.h>
> #include <string.h>
> #include <sys/mount.h>
> +#include <poll.h>
> +#include <sys/epoll.h>
> #include <sys/resource.h>
> #include <sys/stat.h>
> #include <sys/socket.h>
> @@ -15,6 +20,7 @@
> #include <unistd.h>
>
> #include "../kselftest_harness.h"
> +#include "../filesystems/wrappers.h"
> #include "../pidfd/pidfd.h"
>
> #define STACKDUMP_FILE "stack_values"
> @@ -49,14 +55,32 @@ FIXTURE(coredump)
> {
> char original_core_pattern[256];
> pid_t pid_coredump_server;
> + int fd_tmpfs_detached;
> };
>
> +static int create_detached_tmpfs(void)
> +{
> + int fd_context, fd_tmpfs;
> +
> + fd_context = sys_fsopen("tmpfs", 0);
> + if (fd_context < 0)
> + return -1;
> +
> + if (sys_fsconfig(fd_context, FSCONFIG_CMD_CREATE, NULL, NULL, 0) < 0)
> + return -1;
> +
> + fd_tmpfs = sys_fsmount(fd_context, 0, 0);
> + close(fd_context);
> + return fd_tmpfs;
> +}
> +
> FIXTURE_SETUP(coredump)
> {
> FILE *file;
> int ret;
>
> self->pid_coredump_server = -ESRCH;
> + self->fd_tmpfs_detached = -1;
> file = fopen("/proc/sys/kernel/core_pattern", "r");
> ASSERT_NE(NULL, file);
>
> @@ -65,6 +89,8 @@ FIXTURE_SETUP(coredump)
> ASSERT_LT(ret, sizeof(self->original_core_pattern));
>
> self->original_core_pattern[ret] = '\0';
> + self->fd_tmpfs_detached = create_detached_tmpfs();
> + ASSERT_GE(self->fd_tmpfs_detached, 0);
>
> ret = fclose(file);
> ASSERT_EQ(0, ret);
> @@ -103,6 +129,15 @@ FIXTURE_TEARDOWN(coredump)
> goto fail;
> }
>
> + if (self->fd_tmpfs_detached >= 0) {
> + ret = close(self->fd_tmpfs_detached);
> + if (ret < 0) {
> + reason = "Unable to close detached tmpfs";
> + goto fail;
> + }
> + self->fd_tmpfs_detached = -1;
> + }
> +
> return;
> fail:
> /* This should never happen */
> @@ -192,7 +227,7 @@ static int create_and_listen_unix_socket(const char *path)
> if (ret < 0)
> goto out;
>
> - ret = listen(fd, 1);
> + ret = listen(fd, 128);
> if (ret < 0)
> goto out;
>
> @@ -551,4 +586,1258 @@ TEST_F(coredump, socket_no_listener)
> wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> }
>
> +int recv_oob_marker(int fd)
> +{
> + uint8_t oob_marker;
> + ssize_t ret;
> +
> + ret = recv(fd, &oob_marker, 1, MSG_OOB);
> + if (ret < 0)
> + return -1;
> + if (ret > 1 || ret == 0)
> + return -EINVAL;
> +
> + switch (oob_marker) {
> + case COREDUMP_OOB_INVALIDSIZE:
> + fprintf(stderr, "Received OOB marker: InvalidSize\n");
> + return COREDUMP_OOB_INVALIDSIZE;
> + case COREDUMP_OOB_UNSUPPORTED:
> + fprintf(stderr, "Received OOB marker: Unsupported\n");
> + return COREDUMP_OOB_UNSUPPORTED;
> + case COREDUMP_OOB_CONFLICTING:
> + fprintf(stderr, "Received OOB marker: Conflicting\n");
> + return COREDUMP_OOB_CONFLICTING;
> + default:
> + fprintf(stderr, "Received unknown OOB marker: %u\n", oob_marker);
> + break;
> + }
> + return -1;
> +}
> +
> +static bool is_msg_oob_supported(void)
> +{
> + int sv[2];
> + char c = 'X';
> + int ret;
> + static int supported = -1;
> +
> + if (supported >= 0)
> + return supported == 1;
> +
> + if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) < 0)
> + return false;
> +
> + ret = send(sv[0], &c, 1, MSG_OOB);
> + close(sv[0]);
> + close(sv[1]);
> +
> + if (ret < 0) {
> + if (errno == EINVAL || errno == EOPNOTSUPP) {
> + supported = 0;
> + return false;
> + }
> +
> + return false;
> + }
> + supported = 1;
> + return true;
> +}
> +
> +static bool wait_for_oob_marker(int fd, enum coredump_oob oob_marker)
> +{
> + ssize_t ret;
> + struct pollfd pfd = {
> + .fd = fd,
> + .events = POLLPRI,
> + .revents = 0,
> + };
> +
> + if (!is_msg_oob_supported())
> + return true;
> +
> + ret = poll(&pfd, 1, -1);
> + if (ret < 0)
> + return false;
> + if (!(pfd.revents & POLLPRI))
> + return false;
> + if (pfd.revents & POLLERR)
> + return false;
> + if (pfd.revents & POLLHUP)
> + return false;
> +
> + ret = recv_oob_marker(fd);
> + if (ret < 0)
> + return false;
> + return ret == oob_marker;
> +}
> +
> +static bool read_coredump_req(int fd, struct coredump_req *req)
> +{
> + ssize_t ret;
> + size_t field_size, user_size, ack_size, kernel_size, remaining_size;
> +
> + memset(req, 0, sizeof(*req));
> + field_size = sizeof(req->size);
> +
> + /* Peek the size of the coredump request. */
> + ret = recv(fd, req, field_size, MSG_PEEK | MSG_WAITALL);
> + if (ret != field_size)
> + return false;
> + kernel_size = req->size;
> +
> + if (kernel_size < COREDUMP_ACK_SIZE_VER0)
> + return false;
> + if (kernel_size >= PAGE_SIZE)
> + return false;
> +
> + /* Use the minimum of user and kernel size to read the full request. */
> + user_size = sizeof(struct coredump_req);
> + ack_size = user_size < kernel_size ? user_size : kernel_size;
> + ret = recv(fd, req, ack_size, MSG_WAITALL);
> + if (ret != ack_size)
> + return false;
> +
> + fprintf(stderr, "Read coredump request with size %u and mask 0x%llx\n",
> + req->size, (unsigned long long)req->mask);
> +
> + if (user_size > kernel_size)
> + remaining_size = user_size - kernel_size;
> + else
> + remaining_size = kernel_size - user_size;
> +
> + if (PAGE_SIZE <= remaining_size)
> + return false;
> +
> + /*
> + * Discard any additional data if the kernel's request was larger than
> + * what we knew about or cared about.
> + */
> + if (remaining_size) {
> + char buffer[PAGE_SIZE];
> +
> + ret = recv(fd, buffer, sizeof(buffer), MSG_WAITALL);
> + if (ret != remaining_size)
> + return false;
> + fprintf(stderr, "Discarded %zu bytes of non-OOB data after coredump request\n", remaining_size);
> + }
> +
> + return true;
> +}
> +
> +static bool send_coredump_ack(int fd, const struct coredump_req *req,
> + __u64 mask, size_t size_ack)
> +{
> + ssize_t ret;
> + /*
> + * Wrap struct coredump_ack in a larger struct so we can
> + * simulate sending to much data to the kernel.
> + */
> + struct large_ack_for_size_testing {
> + struct coredump_ack ack;
> + char buffer[PAGE_SIZE];
> + } large_ack = {};
> +
> + if (!size_ack)
> + size_ack = sizeof(struct coredump_ack) < req->size_ack ?
> + sizeof(struct coredump_ack) :
> + req->size_ack;
> + large_ack.ack.mask = mask;
> + large_ack.ack.size = size_ack;
> + ret = send(fd, &large_ack, size_ack, MSG_NOSIGNAL);
> + if (ret != size_ack)
> + return false;
> +
> + fprintf(stderr, "Sent coredump ack with size %zu and mask 0x%llx\n",
> + size_ack, (unsigned long long)mask);
> + return true;
> +}
> +
> +static bool check_coredump_req(const struct coredump_req *req, size_t min_size,
> + __u64 required_mask)
> +{
> + if (req->size < min_size)
> + return false;
> + if ((req->mask & required_mask) != required_mask)
> + return false;
> + if (req->mask & ~required_mask)
> + return false;
> + return true;
> +}
> +
> +TEST_F(coredump, socket_request_kernel)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct stat st;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_core_file = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + fd_core_file = creat("/tmp/coredump.file", 0644);
> + if (fd_core_file < 0)
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_KERNEL | COREDUMP_WAIT, 0))
> + goto out;
> +
> + for (;;) {
> + char buffer[4096];
> + ssize_t bytes_read, bytes_write;
> +
> + bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> + if (bytes_read < 0)
> + goto out;
> +
> + if (bytes_read == 0)
> + break;
> +
> + bytes_write = write(fd_core_file, buffer, bytes_read);
> + if (bytes_read != bytes_write)
> + goto out;
> + }
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_core_file >= 0)
> + close(fd_core_file);
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_TRUE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +
> + ASSERT_EQ(stat("/tmp/coredump.file", &st), 0);
> + ASSERT_GT(st.st_size, 0);
> + system("file /tmp/coredump.file");
> +}
> +
> +TEST_F(coredump, socket_request_userspace)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_USERSPACE | COREDUMP_WAIT, 0))
> + goto out;
> +
> + for (;;) {
> + char buffer[4096];
> + ssize_t bytes_read;
> +
> + bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> + if (bytes_read > 0)
> + goto out;
> +
> + if (bytes_read < 0)
> + goto out;
> +
> + if (bytes_read == 0)
> + break;
> + }
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_TRUE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +TEST_F(coredump, socket_request_reject)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_REJECT | COREDUMP_WAIT, 0))
> + goto out;
> +
> + for (;;) {
> + char buffer[4096];
> + ssize_t bytes_read;
> +
> + bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> + if (bytes_read > 0)
> + goto out;
> +
> + if (bytes_read < 0)
> + goto out;
> +
> + if (bytes_read == 0)
> + break;
> + }
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_FALSE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +TEST_F(coredump, socket_request_invalid_flag_combination)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_KERNEL | COREDUMP_REJECT | COREDUMP_WAIT, 0))
> + goto out;
> +
> + if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_CONFLICTING))
> + goto out;
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_FALSE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +TEST_F(coredump, socket_request_unknown_flag)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req, (1ULL << 63), 0))
> + goto out;
> +
> + if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_UNSUPPORTED))
> + goto out;
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_FALSE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +TEST_F(coredump, socket_request_invalid_size_small)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_REJECT | COREDUMP_WAIT,
> + COREDUMP_ACK_SIZE_VER0 / 2))
> + goto out;
> +
> + if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_INVALIDSIZE))
> + goto out;
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_FALSE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +TEST_F(coredump, socket_request_invalid_size_large)
> +{
> + int pidfd, ret, status;
> + pid_t pid, pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ret = socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets);
> + ASSERT_EQ(ret, 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + struct coredump_req req = {};
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + close(ipc_sockets[0]);
> +
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> +
> + close(ipc_sockets[1]);
> +
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0)
> + goto out;
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP))
> + goto out;
> +
> + if (!(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> +
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_REJECT | COREDUMP_WAIT,
> + COREDUMP_ACK_SIZE_VER0 + PAGE_SIZE))
> + goto out;
> +
> + if (!wait_for_oob_marker(fd_coredump, COREDUMP_OOB_INVALIDSIZE))
> + goto out;
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + pid = fork();
> + ASSERT_GE(pid, 0);
> + if (pid == 0)
> + crashing_child();
> +
> + pidfd = sys_pidfd_open(pid, 0);
> + ASSERT_GE(pidfd, 0);
> +
> + waitpid(pid, &status, 0);
> + ASSERT_TRUE(WIFSIGNALED(status));
> + ASSERT_FALSE(WCOREDUMP(status));
> +
> + ASSERT_TRUE(get_pidfd_info(pidfd, &info));
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +
> +static int open_coredump_tmpfile(int fd_tmpfs_detached)
> +{
> + return openat(fd_tmpfs_detached, ".", O_TMPFILE | O_RDWR | O_EXCL, 0600);
> +}
> +
> +#define NUM_CRASHING_COREDUMPS 5
> +
> +TEST_F_TIMEOUT(coredump, socket_multiple_crashing_coredumps, 500)
> +{
> + int pidfd[NUM_CRASHING_COREDUMPS], status[NUM_CRASHING_COREDUMPS];
> + pid_t pid[NUM_CRASHING_COREDUMPS], pid_coredump_server;
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> +
> + ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets), 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + int fd_server = -1, fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
> + int exit_code = EXIT_FAILURE;
> + struct coredump_req req = {};
> +
> + close(ipc_sockets[0]);
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0) {
> + fprintf(stderr, "Failed to create and listen on unix socket\n");
> + goto out;
> + }
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0) {
> + fprintf(stderr, "Failed to notify parent via ipc socket\n");
> + goto out;
> + }
> + close(ipc_sockets[1]);
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0) {
> + fprintf(stderr, "accept4 failed: %m\n");
> + goto out;
> + }
> +
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0) {
> + fprintf(stderr, "get_peer_pidfd failed for fd %d: %m\n", fd_coredump);
> + goto out;
> + }
> +
> + if (!get_pidfd_info(fd_peer_pidfd, &info)) {
> + fprintf(stderr, "get_pidfd_info failed for fd %d\n", fd_peer_pidfd);
> + goto out;
> + }
> +
> + if (!(info.mask & PIDFD_INFO_COREDUMP)) {
> + fprintf(stderr, "pidfd info missing PIDFD_INFO_COREDUMP for fd %d\n", fd_peer_pidfd);
> + goto out;
> + }
> + if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
> + fprintf(stderr, "pidfd info missing PIDFD_COREDUMPED for fd %d\n", fd_peer_pidfd);
> + goto out;
> + }
> +
> + if (!read_coredump_req(fd_coredump, &req)) {
> + fprintf(stderr, "read_coredump_req failed for fd %d\n", fd_coredump);
> + goto out;
> + }
> +
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT)) {
> + fprintf(stderr, "check_coredump_req failed for fd %d\n", fd_coredump);
> + goto out;
> + }
> +
> + if (!send_coredump_ack(fd_coredump, &req,
> + COREDUMP_KERNEL | COREDUMP_WAIT, 0)) {
> + fprintf(stderr, "send_coredump_ack failed for fd %d\n", fd_coredump);
> + goto out;
> + }
> +
> + fd_core_file = open_coredump_tmpfile(self->fd_tmpfs_detached);
> + if (fd_core_file < 0) {
> + fprintf(stderr, "%m - open_coredump_tmpfile failed for fd %d\n", fd_coredump);
> + goto out;
> + }
> +
> + for (;;) {
> + char buffer[4096];
> + ssize_t bytes_read, bytes_write;
> +
> + bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> + if (bytes_read < 0) {
> + fprintf(stderr, "read failed for fd %d: %m\n", fd_coredump);
> + goto out;
> + }
> +
> + if (bytes_read == 0)
> + break;
> +
> + bytes_write = write(fd_core_file, buffer, bytes_read);
> + if (bytes_read != bytes_write) {
> + fprintf(stderr, "write failed for fd %d: %m\n", fd_core_file);
> + goto out;
> + }
> + }
> +
> + close(fd_core_file);
> + close(fd_peer_pidfd);
> + close(fd_coredump);
> + fd_peer_pidfd = -1;
> + fd_coredump = -1;
> + }
> +
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_core_file >= 0)
> + close(fd_core_file);
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_server >= 0)
> + close(fd_server);
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + pid[i] = fork();
> + ASSERT_GE(pid[i], 0);
> + if (pid[i] == 0)
> + crashing_child();
> + pidfd[i] = sys_pidfd_open(pid[i], 0);
> + ASSERT_GE(pidfd[i], 0);
> + }
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + waitpid(pid[i], &status[i], 0);
> + ASSERT_TRUE(WIFSIGNALED(status[i]));
> + ASSERT_TRUE(WCOREDUMP(status[i]));
> + }
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> + ASSERT_EQ(ioctl(pidfd[i], PIDFD_GET_INFO, &info), 0);
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> + }
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> +#define MAX_EVENTS 128
> +
> +static void process_coredump_worker(int fd_coredump, int fd_peer_pidfd, int fd_core_file)
> +{
> + int epfd = -1;
> + int exit_code = EXIT_FAILURE;
> +
> + epfd = epoll_create1(0);
> + if (epfd < 0)
> + goto out;
> +
> + struct epoll_event ev;
> + ev.events = EPOLLIN | EPOLLPRI | EPOLLRDHUP | EPOLLET;
> + ev.data.fd = fd_coredump;
> + if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd_coredump, &ev) < 0)
> + goto out;
> +
> + for (;;) {
> + struct epoll_event events[1];
> + int n = epoll_wait(epfd, events, 1, -1);
> + if (n < 0)
> + break;
> +
> + if (events[0].events & EPOLLPRI) {
> + uint8_t oob;
> + ssize_t oobret = recv(fd_coredump, &oob, 1, MSG_OOB);
> + if (oobret == 1) {
> + fprintf(stderr, "Worker: Received OOB marker %u on fd %d, aborting coredump\n", oob, fd_coredump);
> + break;
> + }
> + }
> + if (events[0].events & (EPOLLIN | EPOLLRDHUP)) {
> + for (;;) {
> + char buffer[4096];
> + ssize_t bytes_read = read(fd_coredump, buffer, sizeof(buffer));
> + if (bytes_read < 0) {
> + if (errno == EAGAIN || errno == EWOULDBLOCK)
> + break;
> + goto out;
> + }
> + if (bytes_read == 0)
> + goto done;
> + ssize_t bytes_write = write(fd_core_file, buffer, bytes_read);
> + if (bytes_write != bytes_read)
> + goto out;
> + }
> + }
> + }
> +
> +done:
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (epfd >= 0)
> + close(epfd);
> + if (fd_core_file >= 0)
> + close(fd_core_file);
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + _exit(exit_code);
> +}
> +
> +TEST_F_TIMEOUT(coredump, socket_multiple_crashing_coredumps_epoll_workers, 500)
> +{
> + int pidfd[NUM_CRASHING_COREDUMPS], status[NUM_CRASHING_COREDUMPS];
> + pid_t pid[NUM_CRASHING_COREDUMPS], pid_coredump_server, worker_pids[NUM_CRASHING_COREDUMPS];
> + struct pidfd_info info = {};
> + int ipc_sockets[2];
> + char c;
> +
> + ASSERT_TRUE(set_core_pattern("@@/tmp/coredump.socket"));
> + ASSERT_EQ(socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, ipc_sockets), 0);
> +
> + pid_coredump_server = fork();
> + ASSERT_GE(pid_coredump_server, 0);
> + if (pid_coredump_server == 0) {
> + int fd_server = -1, exit_code = EXIT_FAILURE, n_conns = 0;
> + fd_server = -1;
> + exit_code = EXIT_FAILURE;
> + n_conns = 0;
> + close(ipc_sockets[0]);
> + fd_server = create_and_listen_unix_socket("/tmp/coredump.socket");
> + if (fd_server < 0)
> + goto out;
> +
> + if (write_nointr(ipc_sockets[1], "1", 1) < 0)
> + goto out;
> + close(ipc_sockets[1]);
> +
> + while (n_conns < NUM_CRASHING_COREDUMPS) {
> + int fd_coredump = -1, fd_peer_pidfd = -1, fd_core_file = -1;
> + struct coredump_req req = {};
> + fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
> + if (fd_coredump < 0) {
> + if (errno == EAGAIN || errno == EWOULDBLOCK)
> + continue;
> + goto out;
> + }
> + fd_peer_pidfd = get_peer_pidfd(fd_coredump);
> + if (fd_peer_pidfd < 0)
> + goto out;
> + if (!get_pidfd_info(fd_peer_pidfd, &info))
> + goto out;
> + if (!(info.mask & PIDFD_INFO_COREDUMP) || !(info.coredump_mask & PIDFD_COREDUMPED))
> + goto out;
> + if (!read_coredump_req(fd_coredump, &req))
> + goto out;
> + if (!check_coredump_req(&req, COREDUMP_ACK_SIZE_VER0,
> + COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT))
> + goto out;
> + if (!send_coredump_ack(fd_coredump, &req, COREDUMP_KERNEL | COREDUMP_WAIT, 0))
> + goto out;
> + fd_core_file = open_coredump_tmpfile(self->fd_tmpfs_detached);
> + if (fd_core_file < 0)
> + goto out;
> + pid_t worker = fork();
> + if (worker == 0) {
> + close(fd_server);
> + process_coredump_worker(fd_coredump, fd_peer_pidfd, fd_core_file);
> + }
> + worker_pids[n_conns] = worker;
> + if (fd_coredump >= 0)
> + close(fd_coredump);
> + if (fd_peer_pidfd >= 0)
> + close(fd_peer_pidfd);
> + if (fd_core_file >= 0)
> + close(fd_core_file);
> + n_conns++;
> + }
> + exit_code = EXIT_SUCCESS;
> +out:
> + if (fd_server >= 0)
> + close(fd_server);
> +
> + // Reap all worker processes
> + for (int i = 0; i < n_conns; i++) {
> + int wstatus;
> + if (waitpid(worker_pids[i], &wstatus, 0) < 0) {
> + fprintf(stderr, "Failed to wait for worker %d: %m\n", worker_pids[i]);
> + } else if (WIFEXITED(wstatus) && WEXITSTATUS(wstatus) != EXIT_SUCCESS) {
> + fprintf(stderr, "Worker %d exited with error code %d\n", worker_pids[i], WEXITSTATUS(wstatus));
> + exit_code = EXIT_FAILURE;
> + }
> + }
> +
> + _exit(exit_code);
> + }
> + self->pid_coredump_server = pid_coredump_server;
> +
> + EXPECT_EQ(close(ipc_sockets[1]), 0);
> + ASSERT_EQ(read_nointr(ipc_sockets[0], &c, 1), 1);
> + EXPECT_EQ(close(ipc_sockets[0]), 0);
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + pid[i] = fork();
> + ASSERT_GE(pid[i], 0);
> + if (pid[i] == 0)
> + crashing_child();
> + pidfd[i] = sys_pidfd_open(pid[i], 0);
> + ASSERT_GE(pidfd[i], 0);
> + }
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + ASSERT_GE(waitpid(pid[i], &status[i], 0), 0);
> + ASSERT_TRUE(WIFSIGNALED(status[i]));
> + ASSERT_TRUE(WCOREDUMP(status[i]));
> + }
> +
> + for (int i = 0; i < NUM_CRASHING_COREDUMPS; i++) {
> + info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
> + ASSERT_EQ(ioctl(pidfd[i], PIDFD_GET_INFO, &info), 0);
> + ASSERT_GT((info.mask & PIDFD_INFO_COREDUMP), 0);
> + ASSERT_GT((info.coredump_mask & PIDFD_COREDUMPED), 0);
> + }
> +
> + wait_and_check_coredump_server(pid_coredump_server, _metadata, self);
> +}
> +
> TEST_HARNESS_MAIN
>
> --
> 2.47.2
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 0/5] coredump: allow for flexible coredump handling
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
` (4 preceding siblings ...)
2025-06-03 13:31 ` [PATCH v2 5/5] selftests/coredump: add coredump server selftests Christian Brauner
@ 2025-06-03 14:44 ` Lennart Poettering
2025-06-09 12:56 ` Jeff Layton
6 siblings, 0 replies; 14+ messages in thread
From: Lennart Poettering @ 2025-06-03 14:44 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Jann Horn, Josef Bacik, Jeff Layton,
Alexander Viro, Daan De Meyer, Jan Kara, Mike Yuan,
Zbigniew Jędrzejewski-Szmek, Alexander Mikhalitsyn
On Di, 03.06.25 15:31, Christian Brauner (brauner@kernel.org) wrote:
Thanks for working on this! Love it! But you know that already, I guess.
[...]
> will enable flexible coredump handling. Current kernels already enforce
> that "@" must be followed by "/" and will reject anything else. So
> extending this is backward and forward compatible.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
Acked-by: Lennart Poettering <lennart@poettering.net>
Lennart
--
Lennart Poettering, Berlin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 0/5] coredump: allow for flexible coredump handling
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
` (5 preceding siblings ...)
2025-06-03 14:44 ` [PATCH v2 0/5] coredump: allow for flexible coredump handling Lennart Poettering
@ 2025-06-09 12:56 ` Jeff Layton
6 siblings, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2025-06-09 12:56 UTC (permalink / raw)
To: Christian Brauner, linux-fsdevel, Jann Horn
Cc: Josef Bacik, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Alexander Mikhalitsyn
On Tue, 2025-06-03 at 15:31 +0200, Christian Brauner wrote:
> In addition to the extensive selftests I've already written a
> (non-production ready) simple Rust coredump server for this in
> userspace:
>
> https://github.com/brauner/dumdum.git
>
> Extend the coredump socket to allow the coredump server to tell the
> kernel how to process individual coredumps. This allows for fine-grained
> coredump management. Userspace can decide to just let the kernel write
> out the coredump, or generate the coredump itself, or just reject it.
>
> When the crashing task connects to the coredump socket the kernel will
> send a struct coredump_req to the coredump server. The kernel will set
> the size member of struct coredump_req allowing the coredump server how
> much data can be read.
>
> The coredump server uses MSG_PEEK to peek the size of struct
> coredump_req. If the kernel uses a newer struct coredump_req the
> coredump server just reads the size it knows and discard any remaining
> bytes in the buffer. If the kernel uses an older struct coredump_req
> the coredump server just reads the size the kernel knows.
>
> The returned struct coredump_req will inform the coredump server what
> features the kernel supports. The coredump_req->mask member is set to
> the currently know features.
>
> The coredump server may only use features whose bits were raised by the
> kernel in coredump_req->mask.
>
> In response to a coredump_req from the kernel the coredump server sends
> a struct coredump_ack to the kernel. The kernel informs the coredump
> server what version of struct coredump_ack it supports by setting struct
> coredump_req->size_ack to the size it knows about. The coredump server
> may only send as many bytes as coredump_req->size_ack indicates (a
> smaller size is fine of course). The coredump server must set
> coredump_ack->size accordingly.
>
> The coredump server sets the features it wants to use in struct
> coredump_ack->mask. Only bits returned in struct coredump_req->mask may
> be used.
>
> In case an invalid struct coredump_ack is sent to the kernel an
> out-of-band byte will be sent by the kernel indicating the reason why
> the coredump_ack was rejected.
>
> The out-of-band markers allow advanced userspace to infer failure. They
> are optional and can be ignored by not listening for POLLPRI events and
> aren't necessary for the coredump server to function correctly.
>
> In the initial version the following features are supported in
> coredump_{req,ack}->mask:
>
> * COREDUMP_KERNEL
> The kernel will write the coredump data to the socket.
>
> * COREDUMP_USERSPACE
> The kernel will not write coredump data but will indicate to the
> parent that a coredump has been generated. This is used when userspace
> generates its own coredumps.
>
> * COREDUMP_REJECT
> The kernel will skip generating a coredump for this task.
>
> * COREDUMP_WAIT
> The kernel will prevent the task from exiting until the coredump
> server has shutdown the socket connection.
>
How do you envision COREDUMP_WAIT being used? I took a look at the
trivial server, but it wasn't clear to me why you'd want to block the
task from exiting.
> The flexible coredump socket can be enabled by using the "@@" prefix
> instead of the single "@" prefix for the regular coredump socket:
>
> @@/run/systemd/coredump.socket
>
> will enable flexible coredump handling. Current kernels already enforce
> that "@" must be followed by "/" and will reject anything else. So
> extending this is backward and forward compatible.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> Changes in v2:
> - Add epoll-based concurrent coredump handling selftests.
> - Improve cover letter.
> - Ensure that enum coredump_oob is packed aka a single byte and add a
> static_assert() verifying that.
> - Simplify helper functions making the patch even smaller.
> - Link to v1: https://lore.kernel.org/20250530-work-coredump-socket-protocol-v1-0-20bde1cd4faa@kernel.org
>
> ---
> Christian Brauner (5):
> coredump: allow for flexible coredump handling
> selftests/coredump: fix build
> selftests/coredump: cleanup coredump tests
> tools: add coredump.h header
> selftests/coredump: add coredump server selftests
>
> fs/coredump.c | 130 +-
> include/uapi/linux/coredump.h | 104 ++
> tools/include/uapi/linux/coredump.h | 104 ++
> tools/testing/selftests/coredump/Makefile | 2 +-
> tools/testing/selftests/coredump/config | 4 +
> tools/testing/selftests/coredump/stackdump_test.c | 1705 ++++++++++++++++++---
> 6 files changed, 1799 insertions(+), 250 deletions(-)
> ---
> base-commit: 3e406741b19890c3d8a2ed126aa7c23b106ca9e1
> change-id: 20250520-work-coredump-socket-protocol-6980d1f54c2f
--
Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/5] coredump: allow for flexible coredump handling
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
2025-06-03 13:49 ` Alexander Mikhalitsyn
@ 2025-06-09 14:16 ` Jeff Layton
1 sibling, 0 replies; 14+ messages in thread
From: Jeff Layton @ 2025-06-09 14:16 UTC (permalink / raw)
To: Christian Brauner, linux-fsdevel, Jann Horn
Cc: Josef Bacik, Alexander Viro, Daan De Meyer, Jan Kara,
Lennart Poettering, Mike Yuan, Zbigniew Jędrzejewski-Szmek,
Alexander Mikhalitsyn
On Tue, 2025-06-03 at 15:31 +0200, Christian Brauner wrote:
> Extend the coredump socket to allow the coredump server to tell the
> kernel how to process individual coredumps.
>
> When the crashing task connects to the coredump socket the kernel will
> send a struct coredump_req to the coredump server. The kernel will set
> the size member of struct coredump_req allowing the coredump server how
> much data can be read.
>
> The coredump server uses MSG_PEEK to peek the size of struct
> coredump_req. If the kernel uses a newer struct coredump_req the
> coredump server just reads the size it knows and discard any remaining
> bytes in the buffer. If the kernel uses an older struct coredump_req
> the coredump server just reads the size the kernel knows.
>
> The returned struct coredump_req will inform the coredump server what
> features the kernel supports. The coredump_req->mask member is set to
> the currently know features.
>
> The coredump server may only use features whose bits were raised by the
> kernel in coredump_req->mask.
>
> In response to a coredump_req from the kernel the coredump server sends
> a struct coredump_ack to the kernel. The kernel informs the coredump
> server what version of struct coredump_ack it supports by setting struct
> coredump_req->size_ack to the size it knows about. The coredump server
> may only send as many bytes as coredump_req->size_ack indicates (a
> smaller size is fine of course). The coredump server must set
> coredump_ack->size accordingly.
>
> The coredump server sets the features it wants to use in struct
> coredump_ack->mask. Only bits returned in struct coredump_req->mask may
> be used.
>
> In case an invalid struct coredump_ack is sent to the kernel an
> out-of-band byte will be sent by the kernel indicating the reason why
> the coredump_ack was rejected.
>
> The out-of-band markers allow advanced userspace to infer failure. They
> are optional and can be ignored by not listening for POLLPRI events and
> aren't necessary for the coredump server to function correctly.
>
> In the initial version the following features are supported in
> coredump_{req,ack}->mask:
>
> * COREDUMP_KERNEL
> The kernel will write the coredump data to the socket.
>
> * COREDUMP_USERSPACE
> The kernel will not write coredump data but will indicate to the
> parent that a coredump has been generated. This is used when userspace
> generates its own coredumps.
>
> * COREDUMP_REJECT
> The kernel will skip generating a coredump for this task.
>
> * COREDUMP_WAIT
> The kernel will prevent the task from exiting until the coredump
> server has shutdown the socket connection.
>
> The flexible coredump socket can be enabled by using the "@@" prefix
> instead of the single "@" prefix for the regular coredump socket:
>
> @@/run/systemd/coredump.socket
>
> will enable flexible coredump handling. Current kernels already enforce
> that "@" must be followed by "/" and will reject anything else. So
> extending this is backward and forward compatible.
>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> fs/coredump.c | 130 +++++++++++++++++++++++++++++++++++++++---
> include/uapi/linux/coredump.h | 104 +++++++++++++++++++++++++++++++++
> 2 files changed, 227 insertions(+), 7 deletions(-)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index f217ebf2b3b6..e79f37d3eefb 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -51,6 +51,7 @@
> #include <net/sock.h>
> #include <uapi/linux/pidfd.h>
> #include <uapi/linux/un.h>
> +#include <uapi/linux/coredump.h>
>
> #include <linux/uaccess.h>
> #include <asm/mmu_context.h>
> @@ -83,15 +84,17 @@ static int core_name_size = CORENAME_MAX_SIZE;
> unsigned int core_file_note_size_limit = CORE_FILE_NOTE_SIZE_DEFAULT;
>
> enum coredump_type_t {
> - COREDUMP_FILE = 1,
> - COREDUMP_PIPE = 2,
> - COREDUMP_SOCK = 3,
> + COREDUMP_FILE = 1,
> + COREDUMP_PIPE = 2,
> + COREDUMP_SOCK = 3,
> + COREDUMP_SOCK_REQ = 4,
> };
>
> struct core_name {
> char *corename;
> int used, size;
> enum coredump_type_t core_type;
> + u64 mask;
> };
>
> static int expand_corename(struct core_name *cn, int size)
> @@ -235,6 +238,9 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> int pid_in_pattern = 0;
> int err = 0;
>
> + cn->mask = COREDUMP_KERNEL;
> + if (core_pipe_limit)
> + cn->mask |= COREDUMP_WAIT;
> cn->used = 0;
> cn->corename = NULL;
> if (*pat_ptr == '|')
> @@ -264,6 +270,13 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> pat_ptr++;
> if (!(*pat_ptr))
> return -ENOMEM;
> + if (*pat_ptr == '@') {
> + pat_ptr++;
> + if (!(*pat_ptr))
> + return -ENOMEM;
> +
> + cn->core_type = COREDUMP_SOCK_REQ;
> + }
>
> err = cn_printf(cn, "%s", pat_ptr);
> if (err)
> @@ -632,6 +645,93 @@ static int umh_coredump_setup(struct subprocess_info *info, struct cred *new)
> return 0;
> }
>
> +#ifdef CONFIG_UNIX
> +static inline bool coredump_sock_recv(struct file *file, struct coredump_ack *ack, size_t size, int flags)
> +{
> + struct msghdr msg = {};
> + struct kvec iov = { .iov_base = ack, .iov_len = size };
> + ssize_t ret;
> +
> + memset(ack, 0, size);
> + ret = kernel_recvmsg(sock_from_file(file), &msg, &iov, 1, size, flags);
> + return ret == size;
> +}
> +
> +static inline bool coredump_sock_send(struct file *file, struct coredump_req *req)
> +{
> + struct msghdr msg = { .msg_flags = MSG_NOSIGNAL };
> + struct kvec iov = { .iov_base = req, .iov_len = sizeof(*req) };
> + ssize_t ret;
> +
> + ret = kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(*req));
> + return ret == sizeof(*req);
> +}
> +
> +static_assert(sizeof(enum coredump_oob) == sizeof(__u8));
> +
> +static inline bool coredump_sock_oob(struct file *file, enum coredump_oob oob)
> +{
> +#ifdef CONFIG_AF_UNIX_OOB
> + struct msghdr msg = { .msg_flags = MSG_NOSIGNAL | MSG_OOB };
> + struct kvec iov = { .iov_base = &oob, .iov_len = sizeof(oob) };
> +
> + kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(oob));
> +#endif
> + coredump_report_failure("Coredump socket ack failed %u", oob);
> + return false;
> +}
> +
> +static bool coredump_request(struct core_name *cn, struct coredump_params *cprm)
> +{
> + struct coredump_req req = {
> + .size = sizeof(struct coredump_req),
> + .mask = COREDUMP_KERNEL | COREDUMP_USERSPACE |
> + COREDUMP_REJECT | COREDUMP_WAIT,
> + .size_ack = sizeof(struct coredump_ack),
> + };
> + struct coredump_ack ack = {};
> + ssize_t usize;
> +
> + if (cn->core_type != COREDUMP_SOCK_REQ)
> + return true;
> +
> + /* Let userspace know what we support. */
> + if (!coredump_sock_send(cprm->file, &req))
> + return false;
> +
> + /* Peek the size of the coredump_ack. */
> + if (!coredump_sock_recv(cprm->file, &ack, sizeof(ack.size),
> + MSG_PEEK | MSG_WAITALL))
> + return false;
> +
> + /* Refuse unknown coredump_ack sizes. */
> + usize = ack.size;
> + if (usize < COREDUMP_ACK_SIZE_VER0 || usize > sizeof(ack))
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_INVALIDSIZE);
> +
> + /* Now retrieve the coredump_ack. */
> + if (!coredump_sock_recv(cprm->file, &ack, usize, MSG_WAITALL))
> + return false;
> + if (ack.size != usize)
> + return false;
> +
> + /* Refuse unknown coredump_ack flags. */
> + if (ack.mask & ~req.mask)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
> +
> + /* Refuse mutually exclusive options. */
> + if (hweight64(ack.mask & (COREDUMP_USERSPACE | COREDUMP_KERNEL |
> + COREDUMP_REJECT)) != 1)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_CONFLICTING);
> +
> + if (ack.spare)
> + return coredump_sock_oob(cprm->file, COREDUMP_OOB_UNSUPPORTED);
> +
> + cn->mask = ack.mask;
> + return true;
> +}
> +#endif
> +
> void do_coredump(const kernel_siginfo_t *siginfo)
> {
> struct core_state core_state;
> @@ -850,6 +950,8 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> }
> break;
> }
> + case COREDUMP_SOCK_REQ:
> + fallthrough;
nit: you can omit the "fallthrough;" line here.
> case COREDUMP_SOCK: {
> #ifdef CONFIG_UNIX
> struct file *file __free(fput) = NULL;
> @@ -918,6 +1020,9 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>
> cprm.limit = RLIM_INFINITY;
> cprm.file = no_free_ptr(file);
> +
> + if (!coredump_request(&cn, &cprm))
> + goto close_fail;
> #else
> coredump_report_failure("Core dump socket support %s disabled", cn.corename);
> goto close_fail;
> @@ -929,12 +1034,17 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> goto close_fail;
> }
>
> + /* Don't even generate the coredump. */
> + if (cn.mask & COREDUMP_REJECT)
> + goto close_fail;
> +
> /* get us an unshared descriptor table; almost always a no-op */
> /* The cell spufs coredump code reads the file descriptor tables */
> retval = unshare_files();
> if (retval)
> goto close_fail;
> - if (!dump_interrupted()) {
> +
> + if ((cn.mask & COREDUMP_KERNEL) && !dump_interrupted()) {
> /*
> * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would
> * have this set to NULL.
> @@ -968,17 +1078,23 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> kernel_sock_shutdown(sock_from_file(cprm.file), SHUT_WR);
> #endif
>
> + /* Let the parent know that a coredump was generated. */
> + if (cn.mask & COREDUMP_USERSPACE)
> + core_dumped = true;
> +
> /*
> * When core_pipe_limit is set we wait for the coredump server
> * or usermodehelper to finish before exiting so it can e.g.,
> * inspect /proc/<pid>.
> */
You can ignore my earlier question. The comment above clarifies it.
> - if (core_pipe_limit) {
> + if (cn.mask & COREDUMP_WAIT) {
> switch (cn.core_type) {
> case COREDUMP_PIPE:
> wait_for_dump_helpers(cprm.file);
> break;
> #ifdef CONFIG_UNIX
> + case COREDUMP_SOCK_REQ:
> + fallthrough;
> case COREDUMP_SOCK: {
> ssize_t n;
>
> @@ -1249,8 +1365,8 @@ static inline bool check_coredump_socket(void)
> if (current->nsproxy->mnt_ns != init_task.nsproxy->mnt_ns)
> return false;
>
> - /* Must be an absolute path. */
> - if (*(core_pattern + 1) != '/')
> + /* Must be an absolute path or the socket request. */
> + if (*(core_pattern + 1) != '/' && *(core_pattern + 1) != '@')
> return false;
>
> return true;
> diff --git a/include/uapi/linux/coredump.h b/include/uapi/linux/coredump.h
> new file mode 100644
> index 000000000000..4fa7d1f9d062
> --- /dev/null
> +++ b/include/uapi/linux/coredump.h
> @@ -0,0 +1,104 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +
> +#ifndef _UAPI_LINUX_COREDUMP_H
> +#define _UAPI_LINUX_COREDUMP_H
> +
> +#include <linux/types.h>
> +
> +/**
> + * coredump_{req,ack} flags
> + * @COREDUMP_KERNEL: kernel writes coredump
> + * @COREDUMP_USERSPACE: userspace writes coredump
> + * @COREDUMP_REJECT: don't generate coredump
> + * @COREDUMP_WAIT: wait for coredump server
> + */
> +enum {
> + COREDUMP_KERNEL = (1ULL << 0),
> + COREDUMP_USERSPACE = (1ULL << 1),
> + COREDUMP_REJECT = (1ULL << 2),
> + COREDUMP_WAIT = (1ULL << 3),
> +};
> +
> +/**
> + * struct coredump_req - message kernel sends to userspace
> + * @size: size of struct coredump_req
> + * @size_ack: known size of struct coredump_ack on this kernel
> + * @mask: supported features
> + *
> + * When a coredump happens the kernel will connect to the coredump
> + * socket and send a coredump request to the coredump server. The @size
> + * member is set to the size of struct coredump_req and provides a hint
> + * to userspace how much data can be read. Userspace may use MSG_PEEK to
> + * peek the size of struct coredump_req and then choose to consume it in
> + * one go. Userspace may also simply read a COREDUMP_ACK_SIZE_VER0
> + * request. If the size the kernel sends is larger userspace simply
> + * discards any remaining data.
> + *
> + * The coredump_req->mask member is set to the currently know features.
> + * Userspace may only set coredump_ack->mask to the bits raised by the
> + * kernel in coredump_req->mask.
> + *
> + * The coredump_req->size_ack member is set by the kernel to the size of
> + * struct coredump_ack the kernel knows. Userspace may only send up to
> + * coredump_req->size_ack bytes to the kernel and must set
> + * coredump_ack->size accordingly.
> + */
> +struct coredump_req {
> + __u32 size;
> + __u32 size_ack;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_REQ_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * struct coredump_ack - message userspace sends to kernel
> + * @size: size of the struct
> + * @spare: unused
> + * @mask: features kernel is supposed to use
> + *
> + * The @size member must be set to the size of struct coredump_ack. It
> + * may never exceed what the kernel returned in coredump_req->size_ack
> + * but it may of course be smaller (>= COREDUMP_ACK_SIZE_VER0 and <=
> + * coredump_req->size_ack).
> + *
> + * The @mask member must be set to the features the coredump server
> + * wants the kernel to use. Only bits the kernel returned in
> + * coredump_req->mask may be set.
> + */
> +struct coredump_ack {
> + __u32 size;
> + __u32 spare;
> + __u64 mask;
> +};
> +
> +enum {
> + COREDUMP_ACK_SIZE_VER0 = 16U, /* size of first published struct */
> +};
> +
> +/**
> + * enum coredump_oob - Out-of-band markers for the coredump socket
> + *
> + * The kernel will place a single byte coredump_oob marker on the
> + * coredump socket. An interested coredump server can listen for POLLPRI
> + * and figure out why the provided coredump_ack was invalid.
> + *
> + * The out-of-band markers allow advanced userspace to infer more details
> + * about a coredump ack. They are optional and can be ignored. They
> + * aren't necessary for the coredump server to function correctly.
> + *
> + * @COREDUMP_OOB_INVALIDSIZE: the provided coredump_ack size was invalid
> + * @COREDUMP_OOB_UNSUPPORTED: the provided coredump_ack mask was invalid
> + * @COREDUMP_OOB_CONFLICTING: the provided coredump_ack mask has conflicting options
> + * @__COREDUMP_OOB_MAX: the maximum value for coredump_oob
> + */
> +enum coredump_oob {
> + COREDUMP_OOB_INVALIDSIZE = 1U,
> + COREDUMP_OOB_UNSUPPORTED = 2U,
> + COREDUMP_OOB_CONFLICTING = 3U,
> + __COREDUMP_OOB_MAX = 255U,
> +} __attribute__ ((__packed__));
> +
> +#endif /* _UAPI_LINUX_COREDUMP_H */
Looks good!
Reviewed-by: Jeff Layton <jlayton@kernel.org>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-06-09 14:16 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 13:31 [PATCH v2 0/5] coredump: allow for flexible coredump handling Christian Brauner
2025-06-03 13:31 ` [PATCH v2 1/5] " Christian Brauner
2025-06-03 13:49 ` Alexander Mikhalitsyn
2025-06-09 14:16 ` Jeff Layton
2025-06-03 13:31 ` [PATCH v2 2/5] selftests/coredump: fix build Christian Brauner
2025-06-03 13:51 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 3/5] selftests/coredump: cleanup coredump tests Christian Brauner
2025-06-03 13:52 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 4/5] tools: add coredump.h header Christian Brauner
2025-06-03 13:51 ` Alexander Mikhalitsyn
2025-06-03 13:31 ` [PATCH v2 5/5] selftests/coredump: add coredump server selftests Christian Brauner
2025-06-03 13:53 ` Alexander Mikhalitsyn
2025-06-03 14:44 ` [PATCH v2 0/5] coredump: allow for flexible coredump handling Lennart Poettering
2025-06-09 12:56 ` Jeff Layton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).