linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] fix reading ESP during coredump
@ 2024-11-06  9:22 Nam Cao
  2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
  2024-11-06  9:22 ` [PATCH 2/2] selftests: coredump: Add stackdump test Nam Cao
  0 siblings, 2 replies; 10+ messages in thread
From: Nam Cao @ 2024-11-06  9:22 UTC (permalink / raw)
  To: Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, John Ogness, Kees Cook, linux-kernel,
	linux-fsdevel, linux-kselftest
  Cc: Nam Cao

Hi,

In /proc/PID/stat, there is the kstkesp field which is the stack pointer of
a thread. While the thread is active, this field reads zero. But during a
coredump, it should have a valid value.

However, at the moment, kstkesp is zero even during coredump.

The first commit fixes this problem, and the second commit adds a selftest
to detect if this problem appears again in the future.

Nam Cao (2):
  fs/proc: do_task_stat: Fix ESP not readable during coredump
  selftests: coredump: Add stackdump test

 fs/proc/array.c                               |  36 ++--
 tools/testing/selftests/coredump/Makefile     |   7 +
 tools/testing/selftests/coredump/README.rst   |  50 ++++++
 tools/testing/selftests/coredump/stackdump    |  14 ++
 .../selftests/coredump/stackdump_test.c       | 154 ++++++++++++++++++
 5 files changed, 243 insertions(+), 18 deletions(-)
 create mode 100644 tools/testing/selftests/coredump/Makefile
 create mode 100644 tools/testing/selftests/coredump/README.rst
 create mode 100755 tools/testing/selftests/coredump/stackdump
 create mode 100644 tools/testing/selftests/coredump/stackdump_test.c

-- 
2.39.5


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-11-06  9:22 [PATCH 0/2] fix reading ESP during coredump Nam Cao
@ 2024-11-06  9:22 ` Nam Cao
  2024-11-06 14:30   ` John Ogness
                     ` (2 more replies)
  2024-11-06  9:22 ` [PATCH 2/2] selftests: coredump: Add stackdump test Nam Cao
  1 sibling, 3 replies; 10+ messages in thread
From: Nam Cao @ 2024-11-06  9:22 UTC (permalink / raw)
  To: Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, John Ogness, Kees Cook, linux-kernel,
	linux-fsdevel, linux-kselftest
  Cc: Nam Cao, stable

Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in
/proc/PID/stat") disabled stack pointer reading, because it is generally
dangerous to do so.

Commit fd7d56270b52 ("fs/proc: Report eip/esp in /prod/PID/stat for
coredumping") made an exception for coredumping thread, because for this
case it is safe.

The exception was later extended to all threads in a coredumping process by
commit cb8f381f1613 ("fs/proc/array.c: allow reporting eip/esp for all
coredumping threads").

The above two commits determine if a task is core dumping by checking the
PF_EXITING and PF_DUMPCORE flags.

However, commit 92307383082d ("coredump:  Don't perform any cleanups before
dumping core") moved coredump to happen earlier and before PF_EXITING is
set. Thus, the check of the PF_EXITING flag no longer works.

Instead, use task->signal->core_state to determine if coredump is
happening. This pointer is set at the beginning of coredump and is cleared
once coredump is done. Thus, while this pointer is not NULL, it is safe to
read ESP.

Fixes: 92307383082d ("coredump:  Don't perform any cleanups before dumping core")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/proc/array.c | 36 ++++++++++++++++++------------------
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 34a47fb0c57f..2f1dbfcf143d 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -489,25 +489,8 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 	vsize = eip = esp = 0;
 	permitted = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS | PTRACE_MODE_NOAUDIT);
 	mm = get_task_mm(task);
-	if (mm) {
+	if (mm)
 		vsize = task_vsize(mm);
-		/*
-		 * esp and eip are intentionally zeroed out.  There is no
-		 * non-racy way to read them without freezing the task.
-		 * Programs that need reliable values can use ptrace(2).
-		 *
-		 * The only exception is if the task is core dumping because
-		 * a program is not able to use ptrace(2) in that case. It is
-		 * safe because the task has stopped executing permanently.
-		 */
-		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
-			if (try_get_task_stack(task)) {
-				eip = KSTK_EIP(task);
-				esp = KSTK_ESP(task);
-				put_task_stack(task);
-			}
-		}
-	}
 
 	sigemptyset(&sigign);
 	sigemptyset(&sigcatch);
@@ -534,6 +517,23 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 		ppid = task_tgid_nr_ns(task->real_parent, ns);
 		pgid = task_pgrp_nr_ns(task, ns);
 
+		/*
+		 * esp and eip are intentionally zeroed out.  There is no
+		 * non-racy way to read them without freezing the task.
+		 * Programs that need reliable values can use ptrace(2).
+		 *
+		 * The only exception is if the task is core dumping because
+		 * a program is not able to use ptrace(2) in that case. It is
+		 * safe because the task has stopped executing permanently.
+		 */
+		if (permitted && task->signal->core_state) {
+			if (try_get_task_stack(task)) {
+				eip = KSTK_EIP(task);
+				esp = KSTK_ESP(task);
+				put_task_stack(task);
+			}
+		}
+
 		unlock_task_sighand(task, &flags);
 	}
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] selftests: coredump: Add stackdump test
  2024-11-06  9:22 [PATCH 0/2] fix reading ESP during coredump Nam Cao
  2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
@ 2024-11-06  9:22 ` Nam Cao
  2024-11-06 14:32   ` John Ogness
  1 sibling, 1 reply; 10+ messages in thread
From: Nam Cao @ 2024-11-06  9:22 UTC (permalink / raw)
  To: Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, John Ogness, Kees Cook, linux-kernel,
	linux-fsdevel, linux-kselftest
  Cc: Nam Cao

Add a test which checks that the kstkesp field in /proc/pid/stat can be
read for all threads of a coredumping process.

For full details including the motivation for this test and how it works,
see the README file added by this commit.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/testing/selftests/coredump/Makefile     |   7 +
 tools/testing/selftests/coredump/README.rst   |  50 ++++++
 tools/testing/selftests/coredump/stackdump    |  14 ++
 .../selftests/coredump/stackdump_test.c       | 154 ++++++++++++++++++
 4 files changed, 225 insertions(+)
 create mode 100644 tools/testing/selftests/coredump/Makefile
 create mode 100644 tools/testing/selftests/coredump/README.rst
 create mode 100755 tools/testing/selftests/coredump/stackdump
 create mode 100644 tools/testing/selftests/coredump/stackdump_test.c

diff --git a/tools/testing/selftests/coredump/Makefile b/tools/testing/selftests/coredump/Makefile
new file mode 100644
index 000000000000..ed210037b29d
--- /dev/null
+++ b/tools/testing/selftests/coredump/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0-only
+CFLAGS = $(KHDR_INCLUDES)
+
+TEST_GEN_PROGS := stackdump_test
+TEST_FILES := stackdump
+
+include ../lib.mk
diff --git a/tools/testing/selftests/coredump/README.rst b/tools/testing/selftests/coredump/README.rst
new file mode 100644
index 000000000000..164a7aa181c8
--- /dev/null
+++ b/tools/testing/selftests/coredump/README.rst
@@ -0,0 +1,50 @@
+coredump selftest
+=================
+
+Background context
+------------------
+
+`coredump` is a feature which dumps a process's memory space when the process terminates
+unexpectedly (e.g. due to segmentation fault), which can be useful for debugging. By default,
+`coredump` dumps the memory to the file named `core`, but this behavior can be changed by writing a
+different file name to `/proc/sys/kernel/core_pattern`. Furthermore, `coredump` can be piped to a
+user-space program by writing the pipe symbol (`|`) followed by the command to be executed to
+`/proc/sys/kernel/core_pattern`. For the full description, see `man 5 core`.
+
+The piped user program may be interested in reading the stack pointers of the crashed process. The
+crashed process's stack pointers can be read from `procfs`: it is the `kstkesp` field in
+`/proc/$PID/stat`. See `man 5 proc` for all the details.
+
+The problem
+-----------
+While a thread is active, the stack pointer is unsafe to read and therefore the `kstkesp` field
+reads zero. But when the thread is dead (e.g. during a coredump), this field should have valid
+value.
+
+However, this was broken in the past and `kstkesp` was zero even during coredump:
+
+* commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in /proc/PID/stat") changed kstkesp to
+  always be zero
+
+* commit fd7d56270b52 ("fs/proc: Report eip/esp in /prod/PID/stat for coredumping") fixed it for the
+  coredumping thread. However, other threads in a coredumping process still had the problem.
+
+* commit cb8f381f1613 ("fs/proc/array.c: allow reporting eip/esp for all coredumping threads") fixed
+  for all threads in a coredumping process.
+
+* commit 92307383082d ("coredump:  Don't perform any cleanups before dumping core") broke it again
+  for the other threads in a coredumping process.
+
+The problem has been fixed now, but considering the history, it may appear again in the future.
+
+The goal of this test
+---------------------
+This test detects problem with reading `kstkesp` during coredump by doing the following:
+
+#. Tell the kernel to execute the "stackdump" script when a coredump happens. This script
+   reads the stack pointers of all threads of crashed processes.
+
+#. Spawn a child process who creates some threads and then crashes.
+
+#. Read the output from the "stackdump" script, and make sure all stack pointer values are
+   non-zero.
diff --git a/tools/testing/selftests/coredump/stackdump b/tools/testing/selftests/coredump/stackdump
new file mode 100755
index 000000000000..96714ce42d12
--- /dev/null
+++ b/tools/testing/selftests/coredump/stackdump
@@ -0,0 +1,14 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+CRASH_PROGRAM_ID=$1
+STACKDUMP_FILE=$2
+
+TMP=$(mktemp)
+
+for t in /proc/$CRASH_PROGRAM_ID/task/*; do
+	tid=$(basename $t)
+	cat /proc/$tid/stat | awk '{print $29}' >> $TMP
+done
+
+mv $TMP $STACKDUMP_FILE
diff --git a/tools/testing/selftests/coredump/stackdump_test.c b/tools/testing/selftests/coredump/stackdump_test.c
new file mode 100644
index 000000000000..b86202aa2f04
--- /dev/null
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <fcntl.h>
+#include <libgen.h>
+#include <linux/limits.h>
+#include <pthread.h>
+#include <string.h>
+#include <sys/resource.h>
+#include <unistd.h>
+
+#include "../kselftest_harness.h"
+
+#define STACKDUMP_FILE "/tmp/kselftest_stackdump"
+#define STACKDUMP_SCRIPT "stackdump"
+#define NUM_THREAD_SPAWN 128
+
+static void *do_nothing(void *)
+{
+	while (1)
+		pause();
+}
+
+static void crashing_child(void)
+{
+	pthread_t thread;
+	int i;
+
+	for (i = 0; i < NUM_THREAD_SPAWN; ++i)
+		pthread_create(&thread, NULL, do_nothing, NULL);
+
+	/* crash on purpose */
+	i = *(int *)NULL;
+}
+
+static void empty_function(int) {}
+
+FIXTURE(coredump)
+{
+	char original_core_pattern[256];
+};
+
+FIXTURE_SETUP(coredump)
+{
+	char buf[PATH_MAX];
+	FILE *file;
+	char *dir;
+	int ret;
+
+	file = fopen("/proc/sys/kernel/core_pattern", "r");
+	ASSERT_NE(NULL, file);
+
+	ret = fread(self->original_core_pattern, 1, sizeof(self->original_core_pattern), file);
+	ASSERT_TRUE(ret || feof(file));
+	ASSERT_LT(ret, sizeof(self->original_core_pattern));
+
+	self->original_core_pattern[ret] = '\0';
+
+	ret = fclose(file);
+	ASSERT_EQ(0, ret);
+}
+
+FIXTURE_TEARDOWN(coredump)
+{
+	const char *reason;
+	FILE *file;
+	int ret;
+
+	file = fopen("/proc/sys/kernel/core_pattern", "w");
+	if (!file) {
+		reason = "Unable to open core_pattern";
+		goto fail;
+	}
+
+	ret = fprintf(file, "%s", self->original_core_pattern);
+	if (ret < 0) {
+		reason = "Unable to write to core_pattern";
+		goto fail;
+	}
+
+	ret = fclose(file);
+	if (ret) {
+		reason = "Unable to close core_pattern";
+		goto fail;
+	}
+
+	return;
+fail:
+	/* This should never happen */
+	fprintf(stderr, "Failed to cleanup stackdump test: %s\n", reason);
+}
+
+TEST_F(coredump, stackdump)
+{
+	struct sigaction action = {};
+	unsigned long long stack;
+	char *test_dir, *line;
+	size_t line_length;
+	char buf[PATH_MAX];
+	int ret, i;
+	FILE *file;
+	pid_t pid;
+
+	/* This file may be left behind by a previous test run */
+	unlink(STACKDUMP_FILE);
+
+	/*
+	 * Step 1: Setup core_pattern so that the stackdump script is executed when the child
+	 * process crashes
+	 */
+	ret = readlink("/proc/self/exe", buf, sizeof(buf));
+	ASSERT_NE(-1, ret);
+	ASSERT_LT(ret, sizeof(buf));
+	buf[ret] = '\0';
+
+	test_dir = dirname(buf);
+
+	file = fopen("/proc/sys/kernel/core_pattern", "w");
+	ASSERT_NE(NULL, file);
+
+	ret = fprintf(file, "|%s/%s %%P %s", test_dir, STACKDUMP_SCRIPT, STACKDUMP_FILE);
+	ASSERT_LT(0, ret);
+
+	ret = fclose(file);
+	ASSERT_EQ(0, ret);
+
+	/* Step 2: Create a process who spawns some threads then crashes */
+	pid = fork();
+	ASSERT_TRUE(pid >= 0);
+	if (pid == 0)
+		crashing_child();
+
+	/*
+	 * Step 3: Wait for the stackdump script to write the stack pointers to the stackdump file
+	 */
+	for (i = 0; i < 10; ++i) {
+		file = fopen(STACKDUMP_FILE, "r");
+		if (file)
+			break;
+		sleep(1);
+	}
+	ASSERT_NE(file, NULL);
+
+	/* Step 4: Make sure all stack pointer values are non-zero */
+	for (i = 0; -1 != getline(&line, &line_length, file); ++i) {
+		stack = strtoull(line, NULL, 10);
+		ASSERT_NE(stack, 0);
+	}
+
+	ASSERT_EQ(i, 1 + NUM_THREAD_SPAWN);
+
+	fclose(file);
+}
+
+TEST_HARNESS_MAIN
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
@ 2024-11-06 14:30   ` John Ogness
  2024-12-17 12:50   ` Thomas Gleixner
  2024-12-17 14:59   ` Oleg Nesterov
  2 siblings, 0 replies; 10+ messages in thread
From: John Ogness @ 2024-11-06 14:30 UTC (permalink / raw)
  To: Nam Cao, Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest
  Cc: Nam Cao, stable

On 2024-11-06, Nam Cao <namcao@linutronix.de> wrote:
> Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in
> /proc/PID/stat") disabled stack pointer reading, because it is generally
> dangerous to do so.
>
> Commit fd7d56270b52 ("fs/proc: Report eip/esp in /prod/PID/stat for
> coredumping") made an exception for coredumping thread, because for this
> case it is safe.
>
> The exception was later extended to all threads in a coredumping process by
> commit cb8f381f1613 ("fs/proc/array.c: allow reporting eip/esp for all
> coredumping threads").
>
> The above two commits determine if a task is core dumping by checking the
> PF_EXITING and PF_DUMPCORE flags.
>
> However, commit 92307383082d ("coredump:  Don't perform any cleanups before
> dumping core") moved coredump to happen earlier and before PF_EXITING is
> set. Thus, the check of the PF_EXITING flag no longer works.
>
> Instead, use task->signal->core_state to determine if coredump is
> happening. This pointer is set at the beginning of coredump and is cleared
> once coredump is done. Thus, while this pointer is not NULL, it is safe to
> read ESP.
>
> Fixes: 92307383082d ("coredump:  Don't perform any cleanups before dumping core")
> Signed-off-by: Nam Cao <namcao@linutronix.de>

Reviewed-by: John Ogness <john.ogness@linutronix.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] selftests: coredump: Add stackdump test
  2024-11-06  9:22 ` [PATCH 2/2] selftests: coredump: Add stackdump test Nam Cao
@ 2024-11-06 14:32   ` John Ogness
  0 siblings, 0 replies; 10+ messages in thread
From: John Ogness @ 2024-11-06 14:32 UTC (permalink / raw)
  To: Nam Cao, Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest
  Cc: Nam Cao

On 2024-11-06, Nam Cao <namcao@linutronix.de> wrote:
> Add a test which checks that the kstkesp field in /proc/pid/stat can be
> read for all threads of a coredumping process.
>
> For full details including the motivation for this test and how it works,
> see the README file added by this commit.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>

Reviewed-by: John Ogness <john.ogness@linutronix.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
  2024-11-06 14:30   ` John Ogness
@ 2024-12-17 12:50   ` Thomas Gleixner
  2024-12-17 14:59   ` Oleg Nesterov
  2 siblings, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2024-12-17 12:50 UTC (permalink / raw)
  To: Nam Cao, Shuah Khan, Andrew Morton, Oleg Nesterov, Dylan Hatch,
	Eric W . Biederman, John Ogness, Kees Cook, linux-kernel,
	linux-fsdevel, linux-kselftest, Christian Brauner
  Cc: Nam Cao, stable

On Wed, Nov 06 2024 at 10:22, Nam Cao wrote:
> Commit 0a1eb2d474ed ("fs/proc: Stop reporting eip and esp in
> /proc/PID/stat") disabled stack pointer reading, because it is generally
> dangerous to do so.
>
> Commit fd7d56270b52 ("fs/proc: Report eip/esp in /prod/PID/stat for
> coredumping") made an exception for coredumping thread, because for this
> case it is safe.
>
> The exception was later extended to all threads in a coredumping process by
> commit cb8f381f1613 ("fs/proc/array.c: allow reporting eip/esp for all
> coredumping threads").
>
> The above two commits determine if a task is core dumping by checking the
> PF_EXITING and PF_DUMPCORE flags.
>
> However, commit 92307383082d ("coredump:  Don't perform any cleanups before
> dumping core") moved coredump to happen earlier and before PF_EXITING is
> set. Thus, the check of the PF_EXITING flag no longer works.
>
> Instead, use task->signal->core_state to determine if coredump is
> happening. This pointer is set at the beginning of coredump and is cleared
> once coredump is done. Thus, while this pointer is not NULL, it is safe to
> read ESP.
>
> Fixes: 92307383082d ("coredump:  Don't perform any cleanups before dumping core")

Can we please make progress with that? It's a user space visible change
which causes a regression in core dumper tools.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
  2024-11-06 14:30   ` John Ogness
  2024-12-17 12:50   ` Thomas Gleixner
@ 2024-12-17 14:59   ` Oleg Nesterov
  2024-12-17 15:09     ` Oleg Nesterov
  2 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2024-12-17 14:59 UTC (permalink / raw)
  To: Nam Cao
  Cc: Shuah Khan, Andrew Morton, Dylan Hatch, Eric W . Biederman,
	John Ogness, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest, stable

On 11/06, Nam Cao wrote:
>
> @@ -534,6 +517,23 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
>  		ppid = task_tgid_nr_ns(task->real_parent, ns);
>  		pgid = task_pgrp_nr_ns(task, ns);
>
> +		/*
> +		 * esp and eip are intentionally zeroed out.  There is no
> +		 * non-racy way to read them without freezing the task.
> +		 * Programs that need reliable values can use ptrace(2).

OK,

but then:

> +		 * The only exception is if the task is core dumping because
> +		 * a program is not able to use ptrace(2) in that case. It is
> +		 * safe because the task has stopped executing permanently.
> +		 */
> +		if (permitted && task->signal->core_state) {
> +			if (try_get_task_stack(task)) {
> +				eip = KSTK_EIP(task);
> +				esp = KSTK_ESP(task);
> +				put_task_stack(task);

How can the task->signal->core_state check help ?

Suppose we have a task T1 with T1-pid == 100 and you read /proc/100/stat.
It is possible that the T1's sub-thread T2 starts the coredumping and sets
signal->core_state != NULL.

But read(/proc/100/stat) can run before T1 gets SIGKILL from T2 and enters
the kernel mode?

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-12-17 14:59   ` Oleg Nesterov
@ 2024-12-17 15:09     ` Oleg Nesterov
  2024-12-20 14:53       ` Nam Cao
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2024-12-17 15:09 UTC (permalink / raw)
  To: Nam Cao
  Cc: Shuah Khan, Andrew Morton, Dylan Hatch, Eric W . Biederman,
	John Ogness, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest, stable

On 12/17, Oleg Nesterov wrote:
>
> On 11/06, Nam Cao wrote:
> >
> > @@ -534,6 +517,23 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
> >  		ppid = task_tgid_nr_ns(task->real_parent, ns);
> >  		pgid = task_pgrp_nr_ns(task, ns);
> >
> > +		/*
> > +		 * esp and eip are intentionally zeroed out.  There is no
> > +		 * non-racy way to read them without freezing the task.
> > +		 * Programs that need reliable values can use ptrace(2).
>
> OK,
>
> but then:
>
> > +		 * The only exception is if the task is core dumping because
> > +		 * a program is not able to use ptrace(2) in that case. It is
> > +		 * safe because the task has stopped executing permanently.
> > +		 */
> > +		if (permitted && task->signal->core_state) {
> > +			if (try_get_task_stack(task)) {
> > +				eip = KSTK_EIP(task);
> > +				esp = KSTK_ESP(task);
> > +				put_task_stack(task);
>
> How can the task->signal->core_state check help ?
>
> Suppose we have a task T1 with T1-pid == 100 and you read /proc/100/stat.
> It is possible that the T1's sub-thread T2 starts the coredumping and sets
> signal->core_state != NULL.
>
> But read(/proc/100/stat) can run before T1 gets SIGKILL from T2 and enters
> the kernel mode?

Can't the trivial patch below fix the problem?

Oleg.


--- xfs/proc/array.c
+++ x/fs/proc/array.c
@@ -500,7 +500,7 @@
 		 * a program is not able to use ptrace(2) in that case. It is
 		 * safe because the task has stopped executing permanently.
 		 */
-		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
+		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE|PF_POSTCOREDUMP))) {
 			if (try_get_task_stack(task)) {
 				eip = KSTK_EIP(task);
 				esp = KSTK_ESP(task);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-12-17 15:09     ` Oleg Nesterov
@ 2024-12-20 14:53       ` Nam Cao
  2024-12-22 19:18         ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Nam Cao @ 2024-12-20 14:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Shuah Khan, Andrew Morton, Dylan Hatch, Eric W . Biederman,
	John Ogness, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest, stable

Hi Oleg,

On Tue, Dec 17, 2024 at 04:09:14PM +0100, Oleg Nesterov wrote:
> On 12/17, Oleg Nesterov wrote:
> >
> > On 11/06, Nam Cao wrote:
> > >
> > > @@ -534,6 +517,23 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
> > >  		ppid = task_tgid_nr_ns(task->real_parent, ns);
> > >  		pgid = task_pgrp_nr_ns(task, ns);
> > >
> > > +		/*
> > > +		 * esp and eip are intentionally zeroed out.  There is no
> > > +		 * non-racy way to read them without freezing the task.
> > > +		 * Programs that need reliable values can use ptrace(2).
> >
> > OK,
> >
> > but then:
> >
> > > +		 * The only exception is if the task is core dumping because
> > > +		 * a program is not able to use ptrace(2) in that case. It is
> > > +		 * safe because the task has stopped executing permanently.
> > > +		 */
> > > +		if (permitted && task->signal->core_state) {
> > > +			if (try_get_task_stack(task)) {
> > > +				eip = KSTK_EIP(task);
> > > +				esp = KSTK_ESP(task);
> > > +				put_task_stack(task);
> >
> > How can the task->signal->core_state check help ?
> >
> > Suppose we have a task T1 with T1-pid == 100 and you read /proc/100/stat.
> > It is possible that the T1's sub-thread T2 starts the coredumping and sets
> > signal->core_state != NULL.
> >
> > But read(/proc/100/stat) can run before T1 gets SIGKILL from T2 and enters
> > the kernel mode?

Right, I missed that race, thanks for pointing it out.

> Can't the trivial patch below fix the problem?

It can. In fact this is the original fix we had. I thought that checking a
single "core_state" is simpler than checking 3 flags, oh well..

Can you send a proper patch, or should I do it?

Best regards,
Nam

> 
> Oleg.
> 
> 
> --- xfs/proc/array.c
> +++ x/fs/proc/array.c
> @@ -500,7 +500,7 @@
>  		 * a program is not able to use ptrace(2) in that case. It is
>  		 * safe because the task has stopped executing permanently.
>  		 */
> -		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE))) {
> +		if (permitted && (task->flags & (PF_EXITING|PF_DUMPCORE|PF_POSTCOREDUMP))) {
>  			if (try_get_task_stack(task)) {
>  				eip = KSTK_EIP(task);
>  				esp = KSTK_ESP(task);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable during coredump
  2024-12-20 14:53       ` Nam Cao
@ 2024-12-22 19:18         ` Oleg Nesterov
  0 siblings, 0 replies; 10+ messages in thread
From: Oleg Nesterov @ 2024-12-22 19:18 UTC (permalink / raw)
  To: Nam Cao
  Cc: Shuah Khan, Andrew Morton, Dylan Hatch, Eric W . Biederman,
	John Ogness, Kees Cook, linux-kernel, linux-fsdevel,
	linux-kselftest, stable

Hi Nam,

On 12/20, Nam Cao wrote:
>
> > Can't the trivial patch below fix the problem?
>
> It can. In fact this is the original fix we had. I thought that checking a
> single "core_state" is simpler than checking 3 flags, oh well..
>
> Can you send a proper patch, or should I do it?

Can you send V2 please? It was you who found/investigated the problem,
and the patch is trivial.

Feel free to include my acked-by.

Thanks,

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-12-22 19:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-06  9:22 [PATCH 0/2] fix reading ESP during coredump Nam Cao
2024-11-06  9:22 ` [PATCH 1/2] fs/proc: do_task_stat: Fix ESP not readable " Nam Cao
2024-11-06 14:30   ` John Ogness
2024-12-17 12:50   ` Thomas Gleixner
2024-12-17 14:59   ` Oleg Nesterov
2024-12-17 15:09     ` Oleg Nesterov
2024-12-20 14:53       ` Nam Cao
2024-12-22 19:18         ` Oleg Nesterov
2024-11-06  9:22 ` [PATCH 2/2] selftests: coredump: Add stackdump test Nam Cao
2024-11-06 14:32   ` John Ogness

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).