public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: x86@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	linux-kernel@vger.kernel.org,
	Indu Bhagat <indu.bhagat@oracle.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	linux-perf-users@vger.kernel.org, Mark Brown <broonie@kernel.org>,
	linux-toolchains@vger.kernel.org, Jordan Rome <jordalgo@meta.com>,
	Sam James <sam@gentoo.org>,
	linux-trace-kernel@vger.kernel.org,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Jens Remus <jremus@linux.ibm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Florian Weimer <fweimer@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Weinan Liu <wnliu@google.com>
Subject: Re: [PATCH v4 02/39] task_work: Fix TWA_NMI_CURRENT race with __schedule()
Date: Wed, 22 Jan 2025 13:42:28 +0100	[thread overview]
Message-ID: <20250122124228.GO7145@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <ad5fb696fbcd276c0902dbb94baa75fb79a6e1e2.1737511963.git.jpoimboe@kernel.org>

On Tue, Jan 21, 2025 at 06:30:54PM -0800, Josh Poimboeuf wrote:
> If TWA_NMI_CURRENT task work is queued from an NMI triggered while
> running in __schedule() with IRQs disabled, task_work_set_notify_irq()
> ends up inadvertently running on the next scheduled task.  So the
> original task doesn't get its TIF_NOTIFY_RESUME flag set and the task
> work may get delayed indefinitely, or may not get to run at all.
> 
>     __schedule()
> 	// disable irqs
> 	    <NMI>
> 		task_work_add(current, work, TWA_NMI_CURRENT);
> 	    </NMI>
> 	// current = next;
> 	// enable irqs
> 	    <IRQ>
> 		task_work_set_notify_irq()
> 		test_and_set_tsk_thread_flag(current, TIF_NOTIFY_RESUME); // wrong task!
> 	    </IRQ>
> 	// original task skips task work on its next return to user (or exit!)
> 
> Fix it by storing the task pointer along with the irq_work struct and
> passing that task to set_notify_resume().

So I'm a little confused, isn't something like this sufficient?

If we hit before schedule(), all just works as expected, if we hit after
schedule(), the task will already have the TIF flag set, and we'll hit
the return to user path once it gets scheduled again.

---
diff --git a/kernel/task_work.c b/kernel/task_work.c
index c969f1f26be5..155549c017b2 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -9,7 +9,12 @@ static struct callback_head work_exited; /* all we need is ->next == NULL */
 #ifdef CONFIG_IRQ_WORK
 static void task_work_set_notify_irq(struct irq_work *entry)
 {
-	test_and_set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
+	/*
+	 * no-op IPI
+	 *
+	 * TWA_NMI_CURRENT will already have set the TIF flag, all
+	 * this interrupt does it tickle the return-to-user path.
+	 */
 }
 static DEFINE_PER_CPU(struct irq_work, irq_work_NMI_resume) =
 	IRQ_WORK_INIT_HARD(task_work_set_notify_irq);
@@ -98,6 +103,7 @@ int task_work_add(struct task_struct *task, struct callback_head *work,
 		break;
 #ifdef CONFIG_IRQ_WORK
 	case TWA_NMI_CURRENT:
+		set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
 		irq_work_queue(this_cpu_ptr(&irq_work_NMI_resume));
 		break;
 #endif

  parent reply	other threads:[~2025-01-22 12:42 UTC|newest]

Thread overview: 161+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-22  2:30 [PATCH v4 00/39] unwind, perf: sframe user space unwinding Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 01/39] task_work: Fix TWA_NMI_CURRENT error handling Josh Poimboeuf
2025-01-22 12:28   ` Peter Zijlstra
2025-01-22 20:47     ` Josh Poimboeuf
2025-01-23  8:14       ` Peter Zijlstra
2025-01-23 17:15         ` Josh Poimboeuf
2025-01-23 22:19           ` Peter Zijlstra
2025-04-22 16:14     ` Steven Rostedt
2025-01-22  2:30 ` [PATCH v4 02/39] task_work: Fix TWA_NMI_CURRENT race with __schedule() Josh Poimboeuf
2025-01-22 12:23   ` Peter Zijlstra
2025-01-22 12:42   ` Peter Zijlstra [this message]
2025-01-22 21:03     ` Josh Poimboeuf
2025-01-22 22:14       ` Josh Poimboeuf
2025-01-23  8:15         ` Peter Zijlstra
2025-04-22 16:15     ` Steven Rostedt
2025-04-22 17:20       ` Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 03/39] mm: Add guard for mmap_read_lock Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 04/39] x86/vdso: Fix DWARF generation for getrandom() Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 05/39] x86/asm: Avoid emitting DWARF CFI for non-VDSO Josh Poimboeuf
2025-01-24 16:08   ` Jens Remus
2025-01-24 16:47     ` Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 06/39] x86/asm: Fix VDSO DWARF generation with kernel IBT enabled Josh Poimboeuf
2025-01-22  2:30 ` [PATCH v4 07/39] x86/vdso: Use SYM_FUNC_{START,END} in __kernel_vsyscall() Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 08/39] x86/vdso: Use CFI macros in __vdso_sgx_enter_enclave() Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 09/39] x86/vdso: Enable sframe generation in VDSO Josh Poimboeuf
2025-01-24 16:00   ` Jens Remus
2025-01-24 16:43     ` Josh Poimboeuf
2025-01-24 16:53       ` Josh Poimboeuf
2025-04-22 17:44       ` Steven Rostedt
2025-01-24 16:30   ` Jens Remus
2025-01-24 16:56     ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 10/39] x86/uaccess: Add unsafe_copy_from_user() implementation Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 11/39] unwind_user: Add user space unwinding API Josh Poimboeuf
2025-01-24 16:41   ` Jens Remus
2025-01-24 17:09     ` Josh Poimboeuf
2025-01-24 17:59   ` Andrii Nakryiko
2025-01-24 18:08     ` Josh Poimboeuf
2025-01-24 20:02   ` Steven Rostedt
2025-01-24 22:05     ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 12/39] unwind_user: Add frame pointer support Josh Poimboeuf
2025-01-24 17:59   ` Andrii Nakryiko
2025-01-24 18:16     ` Josh Poimboeuf
2025-04-24 13:41       ` Steven Rostedt
2025-01-22  2:31 ` [PATCH v4 13/39] unwind_user/x86: Enable frame pointer unwinding on x86 Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 14/39] perf/x86: Rename get_segment_base() and make it global Josh Poimboeuf
2025-01-22 12:51   ` Peter Zijlstra
2025-01-22 21:37     ` Josh Poimboeuf
2025-01-24 20:09   ` Steven Rostedt
2025-01-24 22:06     ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 15/39] unwind_user: Add compat mode frame pointer support Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 16/39] unwind_user/x86: Enable compat mode frame pointer unwinding on x86 Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 17/39] unwind_user/sframe: Add support for reading .sframe headers Josh Poimboeuf
2025-01-24 18:00   ` Andrii Nakryiko
2025-01-24 19:21     ` Josh Poimboeuf
2025-01-24 20:13       ` Steven Rostedt
2025-01-24 22:39         ` Josh Poimboeuf
2025-01-24 22:13       ` Indu Bhagat
2025-01-28  1:10         ` Andrii Nakryiko
2025-01-29  2:02           ` Josh Poimboeuf
2025-01-30  0:02             ` Andrii Nakryiko
2025-02-04 18:26               ` Josh Poimboeuf
2025-01-30 21:39             ` Indu Bhagat
2025-02-05  0:57               ` Josh Poimboeuf
2025-02-06  1:10                 ` Indu Bhagat
2025-02-05 13:56             ` Jens Remus
2025-02-07 21:13               ` Josh Poimboeuf
2025-01-30 21:21           ` Indu Bhagat
2025-02-04 19:59             ` Josh Poimboeuf
2025-02-05 23:16             ` Andrii Nakryiko
2025-02-05 11:01           ` Jens Remus
2025-02-05 23:05             ` Andrii Nakryiko
2025-01-24 20:31     ` Indu Bhagat
2025-01-22  2:31 ` [PATCH v4 18/39] unwind_user/sframe: Store sframe section data in per-mm maple tree Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 19/39] unwind_user/sframe: Add support for reading .sframe contents Josh Poimboeuf
2025-01-24 16:36   ` Jens Remus
2025-01-24 17:07     ` Josh Poimboeuf
2025-01-24 18:02   ` Andrii Nakryiko
2025-01-24 21:41     ` Josh Poimboeuf
2025-01-28  0:39       ` Andrii Nakryiko
2025-01-28 10:50         ` Jens Remus
2025-01-29  2:04           ` Josh Poimboeuf
2025-01-28 10:54         ` Jens Remus
2025-01-30 19:51       ` Weinan Liu
2025-02-04 19:42         ` Josh Poimboeuf
2025-01-30 15:07   ` Indu Bhagat
2025-02-04 18:38     ` Josh Poimboeuf
2025-01-30 15:47   ` Jens Remus
2025-02-04 18:51     ` Josh Poimboeuf
2025-02-05  9:47       ` Jens Remus
2025-02-07 21:06         ` Josh Poimboeuf
2025-02-10 15:56           ` Jens Remus
2025-01-22  2:31 ` [PATCH v4 20/39] unwind_user/sframe: Detect .sframe sections in executables Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 21/39] unwind_user/sframe: Add prctl() interface for registering .sframe sections Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 22/39] unwind_user/sframe: Wire up unwind_user to sframe Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 23/39] unwind_user/sframe/x86: Enable sframe unwinding on x86 Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 24/39] unwind_user/sframe: Remove .sframe section on detected corruption Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 25/39] unwind_user/sframe: Show file name in debug output Josh Poimboeuf
2025-01-30 16:17   ` Jens Remus
2025-02-04 19:10     ` Josh Poimboeuf
2025-02-05 10:04       ` Jens Remus
2025-01-22  2:31 ` [PATCH v4 26/39] unwind_user/sframe: Enable debugging in uaccess regions Josh Poimboeuf
2025-01-30 16:38   ` Jens Remus
2025-02-04 19:33     ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 27/39] unwind_user/sframe: Add .sframe validation option Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding interface Josh Poimboeuf
2025-01-22 13:37   ` Peter Zijlstra
2025-01-22 14:16     ` Peter Zijlstra
2025-01-22 22:51       ` Josh Poimboeuf
2025-01-23  8:17         ` Peter Zijlstra
2025-01-23 18:30           ` Josh Poimboeuf
2025-01-23 21:58             ` Peter Zijlstra
2025-01-22 21:38     ` Josh Poimboeuf
2025-01-22 13:44   ` Peter Zijlstra
2025-01-22 21:52     ` Josh Poimboeuf
2025-01-22 20:13   ` Mathieu Desnoyers
2025-01-23  4:05     ` Josh Poimboeuf
2025-01-23  8:25       ` Peter Zijlstra
2025-01-23 18:43         ` Josh Poimboeuf
2025-01-23 22:13           ` Peter Zijlstra
2025-01-24 21:58             ` Steven Rostedt
2025-01-24 22:46               ` Josh Poimboeuf
2025-01-24 22:50                 ` Josh Poimboeuf
2025-01-24 23:57                   ` Steven Rostedt
2025-01-30 20:21                     ` Steven Rostedt
2025-02-05  2:25                       ` Josh Poimboeuf
2025-01-24 16:35   ` Jens Remus
2025-01-24 16:57     ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 29/39] unwind_user/deferred: Add unwind cache Josh Poimboeuf
2025-01-22 13:57   ` Peter Zijlstra
2025-01-22 22:36     ` Josh Poimboeuf
2025-01-23  8:31       ` Peter Zijlstra
2025-01-23 18:45         ` Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 30/39] unwind_user/deferred: Make unwind deferral requests NMI-safe Josh Poimboeuf
2025-01-22 14:15   ` Peter Zijlstra
2025-01-22 22:49     ` Josh Poimboeuf
2025-01-23  8:40       ` Peter Zijlstra
2025-01-23 19:48         ` Josh Poimboeuf
2025-01-23 19:54           ` Josh Poimboeuf
2025-01-23 22:17           ` Peter Zijlstra
2025-01-23 23:34             ` Josh Poimboeuf
2025-01-24 11:58               ` Peter Zijlstra
2025-01-22 14:24   ` Peter Zijlstra
2025-01-22 22:52     ` Josh Poimboeuf
2025-01-23  8:42       ` Peter Zijlstra
2025-01-22  2:31 ` [PATCH v4 31/39] perf: Remove get_perf_callchain() 'init_nr' argument Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 32/39] perf: Remove get_perf_callchain() 'crosstask' argument Josh Poimboeuf
2025-01-24 18:13   ` Andrii Nakryiko
2025-01-24 22:00     ` Josh Poimboeuf
2025-01-28  0:39       ` Andrii Nakryiko
2025-01-22  2:31 ` [PATCH v4 33/39] perf: Simplify get_perf_callchain() user logic Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 34/39] perf: Skip user unwind if !current->mm Josh Poimboeuf
2025-01-22 14:29   ` Peter Zijlstra
2025-01-22 23:08     ` Josh Poimboeuf
2025-01-23  8:44       ` Peter Zijlstra
2025-01-22  2:31 ` [PATCH v4 35/39] perf: Support deferred user callchains Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 36/39] perf tools: Minimal CALLCHAIN_DEFERRED support Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 37/39] perf record: Enable defer_callchain for user callchains Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 38/39] perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED Josh Poimboeuf
2025-01-22  2:31 ` [PATCH v4 39/39] perf tools: Merge deferred user callchains Josh Poimboeuf
2025-01-22  2:35 ` [PATCH v4 00/39] unwind, perf: sframe user space unwinding Josh Poimboeuf
2025-01-22 16:13 ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250122124228.GO7145@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=broonie@kernel.org \
    --cc=fweimer@redhat.com \
    --cc=indu.bhagat@oracle.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=jordalgo@meta.com \
    --cc=jpoimboe@kernel.org \
    --cc=jremus@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sam@gentoo.org \
    --cc=wnliu@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox