All of lore.kernel.org
 help / color / mirror / Atom feed
From: Seth Forshee <sforshee@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>,
	Petr Mladek <pmladek@suse.com>, Jason Wang <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
	Joe Lawrence <joe.lawrence@redhat.com>,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	netdev@vger.kernel.org, live-patching@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] vhost: improve livepatch switching for heavily loaded vhost worker kthreads
Date: Fri, 27 Jan 2023 14:02:40 -0600	[thread overview]
Message-ID: <Y9Qt4BT2WXK2dToL@do-x1extreme> (raw)
In-Reply-To: <Y9OpTtqWjAkC2pal@hirez.programming.kicks-ass.net>

On Fri, Jan 27, 2023 at 11:37:02AM +0100, Peter Zijlstra wrote:
> On Thu, Jan 26, 2023 at 08:43:55PM -0800, Josh Poimboeuf wrote:
> > On Thu, Jan 26, 2023 at 03:12:35PM -0600, Seth Forshee (DigitalOcean) wrote:
> > > On Thu, Jan 26, 2023 at 06:03:16PM +0100, Petr Mladek wrote:
> > > > On Fri 2023-01-20 16:12:20, Seth Forshee (DigitalOcean) wrote:
> > > > > We've fairly regularaly seen liveptches which cannot transition within kpatch's
> > > > > timeout period due to busy vhost worker kthreads.
> > > > 
> > > > I have missed this detail. Miroslav told me that we have solved
> > > > something similar some time ago, see
> > > > https://lore.kernel.org/all/20220507174628.2086373-1-song@kernel.org/
> > > 
> > > Interesting thread. I had thought about something along the lines of the
> > > original patch, but there are some ideas in there that I hadn't
> > > considered.
> > 
> > Here's another idea, have we considered this?  Have livepatch set
> > TIF_NEED_RESCHED on all kthreads to force them into schedule(), and then
> > have the scheduler call klp_try_switch_task() if TIF_PATCH_PENDING is
> > set.
> > 
> > Not sure how scheduler folks would feel about that ;-)
> 
> So, let me try and page all that back in.... :-)
> 
> KLP needs to unwind the stack to see if any of the patched functions are
> active, if not, flip task to new set.
> 
> Unwinding the stack of a task can be done when:
> 
>  - task is inactive (stable reg and stack) -- provided it stays inactive
>    while unwinding etc..
> 
>  - task is current (guarantees stack doesn't dip below where we started
>    due to being busy on top etc..)
> 
> Can NOT be done from interrupt context, because can hit in the middle of
> setting up stack frames etc..
> 
> The issue at hand is that some tasks run for a long time without passing
> through an explicit check.
> 
> The thread above tried sticking something in cond_resched() which is a
> problem for PREEMPT=y since cond_resched() is a no-op.
> 
> Preempt notifiers were raised, and those would actually be nice, except
> you can only install a notifier on current and you need some memory
> allocated per task, which makes it less than ideal. Plus ...
> 
> ... putting something in finish_task_switch() wouldn't be the end of the
> world I suppose, but then you still need to force schedule the task --
> imagine it being the only runnable task on the CPU, there's nothing
> going to make it actually switch.
> 
> Which then leads me to suggest something daft like this.. does that
> help?

The changes below are working well for me. The context has a read lock
on tasklist_lock so it can't sleep, so I'm using stop_one_cpu_nowait()
with per-CPU state.

Based on Josh's comments it sounds like the klp_have_reliable_stack()
check probably isn't quite right, and we might want to add something
else for PREEMPT+!ORC. But I wanted to go ahead and see if this approach
seems reasonable to everyone.

Thanks,
Seth


diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
index f1b25ec581e0..9f3898f02828 100644
--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -9,6 +9,7 @@
 
 #include <linux/cpu.h>
 #include <linux/stacktrace.h>
+#include <linux/stop_machine.h>
 #include "core.h"
 #include "patch.h"
 #include "transition.h"
@@ -334,9 +335,16 @@ static bool klp_try_switch_task(struct task_struct *task)
 	return !ret;
 }
 
+static int __try_switch_kthread(void *arg)
+{
+	return klp_try_switch_task(arg) ? 0 : -EBUSY;
+}
+
+static DEFINE_PER_CPU(struct cpu_stop_work, klp_stop_work);
+
 /*
  * Sends a fake signal to all non-kthread tasks with TIF_PATCH_PENDING set.
- * Kthreads with TIF_PATCH_PENDING set are woken up.
+ * Kthreads with TIF_PATCH_PENDING set are preempted or woken up.
  */
 static void klp_send_signals(void)
 {
@@ -357,11 +365,22 @@ static void klp_send_signals(void)
 		 * would be meaningless. It is not serious though.
 		 */
 		if (task->flags & PF_KTHREAD) {
-			/*
-			 * Wake up a kthread which sleeps interruptedly and
-			 * still has not been migrated.
-			 */
-			wake_up_state(task, TASK_INTERRUPTIBLE);
+			if (task_curr(task) && klp_have_reliable_stack()) {
+				/*
+				 * kthread is currently running on a CPU; try
+				 * to preempt it.
+				 */
+				stop_one_cpu_nowait(task_cpu(task),
+						    __try_switch_kthread,
+						    task,
+						    this_cpu_ptr(&klp_stop_work));
+			} else {
+				/*
+				 * Wake up a kthread which sleeps interruptedly
+				 * and still has not been migrated.
+				 */
+				wake_up_state(task, TASK_INTERRUPTIBLE);
+			}
 		} else {
 			/*
 			 * Send fake signal to all non-kthread tasks which are

  parent reply	other threads:[~2023-01-27 20:05 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-20 22:12 [PATCH 0/2] vhost: improve livepatch switching for heavily loaded vhost worker kthreads Seth Forshee (DigitalOcean)
2023-01-20 22:12 ` [PATCH 1/2] livepatch: add an interface for safely switching kthreads Seth Forshee (DigitalOcean)
2023-01-20 22:12 ` [PATCH 2/2] vhost: check for pending livepatches from vhost worker kthreads Seth Forshee (DigitalOcean)
2023-01-24 14:17   ` Petr Mladek
2023-01-24 14:17     ` Petr Mladek via Virtualization
2023-01-24 17:21     ` Seth Forshee
2023-01-25 11:34       ` Petr Mladek
2023-01-25 11:34         ` Petr Mladek via Virtualization
2023-01-25 16:57         ` Seth Forshee
2023-01-26 11:16           ` Petr Mladek
2023-01-26 11:16             ` Petr Mladek via Virtualization
2023-01-26 11:49             ` Petr Mladek
2023-01-26 11:49               ` Petr Mladek via Virtualization
2023-01-22  8:34 ` [PATCH 0/2] vhost: improve livepatch switching for heavily loaded " Michael S. Tsirkin
2023-01-22  8:34   ` Michael S. Tsirkin
2023-01-26 17:03 ` Petr Mladek
2023-01-26 17:03   ` Petr Mladek via Virtualization
2023-01-26 21:12   ` Seth Forshee (DigitalOcean)
2023-01-27  4:43     ` Josh Poimboeuf
2023-01-27 10:37       ` Peter Zijlstra
2023-01-27 10:37         ` Peter Zijlstra
2023-01-27 12:09         ` Petr Mladek
2023-01-27 12:09           ` Petr Mladek via Virtualization
2023-01-27 14:37           ` Seth Forshee
2023-01-27 16:52         ` Josh Poimboeuf
2023-01-27 17:09           ` Josh Poimboeuf
2023-01-27 22:11             ` Josh Poimboeuf
2023-01-30 12:40               ` Peter Zijlstra
2023-01-30 12:40                 ` Peter Zijlstra
2023-01-30 17:50                 ` Seth Forshee
2023-01-30 18:18                 ` Josh Poimboeuf
2023-01-30 18:36                 ` Mark Rutland
2023-01-30 18:36                   ` Mark Rutland
2023-01-30 19:48                   ` Josh Poimboeuf
2023-01-31  1:53                     ` Song Liu
2023-01-31 10:22                     ` Mark Rutland
2023-01-31 10:22                       ` Mark Rutland
2023-01-31 16:38                       ` Josh Poimboeuf
2023-02-01 11:10                         ` Mark Rutland
2023-02-01 11:10                           ` Mark Rutland
2023-02-01 16:57                           ` Josh Poimboeuf
2023-02-01 17:11                             ` Mark Rutland
2023-02-01 17:11                               ` Mark Rutland
2023-01-30 19:59                 ` Josh Poimboeuf
2023-01-31 10:02                   ` Peter Zijlstra
2023-01-31 10:02                     ` Peter Zijlstra
2023-01-27 20:02         ` Seth Forshee [this message]
2023-01-27 11:19     ` Petr Mladek
2023-01-27 11:19       ` Petr Mladek via Virtualization
2023-01-27 14:57       ` Seth Forshee
2023-01-30  9:55         ` Petr Mladek
2023-01-30  9:55           ` Petr Mladek via Virtualization

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9Qt4BT2WXK2dToL@do-x1extreme \
    --to=sforshee@kernel.org \
    --cc=jasowang@redhat.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.