public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Song Liu <songliubraving@fb.com>, Rik van Riel <riel@fb.com>,
	"song@kernel.org" <song@kernel.org>,
	"joe.lawrence@redhat.com" <joe.lawrence@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	"live-patching@vger.kernel.org" <live-patching@vger.kernel.org>,
	Kernel Team <Kernel-team@fb.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"jpoimboe@redhat.com" <jpoimboe@redhat.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
Date: Wed, 11 May 2022 11:24:33 +0200	[thread overview]
Message-ID: <20220511092433.GA26047@pathway.suse.cz> (raw)
In-Reply-To: <20220511003331.clfvwfgpmbr5yx6n@treble>

On Tue 2022-05-10 17:33:31, Josh Poimboeuf wrote:
> On Tue, May 10, 2022 at 11:57:04PM +0000, Song Liu wrote:
> > > If it's a real bug, we should fix it everywhere, not just for Facebook.
> > > Otherwise CONFIG_PREEMPT and/or non-x86 arches become second-class
> > > citizens.
> > 
> > I think "is it a real bug?" is the top question for me. So maybe we 
> > should take a step back.
> > 
> > The behavior we see is: A busy kernel thread blocks klp transition 
> > for more than a minute. But the transition eventually succeeded after 
> > < 10 retries on most systems. The kernel thread is well-behaved, as 
> > it calls cond_resched() at a reasonable frequency, so this is not a 
> > deadlock. 
> > 
> > If I understand Petr correctly, this behavior is expected, and thus 
> > is not a bug or issue for the livepatch subsystem. This is different
> > to our original expectation, but if this is what we agree on, we 
> > will look into ways to incorporate long wait time for patch 
> > transition in our automations. 
> 
> That's how we've traditionally looked at it, though apparently Red Hat
> and SUSE have implemented different ideas of what a long wait time is.
> 
> In practice, one minute has always been enough for all of kpatch's users
> -- AFAIK, everybody except SUSE -- up until now.

I am actually surprised that nobody met the problem yet. There are
"only" 60 attempts to transition the pending tasks.

Well, the problem is mainly with kthreads. User space processes are
migrated also on the kernel boundary. And the fake signal is likely
pretty effective here. And it probably is not that common that
a kthread would occupy a single CPU all the time.


> Though, these options might be considered workarounds, as it's
> theoretically possible for a kthread to be CPU-bound indefinitely,
> beyond any arbitrarily chosen timeout.  But maybe that's not realistic
> beyond a certain timeout value of X and we don't care?  I dunno.

I agree that it might happen theoretically. And it would be great
to be prepared for this. My only concern is the complexity and risk.
We should know that it is worth it.


> As I have been trying to say, that won't work for PREEMPT+!ORC, because,
> when the kthread gets preempted, the stack trace will be attempted from
> an IRQ and will be reported as unreliable.

This limits the range of possible solutions quite a lot. But it is
how it is.

> Ideally we'd have the ORC unwinder for all arches, that would make this
> much easier.  But we're not there yet.

The alternative solution is that the process has to migrate itself
on some safe location.

One crazy idea. It still might be possible to find the called
functions on the stack even when it is not reliable. Then it
might be possible to add another ftrace handler on
these found functions. This other ftrace handler might migrate
the task when it calls this function again.

It assumes that the task will call the same functions again
and again. Also it might require that the tasks checks its
own stack from the ftrace handler. I am not sure if this
is possible.

There might be other variants of this approach.

Best Regards,
Petr

  reply	other threads:[~2022-05-11  9:24 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-07 17:46 [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched Song Liu
2022-05-07 18:26 ` Rik van Riel
2022-05-07 19:04   ` Song Liu
2022-05-07 19:18     ` Rik van Riel
2022-05-08 20:41       ` Peter Zijlstra
2022-05-09  1:07         ` Rik van Riel
2022-05-09  7:04 ` Peter Zijlstra
2022-05-09  8:06   ` Song Liu
2022-05-09  9:38     ` Peter Zijlstra
2022-05-09 14:13       ` Rik van Riel
2022-05-09 15:22         ` Petr Mladek
2022-05-09 15:07 ` Petr Mladek
2022-05-09 16:22   ` Song Liu
2022-05-10  7:56     ` Petr Mladek
2022-05-10 13:33       ` Rik van Riel
2022-05-10 15:44         ` Petr Mladek
2022-05-10 16:07           ` Rik van Riel
2022-05-10 16:52             ` Josh Poimboeuf
2022-05-10 18:07               ` Rik van Riel
2022-05-10 18:42                 ` Josh Poimboeuf
2022-05-10 19:45                   ` Song Liu
2022-05-10 23:04                     ` Josh Poimboeuf
2022-05-10 23:57                       ` Song Liu
2022-05-11  0:33                         ` Josh Poimboeuf
2022-05-11  9:24                           ` Petr Mladek [this message]
2022-05-11 16:33                             ` Song Liu
2022-05-12  4:07                               ` Josh Poimboeuf
2022-05-13 12:33                               ` Petr Mladek
2022-05-13 13:34                                 ` Peter Zijlstra
2022-05-11  0:35                         ` Rik van Riel
2022-05-11  0:37                           ` Josh Poimboeuf
2022-05-11  0:46                             ` Rik van Riel
2022-05-11  1:12                               ` Josh Poimboeuf
2022-05-11 18:09                                 ` Rik van Riel
2022-05-12  3:59                                   ` Josh Poimboeuf
2022-05-09 15:52 ` [RFC] sched,livepatch: call stop_one_cpu in klp_check_and_switch_task Rik van Riel
2022-05-09 16:28   ` Song Liu
2022-05-09 18:00   ` Josh Poimboeuf
2022-05-09 19:10     ` Rik van Riel
2022-05-09 19:17       ` Josh Poimboeuf
2022-05-09 19:49         ` Rik van Riel
2022-05-09 20:09           ` Josh Poimboeuf
2022-05-10  0:32             ` Song Liu
2022-05-10  9:35               ` Peter Zijlstra
2022-05-10  1:48             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220511092433.GA26047@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=Kernel-team@fb.com \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@kernel.org \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@fb.com \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox