public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Song Liu <songliubraving@fb.com>
Cc: Song Liu <song@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"live-patching@vger.kernel.org" <live-patching@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"joe.lawrence@redhat.com" <joe.lawrence@redhat.com>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
Date: Tue, 10 May 2022 09:56:49 +0200	[thread overview]
Message-ID: <YnoawYtoCSvrK7lb@alley> (raw)
In-Reply-To: <9C7DF147-5112-42E7-9F7C-7159EFDFB766@fb.com>

On Mon 2022-05-09 16:22:11, Song Liu wrote:
> 
> 
> > On May 9, 2022, at 8:07 AM, Petr Mladek <pmladek@suse.com> wrote:
> > 
> > On Sat 2022-05-07 10:46:28, Song Liu wrote:
> >> Busy kernel threads may block the transition of livepatch. Call
> >> klp_try_switch_task from __cond_resched to make the transition easier.
> > 
> > Do you have some numbers how this speeds up the transition
> > and how it slows down the scheduler, please?
> 
> We don’t have number on how much this would slow down the scheduler. 
> For the transition, we see cases where the transition cannot finish
> with in 60 seconds (how much "kpatch load" waits by default). 

60s might be too low limit, see below.

> > cond_resched() is typically called in cycles with many interactions
> > where the task might spend a lot of time. There are two possibilities.
> > cond_resched() is called in:
> > 
> >   + livepatched function
> > 
> >     In this case, klp_try_switch_task(current) will always fail.
> >     And it will non-necessarily slow down every iteration by
> >     checking the very same stack.
> > 
> > 
> >   + non-livepatched function
> > 
> >     In this case, the transition will succeed on the first attempt.
> >     OK, but it would succeed also without that patch. The task would
> >     most likely sleep in this cond_resched() so that it might
> >     be successfully transitioned on the next occasion.
> 
> We are in the non-live patched case. But the transition didn’t happen
> in time, because the kernel thread doesn’t go to sleep. While there is
> clearly something weird with this thread, we think live patch should 
> work because the thread does call cond_resched from time to time. 

I guess that it goes to sleep. Otherwise it would trigger soft lockup
report if you have the watchdog enabled.

IMHO, the problem is that klp_transition_work_fn() tries the
transition "only" once per second, see

void klp_try_complete_transition(void)
{
[...]
		schedule_delayed_work(&klp_transition_work,
				      round_jiffies_relative(HZ));
[...]
}

It means that there are "only" 60 attempts to migrate the busy process.
It fails when the process is in the running state or sleeping in a
livepatched function. There is a _non-zero_ chance of a bad luck.

It would be great to measure how long it will take to complete
the transition if you remove the limit 60s.


Anyway, the limit 60s looks like a bad idea to me. It is too low.
For example, we do not use any limit at all in SUSE products.
And the only report was that some thread from a 3rd party
module could not be migrated. It was stuck with a livepatched
function on the stack. The kthread had really problematic
design. I am afraid that even this patch would not help
in that case.


Now, back to the limit. There are basically two problems when
the transition takes too long:

    + It blocks another transition. But the frequency of new
      livepatches it typically counted in days or even weeks.


    + It means that a process is running the buggy/vulnerable
      code longer time. But few hours should be acceptable
      given how long it takes to prepare the livepatch.

      Also it looks better to have 99.9% of processes running
      the fixed code that revert the fix just because a single
      process needs longer time to get transitioned.


I could imagine that there really is a process that is almost
impossible to livepatch. It might get worse on NO_HZ system.
The question is it happens in the real life.

I would personally start with prolonging or removing the limit.

Best Regards,
Petr

  reply	other threads:[~2022-05-10  7:57 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-07 17:46 [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched Song Liu
2022-05-07 18:26 ` Rik van Riel
2022-05-07 19:04   ` Song Liu
2022-05-07 19:18     ` Rik van Riel
2022-05-08 20:41       ` Peter Zijlstra
2022-05-09  1:07         ` Rik van Riel
2022-05-09  7:04 ` Peter Zijlstra
2022-05-09  8:06   ` Song Liu
2022-05-09  9:38     ` Peter Zijlstra
2022-05-09 14:13       ` Rik van Riel
2022-05-09 15:22         ` Petr Mladek
2022-05-09 15:07 ` Petr Mladek
2022-05-09 16:22   ` Song Liu
2022-05-10  7:56     ` Petr Mladek [this message]
2022-05-10 13:33       ` Rik van Riel
2022-05-10 15:44         ` Petr Mladek
2022-05-10 16:07           ` Rik van Riel
2022-05-10 16:52             ` Josh Poimboeuf
2022-05-10 18:07               ` Rik van Riel
2022-05-10 18:42                 ` Josh Poimboeuf
2022-05-10 19:45                   ` Song Liu
2022-05-10 23:04                     ` Josh Poimboeuf
2022-05-10 23:57                       ` Song Liu
2022-05-11  0:33                         ` Josh Poimboeuf
2022-05-11  9:24                           ` Petr Mladek
2022-05-11 16:33                             ` Song Liu
2022-05-12  4:07                               ` Josh Poimboeuf
2022-05-13 12:33                               ` Petr Mladek
2022-05-13 13:34                                 ` Peter Zijlstra
2022-05-11  0:35                         ` Rik van Riel
2022-05-11  0:37                           ` Josh Poimboeuf
2022-05-11  0:46                             ` Rik van Riel
2022-05-11  1:12                               ` Josh Poimboeuf
2022-05-11 18:09                                 ` Rik van Riel
2022-05-12  3:59                                   ` Josh Poimboeuf
2022-05-09 15:52 ` [RFC] sched,livepatch: call stop_one_cpu in klp_check_and_switch_task Rik van Riel
2022-05-09 16:28   ` Song Liu
2022-05-09 18:00   ` Josh Poimboeuf
2022-05-09 19:10     ` Rik van Riel
2022-05-09 19:17       ` Josh Poimboeuf
2022-05-09 19:49         ` Rik van Riel
2022-05-09 20:09           ` Josh Poimboeuf
2022-05-10  0:32             ` Song Liu
2022-05-10  9:35               ` Peter Zijlstra
2022-05-10  1:48             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnoawYtoCSvrK7lb@alley \
    --to=pmladek@suse.com \
    --cc=Kernel-team@fb.com \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox