All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Song Liu <songliubraving@fb.com>
Cc: Song Liu <song@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"live-patching@vger.kernel.org" <live-patching@vger.kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"joe.lawrence@redhat.com" <joe.lawrence@redhat.com>,
	Kernel Team <Kernel-team@fb.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
Date: Tue, 10 May 2022 09:56:49 +0200	[thread overview]
Message-ID: <YnoawYtoCSvrK7lb@alley> (raw)
In-Reply-To: <9C7DF147-5112-42E7-9F7C-7159EFDFB766@fb.com>

On Mon 2022-05-09 16:22:11, Song Liu wrote:
> 
> 
> > On May 9, 2022, at 8:07 AM, Petr Mladek <pmladek@suse.com> wrote:
> > 
> > On Sat 2022-05-07 10:46:28, Song Liu wrote:
> >> Busy kernel threads may block the transition of livepatch. Call
> >> klp_try_switch_task from __cond_resched to make the transition easier.
> > 
> > Do you have some numbers how this speeds up the transition
> > and how it slows down the scheduler, please?
> 
> We don’t have number on how much this would slow down the scheduler. 
> For the transition, we see cases where the transition cannot finish
> with in 60 seconds (how much "kpatch load" waits by default). 

60s might be too low limit, see below.

> > cond_resched() is typically called in cycles with many interactions
> > where the task might spend a lot of time. There are two possibilities.
> > cond_resched() is called in:
> > 
> >   + livepatched function
> > 
> >     In this case, klp_try_switch_task(current) will always fail.
> >     And it will non-necessarily slow down every iteration by
> >     checking the very same stack.
> > 
> > 
> >   + non-livepatched function
> > 
> >     In this case, the transition will succeed on the first attempt.
> >     OK, but it would succeed also without that patch. The task would
> >     most likely sleep in this cond_resched() so that it might
> >     be successfully transitioned on the next occasion.
> 
> We are in the non-live patched case. But the transition didn’t happen
> in time, because the kernel thread doesn’t go to sleep. While there is
> clearly something weird with this thread, we think live patch should 
> work because the thread does call cond_resched from time to time. 

I guess that it goes to sleep. Otherwise it would trigger soft lockup
report if you have the watchdog enabled.

IMHO, the problem is that klp_transition_work_fn() tries the
transition "only" once per second, see

void klp_try_complete_transition(void)
{
[...]
		schedule_delayed_work(&klp_transition_work,
				      round_jiffies_relative(HZ));
[...]
}

It means that there are "only" 60 attempts to migrate the busy process.
It fails when the process is in the running state or sleeping in a
livepatched function. There is a _non-zero_ chance of a bad luck.

It would be great to measure how long it will take to complete
the transition if you remove the limit 60s.


Anyway, the limit 60s looks like a bad idea to me. It is too low.
For example, we do not use any limit at all in SUSE products.
And the only report was that some thread from a 3rd party
module could not be migrated. It was stuck with a livepatched
function on the stack. The kthread had really problematic
design. I am afraid that even this patch would not help
in that case.


Now, back to the limit. There are basically two problems when
the transition takes too long:

    + It blocks another transition. But the frequency of new
      livepatches it typically counted in days or even weeks.


    + It means that a process is running the buggy/vulnerable
      code longer time. But few hours should be acceptable
      given how long it takes to prepare the livepatch.

      Also it looks better to have 99.9% of processes running
      the fixed code that revert the fix just because a single
      process needs longer time to get transitioned.


I could imagine that there really is a process that is almost
impossible to livepatch. It might get worse on NO_HZ system.
The question is it happens in the real life.

I would personally start with prolonging or removing the limit.

Best Regards,
Petr

  reply	other threads:[~2022-05-10  7:57 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-07 17:46 [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched Song Liu
2022-05-07 18:26 ` Rik van Riel
2022-05-07 19:04   ` Song Liu
2022-05-07 19:18     ` Rik van Riel
2022-05-08 20:41       ` Peter Zijlstra
2022-05-09  1:07         ` Rik van Riel
2022-05-09  7:04 ` Peter Zijlstra
2022-05-09  8:06   ` Song Liu
2022-05-09  9:38     ` Peter Zijlstra
2022-05-09 14:13       ` Rik van Riel
2022-05-09 15:22         ` Petr Mladek
2022-05-09 15:07 ` Petr Mladek
2022-05-09 16:22   ` Song Liu
2022-05-10  7:56     ` Petr Mladek [this message]
2022-05-10 13:33       ` Rik van Riel
2022-05-10 15:44         ` Petr Mladek
2022-05-10 16:07           ` Rik van Riel
2022-05-10 16:52             ` Josh Poimboeuf
2022-05-10 18:07               ` Rik van Riel
2022-05-10 18:42                 ` Josh Poimboeuf
2022-05-10 19:45                   ` Song Liu
2022-05-10 23:04                     ` Josh Poimboeuf
2022-05-10 23:57                       ` Song Liu
2022-05-11  0:33                         ` Josh Poimboeuf
2022-05-11  9:24                           ` Petr Mladek
2022-05-11 16:33                             ` Song Liu
2022-05-12  4:07                               ` Josh Poimboeuf
2022-05-13 12:33                               ` Petr Mladek
2022-05-13 13:34                                 ` Peter Zijlstra
2022-05-11  0:35                         ` Rik van Riel
2022-05-11  0:37                           ` Josh Poimboeuf
2022-05-11  0:46                             ` Rik van Riel
2022-05-11  1:12                               ` Josh Poimboeuf
2022-05-11 18:09                                 ` Rik van Riel
2022-05-12  3:59                                   ` Josh Poimboeuf
2022-05-09 15:52 ` [RFC] sched,livepatch: call stop_one_cpu in klp_check_and_switch_task Rik van Riel
2022-05-09 16:28   ` Song Liu
2022-05-09 18:00   ` Josh Poimboeuf
2022-05-09 19:10     ` Rik van Riel
2022-05-09 19:17       ` Josh Poimboeuf
2022-05-09 19:49         ` Rik van Riel
2022-05-09 20:09           ` Josh Poimboeuf
2022-05-10  0:32             ` Song Liu
2022-05-10  9:35               ` Peter Zijlstra
2022-05-10  1:48             ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnoawYtoCSvrK7lb@alley \
    --to=pmladek@suse.com \
    --cc=Kernel-team@fb.com \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=live-patching@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=song@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.