From: Petr Mladek <pmladek@suse.com>
To: Song Liu <songliubraving@fb.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>, Rik van Riel <riel@fb.com>,
"song@kernel.org" <song@kernel.org>,
"joe.lawrence@redhat.com" <joe.lawrence@redhat.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
"live-patching@vger.kernel.org" <live-patching@vger.kernel.org>,
Kernel Team <Kernel-team@fb.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"jpoimboe@redhat.com" <jpoimboe@redhat.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
Date: Fri, 13 May 2022 14:33:34 +0200 [thread overview]
Message-ID: <Yn5QHpc9YlAbP1li@alley> (raw)
In-Reply-To: <78DFED12-571B-489C-A662-DA333555266B@fb.com>
On Wed 2022-05-11 16:33:57, Song Liu wrote:
>
>
> > On May 11, 2022, at 2:24 AM, Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Tue 2022-05-10 17:33:31, Josh Poimboeuf wrote:
> >> On Tue, May 10, 2022 at 11:57:04PM +0000, Song Liu wrote:
> >>>> If it's a real bug, we should fix it everywhere, not just for Facebook.
> >>>> Otherwise CONFIG_PREEMPT and/or non-x86 arches become second-class
> >>>> citizens.
> >>>
> >>> I think "is it a real bug?" is the top question for me. So maybe we
> >>> should take a step back.
> >>>
> >>> The behavior we see is: A busy kernel thread blocks klp transition
> >>> for more than a minute. But the transition eventually succeeded after
> >>> < 10 retries on most systems. The kernel thread is well-behaved, as
> >>> it calls cond_resched() at a reasonable frequency, so this is not a
> >>> deadlock.
> >>>
> >>> If I understand Petr correctly, this behavior is expected, and thus
> >>> is not a bug or issue for the livepatch subsystem. This is different
> >>> to our original expectation, but if this is what we agree on, we
> >>> will look into ways to incorporate long wait time for patch
> >>> transition in our automations.
> >>
> >> That's how we've traditionally looked at it, though apparently Red Hat
> >> and SUSE have implemented different ideas of what a long wait time is.
> >>
> >> In practice, one minute has always been enough for all of kpatch's users
> >> -- AFAIK, everybody except SUSE -- up until now.
> >
> > I am actually surprised that nobody met the problem yet. There are
> > "only" 60 attempts to transition the pending tasks.
>
> Maybe we should consider increase the frequency we try? Say to 10 times
> per second? I guess this will solve most of the failures we are seeing
> in current case.
My concern is that klp_try_complete_transition() checks all processes
under read_lock(&tasklist_lock). It might create some contention
on this lock. I am not sure if this lock is fair. It might slow down
block writers (creating/deleting tasks).
Best Regards,
Petr
next prev parent reply other threads:[~2022-05-13 12:33 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-07 17:46 [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched Song Liu
2022-05-07 18:26 ` Rik van Riel
2022-05-07 19:04 ` Song Liu
2022-05-07 19:18 ` Rik van Riel
2022-05-08 20:41 ` Peter Zijlstra
2022-05-09 1:07 ` Rik van Riel
2022-05-09 7:04 ` Peter Zijlstra
2022-05-09 8:06 ` Song Liu
2022-05-09 9:38 ` Peter Zijlstra
2022-05-09 14:13 ` Rik van Riel
2022-05-09 15:22 ` Petr Mladek
2022-05-09 15:07 ` Petr Mladek
2022-05-09 16:22 ` Song Liu
2022-05-10 7:56 ` Petr Mladek
2022-05-10 13:33 ` Rik van Riel
2022-05-10 15:44 ` Petr Mladek
2022-05-10 16:07 ` Rik van Riel
2022-05-10 16:52 ` Josh Poimboeuf
2022-05-10 18:07 ` Rik van Riel
2022-05-10 18:42 ` Josh Poimboeuf
2022-05-10 19:45 ` Song Liu
2022-05-10 23:04 ` Josh Poimboeuf
2022-05-10 23:57 ` Song Liu
2022-05-11 0:33 ` Josh Poimboeuf
2022-05-11 9:24 ` Petr Mladek
2022-05-11 16:33 ` Song Liu
2022-05-12 4:07 ` Josh Poimboeuf
2022-05-13 12:33 ` Petr Mladek [this message]
2022-05-13 13:34 ` Peter Zijlstra
2022-05-11 0:35 ` Rik van Riel
2022-05-11 0:37 ` Josh Poimboeuf
2022-05-11 0:46 ` Rik van Riel
2022-05-11 1:12 ` Josh Poimboeuf
2022-05-11 18:09 ` Rik van Riel
2022-05-12 3:59 ` Josh Poimboeuf
2022-05-09 15:52 ` [RFC] sched,livepatch: call stop_one_cpu in klp_check_and_switch_task Rik van Riel
2022-05-09 16:28 ` Song Liu
2022-05-09 18:00 ` Josh Poimboeuf
2022-05-09 19:10 ` Rik van Riel
2022-05-09 19:17 ` Josh Poimboeuf
2022-05-09 19:49 ` Rik van Riel
2022-05-09 20:09 ` Josh Poimboeuf
2022-05-10 0:32 ` Song Liu
2022-05-10 9:35 ` Peter Zijlstra
2022-05-10 1:48 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yn5QHpc9YlAbP1li@alley \
--to=pmladek@suse.com \
--cc=Kernel-team@fb.com \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=live-patching@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@fb.com \
--cc=song@kernel.org \
--cc=songliubraving@fb.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox