From: Petr Mladek <pmladek@suse.com>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: jpoimboe@kernel.org, jikos@kernel.org, mbenes@suse.cz,
joe.lawrence@redhat.com, live-patching@vger.kernel.org
Subject: Re: Find root of the stall: was: Re: [PATCH 2/3] livepatch: Avoid blocking tasklist_lock too long
Date: Fri, 14 Feb 2025 10:46:23 +0100 [thread overview]
Message-ID: <Z68Q2j3yCB8N0n1n@pathway.suse.cz> (raw)
In-Reply-To: <CALOAHbDEcUieW=AcBYHF1BUfQoAi540BNPEP5XR3CApu=3vMNQ@mail.gmail.com>
On Thu 2025-02-13 20:32:19, Yafang Shao wrote:
> On Thu, Feb 13, 2025 at 7:19 PM Petr Mladek <pmladek@suse.com> wrote:
> >
> > On Tue 2025-02-11 14:24:36, Yafang Shao wrote:
> > > I encountered a hard lockup when attempting to reproduce the panic issue
> > > occurred on our production servers [0]. The hard lockup is as follows,
> > >
> > > [15852778.150191] livepatch: klp_try_switch_task: grpc_executor:421106 is sleeping on function do_exit
> > > [15852778.169471] livepatch: klp_try_switch_task: grpc_executor:421244 is sleeping on function do_exit
> > > [15852778.188746] livepatch: klp_try_switch_task: grpc_executor:421457 is sleeping on function do_exit
> > > [15852778.208021] livepatch: klp_try_switch_task: grpc_executor:422407 is sleeping on function do_exit
> > > [15852778.227292] livepatch: klp_try_switch_task: grpc_executor:423184 is sleeping on function do_exit
> > > [15852778.246576] livepatch: klp_try_switch_task: grpc_executor:423582 is sleeping on function do_exit
> > > [15852778.265863] livepatch: klp_try_switch_task: grpc_executor:423738 is sleeping on function do_exit
> > > [15852778.285149] livepatch: klp_try_switch_task: grpc_executor:423739 is sleeping on function do_exit
> > > [15852778.304446] livepatch: klp_try_switch_task: grpc_executor:423833 is sleeping on function do_exit
> > > [15852778.323738] livepatch: klp_try_switch_task: grpc_executor:423893 is sleeping on function do_exit
> > > [15852778.343017] livepatch: klp_try_switch_task: grpc_executor:423894 is sleeping on function do_exit
> > > [15852778.362292] livepatch: klp_try_switch_task: grpc_executor:423976 is sleeping on function do_exit
> > > [15852778.381565] livepatch: klp_try_switch_task: grpc_executor:423977 is sleeping on function do_exit
> > > [15852778.400847] livepatch: klp_try_switch_task: grpc_executor:424610 is sleeping on function do_exit
> >
> > This message does not exist in vanilla kernel. It looks like an extra
> > debug message. And many extra messages might create stalls on its own.
>
> Right, the dynamic_debug is enabled:
>
> $ echo 'file kernel/* +p' > /sys/kernel/debug/dynamic_debug/control
>
> >
> > AFAIK, your reproduced the problem even without these extra messages.
>
> There are two issues during the KLP transition:
> 1. RCU warnings
> 2. hard lockup
>
> RCU stalls can be easily reproduced without the extra messages.
> However, hard lockups cannot be reproduced unless dynamic debugging is
> enabled.
OK, I would ignore the hard lockup for now. I believe that it is
related to flushing the debug messages on the console. And a solution
for the RCU stall might likely solve this as well. Also the debug
messages are not enabled on production systems...
We should debug the RCU warning without these debug messages!
Best Regards,
Petr
next prev parent reply other threads:[~2025-02-14 9:46 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 6:24 [PATCH 0/3] livepatch: Some improvements Yafang Shao
2025-02-11 6:24 ` [PATCH 1/3] livepatch: Add comment to clarify klp_add_nops() Yafang Shao
2025-02-12 12:51 ` Petr Mladek
2025-02-13 5:49 ` Yafang Shao
2025-02-11 6:24 ` [PATCH 2/3] livepatch: Avoid blocking tasklist_lock too long Yafang Shao
2025-02-12 0:40 ` Josh Poimboeuf
2025-02-12 2:34 ` Yafang Shao
2025-02-12 11:54 ` Yafang Shao
2025-02-12 15:42 ` Petr Mladek
2025-02-13 1:36 ` Josh Poimboeuf
2025-02-13 5:53 ` Yafang Shao
2025-02-13 9:48 ` Petr Mladek
2025-02-13 17:32 ` Josh Poimboeuf
2025-02-14 14:44 ` Petr Mladek
2025-02-14 18:12 ` Josh Poimboeuf
2025-02-18 2:37 ` Yafang Shao
2025-02-13 2:47 ` Josh Poimboeuf
2025-02-13 11:19 ` Find root of the stall: was: " Petr Mladek
2025-02-13 12:32 ` Yafang Shao
2025-02-13 12:39 ` Yafang Shao
2025-02-14 2:44 ` Yafang Shao
2025-02-14 8:36 ` Josh Poimboeuf
2025-02-14 11:37 ` Petr Mladek
2025-02-18 2:19 ` Yafang Shao
2025-02-14 9:46 ` Petr Mladek [this message]
2025-02-11 6:24 ` [PATCH 3/3] livepatch: Avoid potential RCU stalls in klp transition Yafang Shao
2025-02-12 0:52 ` Josh Poimboeuf
2025-02-12 2:42 ` Yafang Shao
2025-02-13 1:58 ` Josh Poimboeuf
2025-02-13 5:51 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z68Q2j3yCB8N0n1n@pathway.suse.cz \
--to=pmladek@suse.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@kernel.org \
--cc=laoar.shao@gmail.com \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox