From: Chen Yu <yu.c.chen@intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Oliver Sang <oliver.sang@intel.com>, <oe-lkp@lists.linux.dev>,
<lkp@intel.com>, <linux-kernel@vger.kernel.org>,
<aubrey.li@linux.intel.com>
Subject: Re: [peterz-queue:sched/core] [sched/fair] 420356c350: WARNING:at_kernel/sched/core.c:#__might_sleep
Date: Mon, 26 Aug 2024 16:25:56 +0800 [thread overview]
Message-ID: <Zsw8FEPMHFe4yoaA@chenyu5-mobl2> (raw)
In-Reply-To: <20240822154923.GB17097@noisy.programming.kicks-ass.net>
On 2024-08-22 at 17:49:23 +0200, Peter Zijlstra wrote:
> On Mon, Aug 19, 2024 at 12:44:39PM +0800, Chen Yu wrote:
> > On 2024-08-17 at 11:33:29 +0200, Peter Zijlstra wrote:
> > > On Fri, Aug 16, 2024 at 05:15:12PM +0800, kernel test robot wrote:
> > > > kernel test robot noticed "WARNING:at_kernel/sched/core.c:#__might_sleep" on:
> > > >
> > > > commit: 420356c3504091f0f6021974389df7c58f365dad ("sched/fair: Implement delayed dequeue")
> > > > https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/core
> > >
> > > > [ 86.252370][ T674] ------------[ cut here ]------------
> > > > [ 86.252945][ T674] do not call blocking ops when !TASK_RUNNING; state=1 set at kthread_worker_fn (kernel/kthread.c:?)
> > > > [ 86.254001][ T674] WARNING: CPU: 1 PID: 674 at kernel/sched/core.c:8469 __might_sleep (kernel/sched/core.c:8465)
> > >
> > > > [ 86.283398][ T674] ? handle_bug (arch/x86/kernel/traps.c:239)
> > > > [ 86.283995][ T674] ? exc_invalid_op (arch/x86/kernel/traps.c:260)
> > > > [ 86.284787][ T674] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
> > > > [ 86.285682][ T674] ? __might_sleep (kernel/sched/core.c:8465)
> > > > [ 86.286380][ T674] ? __might_sleep (kernel/sched/core.c:8465)
> > > > [ 86.287116][ T674] kthread_worker_fn (include/linux/kernel.h:73 include/linux/freezer.h:53 kernel/kthread.c:851)
> > > > [ 86.287701][ T674] ? kthread_worker_fn (kernel/kthread.c:?)
> > > > [ 86.288138][ T674] kthread (kernel/kthread.c:391)
> > > > [ 86.288482][ T674] ? __cfi_kthread_worker_fn (kernel/kthread.c:803)
> > > > [ 86.288951][ T674] ? __cfi_kthread (kernel/kthread.c:342)
> > > > [ 86.289560][ T674] ret_from_fork (arch/x86/kernel/process.c:153)
> > > > [ 86.290162][ T674] ? __cfi_kthread (kernel/kthread.c:342)
> > > > [ 86.291465][ T674] ret_from_fork_asm (arch/x86/entry/entry_64.S:254)
> > >
> > > AFAICT this is a pre-existing issue. Notably that all transcribes to:
> > >
> > > kthread_worker_fn()
> > > ...
> > > repeat:
> > > set_current_state(TASK_INTERRUPTIBLE);
> > > ...
> > > if (work) { // false
> > > __set_current_state(TASK_RUNNING);
> > > ...
> > > } else if (!freezing(current)) // false -- we are freezing
> > > schedule();
> > >
> > > // so state really is still TASK_INTERRUPTIBLE here
> > > try_to_freeze()
> > > might_sleep() <--- boom, per the above.
> > >
> >
> > Would the following fix make sense?
>
> Yeah, that looks fine. Could you write it up as a proper patch please?
>
Yes, it should be a race condition in theory and I've sent a patch here:
https://lore.kernel.org/lkml/20240819141551.111610-1-yu.c.chen@intel.com/
And Andrew has given some comments on it.
However, after I did some further investigation, this warning seems to
not be directly related to task freeze, but has connection with the
delay dequeue. I'm planning to add debug patch and investigate the
symptom in 0day's environment, will send the finding later.
thanks,
Chenyu
> >
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index f7be976ff88a..09850b2109c9 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -848,6 +848,12 @@ int kthread_worker_fn(void *worker_ptr)
> > } else if (!freezing(current))
> > schedule();
> >
> > + /*
> > + * Explictly set the running state in case we are being frozen
> > + * and skip the schedule() above. try_to_freeze() expects the
> > + * current task to be in running state.
> > + */
> > + __set_current_state(TASK_RUNNING);
> > try_to_freeze();
> > cond_resched();
> > goto repeat;
> > --
> > 2.25.1
> >
> > Hi Oliver,
> > Could you please help check if above change would make the warning go away?
> >
> > thanks,
> > Chenyu
next prev parent reply other threads:[~2024-08-26 8:26 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-16 9:15 [peterz-queue:sched/core] [sched/fair] 420356c350: WARNING:at_kernel/sched/core.c:#__might_sleep kernel test robot
2024-08-17 9:33 ` Peter Zijlstra
2024-08-19 4:44 ` Chen Yu
2024-08-19 8:40 ` Oliver Sang
2024-08-22 15:49 ` Peter Zijlstra
2024-08-26 8:25 ` Chen Yu [this message]
2024-08-27 9:40 ` Chen Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zsw8FEPMHFe4yoaA@chenyu5-mobl2 \
--to=yu.c.chen@intel.com \
--cc=aubrey.li@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox