From: "Paul E. McKenney" <paulmck@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: syzbot <syzbot+dde0cc33951735441301@syzkaller.appspotmail.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
syzkaller-bugs@googlegroups.com, viro@zeniv.linux.org.uk,
netdev@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org,
frederic@kernel.org
Subject: Re: Something is leaking RCU holds from interrupt context
Date: Sun, 4 Apr 2021 14:36:05 -0700 [thread overview]
Message-ID: <20210404213605.GA2696@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <20210404182453.GT351017@casper.infradead.org>
On Sun, Apr 04, 2021 at 07:24:53PM +0100, Matthew Wilcox wrote:
> On Sun, Apr 04, 2021 at 09:48:08AM -0700, Paul E. McKenney wrote:
> > On Sun, Apr 04, 2021 at 11:24:57AM +0100, Matthew Wilcox wrote:
> > > On Sat, Apr 03, 2021 at 09:15:17PM -0700, syzbot wrote:
> > > > HEAD commit: 2bb25b3a Merge tag 'mips-fixes_5.12_3' of git://git.kernel..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1284cc31d00000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=dde0cc33951735441301
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+dde0cc33951735441301@syzkaller.appspotmail.com
> > > >
> > > > WARNING: suspicious RCU usage
> > > > 5.12.0-rc5-syzkaller #0 Not tainted
> > > > -----------------------------
> > > > kernel/sched/core.c:8294 Illegal context switch in RCU-bh read-side critical section!
> > > >
> > > > other info that might help us debug this:
> > > >
> > > >
> > > > rcu_scheduler_active = 2, debug_locks = 0
> > > > no locks held by systemd-udevd/4825.
> > >
> > > I think we have something that's taking the RCU read lock in
> > > (soft?) interrupt context and not releasing it properly in all
> > > situations. This thread doesn't have any locks recorded, but
> > > lock_is_held(&rcu_bh_lock_map) is true.
> > >
> > > Is there some debugging code that could find this? eg should
> > > lockdep_softirq_end() check that rcu_bh_lock_map is not held?
> > > (if it's taken in process context, then BHs can't run, so if it's
> > > held at softirq exit, then there's definitely a problem).
> >
> > Something like the (untested) patch below?
>
> Maybe? Will this tell us who took the lock? I was really trying to
> throw out a suggestion in the hope that somebody who knows this area
> better than I do would tell me I was wrong.
No idea. If it is reproducible, hopefully someone will try it. If it
is not reproducible, so it goes!
And hey, it is not my fault that you sounded like you knew what you were
talking about! ;-)
But yes, now that you mention it, it is odd that rcu_sleep_check()
thought that rcu_bh_lock_map was held, but lockdep_rcu_suspicious()
does not. One clue might be that rcu_sleep_check() is looking at
rcu_bh_lock_map() itself, while lockdep_rcu_suspicious() and its callee,
lockdep_print_held_locks(), rely on current->lockdep_depth.
Maybe these have gotten out of sync.
> > Please note that it does not make sense to also check for
> > either rcu_lock_map or rcu_sched_lock_map because either of
> > these might be held by the interrupted code.
>
> Yes! Although if we do it somewhere like tasklet_action_common(),
> we could do something like:
>
> +++ b/kernel/softirq.c
> @@ -774,6 +774,7 @@ static void tasklet_action_common(struct softirq_action *a,
>
> while (list) {
> struct tasklet_struct *t = list;
> + unsigned long rcu_lockdep = rcu_get_lockdep_state();
>
> list = list->next;
>
> @@ -790,6 +791,10 @@ static void tasklet_action_common(struct softirq_action *a,
> }
> tasklet_unlock(t);
> }
> + if (rcu_lockdep != rcu_get_lockdep_state()) {
> + printk(something useful about t);
> + RCU_LOCKDEP_WARN(... something else useful ...);
> + }
>
> local_irq_disable();
>
> where rcu_get_lockdep_state() returns a bitmap of whether the four rcu
> lockdep maps are held.
>
> We might also need something similar in __do_softirq(), in case it's
> not a tasklet that's the problem.
The rcu_get_lockdep_state() function would just set bits based on RCU's
various lockdep maps, but the comparison would need to take at least
debug_lockdep_rcu_enabled() into account. Just in case there is a
lockdep report between the sampling and comparison.
But first let's see what the lockdep experts have to say.
Thanx, Paul
next prev parent reply other threads:[~2021-04-04 21:36 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-04 4:15 [syzbot] WARNING: suspicious RCU usage in getname_flags syzbot
2021-04-04 10:24 ` Something is leaking RCU holds from interrupt context Matthew Wilcox
2021-04-04 16:48 ` Paul E. McKenney
2021-04-04 18:24 ` Matthew Wilcox
2021-04-04 21:36 ` Paul E. McKenney [this message]
2021-04-06 8:33 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210404213605.GA2696@paulmck-ThinkPad-P72 \
--to=paulmck@kernel.org \
--cc=frederic@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=syzbot+dde0cc33951735441301@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).