From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Paul Bolle <pebolle@tiscali.nl>
Cc: Vivek Goyal <vgoyal@redhat.com>, Jens Axboe <jaxboe@fusionio.com>,
linux kernel mailing list <linux-kernel@vger.kernel.org>
Subject: Re: Mysterious CFQ crash and RCU
Date: Sat, 21 May 2011 16:54:08 -0700 [thread overview]
Message-ID: <20110521235408.GK2271@linux.vnet.ibm.com> (raw)
In-Reply-To: <1306016630.2066.44.camel@x61.thuisdomein>
On Sun, May 22, 2011 at 12:23:50AM +0200, Paul Bolle wrote:
> Paul,
>
> On Sat, 2011-05-21 at 14:00 -0700, Paul E. McKenney wrote:
> > On Thu, May 19, 2011 at 06:24:04PM -0400, Vivek Goyal wrote:
> > It does look like a tough one!
>
> Thank you!
>
> > > Is it possible? We have looked at the code many a times and we think
> > > that rcu locking around it is fine. Is it possible that a call_rcu()
> > > can fire before rcu grace period is over.
> >
> > If it does, that would be a bug in RCU.
> >
> > > I had put a debug patch in CFQ (details are in bugzilla) and I can
> > > see that after decoupling the object from the hash list, it got
> > > freed while we were still under rcu_read_lock().
> > >
> > > Is there any known issue or is there any quick tip on how can I
> > > go about debugging it further from rcu point of view.
> >
> > First for uses of RCU:
> >
> > o One thing to try would be CONFIG_PROVE_RCU, which could help
> > find missing rcu_read_lock()s and similar. Some years back, it
> > used to be the case that spin_lock() implied rcu_read_lock(),
> > but it no longer does. There might still be some cases where
> > spin_lock() needs to have an rcu_read_lock() added.
> >
> > o There are a few entries in the bugzilla mentioning that elements
> > are being removed more often than expected. There is a config
> > option CONFIG_DEBUG_OBJECTS_RCU_HEAD that complains if the same
> > object is passed to call_rcu() before the grace period ends for
> > the first round.
> >
> > o Try switching between CONFIG_TREE_RCU and CONFIG_TREE_PREEMPT_RCU.
> > These two settings are each sensitive to different forms of abuse.
> > For example, if you have CONFIG_PREEMPT=n and CONFIG_TREE_RCU=y,
> > illegally placing a synchronize_rcu() -- or anything else that
> > blocks -- in an RCU read-side critical section will silently
> > partition that RCU read-side critical section. In contrast,
> > CONFIG_TREE_PREEMPT_RCU=y will complain about this.
> >
> > Second, for RCU itself, CONFIG_RCU_TRACE enables counter-based tracing
> > in RCU. Sampling each of the files in the debugfs directory "rcu"
> > before and after the badness (if possible) could help me see if anything
> > untoward is happening.
>
> Before we go down that route, I'd like to note that I seem to be unable
> to reproduce this Oops under v2.6.39 (either using the first v2.6.39 rpm
> for i686 shipped for Fedora Rawhide, or two versions of that rpm I built
> locally).
>
> Is anyone able to spot one or more commits in v2.6.39-rc7..v2.6.39 that
> might have fixed this Oops? Or did my chance of hitting this Oops,
> somehow, just got a lot smaller in v.2.6.39?
5f45c69589b7d ("read_lock() does not always imply rcu_read_lock()") might
well be a fix.
> Please note that I have tried to reproduce this Oops very often, using
> quite a number of kernels, so there's a non-zero chance I tricked myself
> in seeing a pattern where there actually is none.
Understood -- races can be a bit frustrating. How long should you run
before you conclude that you fixed it? ;-)
Thanx, Paul
next prev parent reply other threads:[~2011-05-21 23:54 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-19 22:24 Mysterious CFQ crash and RCU Vivek Goyal
2011-05-21 21:00 ` Paul E. McKenney
2011-05-21 22:23 ` Paul Bolle
2011-05-21 23:54 ` Paul E. McKenney [this message]
2011-05-22 19:30 ` Paul Bolle
2011-05-22 20:13 ` Paul E. McKenney
2011-05-23 15:21 ` Vivek Goyal
2011-05-23 15:38 ` Paul E. McKenney
2011-05-23 22:20 ` Paul Bolle
2011-05-24 4:14 ` Paul E. McKenney
2011-05-24 9:41 ` Jens Axboe
2011-05-24 14:35 ` Paul E. McKenney
2011-05-24 14:51 ` Jens Axboe
2011-05-24 15:42 ` Paul E. McKenney
2011-05-24 15:51 ` Paul E. McKenney
2011-05-25 8:28 ` Paul Bolle
2011-05-25 8:46 ` Jens Axboe
2011-05-25 9:13 ` Paul Bolle
2011-05-25 9:30 ` Jens Axboe
2011-05-25 9:40 ` Paul Bolle
2011-05-25 12:48 ` Paul Bolle
2011-05-25 12:51 ` Jens Axboe
2011-05-25 17:28 ` Paul Bolle
2011-05-25 18:59 ` Jens Axboe
2011-05-25 10:17 ` Paul Bolle
2011-05-25 15:33 ` Paul E. McKenney
2011-05-25 17:44 ` Paul Bolle
2011-05-25 20:40 ` Paul E. McKenney
2011-05-26 9:15 ` Paul Bolle
2011-06-03 5:07 ` Paul E. McKenney
2011-06-03 13:45 ` Vivek Goyal
2011-06-03 15:33 ` Paul E. McKenney
2011-06-03 16:54 ` Paul E. McKenney
2011-06-04 12:22 ` Paul Bolle
2011-06-04 12:50 ` Paul Bolle
2011-06-04 16:03 ` Paul E. McKenney
2011-06-04 22:48 ` Paul Bolle
2011-06-04 23:06 ` Paul E. McKenney
2011-08-04 15:05 ` Vivek Goyal
2011-08-04 19:43 ` Jens Axboe
2011-08-04 19:51 ` Vivek Goyal
2011-06-05 6:56 ` Jens Axboe
2011-06-05 8:39 ` Paul Bolle
2011-06-05 10:38 ` Paul Bolle
2011-06-05 22:51 ` Jens Axboe
2011-06-06 14:28 ` Vivek Goyal
2011-05-23 15:36 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110521235408.GK2271@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=jaxboe@fusionio.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pebolle@tiscali.nl \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.