From: Greg KH <gregkh@linuxfoundation.org>
To: Ming Lei <ming.lei@canonical.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org
Subject: Re: Race condition between driver_probe_device and device_shutdown
Date: Sun, 20 May 2012 12:51:26 -0700 [thread overview]
Message-ID: <20120520195126.GA11282@kroah.com> (raw)
In-Reply-To: <CACVXFVPhPmTfdB-pKpz7NpL+N03cxDvPpiFdPWeZcqBNNSffYw@mail.gmail.com>
On Sun, May 20, 2012 at 10:08:08PM +0800, Ming Lei wrote:
> Hi,
>
> On Fri, May 18, 2012 at 12:59 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> >> Hi,
> >
> > First off, sorry for missing this, and thanks to Andrew for pointing it
> > out to me. You might want to use the tool, scripts/get_maintainer.pl
> > for who to know to cc: for patches like this, so I don't miss it.
> >
> >> I'm seeing a driver crash in its shutdown routine because it's
> >> touching some uninitialized state. It turns out that the driver's
> >> probe routine was still running [for the same device]. There also
> >> appears to be an issue in the remove path, where device_shutdown()
> >> checks the dev->driver pointer and uses it later, with seemingly
> >> nothing to guarantee that it doesn't change.
> >
> > What type of driver is having this problem? What type of bus is it on?
> > Usually the bus prevents this from happening with its own serialization.
>
> Looks it is a generic problem.
>
> There are two races, one is between .probe and .shutdown, and another
> is between .release and .shutdown, see below:
>
>
> void device_shutdown(void)
>
> ......
> /* Don't allow any more runtime suspends */
> pm_runtime_get_noresume(dev);
> pm_runtime_barrier(dev);
>
> if (dev->bus && dev->bus->shutdown) {
> dev_dbg(dev, "shutdown\n");
> dev->bus->shutdown(dev);
> } else if (dev->driver && dev->driver->shutdown) { /*line-driver*/
> dev_dbg(dev, "shutdown\n");
> dev->driver->shutdown(dev); /*line-shut*/
> }
> ......
>
> If dev->driver is just set(really_probe) before 'line-driver' and .probe is
> not executed before 'line-shut', the .shutdown may touch a uninitialized
> device.
>
> Also if dev->driver is just cleared(__device_release_driver) after "line-driver"
> and before "line-shut", null pointer will be referenced and oops will
> be triggered.
And how can that happen with a real bus? Don't we have a lock
somewhere per-bus that should be protecting this type of thing (sorry,
can't dig through the code at the moment, on the road...)
> >> Shouldn't we synchronize the shutdown routine with probe/remove to
> >> prevent such races?
> >
> > Normally, yes, and for some reason, I thought we already were doing
> > that.
>
> Looks the races are still there.
How come no one has ever hit them in the past 10 years? What am I
missing here?
> >> The patch below should take care of these races.
> >
> > Does this patch solve your problem? Care to show me the oops you get
> > without it?
> >
> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> index e28ce98..f2c63c6 100644
> >> --- a/drivers/base/core.c
> >> +++ b/drivers/base/core.c
> >> @@ -1823,6 +1823,9 @@ void device_shutdown(void)
> >> pm_runtime_get_noresume(dev);
> >> pm_runtime_barrier(dev);
> >>
> >> + if (dev->parent) /* Needed for USB */
> >> + device_lock(dev->parent);
> >> + device_lock(dev);
>
> Looks the above makes sense to serialize .shutdown with
> .probe and .release.
Let me look at the code when I get back in a few days, but I really
thought we already had a lock protecting all of this...
thanks,
greg k-h
next prev parent reply other threads:[~2012-05-20 19:51 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-18 4:59 Race condition between driver_probe_device and device_shutdown Greg KH
2012-05-20 14:08 ` Ming Lei
2012-05-20 19:51 ` Greg KH [this message]
2012-05-21 4:53 ` Ming Lei
2012-05-21 18:29 ` Alan Stern
2012-05-22 0:35 ` Ming Lei
2012-05-22 14:11 ` Alan Stern
2012-05-22 19:16 ` Eric W. Biederman
2012-05-23 10:05 ` Ming Lei
2012-05-23 15:06 ` Alan Stern
2012-05-24 1:39 ` Ming Lei
2012-05-24 2:14 ` Greg KH
2012-05-24 3:12 ` Ming Lei
2012-05-24 14:37 ` Alan Stern
2012-05-25 0:33 ` Ming Lei
2012-12-06 9:11 ` Race condition between driver_probe_device and device_shutdown Wedson Almeida Filho
2012-12-06 10:52 ` Ming Lei
2012-12-07 0:09 ` Wedson Almeida Filho
2012-12-07 3:42 ` Ming Lei
2012-12-07 12:16 ` Wedson Almeida Filho
2012-12-07 13:04 ` Ming Lei
2012-12-07 15:25 ` Alan Stern
2012-12-07 16:27 ` Ming Lei
-- strict thread matches above, loose matches on Subject: below --
2012-05-10 2:53 Race condition between driver_probe_device and device_shutdown Wedson Almeida Filho
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120520195126.GA11282@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.lei@canonical.com \
--cc=wedsonaf@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox