From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200 [thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)
On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter. This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it. The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
>
> URB 0000000031fb463e submitted while active
> WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363
>
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count). The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
>
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls. In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
>
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative. There's
> even a warning about this in the USB power management documentation!
>
> As it happens, this is exactly what the hub driver does. The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread. This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
>
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock. The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts. As far as I know, all
> existing drivers currently do this.
Nice work, that took a lot of debugging, thanks for resolving this.
greg k-h
WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: Re: [PATCH] USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200 [thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)
Message-ID: <20190419191735.hwqGJHZT_IlTPl2vAuy0wpUjXQdFNJFJzJVvMt8_oxU@z> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1904191348580.1406-100000@iolanthe.rowland.org>
On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter. This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it. The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
>
> URB 0000000031fb463e submitted while active
> WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363
>
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count). The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
>
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls. In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
>
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative. There's
> even a warning about this in the USB power management documentation!
>
> As it happens, this is exactly what the hub driver does. The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread. This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
>
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock. The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts. As far as I know, all
> existing drivers currently do this.
Nice work, that took a lot of debugging, thanks for resolving this.
greg k-h
next prev reply other threads:[~2019-04-19 19:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-19 17:52 USB: core: Fix bug caused by duplicate interface PM usage counter Alan Stern
2019-04-19 17:52 ` [PATCH] " Alan Stern
2019-04-19 19:17 ` Greg KH [this message]
2019-04-19 19:17 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190419191735.GA5970@kroah.com \
--to=greg@kroah.com \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
--cc=syzkaller-bugs@googlegroups.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).