linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200	[thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)

On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter.  This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it.  The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
> 
> 	URB 0000000031fb463e submitted while active
> 	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363  
> 
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count).  The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
> 
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls.  In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
> 
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative.  There's
> even a warning about this in the USB power management documentation!
> 
> As it happens, this is exactly what the hub driver does.  The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread.  This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
> 
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock.  The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts.  As far as I know, all
> existing drivers currently do this.

Nice work, that took a lot of debugging, thanks for resolving this.

greg k-h

WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <greg@kroah.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com, USB list <linux-usb@vger.kernel.org>
Subject: Re: [PATCH] USB: core: Fix bug caused by duplicate interface PM usage counter
Date: Fri, 19 Apr 2019 21:17:35 +0200	[thread overview]
Message-ID: <20190419191735.GA5970@kroah.com> (raw)
Message-ID: <20190419191735.hwqGJHZT_IlTPl2vAuy0wpUjXQdFNJFJzJVvMt8_oxU@z> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1904191348580.1406-100000@iolanthe.rowland.org>

On Fri, Apr 19, 2019 at 01:52:38PM -0400, Alan Stern wrote:
> The syzkaller fuzzer reported a bug in the USB hub driver which turned
> out to be caused by a negative runtime-PM usage counter.  This allowed
> a hub to be runtime suspended at a time when the driver did not expect
> it.  The symptom is a WARNING issued because the hub's status URB is
> submitted while it is already active:
> 
> 	URB 0000000031fb463e submitted while active
> 	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363  
> 
> The negative runtime-PM usage count was caused by an unfortunate
> design decision made when runtime PM was first implemented for USB.
> At that time, USB class drivers were allowed to unbind from their
> interfaces without balancing the usage counter (i.e., leaving it with
> a positive count).  The core code would take care of setting the
> counter back to 0 before allowing another driver to bind to the
> interface.
> 
> Later on when runtime PM was implemented for the entire kernel, the
> opposite decision was made: Drivers were required to balance their
> runtime-PM get and put calls.  In order to maintain backward
> compatibility, however, the USB subsystem adapted to the new
> implementation by keeping an independent usage counter for each
> interface and using it to automatically adjust the normal usage
> counter back to 0 whenever a driver was unbound.
> 
> This approach involves duplicating information, but what is worse, it
> doesn't work properly in cases where a USB class driver delays
> decrementing the usage counter until after the driver's disconnect()
> routine has returned and the counter has been adjusted back to 0.
> Doing so would cause the usage counter to become negative.  There's
> even a warning about this in the USB power management documentation!
> 
> As it happens, this is exactly what the hub driver does.  The
> kick_hub_wq() routine increments the runtime-PM usage counter, and the
> corresponding decrement is carried out by hub_event() in the context
> of the hub_wq work-queue thread.  This work routine may sometimes run
> after the driver has been unbound from its interface, and when it does
> it causes the usage counter to go negative.
> 
> It is not possible for hub_disconnect() to wait for a pending
> hub_event() call to finish, because hub_disconnect() is called with
> the device lock held and hub_event() acquires that lock.  The only
> feasible fix is to reverse the original design decision: remove the
> duplicate interface-specific usage counter and require USB drivers to
> balance their runtime PM gets and puts.  As far as I know, all
> existing drivers currently do this.

Nice work, that took a lot of debugging, thanks for resolving this.

greg k-h

         reply	other threads:[~2019-04-19 19:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-19 17:52 USB: core: Fix bug caused by duplicate interface PM usage counter Alan Stern
2019-04-19 17:52 ` [PATCH] " Alan Stern
2019-04-19 19:17 ` Greg KH [this message]
2019-04-19 19:17   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190419191735.GA5970@kroah.com \
    --to=greg@kroah.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).