From: Eli Billauer <eli.billauer@gmail.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: linux-usb@vger.kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Hans de Goede <hdegoede@redhat.com>,
Oliver Neukum <oliver@neukum.org>
Subject: Re: usb: core: URB completer callback possibly called after usb_kill_anchored_urbs() returns
Date: Sat, 25 Jul 2020 19:44:02 +0300 [thread overview]
Message-ID: <5F1C6152.8000500@gmail.com> (raw)
In-Reply-To: <20200724155133.GC1388675@rowland.harvard.edu>
Hello Alan & all,
Thanks for your response.
The thing is that I'm not alone assuming that it's fine to free
resources after usb_kill_anchored_urbs() returns. Most notable is
usb-skeleton.c, which does exactly that in skel_disconnect():
usb_kill_anchored_urbs(&dev->submitted);
/* decrement our usage count */
kref_put(&dev->kref, skel_delete);
Needless to say, skel_delete() frees the struct that the URBs' contexts
point at.
Keeping track of the number of uncompleted URBs, as you suggested, is
indeed a solution. But as it seems that the only problem is the race
condition between usb_kill_anchored_urbs() and __usb_hcd_giveback_urb(),
I suppose it's enough to ensure that the resources are not freed while
__usb_hcd_giveback_urb() is doing its unanchor-before-complete thing.
After taking a second look, I discovered that there's already a function
that takes the race condition into consideration:
usb_wait_anchor_empty_timeout().
Looking again at commit 6ec4147, which I mentioned before, it turns out
that it added a counter to each anchor struct (atomic_t
suspend_wakeups). It's incremented in __usb_hcd_giveback_urb() just
before unanchoring a URB, and decremented after the completion has been
called. This is made with calls to usb_anchor_suspend_wakeups() and
usb_anchor_resume_wakeups(), but that's the essence of these calls.
And there's a wait queue in place, which gets a wakeup call by
usb_anchor_resume_wakeups(), if the anchor's list is empty and the
counter is zero after decrementing it. In the said commit,
usb_wait_anchor_empty_timeout() was modified to check the counter as
well, so when it returns, the anchor is genuinely idle.
So I would say that the safe way to go is
usb_kill_anchored_urbs(&ep->anchor);
if (!usb_wait_anchor_empty_timeout(&ep->anchor, 1000)) {
/* This is really bad */
}
/* Release memory */
And if indeed so, I'm thinking about submitting a patch, which adds a
usb_wait_anchor_empty_timeout() at the bottom of
usb_kill_anchored_urbs(). So that the function does what people out
there think it does.
Have I missed something?
Thanks,
Eli
On 24/07/20 18:51, Alan Stern wrote:
> On Fri, Jul 24, 2020 at 03:46:40PM +0300, Eli Billauer wrote:
>
>> Hello,
>>
>> My understanding is it should be OK to assume that no calls to completer
>> callbacks will be made after usb_kill_anchored_urbs() returns (for that
>> anchor, of course).
>>
> As you have discovered, that is not a correct assumption.
>
>
>> However __usb_hcd_giveback_urb() in
>> drivers/usb/core/hcd.c doesn't seem to work that way. It unanchors first,
>> then calls the complete method:
>>
>> usb_unanchor_urb(urb);
>> if (likely(status == 0))
>> usb_led_activity(USB_LED_EVENT_HOST);
>>
>> /* pass ownership to the completion handler */
>> urb->status = status;
>> kcov_remote_start_usb((u64)urb->dev->bus->busnum);
>> urb->complete(urb);
>>
>> So if usb_kill_anchored_urbs() is called while __usb_hcd_giveback_urb() is
>> in the middle of this code passage, it will miss the URB that is being
>> finished, and possibly return before the completer has been called.
>>
>> It might sound like a theoretic race condition, but I actually got a kernel
>> panic after yanking the USB plug in the middle of heavy traffic. The panic's
>> call trace indicated that the driver's completer callback function attempted
>> to access memory that had been freed previously. As this happened within an
>> IRQ, it was a fullblown computer freeze.
>>
>> The same driver's memory freeing mechanism indeed calls
>> usb_kill_anchored_urbs() first, then frees the URBs' related data structure.
>> So it seems like it freed the memory just before the completer callback was
>> invoked.
>>
> Right. There is a genuine race. Althouogh usb_kill_anchored_urbs()
> does wait for the completion handlers of all the URBs it kills to
> finish, there is some ambiguity about what URBs are on the anchor.
>
>
>> I would love to submit a patch that moves the usb_unanchor_urb() call a few
>> rows down, but something tells me it's not that simple.
>>
> No, it isn't.
>
>
>> As a side note, the comment along with commit 6ec4147, which added
>> usb_anchor_{suspend,resume}_wakeups calls, said among others: "But
>> __usb_hcd_giveback_urb() calls usb_unanchor_urb before calling the
>> completion handler. This is necessary as the completion handler may
>> re-submit and re-anchor the urb". Not sure I understood this part, though.
>>
> Suppose the completion routine puts the URB onto a different anchor and
> then calls usb_submit_urb(). If __usb_hcd_giveback_urb() then called
> usb_unanchor_urb(), the URB would incorrectly be removed from the wrong
> anchor!
>
> Currently the only way to handle this situation properly is to keep
> track of whether each URB has completed. For example, if the driver has
> successfully submitted 237 URBs but the completion routine has only been
> called 235 times, the driver will know that there are still two URBs
> pending.
>
> Alan Stern
>
>
next prev parent reply other threads:[~2020-07-25 16:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-24 12:46 usb: core: URB completer callback possibly called after usb_kill_anchored_urbs() returns Eli Billauer
2020-07-24 15:51 ` Alan Stern
2020-07-25 16:44 ` Eli Billauer [this message]
2020-07-25 19:53 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5F1C6152.8000500@gmail.com \
--to=eli.billauer@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=hdegoede@redhat.com \
--cc=linux-usb@vger.kernel.org \
--cc=oliver@neukum.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).