linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lu Baolu <baolu.lu@linux.intel.com>
To: Mathias Nyman <mathias.nyman@linux.intel.com>,
	Baolin Wang <baolin.wang@linaro.org>
Cc: Mathias Nyman <mathias.nyman@intel.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	USB <linux-usb@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Mark Brown <broonie@kernel.org>, "Lu, Baolu" <baolu.lu@intel.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: Re: [PATCH 2/2] usb: host: xhci: Handle the right timeout command
Date: Thu, 22 Dec 2016 09:43:56 +0800	[thread overview]
Message-ID: <585B2FDC.1060502@linux.intel.com> (raw)
In-Reply-To: <585A7A0A.6090401@linux.intel.com>

Hi,

On 12/21/2016 08:48 PM, Mathias Nyman wrote:
> On 21.12.2016 08:17, Lu Baolu wrote:
>> Hi Mathias,
>>
>> I have some comments for the implementation of xhci_abort_cmd_ring() below.
>>
>> On 12/20/2016 11:13 PM, Mathias Nyman wrote:
>>> On 20.12.2016 09:30, Baolin Wang wrote:
>>> ...
>>>
>>> Alright, I gathered all current work related to xhci races and timeouts
>>> and put them into a branch:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes
>>>
>>> Its based on 4.9
>>> It includes a few other patches just to avoid conflicts and  make my life easier
>>>
>>> Interesting patches are:
>>>
>>> ee4eb91 xhci: remove unnecessary check for pending timer
>>> 0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
>>> 4f2535f xhci: Handle command completion and timeout race
>>> b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
>>> 529a5a0 usb: xhci: fix possible wild pointer
>>> 4766555 xhci: Fix race related to abort operation
>>> de834a3 xhci: Use delayed_work instead of timer for command timeout
>>> 69973b8 Linux 4.9
>>>
>>> The fixes for command queue races will go to usb-linus and stable, the
>>> reworks for stop ep watchdog timer will go to usb-next.
>>>
>>> Still completely untested, (well it compiles)
>>>
>>> Felipe gave instructions how to modify dwc3 driver to timeout on address
>>> devicecommands to test these, I'll try to set that up.
>>>
>>> All additional testing is welcome, especially if you can trigger timeouts
>>> and races
>>>
>>> -Mathias
>>>
>>>
>>
>> Below is the latest code. I put my comments in line.
>>
>>   322 static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
>>   323 {
>>   324         u64 temp_64;
>>   325         int ret;
>>   326
>>   327         xhci_dbg(xhci, "Abort command ring\n");
>>   328
>>   329         reinit_completion(&xhci->cmd_ring_stop_completion);
>>   330
>>   331         temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
>>   332         xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
>>   333                         &xhci->op_regs->cmd_ring);
>>
>> We should hold xhci->lock when we are modifying xhci registers
>> at runtime.
>>
>
> Makes sense, but we need to unlock it before sleeping or waiting for completion.
> I need to look into that in more detail.
>
> But this was an issue already before these changes.
>
>> The retry of setting CMD_RING_ABORT is not necessary according to
>> previous discussion. We have cleaned code for second try in
>> xhci_handle_command_timeout(). Need to clean up here as well.
>>
>
> Yes it can be cleaned up as well, but the two cases are a bit different.
> The cleaned up one was about command ring not starting again after it was stopped.
>
> This second try is a workaround for what we thought was the command ring failing
> to stop in the first place, but is most likely due to the race that OGAWA Hirofumi
> fixed.  It races if the stop command ring interrupt happens between writing the abort
> bit and polling for the ring stopped bit. The interrupt hander may start the command
> ring again, and we would believe we failed to stop it in the first place.
>
> This race could probably be fixed by just extending the lock (and preventing
> interrupts) to cover both writing the abort bit and polling for the command ring
> running bit, as you pointed out here previously.
>
> But then again I really like OGAWA Hiroumi's solution that separates the
> command ring stopping from aborting commands and restarting the ring.
>
> The current way of always restarting the command ring as a response to
> a stop command ring command really limits its usage.
>
> So, with this in mind most reasonable would be to
> 1. fix the lock to cover abort+CRR check, and send it to usb-linus +stable
> 2. rebase OGAWA Hirofumi's changes on top of that, and send to usb-linus only
> 3. remove unnecessary second abort try as a separate patch, send to usb-next
> 4. remove polling for the Command ring running (CRR), waiting for completion
>    is enough, if completion times out then we can check CRR. for usb-next
>    I'll fix the typos these patches would introduce. Fixing old typos can be done as separate
> patches later.

This is exactly the same as what I am thinking of. I will submit the patches later.

Best regards,
Lu Baolu

  parent reply	other threads:[~2016-12-22  1:44 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05  7:51 [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Baolin Wang
2016-12-05  7:51 ` [PATCH 2/2] usb: host: xhci: Handle the right timeout command Baolin Wang
2016-12-12 15:52   ` Mathias Nyman
2016-12-13  3:21     ` Baolin Wang
2016-12-19 10:33       ` Mathias Nyman
2016-12-19 11:34         ` Baolin Wang
2016-12-19 12:13           ` Mathias Nyman
2016-12-20  3:23             ` Baolin Wang
2016-12-20  4:29             ` Lu Baolu
2016-12-20  6:06               ` Baolin Wang
2016-12-20  6:39                 ` Lu Baolu
2016-12-20  6:46                   ` Baolin Wang
2016-12-20  7:18                     ` Lu Baolu
2016-12-20  7:30                       ` Baolin Wang
2016-12-20 15:13                         ` Mathias Nyman
2016-12-21  2:22                           ` Baolin Wang
2016-12-21 13:00                             ` Mathias Nyman
2016-12-27  3:07                               ` Baolin Wang
2017-01-02 14:57                                 ` Mathias Nyman
2017-01-03  6:20                                   ` Baolin Wang
2016-12-21  6:17                           ` Lu Baolu
2016-12-21 12:48                             ` Mathias Nyman
2016-12-21 14:10                               ` OGAWA Hirofumi
2016-12-21 15:04                                 ` Mathias Nyman
2016-12-21 15:18                                   ` OGAWA Hirofumi
2016-12-22  1:46                                     ` Lu Baolu
2016-12-23 12:54                                       ` Mathias Nyman
2016-12-22  1:43                               ` Lu Baolu [this message]
2016-12-21  6:57                           ` Lu Baolu
2016-12-21 12:57                             ` Mathias Nyman
2016-12-22  1:39                               ` Lu Baolu
2016-12-05 14:08 ` [PATCH 1/2] usb: host: xhci: Fix possible wild pointer when handling abort command Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585B2FDC.1060502@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=baolin.wang@linaro.org \
    --cc=baolu.lu@intel.com \
    --cc=broonie@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@intel.com \
    --cc=mathias.nyman@linux.intel.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).