* [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check [not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com> @ 2015-08-03 13:07 ` Mathias Nyman 2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman 1 sibling, 0 replies; 6+ messages in thread From: Mathias Nyman @ 2015-08-03 13:07 UTC (permalink / raw) To: gregkh; +Cc: linux-usb, Mathias Nyman, stable We need to check that a TRB is part of the current segment before calculating its DMA address. Previously a ring segment didn't use a full memory page, and every new ring segment got a new memory page, so the off by one error in checking the upper bound was never seen. Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one didn't catch the case when a TRB was the first element of the next segment. This is triggered if the virtual memory pages for a ring segment are next to each in increasing order where the ring buffer wraps around and causes errors like: [ 106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 0 comp_code 1 [ 106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000 trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0 The trb-end address is one outside the end-seg address. Cc: <stable@vger.kernel.org> Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> --- drivers/usb/host/xhci-ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 6a8fc52..32f4d56 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg, return 0; /* offset in TRBs */ segment_offset = trb - seg->trbs; - if (segment_offset > TRBS_PER_SEGMENT) + if (segment_offset >= TRBS_PER_SEGMENT) return 0; return seg->dma + (segment_offset * sizeof(*trb)); } -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary [not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com> 2015-08-03 13:07 ` [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check Mathias Nyman @ 2015-08-03 13:07 ` Mathias Nyman 2015-08-11 8:15 ` Oliver Neukum 1 sibling, 1 reply; 6+ messages in thread From: Mathias Nyman @ 2015-08-03 13:07 UTC (permalink / raw) To: gregkh; +Cc: linux-usb, Gavin Shan, stable, Mathias Nyman From: Gavin Shan <gwshan@linux.vnet.ibm.com> When xhci_mem_cleanup() is called, it's possible that the command timer isn't initialized and scheduled. For those cases, to delete the command timer causes soft-lockup as below stack dump shows. The patch avoids deleting the command timer if it's not scheduled with the help of timer_pending(). NMI watchdog: BUG: soft lockup - CPU#40 stuck for 23s! [kworker/40:1:8140] : NIP [c000000000150b30] lock_timer_base.isra.34+0x90/0xa0 LR [c000000000150c24] try_to_del_timer_sync+0x34/0xa0 Call Trace: [c000000f67c975e0] [c0000000015b84f8] mon_ops+0x0/0x8 (unreliable) [c000000f67c97620] [c000000000150c24] try_to_del_timer_sync+0x34/0xa0 [c000000f67c97660] [c000000000150cf0] del_timer_sync+0x60/0x80 [c000000f67c97690] [c00000000070ac0c] xhci_mem_cleanup+0x5c/0x5e0 [c000000f67c97740] [c00000000070c2e8] xhci_mem_init+0x1158/0x13b0 [c000000f67c97860] [c000000000700978] xhci_init+0x88/0x110 [c000000f67c978e0] [c000000000701644] xhci_gen_setup+0x2b4/0x590 [c000000f67c97970] [c0000000006d4410] xhci_pci_setup+0x40/0x190 [c000000f67c979f0] [c0000000006b1af8] usb_add_hcd+0x418/0xba0 [c000000f67c97ab0] [c0000000006cb15c] usb_hcd_pci_probe+0x1dc/0x5c0 [c000000f67c97b50] [c0000000006d3ba4] xhci_pci_probe+0x64/0x1f0 [c000000f67c97ba0] [c0000000004fe9ac] local_pci_probe+0x6c/0x130 [c000000f67c97c30] [c0000000000e5ce8] work_for_cpu_fn+0x38/0x60 [c000000f67c97c60] [c0000000000eacb8] process_one_work+0x198/0x470 [c000000f67c97cf0] [c0000000000eb6ac] worker_thread+0x37c/0x5a0 [c000000f67c97d80] [c0000000000f2730] kthread+0x110/0x130 [c000000f67c97e30] [c000000000009660] ret_from_kernel_thread+0x5c/0x7c Cc: <stable@vger.kernel.org> Reported-by: Priya M. A <priyama2@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> --- drivers/usb/host/xhci-mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 3e442f7..9a8c936 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1792,7 +1792,8 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci) int size; int i, j, num_ports; - del_timer_sync(&xhci->cmd_timer); + if (timer_pending(&xhci->cmd_timer)) + del_timer_sync(&xhci->cmd_timer); /* Free the Event Ring Segment Table and the actual Event Ring */ size = sizeof(struct xhci_erst_entry)*(xhci->erst.num_entries); -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary 2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman @ 2015-08-11 8:15 ` Oliver Neukum 2015-08-12 10:55 ` Mathias Nyman 0 siblings, 1 reply; 6+ messages in thread From: Oliver Neukum @ 2015-08-11 8:15 UTC (permalink / raw) To: Mathias Nyman; +Cc: gregkh, linux-usb, Gavin Shan, stable On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote: > From: Gavin Shan <gwshan@linux.vnet.ibm.com> > > When xhci_mem_cleanup() is called, it's possible that the command > timer isn't initialized and scheduled. For those cases, to delete > the command timer causes soft-lockup as below stack dump shows. > > The patch avoids deleting the command timer if it's not scheduled > with the help of timer_pending(). Are you sure this is safe? timer_pending() will not show you that the timer function is running. It looks like you introduced a race between timeout and cleanup. Regards Oliver ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary 2015-08-11 8:15 ` Oliver Neukum @ 2015-08-12 10:55 ` Mathias Nyman 2015-08-12 13:08 ` Oliver Neukum 2015-08-12 16:18 ` Greg KH 0 siblings, 2 replies; 6+ messages in thread From: Mathias Nyman @ 2015-08-12 10:55 UTC (permalink / raw) To: Oliver Neukum, gregkh; +Cc: linux-usb, Gavin Shan, stable On 11.08.2015 11:15, Oliver Neukum wrote: > On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote: >> From: Gavin Shan <gwshan@linux.vnet.ibm.com> >> >> When xhci_mem_cleanup() is called, it's possible that the command >> timer isn't initialized and scheduled. For those cases, to delete >> the command timer causes soft-lockup as below stack dump shows. >> >> The patch avoids deleting the command timer if it's not scheduled >> with the help of timer_pending(). > > Are you sure this is safe? timer_pending() will not show you that > the timer function is running. It looks like you introduced a race > between timeout and cleanup. > Looking at it in more detail you're right. Fortunately this can only happen in cases where xhci is already hosed (no command response for 5 seconds), and we are at the same time anyway about to remove xhci. Doesn't this mean that all cases with if (timer_pending(&timer)) del_timer_sync(&timer) is just basically the same as a plain del_timer(&timer)? Anyways, turns out that the error path in xhci initialization code can end up calling del_timer_sync() before timer is initialized. This should be fixed by re-arranging some code in xhci initialization instead. Greg, should this be reverted in rc7? I think that the possible side effect of this patch is still lesser the original issue. Thanks for spotting this -Mathias ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary 2015-08-12 10:55 ` Mathias Nyman @ 2015-08-12 13:08 ` Oliver Neukum 2015-08-12 16:18 ` Greg KH 1 sibling, 0 replies; 6+ messages in thread From: Oliver Neukum @ 2015-08-12 13:08 UTC (permalink / raw) To: Mathias Nyman; +Cc: gregkh, linux-usb, Gavin Shan, stable On Wed, 2015-08-12 at 13:55 +0300, Mathias Nyman wrote: > Fortunately this can only happen in cases where xhci is already hosed > (no command response for 5 seconds), and we are at the same time > anyway about to remove xhci. > > Doesn't this mean that all cases with > if (timer_pending(&timer)) > del_timer_sync(&timer) > > is just basically the same as a plain del_timer(&timer)? Yes. I never understood the idiom. > Anyways, turns out that the error path in xhci initialization code can end up calling > del_timer_sync() before timer is initialized. This should be fixed by re-arranging > some code in xhci initialization instead. Good. > Greg, should this be reverted in rc7? > I think that the possible side effect of this patch is still lesser the original > issue. I agree. Regards Oliver ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary 2015-08-12 10:55 ` Mathias Nyman 2015-08-12 13:08 ` Oliver Neukum @ 2015-08-12 16:18 ` Greg KH 1 sibling, 0 replies; 6+ messages in thread From: Greg KH @ 2015-08-12 16:18 UTC (permalink / raw) To: Mathias Nyman; +Cc: Oliver Neukum, linux-usb, Gavin Shan, stable On Wed, Aug 12, 2015 at 01:55:34PM +0300, Mathias Nyman wrote: > On 11.08.2015 11:15, Oliver Neukum wrote: > > On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote: > >> From: Gavin Shan <gwshan@linux.vnet.ibm.com> > >> > >> When xhci_mem_cleanup() is called, it's possible that the command > >> timer isn't initialized and scheduled. For those cases, to delete > >> the command timer causes soft-lockup as below stack dump shows. > >> > >> The patch avoids deleting the command timer if it's not scheduled > >> with the help of timer_pending(). > > > > Are you sure this is safe? timer_pending() will not show you that > > the timer function is running. It looks like you introduced a race > > between timeout and cleanup. > > > > Looking at it in more detail you're right. > > Fortunately this can only happen in cases where xhci is already hosed > (no command response for 5 seconds), and we are at the same time > anyway about to remove xhci. > > Doesn't this mean that all cases with > if (timer_pending(&timer)) > del_timer_sync(&timer) > > is just basically the same as a plain del_timer(&timer)? > > Anyways, turns out that the error path in xhci initialization code can end up calling > del_timer_sync() before timer is initialized. This should be fixed by re-arranging > some code in xhci initialization instead. > > Greg, should this be reverted in rc7? > I think that the possible side effect of this patch is still lesser the original > issue. Just fix it "right" in a new patch. thanks, greg k-h ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-08-12 16:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com>
2015-08-03 13:07 ` [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check Mathias Nyman
2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman
2015-08-11 8:15 ` Oliver Neukum
2015-08-12 10:55 ` Mathias Nyman
2015-08-12 13:08 ` Oliver Neukum
2015-08-12 16:18 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).