* [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check
[not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com>
@ 2015-08-03 13:07 ` Mathias Nyman
2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman
1 sibling, 0 replies; 6+ messages in thread
From: Mathias Nyman @ 2015-08-03 13:07 UTC (permalink / raw)
To: gregkh; +Cc: linux-usb, Mathias Nyman, stable
We need to check that a TRB is part of the current segment
before calculating its DMA address.
Previously a ring segment didn't use a full memory page, and every
new ring segment got a new memory page, so the off by one
error in checking the upper bound was never seen.
Now that we use a full memory page, 256 TRBs (4096 bytes), the off by one
didn't catch the case when a TRB was the first element of the next segment.
This is triggered if the virtual memory pages for a ring segment are
next to each in increasing order where the ring buffer wraps around and
causes errors like:
[ 106.398223] xhci_hcd 0000:00:14.0: ERROR Transfer event TRB DMA ptr not
part of current TD ep_index 0 comp_code 1
[ 106.398230] xhci_hcd 0000:00:14.0: Looking for event-dma fffd3000
trb-start fffd4fd0 trb-end fffd5000 seg-start fffd4000 seg-end fffd4ff0
The trb-end address is one outside the end-seg address.
Cc: <stable@vger.kernel.org>
Tested-by: Arkadiusz Miśkiewicz <arekm@maven.pl>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
drivers/usb/host/xhci-ring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 6a8fc52..32f4d56 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -82,7 +82,7 @@ dma_addr_t xhci_trb_virt_to_dma(struct xhci_segment *seg,
return 0;
/* offset in TRBs */
segment_offset = trb - seg->trbs;
- if (segment_offset > TRBS_PER_SEGMENT)
+ if (segment_offset >= TRBS_PER_SEGMENT)
return 0;
return seg->dma + (segment_offset * sizeof(*trb));
}
--
1.8.3.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary
[not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com>
2015-08-03 13:07 ` [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check Mathias Nyman
@ 2015-08-03 13:07 ` Mathias Nyman
2015-08-11 8:15 ` Oliver Neukum
1 sibling, 1 reply; 6+ messages in thread
From: Mathias Nyman @ 2015-08-03 13:07 UTC (permalink / raw)
To: gregkh; +Cc: linux-usb, Gavin Shan, stable, Mathias Nyman
From: Gavin Shan <gwshan@linux.vnet.ibm.com>
When xhci_mem_cleanup() is called, it's possible that the command
timer isn't initialized and scheduled. For those cases, to delete
the command timer causes soft-lockup as below stack dump shows.
The patch avoids deleting the command timer if it's not scheduled
with the help of timer_pending().
NMI watchdog: BUG: soft lockup - CPU#40 stuck for 23s! [kworker/40:1:8140]
:
NIP [c000000000150b30] lock_timer_base.isra.34+0x90/0xa0
LR [c000000000150c24] try_to_del_timer_sync+0x34/0xa0
Call Trace:
[c000000f67c975e0] [c0000000015b84f8] mon_ops+0x0/0x8 (unreliable)
[c000000f67c97620] [c000000000150c24] try_to_del_timer_sync+0x34/0xa0
[c000000f67c97660] [c000000000150cf0] del_timer_sync+0x60/0x80
[c000000f67c97690] [c00000000070ac0c] xhci_mem_cleanup+0x5c/0x5e0
[c000000f67c97740] [c00000000070c2e8] xhci_mem_init+0x1158/0x13b0
[c000000f67c97860] [c000000000700978] xhci_init+0x88/0x110
[c000000f67c978e0] [c000000000701644] xhci_gen_setup+0x2b4/0x590
[c000000f67c97970] [c0000000006d4410] xhci_pci_setup+0x40/0x190
[c000000f67c979f0] [c0000000006b1af8] usb_add_hcd+0x418/0xba0
[c000000f67c97ab0] [c0000000006cb15c] usb_hcd_pci_probe+0x1dc/0x5c0
[c000000f67c97b50] [c0000000006d3ba4] xhci_pci_probe+0x64/0x1f0
[c000000f67c97ba0] [c0000000004fe9ac] local_pci_probe+0x6c/0x130
[c000000f67c97c30] [c0000000000e5ce8] work_for_cpu_fn+0x38/0x60
[c000000f67c97c60] [c0000000000eacb8] process_one_work+0x198/0x470
[c000000f67c97cf0] [c0000000000eb6ac] worker_thread+0x37c/0x5a0
[c000000f67c97d80] [c0000000000f2730] kthread+0x110/0x130
[c000000f67c97e30] [c000000000009660] ret_from_kernel_thread+0x5c/0x7c
Cc: <stable@vger.kernel.org>
Reported-by: Priya M. A <priyama2@in.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
drivers/usb/host/xhci-mem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index 3e442f7..9a8c936 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1792,7 +1792,8 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
int size;
int i, j, num_ports;
- del_timer_sync(&xhci->cmd_timer);
+ if (timer_pending(&xhci->cmd_timer))
+ del_timer_sync(&xhci->cmd_timer);
/* Free the Event Ring Segment Table and the actual Event Ring */
size = sizeof(struct xhci_erst_entry)*(xhci->erst.num_entries);
--
1.8.3.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary
2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman
@ 2015-08-11 8:15 ` Oliver Neukum
2015-08-12 10:55 ` Mathias Nyman
0 siblings, 1 reply; 6+ messages in thread
From: Oliver Neukum @ 2015-08-11 8:15 UTC (permalink / raw)
To: Mathias Nyman; +Cc: gregkh, linux-usb, Gavin Shan, stable
On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote:
> From: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
> When xhci_mem_cleanup() is called, it's possible that the command
> timer isn't initialized and scheduled. For those cases, to delete
> the command timer causes soft-lockup as below stack dump shows.
>
> The patch avoids deleting the command timer if it's not scheduled
> with the help of timer_pending().
Are you sure this is safe? timer_pending() will not show you that
the timer function is running. It looks like you introduced a race
between timeout and cleanup.
Regards
Oliver
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary
2015-08-11 8:15 ` Oliver Neukum
@ 2015-08-12 10:55 ` Mathias Nyman
2015-08-12 13:08 ` Oliver Neukum
2015-08-12 16:18 ` Greg KH
0 siblings, 2 replies; 6+ messages in thread
From: Mathias Nyman @ 2015-08-12 10:55 UTC (permalink / raw)
To: Oliver Neukum, gregkh; +Cc: linux-usb, Gavin Shan, stable
On 11.08.2015 11:15, Oliver Neukum wrote:
> On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote:
>> From: Gavin Shan <gwshan@linux.vnet.ibm.com>
>>
>> When xhci_mem_cleanup() is called, it's possible that the command
>> timer isn't initialized and scheduled. For those cases, to delete
>> the command timer causes soft-lockup as below stack dump shows.
>>
>> The patch avoids deleting the command timer if it's not scheduled
>> with the help of timer_pending().
>
> Are you sure this is safe? timer_pending() will not show you that
> the timer function is running. It looks like you introduced a race
> between timeout and cleanup.
>
Looking at it in more detail you're right.
Fortunately this can only happen in cases where xhci is already hosed
(no command response for 5 seconds), and we are at the same time
anyway about to remove xhci.
Doesn't this mean that all cases with
if (timer_pending(&timer))
del_timer_sync(&timer)
is just basically the same as a plain del_timer(&timer)?
Anyways, turns out that the error path in xhci initialization code can end up calling
del_timer_sync() before timer is initialized. This should be fixed by re-arranging
some code in xhci initialization instead.
Greg, should this be reverted in rc7?
I think that the possible side effect of this patch is still lesser the original
issue.
Thanks for spotting this
-Mathias
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary
2015-08-12 10:55 ` Mathias Nyman
@ 2015-08-12 13:08 ` Oliver Neukum
2015-08-12 16:18 ` Greg KH
1 sibling, 0 replies; 6+ messages in thread
From: Oliver Neukum @ 2015-08-12 13:08 UTC (permalink / raw)
To: Mathias Nyman; +Cc: gregkh, linux-usb, Gavin Shan, stable
On Wed, 2015-08-12 at 13:55 +0300, Mathias Nyman wrote:
> Fortunately this can only happen in cases where xhci is already hosed
> (no command response for 5 seconds), and we are at the same time
> anyway about to remove xhci.
>
> Doesn't this mean that all cases with
> if (timer_pending(&timer))
> del_timer_sync(&timer)
>
> is just basically the same as a plain del_timer(&timer)?
Yes. I never understood the idiom.
> Anyways, turns out that the error path in xhci initialization code can end up calling
> del_timer_sync() before timer is initialized. This should be fixed by re-arranging
> some code in xhci initialization instead.
Good.
> Greg, should this be reverted in rc7?
> I think that the possible side effect of this patch is still lesser the original
> issue.
I agree.
Regards
Oliver
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary
2015-08-12 10:55 ` Mathias Nyman
2015-08-12 13:08 ` Oliver Neukum
@ 2015-08-12 16:18 ` Greg KH
1 sibling, 0 replies; 6+ messages in thread
From: Greg KH @ 2015-08-12 16:18 UTC (permalink / raw)
To: Mathias Nyman; +Cc: Oliver Neukum, linux-usb, Gavin Shan, stable
On Wed, Aug 12, 2015 at 01:55:34PM +0300, Mathias Nyman wrote:
> On 11.08.2015 11:15, Oliver Neukum wrote:
> > On Mon, 2015-08-03 at 16:07 +0300, Mathias Nyman wrote:
> >> From: Gavin Shan <gwshan@linux.vnet.ibm.com>
> >>
> >> When xhci_mem_cleanup() is called, it's possible that the command
> >> timer isn't initialized and scheduled. For those cases, to delete
> >> the command timer causes soft-lockup as below stack dump shows.
> >>
> >> The patch avoids deleting the command timer if it's not scheduled
> >> with the help of timer_pending().
> >
> > Are you sure this is safe? timer_pending() will not show you that
> > the timer function is running. It looks like you introduced a race
> > between timeout and cleanup.
> >
>
> Looking at it in more detail you're right.
>
> Fortunately this can only happen in cases where xhci is already hosed
> (no command response for 5 seconds), and we are at the same time
> anyway about to remove xhci.
>
> Doesn't this mean that all cases with
> if (timer_pending(&timer))
> del_timer_sync(&timer)
>
> is just basically the same as a plain del_timer(&timer)?
>
> Anyways, turns out that the error path in xhci initialization code can end up calling
> del_timer_sync() before timer is initialized. This should be fixed by re-arranging
> some code in xhci initialization instead.
>
> Greg, should this be reverted in rc7?
> I think that the possible side effect of this patch is still lesser the original
> issue.
Just fix it "right" in a new patch.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-08-12 16:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1438607269-8977-1-git-send-email-mathias.nyman@linux.intel.com>
2015-08-03 13:07 ` [PATCH 1/2] xhci: fix off by one error in TRB DMA address boundary check Mathias Nyman
2015-08-03 13:07 ` [PATCH 2/2] drivers/usb: Delete XHCI command timer if necessary Mathias Nyman
2015-08-11 8:15 ` Oliver Neukum
2015-08-12 10:55 ` Mathias Nyman
2015-08-12 13:08 ` Oliver Neukum
2015-08-12 16:18 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).