* [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned
@ 2015-01-30 19:54 Tim Chen
2015-01-30 19:58 ` Greg KH
2015-01-30 21:03 ` Sergei Shtylyov
0 siblings, 2 replies; 20+ messages in thread
From: Tim Chen @ 2015-01-30 19:54 UTC (permalink / raw)
To: H. Peter Anvin, Akinobu Mita, Mathias Nyman
Cc: Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski,
Thomas Gleixner, linux-kernel, x86, linux-usb
Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory
if CMA is enabled") changed the dma_alloc_coherent page clearance from
using an __GFP_ZERO in page allocation to not setting the flag but doing
an explicit memory clear at the end.
However the memory clear only covered the memory size that
was requested, but may not be up to the full extent of the
last page, if the total pages returned exceed the
memory size requested. This behavior has caused problem with XHCI
and caused it to hang and froze usb:
kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead
kernel: xhci_hcd 0000:00:14.0: Abort command ring failed
kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up
kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID
kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64.
Other drivers may have similar issue if it assumes that the pages
allocated are completely zeroed.
This patch ensures that the pages returned are fully cleared.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
arch/x86/kernel/pci-dma.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index a25e202..534470f 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -125,6 +125,8 @@ again:
return NULL;
}
+ /* round up to full page size */
+ size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE;
memset(page_address(page), 0, size);
*dma_addr = addr;
return page_address(page);
--
1.9.3
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen @ 2015-01-30 19:58 ` Greg KH 2015-01-30 22:01 ` Tim Chen 2015-01-30 21:03 ` Sergei Shtylyov 1 sibling, 1 reply; 20+ messages in thread From: Greg KH @ 2015-01-30 19:58 UTC (permalink / raw) To: Tim Chen Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On Fri, Jan 30, 2015 at 11:54:01AM -0800, Tim Chen wrote: > > Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > if CMA is enabled") changed the dma_alloc_coherent page clearance from > using an __GFP_ZERO in page allocation to not setting the flag but doing > an explicit memory clear at the end. > > However the memory clear only covered the memory size that > was requested, but may not be up to the full extent of the > last page, if the total pages returned exceed the > memory size requested. This behavior has caused problem with XHCI > and caused it to hang and froze usb: > > kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead > kernel: xhci_hcd 0000:00:14.0: Abort command ring failed > kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up > kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID > kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. > > Other drivers may have similar issue if it assumes that the pages > allocated are completely zeroed. > > This patch ensures that the pages returned are fully cleared. > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > arch/x86/kernel/pci-dma.c | 2 ++ > 1 file changed, 2 insertions(+) > Shouldn't this go to stable trees too? Also, why is the xhci driver not asking for the memory it is going to need? If it wants to use the full page, shouldn't it ask for it? thanks, greg k-h ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 19:58 ` Greg KH @ 2015-01-30 22:01 ` Tim Chen 2015-01-30 22:07 ` Greg KH 0 siblings, 1 reply; 20+ messages in thread From: Tim Chen @ 2015-01-30 22:01 UTC (permalink / raw) To: Greg KH Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote: > > Shouldn't this go to stable trees too? > Yes. Added the stable tag in updated patch that's resent. > Also, why is the xhci driver not asking for the memory it is going to > need? If it wants to use the full page, shouldn't it ask for it? > I agree that xhci should have done that, but it didn't. Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled") changed the behavior of dma_alloc_coherent by clearing only the memory being asked for. So for backward compatibility, clearing the pages completely to revert to dma_alloc_coherent's original behavior is probably the safe thing to do. Thanks. Tim ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 22:01 ` Tim Chen @ 2015-01-30 22:07 ` Greg KH 2015-01-30 22:16 ` Tim Chen 2015-01-30 22:39 ` Andi Kleen 0 siblings, 2 replies; 20+ messages in thread From: Greg KH @ 2015-01-30 22:07 UTC (permalink / raw) To: Tim Chen Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On Fri, Jan 30, 2015 at 02:01:58PM -0800, Tim Chen wrote: > On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote: > > > > > Shouldn't this go to stable trees too? > > > > Yes. Added the stable tag in updated patch that's resent. > > > Also, why is the xhci driver not asking for the memory it is going to > > need? If it wants to use the full page, shouldn't it ask for it? > > > > I agree that xhci should have done that, but it didn't. Commit > d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > if CMA is enabled") changed the behavior of dma_alloc_coherent > by clearing only the memory being asked for. > > So for backward compatibility, clearing the pages > completely to revert to dma_alloc_coherent's original > behavior is probably the safe thing to do. We don't "need" any backward compatility, why not fix the broken drivers that are using memory outside of what they are asking for? That's not ok no matter what, right? thanks, greg k-h ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 22:07 ` Greg KH @ 2015-01-30 22:16 ` Tim Chen 2015-01-30 22:39 ` Andi Kleen 1 sibling, 0 replies; 20+ messages in thread From: Tim Chen @ 2015-01-30 22:16 UTC (permalink / raw) To: Greg KH Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On Fri, 2015-01-30 at 14:07 -0800, Greg KH wrote: > On Fri, Jan 30, 2015 at 02:01:58PM -0800, Tim Chen wrote: > > On Fri, 2015-01-30 at 11:58 -0800, Greg KH wrote: > > > > > > > > Shouldn't this go to stable trees too? > > > > > > > Yes. Added the stable tag in updated patch that's resent. > > > > > Also, why is the xhci driver not asking for the memory it is going to > > > need? If it wants to use the full page, shouldn't it ask for it? > > > > > > > I agree that xhci should have done that, but it didn't. Commit > > d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > > if CMA is enabled") changed the behavior of dma_alloc_coherent > > by clearing only the memory being asked for. > > > > So for backward compatibility, clearing the pages > > completely to revert to dma_alloc_coherent's original > > behavior is probably the safe thing to do. > > We don't "need" any backward compatility, why not fix the broken drivers > that are using memory outside of what they are asking for? That's not > ok no matter what, right? Not disagreeing with you. I'll be equally happy if the xhci folks can fix the driver. Mathias? Tim ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 22:07 ` Greg KH 2015-01-30 22:16 ` Tim Chen @ 2015-01-30 22:39 ` Andi Kleen 2015-01-31 17:14 ` Alan Stern 1 sibling, 1 reply; 20+ messages in thread From: Andi Kleen @ 2015-01-30 22:39 UTC (permalink / raw) To: Greg KH Cc: Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb > We don't "need" any backward compatility, why not fix the broken drivers > that are using memory outside of what they are asking for? That's not > ok no matter what, right? How would you find them all? We don't even know which place in XHCI is the culprit here. Yes iff you can find them it would be good to fix. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 22:39 ` Andi Kleen @ 2015-01-31 17:14 ` Alan Stern 0 siblings, 0 replies; 20+ messages in thread From: Alan Stern @ 2015-01-31 17:14 UTC (permalink / raw) To: Andi Kleen Cc: Greg KH, Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On Fri, 30 Jan 2015, Andi Kleen wrote: > > We don't "need" any backward compatility, why not fix the broken drivers > > that are using memory outside of what they are asking for? That's not > > ok no matter what, right? > > How would you find them all? > > We don't even know which place in XHCI is the culprit here. It shouldn't be too hard to find, expecially if the fault can easily be reproduced. There are only about half a dozen calls to dma_alloc_coherent in the xhci-hcd driver. Just add 64 KB to the requested memory size in each of them. Then one-by-one (or using bisection) remove the extra size and see which spots trigger the fault. Alan Stern ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen 2015-01-30 19:58 ` Greg KH @ 2015-01-30 21:03 ` Sergei Shtylyov 2015-01-30 21:54 ` Tim Chen 1 sibling, 1 reply; 20+ messages in thread From: Sergei Shtylyov @ 2015-01-30 21:03 UTC (permalink / raw) To: Tim Chen, H. Peter Anvin, Akinobu Mita, Mathias Nyman Cc: Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb On 01/30/2015 10:54 PM, Tim Chen wrote: > Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > if CMA is enabled") changed the dma_alloc_coherent page clearance from > using an __GFP_ZERO in page allocation to not setting the flag but doing > an explicit memory clear at the end. > However the memory clear only covered the memory size that > was requested, but may not be up to the full extent of the > last page, if the total pages returned exceed the > memory size requested. This behavior has caused problem with XHCI > and caused it to hang and froze usb: > kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead > kernel: xhci_hcd 0000:00:14.0: Abort command ring failed > kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up > kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID > kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. > Other drivers may have similar issue if it assumes that the pages > allocated are completely zeroed. > This patch ensures that the pages returned are fully cleared. > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > arch/x86/kernel/pci-dma.c | 2 ++ > 1 file changed, 2 insertions(+) > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > index a25e202..534470f 100644 > --- a/arch/x86/kernel/pci-dma.c > +++ b/arch/x86/kernel/pci-dma.c > @@ -125,6 +125,8 @@ again: > > return NULL; > } > + /* round up to full page size */ > + size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE; This is quite suboptimal formula, we can do without shifts and multiplications (hopefully, still converted to shifts by gcc): size = (size + ~PAGE_MASK) & PAGE_MASK; WBR, Sergei ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 21:03 ` Sergei Shtylyov @ 2015-01-30 21:54 ` Tim Chen 2015-02-02 14:02 ` Jiri Slaby 0 siblings, 1 reply; 20+ messages in thread From: Tim Chen @ 2015-01-30 21:54 UTC (permalink / raw) To: Sergei Shtylyov Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Sat, 2015-01-31 at 00:03 +0300, Sergei Shtylyov wrote: > On 01/30/2015 10:54 PM, Tim Chen wrote: > > > > return NULL; > > } > > + /* round up to full page size */ > > + size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE; > > This is quite suboptimal formula, we can do without shifts and > multiplications (hopefully, still converted to shifts by gcc): > > size = (size + ~PAGE_MASK) & PAGE_MASK; Agree. I've updated patch below Thanks. Tim --->8--- From: Tim Chen <tim.c.chen@linux.intel.com> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled") changed the dma_alloc_coherent page clearance from using an __GFP_ZERO in page allocation to not setting the flag but doing an explicit memory clear at the end. However the memory clear only covered the memory size that was requested, but may not be up to the full extent of the last page, if the total pages returned exceed the memory size requested. This behavior has caused problem with XHCI and caused it to hang: kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead kernel: xhci_hcd 0000:00:14.0: Abort command ring failed kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. Other drivers may have similar issue if it assumes that the pages allocated are completely zeroed. This patch ensures that the pages returned are fully cleared. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: <stable@vger.kernel.org> # 3.16+ --- arch/x86/kernel/pci-dma.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index a25e202..e9d8dee 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -125,6 +125,8 @@ again: return NULL; } + /* round up to full page size */ + size = (size + ~PAGE_MASK) & PAGE_MASK; memset(page_address(page), 0, size); *dma_addr = addr; return page_address(page); -- 1.9.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-01-30 21:54 ` Tim Chen @ 2015-02-02 14:02 ` Jiri Slaby 2015-02-02 16:39 ` Sergei Shtylyov 0 siblings, 1 reply; 20+ messages in thread From: Jiri Slaby @ 2015-02-02 14:02 UTC (permalink / raw) To: Tim Chen, Sergei Shtylyov Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On 01/30/2015, 10:54 PM, Tim Chen wrote: > On Sat, 2015-01-31 at 00:03 +0300, Sergei Shtylyov wrote: >> On 01/30/2015 10:54 PM, Tim Chen wrote: > >>> >>> return NULL; >>> } >>> + /* round up to full page size */ >>> + size = (((size-1) >> PAGE_SHIFT) + 1) * PAGE_SIZE; >> >> This is quite suboptimal formula, we can do without shifts and >> multiplications (hopefully, still converted to shifts by gcc): >> >> size = (size + ~PAGE_MASK) & PAGE_MASK; > > Agree. I've updated patch below > > Thanks. > > Tim > > --->8--- > > From: Tim Chen <tim.c.chen@linux.intel.com> > Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned > > > Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > if CMA is enabled") changed the dma_alloc_coherent page clearance from > using an __GFP_ZERO in page allocation to not setting the flag but doing > an explicit memory clear at the end. > > However the memory clear only covered the memory size that > was requested, but may not be up to the full extent of the > last page, if the total pages returned exceed the > memory size requested. This behavior has caused problem with XHCI > and caused it to hang: > > kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead > kernel: xhci_hcd 0000:00:14.0: Abort command ring failed > kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up > kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID > kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. > > Other drivers may have similar issue if it assumes that the pages > allocated are completely zeroed. > > This patch ensures that the pages returned are fully cleared. > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > Cc: <stable@vger.kernel.org> # 3.16+ > > --- > arch/x86/kernel/pci-dma.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > index a25e202..e9d8dee 100644 > --- a/arch/x86/kernel/pci-dma.c > +++ b/arch/x86/kernel/pci-dma.c > @@ -125,6 +125,8 @@ again: > > return NULL; > } > + /* round up to full page size */ > + size = (size + ~PAGE_MASK) & PAGE_MASK; Hi, is this an open-coded version of PAGE_ALIGN? > memset(page_address(page), 0, size); > *dma_addr = addr; > return page_address(page); > -- js suse labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-02 14:02 ` Jiri Slaby @ 2015-02-02 16:39 ` Sergei Shtylyov 2015-02-04 18:30 ` Tim Chen 0 siblings, 1 reply; 20+ messages in thread From: Sergei Shtylyov @ 2015-02-02 16:39 UTC (permalink / raw) To: Jiri Slaby, Tim Chen Cc: H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable Hello. On 02/02/2015 05:02 PM, Jiri Slaby wrote: >> From: Tim Chen <tim.c.chen@linux.intel.com> >> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned >> Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory >> if CMA is enabled") changed the dma_alloc_coherent page clearance from >> using an __GFP_ZERO in page allocation to not setting the flag but doing >> an explicit memory clear at the end. >> However the memory clear only covered the memory size that >> was requested, but may not be up to the full extent of the >> last page, if the total pages returned exceed the >> memory size requested. This behavior has caused problem with XHCI >> and caused it to hang: >> kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead >> kernel: xhci_hcd 0000:00:14.0: Abort command ring failed >> kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up >> kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID >> kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. >> Other drivers may have similar issue if it assumes that the pages >> allocated are completely zeroed. >> This patch ensures that the pages returned are fully cleared. >> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> >> Cc: <stable@vger.kernel.org> # 3.16+ >> --- >> arch/x86/kernel/pci-dma.c | 2 ++ >> 1 file changed, 2 insertions(+) >> diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c >> index a25e202..e9d8dee 100644 >> --- a/arch/x86/kernel/pci-dma.c >> +++ b/arch/x86/kernel/pci-dma.c >> @@ -125,6 +125,8 @@ again: >> >> return NULL; >> } >> + /* round up to full page size */ >> + size = (size + ~PAGE_MASK) & PAGE_MASK; > Hi, is this an open-coded version of PAGE_ALIGN? Yes, it appears so. :-) WBR, Sergei ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-02 16:39 ` Sergei Shtylyov @ 2015-02-04 18:30 ` Tim Chen 2015-02-18 19:40 ` Tim Chen 0 siblings, 1 reply; 20+ messages in thread From: Tim Chen @ 2015-02-04 18:30 UTC (permalink / raw) To: Sergei Shtylyov Cc: Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote: > > > Hi, is this an open-coded version of PAGE_ALIGN? > > Yes, it appears so. :-) > > WBR, Sergei > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN below. Regards, Tim --->8--- From: Tim Chen <tim.c.chen@linux.intel.com> Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory if CMA is enabled") changed the dma_alloc_coherent page clearance from using an __GFP_ZERO in page allocation to not setting the flag but doing an explicit memory clear at the end. However the memory clear only covered the memory size that was requested, but may not be up to the full extent of the last page, if the total pages returned exceed the memory size requested. This behavior has caused problem with XHCI and caused it to hang: kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead kernel: xhci_hcd 0000:00:14.0: Abort command ring failed kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. Other drivers may have similar issue if it assumes that the pages allocated are completely zeroed. This patch ensures that the pages returned are fully cleared. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: stable@vger.kernel.org --- arch/x86/kernel/pci-dma.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c index a25e202..3bdee55 100644 --- a/arch/x86/kernel/pci-dma.c +++ b/arch/x86/kernel/pci-dma.c @@ -125,6 +125,8 @@ again: return NULL; } + /* round up to full page size */ + size = PAGE_ALIGN(size); memset(page_address(page), 0, size); *dma_addr = addr; return page_address(page); -- 1.9.3 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-04 18:30 ` Tim Chen @ 2015-02-18 19:40 ` Tim Chen 2015-02-18 19:53 ` Alan Stern 0 siblings, 1 reply; 20+ messages in thread From: Tim Chen @ 2015-02-18 19:40 UTC (permalink / raw) To: Sergei Shtylyov, Greg Kroah-Hartman Cc: Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable, Alan Stern On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote: > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote: > > > > > > Hi, is this an open-coded version of PAGE_ALIGN? > > > > Yes, it appears so. :-) > > > > WBR, Sergei > > > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN > below. > > Regards, > Tim > Is there any resolution on this patch? I haven't seen fixes from the XHCI folks yet. This is breaking many of our systems. Thanks. Tim > --->8--- > From: Tim Chen <tim.c.chen@linux.intel.com> > Subject: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned > > Commit d92ef66c4f8f ("x86: make dma_alloc_coherent() return zeroed memory > if CMA is enabled") changed the dma_alloc_coherent page clearance from > using an __GFP_ZERO in page allocation to not setting the flag but doing > an explicit memory clear at the end. > > However the memory clear only covered the memory size that > was requested, but may not be up to the full extent of the > last page, if the total pages returned exceed the > memory size requested. This behavior has caused problem with XHCI > and caused it to hang: > > kernel: xhci_hcd 0000:00:14.0: Stopped the command ring failed, maybe the host is dead > kernel: xhci_hcd 0000:00:14.0: Abort command ring failed > kernel: xhci_hcd 0000:00:14.0: HC died; cleaning up > kernel: xhci_hcd 0000:00:14.0: Error while assigning device slot ID > kernel: xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 64. > > Other drivers may have similar issue if it assumes that the pages > allocated are completely zeroed. > > This patch ensures that the pages returned are fully cleared. > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > Cc: stable@vger.kernel.org > --- > arch/x86/kernel/pci-dma.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c > index a25e202..3bdee55 100644 > --- a/arch/x86/kernel/pci-dma.c > +++ b/arch/x86/kernel/pci-dma.c > @@ -125,6 +125,8 @@ again: > > return NULL; > } > + /* round up to full page size */ > + size = PAGE_ALIGN(size); > memset(page_address(page), 0, size); > *dma_addr = addr; > return page_address(page); ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 19:40 ` Tim Chen @ 2015-02-18 19:53 ` Alan Stern 2015-02-18 20:19 ` Tim Chen 0 siblings, 1 reply; 20+ messages in thread From: Alan Stern @ 2015-02-18 19:53 UTC (permalink / raw) To: Tim Chen Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Wed, 18 Feb 2015, Tim Chen wrote: > On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote: > > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote: > > > > > > > > > Hi, is this an open-coded version of PAGE_ALIGN? > > > > > > Yes, it appears so. :-) > > > > > > WBR, Sergei > > > > > > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN > > below. > > > > Regards, > > Tim > > > > Is there any resolution on this patch? I haven't seen fixes from the > XHCI folks yet. This is breaking many of our systems. Have you tried doing the experiments I suggested in http://marc.info/?l=linux-usb&m=142272448620716&w=2 to determine where the problem occurs? Alan Stern ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 19:53 ` Alan Stern @ 2015-02-18 20:19 ` Tim Chen 2015-02-18 20:36 ` Greg Kroah-Hartman ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Tim Chen @ 2015-02-18 20:19 UTC (permalink / raw) To: Alan Stern Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Wed, 2015-02-18 at 14:53 -0500, Alan Stern wrote: > On Wed, 18 Feb 2015, Tim Chen wrote: > > > On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote: > > > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote: > > > > > > > > > > > > Hi, is this an open-coded version of PAGE_ALIGN? > > > > > > > > Yes, it appears so. :-) > > > > > > > > WBR, Sergei > > > > > > > > > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN > > > below. > > > > > > Regards, > > > Tim > > > > > > > Is there any resolution on this patch? I haven't seen fixes from the > > XHCI folks yet. This is breaking many of our systems. > > Have you tried doing the experiments I suggested in > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > to determine where the problem occurs? > I was bogged down with other things lately and I haven't got a chance to test that. But as you said, there's very few places where xhci call this memory allocation. So I think the problem has been fairly narrowed down for the XHCI folks. Tim > Alan Stern > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 20:19 ` Tim Chen @ 2015-02-18 20:36 ` Greg Kroah-Hartman 2015-02-18 20:39 ` Andi Kleen 2015-02-18 20:45 ` Alan Stern 2 siblings, 0 replies; 20+ messages in thread From: Greg Kroah-Hartman @ 2015-02-18 20:36 UTC (permalink / raw) To: Tim Chen Cc: Alan Stern, Sergei Shtylyov, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Wed, Feb 18, 2015 at 12:19:03PM -0800, Tim Chen wrote: > On Wed, 2015-02-18 at 14:53 -0500, Alan Stern wrote: > > On Wed, 18 Feb 2015, Tim Chen wrote: > > > > > On Wed, 2015-02-04 at 10:30 -0800, Tim Chen wrote: > > > > On Mon, 2015-02-02 at 19:39 +0300, Sergei Shtylyov wrote: > > > > > > > > > > > > > > > Hi, is this an open-coded version of PAGE_ALIGN? > > > > > > > > > > Yes, it appears so. :-) > > > > > > > > > > WBR, Sergei > > > > > > > > > > > > > Thanks for the suggestion by Jiri. I updated the patch to use PAGE_ALIGN > > > > below. > > > > > > > > Regards, > > > > Tim > > > > > > > > > > Is there any resolution on this patch? I haven't seen fixes from the > > > XHCI folks yet. This is breaking many of our systems. > > > > Have you tried doing the experiments I suggested in > > > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > > > to determine where the problem occurs? > > > > I was bogged down with other things lately and I haven't got a chance to > test that. But as you said, there's very few places where xhci > call this memory allocation. So I think the problem has been fairly > narrowed down for the XHCI folks. The "XHCI folks" are in a cube near you, go poke them in person if they aren't answering their emails :) good luck, greg k-h ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 20:19 ` Tim Chen 2015-02-18 20:36 ` Greg Kroah-Hartman @ 2015-02-18 20:39 ` Andi Kleen 2015-02-18 21:15 ` Alan Stern 2015-02-18 20:45 ` Alan Stern 2 siblings, 1 reply; 20+ messages in thread From: Andi Kleen @ 2015-02-18 20:39 UTC (permalink / raw) To: Tim Chen Cc: Alan Stern, Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable, torvalds > > Have you tried doing the experiments I suggested in > > > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > > > to determine where the problem occurs? > > > > I was bogged down with other things lately and I haven't got a chance to > test that. But as you said, there's very few places where xhci > call this memory allocation. So I think the problem has been fairly > narrowed down for the XHCI folks. Also I don't really understand why we're even discussing this. The patch only makes an widely used API behave as it was before. Who knows who else was broken with this change. There's no sane way to audit all users. There is no real advantage of the new behavior. The only good way is to revert to old behavior, like in Tim's original patch. And doing it quickly for mainline and stable. FWIW we have a large number of systems here that are broken without this change. -Andi -- ak@linux.intel.com -- Speaking for myself only. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 20:39 ` Andi Kleen @ 2015-02-18 21:15 ` Alan Stern 0 siblings, 0 replies; 20+ messages in thread From: Alan Stern @ 2015-02-18 21:15 UTC (permalink / raw) To: Andi Kleen Cc: Tim Chen, Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable, torvalds On Wed, 18 Feb 2015, Andi Kleen wrote: > > > Have you tried doing the experiments I suggested in > > > > > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > > > > > to determine where the problem occurs? > > > > > > > I was bogged down with other things lately and I haven't got a chance to > > test that. But as you said, there's very few places where xhci > > call this memory allocation. So I think the problem has been fairly > > narrowed down for the XHCI folks. > > Also I don't really understand why we're even discussing this. The patch > only makes an widely used API behave as it was before. Who knows who > else was broken with this change. There's no sane way to audit all > users. There is no real advantage of the new behavior. We are discussing it because fixing problems is better than papering around them. > The only good way is to revert to old behavior, like in Tim's > original patch. And doing it quickly for mainline and stable. I will agree that applying the patch is a reasonable thing to do. However, I also believe that it is important to fix the bugs revealed by the API change. > FWIW we have a large number of systems here that are broken > without this change. For all you know, they will still be broken even after the change is applied. The breakage may become less obvious, but that doesn't mean it will disappear entirely. Alan Stern ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 20:19 ` Tim Chen 2015-02-18 20:36 ` Greg Kroah-Hartman 2015-02-18 20:39 ` Andi Kleen @ 2015-02-18 20:45 ` Alan Stern 2015-02-18 23:59 ` Tim Chen 2 siblings, 1 reply; 20+ messages in thread From: Alan Stern @ 2015-02-18 20:45 UTC (permalink / raw) To: Tim Chen Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Wed, 18 Feb 2015, Tim Chen wrote: > > Have you tried doing the experiments I suggested in > > > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > > > to determine where the problem occurs? > > > > I was bogged down with other things lately and I haven't got a chance to > test that. But as you said, there's very few places where xhci > call this memory allocation. So I think the problem has been fairly > narrowed down for the XHCI folks. I disagree. _You_ reported the error. How can you expect other people to figure out where it is with no help from you? I looked briefly at the xhci-hcd DMA allocation code. It does not contain any obvious mistakes. Alan Stern ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned 2015-02-18 20:45 ` Alan Stern @ 2015-02-18 23:59 ` Tim Chen 0 siblings, 0 replies; 20+ messages in thread From: Tim Chen @ 2015-02-18 23:59 UTC (permalink / raw) To: Alan Stern Cc: Sergei Shtylyov, Greg Kroah-Hartman, Jiri Slaby, H. Peter Anvin, Akinobu Mita, Mathias Nyman, Andi Kleen, Ingo Molnar, Andrew Morton, Marek Szyprowski, Thomas Gleixner, linux-kernel, x86, linux-usb, stable On Wed, 2015-02-18 at 15:45 -0500, Alan Stern wrote: > On Wed, 18 Feb 2015, Tim Chen wrote: > > > > Have you tried doing the experiments I suggested in > > > > > > http://marc.info/?l=linux-usb&m=142272448620716&w=2 > > > > > > to determine where the problem occurs? > > > > > > > I was bogged down with other things lately and I haven't got a chance to > > test that. But as you said, there's very few places where xhci > > call this memory allocation. So I think the problem has been fairly > > narrowed down for the XHCI folks. > > I disagree. _You_ reported the error. How can you expect other people > to figure out where it is with no help from you? > > I looked briefly at the xhci-hcd DMA allocation code. It does not > contain any obvious mistakes. > > Alan Stern > The error and my quick fix is right here. And xhci still needs to be fixed properly. I'll send out the patch below in a proper patch. Tim diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c index 5cb3d7a..39e7196 100644 --- a/drivers/usb/host/xhci-mem.c +++ b/drivers/usb/host/xhci-mem.c @@ -1658,7 +1658,7 @@ static int scratchpad_alloc(struct xhci_hcd *xhci, gfp_t flags) goto fail_sp; xhci->scratchpad->sp_array = dma_alloc_coherent(dev, - num_sp * sizeof(u64), + PAGE_ALIGN(num_sp * sizeof(u64)), &xhci->scratchpad->sp_dma, flags); if (!xhci->scratchpad->sp_array) ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-02-18 23:59 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-30 19:54 [PATCH] pci-dma: Fix x86 dma_alloc_coherent to fully clear all pages returned Tim Chen 2015-01-30 19:58 ` Greg KH 2015-01-30 22:01 ` Tim Chen 2015-01-30 22:07 ` Greg KH 2015-01-30 22:16 ` Tim Chen 2015-01-30 22:39 ` Andi Kleen 2015-01-31 17:14 ` Alan Stern 2015-01-30 21:03 ` Sergei Shtylyov 2015-01-30 21:54 ` Tim Chen 2015-02-02 14:02 ` Jiri Slaby 2015-02-02 16:39 ` Sergei Shtylyov 2015-02-04 18:30 ` Tim Chen 2015-02-18 19:40 ` Tim Chen 2015-02-18 19:53 ` Alan Stern 2015-02-18 20:19 ` Tim Chen 2015-02-18 20:36 ` Greg Kroah-Hartman 2015-02-18 20:39 ` Andi Kleen 2015-02-18 21:15 ` Alan Stern 2015-02-18 20:45 ` Alan Stern 2015-02-18 23:59 ` Tim Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox