* [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
@ 2016-06-10 13:03 Mauricio Faria de Oliveira
2016-06-11 23:02 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-06-10 13:03 UTC (permalink / raw)
To: linuxppc-dev
This prevents flooding the logs with 'iommu_alloc failed' messages
while I/O is performed (normally) to very fast devices (e.g. NVMe).
That error is not necessarily a problem; device drivers can retry
later / reschedule the requests for which the allocation failed,
and handle things gracefully for the caller stack on top of them.
This helps at least with NVMe devices without "64-bit"/direct DMA
window scenarios (e.g., systems with more than a few terabytes of
memory, on which DDW cannot be enabled, currently), where just an
'dd' command can trigger errors.
# dd if=/dev/zero of=/dev/nvme0n1p1 bs=64k count=512k
<...>
# echo $?
0
# dmesg
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c000000151c90000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c000000151c90000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c000000151c90000 npages 16
<...>
ppc_iommu_map_sg: 8186 callbacks suppressed
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c0000000fa5c0000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c000000100440000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c000000100440000 npages 16
<...>
ppc_iommu_map_sg: 5707 callbacks suppressed
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c0000000b5f50000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c0000000b5c60000 npages 16
nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr c0000000b4b30000 npages 16
<...>
Tested on next-20160609.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
---
arch/powerpc/kernel/iommu.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index a8e3490..b585bdc 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -479,10 +479,9 @@ int ppc_iommu_map_sg(struct device *dev, struct iommu_table *tbl,
/* Handle failure */
if (unlikely(entry == DMA_ERROR_CODE)) {
- if (printk_ratelimit())
- dev_info(dev, "iommu_alloc failed, tbl %p "
- "vaddr %lx npages %lu\n", tbl, vaddr,
- npages);
+ dev_dbg_ratelimited(dev, "iommu_alloc failed, tbl %p "
+ "vaddr %lx npages %lu\n", tbl, vaddr,
+ npages);
goto failure;
}
@@ -776,11 +775,9 @@ dma_addr_t iommu_map_page(struct device *dev, struct iommu_table *tbl,
mask >> tbl->it_page_shift, align,
attrs);
if (dma_handle == DMA_ERROR_CODE) {
- if (printk_ratelimit()) {
- dev_info(dev, "iommu_alloc failed, tbl %p "
- "vaddr %p npages %d\n", tbl, vaddr,
- npages);
- }
+ dev_dbg_ratelimited(dev, "iommu_alloc failed, tbl %p "
+ "vaddr %p npages %d\n", tbl, vaddr,
+ npages);
} else
dma_handle |= (uaddr & ~IOMMU_PAGE_MASK(tbl));
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-10 13:03 [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug Mauricio Faria de Oliveira
@ 2016-06-11 23:02 ` Benjamin Herrenschmidt
2016-06-13 13:27 ` Mauricio Faria de Oliveira
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-11 23:02 UTC (permalink / raw)
To: Mauricio Faria de Oliveira, linuxppc-dev
On Fri, 2016-06-10 at 10:03 -0300, Mauricio Faria de Oliveira wrote:
> This prevents flooding the logs with 'iommu_alloc failed' messages
> while I/O is performed (normally) to very fast devices (e.g. NVMe).
>
> That error is not necessarily a problem; device drivers can retry
> later / reschedule the requests for which the allocation failed,
> and handle things gracefully for the caller stack on top of them.
>
> This helps at least with NVMe devices without "64-bit"/direct DMA
> window scenarios (e.g., systems with more than a few terabytes of
> memory, on which DDW cannot be enabled, currently), where just an
> 'dd' command can trigger errors.
I'm not fan of this. This is a very useful message to diagnose why,
for example, your network adapter is not working properly.
A lot of drivers don't deal well with IOMMU errors.
The fact that NVME trigger these is a problem that needs to be solved
differently.
Cheers,
Ben.
>
> # dd if=/dev/zero of=/dev/nvme0n1p1 bs=64k count=512k
> <...>
> # echo $?
> 0
>
> # dmesg
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c000000151c90000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c000000151c90000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c000000151c90000 npages 16
> <...>
> ppc_iommu_map_sg: 8186 callbacks suppressed
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c0000000fa5c0000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c000000100440000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c000000100440000 npages 16
> <...>
> ppc_iommu_map_sg: 5707 callbacks suppressed
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c0000000b5f50000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c0000000b5c60000 npages 16
> nvme 0000:00:06.0: iommu_alloc failed, tbl c0000001fa67c520 vaddr
> c0000000b4b30000 npages 16
> <...>
>
> Tested on next-20160609.
>
> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.co
> m>
> ---
> arch/powerpc/kernel/iommu.c | 15 ++++++---------
> 1 file changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/kernel/iommu.c
> b/arch/powerpc/kernel/iommu.c
> index a8e3490..b585bdc 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -479,10 +479,9 @@ int ppc_iommu_map_sg(struct device *dev, struct
> iommu_table *tbl,
>
> /* Handle failure */
> if (unlikely(entry == DMA_ERROR_CODE)) {
> - if (printk_ratelimit())
> - dev_info(dev, "iommu_alloc failed,
> tbl %p "
> - "vaddr %lx npages %lu\n",
> tbl, vaddr,
> - npages);
> + dev_dbg_ratelimited(dev, "iommu_alloc
> failed, tbl %p "
> + "vaddr %lx npages %lu\n", tbl,
> vaddr,
> + npages);
> goto failure;
> }
>
> @@ -776,11 +775,9 @@ dma_addr_t iommu_map_page(struct device *dev,
> struct iommu_table *tbl,
> mask >> tbl->it_page_shift,
> align,
> attrs);
> if (dma_handle == DMA_ERROR_CODE) {
> - if (printk_ratelimit()) {
> - dev_info(dev, "iommu_alloc failed,
> tbl %p "
> - "vaddr %p npages %d\n",
> tbl, vaddr,
> - npages);
> - }
> + dev_dbg_ratelimited(dev, "iommu_alloc
> failed, tbl %p "
> + "vaddr %p npages %d\n", tbl, vaddr,
> + npages);
> } else
> dma_handle |= (uaddr &
> ~IOMMU_PAGE_MASK(tbl));
> }
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-11 23:02 ` Benjamin Herrenschmidt
@ 2016-06-13 13:27 ` Mauricio Faria de Oliveira
2016-06-13 21:26 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-06-13 13:27 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev
Hi Ben,
On 06/11/2016 08:02 PM, Benjamin Herrenschmidt wrote:
> I'm not fan of this. This is a very useful message to diagnose why,
> for example, your network adapter is not working properly.
>
> A lot of drivers don't deal well with IOMMU errors.
>
> The fact that NVME trigger these is a problem that needs to be solved
> differently.
Ok, I understand your points.
Thanks for the review -- helps in setting another direction.
Kind regards,
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-13 13:27 ` Mauricio Faria de Oliveira
@ 2016-06-13 21:26 ` Benjamin Herrenschmidt
2016-06-13 21:43 ` Mauricio Faria de Oliveira
2016-08-03 19:39 ` Mauricio Faria de Oliveira
0 siblings, 2 replies; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-13 21:26 UTC (permalink / raw)
To: Mauricio Faria de Oliveira, linuxppc-dev
On Mon, 2016-06-13 at 10:27 -0300, Mauricio Faria de Oliveira wrote:
> Hi Ben,
>
> On 06/11/2016 08:02 PM, Benjamin Herrenschmidt wrote:
> > I'm not fan of this. This is a very useful message to diagnose why,
> > for example, your network adapter is not working properly.
> >
> > A lot of drivers don't deal well with IOMMU errors.
> >
> > The fact that NVME trigger these is a problem that needs to be
> > solved
> > differently.
>
> Ok, I understand your points.
>
> Thanks for the review -- helps in setting another direction.
I've been thinking about this a bit... it might be worthwhile adding
a dma_* call to query the approximate size of the IOMMU window, as
a way for the device to adjust its requirements dynamically.
Another option would be to use a dma_attr for silencing mapping errors
which NVME could use provided it does handle them gracefully ...
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-13 21:26 ` Benjamin Herrenschmidt
@ 2016-06-13 21:43 ` Mauricio Faria de Oliveira
2016-06-13 21:51 ` Benjamin Herrenschmidt
2016-08-03 19:39 ` Mauricio Faria de Oliveira
1 sibling, 1 reply; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-06-13 21:43 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev
Hi Ben,
On 06/13/2016 06:26 PM, Benjamin Herrenschmidt wrote:
> I've been thinking about this a bit... it might be worthwhile adding
> a dma_* call to query the approximate size of the IOMMU window, as
> a way for the device to adjust its requirements dynamically.
Ok, cool; something like it was one of the options being discussed here.
What do you mean by 'approximate'? Maybe the size of 'free regions' in
the pools? -- not sure because iiuic the window size is static / 2 gig,
so didn't get why (or of what) to provide an approximation (for).
> Another option would be to use a dma_attr for silencing mapping errors
> which NVME could use provided it does handle them gracefully ...
Ah, that's new. Interesting. Thanks for suggestion!
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-13 21:43 ` Mauricio Faria de Oliveira
@ 2016-06-13 21:51 ` Benjamin Herrenschmidt
2016-06-13 22:01 ` Mauricio Faria de Oliveira
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2016-06-13 21:51 UTC (permalink / raw)
To: Mauricio Faria de Oliveira, linuxppc-dev
On Mon, 2016-06-13 at 18:43 -0300, Mauricio Faria de Oliveira wrote:
> Hi Ben,
>
> On 06/13/2016 06:26 PM, Benjamin Herrenschmidt wrote:
> > I've been thinking about this a bit... it might be worthwhile adding
> > a dma_* call to query the approximate size of the IOMMU window, as
> > a way for the device to adjust its requirements dynamically.
>
> Ok, cool; something like it was one of the options being discussed here.
>
> What do you mean by 'approximate'? Maybe the size of 'free regions' in
> the pools? -- not sure because iiuic the window size is static / 2 gig,
> so didn't get why (or of what) to provide an approximation (for).
Approximate wasn't a great choice of word but what I meant is:
- The size doesn't mean you can do an allocation that size (pools
layout etc..)
- And it might be shared with another device (though less likely
these days).
> > Another option would be to use a dma_attr for silencing mapping errors
> > which NVME could use provided it does handle them gracefully ...
>
> Ah, that's new. Interesting. Thanks for suggestion!
Cheers,
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-13 21:51 ` Benjamin Herrenschmidt
@ 2016-06-13 22:01 ` Mauricio Faria de Oliveira
0 siblings, 0 replies; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-06-13 22:01 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev
On 06/13/2016 06:51 PM, Benjamin Herrenschmidt wrote:
> Approximate wasn't a great choice of word but what I meant is:
>
> - The size doesn't mean you can do an allocation that size (pools
> layout etc..)
> - And it might be shared with another device (though less likely
> these days).
Ah, yup - ok, now I get it; thanks.
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-06-13 21:26 ` Benjamin Herrenschmidt
2016-06-13 21:43 ` Mauricio Faria de Oliveira
@ 2016-08-03 19:39 ` Mauricio Faria de Oliveira
2016-08-03 21:34 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-03 19:39 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev
Hi Ben,
On 06/13/2016 06:26 PM, Benjamin Herrenschmidt wrote:
> Another option would be to use a dma_attr for silencing mapping errors
> which NVME could use provided it does handle them gracefully ...
I recently submitted patches that implement your suggestion [1].
May you please review/comment if they're OK with you?
Thanks!
[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/146850.html
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-08-03 19:39 ` Mauricio Faria de Oliveira
@ 2016-08-03 21:34 ` Benjamin Herrenschmidt
2016-08-03 21:39 ` Mauricio Faria de Oliveira
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2016-08-03 21:34 UTC (permalink / raw)
To: Mauricio Faria de Oliveira, linuxppc-dev
On Wed, 2016-08-03 at 16:39 -0300, Mauricio Faria de Oliveira wrote:
> Hi Ben,
>
> On 06/13/2016 06:26 PM, Benjamin Herrenschmidt wrote:
> >
> > Another option would be to use a dma_attr for silencing mapping
> > errors
> > which NVME could use provided it does handle them gracefully ...
>
> I recently submitted patches that implement your suggestion [1].
> May you please review/comment if they're OK with you?
I think this is best done by the relevant community maintainer,
I just threw an idea but I'm not that familiar with the details :-)
Did you send them to the lkml list ?
> Thanks!
>
> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/14685
> 0.html
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug
2016-08-03 21:34 ` Benjamin Herrenschmidt
@ 2016-08-03 21:39 ` Mauricio Faria de Oliveira
0 siblings, 0 replies; 10+ messages in thread
From: Mauricio Faria de Oliveira @ 2016-08-03 21:39 UTC (permalink / raw)
To: Benjamin Herrenschmidt, linuxppc-dev
On 08/03/2016 06:34 PM, Benjamin Herrenschmidt wrote:
> I think this is best done by the relevant community maintainer,
> I just threw an idea but I'm not that familiar with the details:-)
Ok, sure; got it.
> Did you send them to the lkml list ?
Yup, plus a few others lists from get_maintainer.pl iirc.
Mailing list archive links:
- linux-kernel: http://marc.info/?l=linux-kernel&m=146798084822100&w=2
- linux-doc: http://marc.info/?l=linux-doc&m=146798085522104&w=2
- linux-nvme:
http://lists.infradead.org/pipermail/linux-nvme/2016-July/005349.html
- linuxppc-dev:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-July/145624.html
Thanks,
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-08-03 21:39 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-10 13:03 [PATCH] powerpc: convert 'iommu_alloc failed' messages to dynamic debug Mauricio Faria de Oliveira
2016-06-11 23:02 ` Benjamin Herrenschmidt
2016-06-13 13:27 ` Mauricio Faria de Oliveira
2016-06-13 21:26 ` Benjamin Herrenschmidt
2016-06-13 21:43 ` Mauricio Faria de Oliveira
2016-06-13 21:51 ` Benjamin Herrenschmidt
2016-06-13 22:01 ` Mauricio Faria de Oliveira
2016-08-03 19:39 ` Mauricio Faria de Oliveira
2016-08-03 21:34 ` Benjamin Herrenschmidt
2016-08-03 21:39 ` Mauricio Faria de Oliveira
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).