* [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb.
@ 2015-09-17 18:22 jglisse
[not found] ` <1442514158-30281-1-git-send-email-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: jglisse @ 2015-09-17 18:22 UTC (permalink / raw)
To: iommu
Cc: Jérôme Glisse, Joerg Roedel, Konrad Rzeszutek Wilk,
Alex Deucher, Dave Airlie, linux-kernel
From: Jérôme Glisse <jglisse@redhat.com>
The swiotlb dma backend is not appropriate for some devices like
GPU where bounce buffer or slow dma page allocations is just not
acceptable. With that helper device drivers can opt-out from the
swiotlb and just do sane things without wasting CPU cycles inside
the swiotlb code.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
To: iommu@lists.linux-foundation.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
CC: Dave Airlie <airlied@redhat.com>
CC: linux-kernel@vger.kernel.org
---
arch/x86/include/asm/dma-mapping.h | 3 +++
arch/x86/kernel/pci-swiotlb.c | 18 ++++++++++++++++++
include/asm-generic/dma-mapping-common.h | 7 +++++++
3 files changed, 28 insertions(+)
diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h
index 953b726..b50745f 100644
--- a/arch/x86/include/asm/dma-mapping.h
+++ b/arch/x86/include/asm/dma-mapping.h
@@ -46,6 +46,9 @@ bool arch_dma_alloc_attrs(struct device **dev, gfp_t *gfp);
#define HAVE_ARCH_DMA_SUPPORTED 1
extern int dma_supported(struct device *hwdev, u64 mask);
+#define HAVE_ARCH_DMA_OVERRIDE_SWIOTLB 1
+int dma_override_swiotlb(struct device *dev);
+
#include <asm-generic/dma-mapping-common.h>
extern void *dma_generic_alloc_coherent(struct device *dev, size_t size,
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index adf0392..6a9efab 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -117,3 +117,21 @@ void __init pci_swiotlb_late_init(void)
swiotlb_print_info();
}
}
+
+/* dma_override_swiotlb() - Override swiotlb with nommu.
+ *
+ * @device: Device for which to disable swiotlb.
+ *
+ * The swiotlb infrastructure just get in the way for some devices like GPU,
+ * where things like bounce pages can not work properly or for which we do not
+ * want to take slow page allocation code path. This function allows device
+ * driver opportunity to opt-out from swiotlb.
+ */
+int dma_override_swiotlb(struct device *dev)
+{
+ if (dev->archdata.dma_ops != &swiotlb_dma_ops)
+ return 1;
+ dev->archdata.dma_ops = &nommu_dma_ops;
+ return 1;
+}
+EXPORT_SYMBOL(dma_override_swiotlb);
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index b1bc954..452d947 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -355,4 +355,11 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
}
#endif
+#ifndef HAVE_ARCH_DMA_OVERRIDE_SWIOTLB
+static inline int dma_override_swiotlb(struct device *dev)
+{
+ return 0;
+}
+#endif
+
#endif
--
2.1.0
^ permalink raw reply related [flat|nested] 10+ messages in thread[parent not found: <1442514158-30281-1-git-send-email-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <1442514158-30281-1-git-send-email-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-09-17 19:02 ` Konrad Rzeszutek Wilk [not found] ` <20150917190251.GE20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Konrad Rzeszutek Wilk @ 2015-09-17 19:02 UTC (permalink / raw) To: jglisse-H+wXaHxf7aLQT0dZR+AlfA Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > The swiotlb dma backend is not appropriate for some devices like > GPU where bounce buffer or slow dma page allocations is just not > acceptable. With that helper device drivers can opt-out from the > swiotlb and just do sane things without wasting CPU cycles inside > the swiotlb code. What if SWIOTLB is the only one available? And what can't the devices use the TTM DMA backend which sets up buffers which don't need bounce buffer or slow dma page allocations? > > Signed-off-by: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org> > Cc: Konrad Rzeszutek Wilk <Konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> > Cc: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org> > CC: Dave Airlie <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > --- > arch/x86/include/asm/dma-mapping.h | 3 +++ > arch/x86/kernel/pci-swiotlb.c | 18 ++++++++++++++++++ > include/asm-generic/dma-mapping-common.h | 7 +++++++ > 3 files changed, 28 insertions(+) > > diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h > index 953b726..b50745f 100644 > --- a/arch/x86/include/asm/dma-mapping.h > +++ b/arch/x86/include/asm/dma-mapping.h > @@ -46,6 +46,9 @@ bool arch_dma_alloc_attrs(struct device **dev, gfp_t *gfp); > #define HAVE_ARCH_DMA_SUPPORTED 1 > extern int dma_supported(struct device *hwdev, u64 mask); > > +#define HAVE_ARCH_DMA_OVERRIDE_SWIOTLB 1 > +int dma_override_swiotlb(struct device *dev); > + > #include <asm-generic/dma-mapping-common.h> > > extern void *dma_generic_alloc_coherent(struct device *dev, size_t size, > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c > index adf0392..6a9efab 100644 > --- a/arch/x86/kernel/pci-swiotlb.c > +++ b/arch/x86/kernel/pci-swiotlb.c > @@ -117,3 +117,21 @@ void __init pci_swiotlb_late_init(void) > swiotlb_print_info(); > } > } > + > +/* dma_override_swiotlb() - Override swiotlb with nommu. > + * > + * @device: Device for which to disable swiotlb. > + * > + * The swiotlb infrastructure just get in the way for some devices like GPU, > + * where things like bounce pages can not work properly or for which we do not > + * want to take slow page allocation code path. This function allows device > + * driver opportunity to opt-out from swiotlb. > + */ > +int dma_override_swiotlb(struct device *dev) > +{ > + if (dev->archdata.dma_ops != &swiotlb_dma_ops) > + return 1; > + dev->archdata.dma_ops = &nommu_dma_ops; > + return 1; > +} > +EXPORT_SYMBOL(dma_override_swiotlb); > diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h > index b1bc954..452d947 100644 > --- a/include/asm-generic/dma-mapping-common.h > +++ b/include/asm-generic/dma-mapping-common.h > @@ -355,4 +355,11 @@ static inline int dma_set_mask(struct device *dev, u64 mask) > } > #endif > > +#ifndef HAVE_ARCH_DMA_OVERRIDE_SWIOTLB > +static inline int dma_override_swiotlb(struct device *dev) > +{ > + return 0; > +} > +#endif > + > #endif > -- > 2.1.0 > ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20150917190251.GE20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org>]
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <20150917190251.GE20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org> @ 2015-09-17 19:06 ` Konrad Rzeszutek Wilk [not found] ` <20150917190656.GF20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org> 2015-09-17 19:07 ` Jerome Glisse 1 sibling, 1 reply; 10+ messages in thread From: Konrad Rzeszutek Wilk @ 2015-09-17 19:06 UTC (permalink / raw) To: jglisse-H+wXaHxf7aLQT0dZR+AlfA Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > The swiotlb dma backend is not appropriate for some devices like > > GPU where bounce buffer or slow dma page allocations is just not > > acceptable. With that helper device drivers can opt-out from the > > swiotlb and just do sane things without wasting CPU cycles inside > > the swiotlb code. > > What if SWIOTLB is the only one available? > > And what can't the devices use the TTM DMA backend which sets up > buffers which don't need bounce buffer or slow dma page allocations? And then the followup question. If it opts out - how can it do sane things without an DMA API available? It would assume physical addresses match the bus addresses which is not always the sane thing. > > > > > Signed-off-by: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > > Cc: Joerg Roedel <jroedel-l3A5Bk7waGM@public.gmane.org> > > Cc: Konrad Rzeszutek Wilk <Konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> > > Cc: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org> > > CC: Dave Airlie <airlied-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > --- > > arch/x86/include/asm/dma-mapping.h | 3 +++ > > arch/x86/kernel/pci-swiotlb.c | 18 ++++++++++++++++++ > > include/asm-generic/dma-mapping-common.h | 7 +++++++ > > 3 files changed, 28 insertions(+) > > > > diff --git a/arch/x86/include/asm/dma-mapping.h b/arch/x86/include/asm/dma-mapping.h > > index 953b726..b50745f 100644 > > --- a/arch/x86/include/asm/dma-mapping.h > > +++ b/arch/x86/include/asm/dma-mapping.h > > @@ -46,6 +46,9 @@ bool arch_dma_alloc_attrs(struct device **dev, gfp_t *gfp); > > #define HAVE_ARCH_DMA_SUPPORTED 1 > > extern int dma_supported(struct device *hwdev, u64 mask); > > > > +#define HAVE_ARCH_DMA_OVERRIDE_SWIOTLB 1 > > +int dma_override_swiotlb(struct device *dev); > > + > > #include <asm-generic/dma-mapping-common.h> > > > > extern void *dma_generic_alloc_coherent(struct device *dev, size_t size, > > diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c > > index adf0392..6a9efab 100644 > > --- a/arch/x86/kernel/pci-swiotlb.c > > +++ b/arch/x86/kernel/pci-swiotlb.c > > @@ -117,3 +117,21 @@ void __init pci_swiotlb_late_init(void) > > swiotlb_print_info(); > > } > > } > > + > > +/* dma_override_swiotlb() - Override swiotlb with nommu. > > + * > > + * @device: Device for which to disable swiotlb. > > + * > > + * The swiotlb infrastructure just get in the way for some devices like GPU, > > + * where things like bounce pages can not work properly or for which we do not > > + * want to take slow page allocation code path. This function allows device > > + * driver opportunity to opt-out from swiotlb. > > + */ > > +int dma_override_swiotlb(struct device *dev) > > +{ > > + if (dev->archdata.dma_ops != &swiotlb_dma_ops) > > + return 1; > > + dev->archdata.dma_ops = &nommu_dma_ops; > > + return 1; > > +} > > +EXPORT_SYMBOL(dma_override_swiotlb); > > diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h > > index b1bc954..452d947 100644 > > --- a/include/asm-generic/dma-mapping-common.h > > +++ b/include/asm-generic/dma-mapping-common.h > > @@ -355,4 +355,11 @@ static inline int dma_set_mask(struct device *dev, u64 mask) > > } > > #endif > > > > +#ifndef HAVE_ARCH_DMA_OVERRIDE_SWIOTLB > > +static inline int dma_override_swiotlb(struct device *dev) > > +{ > > + return 0; > > +} > > +#endif > > + > > #endif > > -- > > 2.1.0 > > > _______________________________________________ > iommu mailing list > iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20150917190656.GF20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org>]
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <20150917190656.GF20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org> @ 2015-09-17 19:11 ` Jerome Glisse [not found] ` <20150917191113.GB6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Jerome Glisse @ 2015-09-17 19:11 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 03:06:57PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > > > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > GPU where bounce buffer or slow dma page allocations is just not > > > acceptable. With that helper device drivers can opt-out from the > > > swiotlb and just do sane things without wasting CPU cycles inside > > > the swiotlb code. > > > > What if SWIOTLB is the only one available? > > > > And what can't the devices use the TTM DMA backend which sets up > > buffers which don't need bounce buffer or slow dma page allocations? > > And then the followup question. If it opts out - how can it do > sane things without an DMA API available? It would assume physical > addresses match the bus addresses which is not always the sane > thing. This is why this is an arch specific function, on x86 with pci device, the driver knows what is the dma mask and thus if it can access directly all the memory or not. So in the end swiotlb vs no_mmu gives the same physical address to the device so there is no difference there. Obviously device driver needs to know what it is doing depending on the arch and bus the device is use in. Cheers, Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20150917191113.GB6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <20150917191113.GB6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-09-17 19:24 ` Konrad Rzeszutek Wilk 2015-09-17 19:27 ` Jerome Glisse 0 siblings, 1 reply; 10+ messages in thread From: Konrad Rzeszutek Wilk @ 2015-09-17 19:24 UTC (permalink / raw) To: Jerome Glisse Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 03:11:14PM -0400, Jerome Glisse wrote: > On Thu, Sep 17, 2015 at 03:06:57PM -0400, Konrad Rzeszutek Wilk wrote: > > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > > > > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > > GPU where bounce buffer or slow dma page allocations is just not > > > > acceptable. With that helper device drivers can opt-out from the > > > > swiotlb and just do sane things without wasting CPU cycles inside > > > > the swiotlb code. > > > > > > What if SWIOTLB is the only one available? > > > > > > And what can't the devices use the TTM DMA backend which sets up > > > buffers which don't need bounce buffer or slow dma page allocations? > > > > And then the followup question. If it opts out - how can it do > > sane things without an DMA API available? It would assume physical > > addresses match the bus addresses which is not always the sane > > thing. > > This is why this is an arch specific function, on x86 with pci device, > the driver knows what is the dma mask and thus if it can access directly > all the memory or not. So in the end swiotlb vs no_mmu gives the same > physical address to the device so there is no difference there. Not with Intel or AMD IOMMUs. The bus address it gives is not the same as the physical address. > > Obviously device driver needs to know what it is doing depending on the > arch and bus the device is use in. > > Cheers, > Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. 2015-09-17 19:24 ` Konrad Rzeszutek Wilk @ 2015-09-17 19:27 ` Jerome Glisse 0 siblings, 0 replies; 10+ messages in thread From: Jerome Glisse @ 2015-09-17 19:27 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Alex Deucher, Dave Airlie, iommu, Joerg Roedel, linux-kernel On Thu, Sep 17, 2015 at 03:24:25PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 03:11:14PM -0400, Jerome Glisse wrote: > > On Thu, Sep 17, 2015 at 03:06:57PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse@redhat.com wrote: > > > > > From: Jérôme Glisse <jglisse@redhat.com> > > > > > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > > > GPU where bounce buffer or slow dma page allocations is just not > > > > > acceptable. With that helper device drivers can opt-out from the > > > > > swiotlb and just do sane things without wasting CPU cycles inside > > > > > the swiotlb code. > > > > > > > > What if SWIOTLB is the only one available? > > > > > > > > And what can't the devices use the TTM DMA backend which sets up > > > > buffers which don't need bounce buffer or slow dma page allocations? > > > > > > And then the followup question. If it opts out - how can it do > > > sane things without an DMA API available? It would assume physical > > > addresses match the bus addresses which is not always the sane > > > thing. > > > > This is why this is an arch specific function, on x86 with pci device, > > the driver knows what is the dma mask and thus if it can access directly > > all the memory or not. So in the end swiotlb vs no_mmu gives the same > > physical address to the device so there is no difference there. > > Not with Intel or AMD IOMMUs. The bus address it gives is not the same > as the physical address. Yes but this patch never overidde if the dma_ops are the one from any IOMMU thus it can only override if there is a 1 to 1 mapping btw bus address and physical address. Cheers, Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <20150917190251.GE20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org> 2015-09-17 19:06 ` Konrad Rzeszutek Wilk @ 2015-09-17 19:07 ` Jerome Glisse [not found] ` <20150917190746.GA6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: Jerome Glisse @ 2015-09-17 19:07 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > The swiotlb dma backend is not appropriate for some devices like > > GPU where bounce buffer or slow dma page allocations is just not > > acceptable. With that helper device drivers can opt-out from the > > swiotlb and just do sane things without wasting CPU cycles inside > > the swiotlb code. > > What if SWIOTLB is the only one available? On x86 no_mmu is always available and we assume that device driver that would use this knows that their device can access all memory with no restriction or at very least use DMA32 gfp flag. > And what can't the devices use the TTM DMA backend which sets up > buffers which don't need bounce buffer or slow dma page allocations? We want to get rid of this TTM code path for radeon and likely nouveau. This is the motivation for that patch. Benchmark shows that the TTM DMA backend is much much much slower (20% on some benchmark) that the regular page allocation and going through no_mmu. So this is all about allowing to directly allocate page through regular kernel page alloc code and not through specialize dma allocator. Cheers, Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20150917190746.GA6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. [not found] ` <20150917190746.GA6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2015-09-17 19:31 ` Konrad Rzeszutek Wilk 2015-09-17 19:40 ` Jerome Glisse 2015-09-22 15:43 ` Jerome Glisse 0 siblings, 2 replies; 10+ messages in thread From: Konrad Rzeszutek Wilk @ 2015-09-17 19:31 UTC (permalink / raw) To: Jerome Glisse Cc: Alex Deucher, Dave Airlie, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Joerg Roedel, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Thu, Sep 17, 2015 at 03:07:47PM -0400, Jerome Glisse wrote: > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org wrote: > > > From: Jérôme Glisse <jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > GPU where bounce buffer or slow dma page allocations is just not > > > acceptable. With that helper device drivers can opt-out from the > > > swiotlb and just do sane things without wasting CPU cycles inside > > > the swiotlb code. > > > > What if SWIOTLB is the only one available? > > On x86 no_mmu is always available and we assume that device driver > that would use this knows that their device can access all memory > with no restriction or at very least use DMA32 gfp flag. That runs afoul of the purpose of the DMA API. On x86 you may have an IOMMU - GART, AMD Vi, Intel VT-d, Calgary, etc which will provide you with the proper dma address. As the physical to bus address topology does not have to be 1:1. > > > > And what can't the devices use the TTM DMA backend which sets up > > buffers which don't need bounce buffer or slow dma page allocations? > > We want to get rid of this TTM code path for radeon and likely > nouveau. This is the motivation for that patch. Benchmark shows > that the TTM DMA backend is much much much slower (20% on some > benchmark) that the regular page allocation and going through > no_mmu. You end up using the DMA API scatter gather API later on though. I am also a bit confused on your use-case - when do you see this? On regular desktop machines you will use the IOMMU API most of the time because that hardware exists. The SWIOTLB should only be used on hardware that is old, odd, or perhaps virtualized. > > So this is all about allowing to directly allocate page through > regular kernel page alloc code and not through specialize dma > allocator. .. What you are saying is that the intent of this patch is to not use TTM DMA. Are you using the SWIOTLB 99% of the time? 1%? Or is this related to the unfortunate patch that enabled SWIOTLB all the time? (If so, please please mention that in the commit, it didn't occur to me until just now). If that is the case we should attack the problem in a different way - see if the IOMMU API is setup? Or is that set already to some no_iommu option? I think what you are looking for is a simple flag telling you whether the IOMMU is there - in which case use the streaming DMA API calls (dma_map_page, etc)? > > Cheers, > Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. 2015-09-17 19:31 ` Konrad Rzeszutek Wilk @ 2015-09-17 19:40 ` Jerome Glisse 2015-09-22 15:43 ` Jerome Glisse 1 sibling, 0 replies; 10+ messages in thread From: Jerome Glisse @ 2015-09-17 19:40 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: iommu, Joerg Roedel, Alex Deucher, Dave Airlie, linux-kernel On Thu, Sep 17, 2015 at 03:31:58PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 03:07:47PM -0400, Jerome Glisse wrote: > > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse@redhat.com wrote: > > > > From: Jérôme Glisse <jglisse@redhat.com> > > > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > > GPU where bounce buffer or slow dma page allocations is just not > > > > acceptable. With that helper device drivers can opt-out from the > > > > swiotlb and just do sane things without wasting CPU cycles inside > > > > the swiotlb code. > > > > > > What if SWIOTLB is the only one available? > > > > On x86 no_mmu is always available and we assume that device driver > > that would use this knows that their device can access all memory > > with no restriction or at very least use DMA32 gfp flag. > > That runs afoul of the purpose of the DMA API. On x86 you may have > an IOMMU - GART, AMD Vi, Intel VT-d, Calgary, etc which will provide > you with the proper dma address. As the physical to bus address > topology does not have to be 1:1. I am well aware of that but saddly IOMMU is not as widespread as you would think on x86, on many platform it is still disabled by default by BIOS and linux kernel endup binding the swiotlb as default dma ops and thus you have a 1:1 mapping btw bus and physical address. My patch does not impact the case where you have an IOMMU, it only caters to the case where the swiotlb is the DMA API backend. > > > > > > > And what can't the devices use the TTM DMA backend which sets up > > > buffers which don't need bounce buffer or slow dma page allocations? > > > > We want to get rid of this TTM code path for radeon and likely > > nouveau. This is the motivation for that patch. Benchmark shows > > that the TTM DMA backend is much much much slower (20% on some > > benchmark) that the regular page allocation and going through > > no_mmu. > > You end up using the DMA API scatter gather API later on though. The DMA API scatter gather is only use for DMA buffer object and this is a minority of the buffer object you have on today graphic stacks and it's not use to present contiguous address to the GPU (at least on GPU i care about). So most of the GPU object do not use the DMA API scatter gather but the GPU hardware mmu that does the scatter gather. > > I am also a bit confused on your use-case - when do you see this? > On regular desktop machines you will use the IOMMU API most of > the time because that hardware exists. The SWIOTLB should only > be used on hardware that is old, odd, or perhaps virtualized. Sadly it's not the case even recent x86 computer have the IOMMU disabled by BIOS by default. User need to go into the bios and enable virtualization option for the IOMMU to be enabled. I wish that IOMMU was the default for all recent computer but it is just not the case. > > > > So this is all about allowing to directly allocate page through > > regular kernel page alloc code and not through specialize dma > > allocator. > > .. What you are saying is that the intent of this patch is > to not use TTM DMA. > > Are you using the SWIOTLB 99% of the time? 1%? Or is this > related to the unfortunate patch that enabled SWIOTLB all the time? > (If so, please please mention that in the commit, it didn't > occur to me until just now). Yes the patch that always enable the SWIOTLB is a pain point but this patch also had other purpose that are now escaping my mind. After discussion with other folks it seemed like the easiest solution would be to opt-out from the swiotlb if it is in use. > > If that is the case we should attack the problem in a different > way - see if the IOMMU API is setup? Or is that set already > to some no_iommu option? > > I think what you are looking for is a simple flag telling you > whether the IOMMU is there - in which case use the streaming > DMA API calls (dma_map_page, etc)? Device driver would still use dma_map_page, but this would not be the swiotlb one but the no_mmu one which is pretty much a no op and thus fast. Cheers, Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb. 2015-09-17 19:31 ` Konrad Rzeszutek Wilk 2015-09-17 19:40 ` Jerome Glisse @ 2015-09-22 15:43 ` Jerome Glisse 1 sibling, 0 replies; 10+ messages in thread From: Jerome Glisse @ 2015-09-22 15:43 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Jerome Glisse, Alex Deucher, Dave Airlie, iommu, Joerg Roedel, linux-kernel On Thu, Sep 17, 2015 at 03:31:58PM -0400, Konrad Rzeszutek Wilk wrote: > On Thu, Sep 17, 2015 at 03:07:47PM -0400, Jerome Glisse wrote: > > On Thu, Sep 17, 2015 at 03:02:51PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Thu, Sep 17, 2015 at 02:22:38PM -0400, jglisse@redhat.com wrote: > > > > From: Jérôme Glisse <jglisse@redhat.com> > > > > > > > > The swiotlb dma backend is not appropriate for some devices like > > > > GPU where bounce buffer or slow dma page allocations is just not > > > > acceptable. With that helper device drivers can opt-out from the > > > > swiotlb and just do sane things without wasting CPU cycles inside > > > > the swiotlb code. > > > > > > What if SWIOTLB is the only one available? > > > > On x86 no_mmu is always available and we assume that device driver > > that would use this knows that their device can access all memory > > with no restriction or at very least use DMA32 gfp flag. > > That runs afoul of the purpose of the DMA API. On x86 you may have > an IOMMU - GART, AMD Vi, Intel VT-d, Calgary, etc which will provide > you with the proper dma address. As the physical to bus address > topology does not have to be 1:1. > > > > > > > And what can't the devices use the TTM DMA backend which sets up > > > buffers which don't need bounce buffer or slow dma page allocations? > > > > We want to get rid of this TTM code path for radeon and likely > > nouveau. This is the motivation for that patch. Benchmark shows > > that the TTM DMA backend is much much much slower (20% on some > > benchmark) that the regular page allocation and going through > > no_mmu. > > You end up using the DMA API scatter gather API later on though. > > I am also a bit confused on your use-case - when do you see this? > On regular desktop machines you will use the IOMMU API most of > the time because that hardware exists. The SWIOTLB should only > be used on hardware that is old, odd, or perhaps virtualized. > > > > > So this is all about allowing to directly allocate page through > > regular kernel page alloc code and not through specialize dma > > allocator. > > .. What you are saying is that the intent of this patch is > to not use TTM DMA. > > Are you using the SWIOTLB 99% of the time? 1%? Or is this > related to the unfortunate patch that enabled SWIOTLB all the time? > (If so, please please mention that in the commit, it didn't > occur to me until just now). > > If that is the case we should attack the problem in a different > way - see if the IOMMU API is setup? Or is that set already > to some no_iommu option? > > I think what you are looking for is a simple flag telling you > whether the IOMMU is there - in which case use the streaming > DMA API calls (dma_map_page, etc)? Konrad are you happy with all the explanation ? I am want to move that patch forward so we can fix performance and forget about swiotlb for GPU. Cheers, Jérôme ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-09-22 15:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-17 18:22 [RFC PATCH] dma/swiotlb: Add helper for device driver to opt-out from swiotlb jglisse
[not found] ` <1442514158-30281-1-git-send-email-jglisse-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-17 19:02 ` Konrad Rzeszutek Wilk
[not found] ` <20150917190251.GE20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org>
2015-09-17 19:06 ` Konrad Rzeszutek Wilk
[not found] ` <20150917190656.GF20952-sHAKZZqAc8NKMcnDSFYBzAC/G2K4zDHf@public.gmane.org>
2015-09-17 19:11 ` Jerome Glisse
[not found] ` <20150917191113.GB6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-17 19:24 ` Konrad Rzeszutek Wilk
2015-09-17 19:27 ` Jerome Glisse
2015-09-17 19:07 ` Jerome Glisse
[not found] ` <20150917190746.GA6699-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-09-17 19:31 ` Konrad Rzeszutek Wilk
2015-09-17 19:40 ` Jerome Glisse
2015-09-22 15:43 ` Jerome Glisse
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox