* [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct
@ 2024-05-03 18:37 T.J. Mercier
2024-05-04 8:53 ` Petr Tesařík
2024-05-06 5:29 ` Christoph Hellwig
0 siblings, 2 replies; 19+ messages in thread
From: T.J. Mercier @ 2024-05-03 18:37 UTC (permalink / raw)
To: tjmercier, Christoph Hellwig, Marek Szyprowski, Robin Murphy
Cc: isaacmanjarres, iommu, linux-kernel
As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is
not cacheline-aligned") sg_dma_mark_swiotlb is called when
dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion
of a scatterlist. It is never set for the direct path, so drivers
cannot always rely on sg_dma_is_swiotlb to return correctly after
calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the
direct path like it is in the IOMMU path.
Signed-off-by: T.J. Mercier <tjmercier@google.com>
---
kernel/dma/direct.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 4d543b1e9d57..52f0dcb25ca2 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -12,7 +12,7 @@
#include <linux/pfn.h>
#include <linux/vmalloc.h>
#include <linux/set_memory.h>
-#include <linux/slab.h>
+#include <linux/swiotlb.h>
#include "direct.h"
/*
@@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents,
goto out_unmap;
}
sg_dma_len(sg) = sg->length;
+ if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address)))
+ sg_dma_mark_swiotlb(sg);
}
return nents;
--
2.45.0.rc1.225.g2a3ae87e7f-goog
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-03 18:37 [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct T.J. Mercier @ 2024-05-04 8:53 ` Petr Tesařík 2024-05-09 13:28 ` Robin Murphy 2024-05-06 5:29 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: Petr Tesařík @ 2024-05-04 8:53 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Fri, 3 May 2024 18:37:12 +0000 "T.J. Mercier" <tjmercier@google.com> wrote: > As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is > not cacheline-aligned") sg_dma_mark_swiotlb is called when > dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion > of a scatterlist. It is never set for the direct path, so drivers > cannot always rely on sg_dma_is_swiotlb to return correctly after > calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the > direct path like it is in the IOMMU path. > > Signed-off-by: T.J. Mercier <tjmercier@google.com> > --- > kernel/dma/direct.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c > index 4d543b1e9d57..52f0dcb25ca2 100644 > --- a/kernel/dma/direct.c > +++ b/kernel/dma/direct.c > @@ -12,7 +12,7 @@ > #include <linux/pfn.h> > #include <linux/vmalloc.h> > #include <linux/set_memory.h> > -#include <linux/slab.h> > +#include <linux/swiotlb.h> > #include "direct.h" > > /* > @@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, > goto out_unmap; > } > sg_dma_len(sg) = sg->length; > + if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address))) > + sg_dma_mark_swiotlb(sg); > } > > return nents; I'm not sure this does the right thing. IIUC when the scatterlist flags include SG_DMA_SWIOTLB, iommu_dma_sync_sg_for_*() will call iommu_dma_sync_single_for_*(), which in turn translates the DMA address to a physical address using iommu_iova_to_phys(). It seems to me that this function may not work correctly if there is no IOMMU, but it also seems to me that the scatterlist may contain such non-IOMMU addresses. I'm no expert, so correct DMA-to-physical translation might in fact be somehow implicitly guaranteed. If that's the case, could you explain it in the commit message, please? Petr T ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-04 8:53 ` Petr Tesařík @ 2024-05-09 13:28 ` Robin Murphy 0 siblings, 0 replies; 19+ messages in thread From: Robin Murphy @ 2024-05-09 13:28 UTC (permalink / raw) To: Petr Tesařík, T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, isaacmanjarres, iommu, linux-kernel On 04/05/2024 9:53 am, Petr Tesařík wrote: > On Fri, 3 May 2024 18:37:12 +0000 > "T.J. Mercier" <tjmercier@google.com> wrote: > >> As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is >> not cacheline-aligned") sg_dma_mark_swiotlb is called when >> dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion >> of a scatterlist. It is never set for the direct path, so drivers >> cannot always rely on sg_dma_is_swiotlb to return correctly after >> calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the >> direct path like it is in the IOMMU path. >> >> Signed-off-by: T.J. Mercier <tjmercier@google.com> >> --- >> kernel/dma/direct.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c >> index 4d543b1e9d57..52f0dcb25ca2 100644 >> --- a/kernel/dma/direct.c >> +++ b/kernel/dma/direct.c >> @@ -12,7 +12,7 @@ >> #include <linux/pfn.h> >> #include <linux/vmalloc.h> >> #include <linux/set_memory.h> >> -#include <linux/slab.h> >> +#include <linux/swiotlb.h> >> #include "direct.h" >> >> /* >> @@ -497,6 +497,8 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, >> goto out_unmap; >> } >> sg_dma_len(sg) = sg->length; >> + if (is_swiotlb_buffer(dev, dma_to_phys(dev, sg->dma_address))) >> + sg_dma_mark_swiotlb(sg); >> } >> >> return nents; > > I'm not sure this does the right thing. IIUC when the scatterlist flags > include SG_DMA_SWIOTLB, iommu_dma_sync_sg_for_*() will call > iommu_dma_sync_single_for_*(), which in turn translates the DMA address > to a physical address using iommu_iova_to_phys(). It seems to me that > this function may not work correctly if there is no IOMMU, but it also > seems to me that the scatterlist may contain such non-IOMMU addresses. In principle dma-direct *could* make use of the SG_DMA_SWIOTLB flag for an ever-so-slightly cheaper check than is_swiotlb_buffer() in sync_sg and unmap_sg, the same way as iommu-dma does. However the benefit would be a lot less significant than for iommu-dma, where it's really about the overhead of needing to perform iommu_iova_to_phys() translations for every segment every time in order to *get* the right thing to check is_swiotlb_buffer() on - that's what would be unreasonably prohibitive otherwise. Thanks, Robin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-03 18:37 [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct T.J. Mercier 2024-05-04 8:53 ` Petr Tesařík @ 2024-05-06 5:29 ` Christoph Hellwig 2024-05-06 16:00 ` T.J. Mercier 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-06 5:29 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Fri, May 03, 2024 at 06:37:12PM +0000, T.J. Mercier wrote: > As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is > not cacheline-aligned") sg_dma_mark_swiotlb is called when > dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion > of a scatterlist. It is never set for the direct path, so drivers > cannot always rely on sg_dma_is_swiotlb to return correctly after > calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the > direct path like it is in the IOMMU path. I don't think this is the right thing to do. Despite it's name sg_dma_mark_swiotlb really is dma-iommu specific, and doesn't make sense in context of dma-direct. If anything we need to find a better name for the flag. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 5:29 ` Christoph Hellwig @ 2024-05-06 16:00 ` T.J. Mercier 2024-05-06 16:02 ` Christoph Hellwig 0 siblings, 1 reply; 19+ messages in thread From: T.J. Mercier @ 2024-05-06 16:00 UTC (permalink / raw) To: Christoph Hellwig Cc: Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Sun, May 5, 2024 at 10:29 PM Christoph Hellwig <hch@lst.de> wrote: > > On Fri, May 03, 2024 at 06:37:12PM +0000, T.J. Mercier wrote: > > As of commit 861370f49ce4 ("iommu/dma: force bouncing if the size is > > not cacheline-aligned") sg_dma_mark_swiotlb is called when > > dma_map_sgtable takes the IOMMU path and uses SWIOTLB for some portion > > of a scatterlist. It is never set for the direct path, so drivers > > cannot always rely on sg_dma_is_swiotlb to return correctly after > > calling dma_map_sgtable. Fix this by calling sg_dma_mark_swiotlb in the > > direct path like it is in the IOMMU path. > > I don't think this is the right thing to do. Despite it's name > sg_dma_mark_swiotlb really is dma-iommu specific, and doesn't make sense > in context of dma-direct. If anything we need to find a better name > for the flag. > Oh, that's disappointing. I'm looking for a way to quickly check if any addresses point at a SWIOTLB buffer without doing a potentially expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but it'd be nice to have just one way to check which it looked like the SG_DMA_SWIOTLB flag could be used for. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 16:00 ` T.J. Mercier @ 2024-05-06 16:02 ` Christoph Hellwig 2024-05-06 16:10 ` T.J. Mercier 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-06 16:02 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 06, 2024 at 09:00:59AM -0700, T.J. Mercier wrote: > Oh, that's disappointing. I'm looking for a way to quickly check if > any addresses point at a SWIOTLB buffer without doing a potentially > expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu > only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev > returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but > it'd be nice to have just one way to check which it looked like the > SG_DMA_SWIOTLB flag could be used for. This sounds like you're trying to do that from a consumer of the DMA API, which is simply wrong. What is the actual problem you are trying to solve? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 16:02 ` Christoph Hellwig @ 2024-05-06 16:10 ` T.J. Mercier 2024-05-06 16:19 ` Christoph Hellwig 0 siblings, 1 reply; 19+ messages in thread From: T.J. Mercier @ 2024-05-06 16:10 UTC (permalink / raw) To: Christoph Hellwig Cc: Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 6, 2024 at 9:02 AM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 06, 2024 at 09:00:59AM -0700, T.J. Mercier wrote: > > Oh, that's disappointing. I'm looking for a way to quickly check if > > any addresses point at a SWIOTLB buffer without doing a potentially > > expensive call to iommu_iova_to_phys. Since it's meant to be dma-iommu > > only I guess I could use sg_dma_is_swiotlb if iommu_get_domain_for_dev > > returns a domain, and is_swiotlb_buffer otherwise for dma-direct, but > > it'd be nice to have just one way to check which it looked like the > > SG_DMA_SWIOTLB flag could be used for. > > This sounds like you're trying to do that from a consumer of the > DMA API, which is simply wrong. What is the actual problem you are > trying to solve? I want to reject mapping a dma_buf for a device if any of the pages that back the buffer require SWIOTLB. AFAICT there's no way to know whether SWIOTLB is used until after calling dma_map_sg, so afterwards I'm trying to check. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 16:10 ` T.J. Mercier @ 2024-05-06 16:19 ` Christoph Hellwig 2024-05-06 16:39 ` T.J. Mercier 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-06 16:19 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 06, 2024 at 09:10:40AM -0700, T.J. Mercier wrote: > I want to reject mapping a dma_buf for a device if any of the pages > that back the buffer require SWIOTLB. AFAICT there's no way to know > whether SWIOTLB is used until after calling dma_map_sg, so afterwards > I'm trying to check. You should not check, you simply must handle it by doing the proper DMA API based ownership management. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 16:19 ` Christoph Hellwig @ 2024-05-06 16:39 ` T.J. Mercier 2024-05-07 5:43 ` Christoph Hellwig 0 siblings, 1 reply; 19+ messages in thread From: T.J. Mercier @ 2024-05-06 16:39 UTC (permalink / raw) To: Christoph Hellwig Cc: Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 6, 2024 at 9:19 AM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 06, 2024 at 09:10:40AM -0700, T.J. Mercier wrote: > > I want to reject mapping a dma_buf for a device if any of the pages > > that back the buffer require SWIOTLB. AFAICT there's no way to know > > whether SWIOTLB is used until after calling dma_map_sg, so afterwards > > I'm trying to check. > > You should not check, you simply must handle it by doing the proper > DMA API based ownership management. That doesn't really work for uncached buffers. Since the SWIOTLB bounces are in the sync functions, and avoiding CMO is the point of uncached buffers, it doesn't make sense to try to map uncached buffers that would require SWIOTLB. So unless we can get the DMA API to fail the map in this case (look for DMA_ATTR_SKIP_CPU_SYNC + SWIOTLB?) I'm not sure how else this should be done. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-06 16:39 ` T.J. Mercier @ 2024-05-07 5:43 ` Christoph Hellwig 2024-05-07 20:07 ` T.J. Mercier 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-07 5:43 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > You should not check, you simply must handle it by doing the proper > > DMA API based ownership management. > > That doesn't really work for uncached buffers. What uncached buffers? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-07 5:43 ` Christoph Hellwig @ 2024-05-07 20:07 ` T.J. Mercier 2024-05-08 11:33 ` Christoph Hellwig 2024-05-08 17:19 ` Catalin Marinas 0 siblings, 2 replies; 19+ messages in thread From: T.J. Mercier @ 2024-05-07 20:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > > You should not check, you simply must handle it by doing the proper > > > DMA API based ownership management. > > > > That doesn't really work for uncached buffers. > > What uncached buffers? For example these ones: https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 Vendors have their own drivers that also export uncached buffers in a similar way. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-07 20:07 ` T.J. Mercier @ 2024-05-08 11:33 ` Christoph Hellwig 2024-05-09 12:54 ` Robin Murphy 2024-05-08 17:19 ` Catalin Marinas 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-08 11:33 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: > On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <hch@lst.de> wrote: > > > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > > > You should not check, you simply must handle it by doing the proper > > > > DMA API based ownership management. > > > > > > That doesn't really work for uncached buffers. > > > > What uncached buffers? > > For example these ones: > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 Whatever that code is doing is probably not upstream because it's too broken to live. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-08 11:33 ` Christoph Hellwig @ 2024-05-09 12:54 ` Robin Murphy 2024-05-09 18:26 ` T.J. Mercier 0 siblings, 1 reply; 19+ messages in thread From: Robin Murphy @ 2024-05-09 12:54 UTC (permalink / raw) To: Christoph Hellwig, T.J. Mercier Cc: Marek Szyprowski, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On 08/05/2024 12:33 pm, Christoph Hellwig wrote: > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: >> On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <hch@lst.de> wrote: >>> >>> On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: >>>>> You should not check, you simply must handle it by doing the proper >>>>> DMA API based ownership management. >>>> >>>> That doesn't really work for uncached buffers. >>> >>> What uncached buffers? >> >> For example these ones: >> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 > > Whatever that code is doing is probably not upstream because it's too > broken to live. Indeed, at a glance it appears to be trying to reinvent dma_alloc_noncontiguous(). What's not immediately obvious is whether it's particular about allocations being DMA-contiguous; if not then I think it comes down to the same thing as vb2-dma-sg and the ideas we were tossing around for that[1]. Thanks, Robin. [1] https://lore.kernel.org/linux-media/20231228074645.765yytb2a7hvz7ti@chromium.org/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-09 12:54 ` Robin Murphy @ 2024-05-09 18:26 ` T.J. Mercier 0 siblings, 0 replies; 19+ messages in thread From: T.J. Mercier @ 2024-05-09 18:26 UTC (permalink / raw) To: Robin Murphy Cc: Christoph Hellwig, Marek Szyprowski, isaacmanjarres, Catalin Marinas, iommu, linux-kernel On Thu, May 9, 2024 at 5:55 AM Robin Murphy <robin.murphy@arm.com> wrote: > > On 08/05/2024 12:33 pm, Christoph Hellwig wrote: > > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: > >> On Mon, May 6, 2024 at 10:43???PM Christoph Hellwig <hch@lst.de> wrote: > >>> > >>> On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > >>>>> You should not check, you simply must handle it by doing the proper > >>>>> DMA API based ownership management. > >>>> > >>>> That doesn't really work for uncached buffers. > >>> > >>> What uncached buffers? > >> > >> For example these ones: > >> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 > > > > Whatever that code is doing is probably not upstream because it's too > > broken to live. > > Indeed, at a glance it appears to be trying to reinvent > dma_alloc_noncontiguous(). What's not immediately obvious is whether > it's particular about allocations being DMA-contiguous; if not then I > think it comes down to the same thing as vb2-dma-sg and the ideas we > were tossing around for that[1]. This heap isn't too picky about the memory being contiguous, but there is an attempt to allocate high order pages if possible to reduce the number of translation entries. These page orders are one thing vendors sometimes change for the hardware they have. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/dma-buf/heaps/system_heap.c#n48 I think one problem there'd be attempting to use dma_alloc_noncontiguous for dmabuf exporters is that memory is typically (always?) allocated when the buffer is exported, and there won't be any device attachments at that time. > Thanks, > Robin. > > [1] > https://lore.kernel.org/linux-media/20231228074645.765yytb2a7hvz7ti@chromium.org/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-07 20:07 ` T.J. Mercier 2024-05-08 11:33 ` Christoph Hellwig @ 2024-05-08 17:19 ` Catalin Marinas 2024-05-08 20:14 ` T.J. Mercier 1 sibling, 1 reply; 19+ messages in thread From: Catalin Marinas @ 2024-05-08 17:19 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: > On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > > > You should not check, you simply must handle it by doing the proper > > > > DMA API based ownership management. > > > > > > That doesn't really work for uncached buffers. > > > > What uncached buffers? > > For example these ones: > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 > > Vendors have their own drivers that also export uncached buffers in a > similar way. IIUC, you have an uncached dma buffer and you want to pass those pages to dma_map_sgtable() with DMA_ATTR_SKIP_CPU_SYNC since the buffer has already been made coherent via other means (the CPU mapping is uncached). The small kmalloc() bouncing gets in the way as it copies the data to a cached buffer but it also skips the cache maintenance because of DMA_ATTR_SKIP_CPU_SYNC. I assume Android carries these patches: https://lore.kernel.org/r/20201110034934.70898-8-john.stultz@linaro.org/ Arguably this is abusing the DMA API since DMA_ATTR_SKIP_CPU_SYNC should be used for subsequent mappings of buffers already mapped to device by a prior dma_map_*() operation. In the above patchset, the buffer is vmap'ed by the dma-buf heap code and DMA_ATTR_SKIP_CPU_SYNC is used on the first dma_map_*(). Ignoring the above hacks, I think we can get in a similar situation even with more complex uses of the DMA API. Let's say some buffers are initially mapped to device with dma_map_page(), some of them being bounced but cache maintenance completed. A subsequent dma_map_*() on those pages may force a bouncing again but DMA_ATTR_SKIP_CPU_SYNC will avoid the cache maintenance. You are not actually sharing the original buffers but create separate (bounced) copies no longer coherent with the device. I think in general buffer sharing with multiple dma_map_*() calls on the same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing, irrespective of the kmalloc() minalign series. If you do this for a 32-bit device and one of the pages is outside the ZONE_DMA32 range, you'd get a similar behaviour. From the kmalloc() minumum alignment perspective, it makes sense to skip the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the bouncing if the direction is DMA_TO_DEVICE or the device is fully coherent. A completely untested patch below. It doesn't solve other problems with bouncing you may have with your out of tree patches and, as Christoph said, checking in your driver whether the DMA address is a swiotlb buffer is completely wrong. ---------8<------------------------ diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index e4cb26f6a943..c7ff464a5f81 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -602,15 +602,16 @@ static bool dev_is_untrusted(struct device *dev) } static bool dev_use_swiotlb(struct device *dev, size_t size, - enum dma_data_direction dir) + enum dma_data_direction dir, unsigned long attrs) { return IS_ENABLED(CONFIG_SWIOTLB) && (dev_is_untrusted(dev) || - dma_kmalloc_needs_bounce(dev, size, dir)); + dma_kmalloc_needs_bounce(dev, size, dir, attrs)); } static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir) + int nents, enum dma_data_direction dir, + unsigned long attrs) { struct scatterlist *s; int i; @@ -626,7 +627,7 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, * direction, check the individual lengths in the sg list. If any * element is deemed unsafe, use the swiotlb for bouncing. */ - if (!dma_kmalloc_safe(dev, dir)) { + if (!dma_kmalloc_safe(dev, dir, attrs)) { for_each_sg(sg, s, nents, i) if (!dma_kmalloc_size_aligned(s->length)) return true; @@ -1076,7 +1077,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); @@ -1092,7 +1093,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev, { phys_addr_t phys; - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0)) return; phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); @@ -1152,7 +1153,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, * If both the physical buffer start address and size are * page aligned, we don't need to use a bounce page. */ - if (dev_use_swiotlb(dev, size, dir) && + if (dev_use_swiotlb(dev, size, dir, attrs) && iova_offset(iovad, phys | size)) { void *padding_start; size_t padding_size, aligned_size; @@ -1369,7 +1370,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, goto out; } - if (dev_use_sg_swiotlb(dev, sg, nents, dir)) + if (dev_use_sg_swiotlb(dev, sg, nents, dir, attrs)) return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs); if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 4abc60f04209..857a460e40f0 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -277,7 +277,8 @@ static inline bool dev_is_dma_coherent(struct device *dev) * Check whether potential kmalloc() buffers are safe for non-coherent DMA. */ static inline bool dma_kmalloc_safe(struct device *dev, - enum dma_data_direction dir) + enum dma_data_direction dir, + unsigned long attrs) { /* * If DMA bouncing of kmalloc() buffers is disabled, the kmalloc() @@ -288,10 +289,12 @@ static inline bool dma_kmalloc_safe(struct device *dev, /* * kmalloc() buffers are DMA-safe irrespective of size if the device - * is coherent or the direction is DMA_TO_DEVICE (non-desctructive - * cache maintenance and benign cache line evictions). + * is coherent, the direction is DMA_TO_DEVICE (non-desctructive + * cache maintenance and benign cache line evictions) or the + * attributes require no cache maintenance. */ - if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE) + if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE || + attrs & DMA_ATTR_SKIP_CPU_SYNC) return true; return false; @@ -328,9 +331,11 @@ static inline bool dma_kmalloc_size_aligned(size_t size) * in the kernel. */ static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size, - enum dma_data_direction dir) + enum dma_data_direction dir, + unsigned long attrs) { - return !dma_kmalloc_safe(dev, dir) && !dma_kmalloc_size_aligned(size); + return !dma_kmalloc_safe(dev, dir, attrs) && + !dma_kmalloc_size_aligned(size); } void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h index 18d346118fe8..c2d31a67719e 100644 --- a/kernel/dma/direct.h +++ b/kernel/dma/direct.h @@ -96,7 +96,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, } if (unlikely(!dma_capable(dev, dma_addr, size, true)) || - dma_kmalloc_needs_bounce(dev, size, dir)) { + dma_kmalloc_needs_bounce(dev, size, dir, attrs)) { if (is_pci_p2pdma_page(page)) return DMA_MAPPING_ERROR; if (is_swiotlb_active(dev)) -- Catalin ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-08 17:19 ` Catalin Marinas @ 2024-05-08 20:14 ` T.J. Mercier 2024-05-09 7:49 ` Catalin Marinas 0 siblings, 1 reply; 19+ messages in thread From: T.J. Mercier @ 2024-05-08 20:14 UTC (permalink / raw) To: Catalin Marinas Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Wed, May 8, 2024 at 10:19 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: > > On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <hch@lst.de> wrote: > > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > > > > You should not check, you simply must handle it by doing the proper > > > > > DMA API based ownership management. > > > > > > > > That doesn't really work for uncached buffers. > > > > > > What uncached buffers? > > > > For example these ones: > > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 > > > > Vendors have their own drivers that also export uncached buffers in a > > similar way. > > IIUC, you have an uncached dma buffer and you want to pass those pages > to dma_map_sgtable() with DMA_ATTR_SKIP_CPU_SYNC since the buffer has > already been made coherent via other means (the CPU mapping is > uncached). The small kmalloc() bouncing gets in the way as it copies the > data to a cached buffer but it also skips the cache maintenance because > of DMA_ATTR_SKIP_CPU_SYNC. Yes, that's a problem at the time of the mapping. It's also a problem for dma_buf_{begin,end}_cpu_access calls because implementing an uncached buffer means we want to skip dma_sync_sgtable_for_*, but that also skips the swiotlb copy. The goal is to only let the mapping succeed if the cache maintenance can always be skipped while also ensuring all users have a correct view of the memory, and that doesn't seem possible where swiotlb is involved. > I assume Android carries these patches: > https://lore.kernel.org/r/20201110034934.70898-8-john.stultz@linaro.org/ Correct. > Arguably this is abusing the DMA API since DMA_ATTR_SKIP_CPU_SYNC should > be used for subsequent mappings of buffers already mapped to device by a > prior dma_map_*() operation. In the above patchset, the buffer is > vmap'ed by the dma-buf heap code and DMA_ATTR_SKIP_CPU_SYNC is used on > the first dma_map_*(). > > Ignoring the above hacks, I think we can get in a similar situation even > with more complex uses of the DMA API. Let's say some buffers are > initially mapped to device with dma_map_page(), some of them being > bounced but cache maintenance completed. A subsequent dma_map_*() > on those pages may force a bouncing again but DMA_ATTR_SKIP_CPU_SYNC > will avoid the cache maintenance. You are not actually sharing the > original buffers but create separate (bounced) copies no longer coherent > with the device. Right, no good. The inverse of the dma_buf_begin_cpu_access problem. > I think in general buffer sharing with multiple dma_map_*() calls on the > same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing, > irrespective of the kmalloc() minalign series. If you do this for a > 32-bit device and one of the pages is outside the ZONE_DMA32 range, > you'd get a similar behaviour. > > From the kmalloc() minumum alignment perspective, it makes sense to skip > the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the > bouncing if the direction is DMA_TO_DEVICE or the device is fully > coherent. > > A completely untested patch below. It doesn't solve other problems with > bouncing you may have with your out of tree patches and, as Christoph > said, checking in your driver whether the DMA address is a swiotlb > buffer is completely wrong. This is where I must be missing something. Is the main opposition that the *driver* is checking for swiotlb use (instead of inside the DMA API)? Because it sounds like we agree it's a bad idea to attempt bouncing + DMA_ATTR_SKIP_CPU_SYNC. This code looks like it almost gets there, but it'd still reach swiotlb_map (instead of DMA_MAPPING_ERROR) with DMA_ATTR_SKIP_CPU_SYNC set for force_bounce or if the dma_capable check fails. > ---------8<------------------------ > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c > index e4cb26f6a943..c7ff464a5f81 100644 > --- a/drivers/iommu/dma-iommu.c > +++ b/drivers/iommu/dma-iommu.c > @@ -602,15 +602,16 @@ static bool dev_is_untrusted(struct device *dev) > } > > static bool dev_use_swiotlb(struct device *dev, size_t size, > - enum dma_data_direction dir) > + enum dma_data_direction dir, unsigned long attrs) > { > return IS_ENABLED(CONFIG_SWIOTLB) && > (dev_is_untrusted(dev) || > - dma_kmalloc_needs_bounce(dev, size, dir)); > + dma_kmalloc_needs_bounce(dev, size, dir, attrs)); > } > > static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, > - int nents, enum dma_data_direction dir) > + int nents, enum dma_data_direction dir, > + unsigned long attrs) > { > struct scatterlist *s; > int i; > @@ -626,7 +627,7 @@ static bool dev_use_sg_swiotlb(struct device *dev, struct scatterlist *sg, > * direction, check the individual lengths in the sg list. If any > * element is deemed unsafe, use the swiotlb for bouncing. > */ > - if (!dma_kmalloc_safe(dev, dir)) { > + if (!dma_kmalloc_safe(dev, dir, attrs)) { > for_each_sg(sg, s, nents, i) > if (!dma_kmalloc_size_aligned(s->length)) > return true; > @@ -1076,7 +1077,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev, > { > phys_addr_t phys; > > - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) > + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0)) > return; > > phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); > @@ -1092,7 +1093,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev, > { > phys_addr_t phys; > > - if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir)) > + if (dev_is_dma_coherent(dev) && !dev_use_swiotlb(dev, size, dir, 0)) > return; > > phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle); > @@ -1152,7 +1153,7 @@ static dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, > * If both the physical buffer start address and size are > * page aligned, we don't need to use a bounce page. > */ > - if (dev_use_swiotlb(dev, size, dir) && > + if (dev_use_swiotlb(dev, size, dir, attrs) && > iova_offset(iovad, phys | size)) { > void *padding_start; > size_t padding_size, aligned_size; > @@ -1369,7 +1370,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, > goto out; > } > > - if (dev_use_sg_swiotlb(dev, sg, nents, dir)) > + if (dev_use_sg_swiotlb(dev, sg, nents, dir, attrs)) > return iommu_dma_map_sg_swiotlb(dev, sg, nents, dir, attrs); > > if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h > index 4abc60f04209..857a460e40f0 100644 > --- a/include/linux/dma-map-ops.h > +++ b/include/linux/dma-map-ops.h > @@ -277,7 +277,8 @@ static inline bool dev_is_dma_coherent(struct device *dev) > * Check whether potential kmalloc() buffers are safe for non-coherent DMA. > */ > static inline bool dma_kmalloc_safe(struct device *dev, > - enum dma_data_direction dir) > + enum dma_data_direction dir, > + unsigned long attrs) > { > /* > * If DMA bouncing of kmalloc() buffers is disabled, the kmalloc() > @@ -288,10 +289,12 @@ static inline bool dma_kmalloc_safe(struct device *dev, > > /* > * kmalloc() buffers are DMA-safe irrespective of size if the device > - * is coherent or the direction is DMA_TO_DEVICE (non-desctructive > - * cache maintenance and benign cache line evictions). > + * is coherent, the direction is DMA_TO_DEVICE (non-desctructive > + * cache maintenance and benign cache line evictions) or the > + * attributes require no cache maintenance. > */ > - if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE) > + if (dev_is_dma_coherent(dev) || dir == DMA_TO_DEVICE || > + attrs & DMA_ATTR_SKIP_CPU_SYNC) > return true; > > return false; > @@ -328,9 +331,11 @@ static inline bool dma_kmalloc_size_aligned(size_t size) > * in the kernel. > */ > static inline bool dma_kmalloc_needs_bounce(struct device *dev, size_t size, > - enum dma_data_direction dir) > + enum dma_data_direction dir, > + unsigned long attrs) > { > - return !dma_kmalloc_safe(dev, dir) && !dma_kmalloc_size_aligned(size); > + return !dma_kmalloc_safe(dev, dir, attrs) && > + !dma_kmalloc_size_aligned(size); > } > > void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *dma_handle, > diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h > index 18d346118fe8..c2d31a67719e 100644 > --- a/kernel/dma/direct.h > +++ b/kernel/dma/direct.h > @@ -96,7 +96,7 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, > } > > if (unlikely(!dma_capable(dev, dma_addr, size, true)) || > - dma_kmalloc_needs_bounce(dev, size, dir)) { > + dma_kmalloc_needs_bounce(dev, size, dir, attrs)) { > if (is_pci_p2pdma_page(page)) > return DMA_MAPPING_ERROR; > if (is_swiotlb_active(dev)) > > -- > Catalin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-08 20:14 ` T.J. Mercier @ 2024-05-09 7:49 ` Catalin Marinas 2024-05-09 13:06 ` Christoph Hellwig 0 siblings, 1 reply; 19+ messages in thread From: Catalin Marinas @ 2024-05-09 7:49 UTC (permalink / raw) To: T.J. Mercier Cc: Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Wed, May 08, 2024 at 01:14:41PM -0700, T.J. Mercier wrote: > On Wed, May 8, 2024 at 10:19 AM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Tue, May 07, 2024 at 01:07:25PM -0700, T.J. Mercier wrote: > > > On Mon, May 6, 2024 at 10:43 PM Christoph Hellwig <hch@lst.de> wrote: > > > > On Mon, May 06, 2024 at 09:39:53AM -0700, T.J. Mercier wrote: > > > > > > You should not check, you simply must handle it by doing the proper > > > > > > DMA API based ownership management. > > > > > > > > > > That doesn't really work for uncached buffers. > > > > > > > > What uncached buffers? > > > > > > For example these ones: > > > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/dma-buf/heaps/system_heap.c#141 > > > > > > Vendors have their own drivers that also export uncached buffers in a > > > similar way. [...] > > I think in general buffer sharing with multiple dma_map_*() calls on the > > same buffer and DMA_ATTR_SKIP_CPU_SYNC is incompatible with bouncing, > > irrespective of the kmalloc() minalign series. If you do this for a > > 32-bit device and one of the pages is outside the ZONE_DMA32 range, > > you'd get a similar behaviour. > > > > From the kmalloc() minumum alignment perspective, it makes sense to skip > > the bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. We also skip the > > bouncing if the direction is DMA_TO_DEVICE or the device is fully > > coherent. > > > > A completely untested patch below. It doesn't solve other problems with > > bouncing you may have with your out of tree patches and, as Christoph > > said, checking in your driver whether the DMA address is a swiotlb > > buffer is completely wrong. > > This is where I must be missing something. Is the main opposition that > the *driver* is checking for swiotlb use (instead of inside the DMA > API)? I see the swiotlb use as some internal detail of the DMA API implementation that should not leak outside this framework. > Because it sounds like we agree it's a bad idea to attempt > bouncing + DMA_ATTR_SKIP_CPU_SYNC. It's not necessarily the DMA_ATTR_SKIP_CPU_SYNC but rather the usage model of sharing a buffer between multiple devices. The DMA API is mostly tailored around CPU <-> single device ownership and the bouncing works fine. When sharing the same buffer with multiple devices, calling dma_map_*() on a buffer can potentially create multiple copies of the original CPU buffer. It may be fine _if_ the devices don't communicate between themselves using such buffer, otherwise the model is broken (sync or no sync). The additional issue with DMA_ATTR_SKIP_CPU_SYNC is when you use it on subsequent dma_map_*() calls assuming that the sync was already done on the first dma_map_*() call but with bouncing it's another dma location (ignoring the Android specific patches). I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. However, this is not sufficient with a proper use of the DMA API since the first dma_map_*() without this attribute can still do the bouncing. IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will be used on the first map and potentially on subsequent calls in combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter to imply "shared"). The downside is that mapping may fail if the coherent mask is too narrow. Anyway, the definitive answer should come from the DMA API maintainers. > This code looks like it almost gets there, but it'd still reach > swiotlb_map (instead of DMA_MAPPING_ERROR) with DMA_ATTR_SKIP_CPU_SYNC > set for force_bounce or if the dma_capable check fails. My quick patch was mainly to ensure the kmalloc() alignment patches do not make the situation worse. -- Catalin ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-09 7:49 ` Catalin Marinas @ 2024-05-09 13:06 ` Christoph Hellwig 2024-05-09 18:32 ` T.J. Mercier 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2024-05-09 13:06 UTC (permalink / raw) To: Catalin Marinas Cc: T.J. Mercier, Christoph Hellwig, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Thu, May 09, 2024 at 08:49:40AM +0100, Catalin Marinas wrote: > I see the swiotlb use as some internal detail of the DMA API > implementation that should not leak outside this framework. And that's what it is. > I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. > However, this is not sufficient with a proper use of the DMA API since > the first dma_map_*() without this attribute can still do the bouncing. > IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will > be used on the first map and potentially on subsequent calls in > combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter > to imply "shared"). The downside is that mapping may fail if the > coherent mask is too narrow. We have two big problems here that kinda interact: 1) DMA_ATTR_SKIP_CPU_SYNC is just a horrible API. It exposes an implementation detail instead of dealing with use cases. The original one IIRC was to deal with networking receive buffers that are often only partially filled and the networking folks wanted to avoid the overhead for doing the cache operations for the rest. It kinda works for that but already gets iffy when swiotlb is involved. The other abuses of the flag just went downhill form there. 2) the model of dma mapping a single chunk of memory to multiple devices is not really well accounted for in the DMA API. So for two we need a memory allocator that can take the constraints of multiple devices into account, and probably a way to fail a dma-buf attach when the importer can't address the memory. We also then need to come up with a memory ownership / cache maintenance protocol that works for this use case. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct 2024-05-09 13:06 ` Christoph Hellwig @ 2024-05-09 18:32 ` T.J. Mercier 0 siblings, 0 replies; 19+ messages in thread From: T.J. Mercier @ 2024-05-09 18:32 UTC (permalink / raw) To: Christoph Hellwig Cc: Catalin Marinas, Marek Szyprowski, Robin Murphy, isaacmanjarres, iommu, linux-kernel On Thu, May 9, 2024 at 6:07 AM Christoph Hellwig <hch@lst.de> wrote: > > On Thu, May 09, 2024 at 08:49:40AM +0100, Catalin Marinas wrote: > > I see the swiotlb use as some internal detail of the DMA API > > implementation that should not leak outside this framework. > > And that's what it is. > > > I think we should prevent bouncing if DMA_ATTR_SKIP_CPU_SYNC is passed. > > However, this is not sufficient with a proper use of the DMA API since > > the first dma_map_*() without this attribute can still do the bouncing. > > IMHO what we need is a DMA_ATTR_NO_BOUNCE or DMA_ATTR_SHARED that will > > be used on the first map and potentially on subsequent calls in > > combination with DMA_ATTR_SKIP_CPU_SYNC (though we could use the latter > > to imply "shared"). The downside is that mapping may fail if the > > coherent mask is too narrow. > > We have two big problems here that kinda interact: > > 1) DMA_ATTR_SKIP_CPU_SYNC is just a horrible API. It exposes an > implementation detail instead of dealing with use cases. > The original one IIRC was to deal with networking receive > buffers that are often only partially filled and the networking > folks wanted to avoid the overhead for doing the cache operations > for the rest. It kinda works for that but already gets iffy > when swiotlb is involved. The other abuses of the flag just > went downhill form there. > > 2) the model of dma mapping a single chunk of memory to multiple > devices is not really well accounted for in the DMA API. > > So for two we need a memory allocator that can take the constraints > of multiple devices into account, and probably a way to fail a > dma-buf attach when the importer can't address the memory. > We also then need to come up with a memory ownership / cache > maintenance protocol that works for this use case. Being able to fail the attach without necessarily performing any mapping yet would be an improvement. However I think the original idea was for dmabuf exporters to perform the constraint solving (if possible) as attachments get added and then finally allocate however is best when the buffer is first mapped. But as far as I know there are no exporters that currently do this. Instead I think the problem is currently being avoided by using custom exporters for particular sets of usecases that are known to work on a given system. This swiotlb + uncached example is one reason we'd want to fail the constraint solving. The DMA API knows about the swiotlb part but not really about the uncached part. ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-05-09 18:32 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-03 18:37 [PATCH] dma-direct: Set SG_DMA_SWIOTLB flag for dma-direct T.J. Mercier 2024-05-04 8:53 ` Petr Tesařík 2024-05-09 13:28 ` Robin Murphy 2024-05-06 5:29 ` Christoph Hellwig 2024-05-06 16:00 ` T.J. Mercier 2024-05-06 16:02 ` Christoph Hellwig 2024-05-06 16:10 ` T.J. Mercier 2024-05-06 16:19 ` Christoph Hellwig 2024-05-06 16:39 ` T.J. Mercier 2024-05-07 5:43 ` Christoph Hellwig 2024-05-07 20:07 ` T.J. Mercier 2024-05-08 11:33 ` Christoph Hellwig 2024-05-09 12:54 ` Robin Murphy 2024-05-09 18:26 ` T.J. Mercier 2024-05-08 17:19 ` Catalin Marinas 2024-05-08 20:14 ` T.J. Mercier 2024-05-09 7:49 ` Catalin Marinas 2024-05-09 13:06 ` Christoph Hellwig 2024-05-09 18:32 ` T.J. Mercier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox