* [PATCH v3 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1
@ 2025-01-28 20:35 Mikhail Rudenko
2025-01-28 20:35 ` [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
2025-01-28 20:35 ` [PATCH v3 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
0 siblings, 2 replies; 7+ messages in thread
From: Mikhail Rudenko @ 2025-01-28 20:35 UTC (permalink / raw)
To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil,
Sergey Senozhatsky
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
Mauro Carvalho Chehab, Mikhail Rudenko, stable
This small series adds support for non-coherent video capture buffers
on Rockchip ISP V1. Patch 1 fixes cache management for dmabuf's
allocated by dma-contig allocator. Patch 2 allows non-coherent
allocations on the rkisp1 capture queue. Some timing measurements are
provided in the commit message of patch 2.
Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
Changes in v3:
- ignore skip_cache_sync_* flags in vb2_dc_dmabuf_ops_{begin,end}_cpu_access
- invalidate/flush kernel mappings as appropriate if they exist
- use dma_sync_sgtable_* instead of dma_sync_sg_*
- Link to v2: https://lore.kernel.org/r/20250115-b4-rkisp-noncoherent-v2-0-0853e1a24012@gmail.com
Changes in v2:
- Fix vb2_dc_dmabuf_ops_{begin,end}_cpu_access() for non-coherent buffers.
- Add cache management timing information to patch 2 commit message.
- Link to v1: https://lore.kernel.org/r/20250102-b4-rkisp-noncoherent-v1-1-bba164f7132c@gmail.com
---
Mikhail Rudenko (2):
media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
media: rkisp1: Allow non-coherent video capture buffers
.../media/common/videobuf2/videobuf2-dma-contig.c | 22 ++++++++++++++++++++++
.../platform/rockchip/rkisp1/rkisp1-capture.c | 1 +
2 files changed, 23 insertions(+)
---
base-commit: c4b7779abc6633677e6edb79e2809f4f61fde157
change-id: 20241231-b4-rkisp-noncoherent-ad6e7c7a68ba
Best regards,
--
Mikhail Rudenko <mike.rudenko@gmail.com>
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-28 20:35 [PATCH v3 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko @ 2025-01-28 20:35 ` Mikhail Rudenko 2025-02-07 16:22 ` Tomasz Figa 2025-02-28 3:30 ` Nicolas Dufresne 2025-01-28 20:35 ` [PATCH v3 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko 1 sibling, 2 replies; 7+ messages in thread From: Mikhail Rudenko @ 2025-01-28 20:35 UTC (permalink / raw) To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, Mikhail Rudenko, stable When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in commit 129134e5415d ("media: media/v4l2: remove V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was introduced in commit c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained no-ops, making cache maintenance for non-coherent dmabufs allocated by dma-contig impossible. Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and {flush,invalidate}_kernel_vmap_range calls to vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent buffers. Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag") Cc: stable@vger.kernel.org Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> --- .../media/common/videobuf2/videobuf2-dma-contig.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28ce0641fe726e63 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -427,6 +427,17 @@ static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, enum dma_data_direction direction) { + struct vb2_dc_buf *buf = dbuf->priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!buf->non_coherent_mem) + return 0; + + if (buf->vaddr) + invalidate_kernel_vmap_range(buf->vaddr, buf->size); + + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); + return 0; } @@ -434,6 +445,17 @@ static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, enum dma_data_direction direction) { + struct vb2_dc_buf *buf = dbuf->priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!buf->non_coherent_mem) + return 0; + + if (buf->vaddr) + flush_kernel_vmap_range(buf->vaddr, buf->size); + + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); + return 0; } -- 2.47.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-28 20:35 ` [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko @ 2025-02-07 16:22 ` Tomasz Figa 2025-02-28 3:30 ` Nicolas Dufresne 1 sibling, 0 replies; 7+ messages in thread From: Tomasz Figa @ 2025-02-07 16:22 UTC (permalink / raw) To: Mikhail Rudenko, Christoph Hellwig, Robin Murphy, Hans Verkuil Cc: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Marek Szyprowski, Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable On Tue, Jan 28, 2025 at 9:36 PM Mikhail Rudenko <mike.rudenko@gmail.com> wrote: > > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in > commit 129134e5415d ("media: media/v4l2: remove > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained > no-ops, making cache maintenance for non-coherent dmabufs allocated by > dma-contig impossible. > > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and > {flush,invalidate}_kernel_vmap_range calls to > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent > buffers. > > Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag") > Cc: stable@vger.kernel.org > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> > --- > .../media/common/videobuf2/videobuf2-dma-contig.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28ce0641fe726e63 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -427,6 +427,17 @@ static int > vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem) > + return 0; > + > + if (buf->vaddr) > + invalidate_kernel_vmap_range(buf->vaddr, buf->size); > + > + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); > + > return 0; > } > > @@ -434,6 +445,17 @@ static int > vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem) > + return 0; > + > + if (buf->vaddr) > + flush_kernel_vmap_range(buf->vaddr, buf->size); > + > + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); > + > return 0; > } I took some time (over)thinking the kernel vmap range synchronization, because these functions can be called both from the kernel space using respective dma_buf_*() kAPI and also from the user space using the DMA_BUF_SYNC IOCTLs, so we could in theory have the multiple invocations racing with each other, but then I realized that we don't really provide any guarantees for concurrent writes and reads from the CPU, so I believe this should work fine. Sorry for the delay. Acked-by: Tomasz Figa <tfiga@chromium.org> Let me add @Christoph Hellwig and @Robin Murphy just in case I'm wrong on that, though... Hans, let's give them some time to take a look before applying this. Best regards, Tomasz ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-28 20:35 ` [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko 2025-02-07 16:22 ` Tomasz Figa @ 2025-02-28 3:30 ` Nicolas Dufresne 2025-02-28 10:25 ` Tomasz Figa 1 sibling, 1 reply; 7+ messages in thread From: Nicolas Dufresne @ 2025-02-28 3:30 UTC (permalink / raw) To: Mikhail Rudenko, Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit : > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in > commit 129134e5415d ("media: media/v4l2: remove > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained > no-ops, making cache maintenance for non-coherent dmabufs allocated > by > dma-contig impossible. > > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and > {flush,invalidate}_kernel_vmap_range calls to > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent > buffers. > > Fixes: c0acf9cfeee0 ("media: videobuf2: handle > V4L2_MEMORY_FLAG_NON_COHERENT flag") > Cc: stable@vger.kernel.org > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> > --- > .../media/common/videobuf2/videobuf2-dma-contig.c | 22 > ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index > bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c > e0641fe726e63 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -427,6 +427,17 @@ static int > vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction > direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem) > + return 0; > + > + if (buf->vaddr) > + invalidate_kernel_vmap_range(buf->vaddr, buf->size); Am I correct that this is mostly to prevent the kernel from reading back old data from the cache after an application or other driver did CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and DMA_BIDIRECTIONAL ? As for pending kernel writes, they should have been flushed before the buffer is made available for dequeue. And any access while a buffer is queued is concurrent access, which is expected to have undefined behaviour. > + > + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); Isn't there a link to make between buf->dma_dir and direcction before calling this ? Also, shouldn't we use direction insead of buf->dma_dir to possibly limit the scope ? > + > return 0; > } > > @@ -434,6 +445,17 @@ static int > vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem) > + return 0; > + > + if (buf->vaddr) > + flush_kernel_vmap_range(buf->vaddr, buf->size); > + > + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); Similar questions for the end_cpu_access implementation. Nicolas > + > return 0; > } > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-02-28 3:30 ` Nicolas Dufresne @ 2025-02-28 10:25 ` Tomasz Figa 2025-02-28 11:41 ` Mikhail Rudenko 0 siblings, 1 reply; 7+ messages in thread From: Tomasz Figa @ 2025-02-28 10:25 UTC (permalink / raw) To: Nicolas Dufresne Cc: Mikhail Rudenko, Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable On Fri, Feb 28, 2025 at 12:30 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote: > > Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit : > > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in > > commit 129134e5415d ("media: media/v4l2: remove > > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), > > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made > > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was > > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle > > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained > > no-ops, making cache maintenance for non-coherent dmabufs allocated > > by > > dma-contig impossible. > > > > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and > > {flush,invalidate}_kernel_vmap_range calls to > > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent > > buffers. > > > > Fixes: c0acf9cfeee0 ("media: videobuf2: handle > > V4L2_MEMORY_FLAG_NON_COHERENT flag") > > Cc: stable@vger.kernel.org > > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> > > --- > > .../media/common/videobuf2/videobuf2-dma-contig.c | 22 > > ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > > index > > bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c > > e0641fe726e63 100644 > > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > > @@ -427,6 +427,17 @@ static int > > vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, > > enum dma_data_direction > > direction) > > { > > + struct vb2_dc_buf *buf = dbuf->priv; > > + struct sg_table *sgt = buf->dma_sgt; > > + > > + if (!buf->non_coherent_mem) > > + return 0; > > + > > + if (buf->vaddr) > > + invalidate_kernel_vmap_range(buf->vaddr, buf->size); > > Am I correct that this is mostly to prevent the kernel from reading > back old data from the cache after an application or other driver did > CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and > DMA_BIDIRECTIONAL ? Note that this function must also synchronize between the user-space and kernel mappings, where the DMA direction doesn't really matter. Also it's unlikely for it to be called when not needed - why would one begin a CPU access before the DMA, when the DMA is FROM_DEVICE? > > As for pending kernel writes, they should have been flushed before the > buffer is made available for dequeue. There is no implicit flushing for imported DMA-bufs. All the flushing needs to be executed directly by the CPU accessors by surrounding the access with begin and end CPU access, be it in the kernel or userspace. > And any access while a buffer is > queued is concurrent access, which is expected to have undefined > behaviour. > Correct. > > + > > + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); > > Isn't there a link to make between buf->dma_dir and direcction before > calling this ? Also, shouldn't we use direction insead of buf->dma_dir > to possibly limit the scope ? Oh, yes, that's a good catch. It should be |direction| passed here and not |buf->dma_dir|, since the former determines what CPU access will be done. > > > + > > return 0; > > } > > > > @@ -434,6 +445,17 @@ static int > > vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, > > enum dma_data_direction direction) > > { > > + struct vb2_dc_buf *buf = dbuf->priv; > > + struct sg_table *sgt = buf->dma_sgt; > > + > > + if (!buf->non_coherent_mem) > > + return 0; > > + > > + if (buf->vaddr) > > + flush_kernel_vmap_range(buf->vaddr, buf->size); > > + > > + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); > > Similar questions for the end_cpu_access implementation. Yeah, same here. > > Nicolas > > > + > > return 0; > > } > > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-02-28 10:25 ` Tomasz Figa @ 2025-02-28 11:41 ` Mikhail Rudenko 0 siblings, 0 replies; 7+ messages in thread From: Mikhail Rudenko @ 2025-02-28 11:41 UTC (permalink / raw) To: Tomasz Figa Cc: Nicolas Dufresne, Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable On 2025-02-28 at 19:25 +09, Tomasz Figa <tfiga@chromium.org> wrote: > On Fri, Feb 28, 2025 at 12:30 PM Nicolas Dufresne <nicolas@ndufresne.ca> wrote: >> >> Le mardi 28 janvier 2025 à 23:35 +0300, Mikhail Rudenko a écrit : >> > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in >> > commit 129134e5415d ("media: media/v4l2: remove >> > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), >> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made >> > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was >> > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle >> > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained >> > no-ops, making cache maintenance for non-coherent dmabufs allocated >> > by >> > dma-contig impossible. >> > >> > Fix this by reintroducing dma_sync_sgtable_for_{cpu,device} and >> > {flush,invalidate}_kernel_vmap_range calls to >> > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent >> > buffers. >> > >> > Fixes: c0acf9cfeee0 ("media: videobuf2: handle >> > V4L2_MEMORY_FLAG_NON_COHERENT flag") >> > Cc: stable@vger.kernel.org >> > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> >> > --- >> > .../media/common/videobuf2/videobuf2-dma-contig.c | 22 >> > ++++++++++++++++++++++ >> > 1 file changed, 22 insertions(+) >> > >> > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> > index >> > bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..146d7997a0da5989fb081a6f28c >> > e0641fe726e63 100644 >> > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> > @@ -427,6 +427,17 @@ static int >> > vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, >> > enum dma_data_direction >> > direction) >> > { >> > + struct vb2_dc_buf *buf = dbuf->priv; >> > + struct sg_table *sgt = buf->dma_sgt; >> > + >> > + if (!buf->non_coherent_mem) >> > + return 0; >> > + >> > + if (buf->vaddr) >> > + invalidate_kernel_vmap_range(buf->vaddr, buf->size); >> >> Am I correct that this is mostly to prevent the kernel from reading >> back old data from the cache after an application or other driver did >> CPU writes ? If so, can't we restrict that to DMA_TO_DEVICE and >> DMA_BIDIRECTIONAL ? > > Note that this function must also synchronize between the user-space > and kernel mappings, where the DMA direction doesn't really matter. > Also it's unlikely for it to be called when not needed - why would one > begin a CPU access before the DMA, when the DMA is FROM_DEVICE? > >> >> As for pending kernel writes, they should have been flushed before the >> buffer is made available for dequeue. > > There is no implicit flushing for imported DMA-bufs. All the flushing > needs to be executed directly by the CPU accessors by surrounding the > access with begin and end CPU access, be it in the kernel or > userspace. > >> And any access while a buffer is >> queued is concurrent access, which is expected to have undefined >> behaviour. >> > > Correct. > >> > + >> > + dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); >> >> Isn't there a link to make between buf->dma_dir and direcction before >> calling this ? Also, shouldn't we use direction insead of buf->dma_dir >> to possibly limit the scope ? > > Oh, yes, that's a good catch. Indeed, will fix this in v4. > It should be |direction| passed here and > not |buf->dma_dir|, since the former determines what CPU access will > be done. > >> >> > + >> > return 0; >> > } >> > >> > @@ -434,6 +445,17 @@ static int >> > vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, >> > enum dma_data_direction direction) >> > { >> > + struct vb2_dc_buf *buf = dbuf->priv; >> > + struct sg_table *sgt = buf->dma_sgt; >> > + >> > + if (!buf->non_coherent_mem) >> > + return 0; >> > + >> > + if (buf->vaddr) >> > + flush_kernel_vmap_range(buf->vaddr, buf->size); >> > + >> > + dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); >> >> Similar questions for the end_cpu_access implementation. > > Yeah, same here. Noted, will fix. >> >> Nicolas >> >> > + >> > return 0; >> > } >> > >> > >> -- Best regards, Mikhail Rudenko ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v3 2/2] media: rkisp1: Allow non-coherent video capture buffers 2025-01-28 20:35 [PATCH v3 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko 2025-01-28 20:35 ` [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko @ 2025-01-28 20:35 ` Mikhail Rudenko 1 sibling, 0 replies; 7+ messages in thread From: Mikhail Rudenko @ 2025-01-28 20:35 UTC (permalink / raw) To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, Mikhail Rudenko Currently, the rkisp1 driver always uses coherent DMA allocations for video capture buffers. However, on some platforms, using non-coherent buffers can improve performance, especially when CPU processing of MMAP'ed video buffers is required. For example, on the Rockchip RK3399 running at maximum CPU frequency, the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using non-coherent DMA allocation. When doing cache management with DMA_BUF_IOCTL_SYNC, DMA_BUF_SYNC_START and DMA_BUF_SYNC_END operations take around 270 us each. This change allows userspace to request the allocation of non-coherent buffers. Note that the behavior for existing users will remain unchanged unless they explicitly set the V4L2_MEMORY_FLAG_NON_COHERENT flag when allocating buffers. Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> --- drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c index 6dcefd144d5abe358323e37ac6133c6134ac636e..c94f7d1d73a92646457a27da20726ec6f92e7717 100644 --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c @@ -1563,6 +1563,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; q->lock = &node->vlock; q->dev = cap->rkisp1->dev; + q->allow_cache_hints = 1; ret = vb2_queue_init(q); if (ret) { dev_err(cap->rkisp1->dev, -- 2.47.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-02-28 12:14 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-28 20:35 [PATCH v3 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko 2025-01-28 20:35 ` [PATCH v3 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko 2025-02-07 16:22 ` Tomasz Figa 2025-02-28 3:30 ` Nicolas Dufresne 2025-02-28 10:25 ` Tomasz Figa 2025-02-28 11:41 ` Mikhail Rudenko 2025-01-28 20:35 ` [PATCH v3 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).