* [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1
@ 2025-01-15 17:25 Mikhail Rudenko
2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
0 siblings, 2 replies; 5+ messages in thread
From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw)
To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil,
Sergey Senozhatsky
Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
Mauro Carvalho Chehab, Mikhail Rudenko, stable
This small series adds support for non-coherent video capture buffers
on Rockchip ISP V1. Patch 1 fixes cache management for dmabuf's
allocated by dma-contig allocator. Patch 2 allows non-coherent
allocations on the rkisp1 capture queue. Some timing measurements are
provided in the commit message of patch 2.
Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
Changes in v2:
- Fix vb2_dc_dmabuf_ops_{begin,end}_cpu_access() for non-coherent buffers.
- Add cache management timing information to patch 2 commit message.
- Link to v1: https://lore.kernel.org/r/20250102-b4-rkisp-noncoherent-v1-1-bba164f7132c@gmail.com
---
Mikhail Rudenko (2):
media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
media: rkisp1: Allow non-coherent video capture buffers
drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++
drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 1 +
2 files changed, 15 insertions(+)
---
base-commit: 94794b5ce4d90ab134b0b101a02fddf6e74c437d
change-id: 20241231-b4-rkisp-noncoherent-ad6e7c7a68ba
Best regards,
--
Mikhail Rudenko <mike.rudenko@gmail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko @ 2025-01-15 17:25 ` Mikhail Rudenko 2025-01-27 11:18 ` Tomasz Figa 2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko 1 sibling, 1 reply; 5+ messages in thread From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw) To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, Mikhail Rudenko, stable When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in commit 129134e5415d ("media: media/v4l2: remove V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was introduced in commit c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained no-ops, making cache maintenance for non-coherent dmabufs allocated by dma-contig impossible. Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent buffers. Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag") Cc: stable@vger.kernel.org Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> --- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -427,6 +427,13 @@ static int vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, enum dma_data_direction direction) { + struct vb2_dc_buf *buf = dbuf->priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish) + return 0; + + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); return 0; } @@ -434,6 +441,13 @@ static int vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, enum dma_data_direction direction) { + struct vb2_dc_buf *buf = dbuf->priv; + struct sg_table *sgt = buf->dma_sgt; + + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare) + return 0; + + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); return 0; } -- 2.47.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko @ 2025-01-27 11:18 ` Tomasz Figa 2025-01-27 19:25 ` Mikhail Rudenko 0 siblings, 1 reply; 5+ messages in thread From: Tomasz Figa @ 2025-01-27 11:18 UTC (permalink / raw) To: Mikhail Rudenko Cc: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable Hi Mikhail, On Thu, Jan 16, 2025 at 2:25 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote: > > When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in > commit 129134e5415d ("media: media/v4l2: remove > V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made > no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was > introduced in commit c0acf9cfeee0 ("media: videobuf2: handle > V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained > no-ops, making cache maintenance for non-coherent dmabufs allocated by > dma-contig impossible. > > Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to > vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent > buffers. > > Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag") > Cc: stable@vger.kernel.org > Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> > --- > drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > Thanks a lot for the patch! Sorry, for the delay. Ended up being sick with some nasty cold that took quite a while to recover. Please take a look at my comments inline. > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -427,6 +427,13 @@ static int > vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish) skip_cache_sync_on_finish shouldn't apply to this function, because the buffer was shared with an external DMA-buf importer and we don't know in what state it is at this point. > + return 0; > + We should also take care of the kernel mapping if it exists, because on some platforms it may not be coherent with the userspace one - using flush_kernel_vmap_range(). Please check how vb2_dc_prepare()/vb2_dc_finish() do it. > + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); We can use the dma_sync_sgtable_*() variant here so we can just pass the entire sgt to it. > return 0; > } > > @@ -434,6 +441,13 @@ static int > vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, > enum dma_data_direction direction) > { > + struct vb2_dc_buf *buf = dbuf->priv; > + struct sg_table *sgt = buf->dma_sgt; > + > + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare) > + return 0; > + > + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > return 0; Overall the same comments here as for vb2_dc_dmabuf_ops_begin_cpu_access() +/- flush would change to invalidate. Best regards, Tomasz ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig 2025-01-27 11:18 ` Tomasz Figa @ 2025-01-27 19:25 ` Mikhail Rudenko 0 siblings, 0 replies; 5+ messages in thread From: Mikhail Rudenko @ 2025-01-27 19:25 UTC (permalink / raw) To: Tomasz Figa Cc: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, stable Hi Tomasz, and thanks for your review! On 2025-01-27 at 20:18 +09, Tomasz Figa <tfiga@chromium.org> wrote: > Hi Mikhail, > > On Thu, Jan 16, 2025 at 2:25 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote: >> >> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in >> commit 129134e5415d ("media: media/v4l2: remove >> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"), >> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made >> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was >> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle >> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained >> no-ops, making cache maintenance for non-coherent dmabufs allocated by >> dma-contig impossible. >> >> Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to >> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent >> buffers. >> >> Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag") >> Cc: stable@vger.kernel.org >> Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> >> --- >> drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> > > Thanks a lot for the patch! > Sorry, for the delay. Ended up being sick with some nasty cold that > took quite a while to recover. > Please take a look at my comments inline. > >> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644 >> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >> @@ -427,6 +427,13 @@ static int >> vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf, >> enum dma_data_direction direction) >> { >> + struct vb2_dc_buf *buf = dbuf->priv; >> + struct sg_table *sgt = buf->dma_sgt; >> + >> + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish) > > skip_cache_sync_on_finish shouldn't apply to this function, because > the buffer was shared with an external DMA-buf importer and we don't > know in what state it is at this point. > Ack, will fix in v3. >> + return 0; >> + > > We should also take care of the kernel mapping if it exists, because > on some platforms it may not be coherent with the userspace one - > using flush_kernel_vmap_range(). Please check how > vb2_dc_prepare()/vb2_dc_finish() do it. > Thanks for the pointer, will do so in v3. >> + dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); > > We can use the dma_sync_sgtable_*() variant here so we can just pass > the entire sgt to it. > Will do so. >> return 0; >> } >> >> @@ -434,6 +441,13 @@ static int >> vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf, >> enum dma_data_direction direction) >> { >> + struct vb2_dc_buf *buf = dbuf->priv; >> + struct sg_table *sgt = buf->dma_sgt; >> + >> + if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare) >> + return 0; >> + >> + dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir); >> return 0; > > Overall the same comments here as for > vb2_dc_dmabuf_ops_begin_cpu_access() +/- flush would change to > invalidate. Ack. > Best regards, > Tomasz -- Best regards, Mikhail Rudenko ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers 2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko 2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko @ 2025-01-15 17:25 ` Mikhail Rudenko 1 sibling, 0 replies; 5+ messages in thread From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw) To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab, Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil, Sergey Senozhatsky Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel, Mauro Carvalho Chehab, Mikhail Rudenko Currently, the rkisp1 driver always uses coherent DMA allocations for video capture buffers. However, on some platforms, using non-coherent buffers can improve performance, especially when CPU processing of MMAP'ed video buffers is required. For example, on the Rockchip RK3399 running at maximum CPU frequency, the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using non-coherent DMA allocation. When doing cache management with DMA_BUF_IOCTL_SYNC, DMA_BUF_SYNC_START and DMA_BUF_SYNC_END operations take around 270 us each. This change allows userspace to request the allocation of non-coherent buffers. Note that the behavior for existing users will remain unchanged unless they explicitly set the V4L2_MEMORY_FLAG_NON_COHERENT flag when allocating buffers. Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com> --- drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c index 6dcefd144d5abe358323e37ac6133c6134ac636e..c94f7d1d73a92646457a27da20726ec6f92e7717 100644 --- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c +++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c @@ -1563,6 +1563,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap) q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC; q->lock = &node->vlock; q->dev = cap->rkisp1->dev; + q->allow_cache_hints = 1; ret = vb2_queue_init(q); if (ret) { dev_err(cap->rkisp1->dev, -- 2.47.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-01-27 19:36 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko 2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko 2025-01-27 11:18 ` Tomasz Figa 2025-01-27 19:25 ` Mikhail Rudenko 2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).