linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1
@ 2025-01-15 17:25 Mikhail Rudenko
  2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
  2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
  0 siblings, 2 replies; 5+ messages in thread
From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw)
  To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
	Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil,
	Sergey Senozhatsky
  Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
	Mauro Carvalho Chehab, Mikhail Rudenko, stable

This small series adds support for non-coherent video capture buffers
on Rockchip ISP V1. Patch 1 fixes cache management for dmabuf's
allocated by dma-contig allocator. Patch 2 allows non-coherent
allocations on the rkisp1 capture queue. Some timing measurements are
provided in the commit message of patch 2.

Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
Changes in v2:
- Fix vb2_dc_dmabuf_ops_{begin,end}_cpu_access() for non-coherent buffers.
- Add cache management timing information to patch 2 commit message.
- Link to v1: https://lore.kernel.org/r/20250102-b4-rkisp-noncoherent-v1-1-bba164f7132c@gmail.com

---
Mikhail Rudenko (2):
      media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
      media: rkisp1: Allow non-coherent video capture buffers

 drivers/media/common/videobuf2/videobuf2-dma-contig.c   | 14 ++++++++++++++
 drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c |  1 +
 2 files changed, 15 insertions(+)
---
base-commit: 94794b5ce4d90ab134b0b101a02fddf6e74c437d
change-id: 20241231-b4-rkisp-noncoherent-ad6e7c7a68ba

Best regards,
-- 
Mikhail Rudenko <mike.rudenko@gmail.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
  2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko
@ 2025-01-15 17:25 ` Mikhail Rudenko
  2025-01-27 11:18   ` Tomasz Figa
  2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko
  1 sibling, 1 reply; 5+ messages in thread
From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw)
  To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
	Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil,
	Sergey Senozhatsky
  Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
	Mauro Carvalho Chehab, Mikhail Rudenko, stable

When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
commit 129134e5415d ("media: media/v4l2: remove
V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
no-ops, making cache maintenance for non-coherent dmabufs allocated by
dma-contig impossible.

Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to
vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
buffers.

Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
 drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644
--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
@@ -427,6 +427,13 @@ static int
 vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
 				   enum dma_data_direction direction)
 {
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish)
+		return 0;
+
+	dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 
@@ -434,6 +441,13 @@ static int
 vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
 				 enum dma_data_direction direction)
 {
+	struct vb2_dc_buf *buf = dbuf->priv;
+	struct sg_table *sgt = buf->dma_sgt;
+
+	if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare)
+		return 0;
+
+	dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
 	return 0;
 }
 

-- 
2.47.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers
  2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko
  2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
@ 2025-01-15 17:25 ` Mikhail Rudenko
  1 sibling, 0 replies; 5+ messages in thread
From: Mikhail Rudenko @ 2025-01-15 17:25 UTC (permalink / raw)
  To: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
	Heiko Stuebner, Tomasz Figa, Marek Szyprowski, Hans Verkuil,
	Sergey Senozhatsky
  Cc: linux-media, linux-rockchip, linux-arm-kernel, linux-kernel,
	Mauro Carvalho Chehab, Mikhail Rudenko

Currently, the rkisp1 driver always uses coherent DMA allocations for
video capture buffers. However, on some platforms, using non-coherent
buffers can improve performance, especially when CPU processing of
MMAP'ed video buffers is required.

For example, on the Rockchip RK3399 running at maximum CPU frequency,
the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a
malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using
non-coherent DMA allocation. When doing cache management with
DMA_BUF_IOCTL_SYNC, DMA_BUF_SYNC_START and DMA_BUF_SYNC_END operations
take around 270 us each.

This change allows userspace to request the allocation of non-coherent
buffers. Note that the behavior for existing users will remain unchanged
unless they explicitly set the V4L2_MEMORY_FLAG_NON_COHERENT flag when
allocating buffers.

Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
---
 drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c
index 6dcefd144d5abe358323e37ac6133c6134ac636e..c94f7d1d73a92646457a27da20726ec6f92e7717 100644
--- a/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c
+++ b/drivers/media/platform/rockchip/rkisp1/rkisp1-capture.c
@@ -1563,6 +1563,7 @@ static int rkisp1_register_capture(struct rkisp1_capture *cap)
 	q->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
 	q->lock = &node->vlock;
 	q->dev = cap->rkisp1->dev;
+	q->allow_cache_hints = 1;
 	ret = vb2_queue_init(q);
 	if (ret) {
 		dev_err(cap->rkisp1->dev,

-- 
2.47.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
  2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
@ 2025-01-27 11:18   ` Tomasz Figa
  2025-01-27 19:25     ` Mikhail Rudenko
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Figa @ 2025-01-27 11:18 UTC (permalink / raw)
  To: Mikhail Rudenko
  Cc: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
	Heiko Stuebner, Marek Szyprowski, Hans Verkuil,
	Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel,
	linux-kernel, Mauro Carvalho Chehab, stable

Hi Mikhail,

On Thu, Jan 16, 2025 at 2:25 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>
> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
> commit 129134e5415d ("media: media/v4l2: remove
> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
> no-ops, making cache maintenance for non-coherent dmabufs allocated by
> dma-contig impossible.
>
> Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to
> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
> buffers.
>
> Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
> ---
>  drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>

Thanks a lot for the patch!
Sorry, for the delay. Ended up being sick with some nasty cold that
took quite a while to recover.
Please take a look at my comments inline.

> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644
> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
> @@ -427,6 +427,13 @@ static int
>  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>                                    enum dma_data_direction direction)
>  {
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish)

skip_cache_sync_on_finish shouldn't apply to this function, because
the buffer was shared with an external DMA-buf importer and we don't
know in what state it is at this point.

> +               return 0;
> +

We should also take care of the kernel mapping if it exists, because
on some platforms it may not be coherent with the userspace one -
using flush_kernel_vmap_range(). Please check how
vb2_dc_prepare()/vb2_dc_finish() do it.

> +       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);

We can use the dma_sync_sgtable_*() variant here so we can just pass
the entire sgt to it.

>         return 0;
>  }
>
> @@ -434,6 +441,13 @@ static int
>  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>                                  enum dma_data_direction direction)
>  {
> +       struct vb2_dc_buf *buf = dbuf->priv;
> +       struct sg_table *sgt = buf->dma_sgt;
> +
> +       if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare)
> +               return 0;
> +
> +       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>         return 0;

Overall the same comments here as for
vb2_dc_dmabuf_ops_begin_cpu_access() +/- flush would change to
invalidate.

Best regards,
Tomasz


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig
  2025-01-27 11:18   ` Tomasz Figa
@ 2025-01-27 19:25     ` Mikhail Rudenko
  0 siblings, 0 replies; 5+ messages in thread
From: Mikhail Rudenko @ 2025-01-27 19:25 UTC (permalink / raw)
  To: Tomasz Figa
  Cc: Dafna Hirschfeld, Laurent Pinchart, Mauro Carvalho Chehab,
	Heiko Stuebner, Marek Szyprowski, Hans Verkuil,
	Sergey Senozhatsky, linux-media, linux-rockchip, linux-arm-kernel,
	linux-kernel, Mauro Carvalho Chehab, stable

Hi Tomasz,

and thanks for your review!

On 2025-01-27 at 20:18 +09, Tomasz Figa <tfiga@chromium.org> wrote:

> Hi Mikhail,
>
> On Thu, Jan 16, 2025 at 2:25 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>>
>> When support for V4L2_FLAG_MEMORY_NON_CONSISTENT was removed in
>> commit 129134e5415d ("media: media/v4l2: remove
>> V4L2_FLAG_MEMORY_NON_CONSISTENT flag"),
>> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions were made
>> no-ops. Later, when support for V4L2_MEMORY_FLAG_NON_COHERENT was
>> introduced in commit c0acf9cfeee0 ("media: videobuf2: handle
>> V4L2_MEMORY_FLAG_NON_COHERENT flag"), the above functions remained
>> no-ops, making cache maintenance for non-coherent dmabufs allocated by
>> dma-contig impossible.
>>
>> Fix this by reintroducing dma_sync_sg_for_{cpu,device} calls to
>> vb2_dc_dmabuf_ops_{begin,end}_cpu_access() functions for non-coherent
>> buffers.
>>
>> Fixes: c0acf9cfeee0 ("media: videobuf2: handle V4L2_MEMORY_FLAG_NON_COHERENT flag")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Mikhail Rudenko <mike.rudenko@gmail.com>
>> ---
>>  drivers/media/common/videobuf2/videobuf2-dma-contig.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>
> Thanks a lot for the patch!
> Sorry, for the delay. Ended up being sick with some nasty cold that
> took quite a while to recover.
> Please take a look at my comments inline.
>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> index bb0b7fa67b539aa73ad5ccf3c3bc318e26f8a4cb..889d6c11e15ab5cd4b4c317e865f1fef92df4b66 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>> @@ -427,6 +427,13 @@ static int
>>  vb2_dc_dmabuf_ops_begin_cpu_access(struct dma_buf *dbuf,
>>                                    enum dma_data_direction direction)
>>  {
>> +       struct vb2_dc_buf *buf = dbuf->priv;
>> +       struct sg_table *sgt = buf->dma_sgt;
>> +
>> +       if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_finish)
>
> skip_cache_sync_on_finish shouldn't apply to this function, because
> the buffer was shared with an external DMA-buf importer and we don't
> know in what state it is at this point.
>

Ack, will fix in v3.

>> +               return 0;
>> +
>
> We should also take care of the kernel mapping if it exists, because
> on some platforms it may not be coherent with the userspace one -
> using flush_kernel_vmap_range(). Please check how
> vb2_dc_prepare()/vb2_dc_finish() do it.
>

Thanks for the pointer, will do so in v3.

>> +       dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>
> We can use the dma_sync_sgtable_*() variant here so we can just pass
> the entire sgt to it.
>

Will do so.

>>         return 0;
>>  }
>>
>> @@ -434,6 +441,13 @@ static int
>>  vb2_dc_dmabuf_ops_end_cpu_access(struct dma_buf *dbuf,
>>                                  enum dma_data_direction direction)
>>  {
>> +       struct vb2_dc_buf *buf = dbuf->priv;
>> +       struct sg_table *sgt = buf->dma_sgt;
>> +
>> +       if (!buf->non_coherent_mem || buf->vb->skip_cache_sync_on_prepare)
>> +               return 0;
>> +
>> +       dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->nents, buf->dma_dir);
>>         return 0;
>
> Overall the same comments here as for
> vb2_dc_dmabuf_ops_begin_cpu_access() +/- flush would change to
> invalidate.

Ack.

> Best regards,
> Tomasz


--
Best regards,
Mikhail Rudenko


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-01-27 19:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15 17:25 [PATCH v2 0/2] Allow non-coherent video capture buffers on Rockchip ISP V1 Mikhail Rudenko
2025-01-15 17:25 ` [PATCH v2 1/2] media: videobuf2: Fix dmabuf cache sync/flush in dma-contig Mikhail Rudenko
2025-01-27 11:18   ` Tomasz Figa
2025-01-27 19:25     ` Mikhail Rudenko
2025-01-15 17:25 ` [PATCH v2 2/2] media: rkisp1: Allow non-coherent video capture buffers Mikhail Rudenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).