All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikhail Rudenko <mike.rudenko@gmail.com>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Dafna Hirschfeld <dafna@fastmail.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Heiko Stuebner <heiko@sntech.de>,
	linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rkisp1: allow non-coherent video capture buffers
Date: Tue, 14 Jan 2025 19:00:39 +0300	[thread overview]
Message-ID: <87bjw9s4s3.fsf@gmail.com> (raw)
In-Reply-To: <20250103152326.GP554@pendragon.ideasonboard.com>


Hi Laurent,

On 2025-01-03 at 17:23 +02, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:

> On Thu, Jan 02, 2025 at 06:35:00PM +0300, Mikhail Rudenko wrote:
>> Currently, the rkisp1 driver always uses coherent DMA allocations for
>> video capture buffers. However, on some platforms, using non-coherent
>> buffers can improve performance, especially when CPU processing of
>> MMAP'ed video buffers is required.
>>
>> For example, on the Rockchip RK3399 running at maximum CPU frequency,
>> the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a
>> malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using
>> non-coherent DMA allocation. CPU usage also decreases accordingly.
>
> What's the time taken by the cache management operations ?

Sorry for the late reply, your question turned out a little more
interesting than I expected initially. :)

When capturing using Yavta with MMAP buffers under the conditions mentioned
in the commit message, ftrace gives 437.6 +- 1.1 us for
dma_sync_sgtable_for_cpu and 409 +- 14 us for
dma_sync_sgtable_for_device. Thus, it looks like using non-coherent
buffers in this case is more CPU-efficient even when considering cache
management overhead.

When trying to do the same measurements with libcamera, I failed. In a
typical libcamera use case when MMAP buffers are allocated from a
device, exported as dmabufs and then used for capture on the same device
with DMABUF memory type, cache management in kernel is skipped [1]
[2]. Also, vb2_dc_dmabuf_ops_{begin,end}_cpu_access are no-ops [3], so
DMA_BUF_IOCTL_SYNC from userspace does not work either.

So it looks like to make this change really useful, the above issue of
cache management for libcamera/DMABUF/videobuf2-dma-contig has to be
solved. I'm not an expert in this area, so any advice is kindly welcome. :)

[1] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-core.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n411
[2] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-core.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n829
[3] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-dma-contig.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n426

--
Best regards,
Mikhail Rudenko


WARNING: multiple messages have this Message-ID (diff)
From: Mikhail Rudenko <mike.rudenko@gmail.com>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Dafna Hirschfeld <dafna@fastmail.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Heiko Stuebner <heiko@sntech.de>,
	linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rkisp1: allow non-coherent video capture buffers
Date: Tue, 14 Jan 2025 19:00:39 +0300	[thread overview]
Message-ID: <87bjw9s4s3.fsf@gmail.com> (raw)
In-Reply-To: <20250103152326.GP554@pendragon.ideasonboard.com>


Hi Laurent,

On 2025-01-03 at 17:23 +02, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:

> On Thu, Jan 02, 2025 at 06:35:00PM +0300, Mikhail Rudenko wrote:
>> Currently, the rkisp1 driver always uses coherent DMA allocations for
>> video capture buffers. However, on some platforms, using non-coherent
>> buffers can improve performance, especially when CPU processing of
>> MMAP'ed video buffers is required.
>>
>> For example, on the Rockchip RK3399 running at maximum CPU frequency,
>> the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a
>> malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using
>> non-coherent DMA allocation. CPU usage also decreases accordingly.
>
> What's the time taken by the cache management operations ?

Sorry for the late reply, your question turned out a little more
interesting than I expected initially. :)

When capturing using Yavta with MMAP buffers under the conditions mentioned
in the commit message, ftrace gives 437.6 +- 1.1 us for
dma_sync_sgtable_for_cpu and 409 +- 14 us for
dma_sync_sgtable_for_device. Thus, it looks like using non-coherent
buffers in this case is more CPU-efficient even when considering cache
management overhead.

When trying to do the same measurements with libcamera, I failed. In a
typical libcamera use case when MMAP buffers are allocated from a
device, exported as dmabufs and then used for capture on the same device
with DMABUF memory type, cache management in kernel is skipped [1]
[2]. Also, vb2_dc_dmabuf_ops_{begin,end}_cpu_access are no-ops [3], so
DMA_BUF_IOCTL_SYNC from userspace does not work either.

So it looks like to make this change really useful, the above issue of
cache management for libcamera/DMABUF/videobuf2-dma-contig has to be
solved. I'm not an expert in this area, so any advice is kindly welcome. :)

[1] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-core.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n411
[2] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-core.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n829
[3] https://git.linuxtv.org/media.git/tree/drivers/media/common/videobuf2/videobuf2-dma-contig.c?id=94794b5ce4d90ab134b0b101a02fddf6e74c437d#n426

--
Best regards,
Mikhail Rudenko

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

  reply	other threads:[~2025-01-14 17:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-02 15:35 [PATCH] media: rkisp1: allow non-coherent video capture buffers Mikhail Rudenko
2025-01-02 15:35 ` Mikhail Rudenko
2025-01-03 15:23 ` Laurent Pinchart
2025-01-03 15:23   ` Laurent Pinchart
2025-01-14 16:00   ` Mikhail Rudenko [this message]
2025-01-14 16:00     ` Mikhail Rudenko
2025-01-15  8:31     ` Tomasz Figa
2025-01-15  8:31       ` Tomasz Figa
2025-01-15 13:24       ` Mikhail Rudenko
2025-01-15 13:24         ` Mikhail Rudenko
2025-01-15 14:46         ` Tomasz Figa
2025-01-15 14:46           ` Tomasz Figa
2025-01-15 17:29           ` Mikhail Rudenko
2025-01-15 17:29             ` Mikhail Rudenko
2025-01-15 19:13     ` Nicolas Dufresne
2025-01-15 19:13       ` Nicolas Dufresne
2025-02-27 17:05     ` Jacopo Mondi
2025-02-27 17:05       ` Jacopo Mondi
2025-02-27 20:46       ` Mikhail Rudenko
2025-02-27 20:46         ` Mikhail Rudenko
2025-02-28  2:58         ` Nicolas Dufresne
2025-02-28  2:58           ` Nicolas Dufresne
2025-02-28  9:54           ` Hans Verkuil
2025-02-28  9:54             ` Hans Verkuil
2025-02-28 10:00       ` Tomasz Figa
2025-02-28 10:00         ` Tomasz Figa
2025-02-28 10:18         ` Jacopo Mondi
2025-02-28 10:18           ` Jacopo Mondi
2025-02-28 10:28           ` Tomasz Figa
2025-02-28 10:28             ` Tomasz Figa
2025-02-28 10:48             ` Jacopo Mondi
2025-02-28 10:48               ` Jacopo Mondi
2025-02-28 11:19               ` Tomasz Figa
2025-02-28 11:19                 ` Tomasz Figa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bjw9s4s3.fsf@gmail.com \
    --to=mike.rudenko@gmail.com \
    --cc=dafna@fastmail.com \
    --cc=heiko@sntech.de \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=mchehab@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.