From: Mikhail Rudenko <mike.rudenko@gmail.com>
To: Tomasz Figa <tfiga@chromium.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
Dafna Hirschfeld <dafna@fastmail.com>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
Heiko Stuebner <heiko@sntech.de>,
linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rkisp1: allow non-coherent video capture buffers
Date: Wed, 15 Jan 2025 20:29:58 +0300 [thread overview]
Message-ID: <87y0zcq8wy.fsf@gmail.com> (raw)
In-Reply-To: <CAAFQd5C89M1TtpaCoK56Jd2Kq+h6+z552KY6cAqiDjMjDCFdWQ@mail.gmail.com>
On 2025-01-15 at 23:46 +09, Tomasz Figa <tfiga@chromium.org> wrote:
> On Wed, Jan 15, 2025 at 10:30 PM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>>
>> Hi Tomasz,
>>
>> On 2025-01-15 at 17:31 +09, Tomasz Figa <tfiga@chromium.org> wrote:
>>
>> > Hi Mikhail and Laurent,
>> >
>> > On Wed, Jan 15, 2025 at 2:07 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>> >>
>> >>
>> >> Hi Laurent,
>> >>
>> >> On 2025-01-03 at 17:23 +02, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:
>> >>
>> >> > On Thu, Jan 02, 2025 at 06:35:00PM +0300, Mikhail Rudenko wrote:
>> >> >> Currently, the rkisp1 driver always uses coherent DMA allocations for
>> >> >> video capture buffers. However, on some platforms, using non-coherent
>> >> >> buffers can improve performance, especially when CPU processing of
>> >> >> MMAP'ed video buffers is required.
>> >> >>
>> >> >> For example, on the Rockchip RK3399 running at maximum CPU frequency,
>> >> >> the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a
>> >> >> malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using
>> >> >> non-coherent DMA allocation. CPU usage also decreases accordingly.
>> >> >
>> >> > What's the time taken by the cache management operations ?
>> >>
>> >> Sorry for the late reply, your question turned out a little more
>> >> interesting than I expected initially. :)
>> >>
>> >> When capturing using Yavta with MMAP buffers under the conditions mentioned
>> >> in the commit message, ftrace gives 437.6 +- 1.1 us for
>> >> dma_sync_sgtable_for_cpu and 409 +- 14 us for
>> >> dma_sync_sgtable_for_device. Thus, it looks like using non-coherent
>> >> buffers in this case is more CPU-efficient even when considering cache
>> >> management overhead.
>> >>
>> >> When trying to do the same measurements with libcamera, I failed. In a
>> >> typical libcamera use case when MMAP buffers are allocated from a
>> >> device, exported as dmabufs and then used for capture on the same device
>> >> with DMABUF memory type, cache management in kernel is skipped [1]
>> >> [2]. Also, vb2_dc_dmabuf_ops_{begin,end}_cpu_access are no-ops [3], so
>> >> DMA_BUF_IOCTL_SYNC from userspace does not work either.
>> >
>> > Oops, so I believe this is a bug. When an MMAP buffer is allocated in
>> > the non-coherent mode, those ops should perform proper cache
>> > maintenance.
>>
>> Thanks for pointing this out!
>>
>> > Let me send a patch to fix this in a couple of days unless someone
>> > does it earlier.
>>
>> Now that we know that this is a bug, not an API misuse from my side, I
>> can fix this myself and send a v2. Would this be okay for you?
>
> I'd be more than happy :)
Done, see [1]. A review would be appreciated. :)
[1] https://lore.kernel.org/all/20250115-b4-rkisp-noncoherent-v2-0-0853e1a24012@gmail.com/
--
Best regards,
Mikhail Rudenko
WARNING: multiple messages have this Message-ID (diff)
From: Mikhail Rudenko <mike.rudenko@gmail.com>
To: Tomasz Figa <tfiga@chromium.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
Dafna Hirschfeld <dafna@fastmail.com>,
Mauro Carvalho Chehab <mchehab@kernel.org>,
Heiko Stuebner <heiko@sntech.de>,
linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] media: rkisp1: allow non-coherent video capture buffers
Date: Wed, 15 Jan 2025 20:29:58 +0300 [thread overview]
Message-ID: <87y0zcq8wy.fsf@gmail.com> (raw)
In-Reply-To: <CAAFQd5C89M1TtpaCoK56Jd2Kq+h6+z552KY6cAqiDjMjDCFdWQ@mail.gmail.com>
On 2025-01-15 at 23:46 +09, Tomasz Figa <tfiga@chromium.org> wrote:
> On Wed, Jan 15, 2025 at 10:30 PM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>>
>> Hi Tomasz,
>>
>> On 2025-01-15 at 17:31 +09, Tomasz Figa <tfiga@chromium.org> wrote:
>>
>> > Hi Mikhail and Laurent,
>> >
>> > On Wed, Jan 15, 2025 at 2:07 AM Mikhail Rudenko <mike.rudenko@gmail.com> wrote:
>> >>
>> >>
>> >> Hi Laurent,
>> >>
>> >> On 2025-01-03 at 17:23 +02, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote:
>> >>
>> >> > On Thu, Jan 02, 2025 at 06:35:00PM +0300, Mikhail Rudenko wrote:
>> >> >> Currently, the rkisp1 driver always uses coherent DMA allocations for
>> >> >> video capture buffers. However, on some platforms, using non-coherent
>> >> >> buffers can improve performance, especially when CPU processing of
>> >> >> MMAP'ed video buffers is required.
>> >> >>
>> >> >> For example, on the Rockchip RK3399 running at maximum CPU frequency,
>> >> >> the time to memcpy a frame from a 1280x720 XRGB32 MMAP'ed buffer to a
>> >> >> malloc'ed userspace buffer decreases from 7.7 ms to 1.1 ms when using
>> >> >> non-coherent DMA allocation. CPU usage also decreases accordingly.
>> >> >
>> >> > What's the time taken by the cache management operations ?
>> >>
>> >> Sorry for the late reply, your question turned out a little more
>> >> interesting than I expected initially. :)
>> >>
>> >> When capturing using Yavta with MMAP buffers under the conditions mentioned
>> >> in the commit message, ftrace gives 437.6 +- 1.1 us for
>> >> dma_sync_sgtable_for_cpu and 409 +- 14 us for
>> >> dma_sync_sgtable_for_device. Thus, it looks like using non-coherent
>> >> buffers in this case is more CPU-efficient even when considering cache
>> >> management overhead.
>> >>
>> >> When trying to do the same measurements with libcamera, I failed. In a
>> >> typical libcamera use case when MMAP buffers are allocated from a
>> >> device, exported as dmabufs and then used for capture on the same device
>> >> with DMABUF memory type, cache management in kernel is skipped [1]
>> >> [2]. Also, vb2_dc_dmabuf_ops_{begin,end}_cpu_access are no-ops [3], so
>> >> DMA_BUF_IOCTL_SYNC from userspace does not work either.
>> >
>> > Oops, so I believe this is a bug. When an MMAP buffer is allocated in
>> > the non-coherent mode, those ops should perform proper cache
>> > maintenance.
>>
>> Thanks for pointing this out!
>>
>> > Let me send a patch to fix this in a couple of days unless someone
>> > does it earlier.
>>
>> Now that we know that this is a bug, not an API misuse from my side, I
>> can fix this myself and send a v2. Would this be okay for you?
>
> I'd be more than happy :)
Done, see [1]. A review would be appreciated. :)
[1] https://lore.kernel.org/all/20250115-b4-rkisp-noncoherent-v2-0-0853e1a24012@gmail.com/
--
Best regards,
Mikhail Rudenko
_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip
next prev parent reply other threads:[~2025-01-15 17:35 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-02 15:35 [PATCH] media: rkisp1: allow non-coherent video capture buffers Mikhail Rudenko
2025-01-02 15:35 ` Mikhail Rudenko
2025-01-03 15:23 ` Laurent Pinchart
2025-01-03 15:23 ` Laurent Pinchart
2025-01-14 16:00 ` Mikhail Rudenko
2025-01-14 16:00 ` Mikhail Rudenko
2025-01-15 8:31 ` Tomasz Figa
2025-01-15 8:31 ` Tomasz Figa
2025-01-15 13:24 ` Mikhail Rudenko
2025-01-15 13:24 ` Mikhail Rudenko
2025-01-15 14:46 ` Tomasz Figa
2025-01-15 14:46 ` Tomasz Figa
2025-01-15 17:29 ` Mikhail Rudenko [this message]
2025-01-15 17:29 ` Mikhail Rudenko
2025-01-15 19:13 ` Nicolas Dufresne
2025-01-15 19:13 ` Nicolas Dufresne
2025-02-27 17:05 ` Jacopo Mondi
2025-02-27 17:05 ` Jacopo Mondi
2025-02-27 20:46 ` Mikhail Rudenko
2025-02-27 20:46 ` Mikhail Rudenko
2025-02-28 2:58 ` Nicolas Dufresne
2025-02-28 2:58 ` Nicolas Dufresne
2025-02-28 9:54 ` Hans Verkuil
2025-02-28 9:54 ` Hans Verkuil
2025-02-28 10:00 ` Tomasz Figa
2025-02-28 10:00 ` Tomasz Figa
2025-02-28 10:18 ` Jacopo Mondi
2025-02-28 10:18 ` Jacopo Mondi
2025-02-28 10:28 ` Tomasz Figa
2025-02-28 10:28 ` Tomasz Figa
2025-02-28 10:48 ` Jacopo Mondi
2025-02-28 10:48 ` Jacopo Mondi
2025-02-28 11:19 ` Tomasz Figa
2025-02-28 11:19 ` Tomasz Figa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y0zcq8wy.fsf@gmail.com \
--to=mike.rudenko@gmail.com \
--cc=dafna@fastmail.com \
--cc=heiko@sntech.de \
--cc=laurent.pinchart@ideasonboard.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-rockchip@lists.infradead.org \
--cc=mchehab@kernel.org \
--cc=tfiga@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.