All of lore.kernel.org
 help / color / mirror / Atom feed
From: wangtao <tao.wangtao@honor.com>
To: "Christian König" <christian.koenig@amd.com>,
	"T.J. Mercier" <tjmercier@google.com>
Cc: "sumit.semwal@linaro.org" <sumit.semwal@linaro.org>,
	"benjamin.gaignard@collabora.com"
	<benjamin.gaignard@collabora.com>,
	"Brian.Starkey@arm.com" <Brian.Starkey@arm.com>,
	"jstultz@google.com" <jstultz@google.com>,
	"linux-media@vger.kernel.org" <linux-media@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linaro-mm-sig@lists.linaro.org" <linaro-mm-sig@lists.linaro.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"wangbintian(BintianWang)" <bintian.wang@honor.com>,
	yipengxiang <yipengxiang@honor.com>,
	liulu 00013167 <liulu.liu@honor.com>,
	hanfeng 00012985 <feng.han@honor.com>,
	"amir73il@gmail.com" <amir73il@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"brauner@kernel.org" <brauner@kernel.org>,
	"hughd@google.com" <hughd@google.com>
Subject: RE: [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap
Date: Tue, 27 May 2025 14:35:20 +0000	[thread overview]
Message-ID: <d7b506bd3be242d290b57449c353c07a@honor.com> (raw)
In-Reply-To: <4a53b6bf-9273-4e77-9882-644faafa200a@amd.com>



> -----Original Message-----
> From: Christian König <christian.koenig@amd.com>
> Sent: Thursday, May 22, 2025 7:58 PM
> To: wangtao <tao.wangtao@honor.com>; T.J. Mercier
> <tjmercier@google.com>
> Cc: sumit.semwal@linaro.org; benjamin.gaignard@collabora.com;
> Brian.Starkey@arm.com; jstultz@google.com; linux-media@vger.kernel.org;
> dri-devel@lists.freedesktop.org; linaro-mm-sig@lists.linaro.org; linux-
> kernel@vger.kernel.org; wangbintian(BintianWang)
> <bintian.wang@honor.com>; yipengxiang <yipengxiang@honor.com>; liulu
> 00013167 <liulu.liu@honor.com>; hanfeng 00012985 <feng.han@honor.com>;
> amir73il@gmail.com
> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> DMA_BUF_IOCTL_RW_FILE for system_heap
> 
> On 5/22/25 10:02, wangtao wrote:
> >> -----Original Message-----
> >> From: Christian König <christian.koenig@amd.com>
> >> Sent: Wednesday, May 21, 2025 7:57 PM
> >> To: wangtao <tao.wangtao@honor.com>; T.J. Mercier
> >> <tjmercier@google.com>
> >> Cc: sumit.semwal@linaro.org; benjamin.gaignard@collabora.com;
> >> Brian.Starkey@arm.com; jstultz@google.com;
> >> linux-media@vger.kernel.org; dri-devel@lists.freedesktop.org;
> >> linaro-mm-sig@lists.linaro.org; linux- kernel@vger.kernel.org;
> >> wangbintian(BintianWang) <bintian.wang@honor.com>; yipengxiang
> >> <yipengxiang@honor.com>; liulu
> >> 00013167 <liulu.liu@honor.com>; hanfeng 00012985
> >> <feng.han@honor.com>; amir73il@gmail.com
> >> Subject: Re: [PATCH 2/2] dmabuf/heaps: implement
> >> DMA_BUF_IOCTL_RW_FILE for system_heap
> >>
> >> On 5/21/25 12:25, wangtao wrote:
> >>> [wangtao] I previously explained that
> >>> read/sendfile/splice/copy_file_range
> >>> syscalls can't achieve dmabuf direct IO zero-copy.
> >>
> >> And why can't you work on improving those syscalls instead of
> >> creating a new IOCTL?
> >>
> > [wangtao] As I mentioned in previous emails, these syscalls cannot
> > achieve dmabuf zero-copy due to technical constraints.
> 
> Yeah, and why can't you work on removing those technical constrains?
> 
> What is blocking you from improving the sendfile system call or proposing a
> patch to remove the copy_file_range restrictions?
[wangtao] Since sendfile/splice can't eliminate CPU copies, I skipped cross-FS checks
in copy_file_range when copying memory/disk files.
Will send new patches after completing shmem/udmabuf callback.
Thank you for your attention to this issue.

UFS 4.0 device @4GB/s, Arm64 CPU @1GHz:
| Metrics                  |Creat(us)|Close(us)| I/O(us) |I/O(MB/s)| Vs.%
|--------------------------|---------|---------|---------|---------|-------
| 0)    dmabuf buffer read |   46898 |    4804 | 1173661 |     914 |  100%
| 1)   udmabuf buffer read |  593844 |  337111 | 2144681 |     500 |   54%
| 2)     memfd buffer read |    1029 |  305322 | 2215859 |     484 |   52%
| 3)     memfd direct read |     562 |  295239 | 1019913 |    1052 |  115%
| 4) memfd buffer sendfile |     785 |  299026 | 1431304 |     750 |   82%
| 5) memfd direct sendfile |     718 |  296307 | 2622270 |     409 |   44%
| 6)   memfd buffer splice |     981 |  299694 | 1573710 |     682 |   74%
| 7)   memfd direct splice |     890 |  302509 | 1269757 |     845 |   92%
| 8)    memfd buffer c_f_r |      33 |    4432 |     N/A |     N/A |   N/A
| 9)    memfd direct c_f_r |      27 |    4421 |     N/A |     N/A |   N/A
|10) memfd buffer sendfile |  595797 |  423105 | 1242494 |     864 |   94%
|11) memfd direct sendfile |  593758 |  357921 | 2344001 |     458 |   50%
|12)   memfd buffer splice |  623221 |  356212 | 1117507 |     960 |  105%
|13)   memfd direct splice |  587059 |  345484 |  857103 |    1252 |  136%
|14)  udmabuf buffer c_f_r |   22725 |   10248 |     N/A |     N/A |   N/A
|15)  udmabuf direct c_f_r |   20120 |    9952 |     N/A |     N/A |   N/A
|16)   dmabuf buffer c_f_r |   46517 |    4708 |  857587 |    1252 |  136%
|17)   dmabuf direct c_f_r |   47339 |    4661 |  284023 |    3780 |  413%

> 
> Regards,
> Christian.
> 
>  Could you
> > specify the technical points, code, or principles that need
> > optimization?
> >
> > Let me explain again why these syscalls can't work:
> > 1. read() syscall
> >    - dmabuf fops lacks read callback implementation. Even if implemented,
> >      file_fd info cannot be transferred
> >    - read(file_fd, dmabuf_ptr, len) with remap_pfn_range-based mmap
> >      cannot access dmabuf_buf pages, forcing buffer-mode reads
> >
> > 2. sendfile() syscall
> >    - Requires CPU copy from page cache to memory file(tmpfs/shmem):
> >      [DISK] --DMA--> [page cache] --CPU copy--> [MEMORY file]
> >    - CPU overhead (both buffer/direct modes involve copies):
> >      55.08% do_sendfile
> >     |- 55.08% do_splice_direct
> >     |-|- 55.08% splice_direct_to_actor
> >     |-|-|- 22.51% copy_splice_read
> >     |-|-|-|- 16.57% f2fs_file_read_iter
> >     |-|-|-|-|- 15.12% __iomap_dio_rw
> >     |-|-|- 32.33% direct_splice_actor
> >     |-|-|-|- 32.11% iter_file_splice_write
> >     |-|-|-|-|- 28.42% vfs_iter_write
> >     |-|-|-|-|-|- 28.42% do_iter_write
> >     |-|-|-|-|-|-|- 28.39% shmem_file_write_iter
> >     |-|-|-|-|-|-|-|- 24.62% generic_perform_write
> >     |-|-|-|-|-|-|-|-|- 18.75% __pi_memmove
> >
> > 3. splice() requires one end to be a pipe, incompatible with regular files or
> dmabuf.
> >
> > 4. copy_file_range()
> >    - Blocked by cross-FS restrictions (Amir's commit 868f9f2f8e00)
> >    - Even without this restriction, Even without restrictions, implementing
> >      the copy_file_range callback in dmabuf fops would only allow dmabuf
> read
> > 	 from regular files. This is because copy_file_range relies on
> > 	 file_out->f_op->copy_file_range, which cannot support dmabuf
> write
> > 	 operations to regular files.
> >
> > Test results confirm these limitations:
> > T.J. Mercier's 1G from ext4 on 6.12.20 | read/sendfile (ms) w/ 3 >
> > drop_caches
> > ------------------------|-------------------
> > udmabuf buffer read     | 1210
> > udmabuf direct read     | 671
> > udmabuf buffer sendfile | 1096
> > udmabuf direct sendfile | 2340
> >
> > My 3GHz CPU tests (cache cleared):
> > Method                | alloc | read  | vs. (%)
> > -----------------------------------------------
> > udmabuf buffer read   | 135   | 546   | 180%
> > udmabuf direct read   | 159   | 300   | 99%
> > udmabuf buffer sendfile | 134 | 303   | 100%
> > udmabuf direct sendfile | 141 | 912   | 301%
> > dmabuf buffer read    | 22    | 362   | 119%
> > my patch direct read  | 29    | 265   | 87%
> >
> > My 1GHz CPU tests (cache cleared):
> > Method                | alloc | read  | vs. (%)
> > -----------------------------------------------
> > udmabuf buffer read   | 552   | 2067  | 198%
> > udmabuf direct read   | 540   | 627   | 60%
> > udmabuf buffer sendfile | 497 | 1045  | 100% udmabuf direct sendfile |
> > 527 | 2330  | 223%
> > dmabuf buffer read    | 40    | 1111  | 106%
> > patch direct read     | 44    | 310   | 30%
> >
> > Test observations align with expectations:
> > 1. dmabuf buffer read requires slow CPU copies 2. udmabuf direct read
> > achieves zero-copy but has page retrieval
> >    latency from vaddr
> > 3. udmabuf buffer sendfile suffers CPU copy overhead 4. udmabuf direct
> > sendfile combines CPU copies with frequent DMA
> >    operations due to small pipe buffers 5. dmabuf buffer read also
> > requires CPU copies 6. My direct read patch enables zero-copy with
> > better performance
> >    on low-power CPUs
> > 7. udmabuf creation time remains problematic (as you’ve noted).
> >
> >>> My focus is enabling dmabuf direct I/O for [regular file] <--DMA-->
> >>> [dmabuf] zero-copy.
> >>
> >> Yeah and that focus is wrong. You need to work on a general solution
> >> to the issue and not specific to your problem.
> >>
> >>> Any API achieving this would work. Are there other uAPIs you think
> >>> could help? Could you recommend experts who might offer suggestions?
> >>
> >> Well once more: Either work on sendfile or copy_file_range or
> >> eventually splice to make it what you want to do.
> >>
> >> When that is done we can discuss with the VFS people if that approach
> >> is feasible.
> >>
> >> But just bypassing the VFS review by implementing a DMA-buf specific
> >> IOCTL is a NO-GO. That is clearly not something you can do in any way.
> > [wangtao] The issue is that only dmabuf lacks Direct I/O zero-copy
> > support. Tmpfs/shmem already work with Direct I/O zero-copy. As
> > explained, existing syscalls or generic methods can't enable dmabuf
> > direct I/O zero-copy, which is why I propose adding an IOCTL command.
> >
> > I respect your perspective. Could you clarify specific technical
> > aspects, code requirements, or implementation principles for modifying
> > sendfile() or copy_file_range()? This would help advance our discussion.
> >
> > Thank you for engaging in this dialogue.
> >
> >>
> >> Regards,
> >> Christian.


  parent reply	other threads:[~2025-05-27 14:35 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-13  9:28 [PATCH 2/2] dmabuf/heaps: implement DMA_BUF_IOCTL_RW_FILE for system_heap wangtao
2025-05-13 11:32 ` Christian König
2025-05-13 12:30   ` wangtao
2025-05-13 13:17     ` Christian König
2025-05-14 11:02       ` wangtao
2025-05-14 12:00         ` Christian König
2025-05-15 14:03           ` wangtao
2025-05-15 14:26             ` Christian König
2025-05-16  7:40               ` wangtao
2025-05-16  8:36                 ` Christian König
2025-05-16  9:49                   ` wangtao
2025-05-16 10:29                     ` Christian König
2025-05-19  4:08                       ` wangtao
2025-05-19  7:47                         ` Christian König
2025-05-16 18:37                   ` T.J. Mercier
2025-05-19  4:37                     ` wangtao
2025-05-19 12:03                     ` wangtao
2025-05-20  4:06                       ` wangtao
2025-05-21  2:00                         ` T.J. Mercier
2025-05-21  4:17                           ` wangtao
2025-05-21  7:35                             ` Christian König
2025-05-21 10:25                               ` wangtao
2025-05-21 11:56                                 ` Christian König
2025-05-22  8:02                                   ` wangtao
2025-05-22 11:57                                     ` Christian König
2025-05-22 12:29                                       ` wangtao
2025-05-27 14:35                                       ` wangtao [this message]
2025-05-27 15:10                                         ` Christian König
  -- strict thread matches above, loose matches on Subject: below --
2025-05-14 12:57 kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7b506bd3be242d290b57449c353c07a@honor.com \
    --to=tao.wangtao@honor.com \
    --cc=Brian.Starkey@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=benjamin.gaignard@collabora.com \
    --cc=bintian.wang@honor.com \
    --cc=brauner@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=feng.han@honor.com \
    --cc=hughd@google.com \
    --cc=jstultz@google.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=liulu.liu@honor.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tjmercier@google.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yipengxiang@honor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.