From: Christoph Hellwig <hch@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Andrew Lunn <andrew@lunn.ch>, Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
David Hildenbrand <david@redhat.com>,
John Hubbard <jhubbard@nvidia.com>,
Mina Almasry <almasrymina@google.com>,
willy@infradead.org, Christian Brauner <brauner@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>,
netdev@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
Leon Romanovsky <leon@kernel.org>,
Logan Gunthorpe <logang@deltatee.com>,
Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: How to handle P2P DMA with only {physaddr,len} in bio_vec?
Date: Mon, 23 Jun 2025 06:46:47 -0700 [thread overview]
Message-ID: <aFlaxwpKChYXFf8A@infradead.org> (raw)
In-Reply-To: <1098395.1750675858@warthog.procyon.org.uk>
Hi David,
On Mon, Jun 23, 2025 at 11:50:58AM +0100, David Howells wrote:
> What's the best way to manage this without having to go back to the page
> struct for every DMA mapping we want to make?
There isn't a very easy way. Also because if you actually need to do
peer to peer transfers, you right now absolutely need the page to find
the pgmap that has the information on how to perform the peer to peer
transfer.
> Do we need to have
> iov_extract_user_pages() note this in the bio_vec?
>
> struct bio_vec {
> physaddr_t bv_base_addr; /* 64-bits */
> size_t bv_len:56; /* Maybe just u32 */
> bool p2pdma:1; /* Region is involved in P2P */
> unsigned int spare:7;
> };
Having a flag in the bio_vec might be a way to shortcut the P2P or not
decision a bit. The downside is that without the flag, the bio_vec
in the brave new page-less world would actually just be:
struct bio_vec {
phys_addr_t bv_phys;
u32 bv_len;
} __packed;
i.e. adding any more information would actually increase the size from
12 bytes to 16 bytes for the usualy 64-bit phys_addr_t setups, and thus
undo all the memory savings that this move would provide.
Note that at least for the block layer the DMA mapping changes I'm about
to send out again require each bio to be either non P2P or P2P to a
specific device. It might be worth to also extend this higher level
limitation to other users if feasible.
> I'm guessing that only folio-type pages can be involved in this:
>
> static inline struct dev_pagemap *page_pgmap(const struct page *page)
> {
> VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page);
> return page_folio(page)->pgmap;
> }
>
> as only struct folio has a pointer to dev_pagemap? And I assume this is going
> to get removed from struct page itself at some point soonish.
I guess so.
next prev parent reply other threads:[~2025-06-23 13:46 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <0aa1b4a2-47b2-40a4-ae14-ce2dd457a1f7@lunn.ch>
[not found] ` <1015189.1746187621@warthog.procyon.org.uk>
2025-05-02 13:41 ` MSG_ZEROCOPY and the O_DIRECT vs fork() race David Howells
2025-05-02 13:48 ` David Hildenbrand
2025-05-02 14:21 ` Andrew Lunn
2025-05-02 16:21 ` Reorganising how the networking layer handles memory David Howells
2025-05-05 20:14 ` Jakub Kicinski
2025-05-06 13:50 ` David Howells
2025-05-06 13:56 ` Christoph Hellwig
2025-05-06 18:20 ` Jakub Kicinski
2025-05-07 13:45 ` David Howells
2025-05-07 17:47 ` Willem de Bruijn
2025-05-07 13:49 ` David Howells
2025-05-12 14:51 ` AF_UNIX/zerocopy/pipe/vmsplice/splice vs FOLL_PIN David Howells
2025-05-12 21:59 ` David Hildenbrand
2025-06-23 11:50 ` Christian Brauner
2025-06-23 13:53 ` Christoph Hellwig
2025-06-23 14:16 ` David Howells
2025-06-23 10:50 ` How to handle P2P DMA with only {physaddr,len} in bio_vec? David Howells
2025-06-23 13:46 ` Christoph Hellwig [this message]
2025-06-23 23:38 ` Alistair Popple
2025-06-24 9:02 ` David Howells
2025-06-24 12:18 ` Jason Gunthorpe
2025-06-24 12:39 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFlaxwpKChYXFf8A@infradead.org \
--to=hch@infradead.org \
--cc=almasrymina@google.com \
--cc=andrew@lunn.ch \
--cc=brauner@kernel.org \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=dhowells@redhat.com \
--cc=edumazet@google.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=logang@deltatee.com \
--cc=netdev@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).