linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Andrew Lunn <andrew@lunn.ch>, Eric Dumazet <edumazet@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Mina Almasry <almasrymina@google.com>,
	willy@infradead.org, Christian Brauner <brauner@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	netdev@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Leon Romanovsky <leon@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	Jason Gunthorpe <jgg@nvidia.com>
Subject: Re: How to handle P2P DMA with only {physaddr,len} in bio_vec?
Date: Mon, 23 Jun 2025 06:46:47 -0700	[thread overview]
Message-ID: <aFlaxwpKChYXFf8A@infradead.org> (raw)
In-Reply-To: <1098395.1750675858@warthog.procyon.org.uk>

Hi David,

On Mon, Jun 23, 2025 at 11:50:58AM +0100, David Howells wrote:
> What's the best way to manage this without having to go back to the page
> struct for every DMA mapping we want to make?

There isn't a very easy way.  Also because if you actually need to do
peer to peer transfers, you right now absolutely need the page to find
the pgmap that has the information on how to perform the peer to peer
transfer.

> Do we need to have
> iov_extract_user_pages() note this in the bio_vec?
> 
> 	struct bio_vec {
> 		physaddr_t	bv_base_addr;	/* 64-bits */
> 		size_t		bv_len:56;	/* Maybe just u32 */
> 		bool		p2pdma:1;	/* Region is involved in P2P */
> 		unsigned int	spare:7;
> 	};

Having a flag in the bio_vec might be a way to shortcut the P2P or not
decision a bit.  The downside is that without the flag, the bio_vec
in the brave new page-less world would actually just be:

	struct bio_vec {
		phys_addr_t	bv_phys;
		u32		bv_len;
	} __packed;

i.e. adding any more information would actually increase the size from
12 bytes to 16 bytes for the usualy 64-bit phys_addr_t setups, and thus
undo all the memory savings that this move would provide.

Note that at least for the block layer the DMA mapping changes I'm about
to send out again require each bio to be either non P2P or P2P to a
specific device.  It might be worth to also extend this higher level
limitation to other users if feasible.

> I'm guessing that only folio-type pages can be involved in this:
> 
> 	static inline struct dev_pagemap *page_pgmap(const struct page *page)
> 	{
> 		VM_WARN_ON_ONCE_PAGE(!is_zone_device_page(page), page);
> 		return page_folio(page)->pgmap;
> 	}
> 
> as only struct folio has a pointer to dev_pagemap?  And I assume this is going
> to get removed from struct page itself at some point soonish.

I guess so.



  reply	other threads:[~2025-06-23 13:46 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <0aa1b4a2-47b2-40a4-ae14-ce2dd457a1f7@lunn.ch>
     [not found] ` <1015189.1746187621@warthog.procyon.org.uk>
2025-05-02 13:41   ` MSG_ZEROCOPY and the O_DIRECT vs fork() race David Howells
2025-05-02 13:48     ` David Hildenbrand
2025-05-02 14:21     ` Andrew Lunn
2025-05-02 16:21     ` Reorganising how the networking layer handles memory David Howells
2025-05-05 20:14       ` Jakub Kicinski
2025-05-06 13:50       ` David Howells
2025-05-06 13:56         ` Christoph Hellwig
2025-05-06 18:20         ` Jakub Kicinski
2025-05-07 13:45         ` David Howells
2025-05-07 17:47           ` Willem de Bruijn
2025-05-07 13:49         ` David Howells
2025-05-12 14:51     ` AF_UNIX/zerocopy/pipe/vmsplice/splice vs FOLL_PIN David Howells
2025-05-12 21:59       ` David Hildenbrand
2025-06-23 11:50       ` Christian Brauner
2025-06-23 13:53       ` Christoph Hellwig
2025-06-23 14:16       ` David Howells
2025-06-23 10:50     ` How to handle P2P DMA with only {physaddr,len} in bio_vec? David Howells
2025-06-23 13:46       ` Christoph Hellwig [this message]
2025-06-23 23:38         ` Alistair Popple
2025-06-24  9:02       ` David Howells
2025-06-24 12:18         ` Jason Gunthorpe
2025-06-24 12:39         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aFlaxwpKChYXFf8A@infradead.org \
    --to=hch@infradead.org \
    --cc=almasrymina@google.com \
    --cc=andrew@lunn.ch \
    --cc=brauner@kernel.org \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=edumazet@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=netdev@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).