All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Andrew Lunn <andrew@lunn.ch>
Cc: dhowells@redhat.com, David Hildenbrand <david@redhat.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	willy@infradead.org, netdev@vger.kernel.org, linux-mm@kvack.org
Subject: MSG_ZEROCOPY and the O_DIRECT vs fork() race
Date: Fri, 02 May 2025 14:41:46 +0100	[thread overview]
Message-ID: <1021352.1746193306@warthog.procyon.org.uk> (raw)
In-Reply-To: <0aa1b4a2-47b2-40a4-ae14-ce2dd457a1f7@lunn.ch>

Andrew Lunn <andrew@lunn.ch> wrote:

> > I'm looking into making the sendmsg() code properly handle the 'DIO vs
> > fork' issue (where pages need pinning rather than refs taken) and also
> > getting rid of the taking of refs entirely as the page refcount is going
> > to go away in the relatively near future.
> 
> Sorry, new to this conversation, and i don't know what you mean by DIO
> vs fork.

As I understand it, there's a race between O_DIRECT I/O and fork whereby if
you, say, start a DIO read operation on a page and then fork, the target page
gets attached to child and a copy made for the parent (because the refcount is
elevated by the I/O) - and so only the child sees the result.  This is made
more interesting by such as AIO where the parent gets the completion
notification, but not the data.

Further, a DIO write is then alterable by the child if the DMA has not yet
happened.

One of the things mm/gup.c does is to work around this issue...  However, I
don't think that MSG_ZEROCOPY handles this - and so zerocopy sendmsg is, I
think, subject to the same race.

> Could you point me at a discussion.

I don't know of one, offhand, apart from in the logs for mm/gup.c.  I've added
a couple more mm guys and the mm list to the cc: field.

The information in the description of fc1d8e7cca2daa18d2fe56b94874848adf89d7f5
may be relevant.

David


  reply	other threads:[~2025-05-02 13:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-02 12:07 How much is checksumming done in the kernel vs on the NIC? David Howells
2025-05-02 13:09 ` Andrew Lunn
2025-05-02 13:41   ` David Howells [this message]
2025-05-02 13:48     ` MSG_ZEROCOPY and the O_DIRECT vs fork() race David Hildenbrand
2025-05-02 14:21     ` Andrew Lunn
2025-05-02 16:21       ` Reorganising how the networking layer handles memory David Howells
2025-05-05 20:14         ` Jakub Kicinski
2025-05-06 13:50           ` David Howells
2025-05-06 13:56             ` Christoph Hellwig
2025-05-07 13:49               ` David Howells
2025-05-06 18:20             ` Jakub Kicinski
2025-05-07 13:45               ` David Howells
2025-05-07 17:47                 ` Willem de Bruijn
2025-05-12 14:51         ` AF_UNIX/zerocopy/pipe/vmsplice/splice vs FOLL_PIN David Howells
2025-05-12 21:59           ` David Hildenbrand
2025-06-23 10:50           ` How to handle P2P DMA with only {physaddr,len} in bio_vec? David Howells
2025-06-23 13:46             ` Christoph Hellwig
2025-06-23 23:38               ` Alistair Popple
2025-06-24  9:02               ` David Howells
2025-06-24 12:18                 ` Jason Gunthorpe
2025-06-24 12:39                 ` Christoph Hellwig
2025-06-23 11:50           ` AF_UNIX/zerocopy/pipe/vmsplice/splice vs FOLL_PIN Christian Brauner
2025-06-23 13:53           ` Christoph Hellwig
2025-06-23 14:16             ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1021352.1746193306@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.