linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: David Howells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v8 00/10] iov_iter: Improve page extraction (pin or just list)
Date: Tue, 24 Jan 2023 13:44:21 +0100	[thread overview]
Message-ID: <02063032-61e7-e1e5-cd51-a50337405159@redhat.com> (raw)
In-Reply-To: <20230123173007.325544-1-dhowells@redhat.com>

On 23.01.23 18:29, David Howells wrote:
> Hi Al, Christoph,
> 
> Here are patches to provide support for extracting pages from an iov_iter
> and to use this in the extraction functions in the block layer bio code.
> 
> The patches make the following changes:
> 
>   (1) Add a function, iov_iter_extract_pages() to replace
>       iov_iter_get_pages*() that gets refs, pins or just lists the pages as
>       appropriate to the iterator type.
> 
>       Add a function, iov_iter_extract_mode() that will indicate from the
>       iterator type how the cleanup is to be performed, returning FOLL_PIN
>       or 0.
> 
>   (2) Add a function, folio_put_unpin(), and a wrapper, page_put_unpin(),
>       that take a page and the return from iov_iter_extract_mode() and do
>       the right thing to clean up the page.
> 
>   (3) Make the bio struct carry a pair of flags to indicate the cleanup
>       mode.  BIO_NO_PAGE_REF is replaced with BIO_PAGE_REFFED (equivalent to
>       FOLL_GET) and BIO_PAGE_PINNED (equivalent to BIO_PAGE_PINNED) is
>       added.
> 
>   (4) Add a function, bio_release_page(), to release a page appropriately to
>       the cleanup mode indicated by the BIO_PAGE_* flags.
> 
>   (5) Make the iter-to-bio code use iov_iter_extract_pages() to retain the
>       pages appropriately and clean them up later.
> 
>   (6) Fix bio_flagged() so that it doesn't prevent a gcc optimisation.
> 
>   (7) Renumber FOLL_PIN and FOLL_GET down so that they're at bits 0 and 1
>       and coincident with BIO_PAGE_PINNED and BIO_PAGE_REFFED.  The compiler
>       can then optimise on that.  Also, it's probably going to be necessary
>       to embed these in the page pointer in sk_buff fragments.  This patch
>       can go independently through the mm tree.

^ I feel like some of that information might be stale now that you're 
only using FOLL_PIN.

> 
> I've pushed the patches here also:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-extract

I gave this a quick test and it indeed fixes the last remaining test 
case of my O_DIRECT+fork tests [1] that was still failing on upstream 
(test3).


Once landed upstream, if we feel confident enough (I tend to), we could 
adjust the open() man page to state that O_DIRECT can now be run 
concurrently with fork(). Especially, the following documentation might 
be adjusted:

"O_DIRECT  I/Os  should  never  be run concurrently with the fork(2) 
system call, if the memory buffer is a private mapping (i.e., any 
mapping created with the mmap(2) MAP_PRIVATE flag; this includes  memory 
  allocated  on  the  heap  and statically allocated buffers).  Any such 
I/Os, whether submitted via an asynchronous I/O interface or from 
another thread in the  process, should  be completed before fork(2) is 
called.  Failure to do so can result in data corruption and undefined 
behavior in parent and child processes."


This series does not yet fix vmsplice()+hugetlb ... simply because your 
series does not mess with the vmsplice() implementation I assume ;) Once 
vmsplice() uses FOLL_PIN, all cow tests should be passing as well. Easy 
to test:

$ cd tools/testing/selftests/vm/
$ echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
$ echo 2 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
$ ./cow
...
Bail out! 8 out of 190 tests failed
# Totals: pass:181 fail:8 xfail:0 xpass:0 skip:1 error:0


[1] https://gitlab.com/davidhildenbrand/o_direct_fork_tests

-- 
Thanks,

David / dhildenb


  parent reply	other threads:[~2023-01-24 12:45 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-23 17:29 [PATCH v8 00/10] iov_iter: Improve page extraction (pin or just list) David Howells
2023-01-23 17:29 ` [PATCH v8 01/10] iov_iter: Define flags to qualify page extraction David Howells
2023-01-23 18:20   ` Christoph Hellwig
2023-01-24  2:12   ` John Hubbard
2023-01-23 17:29 ` [PATCH v8 02/10] iov_iter: Add a function to extract a page list from an iterator David Howells
2023-01-23 18:21   ` Christoph Hellwig
2023-01-24 14:27   ` David Hildenbrand
2023-01-24 14:35   ` David Howells
2023-01-24 14:37     ` David Hildenbrand
2023-01-24 14:45     ` David Howells
2023-01-24 14:52       ` David Hildenbrand
2023-01-23 17:30 ` [PATCH v8 03/10] mm: Provide a helper to drop a pin/ref on a page David Howells
2023-01-23 18:21   ` Christoph Hellwig
2023-01-24  3:03   ` John Hubbard
2023-01-24 14:28   ` David Hildenbrand
2023-01-24 14:41   ` David Howells
2023-01-24 14:52     ` Christoph Hellwig
2023-01-24 14:53       ` David Hildenbrand
2023-01-24 15:04     ` David Howells
2023-01-23 17:30 ` [PATCH v8 04/10] iomap: don't get an reference on ZERO_PAGE for direct I/O block zeroing David Howells
2023-01-23 18:22   ` Christoph Hellwig
2023-01-24  2:42   ` John Hubbard
2023-01-24  5:59     ` Christoph Hellwig
2023-01-24  7:03       ` John Hubbard
2023-01-24 14:29   ` David Hildenbrand
2023-01-23 17:30 ` [PATCH v8 05/10] block: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-01-23 17:30 ` [PATCH v8 06/10] block: Rename BIO_NO_PAGE_REF to BIO_PAGE_REFFED and invert the meaning David Howells
2023-01-23 18:23   ` Christoph Hellwig
2023-01-23 17:30 ` [PATCH v8 07/10] block: Switch to pinning pages David Howells
2023-01-23 18:23   ` Christoph Hellwig
2023-01-24 14:32   ` David Hildenbrand
2023-01-24 14:47   ` David Howells
2023-01-24 14:53     ` Christoph Hellwig
2023-01-24 15:03     ` David Howells
2023-01-24 16:44       ` Christoph Hellwig
2023-01-24 16:46         ` David Hildenbrand
2023-01-24 16:59         ` Christoph Hellwig
2023-01-24 18:37         ` David Howells
2023-01-24 18:55           ` Christoph Hellwig
2023-01-24 18:38         ` David Howells
2023-01-23 17:30 ` [PATCH v8 08/10] block: Convert bio_iov_iter_get_pages to use iov_iter_extract_pages David Howells
2023-01-23 18:23   ` Christoph Hellwig
2023-01-23 17:30 ` [PATCH v8 09/10] block: convert bio_map_user_iov " David Howells
2023-01-23 18:24   ` Christoph Hellwig
2023-01-23 17:30 ` [PATCH v8 10/10] mm: Renumber FOLL_PIN and FOLL_GET down David Howells
2023-01-23 18:25   ` Christoph Hellwig
2023-01-24  3:08   ` John Hubbard
2023-01-24  3:11     ` John Hubbard
2023-01-24 13:13       ` Jason Gunthorpe
2023-01-24 13:18         ` Christoph Hellwig
2023-01-24 13:43           ` Jason Gunthorpe
2023-01-24 13:40       ` David Howells
2023-01-24 13:46       ` David Howells
2023-01-24 13:47         ` Jason Gunthorpe
2023-01-24 13:57         ` David Howells
2023-01-24 14:00           ` Jason Gunthorpe
2023-01-24 14:02           ` Christoph Hellwig
2023-01-24 14:11           ` David Howells
2023-01-24 14:14             ` Jason Gunthorpe
2023-01-24 14:27             ` David Howells
2023-01-24 14:31               ` Jason Gunthorpe
2023-01-24 14:59               ` David Howells
2023-01-24 15:06                 ` Jason Gunthorpe
2023-01-24 15:12                 ` David Howells
2023-01-24 14:12           ` David Howells
2023-01-24 14:13             ` Christoph Hellwig
2023-01-24 14:25             ` David Howells
2023-01-24  7:05   ` David Howells
2023-01-24  2:02 ` [PATCH v8 00/10] iov_iter: Improve page extraction (pin or just list) John Hubbard
2023-01-24 12:44 ` David Hildenbrand [this message]
2023-01-24 13:16   ` Christoph Hellwig
2023-01-24 13:22     ` David Hildenbrand
2023-01-24 13:32       ` Christoph Hellwig
2023-01-24 13:35         ` David Hildenbrand
2023-01-24 13:44 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02063032-61e7-e1e5-cd51-a50337405159@redhat.com \
    --to=david@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dhowells@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).