linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Christoph Hellwig <hch@infradead.org>
Cc: Jan Kara <jack@suse.cz>, John Hubbard <jhubbard@nvidia.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>, Miklos Szeredi <miklos@szeredi.hu>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Anna Schumaker <anna@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 4/7] iov_iter: new iov_iter_pin_pages*() routines
Date: Wed, 14 Sep 2022 04:51:17 +0100	[thread overview]
Message-ID: <YyFPtTtxYozCuXvu@ZenIV> (raw)
In-Reply-To: <YxhaJktqtHw3QTSG@infradead.org>

On Wed, Sep 07, 2022 at 01:45:26AM -0700, Christoph Hellwig wrote:
> On Tue, Sep 06, 2022 at 12:21:06PM +0200, Jan Kara wrote:
> > > For FOLL_PIN callers, never pin bvec and kvec pages:  For file systems
> > > not acquiring a reference is obviously safe, and the other callers will
> > > need an audit, but I can't think of why it woul  ever be unsafe.
> > 
> > Are you sure about "For file systems not acquiring a reference is obviously
> > safe"? I can see places e.g. in orangefs, afs, etc. which create bvec iters
> > from pagecache pages. And then we have iter_file_splice_write() which
> > creates bvec from pipe pages (which can also be pagecache pages if
> > vmsplice() is used). So perhaps there are no lifetime issues even without
> > acquiring a reference (but looking at the code I would not say it is
> > obvious) but I definitely don't see how it would be safe to not get a pin
> > to signal to filesystem backing the pagecache page that there is DMA
> > happening to/from the page.
> 
> I mean in the context of iov_iter_get_pages callers, that is direct
> I/O.  Direct callers of iov_iter_bvec which then pass that iov to
> ->read_iter / ->write_iter will need to hold references (those are
> the references that the callers of iov_iter_get_pages rely on!).

Unless I'm misreading Jan, the question is whether they should get or
pin.  AFAICS, anyone who passes the sucker to ->read_iter() (or ->recvmsg(),
or does direct copy_to_iter()/zero_iter(), etc.) is falling under
=================================================================================
CASE 5: Pinning in order to write to the data within the page
-------------------------------------------------------------
Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
write to a page's data, unpin" can cause a problem. Case 5 may be considered a
superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
other words, if the code is neither Case 1 nor Case 2, it may still require
FOLL_PIN, for patterns like this:

Correct (uses FOLL_PIN calls):
    pin_user_pages()
    write to the data within the pages
    unpin_user_pages()

INCORRECT (uses FOLL_GET calls):
    get_user_pages()
    write to the data within the pages
    put_page()
=================================================================================

Regarding iter_file_splice_write() case, do we need to pin pages
when we are not going to modify the data in those?

The same goes for afs, AFAICS; I started to type "... and everything that passes
WRITE to iov_iter_bvec()", but...
drivers/vhost/vringh.c:1165:            iov_iter_bvec(&iter, READ, iov, ret, translated);
drivers/vhost/vringh.c:1198:            iov_iter_bvec(&iter, WRITE, iov, ret, translated);
is backwards - READ is for data destinations, comes with copy_to_iter(); WRITE is
for data sources and it comes with copy_from_iter()...
I'm really tempted to slap
	if (WARN_ON(i->data_source))
		return 0;
into copy_to_iter() et.al., along with its opposite for copy_from_iter().
And see who comes screaming...  Things like
        if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
                WARN_ON(1);
                return 0;
        }
in e.g. csum_and_copy_from_iter() would be replaced by that, and become
easier to understand...
These two are also getting it wrong, BTW:
drivers/target/target_core_file.c:340:  iov_iter_bvec(&iter, READ, bvec, sgl_nents, len);
drivers/target/target_core_file.c:476:  iov_iter_bvec(&iter, READ, bvec, nolb, len);


  reply	other threads:[~2022-09-14  3:51 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-31  4:18 [PATCH v2 0/7] convert most filesystems to pin_user_pages_fast() John Hubbard
2022-08-31  4:18 ` [PATCH v2 1/7] mm: change release_pages() to use unsigned long for npages John Hubbard
2022-08-31  4:18 ` [PATCH v2 2/7] mm/gup: introduce pin_user_page() John Hubbard
2022-09-06  6:37   ` Christoph Hellwig
2022-09-06  7:12     ` John Hubbard
2022-08-31  4:18 ` [PATCH v2 3/7] block: add dio_w_*() wrappers for pin, unpin user pages John Hubbard
2022-08-31  4:18 ` [PATCH v2 4/7] iov_iter: new iov_iter_pin_pages*() routines John Hubbard
2022-09-01  0:42   ` Al Viro
2022-09-01  1:48     ` John Hubbard
2022-09-06  6:47   ` Christoph Hellwig
2022-09-06  7:44     ` John Hubbard
2022-09-06  7:48       ` Christoph Hellwig
2022-09-06  7:58         ` John Hubbard
2022-09-07  8:50           ` Christoph Hellwig
2022-09-06 10:21         ` Jan Kara
2022-09-07  8:45           ` Christoph Hellwig
2022-09-14  3:51             ` Al Viro [this message]
2022-09-14 14:52               ` Jan Kara
2022-09-14 16:42                 ` Al Viro
2022-09-15  8:16                   ` Jan Kara
2022-09-16  1:55                     ` Al Viro
2022-09-20  5:02                       ` Al Viro
2022-09-22 14:36                         ` Christoph Hellwig
2022-09-22 14:43                           ` David Hildenbrand
2022-09-22 14:45                             ` Christoph Hellwig
2022-09-22  2:22                     ` Al Viro
2022-09-22  6:09                       ` John Hubbard
2022-09-22 11:29                         ` Jan Kara
2022-09-23  3:19                           ` Al Viro
2022-09-23  4:05                             ` John Hubbard
2022-09-23  8:39                               ` Christoph Hellwig
2022-09-23 12:22                               ` Jan Kara
2022-09-23  4:34                           ` John Hubbard
2022-09-22 14:38                       ` Christoph Hellwig
2022-09-23  4:22                         ` Al Viro
2022-09-23  8:44                           ` Christoph Hellwig
2022-09-23 16:13                             ` Al Viro
2022-09-26 15:53                               ` Christoph Hellwig
2022-09-26 19:55                                 ` Al Viro
2022-09-22 14:31               ` Christoph Hellwig
2022-09-22 14:36                 ` Al Viro
2022-08-31  4:18 ` [PATCH v2 5/7] block, bio, fs: convert most filesystems to pin_user_pages_fast() John Hubbard
2022-09-06  6:48   ` Christoph Hellwig
2022-09-06  7:15     ` John Hubbard
2022-08-31  4:18 ` [PATCH v2 6/7] NFS: direct-io: convert to FOLL_PIN pages John Hubbard
2022-09-06  6:49   ` Christoph Hellwig
2022-09-06  7:16     ` John Hubbard
2022-08-31  4:18 ` [PATCH v2 7/7] fuse: convert direct IO paths to use FOLL_PIN John Hubbard
2022-08-31 10:37   ` Miklos Szeredi
2022-09-01  1:33     ` John Hubbard
2022-09-06  6:36 ` [PATCH v2 0/7] convert most filesystems to pin_user_pages_fast() Christoph Hellwig
2022-09-06  7:10   ` John Hubbard
2022-09-06  7:22     ` Christoph Hellwig
2022-09-06  7:37       ` John Hubbard
2022-09-06  7:46         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YyFPtTtxYozCuXvu@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=anna@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=david@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=miklos@szeredi.hu \
    --cc=trond.myklebust@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).