From: Ming Lei <ming.lei@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: Martin Wilck <mwilck@suse.com>, Jens Axboe <axboe@kernel.dk>,
Jan Kara <jack@suse.com>, Christoph Hellwig <hch@lst.de>,
Hannes Reinecke <hare@suse.de>,
Johannes Thumshirn <jthumshirn@suse.de>,
Kent Overstreet <kent.overstreet@gmail.com>,
linux-block@vger.kernel.org
Subject: Re: [PATCH v5 3/3] block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
Date: Wed, 22 Aug 2018 18:50:53 +0800 [thread overview]
Message-ID: <20180822105046.GA3272@ming.t460p> (raw)
In-Reply-To: <20180822103305.GA23037@quack2.suse.cz>
On Wed, Aug 22, 2018 at 12:33:05PM +0200, Jan Kara wrote:
> On Wed 22-08-18 10:02:49, Martin Wilck wrote:
> > On Mon, 2018-07-30 at 20:37 +0800, Ming Lei wrote:
> > > On Wed, Jul 25, 2018 at 11:15:09PM +0200, Martin Wilck wrote:
> > > >
> > > > +/**
> > > > + * bio_iov_iter_get_pages - pin user or kernel pages and add them
> > > > to a bio
> > > > + * @bio: bio to add pages to
> > > > + * @iter: iov iterator describing the region to be mapped
> > > > + *
> > > > + * Pins pages from *iter and appends them to @bio's bvec array.
> > > > The
> > > > + * pages will have to be released using put_page() when done.
> > > > + * The function tries, but does not guarantee, to pin as many
> > > > pages as
> > > > + * fit into the bio, or are requested in *iter, whatever is
> > > > smaller.
> > > > + * If MM encounters an error pinning the requested pages, it
> > > > stops.
> > > > + * Error is returned only if 0 pages could be pinned.
> > > > + */
> > > > +int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> > > > +{
> > > > + unsigned short orig_vcnt = bio->bi_vcnt;
> > > > +
> > > > + do {
> > > > + int ret = __bio_iov_iter_get_pages(bio, iter);
> > > > +
> > > > + if (unlikely(ret))
> > > > + return bio->bi_vcnt > orig_vcnt ? 0 : ret;
> > > > +
> > > > + } while (iov_iter_count(iter) && !bio_full(bio));
> > >
> > > When 'ret' isn't zero, and some partial progress has been made, seems
> > > less pages
> > > might be obtained than requested too. Is that something we need to
> > > worry about?
> >
> > This would be the case when VM isn't willing or able to fulfill the
> > page-pinning request. Previously, we came to the conclusion that VM has
> > the right to do so. This is the reason why callers have to check the
> > number of pages allocated, and either loop over
> > bio_iov_iter_get_pages(), or fall back to buffered I/O, until all pages
> > have been obtained. All callers except the blockdev fast path do the
> > former.
> >
> > We could add looping in __blkdev_direct_IO_simple() on top of the
> > current patch set, to avoid fallback to buffered IO in this corner
> > case. Should we? If yes, only for WRITEs, or for READs as well?
> >
> > I haven't encountered this situation in my tests, and I'm unsure how to
> > provoke it - run a direct IO test under high memory pressure?
>
> Currently, iov_iter_get_pages() is always guaranteed to get at least one
> page as that is current guarantee of get_user_pages() (unless we hit
> EFAULT obviously). So bio_iov_iter_get_pages() as is now is guaranteed to
Is it possible for this EFAULT to happen on the user-space VM?
Thanks,
Ming
next prev parent reply other threads:[~2018-08-22 10:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-25 21:15 [PATCH v5 0/3] Fix silent data corruption in blkdev_direct_IO() Martin Wilck
2018-07-25 21:15 ` [PATCH v5 1/3] block: bio_iov_iter_get_pages: fix size of last iovec Martin Wilck
2018-07-25 21:15 ` [PATCH v5 2/3] blkdev: __blkdev_direct_IO_simple: fix leak in error case Martin Wilck
2018-07-26 6:38 ` Hannes Reinecke
2018-07-26 9:20 ` Christoph Hellwig
2018-07-25 21:15 ` [PATCH v5 3/3] block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs Martin Wilck
2018-07-26 9:21 ` Christoph Hellwig
2018-07-30 12:37 ` Ming Lei
2018-08-22 8:02 ` Martin Wilck
2018-08-22 10:33 ` Jan Kara
2018-08-22 10:50 ` Ming Lei [this message]
2018-08-22 12:47 ` Jan Kara
2018-07-26 17:53 ` [PATCH v5 0/3] Fix silent data corruption in blkdev_direct_IO() Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180822105046.GA3272@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=jthumshirn@suse.de \
--cc=kent.overstreet@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=mwilck@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox