From: Martin Wilck <mwilck@suse.com>
To: Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.com>,
Christoph Hellwig <hch@lst.de>, Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>,
Johannes Thumshirn <jthumshirn@suse.de>,
Kent Overstreet <kent.overstreet@gmail.com>,
linux-block@vger.kernel.org, Martin Wilck <mwilck@suse.com>
Subject: [PATCH v5 3/3] block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs
Date: Wed, 25 Jul 2018 23:15:09 +0200 [thread overview]
Message-ID: <20180725211509.13592-4-mwilck@suse.com> (raw)
In-Reply-To: <20180725211509.13592-1-mwilck@suse.com>
bio_iov_iter_get_pages() currently only adds pages for the
next non-zero segment from the iov_iter to the bio. That's
suboptimal for callers, which typically try to pin as many
pages as fit into the bio. This patch converts the current
bio_iov_iter_get_pages() into a static helper, and introduces
a new helper that allocates as many pages as
1) fit into the bio,
2) are present in the iov_iter,
3) and can be pinned by MM.
Error is returned only if zero pages could be pinned. Because of
3), a zero return value doesn't necessarily mean all pages have been
pinned. Callers that have to pin every page in the iov_iter must still
call this function in a loop (this is currently the case).
This change matters most for __blkdev_direct_IO_simple(), which calls
bio_iov_iter_get_pages() only once. If it obtains less pages than requested,
it returns a "short write" or "short read", and __generic_file_write_iter()
falls back to buffered writes, which may lead to data corruption.
Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for
simplified bdev direct-io")
Signed-off-by: Martin Wilck <mwilck@suse.com>
---
block/bio.c | 35 ++++++++++++++++++++++++++++++++---
1 file changed, 32 insertions(+), 3 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 489a430..925033d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -903,14 +903,16 @@ int bio_add_page(struct bio *bio, struct page *page,
EXPORT_SYMBOL(bio_add_page);
/**
- * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
+ * __bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
* @bio: bio to add pages to
* @iter: iov iterator describing the region to be mapped
*
- * Pins as many pages from *iter and appends them to @bio's bvec array. The
+ * Pins pages from *iter and appends them to @bio's bvec array. The
* pages will have to be released using put_page() when done.
+ * For multi-segment *iter, this function only adds pages from the
+ * the next non-empty segment of the iov iterator.
*/
-int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt, idx;
struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
@@ -947,6 +949,33 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
iov_iter_advance(iter, size);
return 0;
}
+
+/**
+ * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
+ * @bio: bio to add pages to
+ * @iter: iov iterator describing the region to be mapped
+ *
+ * Pins pages from *iter and appends them to @bio's bvec array. The
+ * pages will have to be released using put_page() when done.
+ * The function tries, but does not guarantee, to pin as many pages as
+ * fit into the bio, or are requested in *iter, whatever is smaller.
+ * If MM encounters an error pinning the requested pages, it stops.
+ * Error is returned only if 0 pages could be pinned.
+ */
+int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+{
+ unsigned short orig_vcnt = bio->bi_vcnt;
+
+ do {
+ int ret = __bio_iov_iter_get_pages(bio, iter);
+
+ if (unlikely(ret))
+ return bio->bi_vcnt > orig_vcnt ? 0 : ret;
+
+ } while (iov_iter_count(iter) && !bio_full(bio));
+
+ return 0;
+}
EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
static void submit_bio_wait_endio(struct bio *bio)
--
2.17.1
next prev parent reply other threads:[~2018-07-25 21:15 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-25 21:15 [PATCH v5 0/3] Fix silent data corruption in blkdev_direct_IO() Martin Wilck
2018-07-25 21:15 ` [PATCH v5 1/3] block: bio_iov_iter_get_pages: fix size of last iovec Martin Wilck
2018-07-25 21:15 ` [PATCH v5 2/3] blkdev: __blkdev_direct_IO_simple: fix leak in error case Martin Wilck
2018-07-26 6:38 ` Hannes Reinecke
2018-07-26 9:20 ` Christoph Hellwig
2018-07-25 21:15 ` Martin Wilck [this message]
2018-07-26 9:21 ` [PATCH v5 3/3] block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs Christoph Hellwig
2018-07-30 12:37 ` Ming Lei
2018-08-22 8:02 ` Martin Wilck
2018-08-22 10:33 ` Jan Kara
2018-08-22 10:50 ` Ming Lei
2018-08-22 12:47 ` Jan Kara
2018-07-26 17:53 ` [PATCH v5 0/3] Fix silent data corruption in blkdev_direct_IO() Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180725211509.13592-4-mwilck@suse.com \
--to=mwilck@suse.com \
--cc=axboe@kernel.dk \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jack@suse.com \
--cc=jthumshirn@suse.de \
--cc=kent.overstreet@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox