From: Jens Axboe <axboe@kernel.dk>
To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
linux-block@vger.kernel.org, linux-arch@vger.kernel.org
Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com,
Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio
Date: Tue, 8 Jan 2019 09:56:41 -0700 [thread overview]
Message-ID: <20190108165645.19311-13-axboe@kernel.dk> (raw)
In-Reply-To: <20190108165645.19311-1-axboe@kernel.dk>
For an ITER_BVEC, we can just iterate the iov and add the pages
to the bio directly. This requires that the caller doesn't releases
the pages on IO completion, we add a BIO_HOLD_PAGES flag for that.
The current two callers of bio_iov_iter_get_pages() are updated to
check if they need to release pages on completion. This makes them
work with bvecs that contain kernel mapped pages already.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/bio.c | 59 ++++++++++++++++++++++++++++++++-------
fs/block_dev.c | 5 ++--
fs/iomap.c | 5 ++--
include/linux/blk_types.h | 1 +
4 files changed, 56 insertions(+), 14 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 4db1008309ed..7af4f45d2ed6 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -828,6 +828,23 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);
+static int __bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter)
+{
+ const struct bio_vec *bv = iter->bvec;
+ unsigned int len;
+ size_t size;
+
+ len = min_t(size_t, bv->bv_len, iter->count);
+ size = bio_add_page(bio, bv->bv_page, len,
+ bv->bv_offset + iter->iov_offset);
+ if (size == len) {
+ iov_iter_advance(iter, size);
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
#define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *))
/**
@@ -876,23 +893,43 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
}
/**
- * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
+ * bio_iov_iter_get_pages - add user or kernel pages to a bio
* @bio: bio to add pages to
- * @iter: iov iterator describing the region to be mapped
+ * @iter: iov iterator describing the region to be added
+ *
+ * This takes either an iterator pointing to user memory, or one pointing to
+ * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
+ * map them into the kernel. On IO completion, the caller should put those
+ * pages. If we're adding kernel pages, we just have to add the pages to the
+ * bio directly. We don't grab an extra reference to those pages (the user
+ * should already have that), and we don't put the page on IO completion.
+ * The caller needs to check if the bio is flagged BIO_HOLD_PAGES on IO
+ * completion. If it isn't, then pages should be released.
*
- * Pins pages from *iter and appends them to @bio's bvec array. The
- * pages will have to be released using put_page() when done.
* The function tries, but does not guarantee, to pin as many pages as
- * fit into the bio, or are requested in *iter, whatever is smaller.
- * If MM encounters an error pinning the requested pages, it stops.
- * Error is returned only if 0 pages could be pinned.
+ * fit into the bio, or are requested in *iter, whatever is smaller. If
+ * MM encounters an error pinning the requested pages, it stops. Error
+ * is returned only if 0 pages could be pinned.
*/
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
+ const bool is_bvec = iov_iter_is_bvec(iter);
unsigned short orig_vcnt = bio->bi_vcnt;
+ /*
+ * If this is a BVEC iter, then the pages are kernel pages. Don't
+ * release them on IO completion.
+ */
+ if (is_bvec)
+ bio_set_flag(bio, BIO_HOLD_PAGES);
+
do {
- int ret = __bio_iov_iter_get_pages(bio, iter);
+ int ret;
+
+ if (is_bvec)
+ ret = __bio_iov_bvec_add_pages(bio, iter);
+ else
+ ret = __bio_iov_iter_get_pages(bio, iter);
if (unlikely(ret))
return bio->bi_vcnt > orig_vcnt ? 0 : ret;
@@ -1634,7 +1671,8 @@ static void bio_dirty_fn(struct work_struct *work)
next = bio->bi_private;
bio_set_pages_dirty(bio);
- bio_release_pages(bio);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_release_pages(bio);
bio_put(bio);
}
}
@@ -1650,7 +1688,8 @@ void bio_check_pages_dirty(struct bio *bio)
goto defer;
}
- bio_release_pages(bio);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_release_pages(bio);
bio_put(bio);
return;
defer:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2ebd2a0d7789..b7742014c9de 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -324,8 +324,9 @@ static void blkdev_bio_end_io(struct bio *bio)
struct bio_vec *bvec;
int i;
- bio_for_each_segment_all(bvec, bio, i)
- put_page(bvec->bv_page);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_for_each_segment_all(bvec, bio, i)
+ put_page(bvec->bv_page);
bio_put(bio);
}
}
diff --git a/fs/iomap.c b/fs/iomap.c
index 4ee50b76b4a1..0a64c9c51203 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1582,8 +1582,9 @@ static void iomap_dio_bio_end_io(struct bio *bio)
struct bio_vec *bvec;
int i;
- bio_for_each_segment_all(bvec, bio, i)
- put_page(bvec->bv_page);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_for_each_segment_all(bvec, bio, i)
+ put_page(bvec->bv_page);
bio_put(bio);
}
}
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5c7e7f859a24..97e206855cd3 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -215,6 +215,7 @@ struct bio {
/*
* bio flags
*/
+#define BIO_HOLD_PAGES 0 /* don't put O_DIRECT pages */
#define BIO_SEG_VALID 1 /* bi_phys_segments valid */
#define BIO_CLONED 2 /* doesn't own data */
#define BIO_BOUNCED 3 /* bio is a bounce bio */
--
2.17.1
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk>
To: linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
linux-block@vger.kernel.org, linux-arch@vger.kernel.org
Cc: hch@lst.de, jmoyer@redhat.com, avi@scylladb.com,
Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio
Date: Tue, 8 Jan 2019 09:56:41 -0700 [thread overview]
Message-ID: <20190108165645.19311-13-axboe@kernel.dk> (raw)
Message-ID: <20190108165641.Kg2MBhcZuFvW9v0IqM7hUnFSWaPOd9AZ3ICWg932_GQ@z> (raw)
In-Reply-To: <20190108165645.19311-1-axboe@kernel.dk>
For an ITER_BVEC, we can just iterate the iov and add the pages
to the bio directly. This requires that the caller doesn't releases
the pages on IO completion, we add a BIO_HOLD_PAGES flag for that.
The current two callers of bio_iov_iter_get_pages() are updated to
check if they need to release pages on completion. This makes them
work with bvecs that contain kernel mapped pages already.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
block/bio.c | 59 ++++++++++++++++++++++++++++++++-------
fs/block_dev.c | 5 ++--
fs/iomap.c | 5 ++--
include/linux/blk_types.h | 1 +
4 files changed, 56 insertions(+), 14 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 4db1008309ed..7af4f45d2ed6 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -828,6 +828,23 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);
+static int __bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter)
+{
+ const struct bio_vec *bv = iter->bvec;
+ unsigned int len;
+ size_t size;
+
+ len = min_t(size_t, bv->bv_len, iter->count);
+ size = bio_add_page(bio, bv->bv_page, len,
+ bv->bv_offset + iter->iov_offset);
+ if (size == len) {
+ iov_iter_advance(iter, size);
+ return 0;
+ }
+
+ return -EINVAL;
+}
+
#define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *))
/**
@@ -876,23 +893,43 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
}
/**
- * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
+ * bio_iov_iter_get_pages - add user or kernel pages to a bio
* @bio: bio to add pages to
- * @iter: iov iterator describing the region to be mapped
+ * @iter: iov iterator describing the region to be added
+ *
+ * This takes either an iterator pointing to user memory, or one pointing to
+ * kernel pages (BVEC iterator). If we're adding user pages, we pin them and
+ * map them into the kernel. On IO completion, the caller should put those
+ * pages. If we're adding kernel pages, we just have to add the pages to the
+ * bio directly. We don't grab an extra reference to those pages (the user
+ * should already have that), and we don't put the page on IO completion.
+ * The caller needs to check if the bio is flagged BIO_HOLD_PAGES on IO
+ * completion. If it isn't, then pages should be released.
*
- * Pins pages from *iter and appends them to @bio's bvec array. The
- * pages will have to be released using put_page() when done.
* The function tries, but does not guarantee, to pin as many pages as
- * fit into the bio, or are requested in *iter, whatever is smaller.
- * If MM encounters an error pinning the requested pages, it stops.
- * Error is returned only if 0 pages could be pinned.
+ * fit into the bio, or are requested in *iter, whatever is smaller. If
+ * MM encounters an error pinning the requested pages, it stops. Error
+ * is returned only if 0 pages could be pinned.
*/
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
+ const bool is_bvec = iov_iter_is_bvec(iter);
unsigned short orig_vcnt = bio->bi_vcnt;
+ /*
+ * If this is a BVEC iter, then the pages are kernel pages. Don't
+ * release them on IO completion.
+ */
+ if (is_bvec)
+ bio_set_flag(bio, BIO_HOLD_PAGES);
+
do {
- int ret = __bio_iov_iter_get_pages(bio, iter);
+ int ret;
+
+ if (is_bvec)
+ ret = __bio_iov_bvec_add_pages(bio, iter);
+ else
+ ret = __bio_iov_iter_get_pages(bio, iter);
if (unlikely(ret))
return bio->bi_vcnt > orig_vcnt ? 0 : ret;
@@ -1634,7 +1671,8 @@ static void bio_dirty_fn(struct work_struct *work)
next = bio->bi_private;
bio_set_pages_dirty(bio);
- bio_release_pages(bio);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_release_pages(bio);
bio_put(bio);
}
}
@@ -1650,7 +1688,8 @@ void bio_check_pages_dirty(struct bio *bio)
goto defer;
}
- bio_release_pages(bio);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_release_pages(bio);
bio_put(bio);
return;
defer:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2ebd2a0d7789..b7742014c9de 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -324,8 +324,9 @@ static void blkdev_bio_end_io(struct bio *bio)
struct bio_vec *bvec;
int i;
- bio_for_each_segment_all(bvec, bio, i)
- put_page(bvec->bv_page);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_for_each_segment_all(bvec, bio, i)
+ put_page(bvec->bv_page);
bio_put(bio);
}
}
diff --git a/fs/iomap.c b/fs/iomap.c
index 4ee50b76b4a1..0a64c9c51203 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1582,8 +1582,9 @@ static void iomap_dio_bio_end_io(struct bio *bio)
struct bio_vec *bvec;
int i;
- bio_for_each_segment_all(bvec, bio, i)
- put_page(bvec->bv_page);
+ if (!bio_flagged(bio, BIO_HOLD_PAGES))
+ bio_for_each_segment_all(bvec, bio, i)
+ put_page(bvec->bv_page);
bio_put(bio);
}
}
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5c7e7f859a24..97e206855cd3 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -215,6 +215,7 @@ struct bio {
/*
* bio flags
*/
+#define BIO_HOLD_PAGES 0 /* don't put O_DIRECT pages */
#define BIO_SEG_VALID 1 /* bi_phys_segments valid */
#define BIO_CLONED 2 /* doesn't own data */
#define BIO_BOUNCED 3 /* bio is a bounce bio */
--
2.17.1
next prev parent reply other threads:[~2019-01-08 16:56 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-08 16:56 [PATCHSET v1] io_uring IO interface Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 01/16] fs: add an iopoll method to struct file_operations Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 02/16] block: wire up block device iopoll method Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 03/16] block: add bio_set_polled() helper Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-10 9:43 ` Ming Lei
2019-01-10 9:43 ` Ming Lei
2019-01-10 16:05 ` Jens Axboe
2019-01-10 16:05 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 04/16] iomap: wire up the iopoll method Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 05/16] Add io_uring IO interface Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 12:10 ` Christoph Hellwig
2019-01-09 15:53 ` Jens Axboe
2019-01-09 15:53 ` Jens Axboe
2019-01-09 18:30 ` Christoph Hellwig
2019-01-09 18:30 ` Christoph Hellwig
2019-01-09 20:07 ` Jens Axboe
2019-01-09 20:07 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 06/16] io_uring: support for IO polling Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 12:11 ` Christoph Hellwig
2019-01-09 15:53 ` Jens Axboe
2019-01-09 15:53 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 07/16] io_uring: add submission side request cache Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 08/16] fs: add fget_many() and fput_many() Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 09/16] io_uring: use fget/fput_many() for file references Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 10/16] io_uring: split kiocb init from allocation Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 12:12 ` Christoph Hellwig
2019-01-09 12:12 ` Christoph Hellwig
2019-01-09 16:56 ` Jens Axboe
2019-01-09 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 11/16] io_uring: batch io_kiocb allocation Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 12:13 ` Christoph Hellwig
2019-01-09 16:57 ` Jens Axboe
2019-01-09 16:57 ` Jens Axboe
2019-01-09 19:03 ` Christoph Hellwig
2019-01-09 20:08 ` Jens Axboe
2019-01-09 20:08 ` Jens Axboe
2019-01-08 16:56 ` Jens Axboe [this message]
2019-01-08 16:56 ` [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio Jens Axboe
2019-01-08 16:56 ` [PATCH 13/16] io_uring: add support for pre-mapped user IO buffers Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 12:16 ` Christoph Hellwig
2019-01-09 17:06 ` Jens Axboe
2019-01-09 17:06 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 14/16] io_uring: support kernel side submission Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 19:06 ` Christoph Hellwig
2019-01-09 20:49 ` Jens Axboe
2019-01-09 20:49 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 15/16] io_uring: add submission polling Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-08 16:56 ` [PATCH 16/16] io_uring: add io_uring_event cache hit information Jens Axboe
2019-01-08 16:56 ` Jens Axboe
2019-01-09 16:00 ` [PATCHSET v1] io_uring IO interface Matthew Wilcox
2019-01-09 16:00 ` Matthew Wilcox
2019-01-09 16:27 ` Chris Mason
2019-01-09 16:27 ` Chris Mason
-- strict thread matches above, loose matches on Subject: below --
2019-01-12 21:29 [PATCHSET v3] " Jens Axboe
2019-01-12 21:30 ` [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio Jens Axboe
2019-01-12 21:30 ` Jens Axboe
2019-01-15 2:55 (unknown), Jens Axboe
2019-01-15 2:55 ` [PATCH 12/16] block: implement bio helper to add iter bvec pages to bio Jens Axboe
2019-01-15 2:55 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190108165645.19311-13-axboe@kernel.dk \
--to=axboe@kernel.dk \
--cc=avi@scylladb.com \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-aio@kvack.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.