* [PATCH 0/4] Fix gaps detection in bio merges
@ 2015-09-03 16:28 Sagi Grimberg
2015-09-03 16:28 ` [PATCH 1/4] block: Check for gaps on front and back merges Sagi Grimberg
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-03 16:28 UTC (permalink / raw)
This series makes gaps go away for real if a driver
asked to not see them (nvme and soon iser).
patch 1 addresses front merges where gaps weren't
checked at all.
patches 2,3 adds gap detection for integrity payloads.
patch 4 makes blk_rq_map_user_iov turn to bounce buffer
in case of an iovec with gaps.
I've been testing this for a couple of days now and I'll
run some more tests on this, but since it was very easy
to trigger these cases before I'm pretty sure we are covered.
Once this lands in, I will be able to get rid of the
bounce buffer logic in the iser driver as it will rely
on the block layer to get it right.
Jens Axboe (1):
block: Check for gaps on front and back merges
Sagi Grimberg (3):
block: Refuse request/bio merges with gaps in the integrity payload
block: Refuse adding appending a gaped integrity page to a bio
block: Copy a user iovec if it includes gaps
block/bio-integrity.c | 5 +++++
block/blk-integrity.c | 3 +++
block/blk-map.c | 26 ++++++++++++++++++++++++--
block/blk-merge.c | 25 ++++++++++++-------------
include/linux/blkdev.h | 40 ++++++++++++++++++++++++++++++++++++++++
5 files changed, 84 insertions(+), 15 deletions(-)
--
1.8.4.3
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] block: Check for gaps on front and back merges
2015-09-03 16:28 [PATCH 0/4] Fix gaps detection in bio merges Sagi Grimberg
@ 2015-09-03 16:28 ` Sagi Grimberg
2015-09-03 16:28 ` [PATCH 2/4] block: Refuse request/bio merges with gaps in the integrity payload Sagi Grimberg
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-03 16:28 UTC (permalink / raw)
From: Jens Axboe <axboe@fb.com>
We are checking for gaps to previous bio_vec, which can
only detect back merges gaps. Moreover, at the point where
we check for a gap, we don't know if we will attempt a back
or a front merge. Thus, check for gap to prev in a back merge
attempt and check for a gap to next in a front merge attempt.
Signed-off-by: Jens Axboe <axboe at fb.com>
[sagig: Minor rename change]
Signed-off-by: Sagi Grimberg <sagig at mellanox.com>
---
block/blk-merge.c | 19 ++++++-------------
include/linux/blkdev.h | 20 ++++++++++++++++++++
2 files changed, 26 insertions(+), 13 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 09bf5cb116e2..8fbe79829d79 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -312,6 +312,8 @@ no_merge:
int ll_back_merge_fn(struct request_queue *q, struct request *req,
struct bio *bio)
{
+ if (req_gap_back_merge(req, bio))
+ return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
blk_rq_get_max_sectors(req)) {
req->cmd_flags |= REQ_NOMERGE;
@@ -330,6 +332,9 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req,
int ll_front_merge_fn(struct request_queue *q, struct request *req,
struct bio *bio)
{
+
+ if (req_gap_front_merge(req, bio))
+ return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
blk_rq_get_max_sectors(req)) {
req->cmd_flags |= REQ_NOMERGE;
@@ -356,14 +361,6 @@ static bool req_no_special_merge(struct request *req)
return !q->mq_ops && req->special;
}
-static int req_gap_to_prev(struct request *req, struct bio *next)
-{
- struct bio *prev = req->biotail;
-
- return bvec_gap_to_prev(req->q, &prev->bi_io_vec[prev->bi_vcnt - 1],
- next->bi_io_vec[0].bv_offset);
-}
-
static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
struct request *next)
{
@@ -378,7 +375,7 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
if (req_no_special_merge(req) || req_no_special_merge(next))
return 0;
- if (req_gap_to_prev(req, next->bio))
+ if (req_gap_back_merge(req, next->bio))
return 0;
/*
@@ -586,10 +583,6 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
!blk_write_same_mergeable(rq->bio, bio))
return false;
- /* Only check gaps if the bio carries data */
- if (bio_has_data(bio) && req_gap_to_prev(rq, bio))
- return false;
-
return true;
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 68adec9da25f..a65b94480422 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1373,6 +1373,26 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
((bprv->bv_offset + bprv->bv_len) & queue_virt_boundary(q));
}
+static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
+ struct bio *next)
+{
+ if (!bio_has_data(prev))
+ return false;
+
+ return bvec_gap_to_prev(q, &prev->bi_io_vec[prev->bi_vcnt - 1],
+ next->bi_io_vec[0].bv_offset);
+}
+
+static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
+{
+ return bio_will_gap(req->q, req->biotail, bio);
+}
+
+static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
+{
+ return bio_will_gap(req->q, bio, req->bio);
+}
+
struct work_struct;
int kblockd_schedule_work(struct work_struct *work);
int kblockd_schedule_delayed_work(struct delayed_work *dwork, unsigned long delay);
--
1.8.4.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] block: Refuse request/bio merges with gaps in the integrity payload
2015-09-03 16:28 [PATCH 0/4] Fix gaps detection in bio merges Sagi Grimberg
2015-09-03 16:28 ` [PATCH 1/4] block: Check for gaps on front and back merges Sagi Grimberg
@ 2015-09-03 16:28 ` Sagi Grimberg
2015-09-03 16:28 ` [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio Sagi Grimberg
2015-09-03 16:28 ` [PATCH 4/4] block: Copy a user iovec if it includes gaps Sagi Grimberg
3 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-03 16:28 UTC (permalink / raw)
If a driver sets the block queue virtual boundary mask, it means that
it cannot handle gaps so we must not allow those in the integrity
payload as well.
Signed-off-by: Sagi Grimberg <sagig at mellanox.com>
---
block/blk-integrity.c | 3 +++
block/blk-merge.c | 6 ++++++
include/linux/blkdev.h | 20 ++++++++++++++++++++
3 files changed, 29 insertions(+)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index f548b64be092..75f29cf70188 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -204,6 +204,9 @@ bool blk_integrity_merge_rq(struct request_queue *q, struct request *req,
q->limits.max_integrity_segments)
return false;
+ if (integrity_req_gap_back_merge(req, next->bio))
+ return false;
+
return true;
}
EXPORT_SYMBOL(blk_integrity_merge_rq);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 8fbe79829d79..b4be61f4b65e 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -314,6 +314,9 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req,
{
if (req_gap_back_merge(req, bio))
return 0;
+ if (blk_integrity_rq(req) &&
+ integrity_req_gap_back_merge(req, bio))
+ return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
blk_rq_get_max_sectors(req)) {
req->cmd_flags |= REQ_NOMERGE;
@@ -335,6 +338,9 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req,
if (req_gap_front_merge(req, bio))
return 0;
+ if (blk_integrity_rq(req) &&
+ integrity_req_gap_front_merge(req, bio))
+ return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
blk_rq_get_max_sectors(req)) {
req->cmd_flags |= REQ_NOMERGE;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index a65b94480422..dead6088af1c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1393,6 +1393,26 @@ static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
return bio_will_gap(req->q, bio, req->bio);
}
+static inline bool integrity_req_gap_back_merge(struct request *req,
+ struct bio *next)
+{
+ struct bio_integrity_payload *bip = bio_integrity(req->bio);
+ struct bio_integrity_payload *bip_next = bio_integrity(next);
+
+ return bvec_gap_to_prev(req->q, &bip->bip_vec[bip->bip_vcnt - 1],
+ bip_next->bip_vec[0].bv_offset);
+}
+
+static inline bool integrity_req_gap_front_merge(struct request *req,
+ struct bio *bio)
+{
+ struct bio_integrity_payload *bip = bio_integrity(bio);
+ struct bio_integrity_payload *bip_next = bio_integrity(req->bio);
+
+ return bvec_gap_to_prev(req->q, &bip->bip_vec[bip->bip_vcnt - 1],
+ bip_next->bip_vec[0].bv_offset);
+}
+
struct work_struct;
int kblockd_schedule_work(struct work_struct *work);
int kblockd_schedule_delayed_work(struct delayed_work *dwork, unsigned long delay);
--
1.8.4.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-03 16:28 [PATCH 0/4] Fix gaps detection in bio merges Sagi Grimberg
2015-09-03 16:28 ` [PATCH 1/4] block: Check for gaps on front and back merges Sagi Grimberg
2015-09-03 16:28 ` [PATCH 2/4] block: Refuse request/bio merges with gaps in the integrity payload Sagi Grimberg
@ 2015-09-03 16:28 ` Sagi Grimberg
2015-09-04 3:24 ` Martin K. Petersen
2015-09-03 16:28 ` [PATCH 4/4] block: Copy a user iovec if it includes gaps Sagi Grimberg
3 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-03 16:28 UTC (permalink / raw)
This is only theoretical at the moment given that the only
subsystems that generate integrity payloads are the block layer
itself and the scsi target (which generate well aligned integrity
payloads). But when we will expose integrity meta-data to user-space,
we'll need to refuse appending a page with a gap (if the queue
virtual boundary is set).
Signed-off-by: Sagi Grimberg <sagig at mellanox.com>
---
block/bio-integrity.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 719b7152aed1..4e5a43465ddb 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -140,6 +140,11 @@ int bio_integrity_add_page(struct bio *bio, struct page *page,
iv = bip->bip_vec + bip->bip_vcnt;
+ if (bip->bip_vcnt &&
+ bvec_gap_to_prev(bdev_get_queue(bio->bi_bdev),
+ &bip->bip_vec[bip->bip_vcnt - 1], offset))
+ return 0;
+
iv->bv_page = page;
iv->bv_len = len;
iv->bv_offset = offset;
--
1.8.4.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] block: Copy a user iovec if it includes gaps
2015-09-03 16:28 [PATCH 0/4] Fix gaps detection in bio merges Sagi Grimberg
` (2 preceding siblings ...)
2015-09-03 16:28 ` [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio Sagi Grimberg
@ 2015-09-03 16:28 ` Sagi Grimberg
3 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-03 16:28 UTC (permalink / raw)
For drivers that don't support gaps in the SG lists handed to
them we must bounce (copy the user buffers) and pass a bio that
does not include gaps. This doesn't matter for any current user,
but will help to allow iser which can't handle gaps to use the
block virtual boundary instead of using driver-local bounce
buffering when handling SG_IO commands.
Signed-off-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Sagi Grimberg <sagig at mellanox.com>
---
block/blk-map.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/block/blk-map.c b/block/blk-map.c
index da310a105429..ac581a674b9b 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -9,6 +9,24 @@
#include "blk.h"
+static bool iovec_gap_to_prv(struct request_queue *q,
+ struct iovec *prv, struct iovec *cur)
+{
+ unsigned long prev_end;
+
+ if (!queue_virt_boundary(q))
+ return false;
+
+ if (prv->iov_base == NULL && prv->iov_len == 0)
+ /* prv is not set - don't check */
+ return false;
+
+ prev_end = (unsigned long)(prv->iov_base + prv->iov_len);
+
+ return (((unsigned long)cur->iov_base & queue_virt_boundary(q)) ||
+ prev_end & queue_virt_boundary(q));
+}
+
int blk_rq_append_bio(struct request_queue *q, struct request *rq,
struct bio *bio)
{
@@ -67,7 +85,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
struct bio *bio;
int unaligned = 0;
struct iov_iter i;
- struct iovec iov;
+ struct iovec iov, prv = {.iov_base = NULL, .iov_len = 0};
if (!iter || !iter->count)
return -EINVAL;
@@ -81,8 +99,12 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
/*
* Keep going so we check length of all segments
*/
- if (uaddr & queue_dma_alignment(q))
+ if ((uaddr & queue_dma_alignment(q)) ||
+ iovec_gap_to_prv(q, &prv, &iov))
unaligned = 1;
+
+ prv.iov_base = iov.iov_base;
+ prv.iov_len = iov.iov_len;
}
if (unaligned || (q->dma_pad_mask & iter->count) || map_data)
--
1.8.4.3
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-03 16:28 ` [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio Sagi Grimberg
@ 2015-09-04 3:24 ` Martin K. Petersen
2015-09-06 7:46 ` Sagi Grimberg
0 siblings, 1 reply; 11+ messages in thread
From: Martin K. Petersen @ 2015-09-04 3:24 UTC (permalink / raw)
>>>>> "Sagi" == Sagi Grimberg <sagig at mellanox.com> writes:
Sagi> This is only theoretical at the moment given that the only
Sagi> subsystems that generate integrity payloads are the block layer
Sagi> itself and the scsi target (which generate well aligned integrity
Sagi> payloads). But when we will expose integrity meta-data to
Sagi> user-space, we'll need to refuse appending a page with a gap (if
Sagi> the queue virtual boundary is set).
Code looks fine but my concerns outlined in:
http://lists.infradead.org/pipermail/linux-nvme/2015-July/002063.html
still stand.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-04 3:24 ` Martin K. Petersen
@ 2015-09-06 7:46 ` Sagi Grimberg
2015-09-08 18:24 ` Martin K. Petersen
0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-06 7:46 UTC (permalink / raw)
On 9/4/2015 6:24 AM, Martin K. Petersen wrote:
>>>>>> "Sagi" == Sagi Grimberg <sagig at mellanox.com> writes:
>
> Sagi> This is only theoretical at the moment given that the only
> Sagi> subsystems that generate integrity payloads are the block layer
> Sagi> itself and the scsi target (which generate well aligned integrity
> Sagi> payloads). But when we will expose integrity meta-data to
> Sagi> user-space, we'll need to refuse appending a page with a gap (if
> Sagi> the queue virtual boundary is set).
>
Hi Martin,
> Code looks fine but my concerns outlined in:
>
> http://lists.infradead.org/pipermail/linux-nvme/2015-July/002063.html
So given that this is a bug fix (was completely not handled before) I
think we should make an incremental progress towards getting it right.
Are you fine with that?
Now, given that drivers which set virt_boundary won't be able to accept
8-byte sg elements for DIX, would you prefer it being bounced rather
than refusing the merge? This would require some additional logic given
that we won't want to bounce on every merge attempt but only when the
request is kicked?
Sagi.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-06 7:46 ` Sagi Grimberg
@ 2015-09-08 18:24 ` Martin K. Petersen
2015-09-10 9:10 ` Sagi Grimberg
0 siblings, 1 reply; 11+ messages in thread
From: Martin K. Petersen @ 2015-09-08 18:24 UTC (permalink / raw)
>>>>> "Sagi" == Sagi Grimberg <sagig at dev.mellanox.co.il> writes:
Sagi,
Sagi> So given that this is a bug fix (was completely not handled
Sagi> before) I think we should make an incremental progress towards
Sagi> getting it right. Are you fine with that?
Yeah, I don't have an objection to your changes.
Sagi> Now, given that drivers which set virt_boundary won't be able to
Sagi> accept 8-byte sg elements for DIX, would you prefer it being
Sagi> bounced rather than refusing the merge? This would require some
Sagi> additional logic given that we won't want to bounce on every merge
Sagi> attempt but only when the request is kicked?
For some workloads I think bouncing is inevitable by virtue of the PI
payload being teeny tiny by design. I don't mind that happening in block
as opposed to the driver. This would also help things like NVMe.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-08 18:24 ` Martin K. Petersen
@ 2015-09-10 9:10 ` Sagi Grimberg
2015-09-10 22:22 ` Martin K. Petersen
0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-10 9:10 UTC (permalink / raw)
On 9/8/2015 9:24 PM, Martin K. Petersen wrote:
>>>>>> "Sagi" == Sagi Grimberg <sagig at dev.mellanox.co.il> writes:
>
> Sagi,
>
> Sagi> So given that this is a bug fix (was completely not handled
> Sagi> before) I think we should make an incremental progress towards
> Sagi> getting it right. Are you fine with that?
>
> Yeah, I don't have an objection to your changes.
>
> Sagi> Now, given that drivers which set virt_boundary won't be able to
> Sagi> accept 8-byte sg elements for DIX, would you prefer it being
> Sagi> bounced rather than refusing the merge? This would require some
> Sagi> additional logic given that we won't want to bounce on every merge
> Sagi> attempt but only when the request is kicked?
>
> For some workloads I think bouncing is inevitable by virtue of the PI
> payload being teeny tiny by design.
I see. I guess we can bounce DIX data, but is it always "right" to
bounce? I don't know, as you say, it is workload dependent.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-10 9:10 ` Sagi Grimberg
@ 2015-09-10 22:22 ` Martin K. Petersen
2015-09-13 7:53 ` Sagi Grimberg
0 siblings, 1 reply; 11+ messages in thread
From: Martin K. Petersen @ 2015-09-10 22:22 UTC (permalink / raw)
>>>>> "Sagi" == Sagi Grimberg <sagig at dev.mellanox.co.il> writes:
>> For some workloads I think bouncing is inevitable by virtue of the PI
>> payload being teeny tiny by design.
Sagi> I see. I guess we can bounce DIX data, but is it always "right" to
Sagi> bounce? I don't know, as you say, it is workload dependent.
The right place to make that call, I guess, is in
bio_integrity_add_page() based on whether we exceed any of the
constraints set by the LLD.
--
Martin K. Petersen Oracle Linux Engineering
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio
2015-09-10 22:22 ` Martin K. Petersen
@ 2015-09-13 7:53 ` Sagi Grimberg
0 siblings, 0 replies; 11+ messages in thread
From: Sagi Grimberg @ 2015-09-13 7:53 UTC (permalink / raw)
On 9/11/2015 1:22 AM, Martin K. Petersen wrote:
>>>>>> "Sagi" == Sagi Grimberg <sagig at dev.mellanox.co.il> writes:
>
>>> For some workloads I think bouncing is inevitable by virtue of the PI
>>> payload being teeny tiny by design.
>
> Sagi> I see. I guess we can bounce DIX data, but is it always "right" to
> Sagi> bounce? I don't know, as you say, it is workload dependent.
>
> The right place to make that call, I guess, is in
> bio_integrity_add_page() based on whether we exceed any of the
> constraints set by the LLD.
In case the consumer passed a gappy integrity payload then we can bounce
but AFAICT, merge flows do not include a call site to
bio_integrity_add_page.
It would be easy enough to bounce gappy IOs and refuse gappy merges,
would that be sufficient?
Sagi.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-09-13 7:53 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-03 16:28 [PATCH 0/4] Fix gaps detection in bio merges Sagi Grimberg
2015-09-03 16:28 ` [PATCH 1/4] block: Check for gaps on front and back merges Sagi Grimberg
2015-09-03 16:28 ` [PATCH 2/4] block: Refuse request/bio merges with gaps in the integrity payload Sagi Grimberg
2015-09-03 16:28 ` [PATCH 3/4] block: Refuse adding appending a gaped integrity page to a bio Sagi Grimberg
2015-09-04 3:24 ` Martin K. Petersen
2015-09-06 7:46 ` Sagi Grimberg
2015-09-08 18:24 ` Martin K. Petersen
2015-09-10 9:10 ` Sagi Grimberg
2015-09-10 22:22 ` Martin K. Petersen
2015-09-13 7:53 ` Sagi Grimberg
2015-09-03 16:28 ` [PATCH 4/4] block: Copy a user iovec if it includes gaps Sagi Grimberg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox