merging discard request in the block layer

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* merging discard request in the block layer
@ 2011-03-22 19:47 Christoph Hellwig
  2011-03-22 19:54 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2011-03-22 19:47 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel

It seems the current block layer wil happily try to merge discard
requests that were split because they are at the max that bi_size
can hold together again.  At least that's what the

	blk: request botched

make me believe when testing XFS code that allows multiple
asynchronous discard request, unlike the current blkdev_issue_discard
which always waits for one before starting the next.

I tried this little sniplet to prevent it:

Index: xfs/block/blk-merge.c
===================================================================
--- xfs.orig/block/blk-merge.c	2011-03-22 13:07:24.733857580 +0100
+++ xfs/block/blk-merge.c	2011-03-22 13:08:17.448856577 +0100
@@ -373,7 +373,7 @@ static int attempt_merge(struct request_
 	/*
 	 * Don't merge file system requests and discard requests
 	 */
-	if ((req->cmd_flags & REQ_DISCARD) != (next->cmd_flags & REQ_DISCARD))
+	if ((req->cmd_flags & REQ_DISCARD) || (next->cmd_flags & REQ_DISCARD))
 		return 0;
 
 	/*

but it has no effect.  Using the big hammer and bypassing the whole
I/O schedule logic on the other works fine:

Index: xfs/block/blk-core.c
===================================================================
--- xfs.orig/block/blk-core.c	2011-03-22 13:07:24.717855861 +0100
+++ xfs/block/blk-core.c	2011-03-22 14:56:13.424856289 +0100
@@ -1218,7 +1218,7 @@ static int __make_request(struct request
 
 	spin_lock_irq(q->queue_lock);
 
-	if (bio->bi_rw & (REQ_FLUSH | REQ_FUA)) {
+	if (bio->bi_rw & (REQ_FLUSH | REQ_FUA | REQ_DISCARD)) {
 		where = ELEVATOR_INSERT_FRONT;
 		goto get_rq;
 	}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 19:47 merging discard request in the block layer Christoph Hellwig
@ 2011-03-22 19:54 ` Jens Axboe
  2011-03-22 21:00   ` Christoph Hellwig
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jens Axboe @ 2011-03-22 19:54 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel@vger.kernel.org

On 2011-03-22 20:47, Christoph Hellwig wrote:
> It seems the current block layer wil happily try to merge discard
> requests that were split because they are at the max that bi_size
> can hold together again.  At least that's what the
> 
> 	blk: request botched

That would seem to indicate a bug in the merging logic instead.

> make me believe when testing XFS code that allows multiple
> asynchronous discard request, unlike the current blkdev_issue_discard
> which always waits for one before starting the next.
> 
> I tried this little sniplet to prevent it:
> 
> Index: xfs/block/blk-merge.c
> ===================================================================
> --- xfs.orig/block/blk-merge.c	2011-03-22 13:07:24.733857580 +0100
> +++ xfs/block/blk-merge.c	2011-03-22 13:08:17.448856577 +0100
> @@ -373,7 +373,7 @@ static int attempt_merge(struct request_
>  	/*
>  	 * Don't merge file system requests and discard requests
>  	 */
> -	if ((req->cmd_flags & REQ_DISCARD) != (next->cmd_flags & REQ_DISCARD))
> +	if ((req->cmd_flags & REQ_DISCARD) || (next->cmd_flags & REQ_DISCARD))
>  		return 0;
>  
>  	/*

That's not going to be enough, you want to disable the bio to request
merging of discards as well in elevator.c:elv_rq_merge_ok(). Does
that then fix it?


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 19:54 ` Jens Axboe
@ 2011-03-22 21:00   ` Christoph Hellwig
  2011-03-22 21:03   ` Jens Axboe
  2011-05-03 18:05   ` Christoph Hellwig
  2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2011-03-22 21:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-kernel@vger.kernel.org

On Tue, Mar 22, 2011 at 08:54:06PM +0100, Jens Axboe wrote:
> > Index: xfs/block/blk-merge.c
> > ===================================================================
> > --- xfs.orig/block/blk-merge.c	2011-03-22 13:07:24.733857580 +0100
> > +++ xfs/block/blk-merge.c	2011-03-22 13:08:17.448856577 +0100
> > @@ -373,7 +373,7 @@ static int attempt_merge(struct request_
> >  	/*
> >  	 * Don't merge file system requests and discard requests
> >  	 */
> > -	if ((req->cmd_flags & REQ_DISCARD) != (next->cmd_flags & REQ_DISCARD))
> > +	if ((req->cmd_flags & REQ_DISCARD) || (next->cmd_flags & REQ_DISCARD))
> >  		return 0;
> >  
> >  	/*
> 
> That's not going to be enough, you want to disable the bio to request
> merging of discards as well in elevator.c:elv_rq_merge_ok(). Does
> that then fix it?

Applying the same fix in elv_rq_merge_ok seems to fix the issue, at
least the xfstests testcase that usually hits it is completes ok.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 19:54 ` Jens Axboe
  2011-03-22 21:00   ` Christoph Hellwig
@ 2011-03-22 21:03   ` Jens Axboe
  2011-03-23 13:01     ` Christoph Hellwig
  2011-03-30 14:16     ` Christoph Hellwig
  2011-05-03 18:05   ` Christoph Hellwig
  2 siblings, 2 replies; 8+ messages in thread
From: Jens Axboe @ 2011-03-22 21:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel@vger.kernel.org

On 2011-03-22 20:54, Jens Axboe wrote:
> On 2011-03-22 20:47, Christoph Hellwig wrote:
>> It seems the current block layer wil happily try to merge discard
>> requests that were split because they are at the max that bi_size
>> can hold together again.  At least that's what the
>>
>> 	blk: request botched
> 
> That would seem to indicate a bug in the merging logic instead.

What kind of max discard size does you device have? If the max discard
size is smaller than the regular request size, this could help.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index cfcc37c..76cdfb7 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -232,8 +232,12 @@ int ll_back_merge_fn(struct request_queue *q, struct request *req,
 
 	if (unlikely(req->cmd_type == REQ_TYPE_BLOCK_PC))
 		max_sectors = queue_max_hw_sectors(q);
-	else
-		max_sectors = queue_max_sectors(q);
+	else {
+		if (req->cmd_flags & REQ_DISCARD)
+			max_sectors = q->limits.max_discard_sectors;
+		else
+			max_sectors = queue_max_sectors(q);
+	}
 
 	if (blk_rq_sectors(req) + bio_sectors(bio) > max_sectors) {
 		req->cmd_flags |= REQ_NOMERGE;
@@ -256,9 +260,12 @@ int ll_front_merge_fn(struct request_queue *q, struct request *req,
 
 	if (unlikely(req->cmd_type == REQ_TYPE_BLOCK_PC))
 		max_sectors = queue_max_hw_sectors(q);
-	else
-		max_sectors = queue_max_sectors(q);
-
+	else {
+		if (req->cmd_flags & REQ_DISCARD)
+			max_sectors = q->limits.max_discard_sectors;
+		else
+			max_sectors = queue_max_sectors(q);
+	}
 
 	if (blk_rq_sectors(req) + bio_sectors(bio) > max_sectors) {
 		req->cmd_flags |= REQ_NOMERGE;


-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 21:03   ` Jens Axboe
@ 2011-03-23 13:01     ` Christoph Hellwig
  2011-03-23 15:26       ` Jens Axboe
  2011-03-30 14:16     ` Christoph Hellwig
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2011-03-23 13:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-kernel@vger.kernel.org

On Tue, Mar 22, 2011 at 10:03:57PM +0100, Jens Axboe wrote:
> > That would seem to indicate a bug in the merging logic instead.
> 
> What kind of max discard size does you device have? If the max discard
> size is smaller than the regular request size, this could help.

It's a SCSI device, so the max discard size is a lot larger:

# cat /sys/block/sda/queue/max_sectors_kb
512
# cat /sys/block/sda/queue/max_hw_sectors_kb 
32767
# cat /sys/block/sda/queue/discard_max_bytes 
4294966784


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-23 13:01     ` Christoph Hellwig
@ 2011-03-23 15:26       ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2011-03-23 15:26 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel@vger.kernel.org

On 2011-03-23 14:01, Christoph Hellwig wrote:
> On Tue, Mar 22, 2011 at 10:03:57PM +0100, Jens Axboe wrote:
>>> That would seem to indicate a bug in the merging logic instead.
>>
>> What kind of max discard size does you device have? If the max discard
>> size is smaller than the regular request size, this could help.
> 
> It's a SCSI device, so the max discard size is a lot larger:
> 
> # cat /sys/block/sda/queue/max_sectors_kb
> 512
> # cat /sys/block/sda/queue/max_hw_sectors_kb 
> 32767
> # cat /sys/block/sda/queue/discard_max_bytes 
> 4294966784

I'll try and throw a synthetic test at it that produces a slew of
discard merging and see what happens.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 21:03   ` Jens Axboe
  2011-03-23 13:01     ` Christoph Hellwig
@ 2011-03-30 14:16     ` Christoph Hellwig
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2011-03-30 14:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-kernel@vger.kernel.org

On Tue, Mar 22, 2011 at 10:03:57PM +0100, Jens Axboe wrote:
> On 2011-03-22 20:54, Jens Axboe wrote:
> > On 2011-03-22 20:47, Christoph Hellwig wrote:
> >> It seems the current block layer wil happily try to merge discard
> >> requests that were split because they are at the max that bi_size
> >> can hold together again.  At least that's what the
> >>
> >> 	blk: request botched
> > 
> > That would seem to indicate a bug in the merging logic instead.
> 
> What kind of max discard size does you device have? If the max discard
> size is smaller than the regular request size, this could help.

I've done some heavier test, and both the extended check for mergeable
requests or your patch with different limits hangs the test box hard
with no way to get a backtrace.  Using my original patch to completely
skip the merging logic seems to work fine.  


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: merging discard request in the block layer
  2011-03-22 19:54 ` Jens Axboe
  2011-03-22 21:00   ` Christoph Hellwig
  2011-03-22 21:03   ` Jens Axboe
@ 2011-05-03 18:05   ` Christoph Hellwig
  2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2011-05-03 18:05 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-kernel@vger.kernel.org

I finally maged to debug this a bit further and can have found a cure
for the null pointer derefences I got on recent kernels.

The problem is that bio_has_data thinks discard requests have a payload
and thus tries to poke into it's pages when trying to merge requests.
Taking the REQ_DISCARD check in to bio_has_data fixes that.  I've also
tried to special case discard requests in bio_cur_bytes, but that
doesn't fix the botched requests messages yet.  I suspect the merging
code might need some additions to update the bio_size for discard
requests that it currently skips.

Index: xfs/block/blk-core.c
===================================================================
--- xfs.orig/block/blk-core.c	2011-05-03 19:45:51.980219652 +0200
+++ xfs/block/blk-core.c	2011-05-03 19:47:51.756237436 +0200
@@ -1645,7 +1645,7 @@ void submit_bio(int rw, struct bio *bio)
 	 * If it's a regular read/write or a barrier with data attached,
 	 * go through the normal accounting stuff before submission.
 	 */
-	if (bio_has_data(bio) && !(rw & REQ_DISCARD)) {
+	if (bio_has_data(bio)) {
 		if (rw & WRITE) {
 			count_vm_events(PGPGOUT, count);
 		} else {
Index: xfs/include/linux/bio.h
===================================================================
--- xfs.orig/include/linux/bio.h	2011-05-03 19:43:28.537663414 +0200
+++ xfs/include/linux/bio.h	2011-05-03 19:48:18.632758500 +0200
@@ -69,7 +69,7 @@
 
 static inline unsigned int bio_cur_bytes(struct bio *bio)
 {
-	if (bio->bi_vcnt)
+	if (bio->bi_vcnt && !(bio->bi_rw & REQ_DISCARD))
 		return bio_iovec(bio)->bv_len;
 	else /* dataless requests such as discard */
 		return bio->bi_size;
@@ -368,7 +368,7 @@ static inline char *__bio_kmap_irq(struc
  */
 static inline int bio_has_data(struct bio *bio)
 {
-	return bio && bio->bi_io_vec != NULL;
+	return bio && bio->bi_io_vec != NULL && !(bio->bi_rw & REQ_DISCARD);
 }
 
 /*

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-05-03 18:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-22 19:47 merging discard request in the block layer Christoph Hellwig
2011-03-22 19:54 ` Jens Axboe
2011-03-22 21:00   ` Christoph Hellwig
2011-03-22 21:03   ` Jens Axboe
2011-03-23 13:01     ` Christoph Hellwig
2011-03-23 15:26       ` Jens Axboe
2011-03-30 14:16     ` Christoph Hellwig
2011-05-03 18:05   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox