linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
@ 2014-05-01 13:56 Maurizio Lombardi
  2014-05-27 10:15 ` Ming Lei
  0 siblings, 1 reply; 6+ messages in thread
From: Maurizio Lombardi @ 2014-05-01 13:56 UTC (permalink / raw)
  To: viro
  Cc: akpm, linux-fsdevel, JBottomley, hch, linux-scsi, kmo,
	linux-kernel, m.lombardi85

The original behaviour is to refuse to add a new page if the maximum number
of segments has been reached, regardless of the fact the page we are
going to add can be merged into the last segment or not.

Unfortunately, when the system runs under heavy memory fragmentation conditions,
a driver may try to add multiple pages to the last segment.
The original code won't accept them and EBUSY will be reported to
userspace.

This patch modifies the function so it refuses to add a page
only in case the latter starts a new segment and the maximum number
of segments has already been reached.

The bug can be easily reproduced with the st driver:

1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE  to 16
2) modprobe st buffer_kbs=1024
3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10
   dd: error writing ‘/dev/st0’: Device or resource busy

Changes in V3:

In case of error, V2 restored the previous number of segments but left
the BIO_SEG_FLAG set.
To avoid problems, after the page is removed from the bio vec,
V3 performs a recount of the segments in the error code path.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
---
 fs/bio.c | 48 ++++++++++++++++++++++++++----------------------
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index 6f0362b..9bf512e 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -750,29 +750,31 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 		return 0;
 
 	/*
-	 * we might lose a segment or two here, but rather that than
-	 * make this too complex.
+	 * setup the new entry, we might clear it again later if we
+	 * cannot add the page
+	 */
+	bvec = &bio->bi_io_vec[bio->bi_vcnt];
+	bvec->bv_page = page;
+	bvec->bv_len = len;
+	bvec->bv_offset = offset;
+	bio->bi_vcnt++;
+	bio->bi_phys_segments++;
+
+	/*
+	 * Perform a recount if the number of segments is greater
+	 * than queue_max_segments(q).
 	 */
 
-	while (bio->bi_phys_segments >= queue_max_segments(q)) {
+	while (bio->bi_phys_segments > queue_max_segments(q)) {
 
 		if (retried_segments)
-			return 0;
+			goto failed;
 
 		retried_segments = 1;
 		blk_recount_segments(q, bio);
 	}
 
 	/*
-	 * setup the new entry, we might clear it again later if we
-	 * cannot add the page
-	 */
-	bvec = &bio->bi_io_vec[bio->bi_vcnt];
-	bvec->bv_page = page;
-	bvec->bv_len = len;
-	bvec->bv_offset = offset;
-
-	/*
 	 * if queue has other restrictions (eg varying max sector size
 	 * depending on offset), it can specify a merge_bvec_fn in the
 	 * queue to get further control
@@ -789,23 +791,25 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 		 * merge_bvec_fn() returns number of bytes it can accept
 		 * at this offset
 		 */
-		if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len) {
-			bvec->bv_page = NULL;
-			bvec->bv_len = 0;
-			bvec->bv_offset = 0;
-			return 0;
-		}
+		if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len)
+			goto failed;
 	}
 
 	/* If we may be able to merge these biovecs, force a recount */
-	if (bio->bi_vcnt && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
+	if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
 		bio->bi_flags &= ~(1 << BIO_SEG_VALID);
 
-	bio->bi_vcnt++;
-	bio->bi_phys_segments++;
  done:
 	bio->bi_iter.bi_size += len;
 	return len;
+
+ failed:
+	bvec->bv_page = NULL;
+	bvec->bv_len = 0;
+	bvec->bv_offset = 0;
+	bio->bi_vcnt--;
+	blk_recount_segments(q, bio);
+	return 0;
 }
 
 /**
-- 
Maurizio Lombardi

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
  2014-05-01 13:56 [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment Maurizio Lombardi
@ 2014-05-27 10:15 ` Ming Lei
  2014-05-27 11:14   ` Maurizio Lombardi
  2014-05-27 11:58   ` Maurizio Lombardi
  0 siblings, 2 replies; 6+ messages in thread
From: Ming Lei @ 2014-05-27 10:15 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Al Viro, Andrew Morton, Linux FS Devel, James E.J. Bottomley,
	Christoph Hellwig, Linux SCSI List, Kent Overstreet,
	Linux Kernel Mailing List, m.lombardi85

Hi Maurizio,

On Thu, May 1, 2014 at 9:56 PM, Maurizio Lombardi <mlombard@redhat.com> wrote:
> The original behaviour is to refuse to add a new page if the maximum number
> of segments has been reached, regardless of the fact the page we are
> going to add can be merged into the last segment or not.

If the page can be merged to last segment, it should have been
covered by code in branch of 'if (bio->bi_vcnt > 0) ...', shouldn't it?

Or maybe it is better to make that code cover your case since
looks your case is similar with that one according to your commit
log.


Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
  2014-05-27 10:15 ` Ming Lei
@ 2014-05-27 11:14   ` Maurizio Lombardi
  2014-05-27 11:58   ` Maurizio Lombardi
  1 sibling, 0 replies; 6+ messages in thread
From: Maurizio Lombardi @ 2014-05-27 11:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: Al Viro, Andrew Morton, Linux FS Devel, James E.J. Bottomley,
	Christoph Hellwig, linux-scsi, Kent Overstreet,
	Linux Kernel Mailing List, m.lombardi85

On Tue, May 27, 2014 at 06:15:45PM +0800, Ming Lei wrote:
> 
> If the page can be merged to last segment, it should have been
> covered by code in branch of 'if (bio->bi_vcnt > 0) ...', shouldn't it?
> 
> Or maybe it is better to make that code cover your case since
> looks your case is similar with that one according to your commit
> log.

the code in this branch does not cover our case, it is intended to cover the case
where __bio_add_page() is called multiple times with the *same* page as parameter.
My patch deals with the case when __bio_add_page() is called with *different* pages
as parameter but physically adjacent to each other.

That said it is true that maybe this branch can be extended to also cover the case
I'm dealing with and try to avoid the problem that commit 3979ef4dcf introduced.

Thanks,
Maurizio Lombardi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
  2014-05-27 10:15 ` Ming Lei
  2014-05-27 11:14   ` Maurizio Lombardi
@ 2014-05-27 11:58   ` Maurizio Lombardi
  2014-05-27 14:04     ` Ming Lei
  1 sibling, 1 reply; 6+ messages in thread
From: Maurizio Lombardi @ 2014-05-27 11:58 UTC (permalink / raw)
  To: Ming Lei
  Cc: Al Viro, Andrew Morton, Linux FS Devel, James E.J. Bottomley,
	Christoph Hellwig, linux-scsi, Kent Overstreet,
	Linux Kernel Mailing List, m.lombardi85

On Tue, May 27, 2014 at 06:15:45PM +0800, Ming Lei wrote:
> 
> If the page can be merged to last segment, it should have been
> covered by code in branch of 'if (bio->bi_vcnt > 0) ...', shouldn't it?
> 
> Or maybe it is better to make that code cover your case since
> looks your case is similar with that one according to your commit
> log.
>

I realized that maybe you mean this branch:

if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))

right?
In this case the branch is not even reached in the original code because
the function returned an error way before executing it (line 760)

There was already a patchset trying to modify the code in a different way than mine:

https://groups.google.com/forum/#!msg/linux.kernel/3IanUpBVhFQ/3Xbg3yLRFp4J

but it has been ignored and in my opinion it takes a more
complicated approach.

Regards,
Maurizio Lombardi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
  2014-05-27 11:58   ` Maurizio Lombardi
@ 2014-05-27 14:04     ` Ming Lei
  2014-05-27 14:27       ` Maurizio Lombardi
  0 siblings, 1 reply; 6+ messages in thread
From: Ming Lei @ 2014-05-27 14:04 UTC (permalink / raw)
  To: Maurizio Lombardi
  Cc: Al Viro, Andrew Morton, Linux FS Devel, James E.J. Bottomley,
	Christoph Hellwig, Linux SCSI List, Kent Overstreet,
	Linux Kernel Mailing List, Maurizio Lombardi

On Tue, May 27, 2014 at 7:58 PM, Maurizio Lombardi <mlombard@redhat.com> wrote:
> On Tue, May 27, 2014 at 06:15:45PM +0800, Ming Lei wrote:
>>
>> If the page can be merged to last segment, it should have been
>> covered by code in branch of 'if (bio->bi_vcnt > 0) ...', shouldn't it?
>>
>> Or maybe it is better to make that code cover your case since
>> looks your case is similar with that one according to your commit
>> log.
>>
>
> I realized that maybe you mean this branch:
>
> if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
>
> right?
> In this case the branch is not even reached in the original code because
> the function returned an error way before executing it (line 760)
>
> There was already a patchset trying to modify the code in a different way than mine:
>
> https://groups.google.com/forum/#!msg/linux.kernel/3IanUpBVhFQ/3Xbg3yLRFp4J
>
> but it has been ignored and in my opinion it takes a more
> complicated approach.

Looks your approach is simpler.

But looks there is one problem if I understand correctly:
__blk_recalc_rq_segments() may not cover the last vector
because bio->bi_iter.bi_size isn't updated until the end of
__bio_add_page().

But it shouldn't have been related with current virtio-blk problem.

Thanks,
-- 
Ming Lei

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment
  2014-05-27 14:04     ` Ming Lei
@ 2014-05-27 14:27       ` Maurizio Lombardi
  0 siblings, 0 replies; 6+ messages in thread
From: Maurizio Lombardi @ 2014-05-27 14:27 UTC (permalink / raw)
  To: Ming Lei, axboe
  Cc: Al Viro, Andrew Morton, Linux FS Devel, James E.J. Bottomley,
	Christoph Hellwig, linux-scsi, Kent Overstreet,
	Linux Kernel Mailing List, Maurizio Lombardi

On Tue, May 27, 2014 at 10:04:58PM +0800, Ming Lei wrote:
> 
> Looks your approach is simpler.
> 
> But looks there is one problem if I understand correctly:
> __blk_recalc_rq_segments() may not cover the last vector
> because bio->bi_iter.bi_size isn't updated until the end of
> __bio_add_page().
> 
> But it shouldn't have been related with current virtio-blk problem.
>

This is a valid point, bi_iter.bi_size influences the behaviour of
blk_recount_segments(). Maybe Jens can confirm your observation.

Anyway it doesn't explain the reason behind the regression
introduced by commit 3979ef4dcf

Maurizio Lombardi

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-05-27 14:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-01 13:56 [PATCH V3] bio: modify __bio_add_page() to accept pages that don't start a new segment Maurizio Lombardi
2014-05-27 10:15 ` Ming Lei
2014-05-27 11:14   ` Maurizio Lombardi
2014-05-27 11:58   ` Maurizio Lombardi
2014-05-27 14:04     ` Ming Lei
2014-05-27 14:27       ` Maurizio Lombardi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).