From: Neil Brown <neilb@suse.de>
To: "Kai" <epimetreus@fastmail.fm>,
Andrew Morton <akpm@linux-foundation.org>,
stable@suse.de, org@suse.de
Cc: linux-kernel@vger.kernel.org, Jens Axboe <jens.axboe@oracle.com>
Subject: [PATCH] Re: Bio device too big | kernel BUG at mm/filemap.c:537!
Date: Wed, 7 Feb 2007 10:26:56 +1100 [thread overview]
Message-ID: <17865.3776.511594.763544@notabene.brown> (raw)
In-Reply-To: message from Neil Brown on Tuesday February 6
On Tuesday February 6, neilb@suse.de wrote:
>
> This patch should fix the worst of the offences, but I'd like to
> experiment and think a bit more before I submit it to stable.
> And probably test it too - as yet I have only compile and brain
> tested.
Ok, I've experimented and tested and now I know what was causing the
double-unlock.
The following patch is suitable for 2.6.20.1 and mainline. There is
room for a bit more improvement, but only for performance, not
correctness. I'll look into that later.
Thanks,
NeilBrown
------------------------------------
Fix various bugs with aligned reads in RAID5.
It is possible for raid5 to be sent a bio that is too big
for an underlying device. So if it is a READ that we
pass stright down to a device, it will fail and confuse
RAID5.
So in 'chunk_aligned_read' we check that the bio fits within the
parameters for the target device and if it doesn't fit, fall back
on reading through the stripe cache and making lots of one-page
requests.
Note that this is the earliest time we can check against the device
because earlier we don't have a lock on the device, so it could change
underneath us.
Also, the code for handling a retry through the cache when a read
fails has not been tested and was badly broken. This patch fixes that
code.
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./drivers/md/raid5.c | 42 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c 2007-02-02 14:17:55.000000000 +1100
+++ ./drivers/md/raid5.c 2007-02-06 19:19:01.000000000 +1100
@@ -2570,7 +2570,7 @@ static struct bio *remove_bio_from_retry
}
bi = conf->retry_read_aligned_list;
if(bi) {
- conf->retry_read_aligned = bi->bi_next;
+ conf->retry_read_aligned_list = bi->bi_next;
bi->bi_next = NULL;
bi->bi_phys_segments = 1; /* biased count of active stripes */
bi->bi_hw_segments = 0; /* count of processed stripes */
@@ -2619,6 +2619,27 @@ static int raid5_align_endio(struct bio
return 0;
}
+static int bio_fits_rdev(struct bio *bi)
+{
+ request_queue_t *q = bdev_get_queue(bi->bi_bdev);
+
+ if ((bi->bi_size>>9) > q->max_sectors)
+ return 0;
+ blk_recount_segments(q, bi);
+ if (bi->bi_phys_segments > q->max_phys_segments ||
+ bi->bi_hw_segments > q->max_hw_segments)
+ return 0;
+
+ if (q->merge_bvec_fn)
+ /* it's too hard to apply the merge_bvec_fn at this stage,
+ * just just give up
+ */
+ return 0;
+
+ return 1;
+}
+
+
static int chunk_aligned_read(request_queue_t *q, struct bio * raid_bio)
{
mddev_t *mddev = q->queuedata;
@@ -2665,6 +2686,13 @@ static int chunk_aligned_read(request_qu
align_bi->bi_flags &= ~(1 << BIO_SEG_VALID);
align_bi->bi_sector += rdev->data_offset;
+ if (!bio_fits_rdev(align_bi)) {
+ /* too big in some way */
+ bio_put(align_bi);
+ rdev_dec_pending(rdev, mddev);
+ return 0;
+ }
+
spin_lock_irq(&conf->device_lock);
wait_event_lock_irq(conf->wait_for_stripe,
conf->quiesce == 0,
@@ -3055,7 +3083,9 @@ static int retry_aligned_read(raid5_con
last_sector = raid_bio->bi_sector + (raid_bio->bi_size>>9);
for (; logical_sector < last_sector;
- logical_sector += STRIPE_SECTORS, scnt++) {
+ logical_sector += STRIPE_SECTORS,
+ sector += STRIPE_SECTORS,
+ scnt++) {
if (scnt < raid_bio->bi_hw_segments)
/* already done this stripe */
@@ -3071,7 +3101,13 @@ static int retry_aligned_read(raid5_con
}
set_bit(R5_ReadError, &sh->dev[dd_idx].flags);
- add_stripe_bio(sh, raid_bio, dd_idx, 0);
+ if (!add_stripe_bio(sh, raid_bio, dd_idx, 0)) {
+ release_stripe(sh);
+ raid_bio->bi_hw_segments = scnt;
+ conf->retry_read_aligned = raid_bio;
+ return handled;
+ }
+
handle_stripe(sh, NULL);
release_stripe(sh);
handled++;
next prev parent reply other threads:[~2007-02-06 23:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-06 4:08 Bio device too big | kernel BUG at mm/filemap.c:537! Kai
2007-02-06 4:37 ` Andrew Morton
2007-02-06 5:24 ` Neil Brown
2007-02-06 23:26 ` Neil Brown [this message]
2007-02-07 1:15 ` [PATCH] " Andrew Morton
2007-02-07 1:30 ` Neil Brown
2007-02-07 1:40 ` Andrew Morton
2007-02-07 16:26 ` Kai
2007-02-07 22:08 ` Neil Brown
2007-02-09 17:15 ` Kai
2007-02-12 8:51 ` J.A. Magallón
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17865.3776.511594.763544@notabene.brown \
--to=neilb@suse.de \
--cc=akpm@linux-foundation.org \
--cc=epimetreus@fastmail.fm \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=org@suse.de \
--cc=stable@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.