From: Paul Clements <paul.clements@steeleye.com>
To: linux-raid@vger.kernel.org, Neil Brown <neilb@suse.de>,
kernel list <linux-kernel@vger.kernel.org>
Cc: ian.campbell@citrix.com
Subject: [BUG] raid1 behind writes alter bio structure illegally
Date: Wed, 29 Jul 2009 12:14:48 -0400 [thread overview]
Message-ID: <4A707578.3010901@steeleye.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 2037 bytes --]
I've run into this bug on a 2.6.18 kernel, but I think the fix is still
applicable to the latest kernels (even though the symptoms would be
slightly different).
Perhaps someone who knows the block and/or SCSI layers well can comment
on the legality of attaching new pages to a bio without fixing up the
internal bio counters (details below)?
Thanks,
Paul
Environment:
-----------
Citrix XenServer 5.5 (2.6.18 Red Hat-derived kernel)
LVM over raid1 over SCSI/nbd
Description:
-----------
The problem is due to the behind-write code in raid1. It turns out the
code is doing something a little non-kosher with the bio's and pages
associated with them. This causes (at least) the SCSI layer to get upset
and fail the write requests.
Basically, when we do behind writes in raid1, we have to make a copy of
the original data that is being written, since we're going to complete
the request back up to user level before all the devices are finished
writing the data (e.g., the SCSI disk completes the write and raid1 then
completes the write back to user level, while nbd is still sending data
across the network).
The problem is actually a pretty simple one -- these copied pages
(behind_pages in raid1 code) are allocated at different memory addresses
than the original ones (obviously). This can cause the internal segment
counts (nr_phys_segments) that were calculated in the bio when it was
originally created (or cloned) to be invalid. Specifically, the SCSI
layer notices the values are invalid when it tries to build its scatter
gather list. The error:
Incorrect number of segments after building list
counted 94, received 64
req nr_sec 992, cur_nr_sec 8
appears in the kernel logs when this happens. (This exact message is no
longer present in the kernel, but SCSI still appears to be building its
scatter gather list in a similar fashion.)
Solution:
--------
The patch adds a call to blk_recount_segments to fix up the bio
structure to account for the new page addresses that have
been attached to the bio.
[-- Attachment #2: xen-5.5-raid1-blk_recount_segments_fix.diff --]
[-- Type: text/x-diff, Size: 1175 bytes --]
diff -purN --exclude-from=/export/public/clemep/tmp/dontdiff linux-orig/block/ll_rw_blk.c linux-2.6.18-128.1.6.el5.xs5.5.0.496.1012xen/block/ll_rw_blk.c
--- linux-orig/block/ll_rw_blk.c 2009-05-29 07:29:54.000000000 -0400
+++ linux-2.6.18-128.1.6.el5.xs5.5.0.496.1012xen/block/ll_rw_blk.c 2009-07-28 13:36:19.000000000 -0400
@@ -1374,6 +1374,7 @@ new_hw_segment:
bio->bi_flags |= (1 << BIO_SEG_VALID);
}
+EXPORT_SYMBOL(blk_recount_segments);
static int blk_phys_contig_segment(request_queue_t *q, struct bio *bio,
struct bio *nxt)
diff -purN --exclude-from=/export/public/clemep/tmp/dontdiff linux-orig/drivers/md/raid1.c linux-2.6.18-128.1.6.el5.xs5.5.0.496.1012xen/drivers/md/raid1.c
--- linux-orig/drivers/md/raid1.c 2009-05-29 07:29:54.000000000 -0400
+++ linux-2.6.18-128.1.6.el5.xs5.5.0.496.1012xen/drivers/md/raid1.c 2009-07-28 13:35:36.000000000 -0400
@@ -900,6 +900,7 @@ static int make_request(request_queue_t
*/
__bio_for_each_segment(bvec, mbio, j, 0)
bvec->bv_page = behind_pages[j];
+ blk_recount_segments(q, mbio);
if (test_bit(WriteMostly, &conf->mirrors[i].rdev->flags))
atomic_inc(&r1_bio->behind_remaining);
}
next reply other threads:[~2009-07-29 16:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-29 16:14 Paul Clements [this message]
2009-07-29 16:57 ` [BUG] raid1 behind writes alter bio structure illegally Milan Broz
2009-07-29 20:18 ` Paul Clements
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A707578.3010901@steeleye.com \
--to=paul.clements@steeleye.com \
--cc=ian.campbell@citrix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).