* [PATCH 001 of 2] md: Fix two raid10 bugs.
2007-06-12 1:09 [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10} NeilBrown
@ 2007-06-12 1:09 ` NeilBrown
2007-06-12 1:09 ` [PATCH 002 of 2] md: Fix bug in error handling during raid1 repair NeilBrown
2007-06-12 17:04 ` [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10} Bill Davidsen
2 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2007-06-12 1:09 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-raid, linux-kernel, Neil Brown, stable
1/ When resyncing a degraded raid10 which has more than 2 copies of each block,
garbage can get synced on top of good data.
2/ We round the wrong way in part of the device size calculation, which
can cause confusion.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: stable@kernel.org
### Diffstat output
./drivers/md/raid10.c | 6 ++++++
1 file changed, 6 insertions(+)
diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c 2007-06-12 10:19:04.000000000 +1000
+++ ./drivers/md/raid10.c 2007-06-12 10:20:31.000000000 +1000
@@ -1866,6 +1866,7 @@ static sector_t sync_request(mddev_t *md
int d = r10_bio->devs[i].devnum;
bio = r10_bio->devs[i].bio;
bio->bi_end_io = NULL;
+ clear_bit(BIO_UPTODATE, &bio->bi_flags);
if (conf->mirrors[d].rdev == NULL ||
test_bit(Faulty, &conf->mirrors[d].rdev->flags))
continue;
@@ -2036,6 +2037,11 @@ static int run(mddev_t *mddev)
/* 'size' is now the number of chunks in the array */
/* calculate "used chunks per device" in 'stride' */
stride = size * conf->copies;
+
+ /* We need to round up when dividing by raid_disks to
+ * get the stride size.
+ */
+ stride += conf->raid_disks - 1;
sector_div(stride, conf->raid_disks);
mddev->size = stride << (conf->chunk_shift-1);
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH 002 of 2] md: Fix bug in error handling during raid1 repair.
2007-06-12 1:09 [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10} NeilBrown
2007-06-12 1:09 ` [PATCH 001 of 2] md: Fix two raid10 bugs NeilBrown
@ 2007-06-12 1:09 ` NeilBrown
2007-06-12 17:04 ` [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10} Bill Davidsen
2 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2007-06-12 1:09 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-raid, linux-kernel, Mike Accetta, Neil Brown, stable
From: Mike Accetta <maccetta@laurelnetworks.com>
If raid1/repair (which reads all block and fixes any differences
it finds) hits a read error, it doesn't reset the bio for writing
before writing correct data back, so the read error isn't fixed,
and the device probably gets a zero-length write which it might
complain about.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: stable@kernel.org
### Diffstat output
./drivers/md/raid1.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c 2007-06-12 10:48:57.000000000 +1000
+++ ./drivers/md/raid1.c 2007-06-12 10:49:05.000000000 +1000
@@ -1240,17 +1240,24 @@ static void sync_request_write(mddev_t *
}
r1_bio->read_disk = primary;
for (i=0; i<mddev->raid_disks; i++)
- if (r1_bio->bios[i]->bi_end_io == end_sync_read &&
- test_bit(BIO_UPTODATE, &r1_bio->bios[i]->bi_flags)) {
+ if (r1_bio->bios[i]->bi_end_io == end_sync_read) {
int j;
int vcnt = r1_bio->sectors >> (PAGE_SHIFT- 9);
struct bio *pbio = r1_bio->bios[primary];
struct bio *sbio = r1_bio->bios[i];
- for (j = vcnt; j-- ; )
- if (memcmp(page_address(pbio->bi_io_vec[j].bv_page),
- page_address(sbio->bi_io_vec[j].bv_page),
- PAGE_SIZE))
- break;
+
+ if (test_bit(BIO_UPTODATE, &sbio->bi_flags)) {
+ for (j = vcnt; j-- ; ) {
+ struct page *p, *s;
+ p = pbio->bi_io_vec[j].bv_page;
+ s = sbio->bi_io_vec[j].bv_page;
+ if (memcmp(page_address(p),
+ page_address(s),
+ PAGE_SIZE))
+ break;
+ }
+ } else
+ j = 0;
if (j >= 0)
mddev->resync_mismatches += r1_bio->sectors;
if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) {
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10}
2007-06-12 1:09 [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10} NeilBrown
2007-06-12 1:09 ` [PATCH 001 of 2] md: Fix two raid10 bugs NeilBrown
2007-06-12 1:09 ` [PATCH 002 of 2] md: Fix bug in error handling during raid1 repair NeilBrown
@ 2007-06-12 17:04 ` Bill Davidsen
2 siblings, 0 replies; 4+ messages in thread
From: Bill Davidsen @ 2007-06-12 17:04 UTC (permalink / raw)
To: NeilBrown; +Cc: Andrew Morton, linux-raid, linux-kernel, Mike Accetta, stable
NeilBrown wrote:
> Following are a couple of bugfixes for raid10 and raid1. They only
> affect fairly uncommon configurations (more than 2 mirrors) and can
> cause data corruption. Thay are suitable for 2.6.22 and 21-stable.
>
> Thanks,
> NeilBrown
>
>
> [PATCH 001 of 2] md: Fix two raid10 bugs.
> [PATCH 002 of 2] md: Fix bug in error handling during raid1 repair.
I don't know about uncommon, given that I have six machines in this
building with three way RAID-1 for the boot partition, to be sure I can
get off the ground enough to get the other partitions up.
And since you added "write-mostly" for remote mirrors I do have a few
systems doing >2 mirrors as well. This set of patches definitely will be
in my kernel by this afternoon.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 4+ messages in thread