* [bug?] raid1 integrity checking is broken on 2.6.18-rc4
@ 2006-08-12 6:49 Chuck Ebbert
2006-08-12 9:13 ` Justin Piszcz
2006-08-14 6:14 ` Neil Brown
0 siblings, 2 replies; 7+ messages in thread
From: Chuck Ebbert @ 2006-08-12 6:49 UTC (permalink / raw)
To: linux-raid; +Cc: Neil Brown, linux-kernel
Doing this on a raid1 array:
echo "check" >/sys/block/md0/md/sync_action
On 2.6.16.27:
Activity lights on both mirrors show activity for a while,
then the array status prints on the console.
On 2.6.18-rc4 + the below patch:
Drive activity light blinks once on one drive, then the
array status prints (obviously no checking takes place.)
Applied hotfix on 2.6.18-rc4:
--- .prev/drivers/md/md.c 2006-08-08 09:00:44.000000000 +1000
+++ ./drivers/md/md.c 2006-08-08 09:04:04.000000000 +1000
@@ -1597,6 +1597,19 @@ void md_update_sb(mddev_t * mddev)
repeat:
spin_lock_irq(&mddev->write_lock);
+
+ if (mddev->degraded && mddev->sb_dirty == 3)
+ /* If the array is degraded, then skipping spares is both
+ * dangerous and fairly pointless.
+ * Dangerous because a device that was removed from the array
+ * might have a event_count that still looks up-to-date,
+ * so it can be re-added without a resync.
+ * Pointless because if there are any spares to skip,
+ * then a recovery will happen and soon that array won't
+ * be degraded any more and the spare can go back to sleep then.
+ */
+ mddev->sb_dirty = 1;
+
sync_req = mddev->in_sync;
mddev->utime = get_seconds();
if (mddev->sb_dirty == 3)
--
Chuck
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
2006-08-12 6:49 [bug?] raid1 integrity checking is broken on 2.6.18-rc4 Chuck Ebbert
@ 2006-08-12 9:13 ` Justin Piszcz
2006-08-12 11:59 ` Michael Tokarev
2006-08-14 6:14 ` Neil Brown
1 sibling, 1 reply; 7+ messages in thread
From: Justin Piszcz @ 2006-08-12 9:13 UTC (permalink / raw)
To: Chuck Ebbert; +Cc: linux-raid, Neil Brown, linux-kernel
On Sat, 12 Aug 2006, Chuck Ebbert wrote:
> Doing this on a raid1 array:
> echo "check" >/sys/block/md0/md/sync_action
>
> On 2.6.16.27:
> Activity lights on both mirrors show activity for a while,
> then the array status prints on the console.
>
> On 2.6.18-rc4 + the below patch:
> Drive activity light blinks once on one drive, then the
> array status prints (obviously no checking takes place.)
>
>
> Applied hotfix on 2.6.18-rc4:
>
> --- .prev/drivers/md/md.c 2006-08-08 09:00:44.000000000 +1000
> +++ ./drivers/md/md.c 2006-08-08 09:04:04.000000000 +1000
> @@ -1597,6 +1597,19 @@ void md_update_sb(mddev_t * mddev)
>
> repeat:
> spin_lock_irq(&mddev->write_lock);
> +
> + if (mddev->degraded && mddev->sb_dirty == 3)
> + /* If the array is degraded, then skipping spares is both
> + * dangerous and fairly pointless.
> + * Dangerous because a device that was removed from the array
> + * might have a event_count that still looks up-to-date,
> + * so it can be re-added without a resync.
> + * Pointless because if there are any spares to skip,
> + * then a recovery will happen and soon that array won't
> + * be degraded any more and the spare can go back to sleep then.
> + */
> + mddev->sb_dirty = 1;
> +
> sync_req = mddev->in_sync;
> mddev->utime = get_seconds();
> if (mddev->sb_dirty == 3)
> --
> Chuck
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Is there a doc for all of the options you can echo into the sync_action?
I'm assuming mdadm does these as well and echo is just another way to run
work with the array?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
2006-08-12 9:13 ` Justin Piszcz
@ 2006-08-12 11:59 ` Michael Tokarev
0 siblings, 0 replies; 7+ messages in thread
From: Michael Tokarev @ 2006-08-12 11:59 UTC (permalink / raw)
To: Justin Piszcz; +Cc: Chuck Ebbert, linux-raid, Neil Brown, linux-kernel
Justin Piszcz wrote:
> Is there a doc for all of the options you can echo into the sync_action?
> I'm assuming mdadm does these as well and echo is just another way to
> run work with the array?
How about the obvious, Documentation/md.txt ?
And no, mdadm does not perform or trigger data integrity checking.
/mjt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
2006-08-12 6:49 [bug?] raid1 integrity checking is broken on 2.6.18-rc4 Chuck Ebbert
2006-08-12 9:13 ` Justin Piszcz
@ 2006-08-14 6:14 ` Neil Brown
1 sibling, 0 replies; 7+ messages in thread
From: Neil Brown @ 2006-08-14 6:14 UTC (permalink / raw)
To: Chuck Ebbert; +Cc: linux-raid, linux-kernel
On Saturday August 12, 76306.1226@compuserve.com wrote:
> Doing this on a raid1 array:
> echo "check" >/sys/block/md0/md/sync_action
>
> On 2.6.16.27:
> Activity lights on both mirrors show activity for a while,
> then the array status prints on the console.
>
> On 2.6.18-rc4 + the below patch:
> Drive activity light blinks once on one drive, then the
> array status prints (obviously no checking takes place.)
>
Thanks for the report.
Easily duplicated, easily fixed.
I'll make sure this patch gets into 2.6.18.
Thanks again,
NeilBrown
Signed-off-by: Neil Brown <neilb@suse.de>
diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c 2006-07-31 17:24:36.000000000 +1000
+++ ./drivers/md/raid1.c 2006-08-14 15:52:48.000000000 +1000
@@ -1644,15 +1644,16 @@ static sector_t sync_request(mddev_t *md
return 0;
}
- /* before building a request, check if we can skip these blocks..
- * This call the bitmap_start_sync doesn't actually record anything
- */
if (mddev->bitmap == NULL &&
mddev->recovery_cp == MaxSector &&
+ !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
conf->fullsync == 0) {
*skipped = 1;
return max_sector - sector_nr;
}
+ /* before building a request, check if we can skip these blocks..
+ * This call the bitmap_start_sync doesn't actually record anything
+ */
if (!bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, 1) &&
!conf->fullsync && !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
/* We can skip this block, and probably several more */
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
@ 2006-08-17 20:03 Chuck Ebbert
2006-08-28 3:47 ` Neil Brown
0 siblings, 1 reply; 7+ messages in thread
From: Chuck Ebbert @ 2006-08-17 20:03 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-kernel, linux-raid
In-Reply-To: <17632.5294.559058.66914@cse.unsw.edu.au>
On Mon, 14 Aug 2006 16:14:06 +1000, Neil Brown wrote:
> > On 2.6.18-rc4 + the below patch:
> > Drive activity light blinks once on one drive, then the
> > array status prints (obviously no checking takes place.)
> >
>
> Thanks for the report.
> Easily duplicated, easily fixed.
> I'll make sure this patch gets into 2.6.18.
>
> Thanks again,
> NeilBrown
>
I just tried the patch and now it seems to be syncing the drives instead
of only checking them? (At the very least the message is misleading.)
# echo "check" >/sys/block/md0/md/sync_action
# dmesg | tail -9
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 104256 blocks.
md: md0: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:hda9
disk 1, wo:0, o:1, dev:sda5
> Signed-off-by: Neil Brown <neilb@suse.de>
>
> diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
> --- .prev/drivers/md/raid1.c 2006-07-31 17:24:36.000000000 +1000
> +++ ./drivers/md/raid1.c 2006-08-14 15:52:48.000000000 +1000
> @@ -1644,15 +1644,16 @@ static sector_t sync_request(mddev_t *md
> return 0;
> }
>
> - /* before building a request, check if we can skip these blocks..
> - * This call the bitmap_start_sync doesn't actually record anything
> - */
> if (mddev->bitmap == NULL &&
> mddev->recovery_cp == MaxSector &&
> + !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) &&
> conf->fullsync == 0) {
> *skipped = 1;
> return max_sector - sector_nr;
> }
> + /* before building a request, check if we can skip these blocks..
> + * This call the bitmap_start_sync doesn't actually record anything
> + */
> if (!bitmap_start_sync(mddev->bitmap, sector_nr, &sync_blocks, 1) &&
> !conf->fullsync && !test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
> /* We can skip this block, and probably several more */
--
Chuck
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
@ 2006-08-18 16:38 raid
0 siblings, 0 replies; 7+ messages in thread
From: raid @ 2006-08-18 16:38 UTC (permalink / raw)
To: linux-raid
Neil introduced read-checking into 2.6.16. In versions prior, mirror copies were overwritten instead of checked.
I'm running 2.6.17rc4:
# echo "check" > /sys/block/md0/md/sync_action
# dmesg
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 15542784 blocks.
# dstat -d -D sda,sdg
--disk/sda----disk/sdg-
_read write:_read write
81k 30k: 81k 30k
58M 0 : 58M 0
58M 0 : 57M 0
57M 0 : 58M 0
58M 0 : 57M 0
Although the message uses the word "reconscruction," the drives are being checked for consistancy.
-----------------
I just tried the patch and now it seems to be syncing the drives instead
of only checking them? (At the very least the message is misleading.)
# echo "check" >/sys/block/md0/md/sync_action
# dmesg | tail -9
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
md: using 128k window, over a total of 104256 blocks.
md: md0: sync done.
RAID1 conf printout:
--- wd:2 rd:2
disk 0, wo:0, o:1, dev:hda9
disk 1, wo:0, o:1, dev:sda5
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug?] raid1 integrity checking is broken on 2.6.18-rc4
2006-08-17 20:03 Chuck Ebbert
@ 2006-08-28 3:47 ` Neil Brown
0 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2006-08-28 3:47 UTC (permalink / raw)
To: Chuck Ebbert; +Cc: linux-kernel, linux-raid
On Thursday August 17, 76306.1226@compuserve.com wrote:
>
> I just tried the patch and now it seems to be syncing the drives instead
> of only checking them? (At the very least the message is misleading.)
>
Yes, the message is misleading. I should fix that.
NeilBrown
> # echo "check" >/sys/block/md0/md/sync_action
> # dmesg | tail -9
> md: syncing RAID array md0
> md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
> md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reconstruction.
> md: using 128k window, over a total of 104256 blocks.
> md: md0: sync done.
> RAID1 conf printout:
> --- wd:2 rd:2
> disk 0, wo:0, o:1, dev:hda9
> disk 1, wo:0, o:1, dev:sda5
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-08-28 3:47 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-12 6:49 [bug?] raid1 integrity checking is broken on 2.6.18-rc4 Chuck Ebbert
2006-08-12 9:13 ` Justin Piszcz
2006-08-12 11:59 ` Michael Tokarev
2006-08-14 6:14 ` Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2006-08-17 20:03 Chuck Ebbert
2006-08-28 3:47 ` Neil Brown
2006-08-18 16:38 raid
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).