From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: akpm@osdl.org, linux-kernel@vger.kernel.org
Subject: [PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery
Date: Tue, 10 Jan 2006 17:20:03 -0500 [thread overview]
Message-ID: <43C43313.3000402@ce.jp.nec.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]
Hi,
dm-mirror has potential data corruption problem:
while on-disk log shows that all disk contents are in-sync,
actual contents of the disks are not synchronized.
This problem occurs if initial recovery (synching) is
interrupted and resumed.
Attached patch fixes this problem.
Please consider to apply.
Background:
rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.
This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.
If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.
The patch keeps RH_NOSYNC state unchanged.
It will be changed to RH_RECOVERING when recovery starts
and get reclaimed when the recovery completes.
So it doesn't leak the region hash entry.
Thanks,
Jun'ichi "Nick" Nomura
[-- Attachment #2: dm-mirror-keepnosync.patch --]
[-- Type: text/x-patch, Size: 2102 bytes --]
Keep RH_NOSYNC state unchanged when I/O on the region completes.
rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.
This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.
If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.
The RH_NOSYNC region will be changed to RH_RECOVERING when
recovery starts on the region and get reclaimed when the recovery
completes.
So it doesn't leak the region hash entry.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
diff -urp linux.orig/drivers/md/dm-raid1.c linux/drivers/md/dm-raid1.c
--- linux.orig/drivers/md/dm-raid1.c 2005-12-26 05:25:04.000000000 -0500
+++ linux/drivers/md/dm-raid1.c 2006-01-06 10:19:49.000000000 -0500
@@ -414,9 +414,21 @@ static void rh_dec(struct region_hash *r
spin_lock_irqsave(&rh->region_lock, flags);
if (atomic_dec_and_test(®->pending)) {
+ /*
+ * There is no pending I/O for this region.
+ * We can move the region to corresponding list for next action.
+ * At this point, the region is not yet connected to any list.
+ *
+ * If the state is RH_NOSYNC, the region should be kept off
+ * from clean list.
+ * The hash entry for RH_NOSYNC will remain in memory
+ * until the region is recovered or the map is reloaded.
+ */
+
+ /* do nothing for RH_NOSYNC */
if (reg->state == RH_RECOVERING) {
list_add_tail(®->list, &rh->quiesced_regions);
- } else {
+ } else if (reg->state == RH_DIRTY) {
reg->state = RH_CLEAN;
list_add(®->list, &rh->clean_regions);
}
[-- Attachment #3: Type: text/plain, Size: 0 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: linux-kernel@vger.kernel.org, akpm@osdl.org
Subject: [PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery
Date: Tue, 10 Jan 2006 17:20:03 -0500 [thread overview]
Message-ID: <43C43313.3000402@ce.jp.nec.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]
Hi,
dm-mirror has potential data corruption problem:
while on-disk log shows that all disk contents are in-sync,
actual contents of the disks are not synchronized.
This problem occurs if initial recovery (synching) is
interrupted and resumed.
Attached patch fixes this problem.
Please consider to apply.
Background:
rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.
This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.
If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.
The patch keeps RH_NOSYNC state unchanged.
It will be changed to RH_RECOVERING when recovery starts
and get reclaimed when the recovery completes.
So it doesn't leak the region hash entry.
Thanks,
Jun'ichi "Nick" Nomura
[-- Attachment #2: dm-mirror-keepnosync.patch --]
[-- Type: text/x-patch, Size: 2102 bytes --]
Keep RH_NOSYNC state unchanged when I/O on the region completes.
rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.
This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.
If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.
The RH_NOSYNC region will be changed to RH_RECOVERING when
recovery starts on the region and get reclaimed when the recovery
completes.
So it doesn't leak the region hash entry.
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
diff -urp linux.orig/drivers/md/dm-raid1.c linux/drivers/md/dm-raid1.c
--- linux.orig/drivers/md/dm-raid1.c 2005-12-26 05:25:04.000000000 -0500
+++ linux/drivers/md/dm-raid1.c 2006-01-06 10:19:49.000000000 -0500
@@ -414,9 +414,21 @@ static void rh_dec(struct region_hash *r
spin_lock_irqsave(&rh->region_lock, flags);
if (atomic_dec_and_test(®->pending)) {
+ /*
+ * There is no pending I/O for this region.
+ * We can move the region to corresponding list for next action.
+ * At this point, the region is not yet connected to any list.
+ *
+ * If the state is RH_NOSYNC, the region should be kept off
+ * from clean list.
+ * The hash entry for RH_NOSYNC will remain in memory
+ * until the region is recovered or the map is reloaded.
+ */
+
+ /* do nothing for RH_NOSYNC */
if (reg->state == RH_RECOVERING) {
list_add_tail(®->list, &rh->quiesced_regions);
- } else {
+ } else if (reg->state == RH_DIRTY) {
reg->state = RH_CLEAN;
list_add(®->list, &rh->clean_regions);
}
next reply other threads:[~2006-01-10 22:20 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-10 22:20 Jun'ichi Nomura [this message]
2006-01-10 22:20 ` [PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery Jun'ichi Nomura
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43C43313.3000402@ce.jp.nec.com \
--to=j-nomura@ce.jp.nec.com \
--cc=akpm@osdl.org \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.