public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jun'ichi Nomura" <j-nomura@ce.jp.nec.com>
To: device-mapper development <dm-devel@redhat.com>
Cc: linux-kernel@vger.kernel.org, akpm@osdl.org
Subject: [PATCH] drivers/md/dm-raid1.c: Fix inconsistent mirroring after interrupted recovery
Date: Tue, 10 Jan 2006 17:20:03 -0500	[thread overview]
Message-ID: <43C43313.3000402@ce.jp.nec.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 1314 bytes --]

Hi,

dm-mirror has potential data corruption problem:
while on-disk log shows that all disk contents are in-sync,
actual contents of the disks are not synchronized.
This problem occurs if initial recovery (synching) is
interrupted and resumed.

Attached patch fixes this problem.
Please consider to apply.

Background:

rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.

This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.

If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.

The patch keeps RH_NOSYNC state unchanged.
It will be changed to RH_RECOVERING when recovery starts
and get reclaimed when the recovery completes.
So it doesn't leak the region hash entry.

Thanks,
Jun'ichi "Nick" Nomura

[-- Attachment #2: dm-mirror-keepnosync.patch --]
[-- Type: text/x-patch, Size: 2102 bytes --]

Keep RH_NOSYNC state unchanged when I/O on the region completes.

rh_dec() changes the region state from RH_NOSYNC (out-of-sync)
to RH_CLEAN (in-sync), which results in the corresponding bit
of clean_bits being set.

This is harmful if on-disk log is used and the map is
removed/suspended before the initial sync is completed.
The clean_bits is written down to the on-disk log at the map
removal, and, upon resume, it's read and copied to sync_bits.
Since the recovery process refers to the sync_bits to find
a region to be recovered, the region whose state was changed
from RH_NOSYNC to RH_CLEAN is no longer recovered.

If you haven't applied dm-raid1-read-balancing.patch proposed
in dm-devel sometimes ago, the contents of the mirrored disk
just corrupt silently.
If you have, balanced read may get bogus data from out-of-sync
disks.

The RH_NOSYNC region will be changed to RH_RECOVERING when
recovery starts on the region and get reclaimed when the recovery
completes.
So it doesn't leak the region hash entry.

Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>

diff -urp linux.orig/drivers/md/dm-raid1.c linux/drivers/md/dm-raid1.c
--- linux.orig/drivers/md/dm-raid1.c	2005-12-26 05:25:04.000000000 -0500
+++ linux/drivers/md/dm-raid1.c	2006-01-06 10:19:49.000000000 -0500
@@ -414,9 +414,21 @@ static void rh_dec(struct region_hash *r
 
 	spin_lock_irqsave(&rh->region_lock, flags);
 	if (atomic_dec_and_test(&reg->pending)) {
+		/*
+		 * There is no pending I/O for this region.
+		 * We can move the region to corresponding list for next action.
+		 * At this point, the region is not yet connected to any list.
+		 *
+		 * If the state is RH_NOSYNC, the region should be kept off
+		 * from clean list.
+		 * The hash entry for RH_NOSYNC will remain in memory
+		 * until the region is recovered or the map is reloaded.
+		 */
+
+		/* do nothing for RH_NOSYNC */
 		if (reg->state == RH_RECOVERING) {
 			list_add_tail(&reg->list, &rh->quiesced_regions);
-		} else {
+		} else if (reg->state == RH_DIRTY) {
 			reg->state = RH_CLEAN;
 			list_add(&reg->list, &rh->clean_regions);
 		}

                 reply	other threads:[~2006-01-10 22:20 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43C43313.3000402@ce.jp.nec.com \
    --to=j-nomura@ce.jp.nec.com \
    --cc=akpm@osdl.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox