linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: NeilBrown <neilb@suse.de>
Cc: "Kwolek, Adam" <adam.kwolek@intel.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
	"Neubauer, Wojciech" <Wojciech.Neubauer@intel.com>,
	"Wojcik, Krzysztof" <krzysztof.wojcik@intel.com>
Subject: Re: Something wrong with __prep_thunderdome in super-intel.c
Date: Thu, 24 Mar 2011 19:40:46 -0700	[thread overview]
Message-ID: <1301020846.15264.12.camel@dwillia2-linux> (raw)
In-Reply-To: <20110322132307.34e9bb3b@notabene.brown>

On Mon, 2011-03-21 at 19:23 -0700, NeilBrown wrote:
> Hi Dan,
> 
>  I think you were the original author of imsm_thunderdome and
>  __prep_thunderdome - yes?

yes.

> 
>  I found a case (thank to the test suite) where it isn't working correctly,
> 
>  If I have a container with 2 devices, and the second one is failed, then
>  imsm_thunderdome returns NULL.
> 
>  __prep_thunderdome sees the first and adds it to the table of superblocks.
>  Then it sees the second notices the family_num and checksum are the same, and
>  so replaces the first with the second in the table.
> 
>  Then in imsm_thunderdome, d->serial is full of 'nul', so disk_list_get
>  doesn't find anything so the super_table becomes empty and nothing works.
> 
>  So it could be:
>     load_and_parse_mpb is wrong for putting the nul serial in there
>     __prep_thunderdome is wrong for thinking the two are equivalent
>     imsm_thunderdome is wrong for giving up when just one device isn't found
> 
>  and I really don't know which.

hmm...

<context switch out of isci driver review mode>

> 
> You can easily reproduce with
> 
>   ./mdadm -CR /dev/md/imsm -e imsm -n 2 /dev/loop[01]
>   ./mdadm -CR /dev/md/r1 -l1 -n2 /dev/md/imsm
>   ./mdadm /dev/md/r1 -f /dev/loop1
>   ./mdadm -E /dev/md/imsm
> 
> and notice that nothing gets printed.
> If you fail loop0 instead, it works properly.

The first thought is that the generation number on the good device
should be ahead of the bad, but lo and behold it isn't.  We should stop
advancing the generation number on failed devices.  The regression that
Krzysztof's patch somewhat addresses is that thunderdome expects that
failed devices should at least be able to look themselves up in their
own mpb.  Fixing up the serial number so that other devices see the
other disk as out of date is the expectation.

Stopping metadata write outs to failed devices fixes both problems.
Krzysztof care to try the following with your recent change reverted?  I
verified it addresses the problem above, but I have not had a chance to
double check the orom compatibility (although I suspect it will be ok).
This also simplifies the calling convention to mark_missing and
mark_failure.

--
Dan

diff --git a/super-intel.c b/super-intel.c
index 2b41e08..6a6f738 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5213,13 +5213,14 @@ static int is_resyncing(struct imsm_dev *dev)
 }
 
 /* return true if we recorded new information */
-static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
+static int mark_failure(struct imsm_dev *dev, struct dl *dl)
 {
 	__u32 ord;
 	int slot;
 	struct imsm_map *map;
 	char buf[MAX_RAID_SERIAL_LEN+3];
-	unsigned int len, shift = 0;
+	struct imsm_disk *disk = &dl->disk;
+	unsigned int len, shift = 0, idx = dl->index;
 
 	/* new failures are always set in map[0] */
 	map = get_imsm_map(dev, 0);
@@ -5232,6 +5233,7 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
 	if (is_failed(disk) && (ord & IMSM_ORD_REBUILD))
 		return 0;
 
+	dl->index = -2;
 	sprintf(buf, "%s:0", disk->serial);
 	if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN)
 		shift = len - MAX_RAID_SERIAL_LEN + 1;
@@ -5244,9 +5246,11 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
 	return 1;
 }
 
-static void mark_missing(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
+static void mark_missing(struct imsm_dev *dev, struct dl *dl)
 {
-	mark_failure(dev, disk, idx);
+	struct imsm_disk *disk = &dl->disk;
+
+	mark_failure(dev, dl);
 
 	if (disk->scsi_id == __cpu_to_le32(~(__u32)0))
 		return;
@@ -5269,7 +5273,7 @@ static void handle_missing(struct intel_super *super, struct imsm_dev *dev)
 	dprintf("imsm: mark missing\n");
 	end_migration(dev, map_state);
 	for (dl = super->missing; dl; dl = dl->next)
-		mark_missing(dev, &dl->disk, dl->index);
+		mark_missing(dev, dl);
 	super->updates_pending++;
 }
 
@@ -5497,7 +5501,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
 	struct intel_super *super = a->container->sb;
 	struct imsm_dev *dev = get_imsm_dev(super, inst);
 	struct imsm_map *map = get_imsm_map(dev, 0);
-	struct imsm_disk *disk;
+	struct dl *dl;
 	int failed;
 	__u32 ord;
 	__u8 map_state;
@@ -5512,11 +5516,11 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
 	dprintf("imsm: set_disk %d:%x\n", n, state);
 
 	ord = get_imsm_ord_tbl_ent(dev, n, -1);
-	disk = get_imsm_disk(super, ord_to_idx(ord));
+	dl = get_imsm_dl_disk(super, ord_to_idx(ord));
 
 	/* check for new failures */
 	if (state & DS_FAULTY) {
-		if (mark_failure(dev, disk, ord_to_idx(ord)))
+		if (mark_failure(dev, dl))
 			super->updates_pending++;
 	}
 
@@ -6265,7 +6269,7 @@ static int apply_takeover_update(struct imsm_update_takeover *u,
 	for (du = super->missing; du; du = du->next)
 		if (du->index >= 0) {
 			set_imsm_ord_tbl_ent(map, du->index, du->index);
-			mark_missing(dev_new, &du->disk, du->index);
+			mark_missing(dev_new, du);
 		}
 
 	return 1;



  reply	other threads:[~2011-03-25  2:40 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
2011-03-14 14:09 ` [PATCH 1/3] imsm: FIX: existing backup file fails unit tests Adam Kwolek
2011-03-14 14:09 ` [PATCH 2/3] External metadata has to be restored to initial state in error case Adam Kwolek
2011-03-14 14:09 ` [PATCH 3/3] imsm: Add metadata abort changes handler template Adam Kwolek
2011-03-14 21:53 ` [PATCH 0/3] UT and error case changes NeilBrown
2011-03-15  7:28   ` Kwolek, Adam
2011-03-18  2:07     ` NeilBrown
2011-03-22  2:23     ` Something wrong with __prep_thunderdome in super-intel.c NeilBrown
2011-03-25  2:40       ` Dan Williams [this message]
2011-03-25  8:43         ` Kwolek, Adam
2011-03-25 18:50           ` Dan Williams
2011-03-28  2:28           ` NeilBrown
2011-03-28  1:35         ` NeilBrown
2011-03-28 16:56           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1301020846.15264.12.camel@dwillia2-linux \
    --to=dan.j.williams@intel.com \
    --cc=Wojciech.Neubauer@intel.com \
    --cc=adam.kwolek@intel.com \
    --cc=ed.ciechanowski@intel.com \
    --cc=krzysztof.wojcik@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).