* [PATCH 0/3] UT and error case changes
@ 2011-03-14 14:09 Adam Kwolek
2011-03-14 14:09 ` [PATCH 1/3] imsm: FIX: existing backup file fails unit tests Adam Kwolek
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Adam Kwolek @ 2011-03-14 14:09 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, dan.j.williams, ed.ciechanowski, wojciech.neubauer
The following series implements 2 changes:
1. Fix for unit tests failure.
UT suits 12 and 13 fails, when backup file cannot be opened
for grow operation (backup file exists already).
2. I've got proposal for handling/rollback metadata in error case.
Currently in case of error external metadata can remain in reshape state.
In some cases metadata can be automatically restored to initial state
(i.e. metadata during imsm container operation can be rolled back
when error occurs on first reshaped array before reshape is started).
For such cases, additional superswitch function can be introduced.
Metadata shouldn't be rolled back in restart case.
I'm passing restart flag to abort function in Grow.c only, as this is general rule.
In the same way array reshape state is checked.
This is proposal, so I've put no implementation in to imsm handler (no metadata update is created yet).
Please let me know your opinion. If you will like it, I'll fill out gaps in imsm code.
BR
Adam
---
Adam Kwolek (3):
imsm: Add metadata abort changes handler template
External metadata has to be restored to initial state in error case
imsm: FIX: existing backup file fails unit tests
Grow.c | 26 ++++++++++++++++++++++++++
mdadm.h | 10 ++++++++++
super-intel.c | 15 +++++++++++++++
tests/imsm-grow-template | 8 ++++++--
4 files changed, 57 insertions(+), 2 deletions(-)
--
Signature
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] imsm: FIX: existing backup file fails unit tests
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
@ 2011-03-14 14:09 ` Adam Kwolek
2011-03-14 14:09 ` [PATCH 2/3] External metadata has to be restored to initial state in error case Adam Kwolek
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Adam Kwolek @ 2011-03-14 14:09 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, dan.j.williams, ed.ciechanowski, wojciech.neubauer
During normal test execution, backup file is deleted after test execution.
If test is interrupted/broken, backup file can remain for next run.
When backup file exists before unit test run, suits 12 and 13 fails.
To avoid this remove backup file before grow is executed.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---
tests/imsm-grow-template | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tests/imsm-grow-template b/tests/imsm-grow-template
index 7c212c4..d03752d 100644
--- a/tests/imsm-grow-template
+++ b/tests/imsm-grow-template
@@ -17,8 +17,10 @@ function grow_member() {
local offset=$6
local chunk=$7
local array_size=$((comps * size))
+ local backup_imsm=/tmp/backup_imsm
- ( set -ex; mdadm --grow $member --chunk=$chunk --level=$level --backup-file=/tmp/backup_imsm )
+ rm -f $backup_imsm
+ ( set -ex; mdadm --grow $member --chunk=$chunk --level=$level --backup-file=$backup_imsm )
local status=$?
if [ $negative_test -ne 0 ]; then
if [ $status -eq 0 ]; then
@@ -71,6 +73,7 @@ done
imsm_check container $num_disks
num_disks=$((num_disks + add_to_num_disks))
+backup_imsm=/tmp/backup_imsm
# Grow each member or a container depending on the type of an operation
if [ $migration_test -ne 0 ]; then
@@ -82,7 +85,8 @@ if [ $migration_test -ne 0 ]; then
grow_member $member1 $new_num_disks $vol1_new_num_comps $vol1_new_level $vol1_comp_size $vol1_offset $vol1_new_chunk
fi
else
- ( set -x; mdadm --grow $container --raid-disks=$num_disks --backup-file=/tmp/backup_imsm )
+ rm -f $backup_imsm
+ ( set -x; mdadm --grow $container --raid-disks=$num_disks --backup-file=$backup_imsm )
grow_status=$?
if [ $negative_test -ne 0 ]; then
if [ $grow_status -eq 0 ]; then
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] External metadata has to be restored to initial state in error case
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
2011-03-14 14:09 ` [PATCH 1/3] imsm: FIX: existing backup file fails unit tests Adam Kwolek
@ 2011-03-14 14:09 ` Adam Kwolek
2011-03-14 14:09 ` [PATCH 3/3] imsm: Add metadata abort changes handler template Adam Kwolek
2011-03-14 21:53 ` [PATCH 0/3] UT and error case changes NeilBrown
3 siblings, 0 replies; 14+ messages in thread
From: Adam Kwolek @ 2011-03-14 14:09 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, dan.j.williams, ed.ciechanowski, wojciech.neubauer
When any error occurs in reshape_array(), mdadm should have ability
to rollback changes. Based on metadata knowledge abort_metadata_changes()
should decide:
1. metadata abort is allowed?
In case of container operation abort can be allowed on first array only
It depends on metadata implementation.
2. If rollback is possible, execute required action to prepare update.
Returned value is:
1 if rollback succeed (send update)
0 if rollback not possible (no action)
-1 for other error
Metadata handler can be called only if reshape is not started in md.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---
Grow.c | 26 ++++++++++++++++++++++++++
mdadm.h | 10 ++++++++++
2 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/Grow.c b/Grow.c
index 40e693e..26fd7eb 100644
--- a/Grow.c
+++ b/Grow.c
@@ -594,6 +594,31 @@ static void sync_metadata(struct supertype *st)
}
}
+static int abort_metadata_changes(struct supertype *st, struct mdinfo *sra,
+ int restart)
+{
+ int ret_val = 0;
+ char action[40];
+
+ /* for not restarted external metadata only
+ */
+ if (!st->ss->external || restart)
+ return ret_val;
+
+ /* check if reshape is not started in md
+ */
+ if (sra && sysfs_get_str(sra, NULL, "sync_action", action, 40) > 0 &&
+ strncmp(action, "reshape", 7) == 0)
+ return ret_val;
+
+ if (st->ss->abort_metadata_changes)
+ ret_val = st->ss->abort_metadata_changes(st);
+ if (ret_val > 0)
+ sync_metadata(st);
+
+ return ret_val;
+}
+
static int subarray_set_num(char *container, struct mdinfo *sra, char *name, int n)
{
/* when dealing with external metadata subarrays we need to be
@@ -2187,6 +2212,7 @@ out:
exit(0);
release:
+ abort_metadata_changes(st, sra, restart);
if (orig_level != UnSet && sra) {
c = map_num(pers, orig_level);
if (c && sysfs_set_str(sra, NULL, "level", c) == 0)
diff --git a/mdadm.h b/mdadm.h
index d3ed50a..d502b13 100644
--- a/mdadm.h
+++ b/mdadm.h
@@ -727,6 +727,16 @@ extern struct superswitch {
struct supertype *st, unsigned long blocks,
int *fds, unsigned long long *offsets,
int dests, int *destfd, unsigned long long *destoffsets);
+ /* abort_metadata_changes() will rollbacks metadata changes
+ * in case of error.
+ * It can be called for not restart case
+ * and when reshape has not been started in md.
+ * Function should return values:
+ * 1 if abort succeed
+ * 0 if abort not possible
+ * -1 for other error
+ */
+ int (*abort_metadata_changes)(struct supertype *st);
/* for mdmon */
int (*open_new)(struct supertype *c, struct active_array *a,
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/3] imsm: Add metadata abort changes handler template
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
2011-03-14 14:09 ` [PATCH 1/3] imsm: FIX: existing backup file fails unit tests Adam Kwolek
2011-03-14 14:09 ` [PATCH 2/3] External metadata has to be restored to initial state in error case Adam Kwolek
@ 2011-03-14 14:09 ` Adam Kwolek
2011-03-14 21:53 ` [PATCH 0/3] UT and error case changes NeilBrown
3 siblings, 0 replies; 14+ messages in thread
From: Adam Kwolek @ 2011-03-14 14:09 UTC (permalink / raw)
To: neilb; +Cc: linux-raid, dan.j.williams, ed.ciechanowski, wojciech.neubauer
Add metadata rollback function implementation/placeholder
for imsm metadata.
Signed-off-by: Adam Kwolek <adam.kwolek@intel.com>
---
super-intel.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/super-intel.c b/super-intel.c
index 44c100b..2c2bd1e 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -7364,6 +7364,20 @@ static int imsm_manage_reshape(
afd, sra, reshape, st, stripes,
fds, offsets, dests, destfd, destoffsets);
}
+
+/* imsm metadata rollback handler, manage metadata for error scenario
+ */
+static int imsm_abort_metadata_changes(struct supertype *st)
+{
+ int ret_val = -1;
+
+ dprintf("imsm: imsm_abort_metadata_changes() called\n");
+
+ /* ToDo: when possible, prepare rollback metadata update
+ */
+
+ return ret_val;
+}
#endif /* MDASSEMBLE */
struct superswitch super_imsm = {
@@ -7386,6 +7400,7 @@ struct superswitch super_imsm = {
.get_disk_controller_domain = imsm_get_disk_controller_domain,
.reshape_super = imsm_reshape_super,
.manage_reshape = imsm_manage_reshape,
+ .abort_metadata_changes = imsm_abort_metadata_changes,
#endif
.match_home = match_home_imsm,
.uuid_from_super= uuid_from_super_imsm,
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] UT and error case changes
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
` (2 preceding siblings ...)
2011-03-14 14:09 ` [PATCH 3/3] imsm: Add metadata abort changes handler template Adam Kwolek
@ 2011-03-14 21:53 ` NeilBrown
2011-03-15 7:28 ` Kwolek, Adam
3 siblings, 1 reply; 14+ messages in thread
From: NeilBrown @ 2011-03-14 21:53 UTC (permalink / raw)
To: Adam Kwolek
Cc: linux-raid, dan.j.williams, ed.ciechanowski, wojciech.neubauer
On Mon, 14 Mar 2011 15:09:20 +0100 Adam Kwolek <adam.kwolek@intel.com> wrote:
> The following series implements 2 changes:
> 1. Fix for unit tests failure.
> UT suits 12 and 13 fails, when backup file cannot be opened
> for grow operation (backup file exists already).
Thanks - applied.
>
> 2. I've got proposal for handling/rollback metadata in error case.
> Currently in case of error external metadata can remain in reshape state.
> In some cases metadata can be automatically restored to initial state
> (i.e. metadata during imsm container operation can be rolled back
> when error occurs on first reshaped array before reshape is started).
> For such cases, additional superswitch function can be introduced.
>
> Metadata shouldn't be rolled back in restart case.
> I'm passing restart flag to abort function in Grow.c only, as this is general rule.
> In the same way array reshape state is checked.
>
> This is proposal, so I've put no implementation in to imsm handler (no metadata update is created yet).
> Please let me know your opinion. If you will like it, I'll fill out gaps in imsm code.
I'm not sure.. it sounds like it might be a good idea, but I'd like to have
some concrete examples to help me think about it.
Do you have an example in mind of an error which might be detected after
the metadata has been updated, but before the reshape has actually started,
and for which reverting the metadata update is likely to be useful?
The more general case of changing your mind just after a reshape has
started and asking the reshape to revert would certainly be useful, but
md isn't capable of that in general (yet).
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 0/3] UT and error case changes
2011-03-14 21:53 ` [PATCH 0/3] UT and error case changes NeilBrown
@ 2011-03-15 7:28 ` Kwolek, Adam
2011-03-18 2:07 ` NeilBrown
2011-03-22 2:23 ` Something wrong with __prep_thunderdome in super-intel.c NeilBrown
0 siblings, 2 replies; 14+ messages in thread
From: Kwolek, Adam @ 2011-03-15 7:28 UTC (permalink / raw)
To: NeilBrown
Cc: linux-raid@vger.kernel.org, Williams, Dan J, Ciechanowski, Ed,
Neubauer, Wojciech
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of NeilBrown
> Sent: Monday, March 14, 2011 10:54 PM
> To: Kwolek, Adam
> Cc: linux-raid@vger.kernel.org; Williams, Dan J; Ciechanowski, Ed;
> Neubauer, Wojciech
> Subject: Re: [PATCH 0/3] UT and error case changes
>
> On Mon, 14 Mar 2011 15:09:20 +0100 Adam Kwolek <adam.kwolek@intel.com>
> wrote:
>
> > The following series implements 2 changes:
> > 1. Fix for unit tests failure.
> > UT suits 12 and 13 fails, when backup file cannot be opened
> > for grow operation (backup file exists already).
>
> Thanks - applied.
>
>
> >
> > 2. I've got proposal for handling/rollback metadata in error case.
> > Currently in case of error external metadata can remain in reshape
> state.
> > In some cases metadata can be automatically restored to initial state
> > (i.e. metadata during imsm container operation can be rolled back
> > when error occurs on first reshaped array before reshape is started).
> > For such cases, additional superswitch function can be introduced.
> >
> > Metadata shouldn't be rolled back in restart case.
> > I'm passing restart flag to abort function in Grow.c only, as this is
> general rule.
> > In the same way array reshape state is checked.
> >
> > This is proposal, so I've put no implementation in to imsm handler (no
> metadata update is created yet).
> > Please let me know your opinion. If you will like it, I'll fill out
> gaps in imsm code.
>
> I'm not sure.. it sounds like it might be a good idea, but I'd like to
> have
> some concrete examples to help me think about it.
I've got this idea during correcting UT so, i.e.:
wrong/already existing/ backup file name passed by used.
Backup file verification/opening is when metadata is in reshape state already.
This error causes mdadm exit at reshape position == 0. It is too early position for reshape restart.
User has manually restore metadata information and this can be quite difficult.
Metadata rollback can be possible for container operation if:
1. external metadata case (checked in Grow.c)
2. This is action on first array in container (check in metadata handler)
2. md doesn't start reshape (checked in Grow.c)
3. Other conditions/metadata specific (checked in Grow.c)
We can also pass some additional status to reshape_container() to indicate that metadata is rolled back
and allow for container unfreeze in such case also.
This can allow mdadm to leave array/container in state that work is started from (hopefully ;)).
BR
Adam
>
> Do you have an example in mind of an error which might be detected after
> the metadata has been updated, but before the reshape has actually
> started,
> and for which reverting the metadata update is likely to be useful?
>
> The more general case of changing your mind just after a reshape has
> started and asking the reshape to revert would certainly be useful, but
> md isn't capable of that in general (yet).
>
> Thanks,
> NeilBrown
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] UT and error case changes
2011-03-15 7:28 ` Kwolek, Adam
@ 2011-03-18 2:07 ` NeilBrown
2011-03-22 2:23 ` Something wrong with __prep_thunderdome in super-intel.c NeilBrown
1 sibling, 0 replies; 14+ messages in thread
From: NeilBrown @ 2011-03-18 2:07 UTC (permalink / raw)
To: Kwolek, Adam
Cc: linux-raid@vger.kernel.org, Williams, Dan J, Ciechanowski, Ed,
Neubauer, Wojciech
On Tue, 15 Mar 2011 07:28:00 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:
>
>
> > -----Original Message-----
> > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > owner@vger.kernel.org] On Behalf Of NeilBrown
> > Sent: Monday, March 14, 2011 10:54 PM
> > To: Kwolek, Adam
> > Cc: linux-raid@vger.kernel.org; Williams, Dan J; Ciechanowski, Ed;
> > Neubauer, Wojciech
> > Subject: Re: [PATCH 0/3] UT and error case changes
> >
> > On Mon, 14 Mar 2011 15:09:20 +0100 Adam Kwolek <adam.kwolek@intel.com>
> > wrote:
> >
> > > The following series implements 2 changes:
> > > 1. Fix for unit tests failure.
> > > UT suits 12 and 13 fails, when backup file cannot be opened
> > > for grow operation (backup file exists already).
> >
> > Thanks - applied.
> >
> >
> > >
> > > 2. I've got proposal for handling/rollback metadata in error case.
> > > Currently in case of error external metadata can remain in reshape
> > state.
> > > In some cases metadata can be automatically restored to initial state
> > > (i.e. metadata during imsm container operation can be rolled back
> > > when error occurs on first reshaped array before reshape is started).
> > > For such cases, additional superswitch function can be introduced.
> > >
> > > Metadata shouldn't be rolled back in restart case.
> > > I'm passing restart flag to abort function in Grow.c only, as this is
> > general rule.
> > > In the same way array reshape state is checked.
> > >
> > > This is proposal, so I've put no implementation in to imsm handler (no
> > metadata update is created yet).
> > > Please let me know your opinion. If you will like it, I'll fill out
> > gaps in imsm code.
> >
> > I'm not sure.. it sounds like it might be a good idea, but I'd like to
> > have
> > some concrete examples to help me think about it.
>
>
>
> I've got this idea during correcting UT so, i.e.:
> wrong/already existing/ backup file name passed by used.
> Backup file verification/opening is when metadata is in reshape state already.
> This error causes mdadm exit at reshape position == 0. It is too early position for reshape restart.
> User has manually restore metadata information and this can be quite difficult.
>
> Metadata rollback can be possible for container operation if:
> 1. external metadata case (checked in Grow.c)
> 2. This is action on first array in container (check in metadata handler)
> 2. md doesn't start reshape (checked in Grow.c)
> 3. Other conditions/metadata specific (checked in Grow.c)
>
> We can also pass some additional status to reshape_container() to indicate that metadata is rolled back
> and allow for container unfreeze in such case also.
> This can allow mdadm to leave array/container in state that work is started from (hopefully ;)).
>
> BR
> Adam
>
I think I see your point - there are some issues with some failure
possibilities at start-up.
We should of course do as much checking as possible before committing to the
reshape, and I think we do. But it is still possible that something could go
wrong.
It would probably be good to take the first backup after freezing IO, but
before updating the metadata to show that a reshape has started. That would
ensure that a crash at any time will either be able to replay the backup and
continue the reshape, or won't see the reshape at all.
It might be a bit awkward to do that with the current code, I'm not sure.
Another possibility is to at least write out a 'zeroed' backup file and
arrange that when we restart and array, if the backup file is empty, then
that is a clear sign that no reshape has started and
->update_super("_reshape_progress") could revert the reshape.
Longer term: I want to eventually teach md to be able to revert a reshape
that has already started. In that case we will need some sort of metadata
method to record that reshape is going backwards - though it may end up
looking like something we already have...
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* Something wrong with __prep_thunderdome in super-intel.c
2011-03-15 7:28 ` Kwolek, Adam
2011-03-18 2:07 ` NeilBrown
@ 2011-03-22 2:23 ` NeilBrown
2011-03-25 2:40 ` Dan Williams
1 sibling, 1 reply; 14+ messages in thread
From: NeilBrown @ 2011-03-22 2:23 UTC (permalink / raw)
To: Williams, Dan J
Cc: Kwolek, Adam, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech
Hi Dan,
I think you were the original author of imsm_thunderdome and
__prep_thunderdome - yes?
I found a case (thank to the test suite) where it isn't working correctly,
If I have a container with 2 devices, and the second one is failed, then
imsm_thunderdome returns NULL.
__prep_thunderdome sees the first and adds it to the table of superblocks.
Then it sees the second notices the family_num and checksum are the same, and
so replaces the first with the second in the table.
Then in imsm_thunderdome, d->serial is full of 'nul', so disk_list_get
doesn't find anything so the super_table becomes empty and nothing works.
So it could be:
load_and_parse_mpb is wrong for putting the nul serial in there
__prep_thunderdome is wrong for thinking the two are equivalent
imsm_thunderdome is wrong for giving up when just one device isn't found
and I really don't know which.
You can easily reproduce with
./mdadm -CR /dev/md/imsm -e imsm -n 2 /dev/loop[01]
./mdadm -CR /dev/md/r1 -l1 -n2 /dev/md/imsm
./mdadm /dev/md/r1 -f /dev/loop1
./mdadm -E /dev/md/imsm
and notice that nothing gets printed.
If you fail loop0 instead, it works properly.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Something wrong with __prep_thunderdome in super-intel.c
2011-03-22 2:23 ` Something wrong with __prep_thunderdome in super-intel.c NeilBrown
@ 2011-03-25 2:40 ` Dan Williams
2011-03-25 8:43 ` Kwolek, Adam
2011-03-28 1:35 ` NeilBrown
0 siblings, 2 replies; 14+ messages in thread
From: Dan Williams @ 2011-03-25 2:40 UTC (permalink / raw)
To: NeilBrown
Cc: Kwolek, Adam, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech, Wojcik, Krzysztof
On Mon, 2011-03-21 at 19:23 -0700, NeilBrown wrote:
> Hi Dan,
>
> I think you were the original author of imsm_thunderdome and
> __prep_thunderdome - yes?
yes.
>
> I found a case (thank to the test suite) where it isn't working correctly,
>
> If I have a container with 2 devices, and the second one is failed, then
> imsm_thunderdome returns NULL.
>
> __prep_thunderdome sees the first and adds it to the table of superblocks.
> Then it sees the second notices the family_num and checksum are the same, and
> so replaces the first with the second in the table.
>
> Then in imsm_thunderdome, d->serial is full of 'nul', so disk_list_get
> doesn't find anything so the super_table becomes empty and nothing works.
>
> So it could be:
> load_and_parse_mpb is wrong for putting the nul serial in there
> __prep_thunderdome is wrong for thinking the two are equivalent
> imsm_thunderdome is wrong for giving up when just one device isn't found
>
> and I really don't know which.
hmm...
<context switch out of isci driver review mode>
>
> You can easily reproduce with
>
> ./mdadm -CR /dev/md/imsm -e imsm -n 2 /dev/loop[01]
> ./mdadm -CR /dev/md/r1 -l1 -n2 /dev/md/imsm
> ./mdadm /dev/md/r1 -f /dev/loop1
> ./mdadm -E /dev/md/imsm
>
> and notice that nothing gets printed.
> If you fail loop0 instead, it works properly.
The first thought is that the generation number on the good device
should be ahead of the bad, but lo and behold it isn't. We should stop
advancing the generation number on failed devices. The regression that
Krzysztof's patch somewhat addresses is that thunderdome expects that
failed devices should at least be able to look themselves up in their
own mpb. Fixing up the serial number so that other devices see the
other disk as out of date is the expectation.
Stopping metadata write outs to failed devices fixes both problems.
Krzysztof care to try the following with your recent change reverted? I
verified it addresses the problem above, but I have not had a chance to
double check the orom compatibility (although I suspect it will be ok).
This also simplifies the calling convention to mark_missing and
mark_failure.
--
Dan
diff --git a/super-intel.c b/super-intel.c
index 2b41e08..6a6f738 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -5213,13 +5213,14 @@ static int is_resyncing(struct imsm_dev *dev)
}
/* return true if we recorded new information */
-static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
+static int mark_failure(struct imsm_dev *dev, struct dl *dl)
{
__u32 ord;
int slot;
struct imsm_map *map;
char buf[MAX_RAID_SERIAL_LEN+3];
- unsigned int len, shift = 0;
+ struct imsm_disk *disk = &dl->disk;
+ unsigned int len, shift = 0, idx = dl->index;
/* new failures are always set in map[0] */
map = get_imsm_map(dev, 0);
@@ -5232,6 +5233,7 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
if (is_failed(disk) && (ord & IMSM_ORD_REBUILD))
return 0;
+ dl->index = -2;
sprintf(buf, "%s:0", disk->serial);
if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN)
shift = len - MAX_RAID_SERIAL_LEN + 1;
@@ -5244,9 +5246,11 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
return 1;
}
-static void mark_missing(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
+static void mark_missing(struct imsm_dev *dev, struct dl *dl)
{
- mark_failure(dev, disk, idx);
+ struct imsm_disk *disk = &dl->disk;
+
+ mark_failure(dev, dl);
if (disk->scsi_id == __cpu_to_le32(~(__u32)0))
return;
@@ -5269,7 +5273,7 @@ static void handle_missing(struct intel_super *super, struct imsm_dev *dev)
dprintf("imsm: mark missing\n");
end_migration(dev, map_state);
for (dl = super->missing; dl; dl = dl->next)
- mark_missing(dev, &dl->disk, dl->index);
+ mark_missing(dev, dl);
super->updates_pending++;
}
@@ -5497,7 +5501,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
struct intel_super *super = a->container->sb;
struct imsm_dev *dev = get_imsm_dev(super, inst);
struct imsm_map *map = get_imsm_map(dev, 0);
- struct imsm_disk *disk;
+ struct dl *dl;
int failed;
__u32 ord;
__u8 map_state;
@@ -5512,11 +5516,11 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
dprintf("imsm: set_disk %d:%x\n", n, state);
ord = get_imsm_ord_tbl_ent(dev, n, -1);
- disk = get_imsm_disk(super, ord_to_idx(ord));
+ dl = get_imsm_dl_disk(super, ord_to_idx(ord));
/* check for new failures */
if (state & DS_FAULTY) {
- if (mark_failure(dev, disk, ord_to_idx(ord)))
+ if (mark_failure(dev, dl))
super->updates_pending++;
}
@@ -6265,7 +6269,7 @@ static int apply_takeover_update(struct imsm_update_takeover *u,
for (du = super->missing; du; du = du->next)
if (du->index >= 0) {
set_imsm_ord_tbl_ent(map, du->index, du->index);
- mark_missing(dev_new, &du->disk, du->index);
+ mark_missing(dev_new, du);
}
return 1;
^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: Something wrong with __prep_thunderdome in super-intel.c
2011-03-25 2:40 ` Dan Williams
@ 2011-03-25 8:43 ` Kwolek, Adam
2011-03-25 18:50 ` Dan Williams
2011-03-28 2:28 ` NeilBrown
2011-03-28 1:35 ` NeilBrown
1 sibling, 2 replies; 14+ messages in thread
From: Kwolek, Adam @ 2011-03-25 8:43 UTC (permalink / raw)
To: Williams, Dan J, NeilBrown
Cc: linux-raid@vger.kernel.org, Ciechanowski, Ed, Neubauer, Wojciech,
Wojcik, Krzysztof
> -----Original Message-----
> From: Williams, Dan J
> Sent: Friday, March 25, 2011 3:41 AM
> To: NeilBrown
> Cc: Kwolek, Adam; linux-raid@vger.kernel.org; Ciechanowski, Ed;
> Neubauer, Wojciech; Wojcik, Krzysztof
> Subject: Re: Something wrong with __prep_thunderdome in super-intel.c
>
> On Mon, 2011-03-21 at 19:23 -0700, NeilBrown wrote:
> > Hi Dan,
> >
> > I think you were the original author of imsm_thunderdome and
> > __prep_thunderdome - yes?
>
> yes.
>
> >
> > I found a case (thank to the test suite) where it isn't working
> correctly,
> >
> > If I have a container with 2 devices, and the second one is failed,
> then
> > imsm_thunderdome returns NULL.
> >
> > __prep_thunderdome sees the first and adds it to the table of
> superblocks.
> > Then it sees the second notices the family_num and checksum are the
> same, and
> > so replaces the first with the second in the table.
> >
> > Then in imsm_thunderdome, d->serial is full of 'nul', so
> disk_list_get
> > doesn't find anything so the super_table becomes empty and nothing
> works.
> >
> > So it could be:
> > load_and_parse_mpb is wrong for putting the nul serial in there
> > __prep_thunderdome is wrong for thinking the two are equivalent
> > imsm_thunderdome is wrong for giving up when just one device isn't
> found
> >
> > and I really don't know which.
>
> hmm...
>
> <context switch out of isci driver review mode>
>
> >
> > You can easily reproduce with
> >
> > ./mdadm -CR /dev/md/imsm -e imsm -n 2 /dev/loop[01]
> > ./mdadm -CR /dev/md/r1 -l1 -n2 /dev/md/imsm
> > ./mdadm /dev/md/r1 -f /dev/loop1
> > ./mdadm -E /dev/md/imsm
> >
> > and notice that nothing gets printed.
> > If you fail loop0 instead, it works properly.
>
> The first thought is that the generation number on the good device
> should be ahead of the bad, but lo and behold it isn't. We should stop
> advancing the generation number on failed devices. The regression that
> Krzysztof's patch somewhat addresses is that thunderdome expects that
> failed devices should at least be able to look themselves up in their
> own mpb. Fixing up the serial number so that other devices see the
> other disk as out of date is the expectation.
>
> Stopping metadata write outs to failed devices fixes both problems.
> Krzysztof care to try the following with your recent change reverted? I
> verified it addresses the problem above, but I have not had a chance to
> double check the orom compatibility (although I suspect it will be ok).
> This also simplifies the calling convention to mark_missing and
> mark_failure.
>
> --
> Dan
>
> diff --git a/super-intel.c b/super-intel.c
> index 2b41e08..6a6f738 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -5213,13 +5213,14 @@ static int is_resyncing(struct imsm_dev *dev)
> }
>
> /* return true if we recorded new information */
> -static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk,
> int idx)
> +static int mark_failure(struct imsm_dev *dev, struct dl *dl)
> {
> __u32 ord;
> int slot;
> struct imsm_map *map;
> char buf[MAX_RAID_SERIAL_LEN+3];
> - unsigned int len, shift = 0;
> + struct imsm_disk *disk = &dl->disk;
> + unsigned int len, shift = 0, idx = dl->index;
>
> /* new failures are always set in map[0] */
> map = get_imsm_map(dev, 0);
> @@ -5232,6 +5233,7 @@ static int mark_failure(struct imsm_dev *dev,
> struct imsm_disk *disk, int idx)
> if (is_failed(disk) && (ord & IMSM_ORD_REBUILD))
> return 0;
>
> + dl->index = -2;
> sprintf(buf, "%s:0", disk->serial);
> if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN)
> shift = len - MAX_RAID_SERIAL_LEN + 1;
> @@ -5244,9 +5246,11 @@ static int mark_failure(struct imsm_dev *dev,
> struct imsm_disk *disk, int idx)
> return 1;
> }
>
> -static void mark_missing(struct imsm_dev *dev, struct imsm_disk *disk,
> int idx)
> +static void mark_missing(struct imsm_dev *dev, struct dl *dl)
> {
> - mark_failure(dev, disk, idx);
> + struct imsm_disk *disk = &dl->disk;
> +
> + mark_failure(dev, dl);
>
> if (disk->scsi_id == __cpu_to_le32(~(__u32)0))
> return;
> @@ -5269,7 +5273,7 @@ static void handle_missing(struct intel_super
> *super, struct imsm_dev *dev)
> dprintf("imsm: mark missing\n");
> end_migration(dev, map_state);
> for (dl = super->missing; dl; dl = dl->next)
> - mark_missing(dev, &dl->disk, dl->index);
> + mark_missing(dev, dl);
> super->updates_pending++;
> }
>
> @@ -5497,7 +5501,7 @@ static void imsm_set_disk(struct active_array *a,
> int n, int state)
> struct intel_super *super = a->container->sb;
> struct imsm_dev *dev = get_imsm_dev(super, inst);
> struct imsm_map *map = get_imsm_map(dev, 0);
> - struct imsm_disk *disk;
> + struct dl *dl;
> int failed;
> __u32 ord;
> __u8 map_state;
> @@ -5512,11 +5516,11 @@ static void imsm_set_disk(struct active_array
> *a, int n, int state)
> dprintf("imsm: set_disk %d:%x\n", n, state);
>
> ord = get_imsm_ord_tbl_ent(dev, n, -1);
> - disk = get_imsm_disk(super, ord_to_idx(ord));
> + dl = get_imsm_dl_disk(super, ord_to_idx(ord));
>
> /* check for new failures */
> if (state & DS_FAULTY) {
> - if (mark_failure(dev, disk, ord_to_idx(ord)))
> + if (mark_failure(dev, dl))
> super->updates_pending++;
> }
>
> @@ -6265,7 +6269,7 @@ static int apply_takeover_update(struct
> imsm_update_takeover *u,
> for (du = super->missing; du; du = du->next)
> if (du->index >= 0) {
> set_imsm_ord_tbl_ent(map, du->index, du->index);
> - mark_missing(dev_new, &du->disk, du->index);
> + mark_missing(dev_new, du);
> }
>
> return 1;
>
Unfortunately this doesn't close issue.
I've reached the same from IMSM compatibility point of view, but have in mind that unit test suit 11 fails using this approach.
Put focus on test 5 in this suit (I'm using it, as this is the only one that fails for me) for IMSM (note that first test pass is for native metadata).
Using your path tests 5b and 5c fails. My code fails 5c test only. I'm currently working on it.
BR
Adam
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Something wrong with __prep_thunderdome in super-intel.c
2011-03-25 8:43 ` Kwolek, Adam
@ 2011-03-25 18:50 ` Dan Williams
2011-03-28 2:28 ` NeilBrown
1 sibling, 0 replies; 14+ messages in thread
From: Dan Williams @ 2011-03-25 18:50 UTC (permalink / raw)
To: Kwolek, Adam
Cc: NeilBrown, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech, Wojcik, Krzysztof
On 3/25/2011 1:43 AM, Kwolek, Adam wrote:
>
> Unfortunately this doesn't close issue.
>
> I've reached the same from IMSM compatibility point of view, but have in mind that unit test suit 11 fails using this approach.
> Put focus on test 5 in this suit (I'm using it, as this is the only one that fails for me) for IMSM (note that first test pass is for native metadata).
> Using your path tests 5b and 5c fails. My code fails 5c test only. I'm currently working on it.
I am seeing a different failing set (after applying Anna's patches): 1,
1a, 2, 4, 5b, 5c...
One thing too look out for is that failed disks will no longer self
identify. Prior to this mdadm would update the metadata on failed disks
to show "failed", now the disk must be viewed in relation to others to
determine if it is stale or not. This makes the mdadm -f case more in
line with the hot removed case.
--
Dan
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Something wrong with __prep_thunderdome in super-intel.c
2011-03-25 2:40 ` Dan Williams
2011-03-25 8:43 ` Kwolek, Adam
@ 2011-03-28 1:35 ` NeilBrown
2011-03-28 16:56 ` Dan Williams
1 sibling, 1 reply; 14+ messages in thread
From: NeilBrown @ 2011-03-28 1:35 UTC (permalink / raw)
To: Dan Williams
Cc: Kwolek, Adam, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech, Wojcik, Krzysztof
On Thu, 24 Mar 2011 19:40:46 -0700 Dan Williams <dan.j.williams@intel.com>
wrote:
>
> hmm...
>
> <context switch out of isci driver review mode>
:-)
>
> >
> > You can easily reproduce with
> >
> > ./mdadm -CR /dev/md/imsm -e imsm -n 2 /dev/loop[01]
> > ./mdadm -CR /dev/md/r1 -l1 -n2 /dev/md/imsm
> > ./mdadm /dev/md/r1 -f /dev/loop1
> > ./mdadm -E /dev/md/imsm
> >
> > and notice that nothing gets printed.
> > If you fail loop0 instead, it works properly.
>
> The first thought is that the generation number on the good device
> should be ahead of the bad, but lo and behold it isn't. We should stop
> advancing the generation number on failed devices. The regression that
> Krzysztof's patch somewhat addresses is that thunderdome expects that
> failed devices should at least be able to look themselves up in their
> own mpb. Fixing up the serial number so that other devices see the
> other disk as out of date is the expectation.
>
> Stopping metadata write outs to failed devices fixes both problems.
> Krzysztof care to try the following with your recent change reverted? I
> verified it addresses the problem above, but I have not had a chance to
> double check the orom compatibility (although I suspect it will be ok).
> This also simplifies the calling convention to mark_missing and
> mark_failure.
Thanks for looking at this Dan.
I tried your patch, however ...
>
> diff --git a/super-intel.c b/super-intel.c
> index 2b41e08..6a6f738 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -5213,13 +5213,14 @@ static int is_resyncing(struct imsm_dev *dev)
> }
>
> /* return true if we recorded new information */
> -static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
> +static int mark_failure(struct imsm_dev *dev, struct dl *dl)
> {
> __u32 ord;
> int slot;
> struct imsm_map *map;
> char buf[MAX_RAID_SERIAL_LEN+3];
> - unsigned int len, shift = 0;
> + struct imsm_disk *disk = &dl->disk;
> + unsigned int len, shift = 0, idx = dl->index;
>
> /* new failures are always set in map[0] */
> map = get_imsm_map(dev, 0);
> @@ -5232,6 +5233,7 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
> if (is_failed(disk) && (ord & IMSM_ORD_REBUILD))
> return 0;
>
> + dl->index = -2;
> sprintf(buf, "%s:0", disk->serial);
> if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN)
> shift = len - MAX_RAID_SERIAL_LEN + 1;
> @@ -5244,9 +5246,11 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
> return 1;
> }
>
> -static void mark_missing(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
> +static void mark_missing(struct imsm_dev *dev, struct dl *dl)
> {
> - mark_failure(dev, disk, idx);
> + struct imsm_disk *disk = &dl->disk;
> +
> + mark_failure(dev, dl);
>
> if (disk->scsi_id == __cpu_to_le32(~(__u32)0))
> return;
> @@ -5269,7 +5273,7 @@ static void handle_missing(struct intel_super *super, struct imsm_dev *dev)
> dprintf("imsm: mark missing\n");
> end_migration(dev, map_state);
> for (dl = super->missing; dl; dl = dl->next)
> - mark_missing(dev, &dl->disk, dl->index);
> + mark_missing(dev, dl);
> super->updates_pending++;
> }
>
> @@ -5497,7 +5501,7 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
> struct intel_super *super = a->container->sb;
> struct imsm_dev *dev = get_imsm_dev(super, inst);
> struct imsm_map *map = get_imsm_map(dev, 0);
> - struct imsm_disk *disk;
> + struct dl *dl;
> int failed;
> __u32 ord;
> __u8 map_state;
> @@ -5512,11 +5516,11 @@ static void imsm_set_disk(struct active_array *a, int n, int state)
> dprintf("imsm: set_disk %d:%x\n", n, state);
>
> ord = get_imsm_ord_tbl_ent(dev, n, -1);
> - disk = get_imsm_disk(super, ord_to_idx(ord));
> + dl = get_imsm_dl_disk(super, ord_to_idx(ord));
This sometimes return NULL, leading to bad stuff and mdmon crashing....
So there is more to this than meets the eye...
I'll stop trying this patch.
Thanks,
NeilBrown
>
> /* check for new failures */
> if (state & DS_FAULTY) {
> - if (mark_failure(dev, disk, ord_to_idx(ord)))
> + if (mark_failure(dev, dl))
> super->updates_pending++;
> }
>
> @@ -6265,7 +6269,7 @@ static int apply_takeover_update(struct imsm_update_takeover *u,
> for (du = super->missing; du; du = du->next)
> if (du->index >= 0) {
> set_imsm_ord_tbl_ent(map, du->index, du->index);
> - mark_missing(dev_new, &du->disk, du->index);
> + mark_missing(dev_new, du);
> }
>
> return 1;
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Something wrong with __prep_thunderdome in super-intel.c
2011-03-25 8:43 ` Kwolek, Adam
2011-03-25 18:50 ` Dan Williams
@ 2011-03-28 2:28 ` NeilBrown
1 sibling, 0 replies; 14+ messages in thread
From: NeilBrown @ 2011-03-28 2:28 UTC (permalink / raw)
To: Kwolek, Adam
Cc: Williams, Dan J, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech, Wojcik, Krzysztof
On Fri, 25 Mar 2011 08:43:26 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:
>
> Unfortunately this doesn't close issue.
>
> I've reached the same from IMSM compatibility point of view, but have in mind that unit test suit 11 fails using this approach.
> Put focus on test 5 in this suit (I'm using it, as this is the only one that fails for me) for IMSM (note that first test pass is for native metadata).
> Using your path tests 5b and 5c fails. My code fails 5c test only. I'm currently working on it.
I've got it passing all migration tests with your patch.
The '5' tests have a problem that they are trying to use both dev6 and dev7
which have the same backing device and are used for multipath testing.
I change it to use dev10 instead of dev7 and everything passes.
So I'll be releasing 3.2.1 shortly with what I've got.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Something wrong with __prep_thunderdome in super-intel.c
2011-03-28 1:35 ` NeilBrown
@ 2011-03-28 16:56 ` Dan Williams
0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2011-03-28 16:56 UTC (permalink / raw)
To: NeilBrown
Cc: Kwolek, Adam, linux-raid@vger.kernel.org, Ciechanowski, Ed,
Neubauer, Wojciech, Wojcik, Krzysztof
On Sun, 2011-03-27 at 18:35 -0700, NeilBrown wrote:
> On Thu, 24 Mar 2011 19:40:46 -0700 Dan Williams <dan.j.williams@intel.com>
> wrote:
>
> > <context switch out of isci driver review mode>
>
> :-)
>
[..]
> > - disk = get_imsm_disk(super, ord_to_idx(ord));
> > + dl = get_imsm_dl_disk(super, ord_to_idx(ord));
>
> This sometimes return NULL, leading to bad stuff and mdmon crashing....
>
> So there is more to this than meets the eye...
Yes, (and I chalk this up to context switch latency), setting the index
to -2 is not correct as other paths need to be able to reference a valid
disk index until the failed device is removed via a rebuild.
> I'll stop trying this patch.
Ok, here is a proposed v2 on top of the latest devel-3.2, but I need to
play with it a bit more, and figure out what the spare migration test is
complaining about.
diff --git a/super-intel.c b/super-intel.c
index 6e12af2..e2f66aa 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -3993,7 +3993,7 @@ static int write_super_imsm(struct supertype *st, int doclose)
/* write the mpb for disks that compose raid devices */
for (d = super->disks; d ; d = d->next) {
- if (d->index < 0)
+ if (d->index < 0 || is_failed(&d->disk))
continue;
if (store_imsm_mpb(d->fd, mpb))
fprintf(stderr, "%s: failed for device %d:%d %s\n",
@@ -5218,6 +5218,8 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
__u32 ord;
int slot;
struct imsm_map *map;
+ char buf[MAX_RAID_SERIAL_LEN+3];
+ unsigned int len, shift = 0;
/* new failures are always set in map[0] */
map = get_imsm_map(dev, 0);
@@ -5230,6 +5232,11 @@ static int mark_failure(struct imsm_dev *dev, struct imsm_disk *disk, int idx)
if (is_failed(disk) && (ord & IMSM_ORD_REBUILD))
return 0;
+ sprintf(buf, "%s:0", disk->serial);
+ if ((len = strlen(buf)) >= MAX_RAID_SERIAL_LEN)
+ shift = len - MAX_RAID_SERIAL_LEN + 1;
+ strncpy((char *)disk->serial, &buf[shift], MAX_RAID_SERIAL_LEN);
+
disk->status |= FAILED_DISK;
set_imsm_ord_tbl_ent(map, slot, idx | IMSM_ORD_REBUILD);
if (map->failed_disk_num == 0xff)
^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-03-28 16:56 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-14 14:09 [PATCH 0/3] UT and error case changes Adam Kwolek
2011-03-14 14:09 ` [PATCH 1/3] imsm: FIX: existing backup file fails unit tests Adam Kwolek
2011-03-14 14:09 ` [PATCH 2/3] External metadata has to be restored to initial state in error case Adam Kwolek
2011-03-14 14:09 ` [PATCH 3/3] imsm: Add metadata abort changes handler template Adam Kwolek
2011-03-14 21:53 ` [PATCH 0/3] UT and error case changes NeilBrown
2011-03-15 7:28 ` Kwolek, Adam
2011-03-18 2:07 ` NeilBrown
2011-03-22 2:23 ` Something wrong with __prep_thunderdome in super-intel.c NeilBrown
2011-03-25 2:40 ` Dan Williams
2011-03-25 8:43 ` Kwolek, Adam
2011-03-25 18:50 ` Dan Williams
2011-03-28 2:28 ` NeilBrown
2011-03-28 1:35 ` NeilBrown
2011-03-28 16:56 ` Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).