From: Martin Wilck <mwilck@arcor.de>
To: NeilBrown <neilb@suse.de>, linux-raid@vger.kernel.org
Subject: Re: Suspicious test failure - mdmon misses recovery events on loop devices
Date: Tue, 30 Jul 2013 23:16:43 +0200 [thread overview]
Message-ID: <51F82D3B.6060104@arcor.de> (raw)
In-Reply-To: <20130730104206.3ffc9f00@notabene.brown>
On 07/30/2013 02:42 AM, NeilBrown wrote:
> On Mon, 29 Jul 2013 22:42:25 +0200 Martin Wilck <mwilck@arcor.de> wrote:
>
>>
>>> My current idea to solve this is yet another separate thread just for
>>> monitoring kernel state changes. Don't have it ready yet, though.
>>
>> Another idea would be in manage_member, after queueing the metadata
>> update and waking up the monitor, to wait for the metadata to finish
>> processing before actually starting the recovery (writing "recover" to
>> sync_action).
>>
>> Martin
>
> I hope an extra thread won't be necessary :-)
I think the general problem that mdmon may be busy writing to disk while
something changes in the kernel is real. But introducing an extra thread
would make things even more complex as they are now, so it might be
something to avoid.
> I think that manage_member is the place to fix this. However it might be
> even simpler than you suggest.
>
> We currently have
>
> replace_array(container, a, newa);
> sysfs_set_str(&a->info, NULL, "sync_action", "recover");
>
> monitor subsequently takes that 'newa', looks at 'sync_action', see that it
> is 'idle' and assume that the recover never happened.
> Suppose we change it to:
>
> if (sysfs_set_str(&a->info, NULL, "sync_action", "recover") == 0)
> newa->prev_action = newa->curr_action = recovery;
> replace_array(container, a, newa);
>
> Then it wouldn't matter if monitor never saw the 'recovery' state as manager
> explicitly told it that recovery had started.
>
> Could you try that?
Ingenious idea :-) Unfortunately it isn't sufficient. However this PLUS
waiting for the metadata upate to finish makes my test succeed reliably
(10/10, in the previous failure scenario). I'll send in the current
status of patches in a minute.
Martin
>
> Thanks,
> NeilBrown
next prev parent reply other threads:[~2013-07-30 21:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-26 20:58 Suspicious test failure - mdmon misses recovery events on loop devices Martin Wilck
2013-07-29 6:55 ` NeilBrown
2013-07-29 20:39 ` Martin Wilck
2013-07-29 20:42 ` Martin Wilck
2013-07-30 0:42 ` NeilBrown
2013-07-30 21:16 ` Martin Wilck [this message]
2013-07-30 21:18 ` [PATCH 00/10] Two bug fixes and a lot of debug code mwilck
2013-07-31 3:10 ` NeilBrown
2013-07-30 21:18 ` [PATCH 01/10] DDF: ddf_activate_spare: bugfix for 62ff3c40 mwilck
2013-07-30 21:18 ` [PATCH 02/10] DDF: log disk status changes more nicely mwilck
2013-07-30 21:18 ` [PATCH 03/10] DDF: ddf_process_update: log offsets for conf changes mwilck
2013-07-30 21:18 ` [PATCH 04/10] DDF: load_ddf_header: more error logging mwilck
2013-07-30 21:18 ` [PATCH 05/10] DDF: ddf_set_disk: add some debug messages mwilck
2013-07-30 21:18 ` [PATCH 06/10] monitor: read_and_act: log status when called mwilck
2013-07-31 2:59 ` NeilBrown
2013-07-31 5:28 ` Martin Wilck
2013-07-30 21:18 ` [PATCH 07/10] mdmon: wait_and_act: fix debug message for SIGUSR1 mwilck
2013-07-30 21:18 ` [PATCH 08/10] mdmon: manage_member: debug messages for array state mwilck
2013-07-30 21:18 ` [PATCH 09/10] mdmon: manage_member: fix race condition during slow meta data writes mwilck
2013-07-30 21:18 ` [PATCH 10/10] tests/10ddf-create-fail-rebuild: new unit test for DDF mwilck
2013-07-31 5:36 ` [PATCH] tests/env-ddf-template: helper for new unit test mwilck
2013-07-31 6:49 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51F82D3B.6060104@arcor.de \
--to=mwilck@arcor.de \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.