linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Neil Brown <neilb@suse.de>
Cc: "Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
	"Labun, Marcin" <Marcin.Labun@intel.com>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: [GIT PATCH 0/2] external-metadata recovery checkpointing for 2.6.33
Date: Tue, 15 Dec 2009 11:03:06 -0700	[thread overview]
Message-ID: <e9c3a7c20912151003y942a4aex803e1e6722f23f31@mail.gmail.com> (raw)
In-Reply-To: <e9c3a7c20912142019j280a243csf8c39a73fc3d0b06@mail.gmail.com>

On Mon, Dec 14, 2009 at 9:19 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> On second thought, if we get to activate_spare() it's already too
> late.  Moving this to mdadm at assembly time (prior to setting
> readonly) is a better approach.
>

Problem.  slot_store() in the array inactive case currently does:

                /* assume it is working */
                clear_bit(Faulty, &rdev->flags);
                clear_bit(WriteMostly, &rdev->flags);
                set_bit(In_sync, &rdev->flags);
                sysfs_notify_dirent(rdev->sysfs_state);

i.e. sets the disk insync even if we specified a recovery_start <
MaxSector.  If userspace can guarantee that the array stays inactive
then it can write to 'recovery_start' after 'slot' and catch attempts
to cold_add() out-of-sync disks on pre-2.6.33 kernels, but that gives
a window of invalid configuration.  The other fix is to remove the
set_bit(In_sync), and then for the pre-2.6.33 case userspace would
need to disallow adding out-of-sync disks and force them through the
hot_add() case.  This is how mdadm/mdmon currently operates, but that
is a surprising ABI quirk when switching to/from 2.6.33.  A third
option is to allow recovery_start_store to be modified while the array
is read only. Although not my favorite, because it requires tricky
mdmon logic to catch activate_spare() attempts before the monitor
thread starts touching the array, it has the benefit of not changing
any old behavior and no window of invalid configuration.  Thoughts??

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-12-15 18:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-13  4:17 [GIT PATCH 0/2] external-metadata recovery checkpointing for 2.6.33 Dan Williams
2009-12-13  4:17 ` [PATCH 1/2] md: rcu_read_lock() walk of mddev->disks in md_do_sync() Dan Williams
2009-12-13  4:17 ` [PATCH 2/2] md: add 'recovery_start' sysfs attribute Dan Williams
2009-12-14  4:07 ` [GIT PATCH 0/2] external-metadata recovery checkpointing for 2.6.33 Neil Brown
2009-12-14  4:49   ` Dan Williams
2009-12-14  5:35     ` Neil Brown
2009-12-15  0:37   ` Dan Williams
2009-12-15  4:19     ` Dan Williams
2009-12-15 18:03       ` Dan Williams [this message]
2009-12-16  5:16         ` Neil Brown
2009-12-16  6:24           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9c3a7c20912151003y942a4aex803e1e6722f23f31@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=Marcin.Labun@intel.com \
    --cc=ed.ciechanowski@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).