linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [mdadm git pull] support for removed disks / imsm updates
@ 2009-02-28  0:05 Dan Williams
  2009-03-04 22:41 ` Neil Brown
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Williams @ 2009-02-28  0:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid, Jacek Danecki, Ed Ciechanowski

Hi Neil,

This update brings:

1/ Various imsm fixes, importantly some updates due to a clarification
of how the Windows driver marks disk failures
2/ Support for handling removed disks as currently all container
manipulations fail once a live disk is hot-unplugged.
3/ An initial mdmon man page
4/ imsm auto layout support
5/ Updates to --incremental in pursuit of assembling external metadata
arrays in the initramfs via udev events

The one "fix" that is missing from this update is to teach mdmon to kick
"non-fresh" drives similar to what the kernel does at initial assembly.
I dropped the attempt after realizing I would need to take an O_EXCL
open on the container in an awkward place.  I guess it is not necessary,
but it is a quirk of containers that known failed drives can be allowed
back into the container.

Thanks to Jacek Danecki and my colleagues in Poland for identifying the
removed disks problem and working through the regressions on this
release.  Also along these lines you will see that I have moved
development to github to allow more collaboration on this tree.

Please have a look.

Is it about time for a -devel3 release?

Regards,
Dan

---

The following changes since commit 6c40598f598874d1d4c2c4d0da0c2a9b873d768d:
  NeilBrown (1):
        Merge branch 'master' into devel-3.0

are available in the git repository at:

  git://github.com/djbw/mdadm.git master

Dan Williams (24):
      imsm: don't check raid1 chunk size
      imsm: block creation of devices with identical names
      imsm: provide a simulated option-rom for regression tests
      test: fix a call to udevsettle
      imsm: fix missing initializations of the per-disk extents pointer
      imsm: fixup container spare uuids by default
      imsm: fix activate spare to ignore foreign disks
      imsm: introduce get_imsm_disk_slot
      imsm: fix mark_failure / introduce mark_missing
      imsm: verify single sector mpb checksums
      imsm: retry load_imsm_mpb if we suspect mdmon has made modifications
      sysfs: allow sysfs_read to detect and drop removed disks
      mdmon: fix removed disk handling
      mdmon: record added disks
      Manage: permit '--remove detached' for containers
      Create: wait_for container creation
      Create: fixup 'insert_point', dependent on 'subdevs', for auto-layout
      imsm: auto layout
      mdmon: fix missed 'clean' event
      mdmon: man page
      mdmon: update cmdline when scanning
      Incremental: fix 'name_to_use' in the container case
      Incremental: honor --no-degraded to delay assembly
      imsm: display supported chunk sizes in --detail-platform

 Create.c                                           |    7 +-
 Incremental.c                                      |   16 +-
 Manage.c                                           |    9 +-
 managemon.c                                        |   32 ++-
 mdadm.8                                            |    5 +
 mdadm.c                                            |    1 +
 mdadm.h                                            |   34 +-
 mdmon.8                                            |  138 +++++++
 mdmon.c                                            |    9 +-
 monitor.c                                          |   49 +--
 platform-intel.c                                   |   18 +
 super-ddf.c                                        |    8 +-
 super-intel.c                                      |  386 ++++++++++++++++----
 sysfs.c                                            |   26 +-
 test                                               |    2 +-
 tests/09imsm-create-fail-rebuild                   |   56 +++
 tests/env-08imsm-overlap                           |   69 ++++-
 tests/{env-imsm => env-09imsm-create-fail-rebuild} |   35 ++-
 18 files changed, 760 insertions(+), 140 deletions(-)
 create mode 100644 mdmon.8
 create mode 100644 tests/09imsm-create-fail-rebuild
 mode change 120000 => 100644 tests/env-08imsm-overlap
 rename tests/{env-imsm => env-09imsm-create-fail-rebuild} (66%)



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [mdadm git pull] support for removed disks / imsm updates
  2009-02-28  0:05 [mdadm git pull] support for removed disks / imsm updates Dan Williams
@ 2009-03-04 22:41 ` Neil Brown
  2009-03-04 23:59   ` Dan Williams
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2009-03-04 22:41 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid, Jacek Danecki, Ed Ciechanowski

On Friday February 27, dan.j.williams@intel.com wrote:
> Hi Neil,
> 
> This update brings:
> 
> 1/ Various imsm fixes, importantly some updates due to a clarification
> of how the Windows driver marks disk failures
> 2/ Support for handling removed disks as currently all container
> manipulations fail once a live disk is hot-unplugged.

So this is when md thinks the device is in the array, but the device
has actually been removed so with the block/dev file is missing or
empty, or the status is not 'online'..

But we only check for that if mdmon is running.  For some reason that
seems odd, but I'm not really sure.
Why do we want to treat this case differently depending on whether
mdmon is running or not?

> 3/ An initial mdmon man page
> 4/ imsm auto layout support
> 5/ Updates to --incremental in pursuit of assembling external metadata
> arrays in the initramfs via udev events

Thanks.

Most look good.
My attention was caught by Create: wait_for container creation.

I vaguely remember trying that and it didn't work.  Something about
the md array not being in the right sort of state for udev to create a
device, or something...  But I expect you have tested it so maybe I'm
remembering something else.

> 
> The one "fix" that is missing from this update is to teach mdmon to kick
> "non-fresh" drives similar to what the kernel does at initial assembly.
> I dropped the attempt after realizing I would need to take an O_EXCL
> open on the container in an awkward place.  I guess it is not necessary,
> but it is a quirk of containers that known failed drives can be allowed
> back into the container.

I always thought it was a slightly odd quirk that if you had an array
with failed drives, then stopped and restarted the array, those failed
drives would no longer be there.
My feeling is that it doesn't matter a great deal one way or the
other.  The important thing is that when mdadm describes the state of
an array, it describes it in a way that doesn't confuse people (an
area in which v1.x metadata lets us down at the moment).

> 
> Thanks to Jacek Danecki and my colleagues in Poland for identifying the
> removed disks problem and working through the regressions on this
> release.  Also along these lines you will see that I have moved
> development to github to allow more collaboration on this tree.

Yes, thanks!  I see a lot of polishing happening in this patch set
which is nice.

I had noticed you had a presence on github.  I tried playing with it
one afternoon.  It seems awkwardly slow and very poorly documented.
Then I went looking at gitorious and got distracted...

> 
> Please have a look.
> 
> Is it about time for a -devel3 release?

Yes, or a -rc1.  RSN !! :-)

For now, all these patches have been pulled and pushed to neil.brown.name/mdadm

NeilBrown

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [mdadm git pull] support for removed disks / imsm updates
  2009-03-04 22:41 ` Neil Brown
@ 2009-03-04 23:59   ` Dan Williams
  0 siblings, 0 replies; 3+ messages in thread
From: Dan Williams @ 2009-03-04 23:59 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid, Jacek Danecki, Ed Ciechanowski

On Wed, Mar 4, 2009 at 3:41 PM, Neil Brown <neilb@suse.de> wrote:
> On Friday February 27, dan.j.williams@intel.com wrote:
>> 2/ Support for handling removed disks as currently all container
>> manipulations fail once a live disk is hot-unplugged.
>
> So this is when md thinks the device is in the array, but the device
> has actually been removed so with the block/dev file is missing or
> empty, or the status is not 'online'..
>
> But we only check for that if mdmon is running.  For some reason that
> seems odd, but I'm not really sure.
> Why do we want to treat this case differently depending on whether
> mdmon is running or not?

The thinking, dubious or otherwise, is that if mdmon is not running
then the administrator is in charge of managing the container, and
would want to know about these errors.  I could not convince myself
that we *always* wanted to ignore missing disks here... so I erred
conservative.

However, we have already found another location where SKIP_GONE_DEVS
is needed, so part of me wonders about just making it the default?

>> 3/ An initial mdmon man page
>> 4/ imsm auto layout support
>> 5/ Updates to --incremental in pursuit of assembling external metadata
>> arrays in the initramfs via udev events
>
> Thanks.
>
> Most look good.
> My attention was caught by Create: wait_for container creation.
>
> I vaguely remember trying that and it didn't work.  Something about
> the md array not being in the right sort of state for udev to create a
> device, or something...  But I expect you have tested it so maybe I'm
> remembering something else.

It corrected a test script failure here fwiw, but will keep an eye out
for container creation deadlocks.

>>
>> The one "fix" that is missing from this update is to teach mdmon to kick
>> "non-fresh" drives similar to what the kernel does at initial assembly.
>> I dropped the attempt after realizing I would need to take an O_EXCL
>> open on the container in an awkward place.  I guess it is not necessary,
>> but it is a quirk of containers that known failed drives can be allowed
>> back into the container.
>
> I always thought it was a slightly odd quirk that if you had an array
> with failed drives, then stopped and restarted the array, those failed
> drives would no longer be there.
> My feeling is that it doesn't matter a great deal one way or the
> other.  The important thing is that when mdadm describes the state of
> an array, it describes it in a way that doesn't confuse people (an
> area in which v1.x metadata lets us down at the moment).

Ok, that clarifies things...

[..]
> For now, all these patches have been pulled and pushed to neil.brown.name/mdadm

Thanks!

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-03-04 23:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-28  0:05 [mdadm git pull] support for removed disks / imsm updates Dan Williams
2009-03-04 22:41 ` Neil Brown
2009-03-04 23:59   ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).