linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch 00/17] Autorebuild
@ 2010-10-29 14:13 Czarnowska, Anna
  2010-11-17 10:22 ` Neil Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Czarnowska, Anna @ 2010-10-29 14:13 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-raid@vger.kernel.org, Neubauer, Wojciech, Williams, Dan J,
	Ciechanowski, Ed, Labun, Marcin,
	Hawrylewicz Czarnowski, Przemyslaw, Czarnowska, Anna

This is updated series of patches forming autorebuild functionality in mdadm monitor based on new policy code.

Autorebuild Monitoring application:
Autorebuild monitor is part of monitor application (mdadm -F). In the current code of mdadm monitor autorebuild feature was based on spare group assignment in mdadm.conf file and worked only for native metadata. This code has been retained for compatibility with old config format.
The new autorebuild implementation works also for external metadata types. It uses the concept of domains in mdadm.conf introduced by Neil Brown. 
Monitoring application periodically checks the state of MD active arrays and triggers a rebuild if there are eligible spare disks in other arrays/containers.
Degraded arrays are checked one by one. If there is a spare disk in other array/container that matches the domain of the degraded array and the domain action allows for spare sharing the spare is moved using existing Manage_subdevs function. If the addition fails, the spare device is moved back to the original container and next potential spare is tried. The process is repeated until all arrays are checked and the process is put into a sleep state for a configured period.

New option --no-sharing has been added to Monitor mode to be able to run monitoring only (without moving spares). This is recommended when many instances of monitor are to be run on the same set of devices.
Spare sharing is allowed in only one instance of Monitor running with --scan option. User is still able to start Monitoring functions in multiple instances without --scan option.

The autorebuild build-in assumptions are:
1\spares are shared between the arrays of the same metadata 2\spares are moved only from containers/volumes that are not degraded 3\spares are moved to containers/volumes lacking a *good* spare (size)

Anna Czarnowska
Przemyslaw Hawrylewicz-Czarnowski
Marcin Labun

0001-added-path-path_id-to-give-the-information-on-the-pa.patch
0002-Update-of-udev-rules-to-support-IMSM-devices.patch
0003-extension-of-IncrementalRemove-to-store-location-pat.patch
0004-Incremental-for-bare-disks-implementation-of-spare-s.patch
0005-Util-get-device-size-from-id.patch
0006-Monitor-set-err-on-arrays-not-in-mdstat.patch
0007-Monitor-spare-group-based-spare-sharing-moved-to-sep.patch
0008-mdadm-added-no-sharing-option-for-Monitor-mode.patch
0009-Monitor-avoid-skipping-checks-on-external-arays.patch
0010-Monitor-include-containers-in-scan-mode.patch
0011-Monitor-link-containers-with-subarrays-in-statelist.patch
0012-imsm-create-mdinfo-list-of-disks-in-a-container-from.patch
0013-Monitor-autorebuild-functionality-added.patch
0014-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.patch
0015-Monitor-more-accurate-size-check-when-looking-for-sp.patch
0016-IMSM-Fix-problem-in-mdmon-monitor-of-using-removed-d.patch
0017-Policy-is-aware-of-metadata-disk-s-controller-domain.patch

Incremental.c      |  230 +++++++++++++++---
Makefile           |    3 +
Monitor.c          |  691 ++++++++++++++++++++++++++++++++++++++++++++--------
ReadMe.c           |    4 +
managemon.c        |   38 +++
mdadm.c            |   29 ++-
mdadm.h            |   49 ++++-
policy.c           |  134 +++++++++-
super-intel.c      |  274 ++++++++++++++++++---
udev-md-raid.rules |    7 +-
util.c             |   23 ++
11 files changed, 1290 insertions(+), 192 deletions(-)
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk

Sad Rejonowy Gdansk Polnoc w Gdansku, 
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego, 
numer KRS 101882

NIP 957-07-52-316
Kapital zakladowy 200.000 zl

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply	[flat|nested] 34+ messages in thread
* RE: Devel 3.2 branch issues
@ 2010-11-22 22:39 Czarnowska, Anna
  2010-11-23  0:52 ` Neil Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Czarnowska, Anna @ 2010-11-22 22:39 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-raid@vger.kernel.org, Neubauer, Wojciech, Williams, Dan J,
	Ciechanowski, Ed, Labun, Marcin,
	Hawrylewicz Czarnowski, Przemyslaw


> by the way, some of the changes in you of the patches you sent have not
> been
> included in any form.  They include:
> 
> - the getinfo_super_disks method.  I couldn't see why you need this.
> All the
>   info about the state of the arrays should already be available.
>   If there is something that you need that we don't have, please
> explain and
>   we can see how best to add it back in.

Marcin has already answered this but here is my explanation.
Current test devstate[i]==0 is always true for container so any device seems a good candidate to move.
To be able to identify members, failed devices and real spares we updated devstate for containers.
To find members we can just check which disks are used in subarrays, but a failed disk is removed from subarray after a short while and as soon as it happens we are not able to see a difference between the failed disk and a spare unless we look at metadata. 

> - min_active_disk_size_in_array.  I don't think the minimum current
> size is
>   really a good guide.  I've kept the code for letting the metadata
> handler
>   check the size, but anything beyond that should be done with domains
> I
>   think.
>   E.g have a domain '2G-or-greater' which is assigned to all 2G or
> greater
>   devices.  Then anything smaller will automatically be excluded from
> arrays
>   with those devices.
 
So if someone doesn't base domains on size they may have a small spare added to an array where it cannot be used. 
Min_active_disk_size was more than required for an array that didn't occupy the whole disk but at least it ensured that we are not throwing in something that wouldn't help. If we do this the array will remain degraded but will have spare - so Monitor may think it does not need more.
For this reason we also checked the case when there was a spare in "to" container. If the spare was not suitable (size check here too) we would still look for a good one. 

And now back to assembly. There is still a segmentation fault when we try to assemble a subarray. Occurs when there is any config file and we run "mdadm -As" or "mdadm -Asc /etc/mdadm.conf". content is NULL when we try to compare uuid in line 413 in Assemble.c.
We are going to prepare some tests to add to current suite so it will be easier to verify new patches. 

Regards
Anna
---------------------------------------------------------------------
Intel Technology Poland sp. z o.o.
z siedziba w Gdansku
ul. Slowackiego 173
80-298 Gdansk

Sad Rejonowy Gdansk Polnoc w Gdansku, 
VII Wydzial Gospodarczy Krajowego Rejestru Sadowego, 
numer KRS 101882

NIP 957-07-52-316
Kapital zakladowy 200.000 zl

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2010-12-23 15:44 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-29 14:13 [Patch 00/17] Autorebuild Czarnowska, Anna
2010-11-17 10:22 ` Neil Brown
2010-11-17 16:04   ` Labun, Marcin
2010-11-18 23:14   ` Czarnowska, Anna
2010-11-19 12:43     ` Devel 3.2 branch issues Czarnowska, Anna
2010-11-22  3:29       ` Neil Brown
2010-11-22 17:18         ` Labun, Marcin
2010-11-22 18:47           ` Dan Williams
2010-11-23 17:34         ` Labun, Marcin
2010-11-19 15:12     ` Autorebuild, new dynamic udev rules for hot-plugs Hawrylewicz Czarnowski, Przemyslaw
2010-11-22  5:02       ` Neil Brown
2010-11-22 23:50         ` Hawrylewicz Czarnowski, Przemyslaw
2010-11-23  0:11           ` Dan Williams
2010-11-23  1:17             ` Neil Brown
2010-11-23  5:04               ` Dan Williams
2010-11-23  5:27                 ` Neil Brown
2010-11-23  6:17                   ` Dan Williams
2010-11-23 17:01               ` Hawrylewicz Czarnowski, Przemyslaw
2010-12-23 15:44                 ` Hawrylewicz Czarnowski, Przemyslaw
2010-11-22  2:16     ` [Patch 00/17] Autorebuild Neil Brown
2010-11-22 15:08       ` Czarnowska, Anna
2010-11-23  1:34         ` Neil Brown
2010-11-23 18:20           ` Labun, Marcin
2010-12-09 11:40             ` Czarnowska, Anna
2010-12-13  0:21               ` Neil Brown
2010-12-14 14:47                 ` [PATCH] fix: Monitor doesn't return after starting daemon Czarnowska, Anna
2010-12-14 21:58                   ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2010-11-22 22:39 Devel 3.2 branch issues Czarnowska, Anna
2010-11-23  0:52 ` Neil Brown
2010-11-23 12:04   ` Czarnowska, Anna
2010-11-25  8:01   ` Neil Brown
2010-11-25 10:28     ` Czarnowska, Anna
2010-11-26 18:23       ` Czarnowska, Anna
2010-11-28 22:59         ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).