linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
@ 2010-10-01 12:36 Labun, Marcin
  2010-10-19  0:40 ` Neil Brown
  0 siblings, 1 reply; 4+ messages in thread
From: Labun, Marcin @ 2010-10-01 12:36 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-raid@vger.kernel.org, Czarnowska, Anna,
	Hawrylewicz Czarnowski, Przemyslaw, Neubauer, Wojciech,
	Williams, Dan J, Ciechanowski, Ed, dledford@redhat.com

From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
From: Marcin Labun <marcin.labun@intel.com>
Date: Wed, 29 Sep 2010 06:12:38 +0200
Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy

This is updated series of patches forming autorebuild functionality in mdadm 
monitor based on new policy code.

Autorebuild Monitoring application:
Autorebuild monitor is part of monitor application (mdadm -F). In the current
code of mdadm monitor autorebuild feature was based on spare group assignment in
mdadm.conf file and worked only for native metadata. 
The new autorebuild implementation works for all metadata types. It uses 
the concept of domains in mdadm.conf introduced by Neil Brown. 
Monitoring application shall periodically check the state of MD active arrays
and trigger a rebuild if there are eligible spare disks in other containers.
Degraded arrays are checked one by one. For each array a potential spare disk
is searched. If the spare disk matches the domain of the degraded array and
the domain action allows for spare sharing the spare is moved using existing
Manage_subdevs function. If the addition fails, the spare device is moved back
to the original container and next potential spare is tried. The process is 
repeated until all arrays are checked and the process is put into a sleep state
for a configured period.

The design of mdadm monitor requires that there is only one autorebuild process running.
Therefore a new option -no-sharing has been added to Monitor mode, and spare sharing is
allowed in only one instance of Monitor. User is still able to start Monitoring functions
in multiple instances.

The autorebuild build-in assumptions are:
1\spares are shared between the arrays of the same metadata
2\spares are moved only from containers/volumes that are not degraded
3\spares are moved to containers/volumes lacking a *good* spare (size)


0001-Monitor-set-err-on-arrays-not-in-mdstat.patch
0002-Monitor-removed-spare-group-based-spare-sharing-code.patch
0003-mdadm-added-no-sharing-parameter-for-Monitor-mode.patch
0004-Monitor-link-container-volumes-in-statelist.patch
0005-imsm-create-mdinfo-list-of-disks-in-a-container-from.patch
0006-Monitor-autorebuild-funcionality-added.patch
0007-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.patch
0008-Monitor-Helper-functions-added-for-spare_sharing-in-.patch


 Monitor.c     |  605 +++++++++++++++++++++++++++++++++++++++++++++++----------
 ReadMe.c      |    2 +
 mdadm.c       |    8 +-
 mdadm.h       |    8 +-
 super-intel.c |   53 +++++
 5 files changed, 565 insertions(+), 111 deletions(-)



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
  2010-10-01 12:36 [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy Labun, Marcin
@ 2010-10-19  0:40 ` Neil Brown
  2010-10-19  6:54   ` Dan Williams
  2010-10-20 15:41   ` Labun, Marcin
  0 siblings, 2 replies; 4+ messages in thread
From: Neil Brown @ 2010-10-19  0:40 UTC (permalink / raw)
  To: Labun, Marcin
  Cc: linux-raid@vger.kernel.org, Czarnowska, Anna,
	Hawrylewicz Czarnowski, Przemyslaw, Neubauer, Wojciech,
	Williams, Dan J, Ciechanowski, Ed, dledford@redhat.com

On Fri, 1 Oct 2010 13:36:48 +0100
"Labun, Marcin" <Marcin.Labun@intel.com> wrote:

> >From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
> From: Marcin Labun <marcin.labun@intel.com>
> Date: Wed, 29 Sep 2010 06:12:38 +0200
> Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
> 
> This is updated series of patches forming autorebuild functionality in mdadm 
> monitor based on new policy code.

Hi Marcin,
 thanks for this, and apologies for not replying sooner.
 I've had a bit of a look and some of it seems good.
 I haven't had a thorough look yet as I am in the middle of doing some fairly
 serious refactoring of mdadm (the supertype, and mdinfo structures are going
 to be heavily changed and largely merged - some super_switch methods will
 disappear (e.g. getinfo_super) and others will appear (load_container)).
 Once I have finished that I will review your code more thoroughly and merge
 it into the new code base.

 One concern I do have is patch 0002 which removes the spare-group based
 spare migration.  That functionality needs to stay, though obviously the
 implementation can change.  I imagine the 'spare-group' information would be
 added to each member device as a 'domain' name.

 Also it is best not to remove functionality and then re-add it a different
 way, but rather to make sure the functionality works after every change, but
 just gets extended at various points.

Thanks,
NeilBrown


> 
> Autorebuild Monitoring application:
> Autorebuild monitor is part of monitor application (mdadm -F). In the current
> code of mdadm monitor autorebuild feature was based on spare group assignment in
> mdadm.conf file and worked only for native metadata. 
> The new autorebuild implementation works for all metadata types. It uses 
> the concept of domains in mdadm.conf introduced by Neil Brown. 
> Monitoring application shall periodically check the state of MD active arrays
> and trigger a rebuild if there are eligible spare disks in other containers.
> Degraded arrays are checked one by one. For each array a potential spare disk
> is searched. If the spare disk matches the domain of the degraded array and
> the domain action allows for spare sharing the spare is moved using existing
> Manage_subdevs function. If the addition fails, the spare device is moved back
> to the original container and next potential spare is tried. The process is 
> repeated until all arrays are checked and the process is put into a sleep state
> for a configured period.
> 
> The design of mdadm monitor requires that there is only one autorebuild process running.
> Therefore a new option -no-sharing has been added to Monitor mode, and spare sharing is
> allowed in only one instance of Monitor. User is still able to start Monitoring functions
> in multiple instances.
> 
> The autorebuild build-in assumptions are:
> 1\spares are shared between the arrays of the same metadata
> 2\spares are moved only from containers/volumes that are not degraded
> 3\spares are moved to containers/volumes lacking a *good* spare (size)
> 
> 
> 0001-Monitor-set-err-on-arrays-not-in-mdstat.patch
> 0002-Monitor-removed-spare-group-based-spare-sharing-code.patch
> 0003-mdadm-added-no-sharing-parameter-for-Monitor-mode.patch
> 0004-Monitor-link-container-volumes-in-statelist.patch
> 0005-imsm-create-mdinfo-list-of-disks-in-a-container-from.patch
> 0006-Monitor-autorebuild-funcionality-added.patch
> 0007-Monitor-Respect-policy-in-auto-rebuild-in-mdadm-moni.patch
> 0008-Monitor-Helper-functions-added-for-spare_sharing-in-.patch
> 
> 
>  Monitor.c     |  605 +++++++++++++++++++++++++++++++++++++++++++++++----------
>  ReadMe.c      |    2 +
>  mdadm.c       |    8 +-
>  mdadm.h       |    8 +-
>  super-intel.c |   53 +++++
>  5 files changed, 565 insertions(+), 111 deletions(-)
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
  2010-10-19  0:40 ` Neil Brown
@ 2010-10-19  6:54   ` Dan Williams
  2010-10-20 15:41   ` Labun, Marcin
  1 sibling, 0 replies; 4+ messages in thread
From: Dan Williams @ 2010-10-19  6:54 UTC (permalink / raw)
  To: Neil Brown
  Cc: Labun, Marcin, linux-raid@vger.kernel.org, Czarnowska, Anna,
	Hawrylewicz Czarnowski, Przemyslaw, Neubauer, Wojciech,
	Ciechanowski, Ed, dledford@redhat.com

On 10/18/2010 5:40 PM, Neil Brown wrote:
> On Fri, 1 Oct 2010 13:36:48 +0100
> "Labun, Marcin"<Marcin.Labun@intel.com>  wrote:
>
>> > From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00 2001
>> From: Marcin Labun<marcin.labun@intel.com>
>> Date: Wed, 29 Sep 2010 06:12:38 +0200
>> Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
>>
>> This is updated series of patches forming autorebuild functionality in mdadm
>> monitor based on new policy code.
>
> Hi Marcin,
>   thanks for this, and apologies for not replying sooner.
>   I've had a bit of a look and some of it seems good.
>   I haven't had a thorough look yet as I am in the middle of doing some fairly
>   serious refactoring of mdadm (the supertype, and mdinfo structures are going
>   to be heavily changed and largely merged - some super_switch methods will
>   disappear (e.g. getinfo_super) and others will appear (load_container)).
>   Once I have finished that I will review your code more thoroughly and merge
>   it into the new code base.
>
>   One concern I do have is patch 0002 which removes the spare-group based
>   spare migration.  That functionality needs to stay, though obviously the
>   implementation can change.  I imagine the 'spare-group' information would be
>   added to each member device as a 'domain' name.
>
>   Also it is best not to remove functionality and then re-add it a different
>   way, but rather to make sure the functionality works after every change, but
>   just gets extended at various points.

Hi Neil,

I made a similar comment on this patch during our internal review.  We 
also talked about the need for superswitch methods that can be used to 
1/ determine which devices in a container are spares versus stale disks 
2/ what the minimum size a bare disk needs to be to join a container. 
I'll wait to see if these items will be easier to determine with the new 
mdinfo/supertype refactoring.

Other notes:
The --activate-domains option [1] to validate the configuration file and 
install custom/filtered udev rules for the ports we care about, seemed 
like a good idea at the time.  Now that things are a bit further along 
do you have a better solution in mind or is this still the approach we 
want to take?  Przemek currently has a patch to filter all block device 
events through mdadm to query the configuration file for domain events 
which seems like overkill if not a performance problem for large disk 
count environments.

We also talked about migration, but I'll put those details in a separate 
thread.

Thanks,
Dan

[1]: http://marc.info/?l=linux-raid&m=127001124615043&w=2

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy
  2010-10-19  0:40 ` Neil Brown
  2010-10-19  6:54   ` Dan Williams
@ 2010-10-20 15:41   ` Labun, Marcin
  1 sibling, 0 replies; 4+ messages in thread
From: Labun, Marcin @ 2010-10-20 15:41 UTC (permalink / raw)
  To: Neil Brown
  Cc: linux-raid@vger.kernel.org, Czarnowska, Anna,
	Hawrylewicz Czarnowski, Przemyslaw, Neubauer, Wojciech,
	Williams, Dan J, Ciechanowski, Ed, dledford@redhat.com

> -----Original Message-----
> From: Neil Brown [mailto:neilb@suse.de]
> 
> On Fri, 1 Oct 2010 13:36:48 +0100
> "Labun, Marcin" <Marcin.Labun@intel.com> wrote:
> 
> > >From f423b226f10cfe3b416c5e0580dde45cd8ca887d Mon Sep 17 00:00:00
> 2001
> > From: Marcin Labun <marcin.labun@intel.com>
> > Date: Wed, 29 Sep 2010 06:12:38 +0200
> > Subject: [AUTOREBUILD 0/8] Autorebuild monitor patches based on user
> defined policy
> >
> > This is updated series of patches forming autorebuild functionality
> in mdadm
> > monitor based on new policy code.
> 
> Hi Marcin,
>  thanks for this, and apologies for not replying sooner.
>  I've had a bit of a look and some of it seems good.
>  I haven't had a thorough look yet as I am in the middle of doing some fairly
>  serious refactoring of mdadm (the supertype, and mdinfo structures are going
>  to be heavily changed and largely merged - some super_switch methods> will
>  disappear (e.g. getinfo_super) and others will appear (load_container)).
>  Once I have finished that I will review your code more thoroughly and merge
>  it into the new code base.
> 
>  One concern I do have is patch 0002 which removes the spare-group based
>  spare migration.  That functionality needs to stay, though obviously the
>  implementation can change.  I imagine the 'spare-group' information would be
>  added to each member device as a 'domain' name.
> 
>  Also it is best not to remove functionality and then re-add it a different
>  way, but rather to make sure the functionality works after every change, but
>  just gets extended at various points.
> 
Hi Neil,
Next week we are planning to make another drop that includes spare-groups and a number of code rework changes.
Thanks,
Marcin


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-10-20 15:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-01 12:36 [AUTOREBUILD 0/8] Autorebuild monitor patches based on user defined policy Labun, Marcin
2010-10-19  0:40 ` Neil Brown
2010-10-19  6:54   ` Dan Williams
2010-10-20 15:41   ` Labun, Marcin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).