linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mdadm udev rules change
@ 2009-04-06 14:57 Doug Ledford
  2009-04-08  7:31 ` Neil Brown
  0 siblings, 1 reply; 4+ messages in thread
From: Doug Ledford @ 2009-04-06 14:57 UTC (permalink / raw)
  To: LinuxRaid RAID; +Cc: Neil Brown

[-- Attachment #1: Type: text/plain, Size: 1262 bytes --]

I'm not attaching a patch for this because it's so simple.  Long story  
short, watching both add and change events in udev rules is bad for md  
devices.  Specifically, the kernel will generate a change event on  
things like array stop, and on things like fdisk close.  In the case  
of array stop, it can result in the array being assembled again  
immediately.  In the case of fdisk close, the situation is worse.   
Let's say you stop all the md devices on some block device in order to  
repartition.  You run fdisk, change the partition table, then issue a  
write of the table.  The write of the table triggers the change event  
*before* the kernel updates the partition table in memory for the  
block device, causing udev to rerun the incremental rules on the old  
partition table and restart all the arrays you just stopped with the  
old partition table layout, at which point the kernel is unable to  
reread the partition table.  So, once you've enable incremental  
assembly, it becomes apparent that what we really want is to only  
start devices on add, not on add|change.

--

Doug Ledford <dledford@redhat.com>

GPG KeyID: CFBFF194
http://people.redhat.com/dledford

InfiniBand Specific RPMS
http://people.redhat.com/dledford/Infiniband





[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdadm udev rules change
  2009-04-06 14:57 mdadm udev rules change Doug Ledford
@ 2009-04-08  7:31 ` Neil Brown
  2009-04-09 19:20   ` Doug Ledford
  0 siblings, 1 reply; 4+ messages in thread
From: Neil Brown @ 2009-04-08  7:31 UTC (permalink / raw)
  To: Doug Ledford; +Cc: LinuxRaid RAID

On Monday April 6, dledford@redhat.com wrote:
> I'm not attaching a patch for this because it's so simple.  Long story  
> short, watching both add and change events in udev rules is bad for md  
> devices.  Specifically, the kernel will generate a change event on  
> things like array stop, and on things like fdisk close.  In the case  
> of array stop, it can result in the array being assembled again  
> immediately.  In the case of fdisk close, the situation is worse.   
> Let's say you stop all the md devices on some block device in order to  
> repartition.  You run fdisk, change the partition table, then issue a  
> write of the table.  The write of the table triggers the change event  
> *before* the kernel updates the partition table in memory for the  
> block device, causing udev to rerun the incremental rules on the old  
> partition table and restart all the arrays you just stopped with the  
> old partition table layout, at which point the kernel is unable to  
> reread the partition table.  So, once you've enable incremental  
> assembly, it becomes apparent that what we really want is to only  
> start devices on add, not on add|change.

So you aren't really talking about events on md devices, but rather on
events that might become components of md devices - correct?
So in udev-md-raid.rules we currently have

....
ACTION!="add|change", GOTO="md_end"

# import data from a raid member and activate it
#ENV{ID_FS_TYPE}=="linux_raid_member", IMPORT{program}="/sbin/mdadm --examine --export $tempnode", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
# import data from a raid set
KERNEL!="md*", GOTO="md_end"
......

And you are saying that if we uncomment the "#ENV{...." line, then we
only want it to fire on add devices.  So something like:

....
 ACTION!="add|change", GOTO="md_end"
+ACTION=="change", GOTO="no_incr"
 
 # import data from a raid member and activate it
 #ENV{ID_FS_TYPE}=="linux_raid_member", IMPORT{program}="/sbin/mdadm --examine --export $tempnode", RUN+="/sbin/mdadm --incremental $env{DEVNAME}"
 # import data from a raid set
+LABEL="no_incr"
 KERNEL!="md*", GOTO="md_end"
......

Is that correct?

Your explanation certainly seems reasonable.

Thanks.
NeilBrown

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdadm udev rules change
  2009-04-08  7:31 ` Neil Brown
@ 2009-04-09 19:20   ` Doug Ledford
  2009-04-10  9:22     ` Neil Brown
  0 siblings, 1 reply; 4+ messages in thread
From: Doug Ledford @ 2009-04-09 19:20 UTC (permalink / raw)
  To: Neil Brown; +Cc: LinuxRaid RAID

[-- Attachment #1: Type: text/plain, Size: 3615 bytes --]

On Apr 8, 2009, at 3:31 AM, Neil Brown wrote:
> On Monday April 6, dledford@redhat.com wrote:
>> I'm not attaching a patch for this because it's so simple.  Long  
>> story
>> short, watching both add and change events in udev rules is bad for  
>> md
>> devices.  Specifically, the kernel will generate a change event on
>> things like array stop, and on things like fdisk close.  In the case
>> of array stop, it can result in the array being assembled again
>> immediately.  In the case of fdisk close, the situation is worse.
>> Let's say you stop all the md devices on some block device in order  
>> to
>> repartition.  You run fdisk, change the partition table, then issue a
>> write of the table.  The write of the table triggers the change event
>> *before* the kernel updates the partition table in memory for the
>> block device, causing udev to rerun the incremental rules on the old
>> partition table and restart all the arrays you just stopped with the
>> old partition table layout, at which point the kernel is unable to
>> reread the partition table.  So, once you've enable incremental
>> assembly, it becomes apparent that what we really want is to only
>> start devices on add, not on add|change.
>
> So you aren't really talking about events on md devices, but rather on
> events that might become components of md devices - correct?
> So in udev-md-raid.rules we currently have
> ....
> ACTION!="add|change", GOTO="md_end"
>
> # import data from a raid member and activate it
> #ENV{ID_FS_TYPE}=="linux_raid_member", IMPORT{program}="/sbin/mdadm  
> --examine --export $tempnode", RUN+="/sbin/mdadm --incremental  
> $env{DEVNAME}"
> # import data from a raid set
> KERNEL!="md*", GOTO="md_end"
> ......
>
> And you are saying that if we uncomment the "#ENV{...." line, then we
> only want it to fire on add devices.  So something like:
>
> ....
> ACTION!="add|change", GOTO="md_end"
> +ACTION=="change", GOTO="no_incr"
>
> # import data from a raid member and activate it
> #ENV{ID_FS_TYPE}=="linux_raid_member", IMPORT{program}="/sbin/mdadm  
> --examine --export $tempnode", RUN+="/sbin/mdadm --incremental  
> $env{DEVNAME}"
> # import data from a raid set
> +LABEL="no_incr"
> KERNEL!="md*", GOTO="md_end"
> ......
>
> Is that correct?
>
> Your explanation certainly seems reasonable.

As I was reading through this, I certainly thought the above  
implementation was reasonable, and then I got this new bug report:

https://bugzilla.redhat.com/show_bug.cgi?id=495034

So, mdadm-3.0-0.devel3.5 leaves the default udev rules file in place  
and only defines a little 70-mdadm-incremental.rules that only does  
the incremental assembly.  That file only watches for add events.  The  
earlier mdadm that the user says doesn't have the bug replaces the  
existing rules file entirely and uncomments the incremental part.   
Long story short, it would appear the add|change in the normal rules  
file is possible responsible for a long string of these:

# udevadm monitor
....
UDEV  [1239266575.473953] change   /devices/virtual/block/md5 (block)
UEVENT[1239266575.510739] change   /devices/virtual/block/md5 (block)
UDEV  [1239266575.516114] change   /devices/virtual/block/md5 (block)
UEVENT[1239266575.552984] change   /devices/virtual/block/md5 (block)
....

I would do some investigation on your own machines, but given udev  
change semantics, I think the proper course of action is simply to not  
watch changes.


--

Doug Ledford <dledford@redhat.com>

GPG KeyID: CFBFF194
http://people.redhat.com/dledford

InfiniBand Specific RPMS
http://people.redhat.com/dledford/Infiniband





[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdadm udev rules change
  2009-04-09 19:20   ` Doug Ledford
@ 2009-04-10  9:22     ` Neil Brown
  0 siblings, 0 replies; 4+ messages in thread
From: Neil Brown @ 2009-04-10  9:22 UTC (permalink / raw)
  To: Doug Ledford; +Cc: LinuxRaid RAID

On Thursday April 9, dledford@redhat.com wrote:
> On Apr 8, 2009, at 3:31 AM, Neil Brown wrote:
> >
> > And you are saying that if we uncomment the "#ENV{...." line, then we
> > only want it to fire on add devices.  So something like:
> >
> > ....
> > ACTION!="add|change", GOTO="md_end"
> > +ACTION=="change", GOTO="no_incr"
> >
> > # import data from a raid member and activate it
> > #ENV{ID_FS_TYPE}=="linux_raid_member", IMPORT{program}="/sbin/mdadm  
> > --examine --export $tempnode", RUN+="/sbin/mdadm --incremental  
> > $env{DEVNAME}"
> > # import data from a raid set
> > +LABEL="no_incr"
> > KERNEL!="md*", GOTO="md_end"
> > ......
> >
> > Is that correct?
> >
> > Your explanation certainly seems reasonable.
> 
> As I was reading through this, I certainly thought the above  
> implementation was reasonable, and then I got this new bug report:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=495034

I'm guessing this bug is because the line

TEST!="md/array_state", GOTO="md_end"

is missing from the udev rules file??  It is a relatively recent
addition.

> 
> I would do some investigation on your own machines, but given udev  
> change semantics, I think the proper course of action is simply to not  
> watch changes.

We absolutely have to watch change events on md devices.
When the "new" event happens, the array doesn't exist yet.  We have
just started building it.  Once it is fully assembled and started a
change event happens and it is on that event that we create /dev
files, look for partitions, and mount filesystems.

NeilBrown

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-04-10  9:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-06 14:57 mdadm udev rules change Doug Ledford
2009-04-08  7:31 ` Neil Brown
2009-04-09 19:20   ` Doug Ledford
2009-04-10  9:22     ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).