From: Neil Brown <neilb@suse.com>
To: Simon Guinot <simon.guinot@sequanux.org>
Cc: linux-raid@vger.kernel.org,
Remi Reroller <remi.rerolle@seagate.com>,
Vincent Donnefort <vdonnefort@gmail.com>,
Yoann Sculo <yoann@printk.fr>
Subject: Re: Remove inactive array created by open
Date: Tue, 27 Oct 2015 07:10:47 +0900 [thread overview]
Message-ID: <87a8r5e0wo.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20151026211252.GC4665@kw.sim.vm.gnt>
[-- Attachment #1: Type: text/plain, Size: 4323 bytes --]
On Tue, Oct 27 2015, Simon Guinot wrote:
> On Fri, Oct 23, 2015 at 08:23:38AM +1100, Neil Brown wrote:
>> Simon Guinot <simon.guinot@sequanux.org> writes:
>>
>> > Hi Neil,
>> >
>> > I'd like to have your advice about destroying an array created by open
>> > at close time if not configured, rather than waiting for a ioctl or a
>> > sysfs configuration. This would allow to get rid of the inactive md
>> > devices created by an "accidental" open.
>> >
>> > On the Linux distribution embedded in LaCie NAS, we are able to
>> > observe the following scenario:
>> >
>> > 1. A RAID array is stopped with a command such as mdadm --stop /dev/mdx.
>> > 2. The node /dev/mdx is still available because not removed by mdadm at
>> > stop.
>> > 3. /dev/mdx is opened by a process such as udev or mdadm --monitor.
>> > 4. An inactive RAID array mdx is created and a "add" uevent is
>> > broadcasted to userland. It is let to userland to understand that
>> > this event must be discarded.
>> >
>> > You have to admit that this behaviour is at best awkward :)
>>
>> No argument there.
>>
>>
>> >
>> > I read the commit d3374825ce57
>> > "md: make devices disappear when they are no longer needed" in which
>> > you express some concerns about an infinite loop due to udev always
>> > opening newly created devices. Is that still actual ?
>> >
>> > In your opinion, how could we get rid of an inactive RAID array created
>> > by open ? Maybe we could switch the hold_active flag from UNTIL_IOCTL to
>> > 0 after some delay (enough to prevent udev from looping) ? In addition,
>> > maybe we could remove the device node from mdadm --stop ? Or maybe
>> > something else :)
>> >
>> > If you are interested by any of this solutions or one of yours, I'll
>> > be happy to work on it.
>>
>> By far the best solution here is to used named md devices. These are
>> relatively recent and I wouldn't be surprised if you weren't aware of
>> them.
>>
>> md devices 9,0 to 9,511 (those are major,minor numbers) are "numeric" md
>> devices. They have in-kernel names md%d which appear in /proc/mdstat
>> and /sys/block/
>>
>> If you create a block-special-device node with these numbers, that will
>> create the md device if it doesn't already exist.
>>
>> md devices 9,512 to 9,$BIGNUM are "named" md devices. These have
>> in-kernel names like md_whatever-you-like.
>> If you create a block-special-device with device number 9,512 and try to
>> open it you will get -ENODEV.
>> To create these you
>> echo whatever-you-like > /sys/module/md_mod/parameters/new_array
>>
>> A number 512 or greater will be allocated as the minor number.
>>
>> These arrays behave as you would want them to. They are only created
>> when explicitly requested and they disappear when stopped.
>>
>> mdadm will create this sort of array if you add
>> CREATE names=yes
>> to mdadm.conf and don't use numeric device names.
>> i.e. if you ask for /dev/md0, you will still get 9,0.
>> But if you ask for /dev/md/home, you will get 9,512 where as
>> with names=no (the default) you would probably get 9,127.
>
> Thanks for describing the usage of the named md devices. We will look
> to convert our userland from numeric to named md devices.
I think that is the best approach if you can make it work
>
>>
>> A timeout for dropping idle numeric md devices might make sense but it
>> would need to be several seconds at least as udev can sometimes get very
>> backlogged and would wouldn't want to add to that. Would 5 minutes be
>> soon enough to meet your need?
>
> No unfortunately, this will not. In our case, I think we need to
> remove the inactive numeric md device at close or quickly after.
>
> Considering that for a numeric md device we need to add the gendisk at
> probe and that we can't destroy it at close if inactive (due to the udev
> issue), then I don't think there is a solution to our problem. But I was
> kind of hoping you had an idea :)
The only thing I can think of is suppressing the ADD uevent until
something happens which would make the device more permanent. However I
suspect that would require and ugly hack so I have serious doubts about
it being accepted upstream (I'm not sure I would accept it :-)
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
prev parent reply other threads:[~2015-10-26 22:10 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-22 16:24 Remove inactive array created by open Simon Guinot
2015-10-22 21:23 ` Neil Brown
2015-10-22 23:38 ` Neil Brown
2015-10-26 21:12 ` Simon Guinot
2015-10-26 22:10 ` Neil Brown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a8r5e0wo.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=remi.rerolle@seagate.com \
--cc=simon.guinot@sequanux.org \
--cc=vdonnefort@gmail.com \
--cc=yoann@printk.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).