linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Tokarev <mjt@tls.msk.ru>
To: Neil Brown <neilb@suse.de>
Cc: dean gaudet <dean@arctic.org>, Kay Sievers <kay.sievers@vrfy.org>,
	Greg KH <gregkh@suse.de>, Andrew Morton <akpm@osdl.org>,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 001 of 6] md: Send online/offline uevents when an md array starts/stops.
Date: Thu, 09 Nov 2006 13:10:28 +0300	[thread overview]
Message-ID: <4552FE94.20400@tls.msk.ru> (raw)
In-Reply-To: <17744.5139.805134.577609@cse.unsw.edu.au>

Neil Brown wrote:
[/dev/mdx...]
>> (much like how /dev/ptmx is used to create /dev/pts/N entries.)
[]
> I have the following patch sitting in my patch queue (since about
> March).
> It does what you suggest via /sys/module/md-mod/parameters/MAGIC_FILE
> which is the only md-specific part of the /sys namespace that I could
> find.
> 
> However I'm not at all convinced that it is a good idea.  I would much
> rather have mdadm control device naming than leave it up to udev.

This is again the same "device naming" question as pops up every time
someone mentions udev.  And as usual, I'm suggesting the following, which
should - hopefully - make everyone happy:

  create kernel names *always*, be it /dev/mdN or /dev/sdF or whatever,
  so that things like /proc/partitions, /proc/mdstat etc will be useful.
  For this, the ideal solution - IMHO - is to have mini-devfs-like filesystem
  mounted as /dev, so that it is possible to have "bare" names without any
  help from any external programs like udev, but I don't want to start another
  flamewar here, esp. since it's off-topic to *this* discussion.
  Note /dev/mdN is as good as /dev/md/N - because there are only a few active
  devices wich appear in /dev, there's no "risk" to have "too many" files in
  /dev, hence no need to put them into subdirs like /dev/md/, /dev/sd/ etc.

  if so desired, create *symlinks* at /dev with appropriate user-controlled
  names to those official kernel device nodes.  Be it like /dev/disk/by-label/
  or /dev/cdrom0 or whatever.
  The links can be created by mdadm, OR by udev - in this case, it's really
  irrelevant.  Udev rules does a good job of creating /dev/disk/ hierarchy
  already, and that seems to be sufficient - i see no reason to make other
  device nodes (symlinks) by mdadm.

By the way, unlike /dev/sdE and /dev/hdF entries, /dev/mdN nodes are pretty
stable.  Even if scsi disks gets reordered, mdadm finds the component devices
by UUID (if DEVICE partitions is given in config file), and you have /dev/md1
pointing to the same "logical partition" (have the same filesystem or data)
regardless how you shuffle your disks (IF mdadm was able to find all components
and assemble the array, anyway).  So sometimes, I use md/mdadm on systems
WITHOUT any "raided" drives, but where I suspect disk devices may change for
whatever reason - I just create raid0 "arrays" composed of a single partition
and let mdadm to find them in /dev/sd* and to assemble stable-numbered /dev/mdN
devices - without any help of udev or anything else (I for one dislike udev for
several reasons).

> An in any case, we have the semantic that opening an md device-file
> creates the device, and we cannot get rid of that semantic without a
> lot of warning and a lot of pain.  And adding a new semantic isn't
> really going to help.

I don't think so.  With new semantic in place, we've two options (provided
current semantics stays, and I don't see a strong reason why it should be
removed except of the bloat):

 a) with new mdadm utilizing new semantics, there's nothing to change in udev --
    it will all Just Work, by mdadm opening /dev/md-control-node (how it's called)
    and assembling devices using that, and during assemble, udev will receive proper
    events about new "disks" appearing and will handle that as usual.

 b) without new mdadm, it will work as before (now).  And in this case, let's not
    send any udev events, as mdadm already created the nodes etc.

So if a user wants neat and nice md/udev integration, the way to go is case "a".
If it's not required, either case will do.

Sure, eventually, long term, support for case "b" can be removed.  Or not - depending
on how the things will be implemented, because when done properly, both cases will
call the same routine(s), but case "b" will just skip sending uevents, so ioctl handlers
becomes two- or one-liners (two in case a and one in case b), which isn't bloat really ;)

/mjt

  reply	other threads:[~2006-11-09 10:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-31  6:00 [PATCH 000 of 6] md: udev events and cache bypass for reads NeilBrown
2006-10-31  6:00 ` [PATCH 001 of 6] md: Send online/offline uevents when an md array starts/stops NeilBrown
2006-10-31 21:16   ` Greg KH
2006-11-02 12:13     ` Kay Sievers
2006-11-02 12:32       ` Neil Brown
2006-11-02 13:51         ` Kay Sievers
2006-11-03  6:57           ` Neil Brown
2006-11-03  8:22             ` Kay Sievers
2006-11-06  0:18               ` Neil Brown
2006-11-06  8:38                 ` dean gaudet
2006-11-07  5:05                   ` Neil Brown
2006-11-09 10:10                     ` Michael Tokarev [this message]
2006-11-09 10:17                       ` Michael Tokarev
2006-11-08 11:14                 ` Kay Sievers
2006-11-09  0:17                   ` Neil Brown
2006-10-31  6:00 ` [PATCH 002 of 6] md: Change lifetime rules for 'md' devices NeilBrown
2006-10-31  8:22   ` Andrew Morton
2006-10-31  9:09     ` Neil Brown
2006-10-31  9:15       ` Jens Axboe
2006-10-31  9:26         ` Neil Brown
2006-10-31  9:30           ` Jens Axboe
2006-10-31  6:00 ` [PATCH 003 of 6] md: Define raid5_mergeable_bvec NeilBrown
2006-10-31  6:01 ` [PATCH 004 of 6] md: Handle bypassing the read cache (assuming nothing fails) NeilBrown
2006-10-31  6:01 ` [PATCH 005 of 6] md: Allow reads that have bypassed the cache to be retried on failure NeilBrown
2006-10-31  6:01 ` [PATCH 006 of 6] md: Enable bypassing cache for reads NeilBrown
2006-10-31 21:15 ` [PATCH 000 of 6] md: udev events and cache bypass " Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4552FE94.20400@tls.msk.ru \
    --to=mjt@tls.msk.ru \
    --cc=akpm@osdl.org \
    --cc=dean@arctic.org \
    --cc=gregkh@suse.de \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).