linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <teheo@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Neil Brown <neilb@suse.de>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	Doug Ledford <dledford@redhat.com>, Greg KH <greg@kroah.com>,
	Jens Axboe <jens.axboe@oracle.com>
Subject: Re: [PATCH 1/2] md: make devices disappear when they are no longer needed.
Date: Tue, 25 Nov 2008 01:08:09 +0900	[thread overview]
Message-ID: <492AD169.9000805@suse.de> (raw)
In-Reply-To: <20081124144823.GC28946@ZenIV.linux.org.uk>

Al Viro wrote:
> With ->fops in already freed memory?  Good idea, that...

The existence can be tested on registration and remembered.  Existence
of fops->disk_release signifies that the object and module are around
till the last object is released.

> Look, that's the wrong object *and* the wrong place; gendisk is separate
> from driver-specific data structures for a good reason.
> 
> _IF_ you want to tie something to "opened or in process of being opened",
> the right place is __blkdev_get()/blkdev_put() (BTW, note that get_gendisk()
> is essentially a part of __blkdev_get()).
> 
> It's not about rmmod of module while some disk is opened; _that_ is impossible
> as is.  It's about e.g. modules like sd.ko that would be flat-out impossible
> to unload at all.
> 
> Think for a minute: we do have a bunch of gendisks created by sd; they stay
> around until we rmmod the damn thing and we really want it that way.
> We can't have their existence pinning sd down.  What we do is "module is
> busy if any of those is opened or in the middle of opening".

Devices can be killed from userland via sysfs for SCSI or mdadm for
md.  It's true that such approach is less convenient for unloading but
if it can make usual cases easier, why not?

For both char and block devices, having ->open lookup and grab the
device were easy when each driver supported limited number of devices
and things were pretty much static, but it grows more awkward as
things become more dynamic.  Having to have a lookup table just for
->open lookup isn't pretty at all and missing ->release can get quite
uncomfortable too.

> The same goes for just about any block driver out there; md is weird for
> one reason - it wants to have devices possible to open without any prior
> event that would mark the beginning of availability (such as e.g. hardware
> detection for SCSI, IDE, etc.).  That is a side effect of bad API - there
> *is* a moment for md that would be a good candidate for such event (array
> being completely configured and available for normal IO), but that would
> require an API for creating and configuring these suckers that wouldn't
> start with "First you open the very device you are trying to create..."
> Too bad - we are tied to lousy userland interface and can't get rid of
> it.  That is where all the PITA is coming from.
> 
> And the right way to deal with that is to have explicit boundaries for
> "opened or in process of being opened"; we almost have them (probe and
> final release), so the only point we are missing is on failure exit from
> __blkdev_get()...
> 
> I really think that it's much saner than trying to change the lifetime
> rules for gendisk, etc.

Well, I don't know.  It seems like a lot of trouble just to allow
"rmmod something" without first killing the devices and as people are
now so used to reference counted objects and ->release, not having it
on cdev or gendisk is quite a PITA.  (BTW, Greg, can you please drop
cdev->release patch for now, it's wrong as it currently stands).

Can you see any problem with caching ->disk_release existence on
registration and wrap __module_get/put() around its invocation?  It
wouldn't change behavior of any existing drivers and md can use it if
it wants.  Doing "mdadm --stop --scan" would be enough to unload the
module and md can do whatever forward or back reference it wants to do
to work out the weird userland interface.

Thanks.

-- 
tejun

  reply	other threads:[~2008-11-24 16:08 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-24  3:55 [PATCH 0/2] RFC: allow md devices to disappear when not in use NeilBrown
2008-11-24  3:55 ` [PATCH 1/2] md: make devices disappear when they are no longer needed NeilBrown
2008-11-24  4:18   ` Tejun Heo
2008-11-24  5:13     ` Neil Brown
2008-11-24  5:34       ` Tejun Heo
2008-11-24  6:10         ` NeilBrown
2008-11-24  6:12           ` Tejun Heo
2008-11-24  6:24         ` Al Viro
2008-11-24  6:56           ` Tejun Heo
2008-11-24 13:31             ` Al Viro
2008-11-24 14:04               ` Tejun Heo
2008-11-24 14:26                 ` Tejun Heo
2008-11-24 14:48                   ` Al Viro
2008-11-24 16:08                     ` Tejun Heo [this message]
2008-11-24 16:42                       ` Al Viro
2008-11-24 17:18                         ` Tejun Heo
2008-11-28  0:23                     ` Neil Brown
2008-11-24  4:24   ` Al Viro
2008-11-24  4:47     ` Neil Brown
2008-11-24  6:38       ` Al Viro
2008-11-24  3:55 ` [PATCH 2/2] Allow md devices to be created by name NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=492AD169.9000805@suse.de \
    --to=teheo@suse.de \
    --cc=dledford@redhat.com \
    --cc=greg@kroah.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).