public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Reference counting of MMC host driver modules
@ 2009-01-09 17:07 Enrik Berkhan
  2009-01-09 18:20 ` David Vrabel
  0 siblings, 1 reply; 8+ messages in thread
From: Enrik Berkhan @ 2009-01-09 17:07 UTC (permalink / raw)
  To: Pierre Ossman; +Cc: linux-kernel

Hi,

I've noticed recently that the MMC/SD block driver does not reference 
count the MMC/SD host driver module that it uses via the MMC/SD core 
layer. Thus, I can rmmod my host driver module while, for example, a 
partition on a SD card is mounted.

Assuming this does not happen intentionally, what would be the correct 
fix? Is a try_module_get(card->host->parent->driver->owner) in 
mmc_blk_alloc() in drivers/mmc/card/block.c the right thing to do? If 
so, I could provide a patch fixing the issue. If not, please give me 
advice how to do it better :) (May be I'm missing something that has to 
do with hotplugging, as the devices I'm using are non-removeable).

Or am I possibly doing something wrong in my host driver and this 
problem is not present for others at all?

Enrik

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-09 17:07 Reference counting of MMC host driver modules Enrik Berkhan
@ 2009-01-09 18:20 ` David Vrabel
  2009-01-09 20:00   ` Enrik Berkhan
  0 siblings, 1 reply; 8+ messages in thread
From: David Vrabel @ 2009-01-09 18:20 UTC (permalink / raw)
  To: Enrik Berkhan; +Cc: Pierre Ossman, linux-kernel

Enrik Berkhan wrote:
> Hi,
> 
> I've noticed recently that the MMC/SD block driver does not reference 
> count the MMC/SD host driver module that it uses via the MMC/SD core 
> layer. Thus, I can rmmod my host driver module while, for example, a 
> partition on a SD card is mounted.

This is same as removing the card while it's in use.

David
-- 
David Vrabel, Senior Software Engineer, Drivers
CSR, Churchill House, Cambridge Business Park,  Tel: +44 (0)1223 692562
Cowley Road, Cambridge, CB4 0WZ                 http://www.csr.com/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-09 18:20 ` David Vrabel
@ 2009-01-09 20:00   ` Enrik Berkhan
  2009-01-10  9:49     ` Stefan Richter
  0 siblings, 1 reply; 8+ messages in thread
From: Enrik Berkhan @ 2009-01-09 20:00 UTC (permalink / raw)
  To: David Vrabel; +Cc: Pierre Ossman, linux-kernel

On Fri, Jan 09, 2009 at 06:20:32PM +0000, David Vrabel wrote:
> Enrik Berkhan wrote:
> > I've noticed recently that the MMC/SD block driver does not reference 
> > count the MMC/SD host driver module that it uses via the MMC/SD core 
> > layer. Thus, I can rmmod my host driver module while, for example, a 
> > partition on a SD card is mounted.
> 
> This is same as removing the card while it's in use.

I don't think so.

Being able to remove the card while it is in use is a bug of the
mechanical interface. Can't be fixed easily.

Being able to remove the module while it is in use is a bug of the
software. Can be fixed.

Enrik

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-09 20:00   ` Enrik Berkhan
@ 2009-01-10  9:49     ` Stefan Richter
  2009-01-11  9:41       ` Pierre Ossman
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Richter @ 2009-01-10  9:49 UTC (permalink / raw)
  To: Enrik Berkhan; +Cc: David Vrabel, Pierre Ossman, linux-kernel

Enrik Berkhan wrote:
> On Fri, Jan 09, 2009 at 06:20:32PM +0000, David Vrabel wrote:
>> Enrik Berkhan wrote:
>>> I've noticed recently that the MMC/SD block driver does not reference 
>>> count the MMC/SD host driver module that it uses via the MMC/SD core 
>>> layer. Thus, I can rmmod my host driver module while, for example, a 
>>> partition on a SD card is mounted.
>> This is same as removing the card while it's in use.
> 
> I don't think so.
> 
> Being able to remove the card while it is in use is a bug of the
> mechanical interface. Can't be fixed easily.
> 
> Being able to remove the module while it is in use is a bug of the
> software. Can be fixed.

Here is my implementation experience with the issue, concerning the two
FireWire driver stacks which I maintain:

In case of firewire (new FireWire stack), module removal of the
controller driver will trigger removal of all child and grandchild
devices, e.g. sd_shutdown of FireWire HDDs.

This will have the same effect as if the cable of the HDD was pulled.
I don't see a need to prevent this because whoever runs
# modprobe -r firewire-ohci
is supposed to know what he is doing.

But in case of ieee1394 (old FireWire stack), the sbp2 storage driver
calls try_module_get() on the host driver whenever it starts using a
device, and of course module_put() when it is done.  I.e. modprobe -r
ohci1394 is blocked as long as the sbp2 driver is in touch with a
FireWire drive.

I added this at some point because of unclean layering of the ieee1394
stack:  The high-level drivers video1394 and dv1394 do not only depend
on the ieee1394 core driver, but also directly on the ohci1394 low-level
driver.  Now, if somebody runs
# modprobe -r video1394
it would also remove ohci1394 if sbp2 hadn't artificially increased
ohci1394's reference count.

*However*, note that try_module_get() and module_put() don't actually do
what you asked for:  They do _not_ guarantee that the host driver gets
unbound from the MMC (or FireWire...) controller:
You can still do
# echo -n ${ID} > /sys/module/${MODULE}/drivers/${DRIVER}/unbind

So, if somebody asked me to copy my ieee1394/sbp2 safeguard into
firewire/fw-sbp2, I would reject that on the grounds that killing the
connection to the FireWire disk is the *expected result* of
# modprobe -r firewire-ohci
-- 
Stefan Richter
-=====-==--= ---= -=-=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-10  9:49     ` Stefan Richter
@ 2009-01-11  9:41       ` Pierre Ossman
  2009-01-12 15:41         ` Enrik Berkhan
  0 siblings, 1 reply; 8+ messages in thread
From: Pierre Ossman @ 2009-01-11  9:41 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Enrik Berkhan, David Vrabel, linux-kernel

On Sat, 10 Jan 2009 10:49:00 +0100
Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:

> 
> So, if somebody asked me to copy my ieee1394/sbp2 safeguard into
> firewire/fw-sbp2, I would reject that on the grounds that killing the
> connection to the FireWire disk is the *expected result* of
> # modprobe -r firewire-ohci

I have to agree with Stefan's reasoning here. The reference counting is
about protecting kernel integrity, not about saving the user's foot.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  rdesktop, core developer          http://www.rdesktop.org

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-11  9:41       ` Pierre Ossman
@ 2009-01-12 15:41         ` Enrik Berkhan
  2009-01-12 17:10           ` Stefan Richter
  0 siblings, 1 reply; 8+ messages in thread
From: Enrik Berkhan @ 2009-01-12 15:41 UTC (permalink / raw)
  To: Pierre Ossman; +Cc: Stefan Richter, David Vrabel, linux-kernel

Hi,

Pierre Ossman wrote:
> Stefan Richter <stefanr@s5r6.in-berlin.de> wrote:
> > So, if somebody asked me to copy my ieee1394/sbp2 safeguard into
> > firewire/fw-sbp2, I would reject that on the grounds that killing the
> > connection to the FireWire disk is the *expected result* of
> > # modprobe -r firewire-ohci
> I have to agree with Stefan's reasoning here. The reference counting is
> about protecting kernel integrity, not about saving the user's foot.

Yes, meanwhile, I have to admit you all are right and I have been
looking in the wrong direction.

My original problem was the following:

- MD raid0 made of some MMC/SD devices
- raid0 activated
- rmmod mmc_host_driver (user error, not noticing the raid is active)
- insmod mmc_host_driver (user error, still not noticing the raid is active)
- write to raid0 device
- kernel crash

I have debugged this a little and found the following reason for the
crash:

When removing the mmc_host_driver, everything seems to be fine; the
MMC/SD block device has been deactivated by mmc_blk_remove(), that in
turn has stopped the queue via mmc_cleanup_queue(). mmc_cleanup_queue()
calls blk_cleanup_queue() on the underlying struct request_queue. By
this, the reference count of the struct request_queues kboj drops to
zero. The MD driver still has the block device open and, actually,
things work fine unless the memory of the struct request_queue isn't
touched, because it is marked dead. Of course, accessing the MD device
returns EIO, but that's fine.

When the mmc_host_driver is reloaded, new struct request_queues will be
allocated and with some probability, the old memory will be re-used for
them or the old memory locations will be re-used for something else. The
key point is that the queues still in use by the MD layer will
effectively no longer be marked dead or completely corrupted.

I don't know if this is a problem of the MD layer, the MMC/SD block
driver, a more general problem or even no problem at all :)

How can this be fixed?

Enrik

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-12 15:41         ` Enrik Berkhan
@ 2009-01-12 17:10           ` Stefan Richter
  2009-01-12 20:01             ` Enrik Berkhan
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Richter @ 2009-01-12 17:10 UTC (permalink / raw)
  To: Enrik Berkhan; +Cc: Pierre Ossman, David Vrabel, linux-kernel

Enrik Berkhan wrote:
> When removing the mmc_host_driver, everything seems to be fine; the
> MMC/SD block device has been deactivated by mmc_blk_remove(), that in
> turn has stopped the queue via mmc_cleanup_queue(). mmc_cleanup_queue()
> calls blk_cleanup_queue() on the underlying struct request_queue. By
> this, the reference count of the struct request_queues kboj drops to
> zero. The MD driver still has the block device open and, actually,
> things work fine unless the memory of the struct request_queue isn't
> touched, because it is marked dead. Of course, accessing the MD device
> returns EIO, but that's fine.
> 
> When the mmc_host_driver is reloaded, new struct request_queues will be
> allocated and with some probability, the old memory will be re-used for
> them or the old memory locations will be re-used for something else. The
> key point is that the queues still in use by the MD layer will
> effectively no longer be marked dead or completely corrupted.

So in short, the request_queue's reference count goes to zero even
though something still points to it?
-- 
Stefan Richter
-=====-==--= ---= -==--
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Reference counting of MMC host driver modules
  2009-01-12 17:10           ` Stefan Richter
@ 2009-01-12 20:01             ` Enrik Berkhan
  0 siblings, 0 replies; 8+ messages in thread
From: Enrik Berkhan @ 2009-01-12 20:01 UTC (permalink / raw)
  To: Stefan Richter; +Cc: Pierre Ossman, David Vrabel, linux-kernel

Stefan Richter wrote:
> Enrik Berkhan wrote:
> > When removing the mmc_host_driver, everything seems to be fine; the
> > MMC/SD block device has been deactivated by mmc_blk_remove(), that in
> > turn has stopped the queue via mmc_cleanup_queue(). mmc_cleanup_queue()
> > calls blk_cleanup_queue() on the underlying struct request_queue. By
> > this, the reference count of the struct request_queues kboj drops to
> > zero. The MD driver still has the block device open and, actually,
> > things work fine unless the memory of the struct request_queue isn't
> > touched, because it is marked dead. Of course, accessing the MD device
> > returns EIO, but that's fine.
> > 
> > When the mmc_host_driver is reloaded, new struct request_queues will be
> > allocated and with some probability, the old memory will be re-used for
> > them or the old memory locations will be re-used for something else. The
> > key point is that the queues still in use by the MD layer will
> > effectively no longer be marked dead or completely corrupted.
> 
> So in short, the request_queue's reference count goes to zero even
> though something still points to it?

Exactly. AFAICS.

I haven't checked yet if this happens using other block device
infrastructure, too.

Enrik

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-01-12 20:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-09 17:07 Reference counting of MMC host driver modules Enrik Berkhan
2009-01-09 18:20 ` David Vrabel
2009-01-09 20:00   ` Enrik Berkhan
2009-01-10  9:49     ` Stefan Richter
2009-01-11  9:41       ` Pierre Ossman
2009-01-12 15:41         ` Enrik Berkhan
2009-01-12 17:10           ` Stefan Richter
2009-01-12 20:01             ` Enrik Berkhan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox