public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.35-rc2 module reference counting broken
@ 2010-06-07  5:20 Jari Ruusu
  2010-06-07  6:44 ` Al Viro
  0 siblings, 1 reply; 6+ messages in thread
From: Jari Ruusu @ 2010-06-07  5:20 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rusty Russell, Al Viro, Linus Torvalds

Someone broke block device module reference counting. Problem occours when a
modular block device is mounted and unmounted. Not when it is directly read.
2.6.34 kernel works OK, but 2.6.35-rc2 kernel seems to increase usage count
by one for each mount + umount pair.

# uname -s -r -m
Linux 2.6.35-rc2 i686
# grep CONFIG_SMP /usr/src/linux-2.6.35-rc2/.config
# CONFIG_SMP is not set
# grep CONFIG_MODULE_UNLOAD /usr/src/linux-2.6.35-rc2/.config
CONFIG_MODULE_UNLOAD=y
# grep CONFIG_BLK_DEV_FD /usr/src/linux-2.6.35-rc2/.config
CONFIG_BLK_DEV_FD=m
# lsmod
Module                  Size  Used by
# modprobe floppy
# lsmod
Module                  Size  Used by
floppy                 40029  0
# mount -t ext2 /dev/fd0 /mnt
# umount /mnt
# lsmod
Module                  Size  Used by
floppy                 40029  1
# rmmod floppy
ERROR: Module floppy is in use
# echo $?
1
# 

(reboot)

# uname -s -r -m
Linux 2.6.35-rc2 i686
# lsmod
Module                  Size  Used by
# modprobe floppy
# lsmod
Module                  Size  Used by
floppy                 40029  0
# dd if=/dev/fd0 of=/dev/null bs=4096 count=1 conv=notrunc 2>/dev/null
# lsmod
Module                  Size  Used by
floppy                 40029  0
# rmmod floppy
# echo $?
0
# lsmod
Module                  Size  Used by
# 

-- 
Jari Ruusu  1024R/3A220F51 5B 4B F9 BB D3 3F 52 E9  DB 1D EB E3 24 0E A9 DD

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35-rc2 module reference counting broken
  2010-06-07  5:20 2.6.35-rc2 module reference counting broken Jari Ruusu
@ 2010-06-07  6:44 ` Al Viro
  2010-06-08 23:48   ` Al Viro
  0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2010-06-07  6:44 UTC (permalink / raw)
  To: Jari Ruusu; +Cc: linux-kernel, Rusty Russell, Linus Torvalds

On Mon, Jun 07, 2010 at 08:20:30AM +0300, Jari Ruusu wrote:
> Someone broke block device module reference counting. Problem occours when a
> modular block device is mounted and unmounted. Not when it is directly read.
> 2.6.34 kernel works OK, but 2.6.35-rc2 kernel seems to increase usage count
> by one for each mount + umount pair.

Very interesting...  Looks like mount() bumps refcount by 2.  umount() after
that drops refcount by 1, so it's not leaking superblocks.

Which probably means that open_bdev_exclusive() is fscked.  Interesting...
FWIW, quick look through the history seems to point to this:
commit 6b4517a7913a09d3259bb1d21c9cb300f12294bd
Author: Tejun Heo <tj@kernel.org>
Date:   Wed Apr 7 18:53:59 2010 +0900

    block: implement bd_claiming and claiming block

I'm far too sleepy right now, but I'd start with reviewing what that
thing is doing to module refcounting...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35-rc2 module reference counting broken
  2010-06-07  6:44 ` Al Viro
@ 2010-06-08 23:48   ` Al Viro
  2010-06-09  7:01     ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: Al Viro @ 2010-06-08 23:48 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Rusty Russell, Linus Torvalds, Jari Ruusu

On Mon, Jun 07, 2010 at 07:44:12AM +0100, Al Viro wrote:
> On Mon, Jun 07, 2010 at 08:20:30AM +0300, Jari Ruusu wrote:
> > Someone broke block device module reference counting. Problem occours when a
> > modular block device is mounted and unmounted. Not when it is directly read.
> > 2.6.34 kernel works OK, but 2.6.35-rc2 kernel seems to increase usage count
> > by one for each mount + umount pair.
> 
> Very interesting...  Looks like mount() bumps refcount by 2.  umount() after
> that drops refcount by 1, so it's not leaking superblocks.
> 
> Which probably means that open_bdev_exclusive() is fscked.  Interesting...
> FWIW, quick look through the history seems to point to this:
> commit 6b4517a7913a09d3259bb1d21c9cb300f12294bd
> Author: Tejun Heo <tj@kernel.org>
> Date:   Wed Apr 7 18:53:59 2010 +0900
> 
>     block: implement bd_claiming and claiming block
> 
> I'm far too sleepy right now, but I'd start with reviewing what that
> thing is doing to module refcounting...

Yeah...  bd_start_claiming() grabs a reference to gendisk and we never
let it go.  There's your leak...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35-rc2 module reference counting broken
  2010-06-08 23:48   ` Al Viro
@ 2010-06-09  7:01     ` Tejun Heo
  2010-06-10  6:34       ` Jari Ruusu
  0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2010-06-09  7:01 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-kernel, Rusty Russell, Linus Torvalds, Jari Ruusu,
	Nick Piggin, Jens Axboe

Hello,

On 06/09/2010 01:48 AM, Al Viro wrote:
>>     block: implement bd_claiming and claiming block
>>
>> I'm far too sleepy right now, but I'd start with reviewing what that
>> thing is doing to module refcounting...
> 
> Yeah...  bd_start_claiming() grabs a reference to gendisk and we never
> let it go.  There's your leak...

Eh, I thought you were cc'd.  Sorry.  This was fixed sometime back by
Nick and queued in block tree (delayed due to mail misdelivery).

  http://thread.gmane.org/gmane.linux.file-systems/40655

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35-rc2 module reference counting broken
  2010-06-09  7:01     ` Tejun Heo
@ 2010-06-10  6:34       ` Jari Ruusu
  2010-06-10 11:31         ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: Jari Ruusu @ 2010-06-10  6:34 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Al Viro, linux-kernel, Rusty Russell, Linus Torvalds, Nick Piggin,
	Jens Axboe

Tejun Heo wrote:
> On 06/09/2010 01:48 AM, Al Viro wrote:
> > Yeah...  bd_start_claiming() grabs a reference to gendisk and we never
> > let it go.  There's your leak...
> 
> Eh, I thought you were cc'd.  Sorry.  This was fixed sometime back by
> Nick and queued in block tree (delayed due to mail misdelivery).
> 
>   http://thread.gmane.org/gmane.linux.file-systems/40655

That one liner patch makes module refcount mismatch go away.

However, I am not sure if that is the right place to insert that
module_put(). The problem with Nick Piggin's (2010-05-25 15:50:21 GMT) patch
is that it makes module refcount temporarily drop to zero.

I added this line right after that "module_put(disk->fops->owner);" fix:

 if(disk->fops->owner){printk("bd_start_claiming: module_refcount=%u\n", module_refcount(disk->fops->owner));}

And that said "module_refcount=0" when I tried it with my silly floppy
module mount+umount test.

Later in the mount system call handling the module refrence count is
incremented. But to me that looks like there is a window of opportunity for
things to go wrong. What is there to prevent module from being removed at
zero refcount?

-- 
Jari Ruusu  1024R/3A220F51 5B 4B F9 BB D3 3F 52 E9  DB 1D EB E3 24 0E A9 DD

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.6.35-rc2 module reference counting broken
  2010-06-10  6:34       ` Jari Ruusu
@ 2010-06-10 11:31         ` Tejun Heo
  0 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2010-06-10 11:31 UTC (permalink / raw)
  To: Jari Ruusu
  Cc: Al Viro, linux-kernel, Rusty Russell, Linus Torvalds, Nick Piggin,
	Jens Axboe

Hello,

On 06/10/2010 08:34 AM, Jari Ruusu wrote:
> Later in the mount system call handling the module refrence count is
> incremented. But to me that looks like there is a window of opportunity for
> things to go wrong. What is there to prevent module from being removed at
> zero refcount?

It can be removed, in which case blkdev_get() fails and the whole open
attempt fails, which is the expected behavior.  Claiming block just
needs access to the containing struct block_device, caring for the
actual device and backing module is blkdev_get()'s job.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-06-10 11:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-07  5:20 2.6.35-rc2 module reference counting broken Jari Ruusu
2010-06-07  6:44 ` Al Viro
2010-06-08 23:48   ` Al Viro
2010-06-09  7:01     ` Tejun Heo
2010-06-10  6:34       ` Jari Ruusu
2010-06-10 11:31         ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox