Naughty ramdrives

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Naughty ramdrives
@ 2006-09-07 20:59 Alexey Dobriyan
  2006-09-07 21:54 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Dobriyan @ 2006-09-07 20:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

You'd laugh, but...

Summary:

	After loading and unloading rd.ko many times "ls -l /dev/ram*"
	results are not persistent.

Steps to reproduce:

	# while true; do modprobe rd && rmmod rd; done
		[wait ~10 seconds]
	^C
	# modprobe rd

	# ls -l /dev/ram*
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
	# ls -l /dev/ram*
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram13 -> rd/13
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
	# ls -l /dev/ram*
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram1 -> rd/1
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram11 -> rd/11
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram14 -> rd/14
	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram15 -> rd/15
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram3 -> rd/3
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram8 -> rd/8
	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram9 -> rd/9

Versions:

	Linux 2.6.18-rc5
	udev 087

P.S.:

This was noticed while investigating #4899
http://bugme.osdl.org/show_bug.cgi?id=4899
where /dev/ram0 when opened, pins module indefinitely. It seems that
adding ->release() which undoes

	inode = igrab(bdev->bd_inode);

should do the trick. Am I right?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 20:59 Naughty ramdrives Alexey Dobriyan
@ 2006-09-07 21:54 ` Andrew Morton
  2006-09-07 22:05   ` Greg KH
  2006-09-07 22:08   ` Alexey Dobriyan
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2006-09-07 21:54 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, Al Viro, Christoph Hellwig, Greg KH

On Fri, 8 Sep 2006 00:59:27 +0400
Alexey Dobriyan <adobriyan@gmail.com> wrote:

> You'd laugh, but...
> 
> Summary:
> 
> 	After loading and unloading rd.ko many times "ls -l /dev/ram*"
> 	results are not persistent.
> 
> Steps to reproduce:
> 
> 	# while true; do modprobe rd && rmmod rd; done
> 		[wait ~10 seconds]
> 	^C
> 	# modprobe rd
> 
> 	# ls -l /dev/ram*
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
> 	# ls -l /dev/ram*
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram13 -> rd/13
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
> 	# ls -l /dev/ram*
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram1 -> rd/1
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram11 -> rd/11
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram14 -> rd/14
> 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram15 -> rd/15
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram3 -> rd/3
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram8 -> rd/8
> 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram9 -> rd/9
> 
> Versions:
> 
> 	Linux 2.6.18-rc5
> 	udev 087

So I assume udev is still madly crunching on its message backlog while
this is happening?

If so, ug.

> P.S.:
> 
> This was noticed while investigating #4899
> http://bugme.osdl.org/show_bug.cgi?id=4899
> where /dev/ram0 when opened, pins module indefinitely. It seems that
> adding ->release() which undoes
> 
> 	inode = igrab(bdev->bd_inode);
> 
> should do the trick. Am I right?

Looks right.

I'm not sure that igrab() is needed though.  Probably bd_openers is
sufficient.

I'm also not sure that rd_open() needs to play with bd_openers. 
fs/block_dev.c:do_open() already does that.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 21:54 ` Andrew Morton
@ 2006-09-07 22:05   ` Greg KH
  2006-09-07 22:08   ` Alexey Dobriyan
  1 sibling, 0 replies; 8+ messages in thread
From: Greg KH @ 2006-09-07 22:05 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alexey Dobriyan, linux-kernel, Al Viro, Christoph Hellwig

On Thu, Sep 07, 2006 at 02:54:12PM -0700, Andrew Morton wrote:
> On Fri, 8 Sep 2006 00:59:27 +0400
> Alexey Dobriyan <adobriyan@gmail.com> wrote:
> 
> > You'd laugh, but...
> > 
> > Summary:
> > 
> > 	After loading and unloading rd.ko many times "ls -l /dev/ram*"
> > 	results are not persistent.
> > 
> > Steps to reproduce:
> > 
> > 	# while true; do modprobe rd && rmmod rd; done
> > 		[wait ~10 seconds]
> > 	^C
> > 	# modprobe rd
> > 
> > 	# ls -l /dev/ram*
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
> > 	# ls -l /dev/ram*
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram13 -> rd/13
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram6 -> rd/6
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
> > 	# ls -l /dev/ram*
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram0 -> rd/0
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram1 -> rd/1
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram11 -> rd/11
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram12 -> rd/12
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram14 -> rd/14
> > 	lrwxrwxrwx 1 root root 5 Sep  8 00:35 /dev/ram15 -> rd/15
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram3 -> rd/3
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram7 -> rd/7
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram8 -> rd/8
> > 	lrwxrwxrwx 1 root root 4 Sep  8 00:35 /dev/ram9 -> rd/9
> > 
> > Versions:
> > 
> > 	Linux 2.6.18-rc5
> > 	udev 087
> 
> So I assume udev is still madly crunching on its message backlog while
> this is happening?

It shouldn't be, this should not take that long.  Run 'udevmonitor' to
see what udev is doing at the moment to verify this or not.

> If so, ug.

I agree.  What distro is this?

I just tested this on my box running Gentoo and a newer version of udev
(099), and it worked just fine.  It took a while for udev to catch back
up with the flood of events, but it did and everything was fine.  No
harm done in the end.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 21:54 ` Andrew Morton
  2006-09-07 22:05   ` Greg KH
@ 2006-09-07 22:08   ` Alexey Dobriyan
  2006-09-07 22:20     ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Alexey Dobriyan @ 2006-09-07 22:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Al Viro, Christoph Hellwig, Greg KH

> So I assume udev is still madly crunching on its message backlog while
> this is happening?
>
> If so, ug.

OK. I'll let it stabilize, sorry.

> > This was noticed while investigating #4899
> > http://bugme.osdl.org/show_bug.cgi?id=4899
> > where /dev/ram0 when opened, pins module indefinitely. It seems that
> > adding ->release() which undoes
> >
> > 	inode = igrab(bdev->bd_inode);
> >
> > should do the trick. Am I right?

> Looks right.
>
> I'm not sure that igrab() is needed though.  Probably bd_openers is
> sufficient.
>
> I'm also not sure that rd_open() needs to play with bd_openers.
> fs/block_dev.c:do_open() already does that.

Maybe start with closing open/open race?
That's what drivers/char/raw.c does...
------------------------------------------------
[PATCH 1/2] rd: protect rd_bdev[] with mutex

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 drivers/block/rd.c |    4 ++++
 1 file changed, 4 insertions(+)

--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -56,6 +56,7 @@ #include <linux/buffer_head.h>		/* for i
 #include <linux/backing-dev.h>
 #include <linux/blkpg.h>
 #include <linux/writeback.h>
+#include <linux/mutex.h>
 
 #include <asm/uaccess.h>
 
@@ -63,6 +64,7 @@ #include <asm/uaccess.h>
  */
 
 static struct gendisk *rd_disks[CONFIG_BLK_DEV_RAM_COUNT];
+static DEFINE_MUTEX(rd_mutex);
 static struct block_device *rd_bdev[CONFIG_BLK_DEV_RAM_COUNT];/* Protected device data */
 static struct request_queue *rd_queue[CONFIG_BLK_DEV_RAM_COUNT];
 
@@ -343,6 +345,7 @@ static int rd_open(struct inode *inode, 
 {
 	unsigned unit = iminor(inode);
 
+	mutex_lock(&rd_mutex);
 	if (rd_bdev[unit] == NULL) {
 		struct block_device *bdev = inode->i_bdev;
 		struct address_space *mapping;
@@ -382,6 +385,7 @@ static int rd_open(struct inode *inode, 
 		gfp_mask |= __GFP_HIGH;
 		mapping_set_gfp_mask(mapping, gfp_mask);
 	}
+	mutex_unlock(&rd_mutex);
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 22:08   ` Alexey Dobriyan
@ 2006-09-07 22:20     ` Andrew Morton
  2006-09-07 23:01       ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-09-07 22:20 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, Al Viro, Christoph Hellwig, Greg KH

On Fri, 8 Sep 2006 02:08:53 +0400
Alexey Dobriyan <adobriyan@gmail.com> wrote:

> > So I assume udev is still madly crunching on its message backlog while
> > this is happening?
> >
> > If so, ug.
> 
> OK. I'll let it stabilize, sorry.

You shouldn't have to.

> > > This was noticed while investigating #4899
> > > http://bugme.osdl.org/show_bug.cgi?id=4899
> > > where /dev/ram0 when opened, pins module indefinitely. It seems that
> > > adding ->release() which undoes
> > >
> > > 	inode = igrab(bdev->bd_inode);
> > >
> > > should do the trick. Am I right?
> 
> > Looks right.
> >
> > I'm not sure that igrab() is needed though.  Probably bd_openers is
> > sufficient.
> >
> > I'm also not sure that rd_open() needs to play with bd_openers.
> > fs/block_dev.c:do_open() already does that.
> 
> Maybe start with closing open/open race?
> That's what drivers/char/raw.c does...
> ------------------------------------------------
> [PATCH 1/2] rd: protect rd_bdev[] with mutex
> 
> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
> ---
> 
>  drivers/block/rd.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> --- a/drivers/block/rd.c
> +++ b/drivers/block/rd.c
> @@ -56,6 +56,7 @@ #include <linux/buffer_head.h>		/* for i
>  #include <linux/backing-dev.h>
>  #include <linux/blkpg.h>
>  #include <linux/writeback.h>
> +#include <linux/mutex.h>
>  
>  #include <asm/uaccess.h>
>  
> @@ -63,6 +64,7 @@ #include <asm/uaccess.h>
>   */
>  
>  static struct gendisk *rd_disks[CONFIG_BLK_DEV_RAM_COUNT];
> +static DEFINE_MUTEX(rd_mutex);

This could be static to rd_open().

>  static struct block_device *rd_bdev[CONFIG_BLK_DEV_RAM_COUNT];/* Protected device data */
>  static struct request_queue *rd_queue[CONFIG_BLK_DEV_RAM_COUNT];
>  
> @@ -343,6 +345,7 @@ static int rd_open(struct inode *inode, 
>  {
>  	unsigned unit = iminor(inode);
>  
> +	mutex_lock(&rd_mutex);

I suspect that we've inherited enough locking from the caller to not need
this.  That's if fs/block_dev.c:do_open() is the only caller.  Not sure if
that's true if someone goes and partitions a ramdisk (is that possible?).

All this gendisk/blockdev/contains/partitions/bd_inode stuff is quite
ghastly.  Every six months I spend long enough staring at it to
half-understand it and then promptly forget how it all works.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 22:20     ` Andrew Morton
@ 2006-09-07 23:01       ` Greg KH
  2006-09-07 23:28         ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2006-09-07 23:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alexey Dobriyan, linux-kernel, Al Viro, Christoph Hellwig

On Thu, Sep 07, 2006 at 03:20:37PM -0700, Andrew Morton wrote:
> On Fri, 8 Sep 2006 02:08:53 +0400
> Alexey Dobriyan <adobriyan@gmail.com> wrote:
> 
> > > So I assume udev is still madly crunching on its message backlog while
> > > this is happening?
> > >
> > > If so, ug.
> > 
> > OK. I'll let it stabilize, sorry.
> 
> You shouldn't have to.

You shouldn't have to what?  You purposefully add and remove a block
driver as fast as is possible, creating a ton of new events and you
expect userspace processing of those events to be able to keep up in
real-time with it?

On the later versions of udev we are _way_ faster, we only listen to the
netlink socket, no extra programs are spawned, but still, we can only
work so fast :)

My machine had no interactive response issues while this was happening,
even with both processors being run at 100% cpu usage until I stoped the
loop and then udev recovered a few seconds later.  This is even with
HALD recieving all of these events from udev, remember it's not just
udev in the event processing chain.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 23:01       ` Greg KH
@ 2006-09-07 23:28         ` Andrew Morton
  2006-09-07 23:49           ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2006-09-07 23:28 UTC (permalink / raw)
  To: Greg KH; +Cc: Alexey Dobriyan, linux-kernel, Al Viro, Christoph Hellwig

On Thu, 7 Sep 2006 16:01:30 -0700
Greg KH <greg@kroah.com> wrote:

> On Thu, Sep 07, 2006 at 03:20:37PM -0700, Andrew Morton wrote:
> > On Fri, 8 Sep 2006 02:08:53 +0400
> > Alexey Dobriyan <adobriyan@gmail.com> wrote:
> > 
> > > > So I assume udev is still madly crunching on its message backlog while
> > > > this is happening?
> > > >
> > > > If so, ug.
> > > 
> > > OK. I'll let it stabilize, sorry.
> > 
> > You shouldn't have to.
> 
> You shouldn't have to what?  You purposefully add and remove a block
> driver as fast as is possible, creating a ton of new events and you
> expect userspace processing of those events to be able to keep up in
> real-time with it?

Absolutely.  sys_init_module() should not return until the device nodes
have stabilised.  There is no other sane interface the kernel can offer.

ho hum.

Perhaps there's some hacklet we can put into modprobe, to allow it to peek
at the udev sequence numbering, wait until all the events which were
associated with this modprobe have been serviced?  Or maybe a standalone
tool?

Say, just a loopback message: send it into the kernel, knowing that it will
be appended to the queue.  Wait until a reply comes, so you know that all
preceding events in the queue have been serviced?

Or whatever.  Right now, there's no sane way to do

	modprobe rd
	mkfs /dev/ram0

so instead we could do

	modprobe rd
	/sbin/wait-for-udev-to-catch-up
	mkfs /dev/ram0

Or something.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Naughty ramdrives
  2006-09-07 23:28         ` Andrew Morton
@ 2006-09-07 23:49           ` Greg KH
  0 siblings, 0 replies; 8+ messages in thread
From: Greg KH @ 2006-09-07 23:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexey Dobriyan, linux-kernel, Al Viro, Christoph Hellwig,
	Kay Sievers

On Thu, Sep 07, 2006 at 04:28:00PM -0700, Andrew Morton wrote:
> On Thu, 7 Sep 2006 16:01:30 -0700
> Greg KH <greg@kroah.com> wrote:
> 
> > On Thu, Sep 07, 2006 at 03:20:37PM -0700, Andrew Morton wrote:
> > > On Fri, 8 Sep 2006 02:08:53 +0400
> > > Alexey Dobriyan <adobriyan@gmail.com> wrote:
> > > 
> > > > > So I assume udev is still madly crunching on its message backlog while
> > > > > this is happening?
> > > > >
> > > > > If so, ug.
> > > > 
> > > > OK. I'll let it stabilize, sorry.
> > > 
> > > You shouldn't have to.
> > 
> > You shouldn't have to what?  You purposefully add and remove a block
> > driver as fast as is possible, creating a ton of new events and you
> > expect userspace processing of those events to be able to keep up in
> > real-time with it?
> 
> Absolutely.  sys_init_module() should not return until the device nodes
> have stabilised.  There is no other sane interface the kernel can offer.

No, the module does not ever know that userspace is even _using_ udev,
let alone care if it is finished or not.

> ho hum.
> 
> Perhaps there's some hacklet we can put into modprobe, to allow it to peek
> at the udev sequence numbering, wait until all the events which were
> associated with this modprobe have been serviced?  Or maybe a standalone
> tool?
>
> Say, just a loopback message: send it into the kernel, knowing that it will
> be appended to the queue.  Wait until a reply comes, so you know that all
> preceding events in the queue have been serviced?

Kay does have some thoughts as to this idea, but it's more like using
the kernel to relay those "all done" events from udev back out to
whoever wants to hear it.

> Or whatever.  Right now, there's no sane way to do
> 
> 	modprobe rd
> 	mkfs /dev/ram0
> 
> so instead we could do
> 
> 	modprobe rd
> 	/sbin/wait-for-udev-to-catch-up
> 	mkfs /dev/ram0
> 
> Or something.

That's why people are switching to event driven startup logic, which
makes all of this not an issue anymore.  See
	http://www.netsplit.com/blog/2006/09/01
for one such example of a system that can handle the above situation
just fine.

"normally", udev creates those device nodes just fine, and fast enough
for your first example to work.  But if you don't want that, use
'udevsettle', which will wait until udev is finished:
	modprobe rd
	/sbin/udevsettle
	mkfs /dev/ram0

It's in use already today by some startup scripts that don't want to
switch over to being event driven.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-09-07 23:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-07 20:59 Naughty ramdrives Alexey Dobriyan
2006-09-07 21:54 ` Andrew Morton
2006-09-07 22:05   ` Greg KH
2006-09-07 22:08   ` Alexey Dobriyan
2006-09-07 22:20     ` Andrew Morton
2006-09-07 23:01       ` Greg KH
2006-09-07 23:28         ` Andrew Morton
2006-09-07 23:49           ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox