dev->destructor

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* dev->destructor
@ 2003-04-30  6:26 David S. Miller
  2003-04-30 16:33 ` dev->destructor Stephen Hemminger
  2003-05-01  1:10 ` dev->destructor kuznet
  0 siblings, 2 replies; 28+ messages in thread
From: David S. Miller @ 2003-04-30  6:26 UTC (permalink / raw)
  To: shemminger; +Cc: netdev, kuznet

Stephen, you're right about the dev->destructor problem.
I misread your postings, and I'm very sorry about that.
We were talking about two different things, and admittedly
I had forgotten how some of this stuff works.

Alexey, currently dev->{open,close} are what does get/put
of device module reference.

However, device unregister can explode if dev->destructor is
present.  Unlike in dev->destructor==NULL case, we do not
wait for remnant dev->refcnt to go away.  Therefore we could
invoke dev->destructor() after module is unloaded.

I guess there are two ways to address this problem:

1) dev_get() gets module reference and dev_put() puts is.
   Ugly, as this means dev_get() can fail, but this does
   cover all the possible cases.

2) Make unregister_netdev() wait for refcount to reach 1
   regardless of whether dev->destructor is NULL or not.

I don't like #1.  Do you see some holes in #2?

As Stephen brought up, this also means we should do something
about that NETDEV_UNREGISTER code in dst_dev_event() :-(

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-04-30  6:26 dev->destructor David S. Miller
@ 2003-04-30 16:33 ` Stephen Hemminger
  2003-05-01  1:10 ` dev->destructor kuznet
  1 sibling, 0 replies; 28+ messages in thread
From: Stephen Hemminger @ 2003-04-30 16:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, kuznet

On Tue, 29 Apr 2003 23:26:31 -0700 (PDT)
"David S. Miller" <davem@redhat.com> wrote:

> 
> Stephen, you're right about the dev->destructor problem.
> I misread your postings, and I'm very sorry about that.
> We were talking about two different things, and admittedly
> I had forgotten how some of this stuff works.
> 
> Alexey, currently dev->{open,close} are what does get/put
> of device module reference.
> 
> However, device unregister can explode if dev->destructor is
> present.  Unlike in dev->destructor==NULL case, we do not
> wait for remnant dev->refcnt to go away.  Therefore we could
> invoke dev->destructor() after module is unloaded.
> 
> I guess there are two ways to address this problem:
> 
> 1) dev_get() gets module reference and dev_put() puts is.
>    Ugly, as this means dev_get() can fail, but this does
>    cover all the possible cases.
> 
> 2) Make unregister_netdev() wait for refcount to reach 1
>    regardless of whether dev->destructor is NULL or not.
> 
> I don't like #1.  Do you see some holes in #2?
> 
> As Stephen brought up, this also means we should do something
> about that NETDEV_UNREGISTER code in dst_dev_event() :-(

There are other (not nice possibilities).

A) Require driver have a wait queue per device and wait after unregister.
   Ugly and requires repeated work.

B) Put wait queue and wakeup logic in netdevice/unregister_netdev.
   Adds more to already overloaded netdevice structure, but
   could cleanup existing polling stuff.

C) Audit unregister notifier callbacks to ensure they all dev_put,
   all references to device.  This works for 
   normal IPv4 except for the dst cache.  The following patch causes
   the dst cache to do this.

--- linux-2.5/net/core/dst.c	2003-04-29 11:54:39.000000000 -0700
+++ linux-2.5-bridge/net/core/dst.c	2003-04-29 10:17:15.000000000 -0700
@@ -228,7 +228,7 @@
 				   _race_ _condition_.
 				 */
 				if (event!=NETDEV_DOWN &&
-				    dev->destructor == NULL &&
+				    (dev->destructor == NULL  || dev->owner) &&
 				    dst->output == dst_blackhole) {
 					dst->dev = &loopback_dev;
 					dev_put(dev);

D) Your option #2 only needs to be done if device is a module
   otherwise, can just let destructor run later.

--- linux-2.5/net/core/dev.c	2003-04-29 11:54:39.000000000 -0700
+++ linux-2.5-bridge/net/core/dev.c	2003-04-30 09:28:34.000000000 -0700
@@ -2737,7 +2737,7 @@
 	free_divert_blk(dev);
 #endif
 
-	if (dev->destructor != NULL) {
+	if (dev->destructor != NULL && dev->owner == NULL) {
 #ifdef NET_REFCNT_DEBUG
 		if (atomic_read(&dev->refcnt) != 1)
 			printk(KERN_DEBUG "unregister_netdevice: holding %s "

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-04-30  6:26 dev->destructor David S. Miller
  2003-04-30 16:33 ` dev->destructor Stephen Hemminger
@ 2003-05-01  1:10 ` kuznet
  2003-05-01  7:00   ` dev->destructor David S. Miller
  1 sibling, 1 reply; 28+ messages in thread
From: kuznet @ 2003-05-01  1:10 UTC (permalink / raw)
  To: David S. Miller; +Cc: shemminger, netdev

Hello!

> 1) dev_get() gets module reference and dev_put() puts is.
>    Ugly, as this means dev_get() can fail, but this does
>    cover all the possible cases.

Seems, you eventually _really_ understood why I histerically moan
about bogosity of modules and maybe ready to recongnize that it is
not just histerical moaning in some part. :-)

> 2) Make unregister_netdev() wait for refcount to reach 1
>    regardless of whether dev->destructor is NULL or not.
> 
> I don't like #1.  Do you see some holes in #2?

It is deadlock.

> As Stephen brought up, this also means we should do something
> about that NETDEV_UNREGISTER code in dst_dev_event() :-(

It is just one and the simplest subcase of general situation.
Module must not be unloaded while device is held, that's all.
Also, it is not specific to netdevices. The same applies to each object
in the stack, which is under control of a module. You may like 1), you
may dislike it, but people who designed modules _really_ expect
we must use this idiotic module_get() each time when referencing some object.
Remember about sockets?

Well, my alternatives are:

0) To make module unloading right, rather then preserve creepy scheme.

3) To prohibit unloading such modules and live in peace with code not
   crippled with midule_get(). I hear, hear your growling, but I feel
   enough comfortable with this and you may fell better too after comparing
   to 1), which is exactly the way how all this queer engine was designed.

Alexey

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01  1:10 ` dev->destructor kuznet
@ 2003-05-01  7:00   ` David S. Miller
  2003-05-01 12:01     ` dev->destructor Rusty Russell
  2003-05-02  4:06     ` dev->destructor kuznet
  0 siblings, 2 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-01  7:00 UTC (permalink / raw)
  To: kuznet; +Cc: shemminger, netdev, acme, rusty

   From: kuznet@ms2.inr.ac.ru
   Date: Thu, 1 May 2003 05:10:33 +0400 (MSD)

[ Rusty, just skip down to "Ok ok ok!", it's something we've
  discussed before.  Some of these problems are becomming so
  widespread that we need to implement a fix, I'll probably be
  the one to end up doing it... ]

   > 1) dev_get() gets module reference and dev_put() puts is.
   >    Ugly, as this means dev_get() can fail, but this does
   >    cover all the possible cases.

   Seems, you eventually _really_ understood why I histerically moan
   about bogosity of modules and maybe ready to recongnize that it is
   not just histerical moaning in some part. :-)

Yes, I know, and we can talk about this until the cows come
home... :-)

   > 2) Make unregister_netdev() wait for refcount to reach 1
   >    regardless of whether dev->destructor is NULL or not.
   > 
   > I don't like #1.  Do you see some holes in #2?

   It is deadlock.

What exactly is this deadlock?  Let me think...

None of destructors kill reference to device object, and I mean none
of them.  It is why I thought the idea works.

Also, holding RTNL semaphore does not block potential holders
of device reference.  Or does it?

   > As Stephen brought up, this also means we should do something
   > about that NETDEV_UNREGISTER code in dst_dev_event() :-(

   It is just one and the simplest subcase of general situation.
   Module must not be unloaded while device is held, that's all.

Ok ok ok!!!  Let us depict how it might work in your idealized
module scheme ok?

Your idea, as I understand it, is to add callback to module that
freezes module then at some time in the future makes indication
that module is clean and may be unloaded safely.

The logic is that module knows about reference, internal attributes,
etc. and thus can check for unloadability better than any simple
refcounting system can.

The best argument for this are things like ipv6 which are enormously
complicated to load/unload safely.  If we were to use the module
refcounting system to make ipv6 unloading work cleanly, every other
line of the ipv6 code would be some module get/put. :-)

So the unload sequence is something like:

	/* Shutdown the module, so that no new references may
	 * be created.
	 */
	rc = module->shutdown(module);
	if (rc)
		goto out_err;

	wait_event(&module->unload_waitq, module->unloadable);
	module->exit();

Then, using netdevice as an example, register_netdevice() would
do something like:

	module_add_instance(dev->owner);

which is simply:

void module_add_instance(struct module *module)
{
	if (module)
		atomic_inc(&module->instances);
}

The net device drivers add:

module_shutdown(netdev_module_shutdown);

And then netdev_module_shutdown() would go:

int netdev_module_shutdown(struct module *module)
{
	struct net_device *d, *d_next;

	rtnl_lock();
	d_next = NULL;
	for (d = dev_base; d != NULL; d = d_next) {
		d_next = d->next;
		if (d->owner == module) {
			if (unregister_netdevice(d))
				BUG();

			/* Keep traversing, this module may drive
			 * multiple device instances.
			 */
		}
	}
	rtnl_unlock();

	return 0;
}

And, at final dev_put(), netdev_finish_unregister() is invoked,
and we change it to look something like this:

int netdev_finish_unregister(struct net_device *dev)
{
        BUG_TRAP(!dev->ip_ptr);
        BUG_TRAP(!dev->ip6_ptr);
        BUG_TRAP(!dev->dn_ptr);

        if (!dev->deadbeaf) {
                printk(KERN_ERR "Freeing alive device %p, %s\n",
                       dev, dev->name);
                return 0;
        }
#ifdef NET_REFCNT_DEBUG
        printk(KERN_DEBUG "netdev_finish_unregister: %s%s.\n",
                dev->name,
               (dev->destructor != NULL)?"":", old style");
#endif
        if (dev->destructor)
                dev->destructor(dev);

	module_dec_instance(dev->owner);

        return 0;
}

We implement module_dev_instance as:

void module_dec_instance(struct module *module)
{
	if (module && atomic_dec_and_test(&module->instances))
		module_is_unloadable(module);
}

Finally, we implement module_is_unloadable() which is simply:

void module_is_unloadable(struct module *module)
{
	if (module) {
		module->unloadable = 1;
		wake_up(&module->unload_waitq);
	}
}

Next, let us make socket example for "simple protocol".

static int __init simple_proto_init(void)
{
	int rc;

	rc = sock_register(&simple_proto_ops);
	if (rc)
		return rc;

	return 0;
}

static int __exit simple_proto_shutdown(struct module *module)
{
	int rc;

	rc = sock_unregister(AF_SIMPLE);
	if (rc)
		return rc;

	return 0;
}

module_init(simple_proto_init);
module_shutdown(simple_proto_shutdown);

Now, where to place the code to mark module as unloadable?
We could maintain state internally using:

static atomic_t nr_simple_sockets;

And as sockets are created/destroyed, we inc/dec this thing.
When it is decreamented to zero, we can say

	module_is_unloadable(THIS_MODULE);

Question is, where to make this?  It cannot be in the module
itself of course, that would create race condition similar to
one which exists today making this exercise quite pointless :-)

Alexey and I had some private conversations about this with Rusty,
he agreed with our sentiments mostly, but he was concerned about
unload semantics and some unfortunate side effects of two-stage
unload.  Specifically, consider:

int example_module_shutdown(struct module *module)
{
	int rc;

	rc = unregister_foo(&foo);
	if (rc)
		return rc;

	rc = unregister_bar(&bar);
	if (rc) {
		register_foo(&foo);
		return rc;
	}

	return 0;
}

Note the problem.  If between the unregister_foo() and re-register of
foo in the failure path, someone asks to open a "foo" it will fail.
Those semantics absolutely stink.

Rusty went on to describe that this is one of the reasons for the
current "enable refcounts" scheme with try_module_get().  A single
binary state enables all of the interfaces to start to succeed.
ie. opening an object created by the module does not succeed until
module->init() finishes successfully and thus try_module_get()
starts to return success.

I think he was trying to hint to us that some analogue needs to
occur for shutdown as well.  Ie. to rewrite my shutdown logic
above:

	rc = stop_refcounts(mod);
	if (rc)
		goto out_err;

	if (mod->shutdown) {
		/* Shutdown the module, so that no new references may
		 * be created.
		 */
		rc = mod->shutdown(mod);
		if (rc) {
			restart_refcounts(mod);
			goto out_err;
		}

		wait_event(&mod->unload_waitq, mod->unloadable);
		if (module_refcount(mod) != 0)
			BUG();
	}
	restart_refcounts(mod);

	/* ... put here all the existing code in sys_delete_module()
	 * ... dealing O_NONBLOCK etc. etc.
	 */		

	module->exit();
	module_free(mod);

I am not even sure this is right.  It's quite a tricky area and
requires real brains to solve.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01 12:01     ` dev->destructor Rusty Russell
@ 2003-05-01 11:09       ` David S. Miller
  2003-05-01 17:51         ` dev->destructor Arnaldo Carvalho de Melo
  2003-05-01 17:28       ` dev->destructor Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 28+ messages in thread
From: David S. Miller @ 2003-05-01 11:09 UTC (permalink / raw)
  To: rusty; +Cc: kuznet, shemminger, netdev, acme, wa

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Thu, 01 May 2003 22:01:19 +1000

   There are 70 calls to dev_hold() in the kernel.  The vast majority of
   them already have a reference, they just want another one: dev_hold()
   can do __module_get().

Rusty, this is precisely the what Alexey and myself want to avoid.  On
the surface, it looks fine, only 70 dev_get's in the kernel right?
But think further...

So you propose to add this kind of thing for every ARP entry, every
route cache entry, every IPSEC policy, every socket, every struct
sock, every networking dynamic object ever created?

When we add SKB recycling, will we need to do a module get/put on
every SKB alloc/free/clone/copy?

I think this way lies insanity :) You may make the decision to eat
this kind of overhead inside of netfilter, but Alexey and I do not
accept this.  I disagreed with Alexey initially, but now I truly see
his wisdom.

This networking device example is just the tip of the iceberg.  We can
continue to add bandaids across the kernel, instead of solving the
real problem that modules need to manage their own removal.

It is at the core of the reason the current module scheme has to be
extended to let the module manage unloading.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01  7:00   ` dev->destructor David S. Miller
@ 2003-05-01 12:01     ` Rusty Russell
  2003-05-01 11:09       ` dev->destructor David S. Miller
  2003-05-01 17:28       ` dev->destructor Arnaldo Carvalho de Melo
  2003-05-02  4:06     ` dev->destructor kuznet
  1 sibling, 2 replies; 28+ messages in thread
From: Rusty Russell @ 2003-05-01 12:01 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, shemminger, netdev, acme, Werner Almesberger

In message <20030501.000058.39187964.davem@redhat.com> you write:
>    From: kuznet@ms2.inr.ac.ru
>    Date: Thu, 1 May 2003 05:10:33 +0400 (MSD)
> 
> [ Rusty, just skip down to "Ok ok ok!", it's something we've
>   discussed before.  Some of these problems are becomming so
>   widespread that we need to implement a fix, I'll probably be
>   the one to end up doing it... ]
> 
>    > 1) dev_get() gets module reference and dev_put() puts is.
>    >    Ugly, as this means dev_get() can fail, but this does
>    >    cover all the possible cases.
>    
>    Seems, you eventually _really_ understood why I histerically moan
>    about bogosity of modules and maybe ready to recongnize that it is
>    not just histerical moaning in some part. :-)
>    
> Yes, I know, and we can talk about this until the cows come
> home... :-)

I agree with Alexey.  Modules are poor-man's microkernel: allowing
them to be unloaded has always been a horror.  But I failed to
convince my collegues of this at the 2002 kernel summit, so I did the
best with what we had.  If I had my way, we would *never* remove
modules (even on failed init: we might re-try init later, but never
free the memory).

But before we redesign module architecture from scratch, let's look at
a solution with what we do have (assuming Linus takes my damn
__module_get() patch some day, see below).

There are 70 calls to dev_hold() in the kernel.  The vast majority of
them already have a reference, they just want another one: dev_hold()
can do __module_get().

There are a few *sources* of devices: dev_get, dev_get_by*.  These
should check and fail, using "try_dev_hold()" or something.
Unfortunately auditing all the __dev_get_by* is quite a task, since
it's used very widely (and I think, sometime erroneously).

Completely untested patch below other patch.

I need more time to digest your proposal in detail, Dave.  Expect
reply w/in 24 hours.

Thanks.
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: __module_get
Author: Rusty Russell
Status: Tested on 2.5.68-bk10

D: Introduces __module_get for places where we know we already hold
D: a reference and ignoring the fact that the module is being "rmmod --wait"ed
D: is simpler.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .5001-linux-2.5.67-bk5/fs/filesystems.c .5001-linux-2.5.67-bk5.updated/fs/filesystems.c
--- .5001-linux-2.5.67-bk5/fs/filesystems.c	2003-04-14 13:45:44.000000000 +1000
+++ .5001-linux-2.5.67-bk5.updated/fs/filesystems.c	2003-04-14 15:44:36.000000000 +1000
@@ -32,17 +32,7 @@ static rwlock_t file_systems_lock = RW_L
 /* WARNING: This can be used only if we _already_ own a reference */
 void get_filesystem(struct file_system_type *fs)
 {
-	if (!try_module_get(fs->owner)) {
-#ifdef CONFIG_MODULE_UNLOAD
-		unsigned int cpu = get_cpu();
-		local_inc(&fs->owner->ref[cpu].count);
-		put_cpu();
-#else
-		/* Getting filesystem while it's starting up?  We're
-                   already supposed to have a reference. */
-		BUG();
-#endif
-	}
+	__module_get(fs->owner);
 }
 
 void put_filesystem(struct file_system_type *fs)
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .5001-linux-2.5.67-bk5/include/linux/module.h .5001-linux-2.5.67-bk5.updated/include/linux/module.h
--- .5001-linux-2.5.67-bk5/include/linux/module.h	2003-04-08 11:15:01.000000000 +1000
+++ .5001-linux-2.5.67-bk5.updated/include/linux/module.h	2003-04-14 15:45:15.000000000 +1000
@@ -255,6 +255,7 @@ struct module *module_text_address(unsig
 
 #ifdef CONFIG_MODULE_UNLOAD
 
+unsigned int module_refcount(struct module *mod);
 void __symbol_put(const char *symbol);
 #define symbol_put(x) __symbol_put(MODULE_SYMBOL_PREFIX #x)
 void symbol_put_addr(void *addr);
@@ -265,6 +266,17 @@ void symbol_put_addr(void *addr);
 #define local_dec(x) atomic_dec(x)
 #endif
 
+/* Sometimes we know we already have a refcount, and it's easier not
+   to handle the error case (which only happens with rmmod --wait). */
+static inline void __module_get(struct module *module)
+{
+	if (module) {
+		BUG_ON(module_refcount(module) == 0);
+		local_inc(&module->ref[get_cpu()].count);
+		put_cpu();
+	}
+}
+
 static inline int try_module_get(struct module *module)
 {
 	int ret = 1;
@@ -300,6 +317,9 @@ static inline int try_module_get(struct 
 static inline void module_put(struct module *module)
 {
 }
+static inline void __module_get(struct module *module)
+{
+}
 #define symbol_put(x) do { } while(0)
 #define symbol_put_addr(p) do { } while(0)
 
@@ -357,6 +377,10 @@ static inline struct module *module_text
 #define symbol_put(x) do { } while(0)
 #define symbol_put_addr(x) do { } while(0)
 
+static inline void __module_get(struct module *module)
+{
+}
+
 static inline int try_module_get(struct module *module)
 {
 	return 1;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .5001-linux-2.5.67-bk5/kernel/module.c .5001-linux-2.5.67-bk5.updated/kernel/module.c
--- .5001-linux-2.5.67-bk5/kernel/module.c	2003-04-14 13:45:46.000000000 +1000
+++ .5001-linux-2.5.67-bk5.updated/kernel/module.c	2003-04-14 15:44:36.000000000 +1000
@@ -431,7 +431,7 @@ static inline void restart_refcounts(voi
 }
 #endif
 
-static unsigned int module_refcount(struct module *mod)
+unsigned int module_refcount(struct module *mod)
 {
 	unsigned int i, total = 0;
 
@@ -439,6 +439,7 @@ static unsigned int module_refcount(stru
 		total += atomic_read(&mod->ref[i].count);
 	return total;
 }
+EXPORT_SYMBOL(module_refcount);
 
 /* This exists whether we can unload or not */
 static void free_module(struct module *mod);


================

Name: try_dev_hold
Author: Rusty Russell
Status: Experimental
Depends: Module/module_dup.patch.gz

D: Make dev_hold() actually duplicate the module reference count, and
D: introduce try_dev_hold() for where refcount may be zero.  Some
D: places seemed to use dev_hold() to initialize the device reference
D: count to 1.
D:
D: Lots of fixmes caused by quick audit of __dev_get_by* and
D: dev_getbyhwaddr.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/drivers/net/shaper.c working-2.5.68-bk10-netdevice/drivers/net/shaper.c
--- linux-2.5.68-bk10/drivers/net/shaper.c	2003-04-08 11:14:26.000000000 +1000
+++ working-2.5.68-bk10-netdevice/drivers/net/shaper.c	2003-05-01 21:21:46.000000000 +1000
@@ -526,6 +526,7 @@ static int shaper_neigh_setup_dev(struct
 
 static int shaper_attach(struct net_device *shdev, struct shaper *sh, struct net_device *dev)
 {
+	/* FIXME: No reference to dev --RR */
 	sh->dev = dev;
 	sh->hard_start_xmit=dev->hard_start_xmit;
 	sh->get_stats=dev->get_stats;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/drivers/net/wan/dlci.c working-2.5.68-bk10-netdevice/drivers/net/wan/dlci.c
--- linux-2.5.68-bk10/drivers/net/wan/dlci.c	2003-03-18 05:01:46.000000000 +1100
+++ working-2.5.68-bk10-netdevice/drivers/net/wan/dlci.c	2003-05-01 21:20:31.000000000 +1000
@@ -457,6 +457,7 @@ int dlci_add(struct dlci_add *dlci)
 	*(short *)(master->dev_addr) = dlci->dlci;
 
 	dlp = (struct dlci_local *) master->priv;
+	/* FIXME: We have no reference to slave here.  --RR */
 	dlp->slave = slave;
 
 	flp = slave->priv;
@@ -484,6 +485,7 @@ int dlci_del(struct dlci_add *dlci)
 
 	/* validate slave device */
 	master = __dev_get_by_name(dlci->devname);
+	/* FIXME: No lock, no reference held to master. --RR */
 	if (!master)
 		return(-ENODEV);
 
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/include/linux/netdevice.h working-2.5.68-bk10-netdevice/include/linux/netdevice.h
--- linux-2.5.68-bk10/include/linux/netdevice.h	2003-04-08 11:15:01.000000000 +1000
+++ working-2.5.68-bk10-netdevice/include/linux/netdevice.h	2003-05-01 20:03:27.000000000 +1000
@@ -29,6 +29,7 @@
 #include <linux/if_ether.h>
 #include <linux/if_packet.h>
 #include <linux/kobject.h>
+#include <linux/module.h>
 
 #include <asm/atomic.h>
 #include <asm/cache.h>
@@ -634,7 +635,25 @@ static inline void dev_put(struct net_de
 }
 
 #define __dev_put(dev) atomic_dec(&(dev)->refcnt)
-#define dev_hold(dev) atomic_inc(&(dev)->refcnt)
+
+/* If you already have a reference, and are duplicating it. */
+#define dev_hold(dev)				\
+do {						\
+	atomic_inc(&(dev)->refcnt);		\
+	__module_get((dev)->owner);		\
+} while(0)
+
+/* If you need a new reference, or will be holding it for a long time.
+   If this returns 0, pretend dev doesn't exist (it's being removed now). */
+#define try_dev_hold(dev)			\
+({						\
+	int __ret = 1;				\
+	if (try_module_get((dev)->owner))	\
+		atomic_inc(&(dev)->refcnt);	\
+	else					\
+		__ret = 0;			\
+	__ret;					\
+})
 
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/802/tr.c working-2.5.68-bk10-netdevice/net/802/tr.c
--- linux-2.5.68-bk10/net/802/tr.c	2003-02-07 19:21:54.000000000 +1100
+++ working-2.5.68-bk10-netdevice/net/802/tr.c	2003-05-01 21:24:46.000000000 +1000
@@ -479,6 +479,7 @@ static int rif_get_info(char *buffer,cha
 	for(i=0;i < RIF_TABLE_SIZE;i++) 
 	{
 		for(entry=rif_table[i];entry;entry=entry->next) {
+			/* FIXME: No lock, and no reference to dev. --RR */
 			struct net_device *dev = __dev_get_by_index(entry->iface);
 
 			size=sprintf(buffer+len,"%s %02X:%02X:%02X:%02X:%02X:%02X %7li ",
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/appletalk/ddp.c working-2.5.68-bk10-netdevice/net/appletalk/ddp.c
--- linux-2.5.68-bk10/net/appletalk/ddp.c	2003-05-01 09:29:34.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/appletalk/ddp.c	2003-05-01 21:28:03.000000000 +1000
@@ -920,6 +920,7 @@ static int atrtr_ioctl(unsigned int cmd,
 			 * space, isn't it?
 			 */
 			if (rt.rt_dev) {
+				/* FIXME: No lock, and no reference to dev --RR */
 				dev = __dev_get_by_name(rt.rt_dev);
 				if (!dev)
 					return -ENODEV;
@@ -1217,6 +1218,7 @@ static __inline__ int is_ip_over_ddp(str
 
 static int handle_ip_over_ddp(struct sk_buff *skb)
 {
+	/* FIXME: No lock, and no reference held to dev. --RR */
         struct net_device *dev = __dev_get_by_name("ipddp0");
 	struct net_device_stats *stats;
 
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/core/dev.c working-2.5.68-bk10-netdevice/net/core/dev.c
--- linux-2.5.68-bk10/net/core/dev.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/core/dev.c	2003-05-01 20:19:24.000000000 +1000
@@ -440,8 +440,8 @@ struct net_device *dev_get_by_name(const
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_name(name);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !try_dev_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -513,8 +513,8 @@ struct net_device *dev_get_by_index(int 
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_index(ifindex);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !try_dev_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -563,8 +563,8 @@ struct net_device * dev_get_by_flags(uns
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_flags(if_flags, mask);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !try_dev_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -2611,7 +2611,7 @@ int register_netdevice(struct net_device
 	dev_init_scheduler(dev);
 	write_lock_bh(&dev_base_lock);
 	*dp = dev;
-	dev_hold(dev);
+	atomic_set(&dev->refcnt, 1);
 	dev->deadbeaf = 0;
 	write_unlock_bh(&dev_base_lock);
 
@@ -2899,7 +2899,7 @@ static int __init net_dev_init(void)
 #endif
 		dev->xmit_lock_owner = -1;
 		dev->iflink = -1;
-		dev_hold(dev);
+		atomic_set(&dev->refcnt, 1);
 
 		/*
 		 * Allocate name. If the init() fails
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/core/pktgen.c working-2.5.68-bk10-netdevice/net/core/pktgen.c
--- linux-2.5.68-bk10/net/core/pktgen.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/core/pktgen.c	2003-05-01 21:31:45.000000000 +1000
@@ -226,21 +226,20 @@ static struct net_device *setup_inject(s
 {
 	struct net_device *odev;
 
-	rtnl_lock();
-	odev = __dev_get_by_name(info->outdev);
+	odev = dev_get(info->outdev);
 	if (!odev) {
 		sprintf(info->result, "No such netdevice: \"%s\"", info->outdev);
-		goto out_unlock;
+		return NULL;
 	}
 
 	if (odev->type != ARPHRD_ETHER) {
 		sprintf(info->result, "Not ethernet device: \"%s\"", info->outdev);
-		goto out_unlock;
+		goto out_put;
 	}
 
 	if (!netif_running(odev)) {
 		sprintf(info->result, "Device is down: \"%s\"", info->outdev);
-		goto out_unlock;
+		goto out_put;
 	}
 
 	/* Default to the interface's mac if not explicitly set. */
@@ -281,14 +280,11 @@ static struct net_device *setup_inject(s
 	info->cur_daddr = info->daddr_min;
 	info->cur_udp_dst = info->udp_dst_min;
 	info->cur_udp_src = info->udp_src_min;
-	
-	atomic_inc(&odev->refcnt);
-	rtnl_unlock();
 
 	return odev;
 
-out_unlock:
-	rtnl_unlock();
+out_put:
+	dev_put(odev);
 	return NULL;
 }
 
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/decnet/dn_dev.c working-2.5.68-bk10-netdevice/net/decnet/dn_dev.c
--- linux-2.5.68-bk10/net/decnet/dn_dev.c	2003-04-20 18:05:16.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/decnet/dn_dev.c	2003-05-01 21:35:29.000000000 +1000
@@ -312,9 +312,7 @@ struct net_device *dn_dev_get_default(vo
 	read_lock(&dndev_lock);
 	dev = decnet_default_device;
 	if (dev) {
-		if (dev->dn_ptr)
-			dev_hold(dev);
-		else
+		if (!dev->dn_ptr || !try_dev_hold(dev))
 			dev = NULL;
 	}
 	read_unlock(&dndev_lock);
@@ -584,6 +582,8 @@ int dn_dev_ioctl(unsigned int cmd, void 
 		goto done;
 	}
 
+	/* FIXME: if cmd == SIOCGIFADDR, don't hold lock, and don't
+           have reference to dev. --RR */
 	if ((dn_db = dev->dn_ptr) != NULL) {
 		for (ifap = &dn_db->ifa_list; (ifa=*ifap) != NULL; ifap = &ifa->ifa_next)
 			if (strcmp(ifr->ifr_name, ifa->ifa_label) == 0)
@@ -677,6 +677,7 @@ static int dn_dev_rtm_newaddr(struct sk_
 	if (rta[IFA_LOCAL-1] == NULL)
 		return -EINVAL;
 
+	/* FIXME: Don't have lock, and don't hold reference to dev. --RR */
 	if ((dev = __dev_get_by_index(ifm->ifa_index)) == NULL)
 		return -ENODEV;
 
@@ -1189,9 +1190,10 @@ void dn_dev_up(struct net_device *dev)
 	 * configured ethernet card in the system.
 	 */
 	if (maybe_default) {
-		dev_hold(dev);
-		if (dn_dev_set_default(dev, 0))
-			dev_put(dev);
+		if (try_dev_hold(dev)) {
+			if (dn_dev_set_default(dev, 0))
+				dev_put(dev);
+		}
 	}
 }
 
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/decnet/dn_fib.c working-2.5.68-bk10-netdevice/net/decnet/dn_fib.c
--- linux-2.5.68-bk10/net/decnet/dn_fib.c	2003-04-20 18:05:16.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/decnet/dn_fib.c	2003-05-01 21:43:34.000000000 +1000
@@ -218,7 +218,9 @@ static int dn_fib_check_nh(const struct 
 			if (!(dev->flags&IFF_UP))
 				return -ENETDOWN;
 			nh->nh_dev = dev;
-			atomic_inc(&dev->refcnt);
+			/* FIXME:  Must hold lock, or use dev_get_by_index.
+			   --RR */
+			dev_hold(dev);
 			nh->nh_scope = RT_SCOPE_LINK;
 			return 0;
 		}
@@ -262,7 +264,7 @@ out:
 		if (!(dev->flags&IFF_UP))
 			return -ENETDOWN;
 		nh->nh_dev = dev;
-		atomic_inc(&nh->nh_dev->refcnt);
+		dev_hold(&nh->nh_dev);
 		nh->nh_scope = RT_SCOPE_HOST;
 	}
 
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/decnet/dn_route.c working-2.5.68-bk10-netdevice/net/decnet/dn_route.c
--- linux-2.5.68-bk10/net/decnet/dn_route.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/decnet/dn_route.c	2003-05-01 20:58:08.000000000 +1000
@@ -891,7 +891,9 @@ static int dn_route_output_slow(struct d
 		read_unlock(&dev_base_lock);
 		if (dev_out == NULL)
 			goto out;
-		dev_hold(dev_out);
+		/* FIXME: Shouldn't this be inside the lock? --RR */
+		if (!try_dev_hold(dev_out))
+			goto out;
 source_ok:
 		;
 	}
@@ -960,8 +962,10 @@ source_ok:
 					} else {
 						dev_out = neigh->dev;
 					}
-					dev_hold(dev_out);
-					goto select_source;
+					if (try_dev_hold(dev_out))
+						goto select_source;
+					else
+						dev_out = NULL;
 				}
 			}
 		}
@@ -1035,7 +1039,10 @@ select_source:
 	if (dev_out)
 		dev_put(dev_out);
 	dev_out = DN_FIB_RES_DEV(res);
-	dev_hold(dev_out);
+	if (!try_dev_hold(dev_out)) {
+		dev_out = NULL;
+		goto e_addr;
+	}
 	fl.oif = dev_out->ifindex;
 	gateway = DN_FIB_RES_GW(res);
 
@@ -1231,7 +1238,8 @@ static int dn_route_input_slow(struct sk
 						 "No output device\n");
 			goto e_inval;
 		}
-		dev_hold(out_dev);
+		if (!try_dev_hold(out_dev))
+			goto e_inval;
 
 		if (res.r)
 			src_map = dn_fib_rules_policy(fl.fld_src, &res, &flags);
@@ -1349,8 +1357,12 @@ make_route:
 			rt->u.dst.input = dn_blackhole;
 	}
 	rt->rt_flags = flags;
-	if (rt->u.dst.dev)
-		dev_hold(rt->u.dst.dev);
+	if (rt->u.dst.dev) {
+		if (!try_dev_hold(rt->u.dst.dev)) {
+			dst_free(&rt->u.dst);
+			goto e_inval;
+		}
+	}
 
 	if (dn_rt_set_next_hop(rt, &res))
 		goto e_neighbour;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/decnet/dn_rules.c working-2.5.68-bk10-netdevice/net/decnet/dn_rules.c
--- linux-2.5.68-bk10/net/decnet/dn_rules.c	2003-04-20 18:05:16.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/decnet/dn_rules.c	2003-05-01 21:37:58.000000000 +1000
@@ -173,6 +173,7 @@ int dn_fib_rtm_newrule(struct sk_buff *s
 		memcpy(new_r->r_ifname, RTA_DATA(rta[RTA_IIF-1]), IFNAMSIZ);
 		new_r->r_ifname[IFNAMSIZ-1] = 0;
 		new_r->r_ifindex = -1;
+		/* FIXME: Don't hold lock, and don't get reference. --RR */
 		dev = __dev_get_by_name(new_r->r_ifname);
 		if (dev)
 			new_r->r_ifindex = dev->ifindex;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/fib_frontend.c working-2.5.68-bk10-netdevice/net/ipv4/fib_frontend.c
--- linux-2.5.68-bk10/net/ipv4/fib_frontend.c	2003-05-01 20:36:37.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv4/fib_frontend.c	2003-01-02 12:30:47.000000000 +1100
@@ -115,8 +115,8 @@ struct net_device * ip_dev_find(u32 addr
 	if (res.type != RTN_LOCAL)
 		goto out;
 	dev = FIB_RES_DEV(res);
-	if (dev && !try_dev_get(dev))
-		dev = NULL;
+	if (dev)
+		atomic_inc(&dev->refcnt);
 
 out:
 	fib_res_put(&res);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/fib_semantics.c working-2.5.68-bk10-netdevice/net/ipv4/fib_semantics.c
--- linux-2.5.68-bk10/net/ipv4/fib_semantics.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv4/fib_semantics.c	2003-05-01 21:39:20.000000000 +1000
@@ -405,6 +405,8 @@ static int fib_check_nh(const struct rtm
 				return -ENODEV;
 			if (!(dev->flags&IFF_UP))
 				return -ENETDOWN;
+			/* FIXME:  Must hold lock, or use dev_get_by_index.
+			   --RR */
 			nh->nh_dev = dev;
 			atomic_inc(&dev->refcnt);
 			nh->nh_scope = RT_SCOPE_LINK;
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/ip_gre.c working-2.5.68-bk10-netdevice/net/ipv4/ip_gre.c
--- linux-2.5.68-bk10/net/ipv4/ip_gre.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv4/ip_gre.c	2003-05-01 21:50:07.000000000 +1000
@@ -289,7 +289,7 @@ static struct ip_tunnel * ipgre_tunnel_l
 	if (register_netdevice(dev) < 0)
 		goto failed;
 
-	dev_hold(dev);
+	atomic_set(&dev->refcnt, 1);
 	ipgre_tunnel_link(nt);
 	/* Do not decrement MOD_USE_COUNT here. */
 	return nt;
@@ -1205,6 +1205,7 @@ static int ipgre_tunnel_init(struct net_
 	}
 
 	if (!tdev && tunnel->parms.link)
+		/* FIXME: Don't hold lock, don't grab reference. --RR */
 		tdev = __dev_get_by_index(tunnel->parms.link);
 
 	if (tdev) {
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/ipip.c working-2.5.68-bk10-netdevice/net/ipv4/ipip.c
--- linux-2.5.68-bk10/net/ipv4/ipip.c	2003-04-20 18:05:16.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv4/ipip.c	2003-05-01 21:51:47.000000000 +1000
@@ -259,7 +259,7 @@ static struct ip_tunnel * ipip_tunnel_lo
 	if (register_netdevice(dev) < 0)
 		goto failed;
 
-	dev_hold(dev);
+	atomic_set(&dev->refcnt, 1);
 	ipip_tunnel_link(nt);
 	/* Do not decrement MOD_USE_COUNT here. */
 	return nt;
@@ -841,6 +841,7 @@ static int ipip_tunnel_init(struct net_d
 	}
 
 	if (!tdev && tunnel->parms.link)
+		/* FIXME: Don't hold lock, don't grab reference. --RR */
 		tdev = __dev_get_by_index(tunnel->parms.link);
 
 	if (tdev) {
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/ipmr.c working-2.5.68-bk10-netdevice/net/ipv4/ipmr.c
--- linux-2.5.68-bk10/net/ipv4/ipmr.c	2003-03-25 12:17:32.000000000 +1100
+++ working-2.5.68-bk10-netdevice/net/ipv4/ipmr.c	2003-05-01 20:39:29.000000000 +1000
@@ -443,7 +443,6 @@ static int vif_add(struct vifctl *vifc, 
 
 	/* And finish update writing critical data */
 	write_lock_bh(&mrt_lock);
-	dev_hold(dev);
 	v->dev=dev;
 #ifdef CONFIG_IP_PIMSM
 	if (v->flags&VIFF_REGISTER)
@@ -1441,8 +1440,8 @@ int pim_rcv_v1(struct sk_buff * skb)
 	read_lock(&mrt_lock);
 	if (reg_vif_num >= 0)
 		reg_dev = vif_table[reg_vif_num].dev;
-	if (reg_dev)
-		dev_hold(reg_dev);
+	if (reg_dev && !try_dev_hold(reg_dev))
+		reg_dev = NULL;
 	read_unlock(&mrt_lock);
 
 	if (reg_dev == NULL) {
@@ -1508,8 +1507,8 @@ int pim_rcv(struct sk_buff * skb)
 	read_lock(&mrt_lock);
 	if (reg_vif_num >= 0)
 		reg_dev = vif_table[reg_vif_num].dev;
-	if (reg_dev)
-		dev_hold(reg_dev);
+	if (reg_dev && !try_dev_hold(reg_dev))
+		reg_dev = NULL;
 	read_unlock(&mrt_lock);
 
 	if (reg_dev == NULL) {
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv4/route.c working-2.5.68-bk10-netdevice/net/ipv4/route.c
--- linux-2.5.68-bk10/net/ipv4/route.c	2003-05-01 09:29:35.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv4/route.c	2003-05-01 20:41:28.000000000 +1000
@@ -1992,7 +1992,8 @@ int ip_route_output_slow(struct rtable *
 	if (dev_out)
 		dev_put(dev_out);
 	dev_out = FIB_RES_DEV(res);
-	dev_hold(dev_out);
+	if (!try_dev_hold(dev_out))
+		goto e_inval;
 	fl.oif = dev_out->ifindex;
 
 make_route:
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/ipv6/sit.c working-2.5.68-bk10-netdevice/net/ipv6/sit.c
--- linux-2.5.68-bk10/net/ipv6/sit.c	2003-04-20 18:05:17.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/ipv6/sit.c	2003-05-01 21:55:21.000000000 +1000
@@ -197,7 +197,7 @@ static struct ip_tunnel * ipip6_tunnel_l
 	if (register_netdevice(dev) < 0)
 		goto failed;
 
-	dev_hold(dev);
+	atomic_set(&dev->refcount, 1);
 	ipip6_tunnel_link(nt);
 	/* Do not decrement MOD_USE_COUNT here. */
 	return nt;
@@ -778,6 +778,7 @@ static int ipip6_tunnel_init(struct net_
 	}
 
 	if (!tdev && tunnel->parms.link)
+		/* FIXME: Don't hold lock, don't grab reference. --RR */
 		tdev = __dev_get_by_index(tunnel->parms.link);
 
 	if (tdev) {
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/llc/af_llc.c working-2.5.68-bk10-netdevice/net/llc/af_llc.c
--- linux-2.5.68-bk10/net/llc/af_llc.c	2003-05-01 09:29:36.000000000 +1000
+++ working-2.5.68-bk10-netdevice/net/llc/af_llc.c	2003-05-01 21:13:49.000000000 +1000
@@ -256,6 +256,7 @@ static int llc_ui_autobind(struct socket
 		rc = -ENETUNREACH;
 		if (!dev)
 			goto out;
+		/* FIXME: We don't hold a reference to dev --RR */
 		llc->dev = dev;
 	}
 	/* bind to a specific sap, optional. */
@@ -419,6 +420,7 @@ static int llc_ui_connect(struct socket 
 		rtnl_unlock();
 		if (!dev)
 			goto out;
+		/* FIXME: We don't hold a reference to dev --RR */
 		llc->dev = dev;
 	} else
 		dev = llc->dev;
@@ -764,6 +766,7 @@ static int llc_ui_sendmsg(struct kiocb *
 		dev = dev_getbyhwaddr(addr->sllc_arphrd, addr->sllc_smac);
 		rtnl_unlock();
 		rc = -ENETUNREACH;
+		/* FIXME: We don't hold a reference to dev --RR */
 		if (!dev)
 			goto release;
 	} else
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/netrom/nr_route.c working-2.5.68-bk10-netdevice/net/netrom/nr_route.c
--- linux-2.5.68-bk10/net/netrom/nr_route.c	2003-01-02 12:27:51.000000000 +1100
+++ working-2.5.68-bk10-netdevice/net/netrom/nr_route.c	2003-05-01 20:51:51.000000000 +1000
@@ -567,8 +567,8 @@ struct net_device *nr_dev_get(ax25_addre
 	read_lock(&dev_base_lock);
 	for (dev = dev_base; dev != NULL; dev = dev->next) {
 		if ((dev->flags & IFF_UP) && dev->type == ARPHRD_NETROM && ax25cmp(addr, (ax25_address *)dev->dev_addr) == 0) {
-			dev_hold(dev);
-			goto out;
+			if (try_dev_hold(dev))
+				goto out;
 		}
 	}
 out:
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.68-bk10/net/rose/rose_route.c working-2.5.68-bk10-netdevice/net/rose/rose_route.c
--- linux-2.5.68-bk10/net/rose/rose_route.c	2003-03-18 12:21:41.000000000 +1100
+++ working-2.5.68-bk10-netdevice/net/rose/rose_route.c	2003-05-01 20:51:59.000000000 +1000
@@ -629,8 +629,8 @@ struct net_device *rose_dev_get(rose_add
 	read_lock(&dev_base_lock);
 	for (dev = dev_base; dev != NULL; dev = dev->next) {
 		if ((dev->flags & IFF_UP) && dev->type == ARPHRD_ROSE && rosecmp(addr, (rose_address *)dev->dev_addr) == 0) {
-			dev_hold(dev);
-			goto out;
+			if (try_dev_hold(dev))
+				goto out;
 		}
 	}
 out:

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01 17:51         ` dev->destructor Arnaldo Carvalho de Melo
@ 2003-05-01 16:55           ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-01 16:55 UTC (permalink / raw)
  To: acme; +Cc: rusty, kuznet, shemminger, netdev, wa

   From: Arnaldo Carvalho de Melo <acme@conectiva.com.br>
   Date: Thu, 1 May 2003 14:51:11 -0300

   Well, I think that because there are a graph of relationships here we perhaps
   can be safe by protecting just some of the higher level objects (e.g. struct
   sock, struct socket, struct net_device) while leaving some other lower level
   objects managed by those higher level ones, e.g. struct sk_buff managed by
   struct sock.

The graphs are unfortunately not completely connected.

For example, sk_buff's can be sent not assosciated with any socket.
Routing cache entries are not attached to any particular client,
similar with ARP/neighbour entires, and sk_buff's in turn hold
references to these things.

See, long ago we used to not do proper reference counting
on struct sock's.  We used to rely on graphs of relationships
and certain sock states to control destruction of these objects.
The networking was riddled with obscure bugs because of this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01 12:01     ` dev->destructor Rusty Russell
  2003-05-01 11:09       ` dev->destructor David S. Miller
@ 2003-05-01 17:28       ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2003-05-01 17:28 UTC (permalink / raw)
  To: Rusty Russell
  Cc: David S. Miller, kuznet, shemminger, netdev, Werner Almesberger

Em Thu, May 01, 2003 at 10:01:19PM +1000, Rusty Russell escreveu:
 
> But before we redesign module architecture from scratch, let's look at
> a solution with what we do have (assuming Linus takes my damn
> __module_get() patch some day, see below).

Linus took the __module_get patch, I even used it in redesigning the way
struct sock and struct socket are handled in response to Max Krasnyansky
alternative patches
 
> There are 70 calls to dev_hold() in the kernel.  The vast majority of
> them already have a reference, they just want another one: dev_hold()
> can do __module_get().

yes
 
> There are a few *sources* of devices: dev_get, dev_get_by*.  These
> should check and fail, using "try_dev_hold()" or something.
> Unfortunately auditing all the __dev_get_by* is quite a task, since
> it's used very widely (and I think, sometime erroneously).
> 
> Completely untested patch below other patch.
> 
> I need more time to digest your proposal in detail, Dave.  Expect
> reply w/in 24 hours.

I'm digesting it as well 8)

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01 11:09       ` dev->destructor David S. Miller
@ 2003-05-01 17:51         ` Arnaldo Carvalho de Melo
  2003-05-01 16:55           ` dev->destructor David S. Miller
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2003-05-01 17:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: rusty, kuznet, shemminger, netdev, wa

Em Thu, May 01, 2003 at 04:09:35AM -0700, David S. Miller escreveu:
>    From: Rusty Russell <rusty@rustcorp.com.au>
>    Date: Thu, 01 May 2003 22:01:19 +1000
>    
>    There are 70 calls to dev_hold() in the kernel.  The vast majority of
>    them already have a reference, they just want another one: dev_hold()
>    can do __module_get().
> 
> Rusty, this is precisely the what Alexey and myself want to avoid.  On
> the surface, it looks fine, only 70 dev_get's in the kernel right?
> But think further...
> 
> So you propose to add this kind of thing for every ARP entry, every
> route cache entry, every IPSEC policy, every socket, every struct
> sock, every networking dynamic object ever created?

ALERT: brainstorming and expecting for comments from the people who knows this
better.

Well, I think that because there are a graph of relationships here we perhaps
can be safe by protecting just some of the higher level objects (e.g. struct
sock, struct socket, struct net_device) while leaving some other lower level
objects managed by those higher level ones, e.g. struct sk_buff managed by
struct sock.

This came to me while discussing the struct socket and struct sock module
infrastructure with Max, specifically when net family modules (e.g. AF_INET)
doesn't requires protecting for each and every struct socket created, as the
protocol modules (e.g.: udp, raw, tcp) have to somehow register with the net
family module and by just using one exported function (register_protocol type
functions: register_pppox_proto, bt_sock_register, register_8022_client,
register_snap_client, llc_sap_open, etc) makes the net family module/lower
level protocol protected.

So we need to have a graph of these relationships to see what we have to
protect at a higher level, reducing the overhead of otherwise having to call
try_module_get/__module_get & module_put on _all_ objects creation/use.

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-01  7:00   ` dev->destructor David S. Miller
  2003-05-01 12:01     ` dev->destructor Rusty Russell
@ 2003-05-02  4:06     ` kuznet
  2003-05-02  5:25       ` dev->destructor Rusty Russell
  1 sibling, 1 reply; 28+ messages in thread
From: kuznet @ 2003-05-02  4:06 UTC (permalink / raw)
  To: David S. Miller; +Cc: shemminger, netdev, acme, rusty

Hello!

> None of destructors kill reference to device object, and I mean none
> of them.  It is why I thought the idea works.

When you call unregister_something() you hold a reference to this
something, and you have no idea how much of the references
you hold. This is invisible when unregister_something() is called
from a single place sort of cleanup_module().

> Also, holding RTNL semaphore does not block potential holders
> of device reference.  Or does it?

When that branch with waiting inside unregister() exists you can't hold
any reference when grabbing this semaphore. That's why dev_ioctl()
takes the semaphore first.

It is damnly inconvenient, fragile, et cetera and such bugs do exist.
That's why unregister_netdev() is logically wrong function: it takes
dev as argument, so any sane programmer would assume caller holds
a reference. But he can't. So, call of the function is allowed
only from contexts where device is presumed to be held, i.e. from
cleanup_module() and from no other places.

> Your idea, as I understand it, is to add callback to module that
> freezes module then at some time in the future makes indication
> that module is clean and may be unloaded safely.

Yes, the description is mostly right.

> Question is, where to make this?  It cannot be in the module
> itself of course, that would create race condition similar to
> one which exists today making this exercise quite pointless :-)

Probably, you should look at the most first module implementation. :-)
Desite of it was horrible (like almost everything in kernel that days)
it was logically correct.

rmmod deleted module not depending on refcnt and module body was destroyed
later, when refcnt reached zero. See?

So, that cleanup_module() is replaced with shutdown() and a destructor
is added to allow to cleanup something but memory, if it is necessary.

And to handle the situation when we do not want to use module refcnt,
a predicate to ask module for readiness to kill is added. I think
it can be combined to destructor, so that for such modules destructor
can return -EAGAIN. Well, when refcnt is zero you can try to destroy module
and it might disagree.

> Note the problem.  If between the unregister_foo() and re-register of
> foo in the failure path, someone asks to open a "foo" it will fail.
> Those semantics absolutely stink.

It is not a problem at all comparing to real ones. :-)

> Rusty went on to describe that this is one of the reasons for the
> current "enable refcounts" scheme with try_module_get().

Rusty forgot that crippling xxx_get() is million times more painful. :-)

He also forgot that in 99% of cases there is a single registry and
this registry must be self-consistent, so all the work is already done
and module.c just invades area out of its competence.

Anyway, this approach is legal and the simplest one, I understand this.
It can be used optionally. Only, frankly speaking, I do not see any
applications for this, because when more than one registry exists,
module is surely so complicated that tracking references is painful
enough to forget about this. Anyway, we always know number of sockets et al.,
so why to count them in more than one place?

netdevices is the simplest example, but it shows the most didctively
that all the ocurences of module_** there are illegal. We want
to register/unregister them dynamically, we have to do all the job not
depending on modules. We have to do our own refcounting. And incorrect
design of modules only prevents to make final small step to make this
right. Well, the key moment is that while device is registered, its
module refcnt is not zero logically, but we can't unload the module
in this case, so we have to do funny try_* each lookup.

Alexey

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-02  4:06     ` dev->destructor kuznet
@ 2003-05-02  5:25       ` Rusty Russell
  2003-05-02 20:48         ` dev->destructor David S. Miller
  0 siblings, 1 reply; 28+ messages in thread
From: Rusty Russell @ 2003-05-02  5:25 UTC (permalink / raw)
  To: kuznet; +Cc: David S. Miller, shemminger, netdev, acme

In message <200305020406.IAA10719@sex.inr.ac.ru> you write:
> Hello!

Hi Alexey!

> It is damnly inconvenient, fragile, et cetera and such bugs do exist.
> That's why unregister_netdev() is logically wrong function: it takes
> dev as argument, so any sane programmer would assume caller holds
> a reference. But he can't. So, call of the function is allowed
> only from contexts where device is presumed to be held, i.e. from
> cleanup_module() and from no other places.

If this is true, I think you can use the module reference count only,
and your code will be faster, too.  I can prepare the patch for you
later tonight, to see how it looks.

> netdevices is the simplest example, but it shows the most didctively
> that all the ocurences of module_** there are illegal. We want
> to register/unregister them dynamically, we have to do all the job not
> depending on modules. We have to do our own refcounting. And incorrect
> design of modules only prevents to make final small step to make this
> right. Well, the key moment is that while device is registered, its
> module refcnt is not zero logically, but we can't unload the module
> in this case, so we have to do funny try_* each lookup.

Alexey, you are using a module but don't want to reference count it.
I made module reference counts very cheap so you don't have to worry,
but you still are trying to cheat 8)

You want to be very tricky and count all ways into the module,
instead.  Clearly this is mathematically possible, but in practice
very tricky.  And all solutions I have seen which do this are ugly,
and leave us with "remove may not succeed, it may hang forever, and
you won't know, and you can't replace the module and need to reboot if
it happens". 8(

Better, I think, to make CONFIG_MODULE_UNLOAD=n, and make
CONFIG_MODULE_FORCE_UNLOAD work even if CONFIG_MODULE_UNLOAD=n.

Hope that clarifies?
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-02  5:25       ` dev->destructor Rusty Russell
@ 2003-05-02 20:48         ` David S. Miller
  2003-05-03  4:07           ` dev->destructor Rusty Russell
  0 siblings, 1 reply; 28+ messages in thread
From: David S. Miller @ 2003-05-02 20:48 UTC (permalink / raw)
  To: rusty; +Cc: kuznet, shemminger, netdev, acme

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Fri, 02 May 2003 15:25:15 +1000

   If this is true, I think you can use the module reference count only,
   and your code will be faster, too.  I can prepare the patch for you
   later tonight, to see how it looks.

And where do we get the counter from when dev->owner is NULL
(ie. non-modular)?  We need the reference counting regardless of
whether the device is implemented statically in the kernel or modular.

Do you propose to attach dummy struct module to non-modular case?
I am curious...

   Alexey, you are using a module but don't want to reference count it.
   I made module reference counts very cheap so you don't have to worry,
   but you still are trying to cheat 8)

Understood.

I think even stronger part of Alexey's argument is that all of
this "if (x->owner)" all over the place takes away some of the
gains of compiling things statically into the kernel.  Why extra
branches all over the place?

   You want to be very tricky and count all ways into the module,
   instead.  Clearly this is mathematically possible, but in practice
   very tricky.  And all solutions I have seen which do this are ugly,
   and leave us with "remove may not succeed, it may hang forever, and
   you won't know, and you can't replace the module and need to reboot if
   it happens". 8(

As long as I can Control-C rmmod when it waits like this, which would
be the case, what is the problem?

Also, not only is this mathematically possible it is DONE already.
Hmmm, there seems to be massive disconnect here between what we
understand here and what you appear to.  Let me try to describe it
in detail.

All reasonable protocol code must do exactly this.  Any module which
does not properly keep track of the objects it is creating has
problems bigger than proper module handling.

It is not "very tricky", but rather "required".

Look at it this way, when module kmalloc's something does it
immediately forget about this?  This seems to be what you suggest, and
it is a dangerous way to think!

No, rather, it remembers that it did this, either by setting '1'
to refcount of this object, or attaching it to some hash table, list,
tree, or other global data structure it maintains.  Any time this
object is attached somewhere else, reference count is incremented.
Anytime it is detached or destroyed, refcount is decremented and final
decrement to zero makes final killing of this object.  It is ABCs of
programming. :-)

Apply this to every dynamic object created by a module, and the end
result is that it makes the work of counting all internal references.
Ergo, module refcounting is superfluous.  Look, once external view
into module (ie. socket operations, superblock ops, netdev registry)
is removed, all that remains to reference object is exactly these
objects.  It is the only different part about modules
vs. non-modules.

After threading the networking and adding true refcounting to sockets
I will never forget these rules. :-)

   Better, I think, to make CONFIG_MODULE_UNLOAD=n, and make
   CONFIG_MODULE_FORCE_UNLOAD work even if CONFIG_MODULE_UNLOAD=n.

As much as I'd like to be able to accept that behavior,
it's too much breakage.  So many people periodically make
rmmod attempts to unload unused modules, distributions even
make this by default (or at least used to).

Let's look at this aspect of behavior:

1) Some people think that -EBUSY return is unexpected.
   I fall into this category.

2) It is argued that some other people think the "wait until
   unloadable" behavior is unexpected.

But nobody would be surprised if rmmod told them:

====================
Trying to unload %s, waiting for all references to go away.
Perhaps you have programs running which still reference this
module?  (Hit Ctrl-C to interrupt unload to go and remove
those references)
====================

nobody would ask what does this mean. :-)

In fact, what IF rmmod was able to know it was unloading a filesystem
and therefore could walk the mount list to find mounted instances of
this filesystem and print that to the user in the rmmod message?  Or
for network protocols to print the socket list of
sockets/routes/devices open to that module and even making 'lsof' to
print process name/pid holding open such sockets?

I bet even Linus himself would exclaim "wow, that's nice."

Compare this to "-EBUSY". :-)))))))))

And I want to mention that in some cases you have to "wait".  The best
example are TCP_TIME_WAIT sockets.  Even after users downs all the
interfaces, and closes all the sockets, these remnants must remain for
their full life of 60 seconds.

I really am concerned at both sides, both user observed behavior and
kernel side correctness.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-03  4:07           ` dev->destructor Rusty Russell
@ 2003-05-03  3:46             ` David S. Miller
  2003-05-05  5:18               ` dev->destructor Rusty Russell
  2003-05-03  4:00             ` dev->destructor David S. Miller
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: David S. Miller @ 2003-05-03  3:46 UTC (permalink / raw)
  To: rusty; +Cc: kuznet, shemminger, netdev, acme

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Sat, 03 May 2003 14:07:41 +1000

   I imagined schemes where the kernel would be basically stopped during
   module remove, so the half-remove and unremove would appear atomic.
   I shied away from implementing such a monster without deadlock, but it
   might be possible.  Then we would truly have nirvana 8)

I think this is an interesting area for exploration.

This isn't a unique requirement BTW Rusty.  IP conntrack rehashing
in the presence of RCU would want something just like this now
wouldn't it?  Consider other applications, such as hot plug memory.
I'm sure tons of other interesting examples could be imagined.

So indeed, the key to nirvana would indeed be here :-)

I think it can work Rusty, in short you create 1 freeze thread
per cpu.  You wake up all the freeze threads on non-local cpus,
and they indicate their presence via some bitmask.

Once the master cpu sees all non-local-cpu bits set in the bitmask,
it begins the unload sequence, after the unload the cpu mask is
cleared and this signals the freeze threads to break from their spin
loops and schedule().

This means the local master cpu executes the unload sequence.  It may
sleep in order to yield to, for example, semaphore holders, it may
also sleep to yield to kswapd and friends for the sake of memory
allocation.  I mean... consider all the situations and please try to
find some hole in this.  We can make all try_to_*() sleep at this
time too... this in particular needs more thought.

To make these freeze threads globally useful, we allow them to
run atomicity commands.  The two defined commands are "local_irq_*()"
and "local_bh_*()", two bitmasks control this and the freeze threads
check the bits in their spin loops.

Do you see?  Maybe... it is nearly Nirvana! :-)))))

Our ability to implement this changes the rest of the conversation,
so let us resolve this first.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-03  4:07           ` dev->destructor Rusty Russell
  2003-05-03  3:46             ` dev->destructor David S. Miller
@ 2003-05-03  4:00             ` David S. Miller
  2003-05-05 16:08             ` dev->destructor Stephen Hemminger
  2003-05-05 20:00             ` dev->destructor Stephen Hemminger
  3 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-03  4:00 UTC (permalink / raw)
  To: rusty; +Cc: kuznet, shemminger, netdev, acme

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Sat, 03 May 2003 14:07:41 +1000

   This argument applies to all objects.  If you reference count
   everything which holds a reference to an object, you can infer the
   reference count of the object from the sum of reference counts of its
   referees.

   In practice, as you pointed out in an earlier mail (I think sockets
   were your example), doing this proves to be extremely painful.  And
   we're feeling the pain now.

Please ignore the example code I wrote in that email.
Most of it is inconsistent and frankly garbage. :-)

The "->can_unload()" check is actually simpler than we might initially
suspect.  Something like ipv6 might check:

	if (atomic_read(&inet6_sock_nr) == 0 &&
	    atomic_read(&inet6_dev_nr) == 0 &&
	    rt6_cache_empty())
		return 1;
	return 0;

Now, here is the important part!  When this thing returns "1" the
module.c code does this:

	call_rcu(&mod->rcu_head, mod->cleanup, NULL);

This makes sure the guy who killed the last object has indeed
left the module code.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-02 20:48         ` dev->destructor David S. Miller
@ 2003-05-03  4:07           ` Rusty Russell
  2003-05-03  3:46             ` dev->destructor David S. Miller
                               ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Rusty Russell @ 2003-05-03  4:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, shemminger, netdev, acme

In message <20030502.134804.78707298.davem@redhat.com> you write:
>    From: Rusty Russell <rusty@rustcorp.com.au>
>    Date: Fri, 02 May 2003 15:25:15 +1000
> 
>    If this is true, I think you can use the module reference count only,
>    and your code will be faster, too.  I can prepare the patch for you
>    later tonight, to see how it looks.
>    
> And where do we get the counter from when dev->owner is NULL
> (ie. non-modular)?  We need the reference counting regardless of
> whether the device is implemented statically in the kernel or modular.

But Alexey said you can only call unregister_netdev from module
unload, ie. if not a module, it can't be unloaded, hence no refcount
needed.  I wrote the above paragraph because I'm not sure if I
understood Alexey correctly?

>    Alexey, you are using a module but don't want to reference count it.
>    I made module reference counts very cheap so you don't have to worry,
>    but you still are trying to cheat 8)
>    
> Understood.
> 
> I think even stronger part of Alexey's argument is that all of
> this "if (x->owner)" all over the place takes away some of the
> gains of compiling things statically into the kernel.  Why extra
> branches all over the place?

Agreed.  I have considered removing that, and making THIS_MODULE equal
a dummy module struct for the core kernel.  I think that would be a
win (we've already established that the actual refcount is cheap,
possibly cheaper than the branch in practice).

BTW, we should look at making local_inc() etc. a first-class citizen:
it has uses outside modules.

> As long as I can Control-C rmmod when it waits like this, which would
> be the case, what is the problem?

If you can, and the device is still usable afterwards, it would be
nirvana 8)

[ Refcounting tutorial skipped: I went through the same pain with
  early conntrack code, and learned to refcount everything now 8) ]

> Look, once external view into module (ie. socket operations,
> superblock ops, netdev registry) is removed, all that remains to
> reference object is exactly these objects.  It is the only different
> part about modules vs. non-modules.

This argument applies to all objects.  If you reference count
everything which holds a reference to an object, you can infer the
reference count of the object from the sum of reference counts of its
referees.

In practice, as you pointed out in an earlier mail (I think sockets
were your example), doing this proves to be extremely painful.  And
we're feeling the pain now.

The module functions, and all its data, are objects.  For convenience,
size, and speed, we don't reference count them separately.

OK, so why doesn't, say, struct netdevice grab the module refcount on
registration (as is logical), and drop it on unregistration?  Because
the module holds a reference to the struct device, so now you have a
classic circular reference count problem: the module reference count
will never fall to zero.  That's why we grab a reference just before
use.

You can, of course, do two-stage module cleanup (ie. first stage with
refcounts non-zero), but the user wants clean "remove or fail to
remove" semantics, so half-way through the cleanup you realize it's
still in use, and restore things: now you have the spurious failure
problem where it was half-unregistered for a while, and AFAICT you
have to rewrite 1400 modules's cleanup routines.

I imagined schemes where the kernel would be basically stopped during
module remove, so the half-remove and unremove would appear atomic.
I shied away from implementing such a monster without deadlock, but it
might be possible.  Then we would truly have nirvana 8)

> Trying to unload %s, waiting for all references to go away.
> Perhaps you have programs running which still reference this
> module?  (Hit Ctrl-C to interrupt unload to go and remove
> those references)
> ====================
> 
> nobody would ask what does this mean. :-)

Please, implement such a thing.  I was unable to, without introducing
spurious failure in the components, *and* rewriting every
module_cleanup function 8( Hence rmmod does not wait by default, but
says "module is busy".

> And I want to mention that in some cases you have to "wait".  The best
> example are TCP_TIME_WAIT sockets.  Even after users downs all the
> interfaces, and closes all the sockets, these remnants must remain for
> their full life of 60 seconds.

Yes, certainly.  The two-stage unload provided by rmmod --wait ensures
that the reference count decreases to zero (it makes try_module_get
fail).  This was my desire if unloading security modules or some
netfilter modules was going to be reasonable.

> I really am concerned at both sides, both user observed behavior and
> kernel side correctness.

There are shades of correctness, too.  Not jumping into a module which
has gone away is probably the most important.  Having finite unload
time is also nice.  Not having spurious failures because someone tried
to unload a module is nice too.

Roman Zippel suggested (among other things) that every unregister fail
when busy, and that modules the reinitialize.  This gives the spurious
failure problem, and means rewriting every unregister interface in the
kernel, every module cleanup function, and dealing with the case where
you're cleaning up a failed initialization and your unregister failed.

Adam Richter suggested the module or interface register with a central
repository to say "this modules uses this interface", then you can
atomically freeze all the module interfaces (say they all share a
single lock) and see if it's in use, but you can't really call the
module's cleanup routine, so you have to atomically deactivate all
these registrations before dropping the lock, and that's the same as
try_module_get().

As you know, I love radical change.  But I want be make sure we're
going to end up somewhere we like at the end of it, which is why I
didn't do it.  Anyway, putting the module loader in the kernel was
enough to sate my appetite for change 8)

Thanks!
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-03  3:46             ` dev->destructor David S. Miller
@ 2003-05-05  5:18               ` Rusty Russell
  0 siblings, 0 replies; 28+ messages in thread
From: Rusty Russell @ 2003-05-05  5:18 UTC (permalink / raw)
  To: David S. Miller; +Cc: kuznet, shemminger, netdev, acme

In message <20030502.204628.35664814.davem@redhat.com> you write:
> I think it can work Rusty, in short you create 1 freeze thread
> per cpu.  You wake up all the freeze threads on non-local cpus,
> and they indicate their presence via some bitmask.

This code is already in module.c.  I'm glad you like it though 8).

But we disable local irqs as well: this is what I call a "bogolock"
(the read-side of a bogolock is prempt_disable()/preempt_enable(): you
could temporarily disable preemption and force the scheduler to run
every preempted thread, and remove this).

> This means the local master cpu executes the unload sequence.  It may
> sleep in order to yield to, for example, semaphore holders, it may
> also sleep to yield to kswapd and friends for the sake of memory
> allocation.  I mean... consider all the situations and please try to
> find some hole in this.  We can make all try_to_*() sleep at this
> time too... this in particular needs more thought.

Well, it's a big task.  Holding interrupts disabled for unbounded time
on CPUs needs to be thought about, but I think can be fixed.  try_xxx
can be called from interrupt context: you really want to get rid of
interrupts, too...

During previous discussions, I called this "return to primordial
soup": back to like during init.  Ideally, only userspace context (no
interrupts, timers, bottom halves), and life is easy.

> To make these freeze threads globally useful, we allow them to
> run atomicity commands.  The two defined commands are "local_irq_*()"
> and "local_bh_*()", two bitmasks control this and the freeze threads
> check the bits in their spin loops.

Something like this?

	/* Tell all freeze threads to disable bottom halves. */
	void global_bh_disable(void);
	void global_bh_enable(void);

	/* Tell all freeze threads to disable interrupts halves. */
	void global_irq_disable(void);
	void global_irq_enable(void);

> Do you see?  Maybe... it is nearly Nirvana! :-)))))

Yes, but I worry it might be an illusion 8)

> Our ability to implement this changes the rest of the conversation,
> so let us resolve this first.

Yes, but it's a big IF.  I think it might be easier to make all
unregistrations runnable in interrupt context 8(

Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-03  4:07           ` dev->destructor Rusty Russell
  2003-05-03  3:46             ` dev->destructor David S. Miller
  2003-05-03  4:00             ` dev->destructor David S. Miller
@ 2003-05-05 16:08             ` Stephen Hemminger
  2003-05-06 14:25               ` dev->destructor David S. Miller
  2003-05-05 20:00             ` dev->destructor Stephen Hemminger
  3 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2003-05-05 16:08 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, kuznet, netdev, acme

On Sat, 03 May 2003 14:07:41 +1000
Rusty Russell <rusty@rustcorp.com.au> wrote:

> In message <20030502.134804.78707298.davem@redhat.com> you write:
> >    From: Rusty Russell <rusty@rustcorp.com.au>
> >    Date: Fri, 02 May 2003 15:25:15 +1000
> > 
> >    If this is true, I think you can use the module reference count only,
> >    and your code will be faster, too.  I can prepare the patch for you
> >    later tonight, to see how it looks.
> >    
> > And where do we get the counter from when dev->owner is NULL
> > (ie. non-modular)?  We need the reference counting regardless of
> > whether the device is implemented statically in the kernel or modular.
> 
> But Alexey said you can only call unregister_netdev from module
> unload, ie. if not a module, it can't be unloaded, hence no refcount
> needed.  I wrote the above paragraph because I'm not sure if I
> understood Alexey correctly?

There are several flavors of pseudo-network devices like bridging and VLAN that
dynamically create/destroy netdev's even when they are not modules.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-03  4:07           ` dev->destructor Rusty Russell
                               ` (2 preceding siblings ...)
  2003-05-05 16:08             ` dev->destructor Stephen Hemminger
@ 2003-05-05 20:00             ` Stephen Hemminger
  2003-05-06  4:18               ` dev->destructor Rusty Russell
  3 siblings, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2003-05-05 20:00 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, kuznet, netdev, acme

[-- Attachment #1: Type: text/plain, Size: 2385 bytes --]

As an experiment, tried acquiring module ref count every time network
device is ref counted.  The result is discovering that there are cases
in the Ethernet module init path where there is a call to dev_hold()
without a previous explicit ref count.

kernel BUG at include/linux/module.h:284!
invalid operand: 0000 [#1]
CPU:    0
EIP:    0060:[<c028fd02>]    Not tainted
EFLAGS: 00010246
EIP is at linkwatch_fire_event+0x170/0x1a3
eax: 00000000   ebx: c047fad0   ecx: 00000020   edx: f88c8100
esi: f88c7100   edi: f6fa7000   ebp: f6f15de4   esp: f6f15dc8
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 408, threadinfo=f6f14000 task=f78f46a0)
Stack: f88c7100 c03f7008 00000246 f6f14000 f6fa7000 00000000 fffc829b f6f15df8 
       f88c0ab9 f6fa7000 f6fa71e0 033002a8 f6f15e24 f88c00e0 f6fa71e0 00007148 
       c03e2e80 f6fa7320 c011eb46 ffffffef f6f15e3e f6fa71e0 fffc829b f6f15e50 
Call Trace:
 [<f88c7100>] +0x0/0x1180 [e100]
 [<f88c0ab9>] e100_update_link_state+0x97/0xa2 [e100]
 [<f88c00e0>] e100_find_speed_duplex+0x20/0x26a [e100]
 [<c011eb46>] sys_sched_yield+0xc0/0xfe
 [<f88c07de>] e100_auto_neg+0x114/0x11c [e100]
 [<c01fa4f8>] __delay+0x14/0x18
 [<f88c081d>] e100_phy_set_speed_duplex+0x37/0xa4 [e100]
 [<f88c0997>] e100_phy_init+0x69/0x78 [e100]
 [<f88ba1dc>] e100_hw_init+0x14/0x11e [e100]
 [<f88bc432>] e100_rd_pwa_no+0x32/0x40 [e100]
 [<f88ba058>] e100_init+0xf6/0x126 [e100]
 [<f88b9273>] e100_found1+0x1a9/0x42e [e100]
 [<f88c5b25>] e100_driver_version+0x0/0xb [e100]
 [<f88c5e40>] e100_driver+0x0/0xa0 [e100]
 [<c01fe884>] pci_device_probe+0x5a/0x68
 [<f88c5b60>] e100_id_table+0x0/0x2e0 [e100]
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<c0241853>] bus_match+0x43/0x6e
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<c0241956>] driver_attach+0x5c/0x60
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<c0241c2a>] bus_add_driver+0xb2/0xc8
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<f88c7100>] +0x0/0x1180 [e100]
 [<c01fe99a>] pci_register_driver+0x46/0x56
 [<f88c5e68>] e100_driver+0x28/0xa0 [e100]
 [<f880c015>] +0x15/0x3e [e100]
 [<f88c5e40>] e100_driver+0x0/0xa0 [e100]
 [<c013b034>] sys_init_module+0x1b0/0x292
all_call+0x7/0xb

Code: 0f 0b 1c 01 97 5a 32 c0 e9 d9 fe ff ff c7 04 24 0c 00 00 00 
 ./ifup: line 91:   408 Segmentation fault      modprobe $1 >/dev/null 2>&1


[-- Attachment #2: netdev-module.diff --]
[-- Type: application/octet-stream, Size: 3843 bytes --]

diff -urNp -X dontdiff linux-2.5/include/asm-i386/asm_offsets.h linux-2.5-dev/include/asm-i386/asm_offsets.h
--- linux-2.5/include/asm-i386/asm_offsets.h	1969-12-31 16:00:00.000000000 -0800
+++ linux-2.5-dev/include/asm-i386/asm_offsets.h	2003-05-05 10:11:50.000000000 -0700
@@ -0,0 +1,22 @@
+#ifndef __ASM_OFFSETS_H__
+#define __ASM_OFFSETS_H__
+/*
+ * DO NOT MODIFY.
+ *
+ * This file was generated by arch/i386/Makefile
+ *
+ */
+
+#define SIGCONTEXT_eax 44 /* offsetof (struct sigcontext, eax) */
+#define SIGCONTEXT_ebx 32 /* offsetof (struct sigcontext, ebx) */
+#define SIGCONTEXT_ecx 40 /* offsetof (struct sigcontext, ecx) */
+#define SIGCONTEXT_edx 36 /* offsetof (struct sigcontext, edx) */
+#define SIGCONTEXT_esi 20 /* offsetof (struct sigcontext, esi) */
+#define SIGCONTEXT_edi 16 /* offsetof (struct sigcontext, edi) */
+#define SIGCONTEXT_ebp 24 /* offsetof (struct sigcontext, ebp) */
+#define SIGCONTEXT_esp 28 /* offsetof (struct sigcontext, esp) */
+#define SIGCONTEXT_eip 56 /* offsetof (struct sigcontext, eip) */
+
+#define RT_SIGFRAME_sigcontext 164 /* offsetof (struct rt_sigframe, uc.uc_mcontext) */
+
+#endif
diff -urNp -X dontdiff linux-2.5/include/linux/netdevice.h linux-2.5-dev/include/linux/netdevice.h
--- linux-2.5/include/linux/netdevice.h	2003-04-14 13:32:21.000000000 -0700
+++ linux-2.5-dev/include/linux/netdevice.h	2003-05-05 10:06:19.000000000 -0700
@@ -29,6 +29,7 @@
 #include <linux/if_ether.h>
 #include <linux/if_packet.h>
 #include <linux/kobject.h>
+#include <linux/module.h>
 
 #include <asm/atomic.h>
 #include <asm/cache.h>
@@ -629,12 +630,32 @@ extern int netdev_finish_unregister(stru
 
 static inline void dev_put(struct net_device *dev)
 {
+	module_put(dev->owner);
 	if (atomic_dec_and_test(&dev->refcnt))
 		netdev_finish_unregister(dev);
 }
 
-#define __dev_put(dev) atomic_dec(&(dev)->refcnt)
-#define dev_hold(dev) atomic_inc(&(dev)->refcnt)
+static inline void __dev_put(struct net_device *dev)
+{
+	module_put(dev->owner);
+	atomic_dec(&dev->refcnt);
+}
+
+static inline void dev_hold(struct net_device *dev)
+{
+	__module_get(dev->owner);
+	atomic_inc(&dev->refcnt);
+}
+
+static inline int dev_try_hold(struct net_device *dev)
+{
+	int ret = 0;
+	if (try_module_get(dev->owner)){
+		atomic_inc(&dev->refcnt);
+		ret = 1;
+	}
+	return ret;
+}
 
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
diff -urNp -X dontdiff linux-2.5/net/core/dev.c linux-2.5-dev/net/core/dev.c
--- linux-2.5/net/core/dev.c	2003-05-05 09:41:03.000000000 -0700
+++ linux-2.5-dev/net/core/dev.c	2003-05-05 10:06:20.000000000 -0700
@@ -1,3 +1,4 @@
+#define NET_REFCNT_DEBUG	1
 /*
  * 	NET3	Protocol independent device support routines.
  *
@@ -440,8 +441,8 @@ struct net_device *dev_get_by_name(const
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_name(name);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -513,8 +514,8 @@ struct net_device *dev_get_by_index(int 
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_index(ifindex);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -563,8 +564,8 @@ struct net_device * dev_get_by_flags(uns
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_flags(if_flags, mask);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+	    dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -2609,10 +2610,11 @@ int register_netdevice(struct net_device
 	set_bit(__LINK_STATE_PRESENT, &dev->state);
 
 	dev->next = NULL;
+	atomic_inc(&dev->refcnt);
 	dev_init_scheduler(dev);
 	write_lock_bh(&dev_base_lock);
 	*dp = dev;
-	dev_hold(dev);
+
 	dev->deadbeaf = 0;
 	write_unlock_bh(&dev_base_lock);
 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-05 20:00             ` dev->destructor Stephen Hemminger
@ 2003-05-06  4:18               ` Rusty Russell
  2003-05-06 14:23                 ` dev->destructor David S. Miller
  2003-05-06 22:35                 ` dev->destructor Stephen Hemminger
  0 siblings, 2 replies; 28+ messages in thread
From: Rusty Russell @ 2003-05-06  4:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: davem, kuznet, netdev, acme

In message <20030505130050.4b9868bb.shemminger@osdl.org> you write:
> As an experiment, tried acquiring module ref count every time network
> device is ref counted.

Brave man 8)

> The result is discovering that there are cases in the Ethernet
> module init path where there is a call to dev_hold() without a
> previous explicit ref count.

Well caught: this is in fact a false alarm.  Coming, as we do, out of
module_init(), we actually hold an implicit reference.

It's logically consistent to make it implicit, and cuts out some code
in the unload path.

How's this?
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Name: Hold Initial Reference To Module
Author: Rusty Russell
Status: Tested on 2.5.69

D: __module_get is theoretically allowed on module inside init, since
D: we already hold an implicit reference.  Currently this BUG()s: make
D: the reference count explicit, which also simplifies delete path.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal linux-2.5.69/kernel/module.c working-2.5.69-module-init-ref/kernel/module.c
--- linux-2.5.69/kernel/module.c	2003-05-05 12:37:13.000000000 +1000
+++ working-2.5.69-module-init-ref/kernel/module.c	2003-05-06 12:11:18.000000000 +1000
@@ -214,6 +214,8 @@ static void module_unload_init(struct mo
 	INIT_LIST_HEAD(&mod->modules_which_use_me);
 	for (i = 0; i < NR_CPUS; i++)
 		atomic_set(&mod->ref[i].count, 0);
+	/* Hold reference count during initialization. */
+	atomic_set(&mod->ref[smp_processor_id()].count, 1);
 	/* Backwards compatibility macros put refcount during init. */
 	mod->waiter = current;
 }
@@ -500,16 +502,6 @@ sys_delete_module(const char __user *nam
 		goto out;
 	}
 
-	/* Coming up?  Allow force on stuck modules. */
-	if (mod->state == MODULE_STATE_COMING) {
-		forced = try_force(flags);
-		if (!forced) {
-			/* This module can't be removed */
-			ret = -EBUSY;
-			goto out;
-		}
-	}
-
 	/* If it has an init func, it must have an exit func to unload */
 	if ((mod->init != init_module && mod->exit == cleanup_module)
 	    || mod->unsafe) {
@@ -1448,6 +1440,7 @@ sys_init_module(void __user *umod,
 			printk(KERN_ERR "%s: module is now stuck!\n",
 			       mod->name);
 		else {
+			module_put(mod);
 			down(&module_mutex);
 			free_module(mod);
 			up(&module_mutex);
@@ -1463,6 +1456,9 @@ sys_init_module(void __user *umod,
 	mod->init_size = 0;
 	up(&module_mutex);
 
+	/* Drop initial reference. */
+	module_put(mod);
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-06  4:18               ` dev->destructor Rusty Russell
@ 2003-05-06 14:23                 ` David S. Miller
  2003-05-06 22:32                   ` [PATCH 2.5.69] IPV4 should use dev_hold Stephen Hemminger
  2003-05-07  2:50                   ` dev->destructor Rusty Russell
  2003-05-06 22:35                 ` dev->destructor Stephen Hemminger
  1 sibling, 2 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-06 14:23 UTC (permalink / raw)
  To: rusty; +Cc: shemminger, kuznet, netdev, acme

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Tue, 06 May 2003 14:18:36 +1000

   It's logically consistent to make it implicit, and cuts out some
   code in the unload path.

   How's this?

This looks fine to me.

How hard would it be to make this completely consistent in that
no module code is ever invoked with modcount == 0?  By this I mean
keeping the implicit reference after modload succeeds, and then
calling ->cleanup() is valid once the count drops to '1'.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-05 16:08             ` dev->destructor Stephen Hemminger
@ 2003-05-06 14:25               ` David S. Miller
  2003-05-07  2:54                 ` dev->destructor Rusty Russell
  0 siblings, 1 reply; 28+ messages in thread
From: David S. Miller @ 2003-05-06 14:25 UTC (permalink / raw)
  To: shemminger; +Cc: rusty, kuznet, netdev, acme

   From: Stephen Hemminger <shemminger@osdl.org>
   Date: Mon, 5 May 2003 09:08:20 -0700

   On Sat, 03 May 2003 14:07:41 +1000
   Rusty Russell <rusty@rustcorp.com.au> wrote:
   
   > But Alexey said you can only call unregister_netdev from module
   > unload, ie. if not a module, it can't be unloaded, hence no refcount
   > needed.  I wrote the above paragraph because I'm not sure if I
   > understood Alexey correctly?

   There are several flavors of pseudo-network devices like bridging
   and VLAN that dynamically create/destroy netdev's even when they
   are not modules.

I think you'll understand what Alexey/Rusty are saying better
if you consider statically compiled kernel code as a module with
an implicit non-zero reference count :-)

   

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 2.5.69] IPV4 should use dev_hold
  2003-05-06 14:23                 ` dev->destructor David S. Miller
@ 2003-05-06 22:32                   ` Stephen Hemminger
  2003-05-07  7:32                     ` David S. Miller
  2003-05-07  2:50                   ` dev->destructor Rusty Russell
  1 sibling, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2003-05-06 22:32 UTC (permalink / raw)
  To: David S. Miller; +Cc: rusty, kuznet, netdev, acme

David,

While debugging possible changes to dev refcounting, discovered
a couple of places in IPV4 that were directly incrementing rather
than using the inline dev_hold.  Let's hide the implementation of
net device ref counting so  future module ref count fixes will be easier.

diff -urN -X dontdiff linux-2.5/net/ipv4/fib_frontend.c linux-2.5-dev/net/ipv4/fib_frontend.c
--- linux-2.5/net/ipv4/fib_frontend.c	2003-04-14 13:32:26.000000000 -0700
+++ linux-2.5-dev/net/ipv4/fib_frontend.c	2003-05-06 15:16:03.000000000 -0700
@@ -115,9 +115,9 @@
 	if (res.type != RTN_LOCAL)
 		goto out;
 	dev = FIB_RES_DEV(res);
-	if (dev)
-		atomic_inc(&dev->refcnt);
 
+	if (dev)
+		dev_hold(dev);
 out:
 	fib_res_put(&res);
 	return dev;
diff -urN -X dontdiff linux-2.5/net/ipv4/fib_semantics.c linux-2.5-dev/net/ipv4/fib_semantics.c
--- linux-2.5/net/ipv4/fib_semantics.c	2003-04-29 09:57:41.000000000 -0700
+++ linux-2.5-dev/net/ipv4/fib_semantics.c	2003-05-06 15:10:40.000000000 -0700
@@ -406,7 +406,7 @@
 			if (!(dev->flags&IFF_UP))
 				return -ENETDOWN;
 			nh->nh_dev = dev;
-			atomic_inc(&dev->refcnt);
+			dev_hold(dev);
 			nh->nh_scope = RT_SCOPE_LINK;
 			return 0;
 		}
@@ -429,7 +429,7 @@
 		nh->nh_oif = FIB_RES_OIF(res);
 		if ((nh->nh_dev = FIB_RES_DEV(res)) == NULL)
 			goto out;
-		atomic_inc(&nh->nh_dev->refcnt);
+		dev_hold(nh->nh_dev);
 		err = -ENETDOWN;
 		if (!(nh->nh_dev->flags & IFF_UP))
 			goto out;
@@ -451,7 +451,7 @@
 			return -ENETDOWN;
 		}
 		nh->nh_dev = in_dev->dev;
-		atomic_inc(&nh->nh_dev->refcnt);
+		dev_hold(nh->nh_dev);
 		nh->nh_scope = RT_SCOPE_HOST;
 		in_dev_put(in_dev);
 	}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-06  4:18               ` dev->destructor Rusty Russell
  2003-05-06 14:23                 ` dev->destructor David S. Miller
@ 2003-05-06 22:35                 ` Stephen Hemminger
  2003-05-06 23:51                   ` [RFC] Experiment with dev and module ref counts Stephen Hemminger
  1 sibling, 1 reply; 28+ messages in thread
From: Stephen Hemminger @ 2003-05-06 22:35 UTC (permalink / raw)
  To: Rusty Russell; +Cc: davem, kuznet, netdev, acme

On Tue, 06 May 2003 14:18:36 +1000
Rusty Russell <rusty@rustcorp.com.au> wrote:

> In message <20030505130050.4b9868bb.shemminger@osdl.org> you write:
> > As an experiment, tried acquiring module ref count every time network
> > device is ref counted.
> 
> Brave man 8)
> 
> > The result is discovering that there are cases in the Ethernet
> > module init path where there is a call to dev_hold() without a
> > previous explicit ref count.
> 
> Well caught: this is in fact a false alarm.  Coming, as we do, out of
> module_init(), we actually hold an implicit reference.
> 
> It's logically consistent to make it implicit, and cuts out some code
> in the unload path.
> 
> How's this?
> Rusty.

Thanks, with that change and the following patches, the system does
boot and correctly ref counts the modules.  Still have problems
on unregister and shutdown, but it is a start.

diff -urN -X dontdiff linux-2.5/include/linux/netdevice.h linux-2.5-dev/include/linux/netdevice.h
--- linux-2.5/include/linux/netdevice.h	2003-04-14 13:32:21.000000000 -0700
+++ linux-2.5-dev/include/linux/netdevice.h	2003-05-06 15:11:25.000000000 -0700
@@ -29,6 +29,7 @@
 #include <linux/if_ether.h>
 #include <linux/if_packet.h>
 #include <linux/kobject.h>
+#include <linux/module.h>
 
 #include <asm/atomic.h>
 #include <asm/cache.h>
@@ -629,12 +630,32 @@
 
 static inline void dev_put(struct net_device *dev)
 {
+	module_put(dev->owner);
 	if (atomic_dec_and_test(&dev->refcnt))
 		netdev_finish_unregister(dev);
 }
 
-#define __dev_put(dev) atomic_dec(&(dev)->refcnt)
-#define dev_hold(dev) atomic_inc(&(dev)->refcnt)
+static inline void __dev_put(struct net_device *dev)
+{
+	module_put(dev->owner);
+	atomic_dec(&dev->refcnt);
+}
+
+static inline void dev_hold(struct net_device *dev)
+{
+	__module_get(dev->owner);
+	atomic_inc(&dev->refcnt);
+}
+
+static inline int dev_try_hold(struct net_device *dev)
+{
+	int ret = 0;
+	if (try_module_get(dev->owner)){
+		atomic_inc(&dev->refcnt);
+		ret = 1;
+	}
+	return ret;
+}
 
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
diff -urN -X dontdiff linux-2.5/net/core/dev.c linux-2.5-dev/net/core/dev.c
--- linux-2.5/net/core/dev.c	2003-05-05 09:41:03.000000000 -0700
+++ linux-2.5-dev/net/core/dev.c	2003-05-06 15:12:24.000000000 -0700
@@ -1,3 +1,4 @@
+#define NET_REFCNT_DEBUG	1
 /*
  * 	NET3	Protocol independent device support routines.
  *
@@ -440,8 +441,8 @@
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_name(name);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -513,8 +514,8 @@
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_index(ifindex);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -563,8 +564,8 @@
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_flags(if_flags, mask);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+	    dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -1312,7 +1313,9 @@
 				goto drop;
 
 enqueue:
-			dev_hold(skb->dev);
+			if (!dev_try_hold(skb->dev)) 
+				goto drop;
+
 			__skb_queue_tail(&queue->input_pkt_queue, skb);
 #ifndef OFFLINE_SAMPLE
 			get_sample_stats(this_cpu);
@@ -1990,9 +1993,8 @@
 	ASSERT_RTNL();
 
 	if (master) {
-		if (old)
+		if (old || !dev_try_hold(master))
 			return -EBUSY;
-		dev_hold(master);
 	}
 
 	br_write_lock_bh(BR_NETPROTO_LOCK);
@@ -2609,10 +2611,11 @@
 	set_bit(__LINK_STATE_PRESENT, &dev->state);
 
 	dev->next = NULL;
+	atomic_inc(&dev->refcnt);
 	dev_init_scheduler(dev);
 	write_lock_bh(&dev_base_lock);
 	*dp = dev;
-	dev_hold(dev);
+
 	dev->deadbeaf = 0;
 	write_unlock_bh(&dev_base_lock);
 
@@ -2900,7 +2903,11 @@
 #endif
 		dev->xmit_lock_owner = -1;
 		dev->iflink = -1;
-		dev_hold(dev);
+
+		if (!dev_try_hold(dev)) {
+			dev->deadbeaf = 1;
+			dp = &dev->next;
+		}
 
 		/*
 		 * Allocate name. If the init() fails
diff -urN -X dontdiff linux-2.5/net/ipv4/devinet.c linux-2.5-dev/net/ipv4/devinet.c
--- linux-2.5/net/ipv4/devinet.c	2003-04-14 13:32:26.000000000 -0700
+++ linux-2.5-dev/net/ipv4/devinet.c	2003-05-06 15:16:03.000000000 -0700
@@ -559,6 +559,9 @@
 	if ((dev = __dev_get_by_name(ifr.ifr_name)) == NULL)
 		goto done;
 
+	if (!dev_try_hold(dev))
+		goto done;
+
 	if (colon)
 		*colon = ':';
 
@@ -591,7 +594,7 @@
 
 	ret = -EADDRNOTAVAIL;
 	if (!ifa && cmd != SIOCSIFADDR && cmd != SIOCSIFFLAGS)
-		goto done;
+		goto put;
 
 	switch(cmd) {
 	case SIOCGIFADDR:	/* Get interface address */
@@ -700,12 +703,15 @@
 		}
 		break;
 	}
+put:
+	dev_put(dev);
 done:
 	rtnl_unlock();
 	dev_probe_unlock();
 out:
 	return ret;
 rarok:
+	dev_put(dev);
 	rtnl_unlock();
 	dev_probe_unlock();
 	ret = copy_to_user(arg, &ifr, sizeof(struct ifreq)) ? -EFAULT : 0;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC] Experiment with dev and module ref counts
  2003-05-06 22:35                 ` dev->destructor Stephen Hemminger
@ 2003-05-06 23:51                   ` Stephen Hemminger
  0 siblings, 0 replies; 28+ messages in thread
From: Stephen Hemminger @ 2003-05-06 23:51 UTC (permalink / raw)
  To: rusty, davem, kuznet, acme; +Cc: netdev

This patch ties network device refcounts and module ref counts 
together.  It works for IPV4 and the dev->destructor problem is fixed,
at least for the Ethernet bridge device. Unregister and shutdown
work correctly, and modules can be removed when expected.

BUT: lots of dev_hold usage would need more auditing.
No idea what the performance impact is yet.

diff -urNp -X dontdiff linux-2.5/include/linux/netdevice.h linux-2.5-dev/include/linux/netdevice.h
--- linux-2.5/include/linux/netdevice.h	2003-04-14 13:32:21.000000000 -0700
+++ linux-2.5-dev/include/linux/netdevice.h	2003-05-06 15:11:25.000000000 -0700
@@ -29,6 +29,7 @@
 #include <linux/if_ether.h>
 #include <linux/if_packet.h>
 #include <linux/kobject.h>
+#include <linux/module.h>
 
 #include <asm/atomic.h>
 #include <asm/cache.h>
@@ -629,12 +630,32 @@ extern int netdev_finish_unregister(stru
 
 static inline void dev_put(struct net_device *dev)
 {
+	module_put(dev->owner);
 	if (atomic_dec_and_test(&dev->refcnt))
 		netdev_finish_unregister(dev);
 }
 
-#define __dev_put(dev) atomic_dec(&(dev)->refcnt)
-#define dev_hold(dev) atomic_inc(&(dev)->refcnt)
+static inline void __dev_put(struct net_device *dev)
+{
+	module_put(dev->owner);
+	atomic_dec(&dev->refcnt);
+}
+
+static inline void dev_hold(struct net_device *dev)
+{
+	__module_get(dev->owner);
+	atomic_inc(&dev->refcnt);
+}
+
+static inline int dev_try_hold(struct net_device *dev)
+{
+	int ret = 0;
+	if (try_module_get(dev->owner)){
+		atomic_inc(&dev->refcnt);
+		ret = 1;
+	}
+	return ret;
+}
 
 /* Carrier loss detection, dial on demand. The functions netif_carrier_on
  * and _off may be called from IRQ context, but it is caller
diff -urNp -X dontdiff linux-2.5/kernel/module.c linux-2.5-dev/kernel/module.c
--- linux-2.5/kernel/module.c	2003-04-30 11:19:23.000000000 -0700
+++ linux-2.5-dev/kernel/module.c	2003-05-06 13:13:20.000000000 -0700
@@ -214,6 +214,8 @@ static void module_unload_init(struct mo
 	INIT_LIST_HEAD(&mod->modules_which_use_me);
 	for (i = 0; i < NR_CPUS; i++)
 		atomic_set(&mod->ref[i].count, 0);
+	/* Hold reference count during initialization. */
+	atomic_set(&mod->ref[smp_processor_id()].count, 1);
 	/* Backwards compatibility macros put refcount during init. */
 	mod->waiter = current;
 }
@@ -500,16 +502,6 @@ sys_delete_module(const char __user *nam
 		goto out;
 	}
 
-	/* Coming up?  Allow force on stuck modules. */
-	if (mod->state == MODULE_STATE_COMING) {
-		forced = try_force(flags);
-		if (!forced) {
-			/* This module can't be removed */
-			ret = -EBUSY;
-			goto out;
-		}
-	}
-
 	/* If it has an init func, it must have an exit func to unload */
 	if ((mod->init != init_module && mod->exit == cleanup_module)
 	    || mod->unsafe) {
@@ -1448,6 +1440,7 @@ sys_init_module(void __user *umod,
 			printk(KERN_ERR "%s: module is now stuck!\n",
 			       mod->name);
 		else {
+			module_put(mod);
 			down(&module_mutex);
 			free_module(mod);
 			up(&module_mutex);
@@ -1463,6 +1456,9 @@ sys_init_module(void __user *umod,
 	mod->init_size = 0;
 	up(&module_mutex);
 
+	/* Drop initial reference. */
+	module_put(mod);
+
 	return 0;
 }
 
diff -urNp -X dontdiff linux-2.5/net/core/dev.c linux-2.5-dev/net/core/dev.c
--- linux-2.5/net/core/dev.c	2003-05-05 09:41:03.000000000 -0700
+++ linux-2.5-dev/net/core/dev.c	2003-05-06 16:07:15.000000000 -0700
@@ -1,3 +1,4 @@
+#define NET_REFCNT_DEBUG	1
 /*
  * 	NET3	Protocol independent device support routines.
  *
@@ -440,8 +441,8 @@ struct net_device *dev_get_by_name(const
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_name(name);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -513,8 +514,8 @@ struct net_device *dev_get_by_index(int 
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_index(ifindex);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+		dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -563,8 +564,8 @@ struct net_device * dev_get_by_flags(uns
 
 	read_lock(&dev_base_lock);
 	dev = __dev_get_by_flags(if_flags, mask);
-	if (dev)
-		dev_hold(dev);
+	if (dev && !dev_try_hold(dev))
+	    dev = NULL;
 	read_unlock(&dev_base_lock);
 	return dev;
 }
@@ -1312,7 +1313,9 @@ int netif_rx(struct sk_buff *skb)
 				goto drop;
 
 enqueue:
-			dev_hold(skb->dev);
+			if (!dev_try_hold(skb->dev)) 
+				goto drop;
+
 			__skb_queue_tail(&queue->input_pkt_queue, skb);
 #ifndef OFFLINE_SAMPLE
 			get_sample_stats(this_cpu);
@@ -1990,9 +1993,8 @@ int netdev_set_master(struct net_device 
 	ASSERT_RTNL();
 
 	if (master) {
-		if (old)
+		if (old || !dev_try_hold(master))
 			return -EBUSY;
-		dev_hold(master);
 	}
 
 	br_write_lock_bh(BR_NETPROTO_LOCK);
@@ -2609,10 +2611,11 @@ int register_netdevice(struct net_device
 	set_bit(__LINK_STATE_PRESENT, &dev->state);
 
 	dev->next = NULL;
+	atomic_inc(&dev->refcnt);
 	dev_init_scheduler(dev);
 	write_lock_bh(&dev_base_lock);
 	*dp = dev;
-	dev_hold(dev);
+
 	dev->deadbeaf = 0;
 	write_unlock_bh(&dev_base_lock);
 
@@ -2809,7 +2812,9 @@ int unregister_netdevice(struct net_devi
 	}
 out:
 	kobject_unregister(&dev->kobj);
-	dev_put(dev);
+
+	atomic_dec(&dev->refcnt);
+
 	return 0;
 }
 
@@ -2900,7 +2905,11 @@ static int __init net_dev_init(void)
 #endif
 		dev->xmit_lock_owner = -1;
 		dev->iflink = -1;
-		dev_hold(dev);
+
+		if (!dev_try_hold(dev)) {
+			dev->deadbeaf = 1;
+			dp = &dev->next;
+		}
 
 		/*
 		 * Allocate name. If the init() fails
diff -urNp -X dontdiff linux-2.5/net/ipv4/devinet.c linux-2.5-dev/net/ipv4/devinet.c
--- linux-2.5/net/ipv4/devinet.c	2003-04-14 13:32:26.000000000 -0700
+++ linux-2.5-dev/net/ipv4/devinet.c	2003-05-06 15:16:03.000000000 -0700
@@ -559,6 +559,9 @@ int devinet_ioctl(unsigned int cmd, void
 	if ((dev = __dev_get_by_name(ifr.ifr_name)) == NULL)
 		goto done;
 
+	if (!dev_try_hold(dev))
+		goto done;
+
 	if (colon)
 		*colon = ':';
 
@@ -591,7 +594,7 @@ int devinet_ioctl(unsigned int cmd, void
 
 	ret = -EADDRNOTAVAIL;
 	if (!ifa && cmd != SIOCSIFADDR && cmd != SIOCSIFFLAGS)
-		goto done;
+		goto put;
 
 	switch(cmd) {
 	case SIOCGIFADDR:	/* Get interface address */
@@ -700,12 +703,15 @@ int devinet_ioctl(unsigned int cmd, void
 		}
 		break;
 	}
+put:
+	dev_put(dev);
 done:
 	rtnl_unlock();
 	dev_probe_unlock();
 out:
 	return ret;
 rarok:
+	dev_put(dev);
 	rtnl_unlock();
 	dev_probe_unlock();
 	ret = copy_to_user(arg, &ifr, sizeof(struct ifreq)) ? -EFAULT : 0;
diff -urNp -X dontdiff linux-2.5/net/ipv4/fib_frontend.c linux-2.5-dev/net/ipv4/fib_frontend.c
--- linux-2.5/net/ipv4/fib_frontend.c	2003-04-14 13:32:26.000000000 -0700
+++ linux-2.5-dev/net/ipv4/fib_frontend.c	2003-05-06 15:16:03.000000000 -0700
@@ -115,9 +115,9 @@ struct net_device * ip_dev_find(u32 addr
 	if (res.type != RTN_LOCAL)
 		goto out;
 	dev = FIB_RES_DEV(res);
-	if (dev)
-		atomic_inc(&dev->refcnt);
 
+	if (dev)
+		dev_hold(dev);
 out:
 	fib_res_put(&res);
 	return dev;
diff -urNp -X dontdiff linux-2.5/net/ipv4/fib_semantics.c linux-2.5-dev/net/ipv4/fib_semantics.c
--- linux-2.5/net/ipv4/fib_semantics.c	2003-04-29 09:57:41.000000000 -0700
+++ linux-2.5-dev/net/ipv4/fib_semantics.c	2003-05-06 15:10:40.000000000 -0700
@@ -406,7 +406,7 @@ static int fib_check_nh(const struct rtm
 			if (!(dev->flags&IFF_UP))
 				return -ENETDOWN;
 			nh->nh_dev = dev;
-			atomic_inc(&dev->refcnt);
+			dev_hold(dev);
 			nh->nh_scope = RT_SCOPE_LINK;
 			return 0;
 		}
@@ -429,7 +429,7 @@ static int fib_check_nh(const struct rtm
 		nh->nh_oif = FIB_RES_OIF(res);
 		if ((nh->nh_dev = FIB_RES_DEV(res)) == NULL)
 			goto out;
-		atomic_inc(&nh->nh_dev->refcnt);
+		dev_hold(nh->nh_dev);
 		err = -ENETDOWN;
 		if (!(nh->nh_dev->flags & IFF_UP))
 			goto out;
@@ -451,7 +451,7 @@ out:
 			return -ENETDOWN;
 		}
 		nh->nh_dev = in_dev->dev;
-		atomic_inc(&nh->nh_dev->refcnt);
+		dev_hold(nh->nh_dev);
 		nh->nh_scope = RT_SCOPE_HOST;
 		in_dev_put(in_dev);
 	}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-06 14:23                 ` dev->destructor David S. Miller
  2003-05-06 22:32                   ` [PATCH 2.5.69] IPV4 should use dev_hold Stephen Hemminger
@ 2003-05-07  2:50                   ` Rusty Russell
  2003-05-07  3:58                     ` dev->destructor David S. Miller
  1 sibling, 1 reply; 28+ messages in thread
From: Rusty Russell @ 2003-05-07  2:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: shemminger, kuznet, netdev, acme

In message <20030506.072338.39479306.davem@redhat.com> you write:
>    From: Rusty Russell <rusty@rustcorp.com.au>
>    Date: Tue, 06 May 2003 14:18:36 +1000
>    
>    It's logically consistent to make it implicit, and cuts out some
>    code in the unload path.
>    
>    How's this?
> 
> This looks fine to me.
> 
> How hard would it be to make this completely consistent in that
> no module code is ever invoked with modcount == 0?  By this I mean
> keeping the implicit reference after modload succeeds, and then
> calling ->cleanup() is valid once the count drops to '1'.

I had to think hard about this 8).

There's nothing *wrong* with leaving the refcount at 1: it's just a
matter of adjusting various checks for 0 to 1 (including the BUG()
Stephen hit), and subbing 1 in /proc/modules...

But that's a sideshow: you'd *still* want an extra refcount around the
call to module_init().  Because it's the module state being
MODULE_STATE_COMING which stops the module from being unloaded,
ie. holds the extra reference count.  Once init is finished, module
state goes to MODULE_STATE_LIVE, and the reference must be dropped.

Now, grabbing a similar reference when deleting a module might make
sense, but it's actually kinda pointless when you look at the code.

Hope that clarifies,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-06 14:25               ` dev->destructor David S. Miller
@ 2003-05-07  2:54                 ` Rusty Russell
  0 siblings, 0 replies; 28+ messages in thread
From: Rusty Russell @ 2003-05-07  2:54 UTC (permalink / raw)
  To: David S. Miller; +Cc: shemminger, rusty, kuznet, netdev, acme

In message <20030506.072529.52888036.davem@redhat.com> you write:
>    From: Stephen Hemminger <shemminger@osdl.org>
>    Date: Mon, 5 May 2003 09:08:20 -0700
> 
>    On Sat, 03 May 2003 14:07:41 +1000
>    Rusty Russell <rusty@rustcorp.com.au> wrote:
>    
>    > But Alexey said you can only call unregister_netdev from module
>    > unload, ie. if not a module, it can't be unloaded, hence no refcount
>    > needed.  I wrote the above paragraph because I'm not sure if I
>    > understood Alexey correctly?
> 
>    There are several flavors of pseudo-network devices like bridging
>    and VLAN that dynamically create/destroy netdev's even when they
>    are not modules.
> 
> I think you'll understand what Alexey/Rusty are saying better
> if you consider statically compiled kernel code as a module with
> an implicit non-zero reference count :-)

Yes, but his point is valid.  We *do* want to destroy netdev's at
random times, not just from module cleanup code.  Hotplug, for
example.

So me saying "just rely on the owner refcnt" was wrong.

Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: dev->destructor
  2003-05-07  2:50                   ` dev->destructor Rusty Russell
@ 2003-05-07  3:58                     ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-07  3:58 UTC (permalink / raw)
  To: rusty; +Cc: shemminger, kuznet, netdev, acme

   From: Rusty Russell <rusty@rustcorp.com.au>
   Date: Wed, 07 May 2003 12:50:07 +1000

   But that's a sideshow: you'd *still* want an extra refcount around the
   call to module_init().
 ...   
   Now, grabbing a similar reference when deleting a module might make
   sense, but it's actually kinda pointless when you look at the code.

Indeed, forget my idea ;-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2.5.69] IPV4 should use dev_hold
  2003-05-06 22:32                   ` [PATCH 2.5.69] IPV4 should use dev_hold Stephen Hemminger
@ 2003-05-07  7:32                     ` David S. Miller
  0 siblings, 0 replies; 28+ messages in thread
From: David S. Miller @ 2003-05-07  7:32 UTC (permalink / raw)
  To: shemminger; +Cc: rusty, kuznet, netdev, acme

   From: Stephen Hemminger <shemminger@osdl.org>
   Date: Tue, 6 May 2003 15:32:34 -0700

   While debugging possible changes to dev refcounting, discovered
   a couple of places in IPV4 that were directly incrementing rather
   than using the inline dev_hold.  Let's hide the implementation of
   net device ref counting so  future module ref count fixes will be easier.

Applied, thanks.  If you find any more, just send along
another patch :-)

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2003-05-07  7:32 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-30  6:26 dev->destructor David S. Miller
2003-04-30 16:33 ` dev->destructor Stephen Hemminger
2003-05-01  1:10 ` dev->destructor kuznet
2003-05-01  7:00   ` dev->destructor David S. Miller
2003-05-01 12:01     ` dev->destructor Rusty Russell
2003-05-01 11:09       ` dev->destructor David S. Miller
2003-05-01 17:51         ` dev->destructor Arnaldo Carvalho de Melo
2003-05-01 16:55           ` dev->destructor David S. Miller
2003-05-01 17:28       ` dev->destructor Arnaldo Carvalho de Melo
2003-05-02  4:06     ` dev->destructor kuznet
2003-05-02  5:25       ` dev->destructor Rusty Russell
2003-05-02 20:48         ` dev->destructor David S. Miller
2003-05-03  4:07           ` dev->destructor Rusty Russell
2003-05-03  3:46             ` dev->destructor David S. Miller
2003-05-05  5:18               ` dev->destructor Rusty Russell
2003-05-03  4:00             ` dev->destructor David S. Miller
2003-05-05 16:08             ` dev->destructor Stephen Hemminger
2003-05-06 14:25               ` dev->destructor David S. Miller
2003-05-07  2:54                 ` dev->destructor Rusty Russell
2003-05-05 20:00             ` dev->destructor Stephen Hemminger
2003-05-06  4:18               ` dev->destructor Rusty Russell
2003-05-06 14:23                 ` dev->destructor David S. Miller
2003-05-06 22:32                   ` [PATCH 2.5.69] IPV4 should use dev_hold Stephen Hemminger
2003-05-07  7:32                     ` David S. Miller
2003-05-07  2:50                   ` dev->destructor Rusty Russell
2003-05-07  3:58                     ` dev->destructor David S. Miller
2003-05-06 22:35                 ` dev->destructor Stephen Hemminger
2003-05-06 23:51                   ` [RFC] Experiment with dev and module ref counts Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).