Re: Followup to netpoll issues

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Followup to netpoll issues
       [not found]   ` <20050107011547.GE27896@electric-eye.fr.zoreil.com>
@ 2005-01-07 17:01     ` Matt Mackall
  2005-01-07 21:42       ` Francois Romieu
  0 siblings, 1 reply; 7+ messages in thread
From: Matt Mackall @ 2005-01-07 17:01 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Mark Broadbent, netdev

On Fri, Jan 07, 2005 at 02:15:47AM +0100, Francois Romieu wrote:
> Matt Mackall <mpm@selenic.com> :
> [...]
> > I've tinkered with this idea as well, but I'm not convinced it's the
> > right idea. All current netpoll users (netconsole, kgdb, netdump) are
> > intrinsically concerned with timeliness of delivery. Queueing packets
>                           ^^^^^^^^^^^^^^^
> (as well as with reliability of the delivery maybe)

> > for later delivery simply doesn't work with the latter two and might
> > mislead with the former (eg netconsole message and other network
> > traffic could be reordered).
> 
> If kernel/printk.c::vprintk can not take the console semaphore, I
> would say that messages are reordered with regard to the order in
> which xmit_lock has been taken even without Mark's patch.
> 
> So what do you exactly mean by "reordered" ?

printk("debug: at point A");
do_something_that_sends_a_packet_B();
printk("debug: at point C");

Depending on locking and queueing, we might see on our network dump B,
A, C, and wrongly conclude that whatever did B was not between A and C.
That's a bad way for printk to work.

> > Also note that there's a closely related class of deadlock that we
> > can't detect: netconsole-induced recursion on driver-private locks. We
> > haven't actually seen one of these yet and haven't attempted to audit
> > for them, but my plan all along has been to treat it as a bug and
> > hoist offending printks out of the locked regions. Note that this is
> > really a netconsole-specific problem as the other netpoll clients are
> > unlikely to have such usage patterns.
> 
> Is is still a problem if Mark turns the spinlock in tx_retry_wq() into
> an irq safe trylock ?
> 
> (driver-private bugs could/will be inherited but fixing these is not
> netconsole's job).

The bugs I'm talking about are identical to the xmit_lock deadlock
except with locks we can't see outside the driver. In other words,
this patch addresses the easy part of larger problem by adding a bunch
of complexity that doesn't help in the larger problem. To me, that's a
hint that it's the wrong fix.

> > Since the xmit_lock is so similar, it seems things that hit it ought
> > to get the same treatment. So the rule becomes: no printk with network
> > driver locks held (public or private). This is obviously broken in the
> > face of oopsing in the driver, but netconsole can't be expected to
> > work under such conditions anyway.
> 
> I am not convinced that people will be satisfied with a rule which
> states that printk _from anywhere_ are lost as soon as a CPU enters
> in the xmit_lock zone but, hey, it's just me.

It should only be dropped on the CPU holding the lock, with a loud
warning to follow shortly.

> Objection against Ccing further messages to netdev ? It is better indexed
> than our mailboxes.

Nope.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
       [not found] ` <20050107002053.GD27896@electric-eye.fr.zoreil.com>
@ 2005-01-07 20:14   ` Mark Broadbent
  2005-01-07 23:18     ` Francois Romieu
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Broadbent @ 2005-01-07 20:14 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Matt Mackall, Network Development (Linux)

[-- Attachment #1: Type: text/plain, Size: 6913 bytes --]

[cc'ing netdev]

On Fri, 2005-01-07 at 01:20 +0100, Francois Romieu wrote:
> Mark Broadbent <markb@wetlettuce.com> :
> [...]
> > I had a think about the netpoll deadlock following the initial detective
> > work by you and Francois did.  I came up with a sample fix given below
> > that defers the packet transmission within netpoll until the xmit_lock
> > can be grabbed.  An advantage of this is that is maintains the packet
> > ordering.  I have attached the patch.
> 
> Nice.
> 
> Comments below. Off to bed now.
> 
> > It is possible for netpoll to deadlock by attempting to take the network
> > device xmit_lock twice on different processors.  This patch resolves this
> > by deferring to transmission of the second and subsequent skbs.
> > 
> > Signed-Off-By: Mark Broadbent <markb@wetlettuce.com>
> > 
> > --- linux-2.6.10-org/net/core/netpoll.c	2005-01-05 22:30:53.000000000 +0000
> > +++ linux-2.6.10/net/core/netpoll.c	2005-01-06 21:00:24.000000000 +0000
> [...]
> > @@ -178,17 +201,14 @@ repeat:
> >  	return skb;
> >  }
> >  
> > -void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
> > +/* This must be called with np->dev->xmit_lock spinlock held - this function 
> > + * function will drop the lock before returning indicating if it needs to be 
> > + * retried
> > + */
> > +static int transmit_skb(struct netpoll *np, struct sk_buff *skb)
> 
> It may seem like cosmetic but I do not like functions which unlock a lock
> they have not acquired themselves (and I am probably not alone). Would you
> consider pushing the trylock() in transmit_skb() itself ?
> 
> All the callers of this functions have to issue the trylock anyway so it
> should not matter from a performance POV.

Good point, I was having doubts about that.

> [...]
> > +	
> > +	spin_lock(&tx_list_lock);
>         ^^^^^^^^^
> kernel/workqueue.c::run_workqueue() could reenable interrupts before
> it calls tx_retry_wq(). An irq safe version of the underlined lock
> seems required to avoid deadlocking with printks potentially issued
> from interrupt context.

Fixed.

> > +	
> > +	list_for_each_safe(tx_ctr, tx_tmp, &tx_queue.link) {
> > +		tx_el = list_entry(tx_ctr, struct tx_queue_list, link);
> > +		
> > +		while (tx_el->np && tx_el->np->dev && 
> > +				netif_running(tx_el->np->dev)) {
> 
> netif_running() without xmit_lock held ? Uh oh...

This is done in the original code (and still is in other parts of the code).

> Imho you should try to keep transmit_skb() closer to the former
> netpoll_send_skb() (modulo the try_lock change of course).
> Btw I do not see what will prevent cleanup_netconsole() to succeed
> when tx_retry_wq() is scheduled for later execution. Add some module
> refcounting ?

No need, if netpoll_cleanup is called whilst a retry is pending the tx
list lock is taken.  All the references to the netpoll pointer are
deleted from the pending list before the lock is released.

> > +void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
> > +{
> > +repeat:
> > +	if(!np || !np->dev || !netif_running(np->dev))
> > +		goto out_free;
> 
> I am not sure that the first two tests above really made sense in
> the former version of the code:
> -> kernel/printk.c::vprintk() (downs console_sem)
>    -> kernel/printk.c::release_console_sem 
>       -> kernel/printk.c::call_console_drivers
>          [blah blah]
>          -> drivers/net/netconsole.c::netconsole.write_msg
>             -> drivers/net/netconsole.c::netpoll_send_udp
>                -> drivers/net/netconsole.c::netpoll_send_skb
> 
> -> drivers/net/netconsole.c::cleanup_netconsole
>    -> kernel/printk.c::unregister_console
>       -> kernel/printk.c::acquire_console_sem
>          -> try to down(&console_sem)
> 
> dev_put() has not been issued, np->dev is still among us and != NULL.
> So far, so good.
> 
> However netpoll_send_skb can be called through:
> netpoll_rx -> arp_reply -> netpoll_send_skb
> Here I fail to see why netpoll_cleanup could not happen at the same time
> and issue np->dev = NULL;

Could this solved by having a per netpoll structure rw_lock to protect
against changing elements in the structure?

e.g. 
struct netpoll {
	...
	rwlock_t netpoll_lock;
};

write_lock_irqsave(&np->netpoll_lock, flags);
np->dev = NULL;
write_unlock_irqsave(&np->netpoll_lock, flags);

> > +
> > +	/* If the list is non-empty then pkts are pending tx,
> > +	 * append to the end of list to stop reordering of the pkts
> > +	 */
> > +	if (!list_empty(&tx_queue.link) || !spin_trylock(&np->dev->xmit_lock)) {
> > +		struct tx_queue_list *txel;
> > +		
> > +		/* If we failed to get the xmit_lock then netpoll probably 
> > +		 * already holds this lock but will deadlock here if we relock,
> 
> Ok ("probably" ?).

:) Fixed

> > +		 * queue the skb for transmission later
> > +		 */
> > +		spin_lock(&tx_list_lock);
> 
> No spin_lock_irqsave ? Hmmm...
> The only use of netpoll_send_skb() goes through netpoll_send_udp() which is
> issued in drivers/net/netconsole.c::write_msg() with local_irq disabled.
> You are right but an extra comment would not hurt imho.

Is it safe to issue spin_lock_irqsave/spin_unlock_restore in this
context?  Reason: What if it wasn't netconsole calling netpoll_send_udp
and not with local interrupts disabled?

> > +		
> > +		txel = kmalloc(sizeof(struct tx_queue_list), GFP_ATOMIC);
> > +		if (!txel)
> > +			goto out_free;
> > +		
> > +		txel->skb = skb_get(skb);
> > +		txel->np  = np;
> > +		
> > +		list_add_tail(&txel->link, &tx_queue.link);
> > +		
> > +		queue_delayed_work(tx_wq, &tx_wq_obj, HZ/100);
> > +		
> > +		spin_unlock(&tx_list_lock);
> > +	} else if (transmit_skb(np, skb))
> > +		goto repeat;
> 
> As I read it, the previous code could loop as long as netif_queue_stopped()
> is true and it can acquire the lock. Any reason to not defer the skb for
> transmission in tx_retry_wq() ?

Nope, no good reason.  If transmit_skb fails it'll now defer.

> 
> > +	
> > +	return;
> > +	
> > +out_free:
> > +	__kfree_skb(skb);
> >  }
> >  
> >  void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
> > @@ -636,6 +729,15 @@ int netpoll_setup(struct netpoll *np)
> >  		spin_unlock_irqrestore(&rx_list_lock, flags);
> >  	}
> >  
> > +	/* Workqueue to schedule the tx retries */
> > +	if (!tx_wq)
> > +		tx_wq = create_workqueue("netpoll");
> > +	
> > +	if (!tx_wq) {
> > +		printk(KERN_ERR "Failed to create netpoll workqueue\n");
> > +		goto release;
> > +	}
> > +	
> 
> This queue is created but never removed. May be it could/should go in a
> separate function and be anchored to net_dev_init(). The network maintainers
> will tell if they care or not.
> 
> Pure curiosity: why a specific queue instead of schedule_delayed_work() ?

Pure ignorance I'm afraid ;).  That'll more what I was looking for
originally so I've changed it.

I've attached a revised version.

Thanks for you comments (and Matts)

Mark

-- 
Mark Broadbent <markb@wetlettuce.com>

[-- Attachment #2: netpoll-fix-tx-deadlock-2.patch --]
[-- Type: text/x-patch, Size: 4765 bytes --]

It is possible for netpoll to deadlock by attempting to take the network
device xmit_lock twice on different processors.  This patch resolves this
by deferring to transmission of the second and subsequent skbs.

Signed-Off-By: Mark Broadbent <markb@wetlettuce.com>

--- linux-2.6.10-org/net/core/netpoll.c	2005-01-05 22:30:53.000000000 +0000
+++ linux-2.6.10/net/core/netpoll.c	2005-01-07 20:10:20.000000000 +0000
@@ -1,6 +1,9 @@
 /*
  * Common framework for low-level network console, dump, and debugger code
  *
+ * Jan 5 2005 Mark Broadbent <markb@wetlettuce.com>
+ * avoid deadlock on transmission
+ *
  * Sep 8 2003  Matt Mackall <mpm@selenic.com>
  *
  * based on the netconsole code from:
@@ -19,6 +22,8 @@
 #include <linux/netpoll.h>
 #include <linux/sched.h>
 #include <linux/rcupdate.h>
+#include <linux/workqueue.h>
+#include <linux/list.h>
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <asm/unaligned.h>
@@ -31,6 +36,22 @@
 #define MAX_SKBS 32
 #define MAX_UDP_CHUNK 1460
 
+static void tx_retry_wq(void *p);
+
+struct tx_queue_list {
+	struct list_head link;
+	struct sk_buff *skb;
+	struct netpoll *np;
+};
+
+static spinlock_t tx_list_lock = SPIN_LOCK_UNLOCKED;
+static struct tx_queue_list tx_queue = {
+	.link = LIST_HEAD_INIT(tx_queue.link)
+};
+
+static DECLARE_WORK(tx_wq_obj, tx_retry_wq, NULL);
+
+
 static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED;
 static int nr_skbs;
 static struct sk_buff *skbs;
@@ -178,17 +199,21 @@ repeat:
 	return skb;
 }
 
-void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
+/* This must be called with np->dev->xmit_lock spinlock held - this function 
+ * function will drop the lock before returning indicating if it needs to be 
+ * retried
+ */
+static int transmit_skb(struct netpoll *np, struct sk_buff *skb)
 {
 	int status;
 
-repeat:
-	if(!np || !np->dev || !netif_running(np->dev)) {
-		__kfree_skb(skb);
-		return;
-	}
-
-	spin_lock(&np->dev->xmit_lock);
+	if (np && np->dev && netif_running(np->dev))
+		 return -1;
+	
+	/* Attempt the xmit lock as we may already hold it */
+	if (!spin_trylock(&np->dev->xmit_lock))
+		return -1;
+	
 	np->dev->xmit_lock_owner = smp_processor_id();
 
 	/*
@@ -200,7 +225,7 @@ repeat:
 		spin_unlock(&np->dev->xmit_lock);
 
 		netpoll_poll(np);
-		goto repeat;
+		return -1;
 	}
 
 	status = np->dev->hard_start_xmit(skb, np->dev);
@@ -210,8 +235,73 @@ repeat:
 	/* transmit busy */
 	if(status) {
 		netpoll_poll(np);
-		goto repeat;
+		return -1;
 	}
+	
+	return 0;
+}
+
+static void tx_retry_wq(void *p)
+{
+	struct list_head *tx_ctr, *tx_tmp;
+	struct tx_queue_list *tx_el;
+	unsigned long flags;
+	
+	spin_lock_irqsave(&tx_list_lock, flags);
+	
+	list_for_each_safe(tx_ctr, tx_tmp, &tx_queue.link) {
+		tx_el = list_entry(tx_ctr, struct tx_queue_list, link);
+		
+		if (transmit_skb(tx_el->np, tx_el->skb)) {
+			// Transmit failed - reschedule
+			schedule_delayed_work(&tx_wq_obj, HZ/100);
+			goto out_unlock;
+		}
+
+		list_del(tx_ctr);
+		__kfree_skb(tx_el->skb);
+		kfree(tx_el);
+	}
+	
+out_unlock:
+	spin_unlock_irqrestore(&tx_list_lock, flags);
+}
+
+void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
+{
+	/* If the list is non-empty (packets pending tx) or the skb transmit
+	 * failed then pkts are pending tx, append to the end of list to stop
+	 * reordering of the pkts
+	 */
+	if (!list_empty(&tx_queue.link) || transmit_skb(np, skb)) {
+		struct tx_queue_list *txel;
+		
+		/* If we failed to get the xmit_lock then netpoll already holds
+		 * this lock but will deadlock here if we relock, queue the skb
+		 * for transmission later.
+		 * This function is always called with local irqs disabled, hence
+		 * the lack of spin_lock_irqsave
+		 */
+		spin_lock(&tx_list_lock);
+		
+		txel = kmalloc(sizeof(struct tx_queue_list), GFP_ATOMIC);
+		if (!txel)
+			goto out_free;
+		
+		txel->skb = skb_get(skb);
+		txel->np  = np;
+		
+		list_add_tail(&txel->link, &tx_queue.link);
+		
+		schedule_delayed_work(&tx_wq_obj, HZ/100);
+		
+		spin_unlock(&tx_list_lock);
+	}
+	
+	return;
+	
+out_free:
+	__kfree_skb(skb);
 }
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
@@ -644,6 +734,23 @@ int netpoll_setup(struct netpoll *np)
 
 void netpoll_cleanup(struct netpoll *np)
 {
+	struct list_head *tx_ctr, *tx_tmp;
+	struct tx_queue_list *tx_el;
+	unsigned long flags;
+	
+	spin_lock_irqsave(&tx_list_lock, flags);
+	list_for_each_safe(tx_ctr, tx_tmp, &tx_queue.link) {
+		tx_el = list_entry(tx_ctr, struct tx_queue_list, link);
+		
+		// This netpoll is going away - remove just these elements
+		if (tx_el->np == np) {
+			list_del(tx_ctr);
+			__kfree_skb(tx_el->skb);
+			kfree(tx_el);
+		}
+	}
+	spin_unlock_irqrestore(&tx_list_lock, flags);
+	
 	if (np->rx_hook) {
 		unsigned long flags;
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
  2005-01-07 17:01     ` Followup to netpoll issues Matt Mackall
@ 2005-01-07 21:42       ` Francois Romieu
  2005-01-07 22:07         ` Matt Mackall
  0 siblings, 1 reply; 7+ messages in thread
From: Francois Romieu @ 2005-01-07 21:42 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Mark Broadbent, netdev

Matt Mackall <mpm@selenic.com> :
[...]
> printk("debug: at point A");
> do_something_that_sends_a_packet_B();
> printk("debug: at point C");
> 
> Depending on locking and queueing, we might see on our network dump B,
> A, C, and wrongly conclude that whatever did B was not between A and C.
> That's a bad way for printk to work.

I completely agree that it is not perfect.

However netconsole is currently unable to guarantee that you will
always see both A and B (currently = assuming the skb is dropped
when netconsole fails trylock as suggested by Jamal). So there is
an issue with the reliability of the delivery as well.

I won't push harder on the queuing side as I believe that it will
be possible to add it as an extra choice to the user whatever form
netconsole takes to stop deadlocking.

[...]
> The bugs I'm talking about are identical to the xmit_lock deadlock
> except with locks we can't see outside the driver. In other words,

Right, it's clearer now. Thanks for the reminder.

User space takes device's private lock -> printks -> netconsole.write
-> hard_start_xmit -> device's private lock -> splat. Same thing from
interrupt context (in_irq() can probably help though).

So we ought to check rtnl_sem as well (dev_base_lock anyone ?).

/me scratches neck...

> this patch addresses the easy part of larger problem by adding a bunch
> of complexity that doesn't help in the larger problem. To me, that's a
> hint that it's the wrong fix.

Too big. It won't bite. :o)

[...]
> > I am not convinced that people will be satisfied with a rule which
> > states that printk _from anywhere_ are lost as soon as a CPU enters
> > in the xmit_lock zone but, hey, it's just me.
> 
> It should only be dropped on the CPU holding the lock, with a loud
> warning to follow shortly.

Sorry if I was not clear: "from anywhere" meant printk issued from
any part of the kernel which can interrupt the xmit_locked section
of a qdisc_run(), i.e. printk from irq handlers. 

If I read correctly the suggested design, the remaining CPUs should
loop in netpoll_send_skb() when they notice that they can not take
the lock and that their CPU do not own it, right ?

--
Ueimor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
  2005-01-07 21:42       ` Francois Romieu
@ 2005-01-07 22:07         ` Matt Mackall
  2005-01-07 22:41           ` Francois Romieu
  0 siblings, 1 reply; 7+ messages in thread
From: Matt Mackall @ 2005-01-07 22:07 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Mark Broadbent, netdev

On Fri, Jan 07, 2005 at 10:42:54PM +0100, Francois Romieu wrote:
> Matt Mackall <mpm@selenic.com> :
> [...]
> > printk("debug: at point A");
> > do_something_that_sends_a_packet_B();
> > printk("debug: at point C");
> > 
> > Depending on locking and queueing, we might see on our network dump B,
> > A, C, and wrongly conclude that whatever did B was not between A and C.
> > That's a bad way for printk to work.
> 
> I completely agree that it is not perfect.
> 
> However netconsole is currently unable to guarantee that you will
> always see both A and B (currently = assuming the skb is dropped
> when netconsole fails trylock as suggested by Jamal). So there is
> an issue with the reliability of the delivery as well.

Indeed, and we can't really do much about it generally as we're on a
lossy media. But I think that's less misleading than potential
reordering. I acknowledge that packets can get reordered on the wire
potentially as well but for typical LAN segments, this will be uncommon.

> I won't push harder on the queuing side as I believe that it will
> be possible to add it as an extra choice to the user whatever form
> netconsole takes to stop deadlocking.

I might be willing to compromise on a queue depth of one. Much simpler
code, much less effort to paper over bugs that should be fixed in a
different way.

But then we've got to have a way to disable it for kgdb-over-ethernet
or netdump, where the delayed work simply won't get processed because
the box is stopped. Admittedly trying to trace your way into the
network driver is asking for a beating here, but the general issue is
that netpoll clients can stop interrupt processing and netpoll should
tell such clients that the packets its trying to send aren't going
anywhere rather than quietly scheduling them for later delivery, see?
 
> [...]
> > The bugs I'm talking about are identical to the xmit_lock deadlock
> > except with locks we can't see outside the driver. In other words,
> 
> Right, it's clearer now. Thanks for the reminder.
> 
> User space takes device's private lock -> printks -> netconsole.write
> -> hard_start_xmit -> device's private lock -> splat. Same thing from
> interrupt context (in_irq() can probably help though).
> 
> So we ought to check rtnl_sem as well (dev_base_lock anyone ?).
> 
> /me scratches neck...
> 
> > this patch addresses the easy part of larger problem by adding a bunch
> > of complexity that doesn't help in the larger problem. To me, that's a
> > hint that it's the wrong fix.
> 
> Too big. It won't bite. :o)
> 
> [...]
> > > I am not convinced that people will be satisfied with a rule which
> > > states that printk _from anywhere_ are lost as soon as a CPU enters
> > > in the xmit_lock zone but, hey, it's just me.
> > 
> > It should only be dropped on the CPU holding the lock, with a loud
> > warning to follow shortly.
> 
> Sorry if I was not clear: "from anywhere" meant printk issued from
> any part of the kernel which can interrupt the xmit_locked section
> of a qdisc_run(), i.e. printk from irq handlers. 

Hrmm. Yes, that's uglier than I realized.

Perhaps there's a way to use the existing IRQ handler reentrancy
avoidance to detect that we might be about to recurse on a lock.

> If I read correctly the suggested design, the remaining CPUs should
> loop in netpoll_send_skb() when they notice that they can not take
> the lock and that their CPU do not own it, right ?

Yep.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
  2005-01-07 22:07         ` Matt Mackall
@ 2005-01-07 22:41           ` Francois Romieu
  0 siblings, 0 replies; 7+ messages in thread
From: Francois Romieu @ 2005-01-07 22:41 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Mark Broadbent, netdev

Matt Mackall <mpm@selenic.com> :
[...]
> > Sorry if I was not clear: "from anywhere" meant printk issued from
> > any part of the kernel which can interrupt the xmit_locked section
> > of a qdisc_run(), i.e. printk from irq handlers. 
> 
> Hrmm. Yes, that's uglier than I realized.
> 
> Perhaps there's a way to use the existing IRQ handler reentrancy
> avoidance to detect that we might be about to recurse on a lock.

It would help if qdisc_restart() disabled irq until xmit_lock_owner
is set (an extra memory barrier would be required btw). 

I'd say to go with in_irq() + try_lock() but one could object that
try_lock() leaves a window where a late thread could steal the
lock.

--
Ueimor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
  2005-01-07 20:14   ` Mark Broadbent
@ 2005-01-07 23:18     ` Francois Romieu
  2005-01-10 22:03       ` Mark Broadbent
  0 siblings, 1 reply; 7+ messages in thread
From: Francois Romieu @ 2005-01-07 23:18 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: Matt Mackall, netdev

Mark Broadbent <markb@wetlettuce.com> :
[...]
> No need, if netpoll_cleanup is called whilst a retry is pending the tx
> list lock is taken.  All the references to the netpoll pointer are
> deleted from the pending list before the lock is released.

I may be a bit dense but:

t0             : tx_retry_wq().queue_delayed_work(tx_wq, &tx_wq_obj, HZ/100);

t0 + 10*HZ/1000: tx_retry_wq() is done (it was not fast for sure)

t0 + 20*HZ/1000: netpoll_cleanup() + module removal

t0 + HZ/100    : tx_retry_wq() <- Where are its code and data ?

[...]
> > However netpoll_send_skb can be called through:
> > netpoll_rx -> arp_reply -> netpoll_send_skb
> > Here I fail to see why netpoll_cleanup could not happen at the same time
> > and issue np->dev = NULL;
> 
> Could this solved by having a per netpoll structure rw_lock to protect
> against changing elements in the structure?
> 
> e.g. 
> struct netpoll {
> 	...
> 	rwlock_t netpoll_lock;
> };
> 
> write_lock_irqsave(&np->netpoll_lock, flags);
> np->dev = NULL;
> write_unlock_irqsave(&np->netpoll_lock, flags);

I am not sure that a fine coarsed lock is necessary since there
will mostly be readers and netpoll_cleanup should be rare.

[...]
> Is it safe to issue spin_lock_irqsave/spin_unlock_restore in this
> context?  Reason: What if it wasn't netconsole calling netpoll_send_udp
> and not with local interrupts disabled?

The irq state is saved and restored.

Did I get the question right ?

[...]
> I've attached a revised version.
> 
> Thanks for you comments (and Matts)

The malloc() in netpoll_send_skb() is heavy-weight and it is not that
uncommon to fail allocation in network context: preallocate when the
np is first set up. It will make netpoll_send_skb() lighter.

Issue schedule_delayed_work() in netpoll_send_skb() only if when the
task is not already scheduled ? It is an information you can get/set
while tx_list_lock is taken.

I'll read the patch (or any updated version) more closely once I have
recovered my sleep quota. It should not be too hard to isolate the
queueing specific stuff from the real fixes.

--
Ueimor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Followup to netpoll issues
  2005-01-07 23:18     ` Francois Romieu
@ 2005-01-10 22:03       ` Mark Broadbent
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Broadbent @ 2005-01-10 22:03 UTC (permalink / raw)
  To: romieu; +Cc: mpm, netdev


Francois Romieu said:
> Mark Broadbent <markb@wetlettuce.com> :
> [...]
>> No need, if netpoll_cleanup is called whilst a retry is pending the tx
>> list lock is taken.  All the references to the netpoll pointer are
>> deleted from the pending list before the lock is released.
>
> I may be a bit dense but:
>
> t0             : tx_retry_wq().queue_delayed_work(tx_wq, &tx_wq_obj,
> HZ/100);
>
> t0 + 10*HZ/1000: tx_retry_wq() is done (it was not fast for sure)
>
> t0 + 20*HZ/1000: netpoll_cleanup() + module removal
>
> t0 + HZ/100    : tx_retry_wq() <- Where are its code and data ?

You logic is correct but I'm positive that netpoll.o is always built into
the kernel image and hence cannot be unloaded.
> [...]

Thanks
Mark



-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2005-01-10 22:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1105045914.7687.3.camel@tigger>
     [not found] ` <20050106234610.GT2940@waste.org>
     [not found]   ` <20050107011547.GE27896@electric-eye.fr.zoreil.com>
2005-01-07 17:01     ` Followup to netpoll issues Matt Mackall
2005-01-07 21:42       ` Francois Romieu
2005-01-07 22:07         ` Matt Mackall
2005-01-07 22:41           ` Francois Romieu
     [not found] ` <20050107002053.GD27896@electric-eye.fr.zoreil.com>
2005-01-07 20:14   ` Mark Broadbent
2005-01-07 23:18     ` Francois Romieu
2005-01-10 22:03       ` Mark Broadbent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).