All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Folkert van Heusden <folkert@vanheusden.com>,
	linux-kernel@vger.kernel.org,
	Jason Wessel <jason.wessel@windriver.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	stable@kernel.org, netdev@vger.kernel.org
Subject: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module
Date: Wed, 13 Jun 2007 11:25:37 +0200	[thread overview]
Message-ID: <20070613092537.GA2432@ff.dom.local> (raw)
In-Reply-To: <20070612110233.GA3281@ff.dom.local>

On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote:
...
> Of course such a problem should preferably be fixed by somebody who
> knows the code (alas I don't know netconsole), to be sure all needed
> cancels are still done after this change. I hope Jason's patch is
> right but I'm a little surprised I can't see netdev in cc (I'll try
> to fix this).

So, I've had a look into netpoll and, unfortunately, I don't
think this patch is right... 

> > From: Jason Wessel <jason.wessel@windriver.com>
> > 
> > Do not call cancel_rearming_delayed_work() if there is no
> > pending work.
> > 
> > Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> > 
> >  net/core/netpoll.c |    6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff -puN net/core/netpoll.c~a net/core/netpoll.c
> > --- a/net/core/netpoll.c~a
> > +++ a/net/core/netpoll.c
> > @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np)
> >  			if (atomic_dec_and_test(&npinfo->refcnt)) {
> >  				skb_queue_purge(&npinfo->arp_tx);
> >  				skb_queue_purge(&npinfo->txq);
> > -				cancel_rearming_delayed_work(&npinfo->tx_work);
> > -				flush_scheduled_work();
> > +				if (delayed_work_pending(&npinfo->tx_work)) {
> > +					cancel_rearming_delayed_work(&npinfo->tx_work);
> > +					flush_scheduled_work();
> > +				}
> >  
> >  				kfree(npinfo);
> >  			}
> > _

There are such possibilities:

1. After positive delayed_work_pending(&npinfo->tx_work) test
some work is queued, but there is no guarantee that when running
it'll rearm again, so cancel_rearming_delayed_work can loop again;

2. After negative delayed_work_pending(&npinfo->tx_work) test
a work is just running, eg. waiting on netif_tx_lock, while
kfree(npinfo) is done here (oops?!).

I've found an additional problem here with or without this patch:
after deleting a timer in cancel_rearming_delayed_work() there could
stay a last skb queued in npinfo->txq, and after kfree(npinfo)
we have small memory leak. If I'm right here similar fix is needed
in the current netpoll code: additional npinfo->txq purging only
or maybe the whole cancel_rearming_ changed like this.

I've tried to eliminate these problems in attached below patch
proposal. I'm not sure it's all right: as I've written earlier I
don't know netconsole enough, but it's probably a little better
than above solution.

I've some doubts yet (I didn't have time to check this all):

1. I hope this other schedule_delayed_work() from netpoll_send_skb()
is not possible when netpoll_cleanup() runs - if I'm wrong additional
check of npinfo->refcnt should be done there;
2. I also hope npinfo->refcnt before scheduling should be enough here
- if not - another possibility is adding some locking eg.:
netif_tx_lock before cancel for synchronization.

Of course it would be very nice if somebody could test or verify
this patch more.

Regards,
Jarek P.


Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>

---

diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c
--- 2.6.21-/net/core/netpoll.c	2007-04-26 15:08:32.000000000 +0200
+++ 2.6.21/net/core/netpoll.c	2007-06-12 21:05:23.000000000 +0200
@@ -73,7 +73,8 @@ static void queue_process(struct work_st
 			netif_tx_unlock(dev);
 			local_irq_restore(flags);
 
-			schedule_delayed_work(&npinfo->tx_work, HZ/10);
+			if (atomic_read(&npinfo->refcnt))
+				schedule_delayed_work(&npinfo->tx_work, HZ/10);
 			return;
 		}
 		netif_tx_unlock(dev);
@@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np)
 			if (atomic_dec_and_test(&npinfo->refcnt)) {
 				skb_queue_purge(&npinfo->arp_tx);
 				skb_queue_purge(&npinfo->txq);
-				cancel_rearming_delayed_work(&npinfo->tx_work);
+				cancel_delayed_work(&npinfo->tx_work);
 				flush_scheduled_work();
 
+				/* clean after last, unfinished work */
+				if (!skb_queue_empty(&npinfo->txq)) {
+					struct sk_buff *skb;
+					skb = __skb_dequeue(&npinfo->txq);
+					kfree_skb(skb);
+				}
 				kfree(npinfo);
 			}
 		}

  reply	other threads:[~2007-06-13  9:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-26 15:40 [2.6.21.1] soft lockup when removing netconsole module Folkert van Heusden
2007-05-26 15:53 ` Parag Warudkar
2007-05-26 16:12   ` Thomas Gleixner
2007-05-26 16:17     ` Folkert van Heusden
2007-05-26 16:35       ` Thomas Gleixner
2007-05-26 16:49         ` Folkert van Heusden
2007-05-26 17:06           ` Thomas Gleixner
2007-05-26 17:12             ` Folkert van Heusden
2007-05-27 20:32         ` Matt Mackall
2007-05-29  7:56 ` Andrew Morton
2007-05-30 13:28   ` [PATCH] " Jason Wessel
2007-05-30 20:38     ` Folkert van Heusden
2007-06-12 11:02   ` Jarek Poplawski
2007-06-13  9:25     ` Jarek Poplawski [this message]
2007-06-26 23:07       ` [PATCH] " Andrew Morton
2007-06-27  0:46         ` Wessel, Jason
2007-06-27  1:00           ` Andrew Morton
2007-06-27  7:24             ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070613092537.GA2432@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=akpm@linux-foundation.org \
    --cc=folkert@vanheusden.com \
    --cc=jason.wessel@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.