netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"David S\. Miller" <davem@davemloft.net>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [NETPOLL] netconsole: fix soft lockup when removing module
Date: Mon, 2 Jul 2007 08:34:24 +0200	[thread overview]
Message-ID: <20070702063424.GA1639@ff.dom.local> (raw)
In-Reply-To: <20070701173558.GA207@tv-sign.ru>

On Sun, Jul 01, 2007 at 09:35:58PM +0400, Oleg Nesterov wrote:
> Jarek Poplawski wrote:
> >
> >    #1
> >    Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work()
> >    required a work function should always (unconditionally) rearm with
> >    delay > 0 - otherwise it would endlessly loop. This patch replaces
> >    this function with cancel_delayed_work(). Later kernel versions don't
> >    require this, so here it's only for uniformity.
> 
> But 2.6.22 doesn't need this change, why it was merged?

One bad reason is given above. Should I look for another one?

> 
> In fact, I suspect this change adds a race,

You are right!

> 
> > --- a/net/core/netpoll.c
> > +++ b/net/core/netpoll.c
> > @@ -72,7 +72,8 @@ static void queue_process(struct work_struct *work)
> >  			netif_tx_unlock(dev);
> >  			local_irq_restore(flags);
> >  
> > -			schedule_delayed_work(&npinfo->tx_work, HZ/10);
> > +			if (atomic_read(&npinfo->refcnt))
> > +				schedule_delayed_work(&npinfo->tx_work, HZ/10);
> >  			return;
> >  		}
> >  		netif_tx_unlock(dev);
> > @@ -785,9 +786,15 @@ void netpoll_cleanup(struct netpoll *np)
> >  			if (atomic_dec_and_test(&npinfo->refcnt)) {
> >  				skb_queue_purge(&npinfo->arp_tx);
> >  				skb_queue_purge(&npinfo->txq);
> > -				cancel_rearming_delayed_work(&npinfo->tx_work);
> > +				cancel_delayed_work(&npinfo->tx_work);
> >  				flush_scheduled_work();
> 
> Suppose that ->refcnt == 1, and queue_process() was preempted just after
> atomic_read(&npinfo->refcnt).
> 
> netpoll_cleanup() comes, cancel_delayed_work() does nothing, flush_scheduled_work()
> sleeps.
> 
> queue_process() gets CPU, re-schedules ->tx_work, and returns.
> 
> flush_scheduled_work() completes, netpoll_cleanup() frees npinfo and returns
> while ->tx_work is pending.
> 
> No?

No no. (Yes!)

I had some doubts about this, and you found very good reason
for this.

I'll soon send a patch to restore cancel_rearming_delayed_work
in 2.6.22.

So, 2.6.21 needs something better (maybe you've found it btw.?),
but they weren't too interested, anyway.

Thanks very much & regards,
Jarek P.

       reply	other threads:[~2007-07-02  6:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20070701173558.GA207@tv-sign.ru>
2007-07-02  6:34 ` Jarek Poplawski [this message]
2007-07-02  9:24   ` [NETPOLL] netconsole: fix soft lockup when removing module Oleg Nesterov
2007-07-02 11:08     ` Jarek Poplawski
2007-07-02  7:52 ` [PATCH] " Jarek Poplawski
2007-07-02  8:59   ` Oleg Nesterov
2007-07-02 10:12     ` [PATCH 2/2][NETPOLL] netconsole: delete flush_scheduled_work Jarek Poplawski
2007-07-04  6:41   ` [PATCH] Re: [NETPOLL] netconsole: fix soft lockup when removing module Jarek Poplawski
2007-07-04  6:47     ` David Miller
2007-07-04  7:08       ` Jarek Poplawski
2007-07-04  7:21     ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070702063424.GA1639@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@tv-sign.ru \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).