All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Herbert Xu <herbert@gondor.hengli.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	Qianfeng Zhang <frzhang@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org, WANG Cong <amwang@redhat.com>,
	Stephen Hemminger <shemminger@vyatta.com>,
	Matt Mackall <mpm@selenic.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: [PATCH 6/8] netpoll: Allow netpoll_setup/cleanup recursion
Date: Thu, 24 Jun 2010 18:21:23 -0700	[thread overview]
Message-ID: <20100624182123.45264dfe.akpm@linux-foundation.org> (raw)
In-Reply-To: <E1OMtjk-0006P2-6r@gondolin.me.apana.org.au>

On Fri, 11 Jun 2010 12:12:48 +1000
Herbert Xu <herbert@gondor.hengli.com.au> wrote:

> netpoll: Allow netpoll_setup/cleanup recursion
> 
> This patch adds the functions __netpoll_setup/__netpoll_cleanup
> which is designed to be called recursively through ndo_netpoll_seutp.
> 
> They must be called with RTNL held, and the caller must initialise
> np->dev and ensure that it has a valid reference count.
> 
> ...
>
> -int netpoll_setup(struct netpoll *np)
> +int __netpoll_setup(struct netpoll *np)
>  {
> -	struct net_device *ndev = NULL;
> -	struct in_device *in_dev;
> +	struct net_device *ndev = np->dev;
>  	struct netpoll_info *npinfo;
>  	const struct net_device_ops *ops;
>  	unsigned long flags;
>  	int err;
>  
> +	if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
> +	    !ndev->netdev_ops->ndo_poll_controller) {
> +		printk(KERN_ERR "%s: %s doesn't support polling, aborting.\n",
> +		       np->name, np->dev_name);
> +		err = -ENOTSUPP;
> +		goto out;
> +	}
> +
> +	if (!ndev->npinfo) {
> +		npinfo = kmalloc(sizeof(*npinfo), GFP_KERNEL);
> +		if (!npinfo) {
> +			err = -ENOMEM;
> +			goto out;
> +		}
> +
> +		npinfo->rx_flags = 0;
> +		INIT_LIST_HEAD(&npinfo->rx_np);
> +
> +		spin_lock_init(&npinfo->rx_lock);
> +		skb_queue_head_init(&npinfo->arp_tx);
> +		skb_queue_head_init(&npinfo->txq);
> +		INIT_DELAYED_WORK(&npinfo->tx_work, queue_process);
> +
> +		atomic_set(&npinfo->refcnt, 1);
> +
> +		ops = np->dev->netdev_ops;
> +		if (ops->ndo_netpoll_setup) {
> +			err = ops->ndo_netpoll_setup(ndev, npinfo);
> +			if (err)
> +				goto free_npinfo;
> +		}
> +	} else {
> +		npinfo = ndev->npinfo;
> +		atomic_inc(&npinfo->refcnt);
> +	}
> +
> +	npinfo->netpoll = np;
> +
> +	if (np->rx_hook) {
> +		spin_lock_irqsave(&npinfo->rx_lock, flags);
> +		npinfo->rx_flags |= NETPOLL_RX_ENABLED;
> +		list_add_tail(&np->rx, &npinfo->rx_np);
> +		spin_unlock_irqrestore(&npinfo->rx_lock, flags);
> +	}
> +
> +	/* last thing to do is link it to the net device structure */
> +	rcu_assign_pointer(ndev->npinfo, npinfo);
> +	rtnl_unlock();

And there it is, an unbalanced rtnl_unlock().

This stupid little thing took me over a day's work to find - it's just
been awful.

The user-visible symptom was that a bug in the netpoll code causes the
machine to hang after loading ipv6 (!), because
addrconf_fixup_forwarding()'s rtnl_trylock() kept on failing, and the
restart_syscall() kept on getting restarted, so an initscripts procfs
write just kept banging its head against the excessively-unlocked
mutex.

The mutex code handles an excessively-unlocked mutex (mutex.count==2)
really badly.  Some API functions say "its locked", others say "it
isn't", etc.

Maybe it's better with mutex debugging enabled - didn't try that. 
Things get pretty user-unfriendly when there's a bug within the
netconsole code itself.

Enabling lockdep simply made the bug cure itself - I suspect the mutex
code's handling of mutexes is different if lockdep is enabled.  That
would be pretty bad behaviour from the lockdep code.

I just removed the rtnl_unlock() - I couldn't see much in there which
needed rtnl_locking..

Dave, the fixup should be folded into the original patch please -
otherwise we'll have a machine-hangs-up bisection hole which spans two
weeks work of commits.


  reply	other threads:[~2010-06-25  1:21 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-10 12:40 [0/8] netpoll/bridge fixes Herbert Xu
2010-06-10 12:42 ` [PATCH 1/7] netpoll: Set npinfo to NULL even with ndo_netpoll_cleanup Herbert Xu
2010-06-10 12:42 ` [PATCH 2/7] bridge: Remove redundant npinfo NULL setting Herbert Xu
2010-06-10 12:42 ` [PATCH 3/7] netpoll: Fix RCU usage Herbert Xu
2010-06-10 12:42 ` [PATCH 4/7] netpoll: Add locking for netpoll_setup/cleanup Herbert Xu
2010-06-10 12:42 ` [PATCH 5/7] netpoll: Add ndo_netpoll_setup Herbert Xu
2010-06-10 12:42 ` [PATCH 6/7] netpoll: Allow netpoll_setup/cleanup recursion Herbert Xu
2010-06-10 12:42 ` [PATCH 7/7] bridge: Fix netpoll support Herbert Xu
2010-06-10 14:49 ` [0/8] netpoll/bridge fixes Stephen Hemminger
2010-06-10 21:56   ` Herbert Xu
2010-06-10 21:59     ` Stephen Hemminger
2010-06-10 22:48       ` Herbert Xu
2010-06-11  2:11         ` Herbert Xu
2010-06-11  2:12           ` [PATCH 1/8] netpoll: Set npinfo to NULL even with ndo_netpoll_cleanup Herbert Xu
2010-06-11  2:12           ` [PATCH 2/8] bridge: Remove redundant npinfo NULL setting Herbert Xu
2010-06-11  2:12           ` [PATCH 3/8] netpoll: Fix RCU usage Herbert Xu
2010-06-11 23:10             ` Paul E. McKenney
2010-06-11  2:12           ` [PATCH 4/8] netpoll: Add locking for netpoll_setup/cleanup Herbert Xu
2010-06-11  2:12           ` [PATCH 5/8] netpoll: Add ndo_netpoll_setup Herbert Xu
2010-06-11  2:12           ` [PATCH 6/8] netpoll: Allow netpoll_setup/cleanup recursion Herbert Xu
2010-06-25  1:21             ` Andrew Morton [this message]
2010-06-25  3:01               ` Herbert Xu
2010-06-25  3:30               ` David Miller
2010-06-25  3:50                 ` Andrew Morton
2010-06-25  4:27                   ` David Miller
2010-06-25  4:42                     ` Andrew Morton
2010-06-25  4:52                       ` David Miller
2010-06-25  8:08                       ` Peter Zijlstra
2010-06-25  8:42                         ` Andrew Morton
2010-06-25  9:45                           ` Peter Zijlstra
2010-06-25  8:46                       ` Ingo Molnar
2010-06-25 10:08                       ` Nick Piggin
2010-06-11  2:12           ` [PATCH 7/8] netpoll: Add netpoll_tx_running Herbert Xu
2010-06-11  2:12           ` [PATCH 8/8] bridge: Fix netpoll support Herbert Xu
2010-06-11  3:08             ` fired a bug report on bugzilla.redhat.com Qianfeng Zhang
2010-06-15 10:28             ` [PATCH 8/8] bridge: Fix netpoll support Cong Wang
2010-06-17 10:38               ` Herbert Xu
2010-06-17 10:57                 ` Cong Wang
2010-06-17 10:55                   ` Herbert Xu
2010-06-18  3:06                     ` Cong Wang
2010-06-11 20:03           ` [0/8] netpoll/bridge fixes Matt Mackall
2010-06-15 10:17           ` Cong Wang
2010-06-15 18:39           ` David Miller
2010-06-16  2:58             ` Eric Dumazet
2010-06-16  3:03               ` Eric Dumazet
2010-06-16  3:33                 ` Herbert Xu
2010-06-16  4:47                   ` David Miller
2010-06-16 23:02                     ` Paul E. McKenney
2010-06-17 10:18                       ` Michael S. Tsirkin
2010-06-17 21:26                         ` Paul E. McKenney
2010-06-16  6:16                   ` Eric Dumazet
2010-06-16  5:08               ` Paul E. McKenney
2010-06-16  6:21                 ` Eric Dumazet
2010-06-16 16:01                   ` Paul E. McKenney
2010-07-19 10:19           ` Michael S. Tsirkin
2010-07-19 10:53             ` Herbert Xu
2010-07-19 11:54               ` Herbert Xu
2010-07-19 16:05                 ` David Miller
2010-07-19 16:52                   ` Eric Dumazet
2010-07-19 20:35                     ` David Miller
2010-07-20  5:26                   ` Herbert Xu
2010-07-20  6:28                     ` David Miller
2010-06-29 12:53 ` Yanko Kaneti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100624182123.45264dfe.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=amwang@redhat.com \
    --cc=davem@davemloft.net \
    --cc=frzhang@redhat.com \
    --cc=herbert@gondor.hengli.com.au \
    --cc=mingo@elte.hu \
    --cc=mpm@selenic.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.