All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jay Vosburgh <fubar@us.ibm.com>
To: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: netdev@vger.kernel.org, andy@greyhouse.net
Subject: Re: [PATCH net-next] bonding: fix system hang due to fast igmp timer rescheduling
Date: Tue, 30 Jul 2013 11:40:40 -0700	[thread overview]
Message-ID: <7289.1375209640@death.nxdomain> (raw)
In-Reply-To: <1375205852-31325-1-git-send-email-nikolay@redhat.com>

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>From: Nikolay Aleksandrov <Nikolay Aleksandrov nikolay@redhat.com>
>
>After commit 4aa5dee4d9 ("net: convert resend IGMP to notifier event")
>we try to acquire rtnl in bond_resend_igmp_join_requests but it can be
>scheduled with rtnl already held (e.g. when bond_change_active_slave is
>called with rtnl) causing a loop of immediate reschedules + calls because
>rtnl_trylock fails each time since it's being already held.
>For me this issue leads to system hangs very easy:
>modprobe bonding; ifconfig bond0 up; ifenslave bond0 eth0; rmmod
>bonding;

	I believe that bond_change_active_slave is always called with
rtnl held, and it is the only caller of bond_resend_igmp_join_requests
(well, "caller" in the sense that it queues the delayed work for
mcast_work that runs the function, currently with delay of 0).

>The fix is to introduce a small (1 jiffy) delay which is enough for the
>sections holding rtnl to finish without putting any strain on the system.

	Should the delay also be in the bond_change_active_slave queue
work call as well, to eliminate one loop of the "rtnl_trylock failing ->
queue_delayed_work" sequence in bond_resend_igmp_join_requests?

	-J

>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>---
> drivers/net/bonding/bond_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index da3af63..9d94313 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -723,7 +723,7 @@ static int bond_set_allmulti(struct bonding *bond, int inc)
> static void bond_resend_igmp_join_requests(struct bonding *bond)
> {
> 	if (!rtnl_trylock()) {
>-		queue_delayed_work(bond->wq, &bond->mcast_work, 0);
>+		queue_delayed_work(bond->wq, &bond->mcast_work, 1);
> 		return;
> 	}
> 	call_netdevice_notifiers(NETDEV_RESEND_IGMP, bond->dev);
>-- 
>1.8.1.4

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

  reply	other threads:[~2013-07-30 18:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-30 17:37 [PATCH net-next] bonding: fix system hang due to fast igmp timer rescheduling Nikolay Aleksandrov
2013-07-30 18:40 ` Jay Vosburgh [this message]
2013-07-30 23:25   ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7289.1375209640@death.nxdomain \
    --to=fubar@us.ibm.com \
    --cc=andy@greyhouse.net \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.