From: Andy Gospodarek <andy@greyhouse.net>
To: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>,
Krzysztof Oledzki <olel@ans.pl>,
netdev@vger.kernel.org, Jeff Garzik <jgarzik@pobox.com>,
David Miller <davem@davemloft.net>,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
Date: Wed, 9 Jan 2008 15:17:09 -0500 [thread overview]
Message-ID: <20080109201709.GF8728@gospo.usersys.redhat.com> (raw)
In-Reply-To: <32361.1199901296@death>
On Wed, Jan 09, 2008 at 09:54:56AM -0800, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
> [...]
> >My initial concern was that a slave device could disappear out from
> >under us, but it seems like this certainly isn't the case since all
> >calls to bond_release are protected by rtnl-locks, so I think you are
> >correct that we are safe. I'll test this on my setup here and let you
> >know if I see any problems.
>
> Yep, all entries into enslave or remove come in with RTNL, so if
> we have RTNL there then slaves can't vanish.
>
> On further inspection, I don't think it's safe to simply drop
> the locks in bond_set_multicast_list, I'm seeing a couple of cases that
> could be troublesome:
>
> bond_set_promiscuity and bond_set_allmulti both reference
> curr_active_slave, which isn't protected from change by RTNL, so that
> could conflict with a change_active_slave calling bond_mc_swap (which is
> also holding the wrong locks for dev_set_promisc/allmulti).
>
> It also looks like there are paths (igmp6 for one) into
> dev_mc_add that just hold a bunch of regular locks, and not RTNL, so
> those wouldn't be safe from having slaves vanish due to concurrent
> deslavement.
Eeeek! I didn't realize that rtnl wasn't held for all those calls. If
that's the case we can't drop all the locks.
> Looks like read_lock_bh for bond-lock and curr_slave_lock is
> needed in bond_set_multicast_list, and some dropping of locks is needed
> inside bond_set_promisc/allmulti. Methinks that without any locks,
> bond_mc_add/delete could race with either a change of active slave or a
> de-enslavement of the active slave.
Agreed. And despite Herbert's opinion that this isn't the correct fix,
I think this will work fine. This is one of the cases where we can take
a write_lock(bond->lock) in softirq context, so we need to drop that (or
make sure all the read_lock's are read_lock_bh's). The latter isn't
really an option since having a majority of the bonding code run in
softirq context was what we are trying to avoid with the workqueue
conversion.
> I'm wondering if this is worth trying to make perfect for 2.6.24
> (and maybe making things worse), and, instead, just do this:
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 77d004d..8b9e33a 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -3937,7 +3937,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> struct bonding *bond = bond_dev->priv;
> struct dev_mc_list *dmi;
>
> - write_lock_bh(&bond->lock);
> + read_lock_bh(&bond->lock);
>
> /*
> * Do promisc before checking multicast_mode
> @@ -3979,7 +3979,7 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
> bond_mc_list_destroy(bond);
> bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
>
> - write_unlock_bh(&bond->lock);
> + read_unlock_bh(&bond->lock);
> }
>
> /*
>
>
> This should silence the lockdep (if I'm understanding what
> everybody's saying), and keep the change set to a minimum. This might
The lockdep problem is easy to trigger. The lockdep code does a good
job of noticing problems quickly regardless of how easy the deadlocks
are to create.
> not even be worth pushing for 2.6.24; I'm not exactly sure how difficult
> the lockdep problem would be to trigger.
>
I'd like to see it go in there (for correct-ness) and to avoid hearing
about these lockdep issues for the next few months until it makes it
into 2.4.25.
> The other stuff I mention above can be dealt with later; they're
> very low-probability races that would be pretty difficult to hit even on
> purpose.
>
> Thoughts?
>
> -J
>
> ---
> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-01-09 20:26 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-08 1:56 [PATCH 0/3] bonding: 3 fixes for 2.6.24 Jay Vosburgh
2008-01-08 1:56 ` [PATCH 1/3] bonding: fix locking in sysfs primary/active selection Jay Vosburgh
2008-01-08 1:56 ` [PATCH 2/3] bonding: fix ASSERT_RTNL that produces spurious warnings Jay Vosburgh
2008-01-08 1:57 ` [PATCH 3/3] bonding: fix locking during alb failover and slave removal Jay Vosburgh
2008-01-08 18:50 ` [PATCH 0/3] bonding: 3 fixes for 2.6.24 Krzysztof Oledzki
2008-01-08 19:17 ` Andy Gospodarek
2008-01-08 20:28 ` Jay Vosburgh
2008-01-09 6:08 ` Herbert Xu
2008-01-08 19:30 ` Jay Vosburgh
2008-01-09 6:35 ` Krzysztof Oledzki
2008-01-09 7:58 ` Jay Vosburgh
2008-01-09 9:36 ` Krzysztof Oledzki
2008-01-09 15:27 ` Andy Gospodarek
2008-01-09 17:54 ` Jay Vosburgh
2008-01-09 20:17 ` Andy Gospodarek [this message]
2008-01-09 22:05 ` Herbert Xu
2008-01-09 23:19 ` Jay Vosburgh
2008-01-10 0:58 ` Herbert Xu
2008-01-10 14:51 ` Andy Gospodarek
2008-01-10 20:36 ` Herbert Xu
2008-01-10 20:50 ` Jay Vosburgh
2008-01-10 21:03 ` Andy Gospodarek
2008-01-10 21:05 ` Herbert Xu
2008-01-11 1:06 ` Jay Vosburgh
2008-01-11 4:55 ` Herbert Xu
2008-01-10 20:45 ` Jay Vosburgh
2008-01-12 10:53 ` Krzysztof Oledzki
2008-01-12 17:56 ` Jay Vosburgh
2008-01-13 0:19 ` Herbert Xu
2008-01-14 22:15 ` Krzysztof Oledzki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080109201709.GF8728@gospo.usersys.redhat.com \
--to=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=fubar@us.ibm.com \
--cc=herbert@gondor.apana.org.au \
--cc=jgarzik@pobox.com \
--cc=netdev@vger.kernel.org \
--cc=olel@ans.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).