From: Jarod Wilson <jarod@redhat.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
Mahesh Bandewar <maheshb@google.com>,
Veaceslav Falico <vfalico@gmail.com>,
Andy Gospodarek <andy@greyhouse.net>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Thomas Davis <tadavis@lbl.gov>,
Netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH net] bonding: reduce rtnl lock contention in mii monitor thread
Date: Wed, 9 Dec 2020 10:26:38 -0500 [thread overview]
Message-ID: <CAKfmpSfbvi4hVCCotFLQB3MzvCx9ZRX7ttVww136pPkm1bN6PQ@mail.gmail.com> (raw)
In-Reply-To: <11900.1607459690@famine>
On Tue, Dec 8, 2020 at 3:35 PM Jay Vosburgh <jay.vosburgh@canonical.com> wrote:
>
> Jarod Wilson <jarod@redhat.com> wrote:
>
> >I'm seeing a system get stuck unable to bring a downed interface back up
> >when it's got an updelay value set, behavior which ceased when logging
> >spew was removed from bond_miimon_inspect(). I'm monitoring logs on this
> >system over another network connection, and it seems that the act of
> >spewing logs at all there increases rtnl lock contention, because
> >instrumented code showed bond_mii_monitor() never able to succeed in it's
> >attempts to call rtnl_trylock() to actually commit link state changes,
> >leaving the downed link stuck in BOND_LINK_DOWN. The system in question
> >appears to be fine with the log spew being moved to
> >bond_commit_link_state(), which is called after the successful
> >rtnl_trylock(). I'm actually wondering if perhaps we ultimately need/want
> >some bond-specific lock here to prevent racing with bond_close() instead
> >of using rtnl, but this shift of the output appears to work. I believe
> >this started happening when de77ecd4ef02 ("bonding: improve link-status
> >update in mii-monitoring") went in, but I'm not 100% on that.
>
> We use RTNL not to avoid deadlock with bonding itself, but
> because the "commit" side undertakes actions which require RTNL, e.g.,
> various events will eventually call netdev_lower_state_changed.
>
> However, the RTNL acquisition is a trylock to avoid the deadlock
> with bond_close. Moving that out of line here (e.g., putting the commit
> into another work queue event or the like) has the same problem, in that
> bond_close needs to wait for all of the work queue events to finish, and
> it holds RTNL.
Ah, okay, it wasn't clear to me that we actually do need RTNL here,
I'd thought it was just for the deadlock avoidance with bond_close,
based on the comments in the source.
> Also, a dim memory says that the various notification messages
> were mostly placed in the "inspect" phase and not the "commit" phase to
> avoid doing printk-like activities with RTNL held. As a general
> principle, I don't think we want to add more verbiage under RTNL.
Yeah, that's fair.
> >The addition of a case BOND_LINK_BACK in bond_miimon_inspect() is somewhat
> >separate from the fix for the actual hang, but it eliminates a constant
> >"invalid new link 3 on slave" message seen related to this issue, and it's
> >not actually an invalid state here, so we shouldn't be reporting it as an
> >error.
>
> Do you mean bond_miimon_commit here and not bond_miimon_inspect
> (which already has a case for BOND_LINK_BACK)?
Whoops, yes.
> In principle, bond_miimon_commit should not see _BACK or _FAIL
> state as a new link state, because those states should be managed at the
> bond_miimon_inspect level (as they are the result of updelay and
> downdelay). These states should not be "committed" in the sense of
> causing notifications or doing actions that require RTNL.
>
> My recollection is that the "invalid new link" messages were the
> result of a bug in de77ecd4ef02, which was fixed in 1899bb325149
> ("bonding: fix state transition issue in link monitoring"), but maybe
> the RTNL problem here induces that in some other fashion.
>
> Either way, I believe this message is correct as-is.
Hm, okay, definitely seeing this message pop up regularly when a link
recovers, using a fairly simple reproducer:
slave1=p6p1
slave2=p6p2
modprobe -rv bonding
modprobe -v bonding mode=2 miimon=100 updelay=200
ip link set bond0 up
ifenslave bond0 $slave1 $slave2
sleep 10
while :
do
ip link set $slave1 down
sleep 1
ip link set $slave1 up
sleep 1
done
I wasn't actually seeing the problem until I was running a 'watch -n 1
"dmesg | tail -n 50"' or similar in a remote ssh session on the host.
I should add the caveat that this was also initially seen on an older
kernel, but with a fairly up-to-date bonding driver, which does
include both de77ecd4ef02 and 1899bb325149. I'm going to keep prodding
w/a more recent upstreamier kernel, and see if I can get a better idea
of what's actually going on.
--
Jarod Wilson
jarod@redhat.com
next prev parent reply other threads:[~2020-12-09 15:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-05 23:43 [PATCH net] bonding: reduce rtnl lock contention in mii monitor thread Jarod Wilson
2020-12-08 19:38 ` Jakub Kicinski
2020-12-09 15:08 ` Jarod Wilson
2020-12-08 20:34 ` Jay Vosburgh
2020-12-09 15:26 ` Jarod Wilson [this message]
2020-12-18 23:48 ` Jarod Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKfmpSfbvi4hVCCotFLQB3MzvCx9ZRX7ttVww136pPkm1bN6PQ@mail.gmail.com \
--to=jarod@redhat.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=jay.vosburgh@canonical.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maheshb@google.com \
--cc=netdev@vger.kernel.org \
--cc=tadavis@lbl.gov \
--cc=vfalico@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).