Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Mahesh Bandewar <maheshb@google.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	Veaceslav Falico <vfalico@gmail.com>,
	David Miller <davem@davemloft.net>,
	Maciej Zenczykowski <maze@google.com>,
	netdev <netdev@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications
Date: Thu, 12 Mar 2015 12:10:07 -0700	[thread overview]
Message-ID: <13968.1426187407@famine> (raw)
In-Reply-To: <5501A1B1.3030907@redhat.com>

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>On 12/03/15 15:21, Nikolay Aleksandrov wrote:
>> On 12/03/15 15:11, Eric Dumazet wrote:
>>> On Thu, 2015-03-12 at 13:09 +0100, Nikolay Aleksandrov wrote:
>>>> On 12/03/15 06:54, Mahesh Bandewar wrote:
>>>>> This patch series tries to address the issue discovered in various work-
>>>>> queues and the way these handlers deal with the RTNL. Especially for 
>>>>> notification handling. If RTNL can not be acquired, these handlers ignore
>>>>> sending notifications and just re-arm the timer. This could be very 
>>>>> problematic if the re-arm timer has larger value (e.g. in minutes).
>>>>>
>>>>>
>>>>> Mahesh Bandewar (4):
>>>>>   bonding: Handle notifications during work-queue processing gracefully
>>>>>   bonding: Do not ignore notifications for miimon-work-queue
>>>>>   bonding: Do not ignore notifications for AD-work-queue
>>>>>   bonding: Do not ignore notifications for ARP-work-queue
>>>>>
>>>>>  drivers/net/bonding/bond_3ad.c  | 22 ++++++++++++---
>>>>>  drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++-----------------
>>>>>  include/net/bonding.h           | 22 +++++++++++++++
>>>>>  3 files changed, 77 insertions(+), 29 deletions(-)
>>>>>
>>>>
>>>> Hello Mahesh,
>>>> The current behaviour was chosen as a best-effort type because such change
>>>> could skew the monitoring/AD timers because you re-schedule every time you
>>>> cannot acquire rtnl and move their timers with 1 tick ahead which could lead 
>>>> to false link drops, state machines running late and so on.
>>>> Currently the monitoring/AD functions have priority over the notifications as in
>>>> we might miss a notification but we try not to miss a monitoring/AD event, thus I'd
>>>> suggest to solve this in a different manner. If you'd like to have the best of both
>>>> worlds i.e. not miss a monitoring event and also not miss any notifications you should
>>>> use a different strategy. 
>>>
>>>
>>> I think I disagree here.
>>>
>>> All rtnl_trylock() call sites must be very careful about what they are
>>> doing.
>>>
>>> bonding is not, and we should fix this.
>>>
>>> Mahesh fix seems very reasonable. If you need something else, please
>>> provide your own patch.
>>>
>>> When code is the following :
>>>
>>>         if (!rtnl_trylock())
>>>                  return;
>>>         call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev);
>>>         rtnl_unlock();
>>>
>>> Then, one must understand what happens if at this point rtnl_trylock()
>>> never succeeds.
>>>
>>> Like, what happens if you remove the whole thing.
>>>
>>> If the code is not needed, remove it, instead of having flaky behavior.
>>>
>> 
>> I agree that it should be fixed and that would work, my only concern is that in
>> rare cases this might lead to skewing of the monitoring/AD timers because with
>> every failed attempt at obtaining rtnl it moves the associated timer with 1 tick
>> ahead, because when it successfully obtains it then the timer gets re-armed with the
>> full timeout. What I suggested has already been done actually by the slave array update
>> which uses a work item of its own because of the rtnl update, so we could just pull all
>> of the call sites that request rtnl in a single work item which checks some bits and
>> calls the appropriate notifications or re-schedules itself if necessary, that would keep
>> all the monitoring/AD timers closer to being correct.
>> I can see that this would be a very rare case so I don't have a strong feeling about
>> my argument, I just wanted to make sure it gets considered. If you and the others decide
>> it's not worth it, then so be it. Also pulling all rtnl call sites in a single place
>> looks cleaner to me instead of having the same logic in each work item's function.
>> 
>> Cheers,
>>  Nik
>> 
>
>Well, of course that has its own problems of missed updates. Hrm.. Okay
>let's leave it this way. Never mind my argument about the timer
>skewing, consider only fixing the RCU problem.

	I don't see an issue with how this changes the notification
processing.  I was initially concerned with the 802.3ad state machine,
but the actual state machine processing is skipped for the "accelerated"
invocation that sends the notifier.  I don't see that the state machine
itself will become skewed due to delays unless rtnl_trylock fails
continuously for an extended period (upwards of 2 seconds, I think, long
enough to risk the 3 second LACPDU timeout if LACP rate is set to fast).

	When the "rtnl_trylock" business was added, the primary concern
was deadlock avoidance, not timely delivery of the notifications.  The
deadlock still looks to be a concern; some paths come in with RTNL held
and acquire the bond->mode_lock, the various monitoring functions are
doing the opposite order.

	I recall another suggestion a while back to modify the bonding
periodic functions to acquire RTNL for every pass (followed by the
mode_lock) to eliminate the trylock, but I don't see that as preferrable
due to the amount of spurious RTNL acquisitions.

	That said, it would be better to have a more deterministic
system (e.g., moving the "needs RTNL" logic into its own work item
without conditional locking), but as an incremental change to what's
there now, I don't see a problem with this patch set.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

next prev parent reply	other threads:[~2015-03-12 19:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12  5:54 [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications Mahesh Bandewar
2015-03-12 12:09 ` Nikolay Aleksandrov
2015-03-12 14:11   ` Eric Dumazet
2015-03-12 14:21     ` Nikolay Aleksandrov
2015-03-12 14:24       ` Nikolay Aleksandrov
2015-03-12 19:10         ` Jay Vosburgh [this message]
2015-03-12 19:20 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13968.1426187407@famine \
    --to=jay.vosburgh@canonical.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=maheshb@google.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@redhat.com \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.