* [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications @ 2015-03-12 5:54 Mahesh Bandewar 2015-03-12 12:09 ` Nikolay Aleksandrov 2015-03-12 19:20 ` David Miller 0 siblings, 2 replies; 7+ messages in thread From: Mahesh Bandewar @ 2015-03-12 5:54 UTC (permalink / raw) To: Jay Vosburgh, Andy Gospodarek, Veaceslav Falico, Nikolay Aleksandrov, David Miller Cc: Mahesh Bandewar, Maciej Zenczykowski, netdev, Eric Dumazet This patch series tries to address the issue discovered in various work- queues and the way these handlers deal with the RTNL. Especially for notification handling. If RTNL can not be acquired, these handlers ignore sending notifications and just re-arm the timer. This could be very problematic if the re-arm timer has larger value (e.g. in minutes). Mahesh Bandewar (4): bonding: Handle notifications during work-queue processing gracefully bonding: Do not ignore notifications for miimon-work-queue bonding: Do not ignore notifications for AD-work-queue bonding: Do not ignore notifications for ARP-work-queue drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- include/net/bonding.h | 22 +++++++++++++++ 3 files changed, 77 insertions(+), 29 deletions(-) -- 2.2.0.rc0.207.ga3a616c ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 5:54 [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications Mahesh Bandewar @ 2015-03-12 12:09 ` Nikolay Aleksandrov 2015-03-12 14:11 ` Eric Dumazet 2015-03-12 19:20 ` David Miller 1 sibling, 1 reply; 7+ messages in thread From: Nikolay Aleksandrov @ 2015-03-12 12:09 UTC (permalink / raw) To: Mahesh Bandewar, Jay Vosburgh, Andy Gospodarek, Veaceslav Falico, David Miller Cc: Maciej Zenczykowski, netdev, Eric Dumazet On 12/03/15 06:54, Mahesh Bandewar wrote: > This patch series tries to address the issue discovered in various work- > queues and the way these handlers deal with the RTNL. Especially for > notification handling. If RTNL can not be acquired, these handlers ignore > sending notifications and just re-arm the timer. This could be very > problematic if the re-arm timer has larger value (e.g. in minutes). > > > Mahesh Bandewar (4): > bonding: Handle notifications during work-queue processing gracefully > bonding: Do not ignore notifications for miimon-work-queue > bonding: Do not ignore notifications for AD-work-queue > bonding: Do not ignore notifications for ARP-work-queue > > drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- > drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- > include/net/bonding.h | 22 +++++++++++++++ > 3 files changed, 77 insertions(+), 29 deletions(-) > Hello Mahesh, The current behaviour was chosen as a best-effort type because such change could skew the monitoring/AD timers because you re-schedule every time you cannot acquire rtnl and move their timers with 1 tick ahead which could lead to false link drops, state machines running late and so on. Currently the monitoring/AD functions have priority over the notifications as in we might miss a notification but we try not to miss a monitoring/AD event, thus I'd suggest to solve this in a different manner. If you'd like to have the best of both worlds i.e. not miss a monitoring event and also not miss any notifications you should use a different strategy. One idea would be to use a new "work" which checks some notification bits and sends various types of notifications, then the monitoring works can just schedule it whenever they need to send notifications and can keep their running times accurate. The new work should handle the rtnl side of things and re-schedule itself if necessary, now that is only an idea from the top of my head I haven't checked if it's feasible or if there isn't a better approach so I'd be open to other ideas. Also if you do decide to go this way, you should be very careful with the order of work cancellation in bond_close(). In fact, IIRC, this has already been done by the slave array update that you introduced and also by the igmp resend work. There's also a bug in bond_should_notify_peers(), as in it's not RCU safe on its own, it's currently safe because its callers use it in RCU protected sections but here you move it outside so it should be fixed first (take a look at how "slave" is used). Nik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 12:09 ` Nikolay Aleksandrov @ 2015-03-12 14:11 ` Eric Dumazet 2015-03-12 14:21 ` Nikolay Aleksandrov 0 siblings, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2015-03-12 14:11 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: Mahesh Bandewar, Jay Vosburgh, Andy Gospodarek, Veaceslav Falico, David Miller, Maciej Zenczykowski, netdev, Eric Dumazet On Thu, 2015-03-12 at 13:09 +0100, Nikolay Aleksandrov wrote: > On 12/03/15 06:54, Mahesh Bandewar wrote: > > This patch series tries to address the issue discovered in various work- > > queues and the way these handlers deal with the RTNL. Especially for > > notification handling. If RTNL can not be acquired, these handlers ignore > > sending notifications and just re-arm the timer. This could be very > > problematic if the re-arm timer has larger value (e.g. in minutes). > > > > > > Mahesh Bandewar (4): > > bonding: Handle notifications during work-queue processing gracefully > > bonding: Do not ignore notifications for miimon-work-queue > > bonding: Do not ignore notifications for AD-work-queue > > bonding: Do not ignore notifications for ARP-work-queue > > > > drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- > > drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- > > include/net/bonding.h | 22 +++++++++++++++ > > 3 files changed, 77 insertions(+), 29 deletions(-) > > > > Hello Mahesh, > The current behaviour was chosen as a best-effort type because such change > could skew the monitoring/AD timers because you re-schedule every time you > cannot acquire rtnl and move their timers with 1 tick ahead which could lead > to false link drops, state machines running late and so on. > Currently the monitoring/AD functions have priority over the notifications as in > we might miss a notification but we try not to miss a monitoring/AD event, thus I'd > suggest to solve this in a different manner. If you'd like to have the best of both > worlds i.e. not miss a monitoring event and also not miss any notifications you should > use a different strategy. I think I disagree here. All rtnl_trylock() call sites must be very careful about what they are doing. bonding is not, and we should fix this. Mahesh fix seems very reasonable. If you need something else, please provide your own patch. When code is the following : if (!rtnl_trylock()) return; call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); rtnl_unlock(); Then, one must understand what happens if at this point rtnl_trylock() never succeeds. Like, what happens if you remove the whole thing. If the code is not needed, remove it, instead of having flaky behavior. diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index c026ce9cd7b6f52f1a6bff88b9e6053b13ecebcd..958c9a46345a59daee2dbd34d859b17af94931bc 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2167,12 +2167,6 @@ re_arm: if (bond->params.miimon) queue_delayed_work(bond->wq, &bond->mii_work, delay); - if (should_notify_peers) { - if (!rtnl_trylock()) - return; - call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); - rtnl_unlock(); - } } static bool bond_has_this_ip(struct bonding *bond, __be32 ip) ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 14:11 ` Eric Dumazet @ 2015-03-12 14:21 ` Nikolay Aleksandrov 2015-03-12 14:24 ` Nikolay Aleksandrov 0 siblings, 1 reply; 7+ messages in thread From: Nikolay Aleksandrov @ 2015-03-12 14:21 UTC (permalink / raw) To: Eric Dumazet Cc: Mahesh Bandewar, Jay Vosburgh, Andy Gospodarek, Veaceslav Falico, David Miller, Maciej Zenczykowski, netdev, Eric Dumazet On 12/03/15 15:11, Eric Dumazet wrote: > On Thu, 2015-03-12 at 13:09 +0100, Nikolay Aleksandrov wrote: >> On 12/03/15 06:54, Mahesh Bandewar wrote: >>> This patch series tries to address the issue discovered in various work- >>> queues and the way these handlers deal with the RTNL. Especially for >>> notification handling. If RTNL can not be acquired, these handlers ignore >>> sending notifications and just re-arm the timer. This could be very >>> problematic if the re-arm timer has larger value (e.g. in minutes). >>> >>> >>> Mahesh Bandewar (4): >>> bonding: Handle notifications during work-queue processing gracefully >>> bonding: Do not ignore notifications for miimon-work-queue >>> bonding: Do not ignore notifications for AD-work-queue >>> bonding: Do not ignore notifications for ARP-work-queue >>> >>> drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- >>> drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- >>> include/net/bonding.h | 22 +++++++++++++++ >>> 3 files changed, 77 insertions(+), 29 deletions(-) >>> >> >> Hello Mahesh, >> The current behaviour was chosen as a best-effort type because such change >> could skew the monitoring/AD timers because you re-schedule every time you >> cannot acquire rtnl and move their timers with 1 tick ahead which could lead >> to false link drops, state machines running late and so on. >> Currently the monitoring/AD functions have priority over the notifications as in >> we might miss a notification but we try not to miss a monitoring/AD event, thus I'd >> suggest to solve this in a different manner. If you'd like to have the best of both >> worlds i.e. not miss a monitoring event and also not miss any notifications you should >> use a different strategy. > > > I think I disagree here. > > All rtnl_trylock() call sites must be very careful about what they are > doing. > > bonding is not, and we should fix this. > > Mahesh fix seems very reasonable. If you need something else, please > provide your own patch. > > When code is the following : > > if (!rtnl_trylock()) > return; > call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); > rtnl_unlock(); > > Then, one must understand what happens if at this point rtnl_trylock() > never succeeds. > > Like, what happens if you remove the whole thing. > > If the code is not needed, remove it, instead of having flaky behavior. > I agree that it should be fixed and that would work, my only concern is that in rare cases this might lead to skewing of the monitoring/AD timers because with every failed attempt at obtaining rtnl it moves the associated timer with 1 tick ahead, because when it successfully obtains it then the timer gets re-armed with the full timeout. What I suggested has already been done actually by the slave array update which uses a work item of its own because of the rtnl update, so we could just pull all of the call sites that request rtnl in a single work item which checks some bits and calls the appropriate notifications or re-schedules itself if necessary, that would keep all the monitoring/AD timers closer to being correct. I can see that this would be a very rare case so I don't have a strong feeling about my argument, I just wanted to make sure it gets considered. If you and the others decide it's not worth it, then so be it. Also pulling all rtnl call sites in a single place looks cleaner to me instead of having the same logic in each work item's function. Cheers, Nik > > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index c026ce9cd7b6f52f1a6bff88b9e6053b13ecebcd..958c9a46345a59daee2dbd34d859b17af94931bc 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -2167,12 +2167,6 @@ re_arm: > if (bond->params.miimon) > queue_delayed_work(bond->wq, &bond->mii_work, delay); > > - if (should_notify_peers) { > - if (!rtnl_trylock()) > - return; > - call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); > - rtnl_unlock(); > - } > } > > static bool bond_has_this_ip(struct bonding *bond, __be32 ip) > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 14:21 ` Nikolay Aleksandrov @ 2015-03-12 14:24 ` Nikolay Aleksandrov 2015-03-12 19:10 ` Jay Vosburgh 0 siblings, 1 reply; 7+ messages in thread From: Nikolay Aleksandrov @ 2015-03-12 14:24 UTC (permalink / raw) To: Eric Dumazet Cc: Mahesh Bandewar, Jay Vosburgh, Andy Gospodarek, Veaceslav Falico, David Miller, Maciej Zenczykowski, netdev, Eric Dumazet On 12/03/15 15:21, Nikolay Aleksandrov wrote: > On 12/03/15 15:11, Eric Dumazet wrote: >> On Thu, 2015-03-12 at 13:09 +0100, Nikolay Aleksandrov wrote: >>> On 12/03/15 06:54, Mahesh Bandewar wrote: >>>> This patch series tries to address the issue discovered in various work- >>>> queues and the way these handlers deal with the RTNL. Especially for >>>> notification handling. If RTNL can not be acquired, these handlers ignore >>>> sending notifications and just re-arm the timer. This could be very >>>> problematic if the re-arm timer has larger value (e.g. in minutes). >>>> >>>> >>>> Mahesh Bandewar (4): >>>> bonding: Handle notifications during work-queue processing gracefully >>>> bonding: Do not ignore notifications for miimon-work-queue >>>> bonding: Do not ignore notifications for AD-work-queue >>>> bonding: Do not ignore notifications for ARP-work-queue >>>> >>>> drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- >>>> drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- >>>> include/net/bonding.h | 22 +++++++++++++++ >>>> 3 files changed, 77 insertions(+), 29 deletions(-) >>>> >>> >>> Hello Mahesh, >>> The current behaviour was chosen as a best-effort type because such change >>> could skew the monitoring/AD timers because you re-schedule every time you >>> cannot acquire rtnl and move their timers with 1 tick ahead which could lead >>> to false link drops, state machines running late and so on. >>> Currently the monitoring/AD functions have priority over the notifications as in >>> we might miss a notification but we try not to miss a monitoring/AD event, thus I'd >>> suggest to solve this in a different manner. If you'd like to have the best of both >>> worlds i.e. not miss a monitoring event and also not miss any notifications you should >>> use a different strategy. >> >> >> I think I disagree here. >> >> All rtnl_trylock() call sites must be very careful about what they are >> doing. >> >> bonding is not, and we should fix this. >> >> Mahesh fix seems very reasonable. If you need something else, please >> provide your own patch. >> >> When code is the following : >> >> if (!rtnl_trylock()) >> return; >> call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); >> rtnl_unlock(); >> >> Then, one must understand what happens if at this point rtnl_trylock() >> never succeeds. >> >> Like, what happens if you remove the whole thing. >> >> If the code is not needed, remove it, instead of having flaky behavior. >> > > I agree that it should be fixed and that would work, my only concern is that in > rare cases this might lead to skewing of the monitoring/AD timers because with > every failed attempt at obtaining rtnl it moves the associated timer with 1 tick > ahead, because when it successfully obtains it then the timer gets re-armed with the > full timeout. What I suggested has already been done actually by the slave array update > which uses a work item of its own because of the rtnl update, so we could just pull all > of the call sites that request rtnl in a single work item which checks some bits and > calls the appropriate notifications or re-schedules itself if necessary, that would keep > all the monitoring/AD timers closer to being correct. > I can see that this would be a very rare case so I don't have a strong feeling about > my argument, I just wanted to make sure it gets considered. If you and the others decide > it's not worth it, then so be it. Also pulling all rtnl call sites in a single place > looks cleaner to me instead of having the same logic in each work item's function. > > Cheers, > Nik > Well, of course that has its own problems of missed updates. Hrm.. Okay let's leave it this way. Never mind my argument about the timer skewing, consider only fixing the RCU problem. Nik ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 14:24 ` Nikolay Aleksandrov @ 2015-03-12 19:10 ` Jay Vosburgh 0 siblings, 0 replies; 7+ messages in thread From: Jay Vosburgh @ 2015-03-12 19:10 UTC (permalink / raw) To: Nikolay Aleksandrov Cc: Eric Dumazet, Mahesh Bandewar, Andy Gospodarek, Veaceslav Falico, David Miller, Maciej Zenczykowski, netdev, Eric Dumazet Nikolay Aleksandrov <nikolay@redhat.com> wrote: >On 12/03/15 15:21, Nikolay Aleksandrov wrote: >> On 12/03/15 15:11, Eric Dumazet wrote: >>> On Thu, 2015-03-12 at 13:09 +0100, Nikolay Aleksandrov wrote: >>>> On 12/03/15 06:54, Mahesh Bandewar wrote: >>>>> This patch series tries to address the issue discovered in various work- >>>>> queues and the way these handlers deal with the RTNL. Especially for >>>>> notification handling. If RTNL can not be acquired, these handlers ignore >>>>> sending notifications and just re-arm the timer. This could be very >>>>> problematic if the re-arm timer has larger value (e.g. in minutes). >>>>> >>>>> >>>>> Mahesh Bandewar (4): >>>>> bonding: Handle notifications during work-queue processing gracefully >>>>> bonding: Do not ignore notifications for miimon-work-queue >>>>> bonding: Do not ignore notifications for AD-work-queue >>>>> bonding: Do not ignore notifications for ARP-work-queue >>>>> >>>>> drivers/net/bonding/bond_3ad.c | 22 ++++++++++++--- >>>>> drivers/net/bonding/bond_main.c | 62 ++++++++++++++++++++++++----------------- >>>>> include/net/bonding.h | 22 +++++++++++++++ >>>>> 3 files changed, 77 insertions(+), 29 deletions(-) >>>>> >>>> >>>> Hello Mahesh, >>>> The current behaviour was chosen as a best-effort type because such change >>>> could skew the monitoring/AD timers because you re-schedule every time you >>>> cannot acquire rtnl and move their timers with 1 tick ahead which could lead >>>> to false link drops, state machines running late and so on. >>>> Currently the monitoring/AD functions have priority over the notifications as in >>>> we might miss a notification but we try not to miss a monitoring/AD event, thus I'd >>>> suggest to solve this in a different manner. If you'd like to have the best of both >>>> worlds i.e. not miss a monitoring event and also not miss any notifications you should >>>> use a different strategy. >>> >>> >>> I think I disagree here. >>> >>> All rtnl_trylock() call sites must be very careful about what they are >>> doing. >>> >>> bonding is not, and we should fix this. >>> >>> Mahesh fix seems very reasonable. If you need something else, please >>> provide your own patch. >>> >>> When code is the following : >>> >>> if (!rtnl_trylock()) >>> return; >>> call_netdevice_notifiers(NETDEV_NOTIFY_PEERS, bond->dev); >>> rtnl_unlock(); >>> >>> Then, one must understand what happens if at this point rtnl_trylock() >>> never succeeds. >>> >>> Like, what happens if you remove the whole thing. >>> >>> If the code is not needed, remove it, instead of having flaky behavior. >>> >> >> I agree that it should be fixed and that would work, my only concern is that in >> rare cases this might lead to skewing of the monitoring/AD timers because with >> every failed attempt at obtaining rtnl it moves the associated timer with 1 tick >> ahead, because when it successfully obtains it then the timer gets re-armed with the >> full timeout. What I suggested has already been done actually by the slave array update >> which uses a work item of its own because of the rtnl update, so we could just pull all >> of the call sites that request rtnl in a single work item which checks some bits and >> calls the appropriate notifications or re-schedules itself if necessary, that would keep >> all the monitoring/AD timers closer to being correct. >> I can see that this would be a very rare case so I don't have a strong feeling about >> my argument, I just wanted to make sure it gets considered. If you and the others decide >> it's not worth it, then so be it. Also pulling all rtnl call sites in a single place >> looks cleaner to me instead of having the same logic in each work item's function. >> >> Cheers, >> Nik >> > >Well, of course that has its own problems of missed updates. Hrm.. Okay >let's leave it this way. Never mind my argument about the timer >skewing, consider only fixing the RCU problem. I don't see an issue with how this changes the notification processing. I was initially concerned with the 802.3ad state machine, but the actual state machine processing is skipped for the "accelerated" invocation that sends the notifier. I don't see that the state machine itself will become skewed due to delays unless rtnl_trylock fails continuously for an extended period (upwards of 2 seconds, I think, long enough to risk the 3 second LACPDU timeout if LACP rate is set to fast). When the "rtnl_trylock" business was added, the primary concern was deadlock avoidance, not timely delivery of the notifications. The deadlock still looks to be a concern; some paths come in with RTNL held and acquire the bond->mode_lock, the various monitoring functions are doing the opposite order. I recall another suggestion a while back to modify the bonding periodic functions to acquire RTNL for every pass (followed by the mode_lock) to eliminate the trylock, but I don't see that as preferrable due to the amount of spurious RTNL acquisitions. That said, it would be better to have a more deterministic system (e.g., moving the "needs RTNL" logic into its own work item without conditional locking), but as an incremental change to what's there now, I don't see a problem with this patch set. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications 2015-03-12 5:54 [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications Mahesh Bandewar 2015-03-12 12:09 ` Nikolay Aleksandrov @ 2015-03-12 19:20 ` David Miller 1 sibling, 0 replies; 7+ messages in thread From: David Miller @ 2015-03-12 19:20 UTC (permalink / raw) To: maheshb; +Cc: j.vosburgh, andy, vfalico, nikolay, maze, netdev, edumazet From: Mahesh Bandewar <maheshb@google.com> Date: Wed, 11 Mar 2015 22:54:33 -0700 > This patch series tries to address the issue discovered in various work- > queues and the way these handlers deal with the RTNL. Especially for > notification handling. If RTNL can not be acquired, these handlers ignore > sending notifications and just re-arm the timer. This could be very > problematic if the re-arm timer has larger value (e.g. in minutes). Any timeout selected is completely arbitrary, and if the RTNL semaphore is really being held for a non-trivial amount of time, you're going to just thrash and potentially even steal cycles and cache lines from the thing that's taking so long. Please do something more sane, perhaps similar to the netdev_run_todo() function which is invoked at rtnl_unlock() time. You'll need a different synchronization scheme that that which netdev_run_todo() uses, but that shouldn't be too hard. This way, when the RTNL mutex is dropped your notifications will get generated promptly, and without all of the thrashing side-effects and arbitrary timeouts. Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-03-12 19:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-12 5:54 [PATCH net-next 0/4] bonding work-queues, try_rtnl() & notifications Mahesh Bandewar 2015-03-12 12:09 ` Nikolay Aleksandrov 2015-03-12 14:11 ` Eric Dumazet 2015-03-12 14:21 ` Nikolay Aleksandrov 2015-03-12 14:24 ` Nikolay Aleksandrov 2015-03-12 19:10 ` Jay Vosburgh 2015-03-12 19:20 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).