* [PATCH v2 net] bonding: don't use stale speed and duplex information
@ 2016-02-08 20:10 Jay Vosburgh
2016-02-14 2:36 ` Ding Tianhong
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Jay Vosburgh @ 2016-02-08 20:10 UTC (permalink / raw)
To: netdev, Tantilov, Emil S, zhuyj
Cc: Veaceslav Falico, dingtianhong, Andy Gospodarek, David S. Miller
There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex. The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.
It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex. This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex. This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.
This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.
This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock. Later, the locking was changed to only hold RTNL, and so
after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.
Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
---
v2: Correct Veaceslav's email address
Note: The "Fixes" commit is the commit that makes this operation safe
again, not the commit that originally introduced the race. I don't see
any simple way to resolve this bug between these two commits.
drivers/net/bonding/bond_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 56b560558884..cabaeb61333d 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
case BOND_LINK_UP:
+ bond_update_speed_duplex(slave);
bond_set_slave_link_state(slave, BOND_LINK_UP,
BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;
--
1.9.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-08 20:10 [PATCH v2 net] bonding: don't use stale speed and duplex information Jay Vosburgh
@ 2016-02-14 2:36 ` Ding Tianhong
2016-02-16 20:14 ` David Miller
2016-02-25 8:35 ` zhuyj
2 siblings, 0 replies; 10+ messages in thread
From: Ding Tianhong @ 2016-02-14 2:36 UTC (permalink / raw)
To: Jay Vosburgh, netdev, Tantilov, Emil S, zhuyj
Cc: Veaceslav Falico, Andy Gospodarek, David S. Miller
On 2016/2/9 4:10, Jay Vosburgh wrote:
>
> There is presently a race condition between the bonding periodic
> link monitor and the updating of a slave's speed and duplex. The former
> occurs on a periodic basis, and the latter in response to a driver's
> calling of netif_carrier_on.
>
> It is possible for the periodic monitor to run between the
> driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
> event that causes bonding to update the slave's speed and duplex. This
> manifests most notably as a report that a slave is up and "0 Mbps full
> duplex" after enslavement, but in principle could report an incorrect
> speed and duplex after any link up event if the device comes up with a
> different speed or duplex. This affects the 802.3ad aggregator
> selection, as the speed and duplex are selection criteria.
>
> This is fixed by updating the speed and duplex in the periodic
> monitor, prior to using that information.
>
> This was done historically in bonding, but the call to
> bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
> don't call update_speed_duplex() under spinlocks"), as it might sleep
> under lock. Later, the locking was changed to only hold RTNL, and so
> after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
> under spinlocks") this call is again safe.
>
> Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
> Cc: Veaceslav Falico <vfalico@gmail.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
>
> ---
>
> v2: Correct Veaceslav's email address
>
> Note: The "Fixes" commit is the commit that makes this operation safe
> again, not the commit that originally introduced the race. I don't see
> any simple way to resolve this bug between these two commits.
>
> drivers/net/bonding/bond_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 56b560558884..cabaeb61333d 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
> continue;
>
> case BOND_LINK_UP:
> + bond_update_speed_duplex(slave);
> bond_set_slave_link_state(slave, BOND_LINK_UP,
> BOND_SLAVE_NOTIFY_NOW);
> slave->last_link_up = jiffies;
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-08 20:10 [PATCH v2 net] bonding: don't use stale speed and duplex information Jay Vosburgh
2016-02-14 2:36 ` Ding Tianhong
@ 2016-02-16 20:14 ` David Miller
2016-02-18 20:25 ` Jay Vosburgh
2016-02-25 8:35 ` zhuyj
2 siblings, 1 reply; 10+ messages in thread
From: David Miller @ 2016-02-16 20:14 UTC (permalink / raw)
To: jay.vosburgh
Cc: netdev, emil.s.tantilov, zyjzyj2000, vfalico, dingtianhong, gospo
From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Mon, 08 Feb 2016 12:10:02 -0800
> There is presently a race condition between the bonding periodic
> link monitor and the updating of a slave's speed and duplex. The former
> occurs on a periodic basis, and the latter in response to a driver's
> calling of netif_carrier_on.
>
> It is possible for the periodic monitor to run between the
> driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
> event that causes bonding to update the slave's speed and duplex. This
> manifests most notably as a report that a slave is up and "0 Mbps full
> duplex" after enslavement, but in principle could report an incorrect
> speed and duplex after any link up event if the device comes up with a
> different speed or duplex. This affects the 802.3ad aggregator
> selection, as the speed and duplex are selection criteria.
>
> This is fixed by updating the speed and duplex in the periodic
> monitor, prior to using that information.
>
> This was done historically in bonding, but the call to
> bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
> don't call update_speed_duplex() under spinlocks"), as it might sleep
> under lock. Later, the locking was changed to only hold RTNL, and so
> after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
> under spinlocks") this call is again safe.
>
> Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
> Cc: Veaceslav Falico <vfalico@gmail.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Applied, thanks Jay.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-16 20:14 ` David Miller
@ 2016-02-18 20:25 ` Jay Vosburgh
2016-02-18 20:27 ` David Miller
0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2016-02-18 20:25 UTC (permalink / raw)
To: David Miller
Cc: netdev, emil.s.tantilov, zyjzyj2000, vfalico, dingtianhong, gospo
David Miller <davem@davemloft.net> wrote:
[...]
>> This was done historically in bonding, but the call to
>> bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
>> don't call update_speed_duplex() under spinlocks"), as it might sleep
>> under lock. Later, the locking was changed to only hold RTNL, and so
>> after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
>> under spinlocks") this call is again safe.
>>
>> Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
>> Cc: Veaceslav Falico <vfalico@gmail.com>
>> Cc: dingtianhong <dingtianhong@huawei.com>
>> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>
>Applied, thanks Jay.
Rereading the above, I just noticed that I put the wrong commit
into the fixes tag (and the "Later, the locking was changed" text); the
correct fixes tag should be:
Fixes: 4cb4f97b7e36 ("bonding: rebuild the lock use for bond_mii_monitor()")
Kernels between 876254ae2758 and 4cb4f97b7e36 should not have
this patch applied, as it might sleep under lock.
Sorry for the error,
-J
---
-Jay Vosburgh, jay.vosburgh@canonical.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-18 20:25 ` Jay Vosburgh
@ 2016-02-18 20:27 ` David Miller
0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2016-02-18 20:27 UTC (permalink / raw)
To: jay.vosburgh
Cc: netdev, emil.s.tantilov, zyjzyj2000, vfalico, dingtianhong, gospo
From: Jay Vosburgh <jay.vosburgh@canonical.com>
Date: Thu, 18 Feb 2016 12:25:52 -0800
> David Miller <davem@davemloft.net> wrote:
> [...]
>>> This was done historically in bonding, but the call to
>>> bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
>>> don't call update_speed_duplex() under spinlocks"), as it might sleep
>>> under lock. Later, the locking was changed to only hold RTNL, and so
>>> after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
>>> under spinlocks") this call is again safe.
>>>
>>> Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
>>> Cc: Veaceslav Falico <vfalico@gmail.com>
>>> Cc: dingtianhong <dingtianhong@huawei.com>
>>> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
>>> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>>
>>Applied, thanks Jay.
>
> Rereading the above, I just noticed that I put the wrong commit
> into the fixes tag (and the "Later, the locking was changed" text); the
> correct fixes tag should be:
>
> Fixes: 4cb4f97b7e36 ("bonding: rebuild the lock use for bond_mii_monitor()")
>
> Kernels between 876254ae2758 and 4cb4f97b7e36 should not have
> this patch applied, as it might sleep under lock.
>
> Sorry for the error,
Ok, thanks for the info.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-08 20:10 [PATCH v2 net] bonding: don't use stale speed and duplex information Jay Vosburgh
2016-02-14 2:36 ` Ding Tianhong
2016-02-16 20:14 ` David Miller
@ 2016-02-25 8:35 ` zhuyj
2016-02-25 13:33 ` Jay Vosburgh
2 siblings, 1 reply; 10+ messages in thread
From: zhuyj @ 2016-02-25 8:35 UTC (permalink / raw)
To: Jay Vosburgh, netdev, Tantilov, Emil S
Cc: Veaceslav Falico, dingtianhong, Andy Gospodarek, David S. Miller
On 02/09/2016 04:10 AM, Jay Vosburgh wrote:
> There is presently a race condition between the bonding periodic
> link monitor and the updating of a slave's speed and duplex. The former
> occurs on a periodic basis, and the latter in response to a driver's
> calling of netif_carrier_on.
>
> It is possible for the periodic monitor to run between the
> driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
> event that causes bonding to update the slave's speed and duplex. This
> manifests most notably as a report that a slave is up and "0 Mbps full
> duplex" after enslavement, but in principle could report an incorrect
> speed and duplex after any link up event if the device comes up with a
> different speed or duplex. This affects the 802.3ad aggregator
> selection, as the speed and duplex are selection criteria.
>
> This is fixed by updating the speed and duplex in the periodic
> monitor, prior to using that information.
>
> This was done historically in bonding, but the call to
> bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
> don't call update_speed_duplex() under spinlocks"), as it might sleep
> under lock. Later, the locking was changed to only hold RTNL, and so
> after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
> under spinlocks") this call is again safe.
>
> Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
> Cc: Veaceslav Falico <vfalico@gmail.com>
> Cc: dingtianhong <dingtianhong@huawei.com>
> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>
> ---
>
> v2: Correct Veaceslav's email address
>
> Note: The "Fixes" commit is the commit that makes this operation safe
> again, not the commit that originally introduced the race. I don't see
> any simple way to resolve this bug between these two commits.
>
> drivers/net/bonding/bond_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 56b560558884..cabaeb61333d 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -2127,6 +2127,7 @@ static void bond_miimon_commit(struct bonding *bond)
> continue;
>
> case BOND_LINK_UP:
> + bond_update_speed_duplex(slave);
> bond_set_slave_link_state(slave, BOND_LINK_UP,
> BOND_SLAVE_NOTIFY_NOW);
> slave->last_link_up = jiffies;
Hi, Jay
Thanks for your patch.
I delved into the source code and Emil's tests. I think that the problem
that this patch expects to fix occurs very unusually.
Do you agree with me?
If so, maybe the following patch can reduce the performance loss.
Please comment on it. Thanks a lot.
diff --git a/drivers/net/bonding/bond_main.c
b/drivers/net/bonding/bond_main.c
index b7f1a99..c4c511a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
continue;
case BOND_LINK_UP:
- bond_update_speed_duplex(slave);
+ if (slave->speed == SPEED_UNKNOWN)
+ bond_update_speed_duplex(slave);
+
bond_set_slave_link_state(slave, BOND_LINK_UP,
BOND_SLAVE_NOTIFY_NOW);
slave->last_link_up = jiffies;
Best Regards!
Zhu Yanjun
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-25 8:35 ` zhuyj
@ 2016-02-25 13:33 ` Jay Vosburgh
2016-02-26 2:21 ` zhuyj
0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2016-02-25 13:33 UTC (permalink / raw)
To: zhuyj
Cc: netdev, Tantilov, Emil S, Veaceslav Falico, dingtianhong,
Andy Gospodarek, David S. Miller
zhuyj <zyjzyj2000@gmail.com> wrote:
[...]
>I delved into the source code and Emil's tests. I think that the problem
>that this patch expects to fix occurs very unusually.
>
>Do you agree with me?
>
>If so, maybe the following patch can reduce the performance loss.
>Please comment on it. Thanks a lot.
>
>
>diff --git a/drivers/net/bonding/bond_main.c
>b/drivers/net/bonding/bond_main.c
>index b7f1a99..c4c511a 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
> continue;
>
> case BOND_LINK_UP:
>- bond_update_speed_duplex(slave);
>+ if (slave->speed == SPEED_UNKNOWN)
>+ bond_update_speed_duplex(slave);
>+
> bond_set_slave_link_state(slave, BOND_LINK_UP,
>BOND_SLAVE_NOTIFY_NOW);
> slave->last_link_up = jiffies;
I don't believe the speed is necessarily SPEED_UNKNOWN coming in
here. If the race occurs at a time later than the initial enslavement,
speed may already be set (and the race manifests if the new speed
changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
think this is functionally correct.
Also, the call to bond_miimon_commit itself is already gated by
bond_miimon_inspect finding a link state change. The performance impact
here should be minimal.
-J
---
-Jay Vosburgh, jay.vosburgh@canonical.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-25 13:33 ` Jay Vosburgh
@ 2016-02-26 2:21 ` zhuyj
2016-02-29 5:39 ` Jay Vosburgh
0 siblings, 1 reply; 10+ messages in thread
From: zhuyj @ 2016-02-26 2:21 UTC (permalink / raw)
To: Jay Vosburgh
Cc: netdev, Tantilov, Emil S, Veaceslav Falico, dingtianhong,
Andy Gospodarek, David S. Miller
On 02/25/2016 09:33 PM, Jay Vosburgh wrote:
> zhuyj <zyjzyj2000@gmail.com> wrote:
> [...]
>> I delved into the source code and Emil's tests. I think that the problem
>> that this patch expects to fix occurs very unusually.
>>
>> Do you agree with me?
>>
>> If so, maybe the following patch can reduce the performance loss.
>> Please comment on it. Thanks a lot.
>>
>>
>> diff --git a/drivers/net/bonding/bond_main.c
>> b/drivers/net/bonding/bond_main.c
>> index b7f1a99..c4c511a 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
>> continue;
>>
>> case BOND_LINK_UP:
>> - bond_update_speed_duplex(slave);
>> + if (slave->speed == SPEED_UNKNOWN)
>> + bond_update_speed_duplex(slave);
>> +
>> bond_set_slave_link_state(slave, BOND_LINK_UP,
>> BOND_SLAVE_NOTIFY_NOW);
>> slave->last_link_up = jiffies;
> I don't believe the speed is necessarily SPEED_UNKNOWN coming in
> here. If the race occurs at a time later than the initial enslavement,
> speed may already be set (and the race manifests if the new speed
> changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
> think this is functionally correct.
Hi, Jay
Thanks for your reply.
IMHO, "If the race occurs at a time later than the initial enslavement,
speed may already be set (and the race manifests if the new speed
changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec)", from my test,
this will not happen because the previous source code make the speed
correct.
This "bond_update_speed_duplex" repeats to get the correct speed.
That is, this patch is to fix the error in initial enslavement. The
mentioned scenario
will not occur.
Even though the performance impact is minimal, if we can avoid this
performance
impact, why not ?
Best Regards!
Zhu Yanjun
>
> Also, the call to bond_miimon_commit itself is already gated by
> bond_miimon_inspect finding a link state change. The performance impact
> here should be minimal.
>
> -J
>
> ---
> -Jay Vosburgh, jay.vosburgh@canonical.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-26 2:21 ` zhuyj
@ 2016-02-29 5:39 ` Jay Vosburgh
2016-02-29 6:41 ` zhuyj
0 siblings, 1 reply; 10+ messages in thread
From: Jay Vosburgh @ 2016-02-29 5:39 UTC (permalink / raw)
To: zhuyj
Cc: netdev, Tantilov, Emil S, Veaceslav Falico, dingtianhong,
Andy Gospodarek, David S. Miller
zhuyj <zyjzyj2000@gmail.com> wrote:
>On 02/25/2016 09:33 PM, Jay Vosburgh wrote:
>> zhuyj <zyjzyj2000@gmail.com> wrote:
>> [...]
>>> I delved into the source code and Emil's tests. I think that the problem
>>> that this patch expects to fix occurs very unusually.
>>>
>>> Do you agree with me?
>>>
>>> If so, maybe the following patch can reduce the performance loss.
>>> Please comment on it. Thanks a lot.
>>>
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c
>>> b/drivers/net/bonding/bond_main.c
>>> index b7f1a99..c4c511a 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
>>> continue;
>>>
>>> case BOND_LINK_UP:
>>> - bond_update_speed_duplex(slave);
>>> + if (slave->speed == SPEED_UNKNOWN)
>>> + bond_update_speed_duplex(slave);
>>> +
>>> bond_set_slave_link_state(slave, BOND_LINK_UP,
>>> BOND_SLAVE_NOTIFY_NOW);
>>> slave->last_link_up = jiffies;
>> I don't believe the speed is necessarily SPEED_UNKNOWN coming in
>> here. If the race occurs at a time later than the initial enslavement,
>> speed may already be set (and the race manifests if the new speed
>> changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
>> think this is functionally correct.
>Hi, Jay
>
>Thanks for your reply.
>
>IMHO, "If the race occurs at a time later than the initial enslavement,
>speed may already be set (and the race manifests if the new speed
>changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec)", from my test,
>this will not happen because the previous source code make the speed
>correct.
How, exactly, will "the previous source code make the speed
correct"?
>This "bond_update_speed_duplex" repeats to get the correct speed.
>
>That is, this patch is to fix the error in initial enslavement. The
>mentioned scenario will not occur.
I see nothing in the code that limits the race to happening only
at enslavement time.
If the bond_mii_monitor call executes between the device going
link up and the arrival of the NETDEV_CHANGE or NETDEV_UP callback, the
stored speed and duplex are stale. The stale speed value is not
guaranteed to be SPEED_UNKNOWN, so your patch is not functionally
correct.
-J
>Even though the performance impact is minimal, if we can avoid this
>performance
>impact, why not ?
>
>Best Regards!
>Zhu Yanjun
>
>>
>> Also, the call to bond_miimon_commit itself is already gated by
>> bond_miimon_inspect finding a link state change. The performance impact
>> here should be minimal.
>>
>> -J
---
-Jay Vosburgh, jay.vosburgh@canonical.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 net] bonding: don't use stale speed and duplex information
2016-02-29 5:39 ` Jay Vosburgh
@ 2016-02-29 6:41 ` zhuyj
0 siblings, 0 replies; 10+ messages in thread
From: zhuyj @ 2016-02-29 6:41 UTC (permalink / raw)
To: Jay Vosburgh
Cc: netdev, Tantilov, Emil S, Veaceslav Falico, dingtianhong,
Andy Gospodarek, David S. Miller
On 02/29/2016 01:39 PM, Jay Vosburgh wrote:
> zhuyj <zyjzyj2000@gmail.com> wrote:
>
>> On 02/25/2016 09:33 PM, Jay Vosburgh wrote:
>>> zhuyj <zyjzyj2000@gmail.com> wrote:
>>> [...]
>>>> I delved into the source code and Emil's tests. I think that the problem
>>>> that this patch expects to fix occurs very unusually.
>>>>
>>>> Do you agree with me?
>>>>
>>>> If so, maybe the following patch can reduce the performance loss.
>>>> Please comment on it. Thanks a lot.
>>>>
>>>>
>>>> diff --git a/drivers/net/bonding/bond_main.c
>>>> b/drivers/net/bonding/bond_main.c
>>>> index b7f1a99..c4c511a 100644
>>>> --- a/drivers/net/bonding/bond_main.c
>>>> +++ b/drivers/net/bonding/bond_main.c
>>>> @@ -2129,7 +2129,9 @@ static void bond_miimon_commit(struct bonding *bond)
>>>> continue;
>>>>
>>>> case BOND_LINK_UP:
>>>> - bond_update_speed_duplex(slave);
>>>> + if (slave->speed == SPEED_UNKNOWN)
>>>> + bond_update_speed_duplex(slave);
>>>> +
>>>> bond_set_slave_link_state(slave, BOND_LINK_UP,
>>>> BOND_SLAVE_NOTIFY_NOW);
>>>> slave->last_link_up = jiffies;
>>> I don't believe the speed is necessarily SPEED_UNKNOWN coming in
>>> here. If the race occurs at a time later than the initial enslavement,
>>> speed may already be set (and the race manifests if the new speed
>>> changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec), so I don't
>>> think this is functionally correct.
>> Hi, Jay
>>
>> Thanks for your reply.
>>
>> IMHO, "If the race occurs at a time later than the initial enslavement,
>> speed may already be set (and the race manifests if the new speed
>> changes, i.e., the link changes from 1 Gb/sec to 10 Gb/sec)", from my test,
>> this will not happen because the previous source code make the speed
>> correct.
> How, exactly, will "the previous source code make the speed
> correct"?
>
>> This "bond_update_speed_duplex" repeats to get the correct speed.
>>
>> That is, this patch is to fix the error in initial enslavement. The
>> mentioned scenario will not occur.
> I see nothing in the code that limits the race to happening only
> at enslavement time.
>
> If the bond_mii_monitor call executes between the device going
> link up and the arrival of the NETDEV_CHANGE or NETDEV_UP callback, the
> stored speed and duplex are stale. The stale speed value is not
> guaranteed to be SPEED_UNKNOWN, so your patch is not functionally
> correct.
Hi, Jay
In this function bond_slave_netdev_event, the speed is updated.
Best Regards!
Zhu Yanjun
>
> -J
>
>> Even though the performance impact is minimal, if we can avoid this
>> performance
>> impact, why not ?
>>
>> Best Regards!
>> Zhu Yanjun
>>
>>> Also, the call to bond_miimon_commit itself is already gated by
>>> bond_miimon_inspect finding a link state change. The performance impact
>>> here should be minimal.
>>>
>>> -J
> ---
> -Jay Vosburgh, jay.vosburgh@canonical.com
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-02-29 6:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-08 20:10 [PATCH v2 net] bonding: don't use stale speed and duplex information Jay Vosburgh
2016-02-14 2:36 ` Ding Tianhong
2016-02-16 20:14 ` David Miller
2016-02-18 20:25 ` Jay Vosburgh
2016-02-18 20:27 ` David Miller
2016-02-25 8:35 ` zhuyj
2016-02-25 13:33 ` Jay Vosburgh
2016-02-26 2:21 ` zhuyj
2016-02-29 5:39 ` Jay Vosburgh
2016-02-29 6:41 ` zhuyj
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).