From: John Fastabend <john.fastabend@gmail.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: netdev@vger.kernel.org, davem@davemloft.net, idosch@mellanox.com,
eladr@mellanox.com, sfeldma@gmail.com, f.fainelli@gmail.com,
linux@roeck-us.net, vivien.didelot@savoirfairelinux.com,
andrew@lunn.ch, David.Laight@ACULAB.COM,
stephen@networkplumber.org
Subject: Re: [patch net-next v5 3/8] switchdev: allow caller to explicitly request attr_set as deferred
Date: Fri, 16 Oct 2015 19:11:07 -0700 [thread overview]
Message-ID: <5621AE3B.2030601@gmail.com> (raw)
In-Reply-To: <20151016082347.GC2194@nanopsycho.orion>
On 15-10-16 01:23 AM, Jiri Pirko wrote:
> Thu, Oct 15, 2015 at 05:21:22PM CEST, john.fastabend@gmail.com wrote:
>> On 15-10-14 10:40 AM, Jiri Pirko wrote:
>>> From: Jiri Pirko <jiri@mellanox.com>
>>>
>>> Caller should know if he can call attr_set directly (when holding RTNL)
>>> or if he has to defer the att_set processing for later.
>>>
>>> This also allows drivers to sleep inside attr_set and report operation
>>> status back to switchdev core. Switchdev core then warns if status is
>>> not ok, instead of silent errors happening in drivers.
>>>
>>> Benefit from newly introduced switchdev deferred ops infrastructure.
>>>
>>
>> A nit but the patch description should note your setting the defer bit
>> on the bridge set state.
>>
>>> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
>>> ---
>>> include/net/switchdev.h | 1 +
>>> net/bridge/br_stp.c | 3 +-
>>> net/switchdev/switchdev.c | 108 ++++++++++++++++++----------------------------
>>> 3 files changed, 46 insertions(+), 66 deletions(-)
>>>
>>> diff --git a/include/net/switchdev.h b/include/net/switchdev.h
>>> index d1c7f90..f7de6f8 100644
>>> --- a/include/net/switchdev.h
>>> +++ b/include/net/switchdev.h
>>> @@ -17,6 +17,7 @@
>>>
>>> #define SWITCHDEV_F_NO_RECURSE BIT(0)
>>> #define SWITCHDEV_F_SKIP_EOPNOTSUPP BIT(1)
>>> +#define SWITCHDEV_F_DEFER BIT(2)
>>>
>>> struct switchdev_trans_item {
>>> struct list_head list;
>>> diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
>>> index db6d243de..80c34d7 100644
>>> --- a/net/bridge/br_stp.c
>>> +++ b/net/bridge/br_stp.c
>>> @@ -41,13 +41,14 @@ void br_set_state(struct net_bridge_port *p, unsigned int state)
>>> {
>>> struct switchdev_attr attr = {
>>> .id = SWITCHDEV_ATTR_ID_PORT_STP_STATE,
>>> + .flags = SWITCHDEV_F_DEFER,
>>> .u.stp_state = state,
>>> };
>>
>>
>> This creates a possible race (with 6/8) I think, please check!
>
> Wait. This patch does not change the previous behaviour. Patch 6 does,
> so I don't understand why you are asking here. Confusing.
>
Sorry if its confusing I keyed of the addition of the SWITCHDEV_F_DEFER
here.
>
>>
>> In del_nbp() we call br_stp_disable_port() to set the port state
>> to BR_STATE_DISABLE and disabling learning events. But with this
>> patch it can be deferred. Also note the STP agent may be in userspace
>> which actually seems more likely the case because you likely want to
>> run some more modern variant of STP than the kernel supports.
>>
>> So at some point in the future the driver will turn off learning. At
>> the same time we call br_fdb_delete_by_port which calls a deferred
>> set of fdb deletes.
>>
>> I don't see how you guarantee learning is off before you start doing
>> the deletes here and possibly learning new addresses after the software
>> side believes the port is down.
>>
>> So
>>
>> br_stp_disable_port
>> br_fdb_delete_by_port
>> {fdb_del_external_learn}
>> [hw learns a fdb]
>> [hw disables learning]
>>
>> What stops this from happening?
>
> Okay. This behaviour is the same as without the patchset. What would
> resolve the issue it to put switchdev_deferred_process() after
> br_stp_disable_port() and before br_fdb_delete_by_port() call.
> That would enforce stp change to happen in hw before fdbs are explicitly
> deleted. Sound good to you?
OK so putting the switchdev_deferred_process() between the disable_port
and the delete_by_port will enforce the stp change to happen in hw
before the fdbs are explicitly deleted. I think this is minimally
required. I don't like scattering these flush_workqueue() calls all
over the place but I don't have any better ideas right now so sounds
good enough.
But now I'm wondering if you can have a deferred fdb add in the rocker
driver (rocker_port_fdb_learn_work) running in parallel with this that
could happen after the delete and add a bogus fdb entry. I think you
also need to have a flush in rocker_port_stp_update() to handle this
case.
Also I agree these issues were not completely caused by your patches.
Thanks,
John
next prev parent reply other threads:[~2015-10-17 2:11 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-14 17:40 [patch net-next v5 0/8] switchdev: change locking Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 1/8] switchdev: introduce switchdev deferred ops infrastructure Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 2/8] switchdev: make struct switchdev_attr parameter const for attr_set calls Jiri Pirko
2015-10-14 19:01 ` Vivien Didelot
2015-10-14 17:40 ` [patch net-next v5 3/8] switchdev: allow caller to explicitly request attr_set as deferred Jiri Pirko
2015-10-15 4:34 ` Scott Feldman
2015-10-15 5:57 ` Jiri Pirko
2015-10-15 15:21 ` John Fastabend
2015-10-16 8:23 ` Jiri Pirko
2015-10-16 16:20 ` Scott Feldman
2015-10-16 17:16 ` John Fastabend
2015-10-17 2:11 ` John Fastabend [this message]
2015-10-14 17:40 ` [patch net-next v5 4/8] switchdev: remove pointers from switchdev objects Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 5/8] switchdev: introduce possibility to defer obj_add/del Jiri Pirko
2015-10-19 7:55 ` Scott Feldman
2015-10-19 8:24 ` Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 6/8] bridge: defer switchdev fdb del call in fdb_del_external_learn Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 7/8] rocker: remove nowait from switchdev callbacks Jiri Pirko
2015-10-14 17:40 ` [patch net-next v5 8/8] switchdev: assert rtnl mutex when going over lower netdevs Jiri Pirko
2015-10-15 13:10 ` [patch net-next v5 0/8] switchdev: change locking David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5621AE3B.2030601@gmail.com \
--to=john.fastabend@gmail.com \
--cc=David.Laight@ACULAB.COM \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=eladr@mellanox.com \
--cc=f.fainelli@gmail.com \
--cc=idosch@mellanox.com \
--cc=jiri@resnulli.us \
--cc=linux@roeck-us.net \
--cc=netdev@vger.kernel.org \
--cc=sfeldma@gmail.com \
--cc=stephen@networkplumber.org \
--cc=vivien.didelot@savoirfairelinux.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.