* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: Scott Feldman @ 2014-12-30 5:31 UTC (permalink / raw)
To: roopa; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <54A2374F.6050901@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 9:25 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 12/29/14, 4:26 PM, Scott Feldman wrote:
>>
>> On Mon, Dec 29, 2014 at 4:07 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>
>>> On Mon, Dec 29, 2014 at 3:47 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>
>>>> On Mon, Dec 29, 2014 at 2:10 PM, roopa <roopa@cumulusnetworks.com>
>>>> wrote:
>>>>>
>>>>> On 12/29/14, 1:40 PM, Scott Feldman wrote:
>>>>>>
>>>>>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>>>>>
>>>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>>>
>>>>>>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>>>>>>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>>>>>>> userspace to pack more than one vlan in the setlink msg.
>>>>>>>
>>>>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>>> ---
>>>>>>> net/bridge/br_netlink.c | 18 +++++++++---------
>>>>>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>>>>> index 9f5eb55..75971b1 100644
>>>>>>> --- a/net/bridge/br_netlink.c
>>>>>>> +++ b/net/bridge/br_netlink.c
>>>>>>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>>>>>>> struct nlattr *af_spec,
>>>>>>> int cmd)
>>>>>>> {
>>>>>>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>>>>>> + struct bridge_vlan_info *vinfo;
>>>>>>> int err = 0;
>>>>>>> + struct nlattr *attr;
>>>>>>> + int err = 0;
>>>>>>> + int rem;
>>>>>>> + u16 vid;
>>>>>>>
>>>>>>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec,
>>>>>>> ifla_br_policy);
>>>>>>
>>>>>> Removing this call orphans ifla_br_policy...should ifla_br_policy be
>>>>>> removed?
>>>>>
>>>>>
>>>>> good question. Its a good place to see the type. In-fact userspace
>>>>> programs
>>>>> also copy the same policy to parse netlink attributes. hmmm..
>>>>> I would like to keep it if it does not throw a warning.
>>>>
>>>> I don't know what the policy (sorry, no pun intended) on leaving dead
>>>> code. I say remove it.
>>>
>>> You know, not using the policy seems like a step backwards, and maybe
>>> it suggests a problem with the attr packing.
>>>
>>> We had:
>>>
>>> ifla_br_policy
>>> IFLA_BRIDGE_FLAGS
>>> IFLA_BRIDGE_MODE
>>> IFLA_BRIDGE_VLAN_INFO
>>>
>>> This patch set makes it:
>>>
>>> ifla_br_policy
>>> IFLA_BRIDGE_FLAGS
>>> IFLA_BRIDGE_MODE
>>> IFLA_BRIDGE_VLAN_INFO
>>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>>
>>> Which is fine, but now VLAN_INFO and VLAN_RANGE_INFO can be repeated.
>>> I think you want some nesting to clarify:
>>>
>>> ifla_br_policy
>>> IFLA_BRIDGE_FLAGS
>>> IFLA_BRIDGE_MODE
>>> IFLA_BRIDGE_VLAN_INFO
>>> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
>>> IFLA_BRIDGE_VLAN_INFO
>>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>>
>>> Now you can keep the policy for the top-level parsing, and loop only
>>> on the nested array VLAN_LIST_INFO. Actually, now you can use just
>>> RANGE_INFO in array and have:
>>>
>>> ifla_br_policy
>>> IFLA_BRIDGE_FLAGS
>>> IFLA_BRIDGE_MODE
>>> IFLA_BRIDGE_VLAN_INFO
>>> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
>>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>>
>>> And use VLAN_RANGE_INFO for both ranges of vids as well as single
>>> vids. That'll simplify your filling algo in patch 5.
>>
>> Hmmmm...do you even need VLAN_RANGE_INFO? How about just using
>> existing VLAN_INFO and add some more flags:
>>
>> #define BRIDGE_VLAN_INFO_RANGE_START (1<<3)
>> #define BRIDGE_VLAN_INFO_RANGE_END (1<<4)
>>
>> Now you can have:
>>
>> IFLA_BRIDGE_FLAGS
>> IFLA_BRIDGE_MODE
>> IFLA_BRIDGE_VLAN_INFO
>> IFLA_BRIDGE_VLAN_INFO_LIST // nested array of
>> IFLA_BRIDGE_VLAN_INFO
>>
>> Don't set START or END for single vids in list.
>
>
> ok. I was debating yesterday about introducing another nest. This looks
> good.
> My only reason to not use existing IFLA_BRIDGE_VLAN_INFO was to make sure it
> works for existing users.
> I see that in this case since IFLA_BRIDGE_VLAN_INFO_LIST is new, it will not
> affect existing users.
>
> But, i cant use IFLA_BRIDGE_VLAN_INFO (ie an attribute in ifla_br_policy)
> under IFLA_BRIDGE_VLAN_INFO_LIST ?.
I don't see why not. You're not going to parse the array with a
policy anyway (since it's an array). And attr INFO_LIST will be .type
= NLA_NESTED in ifla_br_policy.
>
> IFLA_BRIDGE_VLAN_INFO_LIST will have its own policy and its own attributes.
>
> Which will make it look something like below ?
>
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_INFO_LIST // nested array of
> IFLA_BRIDGE_VLAN_INFO_LIST_ENTRY
>
>
> Thanks,
> Roopa
^ permalink raw reply
* Re: [PATCH 5/6] bridge: new function to pack vlans using both IFLA_BRIDGE_VLAN_INFO and IFLA_BRIDGE_VLAN_RANGE_INFO
From: roopa @ 2014-12-30 5:31 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson kok
In-Reply-To: <CAE4R7bC63CpvpAuakFvEtEJvOsowjP40q1cs2eh0R=bf-F6w=Q@mail.gmail.com>
On 12/29/14, 3:25 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This patch adds new function to compress vlans into ranges.
>> Vlans are compressed into ranges only if the fill request is called with
>> RTEXT_FILTER_BRVLAN_COMPRESSED in filtermask.
>>
>> Old vlan packing code is moved to a new function and continues to be
>> called when filter_mask is RTEXT_FILTER_BRVLAN
>>
>> Signed-off-by: Wilson kok <wkok@cumulusnetworks.com>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> net/bridge/br_netlink.c | 157 +++++++++++++++++++++++++++++++++++++++++------
>> 1 file changed, 137 insertions(+), 20 deletions(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 4c47ba0..16bdd5a 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -67,6 +67,133 @@ static int br_port_fill_attrs(struct sk_buff *skb,
>> return 0;
>> }
>>
>> +static int br_fill_ifvlaninfo_bitmap(struct sk_buff *skb,
>> + const unsigned long *vlan_bmp, u16 flags)
>> +{
>> + struct bridge_vlan_range_info vinfo_range;
>> + struct bridge_vlan_info vinfo;
>> + u16 vid, start = 0, end = 0;
> One per line?
ack
>> + u16 pvid;
> Aren't you getting an "unused" compile warning on this ^^^?
will double check..
>> +
>> + /* handle the untagged */
> This comment ^^^ doesn't make sense here.
>
>> + for_each_set_bit(vid, vlan_bmp, VLAN_N_VID) {
>> + if (start == 0) {
>> + start = vid;
>> + end = vid;
>> + }
>> + if ((vid - end) > 1) {
>> + memset(&vinfo_range, 0, sizeof(vinfo_range));
>> + vinfo_range.flags |= flags;
>> + vinfo_range.vid = start;
>> + vinfo_range.vid_end = end;
>> + if (nla_put(skb, IFLA_BRIDGE_VLAN_RANGE_INFO,
>> + sizeof(vinfo_range), &vinfo_range))
>> + goto nla_put_failure;
>> + start = vid;
>> + end = vid;
>> + } else {
>> + end = vid;
>> + }
>> + }
> What happens with this set {1-10, 12, 20-25, 30}? On vid 12, will
> that put a VLAN_RANGE_INFO with start=end=12? Seems strange to use
> RANGE_INFO for single vlan.
>
> Can the algorithm be simplified? Maybe there are other examples in
> the kernel of finding ranges/singles from a bitmap we could borrow
> code from?
let me see...
>> + if (start != 0 && end != 0) {
>> + if (start != end) {
>> + memset(&vinfo_range, 0, sizeof(vinfo_range));
>> + vinfo_range.flags |= flags;
>> + vinfo_range.vid = start;
>> + vinfo_range.vid_end = end;
>> + if (nla_put(skb, IFLA_BRIDGE_VLAN_RANGE_INFO,
>> + sizeof(vinfo_range), &vinfo_range))
>> + goto nla_put_failure;
>> + } else {
>> + memset(&vinfo, 0, sizeof(vinfo));
>> + vinfo.flags |= flags;
>> + vinfo.vid = start;
>> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
>> + sizeof(vinfo), &vinfo))
>> + goto nla_put_failure;
> How many IFLA_BRIDGE_VLAN_INFOs can you fit into skb? It seems the
> worst case is you'll have 4094/2 = 2047 VLAN_INFOs if using all
> even-numbered vids, for example. Will that fit? I guess the original
> code had the same concern...wonder if anyone checked this worst-case?
I will check this again. Remember doing worst case tests for setlinks.
will get back with some worst cases tests in the next submission.
>
>> + }
>> + }
>> +
>> +nla_put_failure:
>> + return -EMSGSIZE;
>> +}
>> +
>> +static int br_fill_ifvlaninfo_compressed(struct sk_buff *skb,
>> + const struct net_port_vlans *pv)
>> +{
>> + unsigned long vlan_bmp_copy[BR_VLAN_BITMAP_LEN];
>> + unsigned long untagged_bmp_copy[BR_VLAN_BITMAP_LEN];
> Lots of automatic space on the stack...can you use dynamic mem
> (kzalloc) for these?
Let me see. Or make it a static global ?
>
>> + struct bridge_vlan_range_info vinfo_range;
>> + struct bridge_vlan_info vinfo;
>> + u16 pvid;
>> +
>> + memset(vlan_bmp_copy, 0,
>> + sizeof(unsigned long) * BR_VLAN_BITMAP_LEN);
>> + bitmap_copy(vlan_bmp_copy, pv->vlan_bitmap, VLAN_N_VID);
>> +
>> + memset(untagged_bmp_copy, 0,
>> + sizeof(unsigned long) * BR_VLAN_BITMAP_LEN);
>> + bitmap_copy(untagged_bmp_copy, pv->untagged_bitmap, VLAN_N_VID);
>> +
>> + /* send the pvid separately first */
>> + pvid = br_get_pvid(pv);
>> + if (pvid && (pvid != VLAN_N_VID)) {
>> + memset(&vinfo, 0, sizeof(vinfo));
>> + vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
>> + if (test_bit(pvid, untagged_bmp_copy)) {
>> + vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
>> + clear_bit(pvid, untagged_bmp_copy);
>> + }
>> + clear_bit(pvid, vlan_bmp_copy);
>> + vinfo.vid = pvid;
>> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
>> + sizeof(vinfo), &vinfo))
>> + goto nla_put_failure;
>> + }
> What if pvid is not in vlan_bmp_copy? This statement block seems
> slightly different than the original logic where BRIDGE_VLAN_INFO_PVID
> was set only when looping thru vlan_bitmap and vid == pvid. Now can
> you send a IFLA_BRIDGE_VLAN_INFO with pvid even if pvid isn't in
> vlan_bitmap?
yes, you can AFAIK. pvid is usually untagged.
> Maybe pvid is always in vlan_bitmap, so it doesn't
> matter. But there is a subtle logic change here, so something to
> consider.
Understand what you are saying, I will check this with v2.
>> +
>> + /* fill untagged vlans */
>> + br_fill_ifvlaninfo_bitmap(skb, untagged_bmp_copy,
>> + BRIDGE_VLAN_INFO_UNTAGGED);
> This might be doing more than original logic. Original logic looped
> thru vid in vlan_btimap and checked if vid was in untagged_bitmap.
> Here you're dumping all bits in untagged_bitmap, without looking if
> bit was set in vlan_bitmap. Maybe you want to do an intersection of
> sets vlan_bitmap and untagged_bitmap.
There is a clear_bit for vlan_bmp_copy below.
And all this works as expected on our boxes (Its tested code).
But, understand the difference with the original loop. Will confirm.
plus, we will see if we can use bitmap intersection here.
>> + for_each_set_bit(vid, untagged_bmp_copy, VLAN_N_VID)
>> + clear_bit(vid, vlan_bmp_copy);
>> +
>> + /* fill tagged vlans */
>> + br_fill_ifvlaninfo_bitmap(skb, vlan_bmp_copy, 0);
>> +
>> + return 0;
>> +
>> +nla_put_failure:
>> + return -EMSGSIZE;
>> +}
>> +
>> +static int br_fill_ifvlaninfo(struct sk_buff *skb,
>> + const struct net_port_vlans *pv)
>> +{
>> + struct bridge_vlan_info vinfo;
>> + u16 pvid, vid;
>> +
>> + pvid = br_get_pvid(pv);
>> + for_each_set_bit(vid, pv->vlan_bitmap, VLAN_N_VID) {
>> + vinfo.vid = vid;
>> + vinfo.flags = 0;
>> + if (vid == pvid)
>> + vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
>> +
>> + if (test_bit(vid, pv->untagged_bitmap))
>> + vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
>> +
>> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
>> + sizeof(vinfo), &vinfo))
>> + goto nla_put_failure;
>> + }
>> +
>> + return 0;
>> +
>> +nla_put_failure:
>> + return -EMSGSIZE;
>> +}
>> +
>> /*
>> * Create one netlink message for one interface
>> * Contains port and master info as well as carrier and bridge state.
>> @@ -121,12 +248,11 @@ static int br_fill_ifinfo(struct sk_buff *skb,
>> }
>>
>> /* Check if the VID information is requested */
>> - if (filter_mask & RTEXT_FILTER_BRVLAN) {
>> - struct nlattr *af;
>> + if ((filter_mask & RTEXT_FILTER_BRVLAN) ||
>> + (filter_mask & RTEXT_FILTER_BRVLAN_COMPRESSED)) {
>> const struct net_port_vlans *pv;
>> - struct bridge_vlan_info vinfo;
>> - u16 vid;
>> - u16 pvid;
>> + struct nlattr *af;
>> + int err;
>>
>> if (port)
>> pv = nbp_get_vlan_info(port);
>> @@ -140,21 +266,12 @@ static int br_fill_ifinfo(struct sk_buff *skb,
>> if (!af)
>> goto nla_put_failure;
>>
>> - pvid = br_get_pvid(pv);
>> - for_each_set_bit(vid, pv->vlan_bitmap, VLAN_N_VID) {
>> - vinfo.vid = vid;
>> - vinfo.flags = 0;
>> - if (vid == pvid)
>> - vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
>> -
>> - if (test_bit(vid, pv->untagged_bitmap))
>> - vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
>> -
>> - if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
>> - sizeof(vinfo), &vinfo))
>> - goto nla_put_failure;
>> - }
>> -
>> + if (filter_mask & RTEXT_FILTER_BRVLAN_COMPRESSED)
>> + err = br_fill_ifvlaninfo_compressed(skb, pv);
>> + else
>> + err = br_fill_ifvlaninfo(skb, pv);
>> + if (err)
>> + goto nla_put_failure;
>> nla_nest_end(skb, af);
>> }
>>
>> --
>> 1.7.10.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: roopa @ 2014-12-30 5:25 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bC7pgH35pkduV0n3G0UpoXoAUeJzmLE8ucJAwONJB2BZg@mail.gmail.com>
On 12/29/14, 4:26 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 4:07 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>> On Mon, Dec 29, 2014 at 3:47 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>>> On Mon, Dec 29, 2014 at 2:10 PM, roopa <roopa@cumulusnetworks.com> wrote:
>>>> On 12/29/14, 1:40 PM, Scott Feldman wrote:
>>>>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>>
>>>>>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>>>>>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>>>>>> userspace to pack more than one vlan in the setlink msg.
>>>>>>
>>>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>> ---
>>>>>> net/bridge/br_netlink.c | 18 +++++++++---------
>>>>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>>>>
>>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>>>> index 9f5eb55..75971b1 100644
>>>>>> --- a/net/bridge/br_netlink.c
>>>>>> +++ b/net/bridge/br_netlink.c
>>>>>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>>>>>> struct nlattr *af_spec,
>>>>>> int cmd)
>>>>>> {
>>>>>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>>>>> + struct bridge_vlan_info *vinfo;
>>>>>> int err = 0;
>>>>>> + struct nlattr *attr;
>>>>>> + int err = 0;
>>>>>> + int rem;
>>>>>> + u16 vid;
>>>>>>
>>>>>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec,
>>>>>> ifla_br_policy);
>>>>> Removing this call orphans ifla_br_policy...should ifla_br_policy be
>>>>> removed?
>>>>
>>>> good question. Its a good place to see the type. In-fact userspace programs
>>>> also copy the same policy to parse netlink attributes. hmmm..
>>>> I would like to keep it if it does not throw a warning.
>>> I don't know what the policy (sorry, no pun intended) on leaving dead
>>> code. I say remove it.
>> You know, not using the policy seems like a step backwards, and maybe
>> it suggests a problem with the attr packing.
>>
>> We had:
>>
>> ifla_br_policy
>> IFLA_BRIDGE_FLAGS
>> IFLA_BRIDGE_MODE
>> IFLA_BRIDGE_VLAN_INFO
>>
>> This patch set makes it:
>>
>> ifla_br_policy
>> IFLA_BRIDGE_FLAGS
>> IFLA_BRIDGE_MODE
>> IFLA_BRIDGE_VLAN_INFO
>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>
>> Which is fine, but now VLAN_INFO and VLAN_RANGE_INFO can be repeated.
>> I think you want some nesting to clarify:
>>
>> ifla_br_policy
>> IFLA_BRIDGE_FLAGS
>> IFLA_BRIDGE_MODE
>> IFLA_BRIDGE_VLAN_INFO
>> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
>> IFLA_BRIDGE_VLAN_INFO
>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>
>> Now you can keep the policy for the top-level parsing, and loop only
>> on the nested array VLAN_LIST_INFO. Actually, now you can use just
>> RANGE_INFO in array and have:
>>
>> ifla_br_policy
>> IFLA_BRIDGE_FLAGS
>> IFLA_BRIDGE_MODE
>> IFLA_BRIDGE_VLAN_INFO
>> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
>> IFLA_BRIDGE_VLAN_RANGE_INFO
>>
>> And use VLAN_RANGE_INFO for both ranges of vids as well as single
>> vids. That'll simplify your filling algo in patch 5.
> Hmmmm...do you even need VLAN_RANGE_INFO? How about just using
> existing VLAN_INFO and add some more flags:
>
> #define BRIDGE_VLAN_INFO_RANGE_START (1<<3)
> #define BRIDGE_VLAN_INFO_RANGE_END (1<<4)
>
> Now you can have:
>
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_INFO_LIST // nested array of
> IFLA_BRIDGE_VLAN_INFO
>
> Don't set START or END for single vids in list.
ok. I was debating yesterday about introducing another nest. This looks
good.
My only reason to not use existing IFLA_BRIDGE_VLAN_INFO was to make
sure it works for existing users.
I see that in this case since IFLA_BRIDGE_VLAN_INFO_LIST is new, it will
not affect existing users.
But, i cant use IFLA_BRIDGE_VLAN_INFO (ie an attribute in
ifla_br_policy) under IFLA_BRIDGE_VLAN_INFO_LIST ?.
IFLA_BRIDGE_VLAN_INFO_LIST will have its own policy and its own attributes.
Which will make it look something like below ?
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_INFO_LIST // nested array of
IFLA_BRIDGE_VLAN_INFO_LIST_ENTRY
Thanks,
Roopa
^ permalink raw reply
* Re: [PATCH 6/6] bridge: add vlan info to bridge setlink and dellink notification messages
From: Scott Feldman @ 2014-12-30 5:25 UTC (permalink / raw)
To: roopa; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <54A2355B.6060903@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 9:17 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 12/29/14, 3:42 PM, Scott Feldman wrote:
>>
>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> vlan add/deletes are not notified to userspace today. This patch fixes
>>> it.
>>> Notifications will contain vlans compressed into ranges whereever
>>> applicable.
>>
>> Why is this notification needed? On VLAN add/del, the port driver
>> will already get called if it implements ndo_vlan_rx_add_vid and
>> ndo_vlan_rx_kill_vid.
>
> This is strictly for userspace.
>
>> The port driver already gets the notification
>> it needs to filter vlans. User space can query for vlan port
>> membership if it needs to know.
>
>
> You mean request a dump ?. yes it can if it needs all the configured vlans.
> But for notifications, we must at the least notify what changed with respect
> to vlans with RTM_SETLINK/DELLINK, no ?
You'll get that notification if you implement ndo_vlan_rx_add/kill_vid
in your port driver, which is called from RTM_SETLINK/DELLINK. Your
port driver can send to user-space, I suppose, if user-space needs
notification.
^ permalink raw reply
* Re: [PATCH 6/6] bridge: add vlan info to bridge setlink and dellink notification messages
From: roopa @ 2014-12-30 5:17 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bCnqBX8TWmG4O_wppKtbnnCEWaaz+bHreybd8pPfUPs+w@mail.gmail.com>
On 12/29/14, 3:42 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> vlan add/deletes are not notified to userspace today. This patch fixes it.
>> Notifications will contain vlans compressed into ranges whereever applicable.
> Why is this notification needed? On VLAN add/del, the port driver
> will already get called if it implements ndo_vlan_rx_add_vid and
> ndo_vlan_rx_kill_vid.
This is strictly for userspace.
> The port driver already gets the notification
> it needs to filter vlans. User space can query for vlan port
> membership if it needs to know.
You mean request a dump ?. yes it can if it needs all the configured vlans.
But for notifications, we must at the least notify what changed with
respect to vlans with RTM_SETLINK/DELLINK, no ?
>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> net/bridge/br_netlink.c | 3 ++-
>> net/core/rtnetlink.c | 3 ++-
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 16bdd5a..cebfb03 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -303,7 +303,8 @@ void br_ifinfo_notify(int event, struct net_bridge_port *port)
>> if (skb == NULL)
>> goto errout;
>>
>> - err = br_fill_ifinfo(skb, port, 0, 0, event, 0, 0, port->dev);
>> + err = br_fill_ifinfo(skb, port, 0, 0, event, 0,
>> + RTEXT_FILTER_BRVLAN_COMPRESSED, port->dev);
> This doesn't look right. The skb wasn't allocated large enough to
> hold vlan info. If this is working in your tests, you're getting
> lucky due to some skb padding.
>
> skb was allocated with br_nlmsg_size(), which doesn't include vlan info.
>
>> if (err < 0) {
>> /* -EMSGSIZE implies BUG in br_nlmsg_size() */
>> WARN_ON(err == -EMSGSIZE);
>> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
>> index d06107d..dad5fb6 100644
>> --- a/net/core/rtnetlink.c
>> +++ b/net/core/rtnetlink.c
>> @@ -2878,7 +2878,8 @@ static int rtnl_bridge_notify(struct net_device *dev, u16 flags)
>>
>> if ((!flags || (flags & BRIDGE_FLAGS_MASTER)) &&
>> br_dev && br_dev->netdev_ops->ndo_bridge_getlink) {
>> - err = br_dev->netdev_ops->ndo_bridge_getlink(skb, 0, 0, dev, 0);
>> + err = br_dev->netdev_ops->ndo_bridge_getlink(skb, 0, 0, dev,
>> + RTEXT_FILTER_BRVLAN_COMPRESSED);
>> if (err < 0)
>> goto errout;
>> }
>> --
>> 1.7.10.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] bonding: move ipoib_header_ops to vmlinux
From: David Miller @ 2014-12-30 4:25 UTC (permalink / raw)
To: wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA
Cc: cwang-xCSkyg8dI+0RB7SZvlqPiA, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <54A21596.7000706-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
From: Wengang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Date: Tue, 30 Dec 2014 11:01:42 +0800
> There are more than one way we do things. For this case, considering
> needs, complexity and stability I think moving ipoib_header_ops is the
> right way to go.
I completely disagree, it's a gross hack at best.
It's papering over the real problem.
When we have references to released objects in other areas of the
networking stack, we don't move those objects into the static kernel
image as a fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] bonding: move ipoib_header_ops to vmlinux
From: David Miller @ 2014-12-30 4:23 UTC (permalink / raw)
To: wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1419908682-17012-1-git-send-email-wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
From: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Date: Tue, 30 Dec 2014 11:04:42 +0800
> When last slave of a bonding master is removed, the bonding then does not work.
> At the time if packet_snd is called against with a master net_device, it calls
> then header_ops->create which points to slave's header_ops. In case the slave
> is ipoib and the module is unloaded, header_ops would point to invalid address.
> Accessing it will cause problem.
> This patch tries to fix this issue by moving ipoib_header_ops to vmlinux to keep
> it valid even when ipoib module is unloaded.
>
> Signed-off-by: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Like others, I absolutely do not consider it sustainable to keep
moving header_ops implementations into the static kernel image.
We're just papering over the real problem, which is making sure all
dangling references to something really go away when we release/unload
an object.
Point it to a dummy set of ops and do a synchronize_net() or similar.
I'm not applying this patch, sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] bonding: move ipoib_header_ops to vmlinux
From: Wengang Wang @ 2014-12-30 3:04 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
davem-fT/PcQaiUtIeIZ0/mPfg9Q
Cc: wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA
When last slave of a bonding master is removed, the bonding then does not work.
At the time if packet_snd is called against with a master net_device, it calls
then header_ops->create which points to slave's header_ops. In case the slave
is ipoib and the module is unloaded, header_ops would point to invalid address.
Accessing it will cause problem.
This patch tries to fix this issue by moving ipoib_header_ops to vmlinux to keep
it valid even when ipoib module is unloaded.
Signed-off-by: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 10 ---------
drivers/infiniband/ulp/ipoib/ipoib_main.c | 28 +------------------------
include/linux/ibdevice.h | 15 ++++++++++++++
include/linux/if_infiniband.h | 11 ++++++++++
include/uapi/linux/if_infiniband.h | 16 ++++++++++++---
net/Makefile | 2 +-
net/infiniband/Makefile | 5 +++++
net/infiniband/infiniband.c | 34 +++++++++++++++++++++++++++++++
8 files changed, 80 insertions(+), 41 deletions(-)
create mode 100644 include/linux/ibdevice.h
create mode 100644 include/linux/if_infiniband.h
create mode 100644 net/infiniband/Makefile
create mode 100644 net/infiniband/infiniband.c
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index d7562be..7c25670 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -121,16 +121,6 @@ enum {
/* structs */
-struct ipoib_header {
- __be16 proto;
- u16 reserved;
-};
-
-struct ipoib_cb {
- struct qdisc_skb_cb qdisc_cb;
- u8 hwaddr[INFINIBAND_ALEN];
-};
-
static inline struct ipoib_cb *ipoib_skb_cb(const struct sk_buff *skb)
{
BUILD_BUG_ON(sizeof(skb->cb) < sizeof(struct ipoib_cb));
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 58b5aa3..9233085 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -34,6 +34,7 @@
#include "ipoib.h"
+#include <linux/ibdevice.h>
#include <linux/module.h>
#include <linux/init.h>
@@ -807,29 +808,6 @@ static void ipoib_timeout(struct net_device *dev)
/* XXX reset QP, etc. */
}
-static int ipoib_hard_header(struct sk_buff *skb,
- struct net_device *dev,
- unsigned short type,
- const void *daddr, const void *saddr, unsigned len)
-{
- struct ipoib_header *header;
- struct ipoib_cb *cb = ipoib_skb_cb(skb);
-
- header = (struct ipoib_header *) skb_push(skb, sizeof *header);
-
- header->proto = htons(type);
- header->reserved = 0;
-
- /*
- * we don't rely on dst_entry structure, always stuff the
- * destination address into skb->cb so we can figure out where
- * to send the packet later.
- */
- memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN);
-
- return sizeof *header;
-}
-
static void ipoib_set_mcast_list(struct net_device *dev)
{
struct ipoib_dev_priv *priv = netdev_priv(dev);
@@ -1328,10 +1306,6 @@ void ipoib_dev_cleanup(struct net_device *dev)
ipoib_neigh_hash_uninit(dev);
}
-static const struct header_ops ipoib_header_ops = {
- .create = ipoib_hard_header,
-};
-
static const struct net_device_ops ipoib_netdev_ops = {
.ndo_uninit = ipoib_uninit,
.ndo_open = ipoib_open,
diff --git a/include/linux/ibdevice.h b/include/linux/ibdevice.h
new file mode 100644
index 0000000..8418974
--- /dev/null
+++ b/include/linux/ibdevice.h
@@ -0,0 +1,15 @@
+/*
+ * ipoib Implementation of ipoib_header_ops here.
+ *
+ * Authors: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
+ */
+#ifndef _LINUX_IBDEVICE_H
+#define _LINUX_IBDEVICE_H
+
+#include <linux/netdevice.h>
+
+#ifdef __KERNEL__
+extern const struct header_ops ipoib_header_ops;
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_IBDEVICE_H */
diff --git a/include/linux/if_infiniband.h b/include/linux/if_infiniband.h
new file mode 100644
index 0000000..9f2d0cf
--- /dev/null
+++ b/include/linux/if_infiniband.h
@@ -0,0 +1,11 @@
+/*
+ * ipoib Implementation of ipoib_header_ops here.
+ *
+ * Authors: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
+ */
+#ifndef _LINUX_IF_INFINIBAND_H
+#define _LINUX_IF_INFINIBAND_H
+
+#include <uapi/linux/if_infiniband.h>
+
+#endif /* _LINUX_IF_INFINIBAND_H */
diff --git a/include/uapi/linux/if_infiniband.h b/include/uapi/linux/if_infiniband.h
index 7d958475..9190ee3 100644
--- a/include/uapi/linux/if_infiniband.h
+++ b/include/uapi/linux/if_infiniband.h
@@ -21,9 +21,19 @@
* $Id$
*/
-#ifndef _LINUX_IF_INFINIBAND_H
-#define _LINUX_IF_INFINIBAND_H
+#ifndef _UAPI_LINUX_IF_INFINIBAND_H
+#define _UAPI_LINUX_IF_INFINIBAND_H
+#include <net/sch_generic.h>
#define INFINIBAND_ALEN 20 /* Octets in IPoIB HW addr */
-#endif /* _LINUX_IF_INFINIBAND_H */
+struct ipoib_header {
+ __be16 proto;
+ u16 reserved;
+};
+
+struct ipoib_cb {
+ struct qdisc_skb_cb qdisc_cb;
+ u8 hwaddr[INFINIBAND_ALEN];
+};
+#endif /* _UAPI_LINUX_IF_INFINIBAND_H */
diff --git a/net/Makefile b/net/Makefile
index 7ed1970..5d00a13 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -14,7 +14,7 @@ obj-$(CONFIG_NET) += $(tmp-y)
# LLC has to be linked before the files in net/802/
obj-$(CONFIG_LLC) += llc/
-obj-$(CONFIG_NET) += ethernet/ 802/ sched/ netlink/
+obj-$(CONFIG_NET) += ethernet/ 802/ sched/ netlink/ infiniband/
obj-$(CONFIG_NETFILTER) += netfilter/
obj-$(CONFIG_INET) += ipv4/
obj-$(CONFIG_XFRM) += xfrm/
diff --git a/net/infiniband/Makefile b/net/infiniband/Makefile
new file mode 100644
index 0000000..c8a5be0
--- /dev/null
+++ b/net/infiniband/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for the Linux ip over infiniband layer.
+#
+
+obj-y += infiniband.o
diff --git a/net/infiniband/infiniband.c b/net/infiniband/infiniband.c
new file mode 100644
index 0000000..a458ec5
--- /dev/null
+++ b/net/infiniband/infiniband.c
@@ -0,0 +1,34 @@
+/*
+ * ipoib Implementation of ipoib_header_ops here.
+ *
+ * Authors: Wengang Wang <wen.gang.wang-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
+ */
+
+#include <linux/if_infiniband.h>
+
+static int ipoib_hard_header(struct sk_buff *skb,
+ struct net_device *dev,
+ unsigned short type,
+ const void *daddr, const void *saddr, unsigned len)
+{
+ struct ipoib_header *header;
+ struct ipoib_cb *cb = (struct ipoib_cb *)skb->cb;
+
+ header = (struct ipoib_header *)skb_push(skb, sizeof(*header));
+
+ header->proto = htons(type);
+ header->reserved = 0;
+
+ /* we don't rely on dst_entry structure, always stuff the
+ * destination address into skb->cb so we can figure out where
+ * to send the packet later.
+ */
+ memcpy(cb->hwaddr, daddr, INFINIBAND_ALEN);
+
+ return sizeof(*header);
+}
+
+const struct header_ops ipoib_header_ops = {
+ .create = ipoib_hard_header,
+};
+EXPORT_SYMBOL(ipoib_header_ops);
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: [PATCH] bonding: move ipoib_header_ops to vmlinux
From: Wengang @ 2014-12-30 3:01 UTC (permalink / raw)
To: Cong Wang; +Cc: netdev, linux-rdma-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAHA+R7M-Nm_R-AaKQ0nX4L3O=BN5_m1v-8sWJSaE_UmGyo8zwA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
于 2014年12月30日 05:32, Cong Wang 写道:
> On Mon, Nov 24, 2014 at 9:36 PM, Wengang Wang <wen.gang.wang@oracle.com> wrote:
>> When last slave of a bonding master is removed, the bonding then does not work.
>> At the time if packet_snd is called against with a master net_device, it calls
>> then header_ops->create which points to slave's header_ops. In case the slave
>> is ipoib and the module is unloaded, header_ops would point to invalid address.
>> Accessing it will cause problem.
>> This patch tries to fix this issue by moving ipoib_header_ops to vmlinux to keep
>> it valid even when ipoib module is unloaded.
>>
> I still don't think this is the right way to fix the bug, because
> there are other modules which have the same bug, for example,
> net/bluetooth/6lowpan.c (unless it can't be enslaved into a bond
> of course, I haven't verified). We don't want to move them all
> to builtin kernel, do we?
I don't think bluetooth needs bonding(and you didn't verify at all).
Do you have strong reason(examples)?
> On the other hand, as Jay explained, bonding is probably
> the only one which has the problem, why not solve it in bonding?
> I mean why do we have to adjust other modules just for bonding?
There are more than one way we do things. For this case, considering
needs, complexity and stability I think moving ipoib_header_ops is the
right way to go.
> When its slave device gets removed, some netdev event is sent
> to the master, so why not reset the header_ops to ether header ops
> up on this event? Something like below (not even compile tested):
>
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 184c434..aedccae 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1726,6 +1726,7 @@ static int __bond_release_one(struct net_device *bond_dev,
> if (!bond_has_slaves(bond)) {
> bond_set_carrier(bond);
> eth_hw_addr_random(bond_dev);
> + ether_setup(bond_dev);
> }
>
> unblock_netpoll_tx();
>
> It should solve more than just your ipoib case.
Do you think it's safe in a race condition? Before you change
header_ops, you have no
way to prevent another thread from accessing the old one.
> Last but not least, since you said you saw the crash in packet_snd(),
> it would also be interesting to know why packet_snd() still picks this
> bond dev after all its slaves are gone, after all it is not a functional
> device without any slave (its carrier is off).
It's quite normal in a race situation.
thanks,
wengang
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: Scott Feldman @ 2014-12-30 0:26 UTC (permalink / raw)
To: roopa; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bAHSNePcX3aAa72kYReC4c8fRKF+BzL1f-bdk-M_pLxUg@mail.gmail.com>
On Mon, Dec 29, 2014 at 4:07 PM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Mon, Dec 29, 2014 at 3:47 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>> On Mon, Dec 29, 2014 at 2:10 PM, roopa <roopa@cumulusnetworks.com> wrote:
>>> On 12/29/14, 1:40 PM, Scott Feldman wrote:
>>>>
>>>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>>>
>>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>>
>>>>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>>>>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>>>>> userspace to pack more than one vlan in the setlink msg.
>>>>>
>>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>> ---
>>>>> net/bridge/br_netlink.c | 18 +++++++++---------
>>>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>>> index 9f5eb55..75971b1 100644
>>>>> --- a/net/bridge/br_netlink.c
>>>>> +++ b/net/bridge/br_netlink.c
>>>>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>>>>> struct nlattr *af_spec,
>>>>> int cmd)
>>>>> {
>>>>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>>>> + struct bridge_vlan_info *vinfo;
>>>>> int err = 0;
>>>>> + struct nlattr *attr;
>>>>> + int err = 0;
>>>>> + int rem;
>>>>> + u16 vid;
>>>>>
>>>>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec,
>>>>> ifla_br_policy);
>>>>
>>>> Removing this call orphans ifla_br_policy...should ifla_br_policy be
>>>> removed?
>>>
>>>
>>> good question. Its a good place to see the type. In-fact userspace programs
>>> also copy the same policy to parse netlink attributes. hmmm..
>>> I would like to keep it if it does not throw a warning.
>>
>> I don't know what the policy (sorry, no pun intended) on leaving dead
>> code. I say remove it.
>
> You know, not using the policy seems like a step backwards, and maybe
> it suggests a problem with the attr packing.
>
> We had:
>
> ifla_br_policy
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
>
> This patch set makes it:
>
> ifla_br_policy
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_RANGE_INFO
>
> Which is fine, but now VLAN_INFO and VLAN_RANGE_INFO can be repeated.
> I think you want some nesting to clarify:
>
> ifla_br_policy
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_RANGE_INFO
>
> Now you can keep the policy for the top-level parsing, and loop only
> on the nested array VLAN_LIST_INFO. Actually, now you can use just
> RANGE_INFO in array and have:
>
> ifla_br_policy
> IFLA_BRIDGE_FLAGS
> IFLA_BRIDGE_MODE
> IFLA_BRIDGE_VLAN_INFO
> IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
> IFLA_BRIDGE_VLAN_RANGE_INFO
>
> And use VLAN_RANGE_INFO for both ranges of vids as well as single
> vids. That'll simplify your filling algo in patch 5.
Hmmmm...do you even need VLAN_RANGE_INFO? How about just using
existing VLAN_INFO and add some more flags:
#define BRIDGE_VLAN_INFO_RANGE_START (1<<3)
#define BRIDGE_VLAN_INFO_RANGE_END (1<<4)
Now you can have:
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_INFO_LIST // nested array of
IFLA_BRIDGE_VLAN_INFO
Don't set START or END for single vids in list.
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: Scott Feldman @ 2014-12-30 0:07 UTC (permalink / raw)
To: roopa; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bDiRd6WY7ry-vFCj2HgAFtxWznwJTX4vqxt_v5Q8LXdOg@mail.gmail.com>
On Mon, Dec 29, 2014 at 3:47 PM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Mon, Dec 29, 2014 at 2:10 PM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 12/29/14, 1:40 PM, Scott Feldman wrote:
>>>
>>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>>
>>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>>
>>>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>>>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>>>> userspace to pack more than one vlan in the setlink msg.
>>>>
>>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>> ---
>>>> net/bridge/br_netlink.c | 18 +++++++++---------
>>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>> index 9f5eb55..75971b1 100644
>>>> --- a/net/bridge/br_netlink.c
>>>> +++ b/net/bridge/br_netlink.c
>>>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>>>> struct nlattr *af_spec,
>>>> int cmd)
>>>> {
>>>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>>> + struct bridge_vlan_info *vinfo;
>>>> int err = 0;
>>>> + struct nlattr *attr;
>>>> + int err = 0;
>>>> + int rem;
>>>> + u16 vid;
>>>>
>>>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec,
>>>> ifla_br_policy);
>>>
>>> Removing this call orphans ifla_br_policy...should ifla_br_policy be
>>> removed?
>>
>>
>> good question. Its a good place to see the type. In-fact userspace programs
>> also copy the same policy to parse netlink attributes. hmmm..
>> I would like to keep it if it does not throw a warning.
>
> I don't know what the policy (sorry, no pun intended) on leaving dead
> code. I say remove it.
You know, not using the policy seems like a step backwards, and maybe
it suggests a problem with the attr packing.
We had:
ifla_br_policy
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
This patch set makes it:
ifla_br_policy
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_RANGE_INFO
Which is fine, but now VLAN_INFO and VLAN_RANGE_INFO can be repeated.
I think you want some nesting to clarify:
ifla_br_policy
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_RANGE_INFO
Now you can keep the policy for the top-level parsing, and loop only
on the nested array VLAN_LIST_INFO. Actually, now you can use just
RANGE_INFO in array and have:
ifla_br_policy
IFLA_BRIDGE_FLAGS
IFLA_BRIDGE_MODE
IFLA_BRIDGE_VLAN_INFO
IFLA_BRIDGE_VLAN_LIST_INFO // nested array of
IFLA_BRIDGE_VLAN_RANGE_INFO
And use VLAN_RANGE_INFO for both ranges of vids as well as single
vids. That'll simplify your filling algo in patch 5.
-scott
^ permalink raw reply
* [PATCH iproute2 2/2] ss: Unify packet stats output from netlink and proc
From: Vadim Kochan @ 2014-12-29 23:56 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1419897380-21012-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
Refactored to use one func for output packet stats info
from both /proc and netlink.
Added possibility to get packet stats info from /proc
by setting environment variable PROC_ROOT or PROC_NET_PACKET.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
misc/ss.c | 211 ++++++++++++++++++++++++++++----------------------------------
1 file changed, 95 insertions(+), 116 deletions(-)
diff --git a/misc/ss.c b/misc/ss.c
index 88b14f8..641c56b 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -2536,69 +2536,86 @@ static int unix_show(struct filter *f)
return 0;
}
-static int packet_show_sock(const struct sockaddr_nl *addr,
- struct nlmsghdr *nlh, void *arg)
-{
- struct packet_diag_msg *r = NLMSG_DATA(nlh);
- struct rtattr *tb[PACKET_DIAG_MAX+1];
- __u32 rq;
-
- parse_rtattr(tb, PACKET_DIAG_MAX, (struct rtattr*)(r+1),
- nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
+struct pktstat {
+ int type;
+ int prot;
+ int iface;
+ int state;
+ int rq;
+ int uid;
+ int ino;
+};
- /* use /proc/net/packet if all info are not available */
- if (!tb[PACKET_DIAG_MEMINFO])
- return -1;
+static void packet_stats_print(struct pktstat *s)
+{
+ char *buf = NULL;
if (netid_width)
printf("%-*s ", netid_width,
- r->pdiag_type == SOCK_RAW ? "p_raw" : "p_dgr");
+ s->type == SOCK_RAW ? "p_raw" : "p_dgr");
if (state_width)
printf("%-*s ", state_width, "UNCONN");
- if (tb[PACKET_DIAG_MEMINFO]) {
- __u32 *skmeminfo = RTA_DATA(tb[PACKET_DIAG_MEMINFO]);
-
- rq = skmeminfo[SK_MEMINFO_RMEM_ALLOC];
- } else
- rq = 0;
- printf("%-6d %-6d ", rq, 0);
-
- if (r->pdiag_num == 3) {
+ printf("%-6d %-6d ", s->rq, 0);
+ if (s->prot == 3) {
printf("%*s:", addr_width, "*");
} else {
- char tb2[16];
+ char tb[16];
printf("%*s:", addr_width,
- ll_proto_n2a(htons(r->pdiag_num), tb2, sizeof(tb2)));
+ ll_proto_n2a(htons(s->prot), tb, sizeof(tb)));
}
- if (tb[PACKET_DIAG_INFO]) {
- struct packet_diag_info *pinfo = RTA_DATA(tb[PACKET_DIAG_INFO]);
-
- if (pinfo->pdi_index == 0)
- printf("%-*s ", serv_width, "*");
- else
- printf("%-*s ", serv_width, xll_index_to_name(pinfo->pdi_index));
- } else
+ if (s->iface == 0) {
printf("%-*s ", serv_width, "*");
+ } else {
+ printf("%-*s ", serv_width, xll_index_to_name(s->iface));
+ }
- printf("%*s*%-*s",
- addr_width, "", serv_width, "");
-
- char *buf = NULL;
+ printf("%*s*%-*s", addr_width, "", serv_width, "");
if (show_proc_ctx || show_sock_ctx) {
- if (find_entry(r->pdiag_ino, &buf,
- (show_proc_ctx & show_sock_ctx) ?
- PROC_SOCK_CTX : PROC_CTX) > 0) {
+ if (find_entry(s->ino, &buf,
+ (show_proc_ctx & show_sock_ctx) ?
+ PROC_SOCK_CTX : PROC_CTX) > 0) {
printf(" users:(%s)", buf);
free(buf);
}
} else if (show_users) {
- if (find_entry(r->pdiag_ino, &buf, USERS) > 0) {
+ if (find_entry(s->ino, &buf, USERS) > 0) {
printf(" users:(%s)", buf);
free(buf);
}
}
+}
+
+static int packet_show_sock(const struct sockaddr_nl *addr,
+ struct nlmsghdr *nlh, void *arg)
+{
+ struct packet_diag_msg *r = NLMSG_DATA(nlh);
+ struct rtattr *tb[PACKET_DIAG_MAX+1];
+ struct pktstat stat = {};
+
+ parse_rtattr(tb, PACKET_DIAG_MAX, (struct rtattr*)(r+1),
+ nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
+
+ /* use /proc/net/packet if all info are not available */
+ if (!tb[PACKET_DIAG_MEMINFO])
+ return -1;
+
+ stat.type = r->pdiag_type;
+ stat.prot = r->pdiag_num;
+ stat.ino = r->pdiag_ino;
+
+ if (tb[PACKET_DIAG_MEMINFO]) {
+ __u32 *skmeminfo = RTA_DATA(tb[PACKET_DIAG_MEMINFO]);
+ stat.rq = skmeminfo[SK_MEMINFO_RMEM_ALLOC];
+ }
+
+ if (tb[PACKET_DIAG_INFO]) {
+ struct packet_diag_info *pinfo = RTA_DATA(tb[PACKET_DIAG_INFO]);
+ stat.iface = pinfo->pdi_index;
+ }
+
+ packet_stats_print(&stat);
if (show_details) {
__u32 uid = 0;
@@ -2640,94 +2657,56 @@ static int packet_show_netlink(struct filter *f)
return handle_netlink_request(f, &req.nlh, sizeof(req), packet_show_sock);
}
+static int packet_show_line(char *buf, const struct filter *f, int fam)
+{
+ unsigned long long sk;
+ struct pktstat stat = {};
+
+ sscanf(buf, "%llx %*d %d %x %d %d %u %u %u",
+ &sk,
+ &stat.type, &stat.prot, &stat.iface, &stat.state,
+ &stat.rq, &stat.uid, &stat.ino);
+
+ if (stat.type == SOCK_RAW && !(f->dbs&(1<<PACKET_R_DB)))
+ return 0;
+ if (stat.type == SOCK_DGRAM && !(f->dbs&(1<<PACKET_DG_DB)))
+ return 0;
+ if (f->f) {
+ struct tcpstat tst;
+ tst.local.family = AF_PACKET;
+ tst.remote.family = AF_PACKET;
+ tst.rport = 0;
+ tst.lport = stat.iface;
+ tst.local.data[0] = stat.prot;
+ tst.remote.data[0] = 0;
+ if (run_ssfilter(f->f, &tst) == 0)
+ return 0;
+ }
+
+ packet_stats_print(&stat);
+
+ if (show_details) {
+ printf(" ino=%u uid=%u sk=%llx", stat.ino, stat.uid, sk);
+ }
+ printf("\n");
+ return 0;
+}
static int packet_show(struct filter *f)
{
FILE *fp;
- char buf[256];
- int type;
- int prot;
- int iface;
- int state;
- int rq;
- int uid;
- int ino;
- unsigned long long sk;
if (preferred_family != AF_PACKET && !(f->states & (1 << SS_CLOSE)))
return 0;
- if (packet_show_netlink(f) == 0)
+ if (!getenv("PROC_NET_PACKET") && !getenv("PROC_ROOT") &&
+ packet_show_netlink(f) == 0)
return 0;
if ((fp = net_packet_open()) == NULL)
return -1;
- fgets(buf, sizeof(buf)-1, fp);
-
- while (fgets(buf, sizeof(buf)-1, fp)) {
- sscanf(buf, "%llx %*d %d %x %d %d %u %u %u",
- &sk,
- &type, &prot, &iface, &state,
- &rq, &uid, &ino);
-
- if (type == SOCK_RAW && !(f->dbs&(1<<PACKET_R_DB)))
- continue;
- if (type == SOCK_DGRAM && !(f->dbs&(1<<PACKET_DG_DB)))
- continue;
- if (f->f) {
- struct tcpstat tst;
- tst.local.family = AF_PACKET;
- tst.remote.family = AF_PACKET;
- tst.rport = 0;
- tst.lport = iface;
- tst.local.data[0] = prot;
- tst.remote.data[0] = 0;
- if (run_ssfilter(f->f, &tst) == 0)
- continue;
- }
-
- if (netid_width)
- printf("%-*s ", netid_width,
- type == SOCK_RAW ? "p_raw" : "p_dgr");
- if (state_width)
- printf("%-*s ", state_width, "UNCONN");
- printf("%-6d %-6d ", rq, 0);
- if (prot == 3) {
- printf("%*s:", addr_width, "*");
- } else {
- char tb[16];
- printf("%*s:", addr_width,
- ll_proto_n2a(htons(prot), tb, sizeof(tb)));
- }
- if (iface == 0) {
- printf("%-*s ", serv_width, "*");
- } else {
- printf("%-*s ", serv_width, xll_index_to_name(iface));
- }
- printf("%*s*%-*s",
- addr_width, "", serv_width, "");
-
- char *buf = NULL;
-
- if (show_proc_ctx || show_sock_ctx) {
- if (find_entry(ino, &buf,
- (show_proc_ctx & show_sock_ctx) ?
- PROC_SOCK_CTX : PROC_CTX) > 0) {
- printf(" users:(%s)", buf);
- free(buf);
- }
- } else if (show_users) {
- if (find_entry(ino, &buf, USERS) > 0) {
- printf(" users:(%s)", buf);
- free(buf);
- }
- }
-
- if (show_details) {
- printf(" ino=%u uid=%u sk=%llx", ino, uid, sk);
- }
- printf("\n");
- }
+ if (generic_record_read(fp, packet_show_line, f, AF_PACKET))
+ return -1;
return 0;
}
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 1/2] ss: Unify unix stats output from netlink and proc
From: Vadim Kochan @ 2014-12-29 23:56 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
In-Reply-To: <1419897380-21012-1-git-send-email-vadim4j@gmail.com>
From: Vadim Kochan <vadim4j@gmail.com>
Refactored to use one func for output unix stats info
from both /proc and netlink.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
---
misc/ss.c | 99 ++++++++++++++++++++++++---------------------------------------
1 file changed, 37 insertions(+), 62 deletions(-)
diff --git a/misc/ss.c b/misc/ss.c
index f0c7b34..88b14f8 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -2234,6 +2234,7 @@ struct unixstat
struct unixstat *next;
int ino;
int peer;
+ char *peer_name;
int rq;
int wq;
int state;
@@ -2279,7 +2280,18 @@ static const char *unix_netid_name(int type)
return netid;
}
-static void unix_list_print(struct unixstat *list, struct filter *f)
+static int unix_type_skip(struct unixstat *s, struct filter *f)
+{
+ if (s->type == SOCK_STREAM && !(f->dbs&(1<<UNIX_ST_DB)))
+ return 1;
+ if (s->type == SOCK_DGRAM && !(f->dbs&(1<<UNIX_DG_DB)))
+ return 1;
+ if (s->type == SOCK_SEQPACKET && !(f->dbs&(1<<UNIX_SQ_DB)))
+ return 1;
+ return 0;
+}
+
+static void unix_stats_print(struct unixstat *list, struct filter *f)
{
struct unixstat *s;
char *peer;
@@ -2287,15 +2299,14 @@ static void unix_list_print(struct unixstat *list, struct filter *f)
for (s = list; s; s = s->next) {
if (!(f->states & (1<<s->state)))
continue;
- if (s->type == SOCK_STREAM && !(f->dbs&(1<<UNIX_ST_DB)))
- continue;
- if (s->type == SOCK_DGRAM && !(f->dbs&(1<<UNIX_DG_DB)))
- continue;
- if (s->type == SOCK_SEQPACKET && !(f->dbs&(1<<UNIX_SQ_DB)))
+ if (unix_type_skip(s, f))
continue;
peer = "*";
- if (s->peer) {
+ if (s->peer_name)
+ peer = s->peer_name;
+
+ if (s->peer && !s->peer_name) {
struct unixstat *p;
for (p = list; p; p = p->next) {
if (s->peer == p->ino)
@@ -2356,36 +2367,24 @@ static int unix_show_sock(const struct sockaddr_nl *addr, struct nlmsghdr *nlh,
struct unix_diag_msg *r = NLMSG_DATA(nlh);
struct rtattr *tb[UNIX_DIAG_MAX+1];
char name[128];
- int peer_ino;
- __u32 rqlen, wqlen;
+ struct unixstat stat = { .name = "*" , .peer_name = "*" };
parse_rtattr(tb, UNIX_DIAG_MAX, (struct rtattr*)(r+1),
nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
- if (r->udiag_type == SOCK_STREAM && !(f->dbs&(1<<UNIX_ST_DB)))
- return 0;
- if (r->udiag_type == SOCK_DGRAM && !(f->dbs&(1<<UNIX_DG_DB)))
- return 0;
- if (r->udiag_type == SOCK_SEQPACKET && !(f->dbs&(1<<UNIX_SQ_DB)))
- return 0;
+ f->f = NULL;
+ stat.type = r->udiag_type;
+ stat.state = r->udiag_state;
+ stat.ino = r->udiag_ino;
- if (netid_width)
- printf("%-*s ", netid_width,
- unix_netid_name(r->udiag_type));
- if (state_width)
- printf("%-*s ", state_width, sstate_name[r->udiag_state]);
+ if (unix_type_skip(&stat, f))
+ return 0;
if (tb[UNIX_DIAG_RQLEN]) {
struct unix_diag_rqlen *rql = RTA_DATA(tb[UNIX_DIAG_RQLEN]);
- rqlen = rql->udiag_rqueue;
- wqlen = rql->udiag_wqueue;
- } else {
- rqlen = 0;
- wqlen = 0;
+ stat.rq = rql->udiag_rqueue;
+ stat.wq = rql->udiag_wqueue;
}
-
- printf("%-6u %-6u ", rqlen, wqlen);
-
if (tb[UNIX_DIAG_NAME]) {
int len = RTA_PAYLOAD(tb[UNIX_DIAG_NAME]);
@@ -2393,41 +2392,17 @@ static int unix_show_sock(const struct sockaddr_nl *addr, struct nlmsghdr *nlh,
name[len] = '\0';
if (name[0] == '\0')
name[0] = '@';
- } else
- sprintf(name, "*");
-
+ stat.name = &name[0];
+ }
if (tb[UNIX_DIAG_PEER])
- peer_ino = rta_getattr_u32(tb[UNIX_DIAG_PEER]);
- else
- peer_ino = 0;
-
- printf("%*s %-*d %*s %-*d",
- addr_width, name,
- serv_width, r->udiag_ino,
- addr_width, "*", /* FIXME */
- serv_width, peer_ino);
-
- char *buf = NULL;
+ stat.peer = rta_getattr_u32(tb[UNIX_DIAG_PEER]);
- if (show_proc_ctx || show_sock_ctx) {
- if (find_entry(r->udiag_ino, &buf,
- (show_proc_ctx & show_sock_ctx) ?
- PROC_SOCK_CTX : PROC_CTX) > 0) {
- printf(" users:(%s)", buf);
- free(buf);
- }
- } else if (show_users) {
- if (find_entry(r->udiag_ino, &buf, USERS) > 0) {
- printf(" users:(%s)", buf);
- free(buf);
- }
- }
+ unix_stats_print(&stat, f);
if (show_mem) {
- printf("\n\t");
+ printf("\t");
print_skmeminfo(tb, UNIX_DIAG_MEMINFO);
}
-
if (show_details) {
if (tb[UNIX_DIAG_SHUTDOWN]) {
unsigned char mask;
@@ -2435,9 +2410,8 @@ static int unix_show_sock(const struct sockaddr_nl *addr, struct nlmsghdr *nlh,
printf(" %c-%c", mask & 1 ? '-' : '<', mask & 2 ? '-' : '>');
}
}
-
- printf("\n");
-
+ if (show_mem || show_details)
+ printf("\n");
return 0;
}
@@ -2505,6 +2479,7 @@ static int unix_show(struct filter *f)
if (!(u = malloc(sizeof(*u))))
break;
u->name = NULL;
+ u->peer_name = NULL;
if (sscanf(buf, "%x: %x %x %x %x %x %d %s",
&u->peer, &u->rq, &u->wq, &flags, &u->type,
@@ -2544,7 +2519,7 @@ static int unix_show(struct filter *f)
strcpy(u->name, name);
}
if (++cnt > MAX_UNIX_REMEMBER) {
- unix_list_print(list, f);
+ unix_stats_print(list, f);
unix_list_free(list);
list = NULL;
cnt = 0;
@@ -2552,7 +2527,7 @@ static int unix_show(struct filter *f)
}
fclose(fp);
if (list) {
- unix_list_print(list, f);
+ unix_stats_print(list, f);
unix_list_free(list);
list = NULL;
cnt = 0;
--
2.1.3
^ permalink raw reply related
* [PATCH iproute2 0/2] Use one func to output stats from Netlink and /proc
From: Vadim Kochan @ 2014-12-29 23:56 UTC (permalink / raw)
To: netdev; +Cc: Vadim Kochan
Refactoring to use one func to output sock stats
from Netlink and /proc for unix and packet socket types.
I did some testing but I might miss some cases, would be good if someone
can test it too.
Vadim Kochan (2):
ss: Unify unix stats output from netlink and proc
ss: Unify packet stats output from netlink and proc
misc/ss.c | 310 ++++++++++++++++++++++++++------------------------------------
1 file changed, 132 insertions(+), 178 deletions(-)
--
2.1.3
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: Scott Feldman @ 2014-12-29 23:47 UTC (permalink / raw)
To: roopa; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <54A1D157.1000002@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 2:10 PM, roopa <roopa@cumulusnetworks.com> wrote:
> On 12/29/14, 1:40 PM, Scott Feldman wrote:
>>
>> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>>>
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>>> userspace to pack more than one vlan in the setlink msg.
>>>
>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>>> ---
>>> net/bridge/br_netlink.c | 18 +++++++++---------
>>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>> index 9f5eb55..75971b1 100644
>>> --- a/net/bridge/br_netlink.c
>>> +++ b/net/bridge/br_netlink.c
>>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>>> struct nlattr *af_spec,
>>> int cmd)
>>> {
>>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>>> + struct bridge_vlan_info *vinfo;
>>> int err = 0;
>>> + struct nlattr *attr;
>>> + int err = 0;
>>> + int rem;
>>> + u16 vid;
>>>
>>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec,
>>> ifla_br_policy);
>>
>> Removing this call orphans ifla_br_policy...should ifla_br_policy be
>> removed?
>
>
> good question. Its a good place to see the type. In-fact userspace programs
> also copy the same policy to parse netlink attributes. hmmm..
> I would like to keep it if it does not throw a warning.
I don't know what the policy (sorry, no pun intended) on leaving dead
code. I say remove it.
^ permalink raw reply
* Re: [PATCH 6/6] bridge: add vlan info to bridge setlink and dellink notification messages
From: Scott Feldman @ 2014-12-29 23:42 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: Netdev, shemminger, vyasevic@redhat.com
In-Reply-To: <1419887132-7084-7-git-send-email-roopa@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> vlan add/deletes are not notified to userspace today. This patch fixes it.
> Notifications will contain vlans compressed into ranges whereever applicable.
Why is this notification needed? On VLAN add/del, the port driver
will already get called if it implements ndo_vlan_rx_add_vid and
ndo_vlan_rx_kill_vid. The port driver already gets the notification
it needs to filter vlans. User space can query for vlan port
membership if it needs to know.
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> net/bridge/br_netlink.c | 3 ++-
> net/core/rtnetlink.c | 3 ++-
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 16bdd5a..cebfb03 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -303,7 +303,8 @@ void br_ifinfo_notify(int event, struct net_bridge_port *port)
> if (skb == NULL)
> goto errout;
>
> - err = br_fill_ifinfo(skb, port, 0, 0, event, 0, 0, port->dev);
> + err = br_fill_ifinfo(skb, port, 0, 0, event, 0,
> + RTEXT_FILTER_BRVLAN_COMPRESSED, port->dev);
This doesn't look right. The skb wasn't allocated large enough to
hold vlan info. If this is working in your tests, you're getting
lucky due to some skb padding.
skb was allocated with br_nlmsg_size(), which doesn't include vlan info.
> if (err < 0) {
> /* -EMSGSIZE implies BUG in br_nlmsg_size() */
> WARN_ON(err == -EMSGSIZE);
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index d06107d..dad5fb6 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2878,7 +2878,8 @@ static int rtnl_bridge_notify(struct net_device *dev, u16 flags)
>
> if ((!flags || (flags & BRIDGE_FLAGS_MASTER)) &&
> br_dev && br_dev->netdev_ops->ndo_bridge_getlink) {
> - err = br_dev->netdev_ops->ndo_bridge_getlink(skb, 0, 0, dev, 0);
> + err = br_dev->netdev_ops->ndo_bridge_getlink(skb, 0, 0, dev,
> + RTEXT_FILTER_BRVLAN_COMPRESSED);
> if (err < 0)
> goto errout;
> }
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 5/6] bridge: new function to pack vlans using both IFLA_BRIDGE_VLAN_INFO and IFLA_BRIDGE_VLAN_RANGE_INFO
From: Scott Feldman @ 2014-12-29 23:25 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson kok
In-Reply-To: <1419887132-7084-6-git-send-email-roopa@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This patch adds new function to compress vlans into ranges.
> Vlans are compressed into ranges only if the fill request is called with
> RTEXT_FILTER_BRVLAN_COMPRESSED in filtermask.
>
> Old vlan packing code is moved to a new function and continues to be
> called when filter_mask is RTEXT_FILTER_BRVLAN
>
> Signed-off-by: Wilson kok <wkok@cumulusnetworks.com>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> net/bridge/br_netlink.c | 157 +++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 137 insertions(+), 20 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 4c47ba0..16bdd5a 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -67,6 +67,133 @@ static int br_port_fill_attrs(struct sk_buff *skb,
> return 0;
> }
>
> +static int br_fill_ifvlaninfo_bitmap(struct sk_buff *skb,
> + const unsigned long *vlan_bmp, u16 flags)
> +{
> + struct bridge_vlan_range_info vinfo_range;
> + struct bridge_vlan_info vinfo;
> + u16 vid, start = 0, end = 0;
One per line?
> + u16 pvid;
Aren't you getting an "unused" compile warning on this ^^^?
> +
> + /* handle the untagged */
This comment ^^^ doesn't make sense here.
> + for_each_set_bit(vid, vlan_bmp, VLAN_N_VID) {
> + if (start == 0) {
> + start = vid;
> + end = vid;
> + }
> + if ((vid - end) > 1) {
> + memset(&vinfo_range, 0, sizeof(vinfo_range));
> + vinfo_range.flags |= flags;
> + vinfo_range.vid = start;
> + vinfo_range.vid_end = end;
> + if (nla_put(skb, IFLA_BRIDGE_VLAN_RANGE_INFO,
> + sizeof(vinfo_range), &vinfo_range))
> + goto nla_put_failure;
> + start = vid;
> + end = vid;
> + } else {
> + end = vid;
> + }
> + }
What happens with this set {1-10, 12, 20-25, 30}? On vid 12, will
that put a VLAN_RANGE_INFO with start=end=12? Seems strange to use
RANGE_INFO for single vlan.
Can the algorithm be simplified? Maybe there are other examples in
the kernel of finding ranges/singles from a bitmap we could borrow
code from?
> + if (start != 0 && end != 0) {
> + if (start != end) {
> + memset(&vinfo_range, 0, sizeof(vinfo_range));
> + vinfo_range.flags |= flags;
> + vinfo_range.vid = start;
> + vinfo_range.vid_end = end;
> + if (nla_put(skb, IFLA_BRIDGE_VLAN_RANGE_INFO,
> + sizeof(vinfo_range), &vinfo_range))
> + goto nla_put_failure;
> + } else {
> + memset(&vinfo, 0, sizeof(vinfo));
> + vinfo.flags |= flags;
> + vinfo.vid = start;
> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
> + sizeof(vinfo), &vinfo))
> + goto nla_put_failure;
How many IFLA_BRIDGE_VLAN_INFOs can you fit into skb? It seems the
worst case is you'll have 4094/2 = 2047 VLAN_INFOs if using all
even-numbered vids, for example. Will that fit? I guess the original
code had the same concern...wonder if anyone checked this worst-case?
> + }
> + }
> +
> +nla_put_failure:
> + return -EMSGSIZE;
> +}
> +
> +static int br_fill_ifvlaninfo_compressed(struct sk_buff *skb,
> + const struct net_port_vlans *pv)
> +{
> + unsigned long vlan_bmp_copy[BR_VLAN_BITMAP_LEN];
> + unsigned long untagged_bmp_copy[BR_VLAN_BITMAP_LEN];
Lots of automatic space on the stack...can you use dynamic mem
(kzalloc) for these?
> + struct bridge_vlan_range_info vinfo_range;
> + struct bridge_vlan_info vinfo;
> + u16 pvid;
> +
> + memset(vlan_bmp_copy, 0,
> + sizeof(unsigned long) * BR_VLAN_BITMAP_LEN);
> + bitmap_copy(vlan_bmp_copy, pv->vlan_bitmap, VLAN_N_VID);
> +
> + memset(untagged_bmp_copy, 0,
> + sizeof(unsigned long) * BR_VLAN_BITMAP_LEN);
> + bitmap_copy(untagged_bmp_copy, pv->untagged_bitmap, VLAN_N_VID);
> +
> + /* send the pvid separately first */
> + pvid = br_get_pvid(pv);
> + if (pvid && (pvid != VLAN_N_VID)) {
> + memset(&vinfo, 0, sizeof(vinfo));
> + vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
> + if (test_bit(pvid, untagged_bmp_copy)) {
> + vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
> + clear_bit(pvid, untagged_bmp_copy);
> + }
> + clear_bit(pvid, vlan_bmp_copy);
> + vinfo.vid = pvid;
> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
> + sizeof(vinfo), &vinfo))
> + goto nla_put_failure;
> + }
What if pvid is not in vlan_bmp_copy? This statement block seems
slightly different than the original logic where BRIDGE_VLAN_INFO_PVID
was set only when looping thru vlan_bitmap and vid == pvid. Now can
you send a IFLA_BRIDGE_VLAN_INFO with pvid even if pvid isn't in
vlan_bitmap? Maybe pvid is always in vlan_bitmap, so it doesn't
matter. But there is a subtle logic change here, so something to
consider.
> +
> + /* fill untagged vlans */
> + br_fill_ifvlaninfo_bitmap(skb, untagged_bmp_copy,
> + BRIDGE_VLAN_INFO_UNTAGGED);
This might be doing more than original logic. Original logic looped
thru vid in vlan_btimap and checked if vid was in untagged_bitmap.
Here you're dumping all bits in untagged_bitmap, without looking if
bit was set in vlan_bitmap. Maybe you want to do an intersection of
sets vlan_bitmap and untagged_bitmap.
> + for_each_set_bit(vid, untagged_bmp_copy, VLAN_N_VID)
> + clear_bit(vid, vlan_bmp_copy);
> +
> + /* fill tagged vlans */
> + br_fill_ifvlaninfo_bitmap(skb, vlan_bmp_copy, 0);
> +
> + return 0;
> +
> +nla_put_failure:
> + return -EMSGSIZE;
> +}
> +
> +static int br_fill_ifvlaninfo(struct sk_buff *skb,
> + const struct net_port_vlans *pv)
> +{
> + struct bridge_vlan_info vinfo;
> + u16 pvid, vid;
> +
> + pvid = br_get_pvid(pv);
> + for_each_set_bit(vid, pv->vlan_bitmap, VLAN_N_VID) {
> + vinfo.vid = vid;
> + vinfo.flags = 0;
> + if (vid == pvid)
> + vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
> +
> + if (test_bit(vid, pv->untagged_bitmap))
> + vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
> +
> + if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
> + sizeof(vinfo), &vinfo))
> + goto nla_put_failure;
> + }
> +
> + return 0;
> +
> +nla_put_failure:
> + return -EMSGSIZE;
> +}
> +
> /*
> * Create one netlink message for one interface
> * Contains port and master info as well as carrier and bridge state.
> @@ -121,12 +248,11 @@ static int br_fill_ifinfo(struct sk_buff *skb,
> }
>
> /* Check if the VID information is requested */
> - if (filter_mask & RTEXT_FILTER_BRVLAN) {
> - struct nlattr *af;
> + if ((filter_mask & RTEXT_FILTER_BRVLAN) ||
> + (filter_mask & RTEXT_FILTER_BRVLAN_COMPRESSED)) {
> const struct net_port_vlans *pv;
> - struct bridge_vlan_info vinfo;
> - u16 vid;
> - u16 pvid;
> + struct nlattr *af;
> + int err;
>
> if (port)
> pv = nbp_get_vlan_info(port);
> @@ -140,21 +266,12 @@ static int br_fill_ifinfo(struct sk_buff *skb,
> if (!af)
> goto nla_put_failure;
>
> - pvid = br_get_pvid(pv);
> - for_each_set_bit(vid, pv->vlan_bitmap, VLAN_N_VID) {
> - vinfo.vid = vid;
> - vinfo.flags = 0;
> - if (vid == pvid)
> - vinfo.flags |= BRIDGE_VLAN_INFO_PVID;
> -
> - if (test_bit(vid, pv->untagged_bitmap))
> - vinfo.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
> -
> - if (nla_put(skb, IFLA_BRIDGE_VLAN_INFO,
> - sizeof(vinfo), &vinfo))
> - goto nla_put_failure;
> - }
> -
> + if (filter_mask & RTEXT_FILTER_BRVLAN_COMPRESSED)
> + err = br_fill_ifvlaninfo_compressed(skb, pv);
> + else
> + err = br_fill_ifvlaninfo(skb, pv);
> + if (err)
> + goto nla_put_failure;
> nla_nest_end(skb, af);
> }
>
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Marvell Kirkwood - MV643XX: near 100% UDP RX packet loss
From: Rick Jones @ 2014-12-29 23:25 UTC (permalink / raw)
To: Bruno Prémont; +Cc: Sebastian Hesselbarth, netdev
In-Reply-To: <20141227121727.20fe52c8@neptune.home>
On 12/27/2014 03:17 AM, Bruno Prémont wrote:
> On Thu, 25 December 2014 Rick Jones <rick.jones2@hp.com> wrote:
>>> Why are so many packets being discarded?
>>
>> You should also check the netstat statistics, particularly UDP on the
>> receiving side. Look before and after the test and see how they change,
>> if at all.
>
> Here they go.
>
> Summary of numbers:
> iperf UDP run, 5 seconds @ 1Gb/s
> lost 71216/71776 packets
>
> before after delta
> ethtool:
> rx_packets: 420001 688424 268423
> rx_bytes: 433251917 809803463 376551546
> rx_errors: 0 0 0
> rx_dropped: 0 0 0
> bad_octets_received: 0 0 0
> bad_frames_received: 0 0 0
> rx_discard: 159691 323123 163432
> rx_overrun: 0 0 0
> netstat, udp:
> packets received 15559 16137 578
> packets to unknown port received 18 18 0
> packet receive errors 41599 83890 42291
> packets sent 34697 34770 73
> receive buffer errors 0 0 0
> send buffer errors 0 0 0
>
Well, it certainly looks like a decent fraction of your lost traffic are
UDP packet receive errors. Overrunning the SO_RCVBUF on the receiving
side presumably. You can either start walking-down the transmission
rate of the iperf client, or try a larger receive socket buffer size on
the iperf server, though that will only help if those drops are from the
receiving side being only occasionally slower than the sending side.
You might also want to make sure the UDP datagrams being sent are huge
and so getting fragmented. All it takes is to lose one fragment of an
IP datagram to render the entire datagram useless.
As for the rx_discard in the ethtool stats, someone more familiar with
the hardware will have to describe the various reasons for that stat to
be incremented.
rick jones
^ permalink raw reply
* Re: [PATCH 3/6] bridge: modify IFLA_AF_SPEC parser to parse IFLA_BRIDGE_VLAN_RANGE_INFO
From: roopa @ 2014-12-29 23:19 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson kok
In-Reply-To: <CAE4R7bA5ZhwX=agK-BOPWoJ1sctQuVcJ92z78Rrtp-=pBWi4jg@mail.gmail.com>
On 12/29/14, 2:04 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This patch modifies br_afspec to parse incoming IFLA_BRIDGE_VLAN_RANGE_INFO
>>
>> Signed-off-by: Wilson kok <wkok@cumulusnetworks.com>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> net/bridge/br_netlink.c | 70 +++++++++++++++++++++++++++++++++--------------
>> 1 file changed, 49 insertions(+), 21 deletions(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index e7d1fc0..4c47ba0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -227,48 +227,76 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
>> .len = sizeof(struct bridge_vlan_range_info), },
>> };
>>
>> +static int br_afspec_vlan_add(struct net_bridge *br,
>> + struct net_bridge_port *p,
>> + u16 vid, u16 flags)
>> +{
>> + int err = 0;
>> +
>> + if (p) {
>> + err = nbp_vlan_add(p, vid, flags);
>> + if (err)
>> + return err;
>> +
>> + if (flags & BRIDGE_VLAN_INFO_MASTER)
>> + err = br_vlan_add(p->br, vid, flags);
>> + } else {
>> + err = br_vlan_add(br, vid, flags);
>> + }
>> +
>> + return err;
>> +}
>> +
>> +static void br_afspec_vlan_del(struct net_bridge *br,
>> + struct net_bridge_port *p,
>> + u16 vid, u16 flags)
>> +{
>> + if (p) {
>> + nbp_vlan_delete(p, vid);
>> + if (flags & BRIDGE_VLAN_INFO_MASTER)
>> + br_vlan_delete(p->br, vid);
>> + } else {
>> + br_vlan_delete(br, vid);
>> + }
>> +}
>> +
>> static int br_afspec(struct net_bridge *br,
>> struct net_bridge_port *p,
>> struct nlattr *af_spec,
>> int cmd)
>> {
>> - struct bridge_vlan_info *vinfo;
>> - int err = 0;
>> + struct bridge_vlan_range_info *vinfo;
>> struct nlattr *attr;
>> int err = 0;
>> int rem;
>> u16 vid;
>>
>> nla_for_each_nested(attr, af_spec, rem) {
>> - if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
>> + if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO &&
>> + nla_type(attr) != IFLA_BRIDGE_VLAN_RANGE_INFO)
>> continue;
>> -
>> vinfo = nla_data(attr);
> Check attr size.
>
>> - if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
>> +
>> + if (nla_type(attr) == IFLA_BRIDGE_VLAN_INFO)
>> + vinfo->vid_end = vinfo->vid;
> Maybe a switch(nla_type(attr)) would be better to explicitly handle
> each case? Then the size check can be done for each case, as well as
> this fix-up for vid_end.
sure, i can do that.
>
>> +
>> + if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK ||
>> + vinfo->vid_end >= VLAN_VID_MASK ||
>> + vinfo->vid > vinfo->vid_end)
>> return -EINVAL;
>>
>> switch (cmd) {
>> case RTM_SETLINK:
>> - if (p) {
>> - err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
>> + for (vid = vinfo->vid; vid <= vinfo->vid_end; vid++) {
>> + err = br_afspec_vlan_add(br, p, vid,
>> + vinfo->flags);
>> if (err)
> Do you want to unwind on failure? Seems it should be an
> all-or-nothing operation.
I could, but looking at other link operations, I don't seen anybody
unwinding. An AF_UNSPEC link netlink request can contain multiple link
attributes and not all maybe applied AFAICS. But i don't have a strong
opinion here. I can unwind if needed.
>> break;
>> -
>> - if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>> - err = br_vlan_add(p->br, vinfo->vid,
>> - vinfo->flags);
>> - } else
>> - err = br_vlan_add(br, vinfo->vid, vinfo->flags);
>> -
>> + }
>> break;
>> -
>> case RTM_DELLINK:
>> - if (p) {
>> - nbp_vlan_delete(p, vinfo->vid);
>> - if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
>> - br_vlan_delete(p->br, vinfo->vid);
>> - } else
>> - br_vlan_delete(br, vinfo->vid);
>> + for (vid = vinfo->vid; vid <= vinfo->vid_end; vid++)
>> + br_afspec_vlan_del(br, p, vid, vinfo->flags);
>> break;
>> }
>> }
>> --
>> 1.7.10.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH/RFC flow-net-next 02/10] net: flow: Add features to tables
From: Cong Wang @ 2014-12-29 23:03 UTC (permalink / raw)
To: Simon Horman; +Cc: John Fastabend, netdev
In-Reply-To: <1419819340-19000-3-git-send-email-simon.horman@netronome.com>
On Sun, Dec 28, 2014 at 6:15 PM, Simon Horman
<simon.horman@netronome.com> wrote:
> +static struct net_flow_table *net_flow_table_get_table(struct net_device *dev,
> + int table_uid)
> +{
> + struct net_flow_table **tables;
> + int i;
> +
> + tables = net_flow_get_tables(dev);
> + if (IS_ERR(tables))
> + return ERR_PTR(PTR_ERR(tables));
Seriously? :)
^ permalink raw reply
* Re: [PATCH 2/6] bridge: add new attribute IFLA_BRIDGE_VLAN_RANGE_INFO to represent vlan range
From: roopa @ 2014-12-29 22:10 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bCYJam86BYf9N8Z63g58njHrjpaH3sdHoSqJBxyFq=Y4g@mail.gmail.com>
On 12/29/14, 1:52 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This patch adds new bridge netlink attribute IFLA_BRIDGE_VLAN_RANGE_INFO
>> to represent vlan range.
>>
>> Current IFLA_BRIDGE_VLAN_INFO attribute represents a single vlan and
>> does not scale well when the port has large number of vlans.
>>
>> IFLA_BRIDGE_VLAN_RANGE_INFO is similar to IFLA_BRIDGE_VLAN_INFO but helps
>> with packing consecutive vlans in a single attribute. Thus reducing the
>> size of the netlink msg during adds and also during dumps.
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>> ---
>> include/uapi/linux/if_bridge.h | 7 +++++++
>> net/bridge/br_netlink.c | 2 ++
>> 2 files changed, 9 insertions(+)
>>
>> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
>> index b03ee8f..d41a346 100644
>> --- a/include/uapi/linux/if_bridge.h
>> +++ b/include/uapi/linux/if_bridge.h
>> @@ -118,6 +118,7 @@ enum {
>> IFLA_BRIDGE_FLAGS,
>> IFLA_BRIDGE_MODE,
>> IFLA_BRIDGE_VLAN_INFO,
>> + IFLA_BRIDGE_VLAN_RANGE_INFO,
>> __IFLA_BRIDGE_MAX,
>> };
>> #define IFLA_BRIDGE_MAX (__IFLA_BRIDGE_MAX - 1)
>> @@ -131,6 +132,12 @@ struct bridge_vlan_info {
>> __u16 vid;
>> };
>>
>> +struct bridge_vlan_range_info {
>> + __u16 flags;
>> + __u16 vid;
> vid_start or vid_begin to be consistent with vid_end?
ack..
>
>> + __u16 vid_end;
>> +};
>> +
>> /* Bridge multicast database attributes
>> * [MDBA_MDB] = {
>> * [MDBA_MDB_ENTRY] = {
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 75971b1..e7d1fc0 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -223,6 +223,8 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
>> [IFLA_BRIDGE_MODE] = { .type = NLA_U16 },
>> [IFLA_BRIDGE_VLAN_INFO] = { .type = NLA_BINARY,
>> .len = sizeof(struct bridge_vlan_info), },
>> + [IFLA_BRIDGE_VLAN_RANGE_INFO] = { .type = NLA_BINARY,
>> + .len = sizeof(struct bridge_vlan_range_info), },
> ifla_br_policy isn't used anymore from patch 1/6, so no need to add this.
>
>> };
>>
>> static int br_afspec(struct net_bridge *br,
>> --
>> 1.7.10.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: roopa @ 2014-12-29 22:10 UTC (permalink / raw)
To: Scott Feldman; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <CAE4R7bAyWy83S-K_XtiRNJO2rZXtVpijPxqj_oq_sZqPKW-h-Q@mail.gmail.com>
On 12/29/14, 1:40 PM, Scott Feldman wrote:
> On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
>> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
>> userspace to pack more than one vlan in the setlink msg.
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> ---
>> net/bridge/br_netlink.c | 18 +++++++++---------
>> 1 file changed, 9 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 9f5eb55..75971b1 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
>> struct nlattr *af_spec,
>> int cmd)
>> {
>> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
>> + struct bridge_vlan_info *vinfo;
>> int err = 0;
>> + struct nlattr *attr;
>> + int err = 0;
>> + int rem;
>> + u16 vid;
>>
>> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec, ifla_br_policy);
> Removing this call orphans ifla_br_policy...should ifla_br_policy be removed?
good question. Its a good place to see the type. In-fact userspace
programs also copy the same policy to parse netlink attributes. hmmm..
I would like to keep it if it does not throw a warning.
>
>> - if (err)
>> - return err;
>> -
>> - if (tb[IFLA_BRIDGE_VLAN_INFO]) {
>> - struct bridge_vlan_info *vinfo;
>> -
>> - vinfo = nla_data(tb[IFLA_BRIDGE_VLAN_INFO]);
>> + nla_for_each_nested(attr, af_spec, rem) {
>> + if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
>> + continue;
> Need to validate size of attr is sizeof(struct bridge_vlan_info).
ack, will fix.
>
>> + vinfo = nla_data(attr);
>> if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
>> return -EINVAL;
>>
>> --
>> 1.7.10.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 3/6] bridge: modify IFLA_AF_SPEC parser to parse IFLA_BRIDGE_VLAN_RANGE_INFO
From: Scott Feldman @ 2014-12-29 22:04 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson kok
In-Reply-To: <1419887132-7084-4-git-send-email-roopa@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This patch modifies br_afspec to parse incoming IFLA_BRIDGE_VLAN_RANGE_INFO
>
> Signed-off-by: Wilson kok <wkok@cumulusnetworks.com>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> net/bridge/br_netlink.c | 70 +++++++++++++++++++++++++++++++++--------------
> 1 file changed, 49 insertions(+), 21 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index e7d1fc0..4c47ba0 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -227,48 +227,76 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
> .len = sizeof(struct bridge_vlan_range_info), },
> };
>
> +static int br_afspec_vlan_add(struct net_bridge *br,
> + struct net_bridge_port *p,
> + u16 vid, u16 flags)
> +{
> + int err = 0;
> +
> + if (p) {
> + err = nbp_vlan_add(p, vid, flags);
> + if (err)
> + return err;
> +
> + if (flags & BRIDGE_VLAN_INFO_MASTER)
> + err = br_vlan_add(p->br, vid, flags);
> + } else {
> + err = br_vlan_add(br, vid, flags);
> + }
> +
> + return err;
> +}
> +
> +static void br_afspec_vlan_del(struct net_bridge *br,
> + struct net_bridge_port *p,
> + u16 vid, u16 flags)
> +{
> + if (p) {
> + nbp_vlan_delete(p, vid);
> + if (flags & BRIDGE_VLAN_INFO_MASTER)
> + br_vlan_delete(p->br, vid);
> + } else {
> + br_vlan_delete(br, vid);
> + }
> +}
> +
> static int br_afspec(struct net_bridge *br,
> struct net_bridge_port *p,
> struct nlattr *af_spec,
> int cmd)
> {
> - struct bridge_vlan_info *vinfo;
> - int err = 0;
> + struct bridge_vlan_range_info *vinfo;
> struct nlattr *attr;
> int err = 0;
> int rem;
> u16 vid;
>
> nla_for_each_nested(attr, af_spec, rem) {
> - if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
> + if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO &&
> + nla_type(attr) != IFLA_BRIDGE_VLAN_RANGE_INFO)
> continue;
> -
> vinfo = nla_data(attr);
Check attr size.
> - if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
> +
> + if (nla_type(attr) == IFLA_BRIDGE_VLAN_INFO)
> + vinfo->vid_end = vinfo->vid;
Maybe a switch(nla_type(attr)) would be better to explicitly handle
each case? Then the size check can be done for each case, as well as
this fix-up for vid_end.
> +
> + if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK ||
> + vinfo->vid_end >= VLAN_VID_MASK ||
> + vinfo->vid > vinfo->vid_end)
> return -EINVAL;
>
> switch (cmd) {
> case RTM_SETLINK:
> - if (p) {
> - err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
> + for (vid = vinfo->vid; vid <= vinfo->vid_end; vid++) {
> + err = br_afspec_vlan_add(br, p, vid,
> + vinfo->flags);
> if (err)
Do you want to unwind on failure? Seems it should be an
all-or-nothing operation.
> break;
> -
> - if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> - err = br_vlan_add(p->br, vinfo->vid,
> - vinfo->flags);
> - } else
> - err = br_vlan_add(br, vinfo->vid, vinfo->flags);
> -
> + }
> break;
> -
> case RTM_DELLINK:
> - if (p) {
> - nbp_vlan_delete(p, vinfo->vid);
> - if (vinfo->flags & BRIDGE_VLAN_INFO_MASTER)
> - br_vlan_delete(p->br, vinfo->vid);
> - } else
> - br_vlan_delete(br, vinfo->vid);
> + for (vid = vinfo->vid; vid <= vinfo->vid_end; vid++)
> + br_afspec_vlan_del(br, p, vid, vinfo->flags);
> break;
> }
> }
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 2/6] bridge: add new attribute IFLA_BRIDGE_VLAN_RANGE_INFO to represent vlan range
From: Scott Feldman @ 2014-12-29 21:52 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: Netdev, shemminger, vyasevic@redhat.com, Wilson Kok
In-Reply-To: <1419887132-7084-3-git-send-email-roopa@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This patch adds new bridge netlink attribute IFLA_BRIDGE_VLAN_RANGE_INFO
> to represent vlan range.
>
> Current IFLA_BRIDGE_VLAN_INFO attribute represents a single vlan and
> does not scale well when the port has large number of vlans.
>
> IFLA_BRIDGE_VLAN_RANGE_INFO is similar to IFLA_BRIDGE_VLAN_INFO but helps
> with packing consecutive vlans in a single attribute. Thus reducing the
> size of the netlink msg during adds and also during dumps.
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
> ---
> include/uapi/linux/if_bridge.h | 7 +++++++
> net/bridge/br_netlink.c | 2 ++
> 2 files changed, 9 insertions(+)
>
> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
> index b03ee8f..d41a346 100644
> --- a/include/uapi/linux/if_bridge.h
> +++ b/include/uapi/linux/if_bridge.h
> @@ -118,6 +118,7 @@ enum {
> IFLA_BRIDGE_FLAGS,
> IFLA_BRIDGE_MODE,
> IFLA_BRIDGE_VLAN_INFO,
> + IFLA_BRIDGE_VLAN_RANGE_INFO,
> __IFLA_BRIDGE_MAX,
> };
> #define IFLA_BRIDGE_MAX (__IFLA_BRIDGE_MAX - 1)
> @@ -131,6 +132,12 @@ struct bridge_vlan_info {
> __u16 vid;
> };
>
> +struct bridge_vlan_range_info {
> + __u16 flags;
> + __u16 vid;
vid_start or vid_begin to be consistent with vid_end?
> + __u16 vid_end;
> +};
> +
> /* Bridge multicast database attributes
> * [MDBA_MDB] = {
> * [MDBA_MDB_ENTRY] = {
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 75971b1..e7d1fc0 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -223,6 +223,8 @@ static const struct nla_policy ifla_br_policy[IFLA_MAX+1] = {
> [IFLA_BRIDGE_MODE] = { .type = NLA_U16 },
> [IFLA_BRIDGE_VLAN_INFO] = { .type = NLA_BINARY,
> .len = sizeof(struct bridge_vlan_info), },
> + [IFLA_BRIDGE_VLAN_RANGE_INFO] = { .type = NLA_BINARY,
> + .len = sizeof(struct bridge_vlan_range_info), },
ifla_br_policy isn't used anymore from patch 1/6, so no need to add this.
> };
>
> static int br_afspec(struct net_bridge *br,
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH 1/6] bridge: add support to parse multiple vlan info attributes in IFLA_AF_SPEC
From: Scott Feldman @ 2014-12-29 21:40 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: Netdev, shemminger, vyasevic@redhat.com
In-Reply-To: <1419887132-7084-2-git-send-email-roopa@cumulusnetworks.com>
On Mon, Dec 29, 2014 at 1:05 PM, <roopa@cumulusnetworks.com> wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>
> This patch changes bridge IFLA_AF_SPEC netlink attribute parser to
> look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows
> userspace to pack more than one vlan in the setlink msg.
>
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> ---
> net/bridge/br_netlink.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
> index 9f5eb55..75971b1 100644
> --- a/net/bridge/br_netlink.c
> +++ b/net/bridge/br_netlink.c
> @@ -230,18 +230,18 @@ static int br_afspec(struct net_bridge *br,
> struct nlattr *af_spec,
> int cmd)
> {
> - struct nlattr *tb[IFLA_BRIDGE_MAX+1];
> + struct bridge_vlan_info *vinfo;
> int err = 0;
> + struct nlattr *attr;
> + int err = 0;
> + int rem;
> + u16 vid;
>
> - err = nla_parse_nested(tb, IFLA_BRIDGE_MAX, af_spec, ifla_br_policy);
Removing this call orphans ifla_br_policy...should ifla_br_policy be removed?
> - if (err)
> - return err;
> -
> - if (tb[IFLA_BRIDGE_VLAN_INFO]) {
> - struct bridge_vlan_info *vinfo;
> -
> - vinfo = nla_data(tb[IFLA_BRIDGE_VLAN_INFO]);
> + nla_for_each_nested(attr, af_spec, rem) {
> + if (nla_type(attr) != IFLA_BRIDGE_VLAN_INFO)
> + continue;
Need to validate size of attr is sizeof(struct bridge_vlan_info).
>
> + vinfo = nla_data(attr);
> if (!vinfo->vid || vinfo->vid >= VLAN_VID_MASK)
> return -EINVAL;
>
> --
> 1.7.10.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox