Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 2/2] net sched act_vlan: VLAN action rewrite to use RCU lock/unlock and update
From: Cong Wang @ 2017-10-11 16:27 UTC (permalink / raw)
  To: Manish Kurup
  Cc: Jamal Hadi Salim, Jiri Pirko, David Miller,
	Linux Kernel Network Developers, LKML, Alexander Aring,
	Roman Mashak, manish.kurup
In-Reply-To: <1507689219-22993-1-git-send-email-manish.kurup@verizon.com>

On Tue, Oct 10, 2017 at 7:33 PM, Manish Kurup <kurup.manish@gmail.com> wrote:
> diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
> index 14c262c..9bb0236 100644
> --- a/net/sched/act_vlan.c
> +++ b/net/sched/act_vlan.c
> @@ -29,31 +29,37 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
>         int action;
>         int err;
>         u16 tci;
> +       struct tcf_vlan_params *p;
>
>         tcf_lastuse_update(&v->tcf_tm);
>         bstats_cpu_update(this_cpu_ptr(v->common.cpu_bstats), skb);
>
> -       spin_lock(&v->tcf_lock);
> -       action = v->tcf_action;
> -

spin_lock() is removed here, see below.


>         /* Ensure 'data' points at mac_header prior calling vlan manipulating
>          * functions.
>          */
>         if (skb_at_tc_ingress(skb))
>                 skb_push_rcsum(skb, skb->mac_len);
>
> -       switch (v->tcfv_action) {
> +       rcu_read_lock();
> +
> +       action = READ_ONCE(v->tcf_action);
> +
> +       p = rcu_dereference(v->vlan_p);
> +
> +       switch (p->tcfv_action) {
>         case TCA_VLAN_ACT_POP:
>                 err = skb_vlan_pop(skb);
>                 if (err)
>                         goto drop;
>                 break;
> +
>         case TCA_VLAN_ACT_PUSH:
> -               err = skb_vlan_push(skb, v->tcfv_push_proto, v->tcfv_push_vid |
> -                                   (v->tcfv_push_prio << VLAN_PRIO_SHIFT));
> +               err = skb_vlan_push(skb, p->tcfv_push_proto, p->tcfv_push_vid |
> +                               (p->tcfv_push_prio << VLAN_PRIO_SHIFT));
>                 if (err)
>                         goto drop;
>                 break;
> +
>         case TCA_VLAN_ACT_MODIFY:
>                 /* No-op if no vlan tag (either hw-accel or in-payload) */
>                 if (!skb_vlan_tagged(skb))
> @@ -69,15 +75,16 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
>                                 goto drop;
>                 }
>                 /* replace the vid */
> -               tci = (tci & ~VLAN_VID_MASK) | v->tcfv_push_vid;
> +               tci = (tci & ~VLAN_VID_MASK) | p->tcfv_push_vid;
>                 /* replace prio bits, if tcfv_push_prio specified */
> -               if (v->tcfv_push_prio) {
> +               if (p->tcfv_push_prio) {
>                         tci &= ~VLAN_PRIO_MASK;
> -                       tci |= v->tcfv_push_prio << VLAN_PRIO_SHIFT;
> +                       tci |= p->tcfv_push_prio << VLAN_PRIO_SHIFT;
>                 }
>                 /* put updated tci as hwaccel tag */
> -               __vlan_hwaccel_put_tag(skb, v->tcfv_push_proto, tci);
> +               __vlan_hwaccel_put_tag(skb, p->tcfv_push_proto, tci);
>                 break;
> +
>         default:
>                 BUG();
>         }
> @@ -89,6 +96,7 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
>         qstats_drop_inc(this_cpu_ptr(v->common.cpu_qstats));
>
>  unlock:
> +       rcu_read_unlock();
>         if (skb_at_tc_ingress(skb))
>                 skb_pull_rcsum(skb, skb->mac_len);
>


But here spin_unlock() is not removed... At least it doesn't show in diff
context. It's probably unbalanced spinlock.


> @@ -111,6 +119,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
>         struct nlattr *tb[TCA_VLAN_MAX + 1];
>         struct tc_vlan *parm;
>         struct tcf_vlan *v;
> +       struct tcf_vlan_params *p, *p_old;
>         int action;
>         __be16 push_vid = 0;
>         __be16 push_proto = 0;
> @@ -187,16 +196,33 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
>
>         v = to_vlan(*a);
>
> -       spin_lock_bh(&v->tcf_lock);
> -
> -       v->tcfv_action = action;
> -       v->tcfv_push_vid = push_vid;
> -       v->tcfv_push_prio = push_prio;
> -       v->tcfv_push_proto = push_proto;
> +       ASSERT_RTNL();
> +       p = kzalloc(sizeof(*p), GFP_KERNEL);
> +       if (unlikely(!p)) {
> +               if (ovr)
> +                       tcf_idr_release(*a, bind);
> +               return -ENOMEM;
> +       }
>
>         v->tcf_action = parm->action;
>
> -       spin_unlock_bh(&v->tcf_lock);
> +       p_old = rtnl_dereference(v->vlan_p);
> +
> +       if (ovr)
> +               spin_lock_bh(&v->tcf_lock);

Why still take spinlock when you already have RTNL lock?
What's the point?

^ permalink raw reply

* Re: BUG:af_packet fails to TX TSO frames
From: Anton Ivanov @ 2017-10-11 16:32 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development, David Miller
In-Reply-To: <CAF=yD-+JaS8JwEC2TyUG=Ebaho-MM+gjumrbHYM-H0EK1Hu62Q@mail.gmail.com>

Working through it at the moment.

The validation logic is prohibiting what the hardware considers to be a 
perfectly legit skb.

Once I narrow down the culprit I will come back with my findings.

Thanks for pointing me where to look by the way.

A.

On 10/11/17 17:26, Willem de Bruijn wrote:
> On Wed, Oct 11, 2017 at 11:54 AM, Anton Ivanov
> <anton.ivanov@cambridgegreys.com> wrote:
>> It is that patch.
>>
>> I rolled it back and immediately got it to work correctly on a Broadcom
>> Tigon. I can test on all other scenarios, I have tried, I suspect it will
>> come back alive on all of them.
>>
>> I am going to try to trace it through and see exactly where it drops a skb
>> which the card has no issues in accepting.
> It might be in the initialization of gso_type and csum. The virtio_net_hdr
> can encode various combinations of flags that are not allowed by the
> validation logic.
>

-- 
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/

^ permalink raw reply

* Re: [patch net-next 1/4] net: sched: make tc_action_ops->get_dev return dev and avoid passing net
From: Cong Wang @ 2017-10-11 16:34 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Linux Kernel Network Developers, David Miller, Jamal Hadi Salim,
	Saeed Mahameed, matanb, leonro, mlxsw
In-Reply-To: <20171010211926.GL2033@nanopsycho>

On Tue, Oct 10, 2017 at 2:19 PM, Jiri Pirko <jiri@resnulli.us> wrote:
> Tue, Oct 10, 2017 at 07:44:53PM CEST, xiyou.wangcong@gmail.com wrote:
>>On Tue, Oct 10, 2017 at 12:30 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>>> -static int tcf_mirred_device(const struct tc_action *a, struct net *net,
>>> -                            struct net_device **mirred_dev)
>>> +static struct net_device *tcf_mirred_get_dev(const struct tc_action *a)
>>>  {
>>> -       int ifindex = tcf_mirred_ifindex(a);
>>> +       struct tcf_mirred *m = to_mirred(a);
>>>
>>> -       *mirred_dev = __dev_get_by_index(net, ifindex);
>>> -       if (!*mirred_dev)
>>> -               return -EINVAL;
>>> -       return 0;
>>> +       return __dev_get_by_index(m->net, m->tcfm_ifindex);
>>
>>Hmm, why not just return m->tcfm_dev?
>
> I just follow the existing code. The change you suggest should be a
> separate follow-up patch.

Why?

Your goal is "make tc_action_ops->get_dev return dev and avoid passing net",
using m->tcfm_dev is simpler and could save you from adding a net pointer
to struct tcf_mirred too.

^ permalink raw reply

* Ethtool question
From: Ben Greear @ 2017-10-11 16:51 UTC (permalink / raw)
  To: netdev

I noticed today that setting some ethtool settings to the same value
returns an error code.  I would think this should silently return
success instead?  Makes it easier to call it from scripts this way:

[root@lf0313-6477 lanforge]# ethtool -L eth3 combined 1
combined unmodified, ignoring
no channel parameters changed, aborting
current values: tx 0 rx 0 other 1 combined 1
[root@lf0313-6477 lanforge]# echo $?
1

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] rtl8xxxu: mark expected switch fall-throughs
From: Kees Cook @ 2017-10-11 17:00 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Jes Sorensen, Kalle Valo, linux-wireless, Network Development,
	LKML
In-Reply-To: <20171011093248.Horde.Jwmh0VKhmCeOxBgQzLLpYeZ@gator4166.hostgator.com>

On Wed, Oct 11, 2017 at 7:32 AM, Gustavo A. R. Silva
<garsilva@embeddedor.com> wrote:
> Quoting Jes Sorensen <jes.sorensen@gmail.com>:
>> On 10/11/2017 04:41 AM, Kalle Valo wrote:
>>> Jes Sorensen <jes.sorensen@gmail.com> writes:
>>>> On 10/10/2017 03:30 PM, Gustavo A. R. Silva wrote:
>>>>>
>>>>> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
>>>>> where we are expecting to fall through.
>>>>
>>>> While this isn't harmful, to me this looks like pointless patch churn
>>>> for zero gain and it's just ugly.
>>>
>>> In general I find it useful to mark fall through cases. And it's just a
>>> comment with two words, so they cannot hurt your eyes that much.
>>
>> I don't see them being harmful in the code, but I don't see them of much
>> use either. If it happened as part of natural code development, fine. My
>> objection is to people running around doing this systematically causing
>> patch churn for little to zero gain.
>
> I understand that you think this is of zero gain for you, but as Florian
> Fainelli pointed out:
>
> "That is the canonical way to tell static analyzers and compilers that
> fall throughs are wanted and not accidental mistakes in the code. For
> people that deal with these kinds of errors, it's quite helpful, unless
> you suggest disabling that particular GCC warning specific for that
> file/directory?"
>
> this is very helpful for people working on fixing issues reported by static
> analyzers. It saves a huge amount of time when dealing with False Positives.
> Also, there are cases when an apparently intentional fall-through turns out
> to be an actual missing break or continue.
>
> So there is an ongoing effort to detect such cases and avoid them to show up
> in the future by at least warning people about a potential issue in their
> code. And this is helpful for everybody.

This is an unfortunate omission in the C language, and thankfully both
gcc and clang have stepped up to solve this the same way static
analyzers have solved it. It's not exactly pretty, but it does both
document the intention for humans and provide a way for analyzers to
report issues. Having the compiler help us not make mistakes is quite
handy, and with Gustavo grinding through all the Coverity warnings,
he's found actual bugs with missing "break"s, so I think this has a
demonstrable benefit to the code-base as a whole. It makes things
unambiguous to someone else reviewing the code.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply

* Re: [PATCH v2] XDP Program for Ip forward
From: Christina Jacob @ 2017-10-11 17:06 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, linux-kernel, linux-arm-kernel, Sunil.Goutham, daniel,
	David Ahern, Christina Jacob
In-Reply-To: <20171010160057.32678367@redhat.com>

On Tue, Oct 10, 2017 at 7:30 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> On Tue, 10 Oct 2017 15:12:31 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
>
> > I'll try to test/benchmark your program...
>
> In my initial testing, I cannot get this to work...
>

What is the test setup you are using? So that I can also test
I verified the program in this minimal test setup, I did not see any
issue with the mac addresses being set.

Below is the test setup.

machine 1                                             machine 2
(90.0.0.2)port 1 ===============>port3(90.0.0.1)
                                                               ||
(80.0.0.2)port 2<================port4(80.0.0.1)
traffic gen                                             xdp program

Below are the steps followed to run the program.

 Assigned ips in different subnets to interfaces 3 and 4
Ip of port 1 is in the same subnet as port 3
Ip of port 2 is in the same subnet as port 4
sysctl -w net.core.bpf_jit_enable=1
./xdp_router_ipv4 eth0 eth1

Note: The program will not generate arp requests. Proper arp entries
need to be added to get the packets forwarded correctly.

> You do seem to XDP_REDIRECT out the right interface, but you have an
> error with setting the correct MAC address.
>
> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: pull-request: mac80211-next 2017-10-11
From: David Miller @ 2017-10-11 17:15 UTC (permalink / raw)
  To: johannes-cdvu00un1VgdHxzADdlk8Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20171011123613.28890-1-johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>

From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
Date: Wed, 11 Oct 2017 14:36:12 +0200

> Here's a -next pull request. The only bigger thing here is the
> addition of the regulatory database as firmware, which will
> allow us to - over time - get rid of CRDA, as well as having
> the option of adding more fields to the database where needed,
> this would've been extremely complex with CRDA because it had
> not been built with extensibility in mind.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks!

^ permalink raw reply

* Re: [PATCH net-next 2/2] net sched actions: fix module auto-loading
From: Cong Wang @ 2017-10-11 17:30 UTC (permalink / raw)
  To: Roman Mashak
  Cc: David Miller, Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <1507733430-17860-3-git-send-email-mrv@mojatatu.com>

On Wed, Oct 11, 2017 at 7:50 AM, Roman Mashak <mrv@mojatatu.com> wrote:
> Macro __stringify_1() can stringify a macro argument, however IFE_META_*
> are enums, so they never expand, however request_module expects an integer
> in IFE module name, so as a result it always fails to auto-load.
>
> Fixes: ef6980b6becb ("introduce IFE action")
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>

Good catch!

Alternatively, it seems we can also do this:

...
#define _IFE_META_PRIO 3

enum {
...
        IFE_META_PRIO = _IFE_META_PRIO,
...
};



#define MODULE_ALIAS_IFE_META(metan)   MODULE_ALIAS("ifemeta" ##_
__stringify_1(metan))

But it does _not_ look any better than yours.

So,

Acked-by: Cong Wang <xiyou.wangcong@gmail.com>

^ permalink raw reply

* Re: [PATCH iproute2] iproute: build more easily on Android
From: Stephen Hemminger @ 2017-10-11 17:45 UTC (permalink / raw)
  To: Lorenzo Colitti; +Cc: netdev, enh
In-Reply-To: <20171002170337.42235-1-lorenzo@google.com>

On Tue,  3 Oct 2017 02:03:37 +0900
Lorenzo Colitti <lorenzo@google.com> wrote:

> iproute2 contains a bunch of kernel headers, including uapi ones.
> Android's libc uses uapi headers almost directly, and uses a
> script to fix kernel types that don't match what userspace
> expects.
> 
> For example: https://issuetracker.google.com/36987220 reports
> that our struct ip_mreq_source contains "__be32 imr_multiaddr"
> rather than "struct in_addr imr_multiaddr". The script addresses
> this by replacing the uapi struct definition with a #include
> <bits/ip_mreq.h> which contains the traditional userspace
> definition.
> 
> Unfortunately, when we compile iproute2, this definition
> conflicts with the one in iproute2's linux/in.h.
> 
> Historically we've just solved this problem by running "git rm"
> on all the iproute2 include/linux headers that break Android's
> libc.  However, deleting the files in this way makes it harder to
> keep up with upstream, because every upstream change to
> an include file causes a merge conflict with the delete.
> 
> This patch fixes the problem by moving the iproute2 linux headers
> from include/linux to include/uapi/linux.
> 
> Tested: compiles on ubuntu trusty (glibc)
> 
> Signed-off-by: Elliott Hughes <enh@google.com>
> Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
>

I went ahead and did this for 4.14 (and net-next).
Applied.

^ permalink raw reply

* Re: [jkirsher/next-queue PATCH v4 0/6] tc-flower based cloud filters in i40e
From: Alexander Duyck @ 2017-10-11 17:46 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Amritha Nambiar, intel-wired-lan, Jeff Kirsher,
	Duyck, Alexander H, Netdev, Jamal Hadi Salim, Cong Wang
In-Reply-To: <20171011125635.GD2039@nanopsycho>

On Wed, Oct 11, 2017 at 5:56 AM, Jiri Pirko <jiri@resnulli.us> wrote:
> Wed, Oct 11, 2017 at 02:24:12AM CEST, amritha.nambiar@intel.com wrote:
>>This patch series enables configuring cloud filters in i40e
>>using the tc-flower classifier. The classification function
>>of the filter is to match a packet to a class. cls_flower is
>>extended to offload classid to hardware. The offloaded classid
>>is used direct matched packets to a traffic class on the device.
>>The approach here is similar to the tc 'prio' qdisc which uses
>>the classid for band selection. The ingress qdisc is called ffff:0,
>>so traffic classes are ffff:1 to ffff:8 (i40e has max of 8 TCs).
>
>
> NACK. This clearly looks like abuse of classid to something
> else. Classid is here to identify qdisc instance. However, you use it
> for hw tclass identification. This is mixing of apples and oranges.
>
> Why?
>
> Please don't try to abuse things! This is not nice.

This isn't an abuse. This is reproducing in hardware what is already
the behavior for software. Isn't that how offloads are supposed to
work?

This is exactly how prio currently handles this. We are essentially
doing the exact same thing in the hardware where we are choosing a
queueing group based on the class ID. You could setup a prio qdisc. If
you are offloading a qdisc behavior into hardware how are you supposed
to emulate the behavior if you aren't allowing the offload to use the
same mechanism?

- Alex

^ permalink raw reply

* Re: [PATCH] rtl8xxxu: mark expected switch fall-throughs
From: Joe Perches @ 2017-10-11 17:47 UTC (permalink / raw)
  To: David Laight, Gustavo A. R. Silva, Jes Sorensen, Kalle Valo
  Cc: linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Arnaldo Carvalho de Melo
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DD009123C@AcuExch.aculab.com>

On Wed, 2017-10-11 at 12:54 +0000, David Laight wrote:
> From: Joe Perches
> > Sent: 11 October 2017 11:21
> > On Tue, 2017-10-10 at 14:30 -0500, Gustavo A. R. Silva wrote:
> > > In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> > > where we are expecting to fall through.
> > 
> > perhaps use Arnaldo's idea:
> > 
> > https://lkml.org/lkml/2017/2/9/845
> > https://lkml.org/lkml/2017/2/10/485
> 
> gah, that is even uglier and requires a chase through
> headers to find out what it means.

Sure, if you think __fallthrough; isn't self-documenting.

	case foo;
		bar;
		__fallthrough;
	case baz;
		etc...



		

^ permalink raw reply

* Re: [PATCH iproute2 v2 0/3] ss: add AF_VSOCK support
From: Stephen Hemminger @ 2017-10-11 17:52 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171006154841.10495-1-stefanha@redhat.com>

On Fri,  6 Oct 2017 11:48:38 -0400
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> v2:
>  * Use uint64_t instead of __u64 for filter->families
>  * Added reference to net-next commit that merged vsock_diag.ko
> 
> This patch series adds AF_VSOCK support to ss(8).  AF_VSOCK is a host<->guest
> communications channel supported by VMware, KVM (virtio-vsock), and Hyper-V.
> 
> To dump AF_VSOCK sockets:
> 
>   $ ss --vsock
> 
> The vsock_diag.ko module has now been merged in the Linux net-next tree.  I
> have verified that the <linux/vm_sockets_diag.h> header copy in this patch
> series is in sync with Linux net-next.  See commit
> 5820299a271fd3dc9b1733e1e10cd7b983edd028 ("Merge branch 'VSOCK-sock_diag'").
> 
> Stefan Hajnoczi (3):
>   ss: allow AF_FAMILY constants >32
>   include: add <linux/vm_sockets_diag.h>
>   ss: add AF_VSOCK support
> 
>  include/linux/vm_sockets_diag.h |  33 ++++++
>  misc/ss.c                       | 238 +++++++++++++++++++++++++++++++++++-----
>  man/man8/ss.8                   |   8 +-
>  3 files changed, 249 insertions(+), 30 deletions(-)
>  create mode 100644 include/linux/vm_sockets_diag.h
> 

Thanks Stefan.
Applied to iproute2 net-next branch.
Headers got rearranged recently.

^ permalink raw reply

* Re: [PATCH iproute2 1/1] color: Fix ip segfault in color_fprintf() when using --color switch
From: Stephen Hemminger @ 2017-10-11 17:53 UTC (permalink / raw)
  To: Petr Vorel; +Cc: netdev, Julien Fortin
In-Reply-To: <20171008143847.21699-1-petr.vorel@gmail.com>

On Sun,  8 Oct 2017 16:38:47 +0200
Petr Vorel <petr.vorel@gmail.com> wrote:

> diff --git a/include/json_print.h b/include/json_print.h
> index b6ce1f9f..2f3f07c8 100644
> --- a/include/json_print.h
> +++ b/include/json_print.h
> @@ -53,7 +53,7 @@ void close_json_array(enum output_type type, const char *delim);
>  					     const char *fmt,		\
>  					     type value)		\
>  	{								\
> -		print_color_##type_name(t, -1, key, fmt, value);	\
> +		print_color_##type_name(t, 0, key, fmt, value);	\
>  	}
>  _PRINT_FUNC(int, int);
>  _PRINT_FUNC(bool, bool);
> diff --git a/lib/color.c b/lib/color.c
> index 79d5e289..e597798f 100644
> --- a/lib/color.c
> +++ b/lib/color.c
> @@ -110,7 +110,7 @@ int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...)
>  	}
>  
>  	ret += fprintf(fp, "%s",
> -		       color_codes[attr_colors[is_dark_bg ? attr + 8 : attr]]);
> +		       color_codes[attr_colors[is_dark_bg ? attr + 6 : attr - 1]]);

Magic offsets (8 and -1) are error prone. Can this be changed to an enum value from colors?

^ permalink raw reply

* Re: [PATCH iproute2 net-next] ip: mroute: Print offload indication
From: Stephen Hemminger @ 2017-10-11 17:55 UTC (permalink / raw)
  To: Yotam Gigi; +Cc: netdev, davem, mlxsw
In-Reply-To: <20171008144304.48850-1-yotamg@mellanox.com>

On Sun,  8 Oct 2017 17:43:04 +0300
Yotam Gigi <yotamg@mellanox.com> wrote:

> Since kernel net-next commit c7c0bbeae950 ("net: ipmr: Add MFC offload
> indication") the kernel indicates on an MFC entry whether it was offloaded
> using the RTNH_F_OFFLOAD flag. Update the "ip mroute show" command to
> indicate when a route is offloaded, similarly to the "ip route show"
> command.
> 
> Example output:
> $ ip mroute
> (0.0.0.0, 239.255.0.1)      Iif: sw1p7  Oifs: t_br0 State: resolved offload
> (192.168.1.1, 239.255.0.1)  Iif: sw1p7  Oifs: sw1p4 State: resolved offload
> 
> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>

Looks good, applied.
Thanks.

^ permalink raw reply

* Re: [PATCH iproute2] iplink: new option to set neigh suppression on a bridge port
From: Stephen Hemminger @ 2017-10-11 17:58 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev, nikolay
In-Reply-To: <1507610533-18724-1-git-send-email-roopa@cumulusnetworks.com>

On Mon,  9 Oct 2017 21:42:13 -0700
Roopa Prabhu <roopa@cumulusnetworks.com> wrote:

> From: Roopa Prabhu <roopa@cumulusnetworks.com>
> 
> neigh suppression can be used to suppress arp and nd flood
> to bridge ports. It maps to the recently added
> kernel support for bridge port flag IFLA_BRPORT_NEIGH_SUPPRESS.
> 
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>

Applied to net-next branch.

^ permalink raw reply

* [PATCH net-next 0/5] Enable ACB for bcm_sf2 and bcmsysport
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli

Hi all,

This patch series enables Broadcom's Advanced Congestion Buffering mechanism
which requires cooperation between the CPU/Management Ethernet MAC controller
and the switch.

I took the notifier approach because ultimately the information we need to
carry to the master network device is DSA specific and I saw little room for
generalizing beyond what DSA requires. Chances are that this is highly specific
to the Broadcom HW as I don't know of any HW out there that supports something
nearly similar for similar or identical needs.

Florian Fainelli (5):
  net: dsa: Add support for DSA specific notifiers
  net: dsa: tag_brcm: Indicate to master netdevice port + queue
  net: systemport: Establish lower/upper queue mapping
  net: dsa: bcm_sf2: Turn on ACB at the switch level
  net: systemport: Turn on ACB at the SYSTEMPORT level

 drivers/net/dsa/bcm_sf2.c                  |  30 ++++++++
 drivers/net/dsa/bcm_sf2_regs.h             |  23 ++++++
 drivers/net/ethernet/broadcom/bcmsysport.c | 119 ++++++++++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bcmsysport.h |  11 ++-
 include/net/dsa.h                          |  50 ++++++++++++
 net/dsa/dsa.c                              |  23 ++++++
 net/dsa/slave.c                            |  13 ++++
 net/dsa/tag_brcm.c                         |   6 ++
 8 files changed, 270 insertions(+), 5 deletions(-)

-- 
2.9.3

^ permalink raw reply

* [PATCH net-next 1/5] net: dsa: Add support for DSA specific notifiers
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
In-Reply-To: <20171011175752.22030-1-f.fainelli@gmail.com>

In preparation for communicating a given DSA network device's port
number and switch index, create a specialized DSA notifier and two
events: DSA_PORT_REGISTER and DSA_PORT_UNREGISTER that communicate: the
slave network device (slave_dev), port number and switch number in the
tree.

This will be later used for network device drivers like bcmsysport which
needs to cooperate with its DSA network devices to set-up queue mapping
and scheduling.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 include/net/dsa.h | 45 +++++++++++++++++++++++++++++++++++++++++++++
 net/dsa/dsa.c     | 23 +++++++++++++++++++++++
 net/dsa/slave.c   | 13 +++++++++++++
 3 files changed, 81 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 10dceccd9ce8..40a709a0754d 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -471,4 +471,49 @@ static inline int dsa_switch_resume(struct dsa_switch *ds)
 }
 #endif /* CONFIG_PM_SLEEP */
 
+enum dsa_notifier_type {
+	DSA_PORT_REGISTER,
+	DSA_PORT_UNREGISTER,
+};
+
+struct dsa_notifier_info {
+	struct net_device *dev;
+};
+
+struct dsa_notifier_register_info {
+	struct dsa_notifier_info info;	/* must be first */
+	struct net_device *master;
+	unsigned int port_number;
+	unsigned int switch_number;
+};
+
+static inline struct net_device *
+dsa_notifier_info_to_dev(const struct dsa_notifier_info *info)
+{
+	return info->dev;
+}
+
+#if IS_ENABLED(CONFIG_NET_DSA)
+int register_dsa_notifier(struct notifier_block *nb);
+int unregister_dsa_notifier(struct notifier_block *nb);
+int call_dsa_notifiers(unsigned long val, struct net_device *dev,
+		       struct dsa_notifier_info *info);
+#else
+static inline int register_dsa_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
+static inline int unregister_dsa_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
+static inline int call_dsa_notifiers(unsigned long val, struct net_device *dev,
+				     struct dsa_notifier_info *info)
+{
+	return NOTIFY_DONE;
+}
+#endif
+
 #endif
diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 51ca2a524a27..832c659ff993 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -14,6 +14,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/notifier.h>
 #include <linux/of.h>
 #include <linux/of_mdio.h>
 #include <linux/of_platform.h>
@@ -261,6 +262,28 @@ bool dsa_schedule_work(struct work_struct *work)
 	return queue_work(dsa_owq, work);
 }
 
+static ATOMIC_NOTIFIER_HEAD(dsa_notif_chain);
+
+int register_dsa_notifier(struct notifier_block *nb)
+{
+	return atomic_notifier_chain_register(&dsa_notif_chain, nb);
+}
+EXPORT_SYMBOL_GPL(register_dsa_notifier);
+
+int unregister_dsa_notifier(struct notifier_block *nb)
+{
+	return atomic_notifier_chain_unregister(&dsa_notif_chain, nb);
+}
+EXPORT_SYMBOL_GPL(unregister_dsa_notifier);
+
+int call_dsa_notifiers(unsigned long val, struct net_device *dev,
+		       struct dsa_notifier_info *info)
+{
+	info->dev = dev;
+	return atomic_notifier_call_chain(&dsa_notif_chain, val, info);
+}
+EXPORT_SYMBOL_GPL(call_dsa_notifiers);
+
 static int __init dsa_init_module(void)
 {
 	int rc;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index fb2954ff198c..45f4ea845c07 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1116,6 +1116,7 @@ int dsa_slave_resume(struct net_device *slave_dev)
 
 int dsa_slave_create(struct dsa_port *port, const char *name)
 {
+	struct dsa_notifier_register_info rinfo = { };
 	struct dsa_switch *ds = port->ds;
 	struct net_device *master;
 	struct net_device *slave_dev;
@@ -1177,6 +1178,12 @@ int dsa_slave_create(struct dsa_port *port, const char *name)
 		goto out_free;
 	}
 
+	rinfo.info.dev = slave_dev;
+	rinfo.master = master;
+	rinfo.port_number = p->dp->index;
+	rinfo.switch_number = p->dp->ds->index;
+	call_dsa_notifiers(DSA_PORT_REGISTER, slave_dev, &rinfo.info);
+
 	ret = register_netdev(slave_dev);
 	if (ret) {
 		netdev_err(master, "error %d registering interface %s\n",
@@ -1200,6 +1207,7 @@ int dsa_slave_create(struct dsa_port *port, const char *name)
 void dsa_slave_destroy(struct net_device *slave_dev)
 {
 	struct dsa_slave_priv *p = netdev_priv(slave_dev);
+	struct dsa_notifier_register_info rinfo = { };
 	struct device_node *port_dn;
 
 	port_dn = p->dp->dn;
@@ -1211,6 +1219,11 @@ void dsa_slave_destroy(struct net_device *slave_dev)
 		if (of_phy_is_fixed_link(port_dn))
 			of_phy_deregister_fixed_link(port_dn);
 	}
+	rinfo.info.dev = slave_dev;
+	rinfo.master = p->dp->cpu_dp->netdev;
+	rinfo.port_number = p->dp->index;
+	rinfo.switch_number = p->dp->ds->index;
+	call_dsa_notifiers(DSA_PORT_UNREGISTER, slave_dev, &rinfo.info);
 	unregister_netdev(slave_dev);
 	free_percpu(p->stats64);
 	free_netdev(slave_dev);
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 2/5] net: dsa: tag_brcm: Indicate to master netdevice port + queue
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
In-Reply-To: <20171011175752.22030-1-f.fainelli@gmail.com>

We need to tell the DSA master network device doing the actual
transmission what the desired switch port and queue number is for it to
resolve that to the internal transmit queue it is mapped to.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 include/net/dsa.h  | 5 +++++
 net/dsa/tag_brcm.c | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 40a709a0754d..ce1d622734d7 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -516,4 +516,9 @@ static inline int call_dsa_notifiers(unsigned long val, struct net_device *dev,
 }
 #endif
 
+/* Broadcom tag specific helpers to insert and extract queue/port number */
+#define BRCM_TAG_SET_PORT_QUEUE(p, q)	((p) << 8 | q)
+#define BRCM_TAG_GET_PORT(v)		((v) >> 8)
+#define BRCM_TAG_GET_QUEUE(v)		((v) & 0xff)
+
 #endif
diff --git a/net/dsa/tag_brcm.c b/net/dsa/tag_brcm.c
index 8e4bdb9d9ae3..cc4f472fbd77 100644
--- a/net/dsa/tag_brcm.c
+++ b/net/dsa/tag_brcm.c
@@ -86,6 +86,12 @@ static struct sk_buff *brcm_tag_xmit(struct sk_buff *skb, struct net_device *dev
 		brcm_tag[2] = BRCM_IG_DSTMAP2_MASK;
 	brcm_tag[3] = (1 << p->dp->index) & BRCM_IG_DSTMAP1_MASK;
 
+	/* Now tell the master network device about the desired output queue
+	 * as well
+	 */
+	skb_set_queue_mapping(skb, BRCM_TAG_SET_PORT_QUEUE(p->dp->index,
+							   queue));
+
 	return skb;
 }
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 3/5] net: systemport: Establish lower/upper queue mapping
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
In-Reply-To: <20171011175752.22030-1-f.fainelli@gmail.com>

Establish a queue mapping between the DSA slave network device queues
created that correspond to switch port queues, and the transmit queue
that SYSTEMPORT manages.

We need to configure the SYSTEMPORT transmit queue with the switch port number
and switch port queue number in order for the switch and SYSTEMPORT hardware to
utilize the out of band congestion notification. This hardware mechanism works
by looking at the switch port egress queue and determines whether there is
enough buffers for this queue, with that class of service for a successful
transmission and if not, backpressures the SYSTEMPORT queue that is being used.

For this to work, we implement a notifier which looks at the
DSA_PORT_REGISTER event.  When DSA network devices are registered, the
framework calls the DSA notifiers when that happens, extracts the number
of queues for these devices and their associated port number, remembers
that in the driver private structure and linearly maps those queues to
TX rings/queues that we manage.

This scheme works because DSA slave network deviecs always transmit
through SYSTEMPORT so when DSA slave network devices are
destroyed/brought down, the corresponding SYSTEMPORT queues are no
longer used. Also, by design of the DSA framework, the master network
device (SYSTEMPORT) is registered first.

For faster lookups we use an array of up to DSA_MAX_PORTS * number of
queues per port, and then map pointers to bcm_sysport_tx_ring such that
our ndo_select_queue() implementation can just index into that array to
locate the corresponding ring index.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 115 ++++++++++++++++++++++++++++-
 drivers/net/ethernet/broadcom/bcmsysport.h |  11 ++-
 2 files changed, 121 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 83eec9a8c275..78bed9a84e81 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1416,7 +1416,14 @@ static int bcm_sysport_init_tx_ring(struct bcm_sysport_priv *priv,
 	tdma_writel(priv, 0, TDMA_DESC_RING_COUNT(index));
 	tdma_writel(priv, 1, TDMA_DESC_RING_INTR_CONTROL(index));
 	tdma_writel(priv, 0, TDMA_DESC_RING_PROD_CONS_INDEX(index));
-	tdma_writel(priv, RING_IGNORE_STATUS, TDMA_DESC_RING_MAPPING(index));
+
+	/* Configure QID and port mapping */
+	reg = tdma_readl(priv, TDMA_DESC_RING_MAPPING(index));
+	reg &= ~(RING_QID_MASK | RING_PORT_ID_MASK << RING_PORT_ID_SHIFT);
+	reg |= ring->switch_queue & RING_QID_MASK;
+	reg |= ring->switch_port << RING_PORT_ID_SHIFT;
+	reg |= RING_IGNORE_STATUS;
+	tdma_writel(priv, reg, TDMA_DESC_RING_MAPPING(index));
 	tdma_writel(priv, 0, TDMA_DESC_RING_PCP_DEI_VID(index));
 
 	/* Do not use tdma_control_bit() here because TSB_SWAP1 collides
@@ -1447,8 +1454,9 @@ static int bcm_sysport_init_tx_ring(struct bcm_sysport_priv *priv,
 	napi_enable(&ring->napi);
 
 	netif_dbg(priv, hw, priv->netdev,
-		  "TDMA cfg, size=%d, desc_cpu=%p\n",
-		  ring->size, ring->desc_cpu);
+		  "TDMA cfg, size=%d, desc_cpu=%p switch q=%d,port=%d\n",
+		  ring->size, ring->desc_cpu, ring->switch_queue,
+		  ring->switch_port);
 
 	return 0;
 }
@@ -2011,6 +2019,92 @@ static const struct ethtool_ops bcm_sysport_ethtool_ops = {
 	.set_link_ksettings     = phy_ethtool_set_link_ksettings,
 };
 
+static u16 bcm_sysport_select_queue(struct net_device *dev, struct sk_buff *skb,
+				    void *accel_priv,
+				    select_queue_fallback_t fallback)
+{
+	struct bcm_sysport_priv *priv = netdev_priv(dev);
+	u16 queue = skb_get_queue_mapping(skb);
+	struct bcm_sysport_tx_ring *tx_ring;
+	unsigned int q, port;
+
+	if (!netdev_uses_dsa(dev))
+		return fallback(dev, skb);
+
+	/* DSA tagging layer will have configured the correct queue */
+	q = BRCM_TAG_GET_QUEUE(queue);
+	port = BRCM_TAG_GET_PORT(queue);
+	tx_ring = priv->ring_map[q + port * priv->per_port_num_tx_queues];
+
+	return tx_ring->index;
+}
+
+static int bcm_sysport_map_queues(struct net_device *dev,
+				  struct dsa_notifier_register_info *info)
+{
+	struct bcm_sysport_priv *priv = netdev_priv(dev);
+	struct bcm_sysport_tx_ring *ring;
+	struct net_device *slave_dev;
+	unsigned int num_tx_queues;
+	unsigned int q, start, port;
+
+	/* We can't be setting up queue inspection for non directly attached
+	 * switches
+	 */
+	if (info->switch_number)
+		return 0;
+
+	port = info->port_number;
+	slave_dev = info->info.dev;
+
+	/* On SYSTEMPORT Lite we have twice as less queues, so we cannot do a
+	 * 1:1 mapping, we can only do a 2:1 mapping. By reducing the number of
+	 * per-port (slave_dev) network devices queue, we achieve just that.
+	 * This need to happen now before any slave network device is used such
+	 * it accurately reflects the number of real TX queues.
+	 */
+	if (priv->is_lite)
+		netif_set_real_num_tx_queues(slave_dev,
+					     slave_dev->num_tx_queues / 2);
+	num_tx_queues = slave_dev->real_num_tx_queues;
+
+	if (priv->per_port_num_tx_queues &&
+	    priv->per_port_num_tx_queues != num_tx_queues)
+		netdev_warn(slave_dev, "asymetric number of per-port queues\n");
+
+	priv->per_port_num_tx_queues = num_tx_queues;
+
+	start = find_first_zero_bit(&priv->queue_bitmap, dev->num_tx_queues);
+	for (q = 0; q < num_tx_queues; q++) {
+		ring = &priv->tx_rings[q + start];
+
+		/* Just remember the mapping actual programming done
+		 * during bcm_sysport_init_tx_ring
+		 */
+		ring->switch_queue = q;
+		ring->switch_port = port;
+		priv->ring_map[q + port * num_tx_queues] = ring;
+
+		/* Set all queues as being used now */
+		set_bit(q + start, &priv->queue_bitmap);
+	}
+
+	return 0;
+}
+
+static int bcm_sysport_dsa_notifier(struct notifier_block *unused,
+				    unsigned long event, void *ptr)
+{
+	struct dsa_notifier_register_info *info;
+
+	if (event != DSA_PORT_REGISTER)
+		return NOTIFY_DONE;
+
+	info = ptr;
+
+	return notifier_from_errno(bcm_sysport_map_queues(info->master, info));
+}
+
 static const struct net_device_ops bcm_sysport_netdev_ops = {
 	.ndo_start_xmit		= bcm_sysport_xmit,
 	.ndo_tx_timeout		= bcm_sysport_tx_timeout,
@@ -2023,6 +2117,7 @@ static const struct net_device_ops bcm_sysport_netdev_ops = {
 	.ndo_poll_controller	= bcm_sysport_poll_controller,
 #endif
 	.ndo_get_stats64	= bcm_sysport_get_stats64,
+	.ndo_select_queue	= bcm_sysport_select_queue,
 };
 
 #define REV_FMT	"v%2x.%02x"
@@ -2172,10 +2267,18 @@ static int bcm_sysport_probe(struct platform_device *pdev)
 
 	u64_stats_init(&priv->syncp);
 
+	priv->dsa_notifier.notifier_call = bcm_sysport_dsa_notifier;
+
+	ret = register_dsa_notifier(&priv->dsa_notifier);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to register DSA notifier\n");
+		goto err_deregister_fixed_link;
+	}
+
 	ret = register_netdev(dev);
 	if (ret) {
 		dev_err(&pdev->dev, "failed to register net_device\n");
-		goto err_deregister_fixed_link;
+		goto err_deregister_notifier;
 	}
 
 	priv->rev = topctrl_readl(priv, REV_CNTL) & REV_MASK;
@@ -2188,6 +2291,8 @@ static int bcm_sysport_probe(struct platform_device *pdev)
 
 	return 0;
 
+err_deregister_notifier:
+	unregister_dsa_notifier(&priv->dsa_notifier);
 err_deregister_fixed_link:
 	if (of_phy_is_fixed_link(dn))
 		of_phy_deregister_fixed_link(dn);
@@ -2199,11 +2304,13 @@ static int bcm_sysport_probe(struct platform_device *pdev)
 static int bcm_sysport_remove(struct platform_device *pdev)
 {
 	struct net_device *dev = dev_get_drvdata(&pdev->dev);
+	struct bcm_sysport_priv *priv = netdev_priv(dev);
 	struct device_node *dn = pdev->dev.of_node;
 
 	/* Not much to do, ndo_close has been called
 	 * and we use managed allocations
 	 */
+	unregister_dsa_notifier(&priv->dsa_notifier);
 	unregister_netdev(dev);
 	if (of_phy_is_fixed_link(dn))
 		of_phy_deregister_fixed_link(dn);
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.h b/drivers/net/ethernet/broadcom/bcmsysport.h
index 82e401df199e..82f70a6783cb 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.h
+++ b/drivers/net/ethernet/broadcom/bcmsysport.h
@@ -404,7 +404,7 @@ struct bcm_rsb {
 #define  RING_CONS_INDEX_MASK		0xffff
 
 #define RING_MAPPING			0x14
-#define  RING_QID_MASK			0x3
+#define  RING_QID_MASK			0x7
 #define  RING_PORT_ID_SHIFT		3
 #define  RING_PORT_ID_MASK		0x7
 #define  RING_IGNORE_STATUS		(1 << 6)
@@ -712,6 +712,8 @@ struct bcm_sysport_tx_ring {
 	struct bcm_sysport_priv *priv;	/* private context backpointer */
 	unsigned long	packets;	/* packets statistics */
 	unsigned long	bytes;		/* bytes statistics */
+	unsigned int	switch_queue;	/* switch port queue number */
+	unsigned int	switch_port;	/* switch port queue number */
 };
 
 /* Driver private structure */
@@ -765,5 +767,12 @@ struct bcm_sysport_priv {
 
 	/* For atomic update generic 64bit value on 32bit Machine */
 	struct u64_stats_sync	syncp;
+
+	/* map information between switch port queues and local queues */
+	struct notifier_block	dsa_notifier;
+	unsigned int		per_port_num_tx_queues;
+	unsigned long		queue_bitmap;
+	struct bcm_sysport_tx_ring *ring_map[DSA_MAX_PORTS * 8];
+
 };
 #endif /* __BCM_SYSPORT_H */
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 4/5] net: dsa: bcm_sf2: Turn on ACB at the switch level
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
In-Reply-To: <20171011175752.22030-1-f.fainelli@gmail.com>

Turn on the out of band Advanced Congestion Buffering (ACB) mechanism at
the switch level now that we have properly established the queue mapping
between the switch egress queues and the SYSTEMPORT egress queues. This
allows the switch to correctly backpressure the host system when one of
its queue drops below the configured thresholds.

This is also helping achieve so called "lossless" behavior by adapting
the TX interrupt pacing to the actual speed and capacity of the switch
port.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/bcm_sf2.c      | 30 ++++++++++++++++++++++++++++++
 drivers/net/dsa/bcm_sf2_regs.h | 23 +++++++++++++++++++++++
 2 files changed, 53 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 7aecc98d0a18..32025b990437 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -205,6 +205,19 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int port,
 	if (port == priv->moca_port)
 		bcm_sf2_port_intr_enable(priv, port);
 
+	/* Set per-queue pause threshold to 32 */
+	core_writel(priv, 32, CORE_TXQ_THD_PAUSE_QN_PORT(port));
+
+	/* Set ACB threshold to 24 */
+	for (i = 0; i < SF2_NUM_EGRESS_QUEUES; i++) {
+		reg = acb_readl(priv, ACB_QUEUE_CFG(port *
+						    SF2_NUM_EGRESS_QUEUES + i));
+		reg &= ~XOFF_THRESHOLD_MASK;
+		reg |= 24;
+		acb_writel(priv, reg, ACB_QUEUE_CFG(port *
+						    SF2_NUM_EGRESS_QUEUES + i));
+	}
+
 	return b53_enable_port(ds, port, phy);
 }
 
@@ -613,6 +626,20 @@ static void bcm_sf2_sw_fixed_link_update(struct dsa_switch *ds, int port,
 		status->pause = 1;
 }
 
+static void bcm_sf2_enable_acb(struct dsa_switch *ds)
+{
+	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
+	u32 reg;
+
+	/* Enable ACB globally */
+	reg = acb_readl(priv, ACB_CONTROL);
+	reg |= (ACB_FLUSH_MASK << ACB_FLUSH_SHIFT);
+	acb_writel(priv, reg, ACB_CONTROL);
+	reg &= ~(ACB_FLUSH_MASK << ACB_FLUSH_SHIFT);
+	reg |= ACB_EN | ACB_ALGORITHM;
+	acb_writel(priv, reg, ACB_CONTROL);
+}
+
 static int bcm_sf2_sw_suspend(struct dsa_switch *ds)
 {
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
@@ -655,6 +682,8 @@ static int bcm_sf2_sw_resume(struct dsa_switch *ds)
 			bcm_sf2_imp_setup(ds, port);
 	}
 
+	bcm_sf2_enable_acb(ds);
+
 	return 0;
 }
 
@@ -766,6 +795,7 @@ static int bcm_sf2_sw_setup(struct dsa_switch *ds)
 	}
 
 	bcm_sf2_sw_configure_vlan(ds);
+	bcm_sf2_enable_acb(ds);
 
 	return 0;
 }
diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h
index d8b8074a47b9..d1596dfca323 100644
--- a/drivers/net/dsa/bcm_sf2_regs.h
+++ b/drivers/net/dsa/bcm_sf2_regs.h
@@ -115,6 +115,24 @@ enum bcm_sf2_reg_offs {
 #define P7_IRQ_OFF			0
 #define P_IRQ_OFF(x)			((6 - (x)) * P_NUM_IRQ)
 
+/* Register set relative to 'ACB' */
+#define ACB_CONTROL			0x00
+#define  ACB_EN				(1 << 0)
+#define  ACB_ALGORITHM			(1 << 1)
+#define  ACB_FLUSH_SHIFT		2
+#define  ACB_FLUSH_MASK			0x3
+
+#define ACB_QUEUE_0_CFG			0x08
+#define  XOFF_THRESHOLD_MASK		0x7ff
+#define  XON_EN				(1 << 11)
+#define  TOTAL_XOFF_THRESHOLD_SHIFT	12
+#define  TOTAL_XOFF_THRESHOLD_MASK	0x7ff
+#define  TOTAL_XOFF_EN			(1 << 23)
+#define  TOTAL_XON_EN			(1 << 24)
+#define  PKTLEN_SHIFT			25
+#define  PKTLEN_MASK			0x3f
+#define ACB_QUEUE_CFG(x)		(ACB_QUEUE_0_CFG + ((x) * 0x4))
+
 /* Register set relative to 'CORE' */
 #define CORE_G_PCTL_PORT0		0x00000
 #define CORE_G_PCTL_PORT(x)		(CORE_G_PCTL_PORT0 + (x * 0x4))
@@ -237,6 +255,11 @@ enum bcm_sf2_reg_offs {
 #define CORE_PORT_VLAN_CTL_PORT(x)	(0xc400 + ((x) * 0x8))
 #define  PORT_VLAN_CTRL_MASK		0x1ff
 
+#define CORE_TXQ_THD_PAUSE_QN_PORT_0	0x2c80
+#define  TXQ_PAUSE_THD_MASK		0x7ff
+#define CORE_TXQ_THD_PAUSE_QN_PORT(x)	(CORE_TXQ_THD_PAUSE_QN_PORT_0 + \
+					(x) * 0x8)
+
 #define CORE_DEFAULT_1Q_TAG_P(x)	(0xd040 + ((x) * 8))
 #define  CFI_SHIFT			12
 #define  PRI_SHIFT			13
-- 
2.9.3

^ permalink raw reply related

* [PATCH net-next 5/5] net: systemport: Turn on ACB at the SYSTEMPORT level
From: Florian Fainelli @ 2017-10-11 17:57 UTC (permalink / raw)
  To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
In-Reply-To: <20171011175752.22030-1-f.fainelli@gmail.com>

Now that we have established the queue mapping between the switch port
egress queues and the SYSTEMPORT egress queues, we can turn on Advanced
Congestion Buffering (ACB) at the SYSTEMPORT level. This enables the
Ethernet MAC controller to get out of band flow control information
directly from the switch port and queue that it monitors such that its
internal TDMA can be appropriately backpressured.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/ethernet/broadcom/bcmsysport.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 78bed9a84e81..dafc26690555 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1422,10 +1422,14 @@ static int bcm_sysport_init_tx_ring(struct bcm_sysport_priv *priv,
 	reg &= ~(RING_QID_MASK | RING_PORT_ID_MASK << RING_PORT_ID_SHIFT);
 	reg |= ring->switch_queue & RING_QID_MASK;
 	reg |= ring->switch_port << RING_PORT_ID_SHIFT;
-	reg |= RING_IGNORE_STATUS;
 	tdma_writel(priv, reg, TDMA_DESC_RING_MAPPING(index));
 	tdma_writel(priv, 0, TDMA_DESC_RING_PCP_DEI_VID(index));
 
+	/* Enable ACB algorithm 2 */
+	reg = tdma_readl(priv, TDMA_CONTROL);
+	reg |= tdma_control_bit(priv, ACB_ALGO);
+	tdma_writel(priv, reg, TDMA_CONTROL);
+
 	/* Do not use tdma_control_bit() here because TSB_SWAP1 collides
 	 * with the original definition of ACB_ALGO
 	 */
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH] tests: Remove bashisms (s/source/.)
From: Stephen Hemminger @ 2017-10-11 18:01 UTC (permalink / raw)
  To: Petr Vorel; +Cc: netdev
In-Reply-To: <20171008143916.21813-1-petr.vorel@gmail.com>

On Sun,  8 Oct 2017 16:39:16 +0200
Petr Vorel <petr.vorel@gmail.com> wrote:

> Signed-off-by: Petr Vorel <petr.vorel@gmail.com>

Ok, applied. But iproute2 is really limited to Linux and bash is lingua franca on Linux

^ permalink raw reply

* Re: [PATCH] tests: Remove bashisms (s/source/.)
From: Randy Dunlap @ 2017-10-11 18:02 UTC (permalink / raw)
  To: Stephen Hemminger, Petr Vorel; +Cc: netdev
In-Reply-To: <20171011110115.13739488@xeon-e3>

On 10/11/17 11:01, Stephen Hemminger wrote:
> On Sun,  8 Oct 2017 16:39:16 +0200
> Petr Vorel <petr.vorel@gmail.com> wrote:
> 
>> Signed-off-by: Petr Vorel <petr.vorel@gmail.com>
> 
> Ok, applied. But iproute2 is really limited to Linux and bash is lingua franca on Linux
> 

no French, please. Some distros use dash, but being POSIX is usually good.

-- 
~Randy

^ permalink raw reply

* Re: [PATCH v3] lib: fix multiple strlcpy definition
From: Stephen Hemminger @ 2017-10-11 18:03 UTC (permalink / raw)
  To: Baruch Siach; +Cc: netdev, Phil Sutter
In-Reply-To: <9a00dc2dca0650efd7e169db9bb15ae1ec043c08.1507528184.git.baruch@tkos.co.il>

On Mon,  9 Oct 2017 08:49:44 +0300
Baruch Siach <baruch@tkos.co.il> wrote:

> Some C libraries, like uClibc and musl, provide BSD compatible
> strlcpy(). Add check_strlcpy() to configure, and avoid defining strlcpy
> and strlcat when the C library provides them.
> 
> This fixes the following static link error with uClibc-ng:
> 
> .../sysroot/usr/lib/libc.a(strlcpy.os): In function `strlcpy':
> strlcpy.c:(.text+0x0): multiple definition of `strlcpy'
> ../lib/libutil.a(utils.o):utils.c:(.text+0x1ddc): first defined here
> collect2: error: ld returned 1 exit status
> 
> Acked-by: Phil Sutter <phil@nwl.cc>
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Thanks for fixing. Most people never use other versions of libc
so things get broken rather often.

^ permalink raw reply

* Re: [PATCH iproute2 2/2] ss: print MD5 signature keys configured on TCP sockets
From: Stephen Hemminger @ 2017-10-11 18:06 UTC (permalink / raw)
  To: Ivan Delalande; +Cc: netdev
In-Reply-To: <20171006234820.27567-2-colona@arista.com>

On Fri,  6 Oct 2017 16:48:20 -0700
Ivan Delalande <colona@arista.com> wrote:

> These keys are reported by kernel 4.14 and later under the
> INET_DIAG_MD5SIG attribute, when INET_DIAG_INFO is requested (ss -i)
> and we have CAP_NET_ADMIN. The additional output looks like:
> 
> 	md5keys:fe80::/64=signing_key,10.1.2.0/24=foobar,::1/128=Test
> 
> Signed-off-by: Ivan Delalande <colona@arista.com>

Sure makes sense applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox