Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}
From: Peter Zijlstra @ 2018-10-19  8:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, paulmck, will.deacon, acme, yhs, john.fastabend,
	netdev
In-Reply-To: <20181018153307.ayvmq6du3gnsyvro@ast-mbp.dhcp.thefacebook.com>

On Thu, Oct 18, 2018 at 08:33:09AM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
> >  #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */
> > diff --git a/tools/arch/powerpc/include/asm/barrier.h b/tools/arch/powerpc/include/asm/barrier.h
> > index a634da0..905a2c6 100644
> > --- a/tools/arch/powerpc/include/asm/barrier.h
> > +++ b/tools/arch/powerpc/include/asm/barrier.h
> > @@ -27,4 +27,20 @@
> >  #define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
> >  #define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
> > 
> > +#if defined(__powerpc64__)
> > +#define smp_lwsync()	__asm__ __volatile__ ("lwsync" : : : "memory")
> > +
> > +#define smp_store_release(p, v)			\
> > +do {						\
> > +	smp_lwsync();				\
> > +	WRITE_ONCE(*p, v);			\
> > +} while (0)
> > +
> > +#define smp_load_acquire(p)			\
> > +({						\
> > +	typeof(*p) ___p1 = READ_ONCE(*p);	\
> > +	smp_lwsync();				\
> > +	___p1;					\
> 
> I don't like this proliferation of asm.
> Why do we think that we can do better job than compiler?
> can we please use gcc builtins instead?
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
> __atomic_store_n(ptr, val, __ATOMIC_RELEASE);
> are done specifically for this use case if I'm not mistaken.
> I think it pays to learn what compiler provides.

My problem with using the C11 stuff for this is that we're then limited
to compilers that actually support that. The kernel has a minimum of
gcc-4.6 (and thus perf does too I think) and gcc-4.6 does not have C11.

What Daniel writes is also true; the kernel and C11 memory models don't
align; but you're right in that for this purpose the C11 load-acquire
and store-release would indeed suffice.

^ permalink raw reply

* Re: [PATCH bpf-next 2/2] samples: bpf: get ifindex from ifname
From: Matteo Croce @ 2018-10-19  8:42 UTC (permalink / raw)
  To: ys114321; +Cc: netdev, Alexei Starovoitov, Daniel Borkmann
In-Reply-To: <CAH3MdRUM+Dw3-Es45bjyGdyF8QO6Vb2-EPf9HStb2_1L5pOErA@mail.gmail.com>

On Fri, Oct 19, 2018 at 5:35 AM Y Song <ys114321@gmail.com> wrote:
>
> On Thu, Oct 18, 2018 at 1:48 PM Matteo Croce <mcroce@redhat.com> wrote:
> >
> > Find the ifindex via ioctl(SIOCGIFINDEX) instead of requiring the
> > numeric ifindex.
>
> Maybe use if_nametoindex which is simpler?
>
> >
> > Signed-off-by: Matteo Croce <mcroce@redhat.com>
> > ---
> >  samples/bpf/xdp1_user.c | 26 ++++++++++++++++++++++++--
> >  1 file changed, 24 insertions(+), 2 deletions(-)
> >
> > diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
> > index 4f3d824fc044..a1d0c5dcee9c 100644
> > --- a/samples/bpf/xdp1_user.c
> > +++ b/samples/bpf/xdp1_user.c
> > @@ -15,6 +15,9 @@
> >  #include <unistd.h>
> >  #include <libgen.h>
> >  #include <sys/resource.h>
> > +#include <sys/ioctl.h>
> > +#include <sys/socket.h>
> > +#include <linux/if.h>
> >
> >  #include "bpf_util.h"
> >  #include "bpf/bpf.h"
> > @@ -59,7 +62,7 @@ static void poll_stats(int map_fd, int interval)
> >  static void usage(const char *prog)
> >  {
> >         fprintf(stderr,
> > -               "usage: %s [OPTS] IFINDEX\n\n"
> > +               "usage: %s [OPTS] IFACE\n\n"
> >                 "OPTS:\n"
> >                 "    -S    use skb-mode\n"
> >                 "    -N    enforce native mode\n",
> > @@ -74,9 +77,11 @@ int main(int argc, char **argv)
> >         };
> >         const char *optstr = "SN";
> >         int prog_fd, map_fd, opt;
> > +       struct ifreq ifr = { 0 };
> >         struct bpf_object *obj;
> >         struct bpf_map *map;
> >         char filename[256];
> > +       int sock;
> >
> >         while ((opt = getopt(argc, argv, optstr)) != -1) {
> >                 switch (opt) {
> > @@ -102,7 +107,24 @@ int main(int argc, char **argv)
> >                 return 1;
> >         }
> >
> > -       ifindex = strtoul(argv[optind], NULL, 0);
> > +       sock = socket(AF_UNIX, SOCK_DGRAM, 0);
> > +       if (sock == -1) {
> > +               perror("socket");
> > +               return 1;
> > +       }
> > +
> > +       if (strlen(argv[optind]) >= IFNAMSIZ) {
> > +               printf("invalid ifname '%s'\n", argv[optind]);
> > +               return 1;
> > +       }
> > +
> > +       strcpy(ifr.ifr_name, argv[optind]);
> > +       if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
> > +               perror("SIOCGIFINDEX");
> > +               return 1;
> > +       }
> > +       close(sock);
> > +       ifindex = ifr.ifr_ifindex;
> >
> >         snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
> >         prog_load_attr.file = filename;
> > --
> > 2.19.1
> >

Right, even better. Will do a v2, thanks.

-- 
Matteo Croce
per aspera ad upstream

^ permalink raw reply

* Re: [net-next PATCH] net: sched: cls_flower: Classify packets using port ranges
From: Jiri Pirko @ 2018-10-19  8:52 UTC (permalink / raw)
  To: Nambiar, Amritha
  Cc: netdev, davem, jakub.kicinski, sridhar.samudrala, jhs,
	xiyou.wangcong
In-Reply-To: <28d1bd21-fd85-07e5-c6e6-18e7a0b19da4@intel.com>

Thu, Oct 18, 2018 at 08:24:44PM CEST, amritha.nambiar@intel.com wrote:
>On 10/18/2018 5:17 AM, Jiri Pirko wrote:
>> Fri, Oct 12, 2018 at 03:53:30PM CEST, amritha.nambiar@intel.com wrote:
>>> Added support in tc flower for filtering based on port ranges.
>>> This is a rework of the RFC patch at:
>>> https://patchwork.ozlabs.org/patch/969595/
>>>
>>> Example:
>>> 1. Match on a port range:
>>> -------------------------
>>> $ tc filter add dev enp4s0 protocol ip parent ffff:\
>>>  prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\
>>>  action drop
>>>
>>> $ tc -s filter show dev enp4s0 parent ffff:
>>> filter protocol ip pref 1 flower chain 0
>>> filter protocol ip pref 1 flower chain 0 handle 0x1
>>>  eth_type ipv4
>>>  ip_proto tcp
>>>  dst_port_min 20
>>>  dst_port_max 30
>>>  skip_hw
>>>  not_in_hw
>>>        action order 1: gact action drop
>>>         random type none pass val 0
>>>         index 1 ref 1 bind 1 installed 181 sec used 5 sec
>>>        Action statistics:
>>>        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
>>>        backlog 0b 0p requeues 0
>>>
>>> 2. Match on IP address and port range:
>>> --------------------------------------
>>> $ tc filter add dev enp4s0 protocol ip parent ffff:\
>>>  prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\
>>>  skip_hw action drop
>>>
>>> $ tc -s filter show dev enp4s0 parent ffff:
>>> filter protocol ip pref 1 flower chain 0 handle 0x2
>>>  eth_type ipv4
>>>  ip_proto tcp
>>>  dst_ip 192.168.1.1
>>>  dst_port_min 100
>>>  dst_port_max 200
>>>  skip_hw
>>>  not_in_hw
>>>        action order 1: gact action drop
>>>         random type none pass val 0
>>>         index 2 ref 1 bind 1 installed 28 sec used 6 sec
>>>        Action statistics:
>>>        Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0)
>>>        backlog 0b 0p requeues 0
>>>
>>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>>> ---
>>> include/uapi/linux/pkt_cls.h |    5 ++
>>> net/sched/cls_flower.c       |  134 ++++++++++++++++++++++++++++++++++++++++--
>>> 2 files changed, 132 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>>> index 401d0c1..b569308 100644
>>> --- a/include/uapi/linux/pkt_cls.h
>>> +++ b/include/uapi/linux/pkt_cls.h
>>> @@ -405,6 +405,11 @@ enum {
>>> 	TCA_FLOWER_KEY_UDP_SRC,		/* be16 */
>>> 	TCA_FLOWER_KEY_UDP_DST,		/* be16 */
>>>
>>> +	TCA_FLOWER_KEY_PORT_SRC_MIN,	/* be16 */
>>> +	TCA_FLOWER_KEY_PORT_SRC_MAX,	/* be16 */
>>> +	TCA_FLOWER_KEY_PORT_DST_MIN,	/* be16 */
>>> +	TCA_FLOWER_KEY_PORT_DST_MAX,	/* be16 */
>>> +
>>> 	TCA_FLOWER_FLAGS,
>>> 	TCA_FLOWER_KEY_VLAN_ID,		/* be16 */
>>> 	TCA_FLOWER_KEY_VLAN_PRIO,	/* u8   */
>>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>>> index 9aada2d..5f135f0 100644
>>> --- a/net/sched/cls_flower.c
>>> +++ b/net/sched/cls_flower.c
>>> @@ -55,6 +55,9 @@ struct fl_flow_key {
>>> 	struct flow_dissector_key_ip ip;
>>> 	struct flow_dissector_key_ip enc_ip;
>>> 	struct flow_dissector_key_enc_opts enc_opts;
>>> +
>>> +	struct flow_dissector_key_ports tp_min;
>>> +	struct flow_dissector_key_ports tp_max;
>>> } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */
>>>
>>> struct fl_flow_mask_range {
>>> @@ -103,6 +106,11 @@ struct cls_fl_filter {
>>> 	struct net_device *hw_dev;
>>> };
>>>
>>> +enum fl_endpoint {
>>> +	FLOWER_ENDPOINT_DST,
>>> +	FLOWER_ENDPOINT_SRC
>>> +};
>>> +
>>> static const struct rhashtable_params mask_ht_params = {
>>> 	.key_offset = offsetof(struct fl_flow_mask, key),
>>> 	.key_len = sizeof(struct fl_flow_key),
>>> @@ -179,11 +187,86 @@ static void fl_clear_masked_range(struct fl_flow_key *key,
>>> 	memset(fl_key_get_start(key, mask), 0, fl_mask_range(mask));
>>> }
>>>
>>> +static int fl_range_compare_params(struct cls_fl_filter *filter,
>>> +				   struct fl_flow_key *key,
>>> +				   struct fl_flow_key *mkey,
>>> +				   enum fl_endpoint endpoint)
>>> +{
>>> +	__be16 min_mask, max_mask, min_val, max_val;
>>> +
>>> +	if (endpoint == FLOWER_ENDPOINT_DST) {
>>> +		min_mask = htons(filter->mask->key.tp_min.dst);
>>> +		max_mask = htons(filter->mask->key.tp_max.dst);
>>> +		min_val = htons(filter->key.tp_min.dst);
>>> +		max_val = htons(filter->key.tp_max.dst);
>>> +
>>> +		if (min_mask && max_mask) {
>>> +			if (htons(key->tp.dst) < min_val ||
>>> +			    htons(key->tp.dst) > max_val)
>>> +				return -1;
>>> +
>>> +			/* skb does not have min and max values */
>>> +			mkey->tp_min.dst = filter->mkey.tp_min.dst;
>>> +			mkey->tp_max.dst = filter->mkey.tp_max.dst;
>>> +		}
>>> +	} else {
>>> +		min_mask = htons(filter->mask->key.tp_min.src);
>>> +		max_mask = htons(filter->mask->key.tp_max.src);
>>> +		min_val = htons(filter->key.tp_min.src);
>>> +		max_val = htons(filter->key.tp_max.src);
>>> +
>>> +		if (min_mask && max_mask) {
>>> +			if (htons(key->tp.src) < min_val ||
>>> +			    htons(key->tp.src) > max_val)
>>> +				return -1;
>>> +
>>> +			/* skb does not have min and max values */
>>> +			mkey->tp_min.src = filter->mkey.tp_min.src;
>>> +			mkey->tp_max.src = filter->mkey.tp_max.src;
>>> +		}
>> 
>> You basically have 2 functions in 1 here. Just have 2 functions:
>> fl_port_range_dst_cmp()
>> and
>> fl_port_range_src_cmp()
>> 
>> And avoid the "endpoint enum.
>> Also, as you return -1 or 0, just make it bool.
>> 
>
>Makes sense. Will do.
>
>> 
>>> +	}
>>> +	return 0;
>>> +}
>>> +
>>> +static struct cls_fl_filter *fl_lookup_range(struct fl_flow_mask *mask,
>>> +					     struct fl_flow_key *mkey,
>>> +					     struct fl_flow_key *key)
>>> +{
>>> +	struct cls_fl_filter *filter, *f;
>>> +	int ret;
>>> +
>>> +	list_for_each_entry_rcu(filter, &mask->filters, list) {
>>> +		ret = fl_range_compare_params(filter, key, mkey,
>>> +					      FLOWER_ENDPOINT_DST);
>>> +		if (ret < 0)
>>> +			continue;
>>> +
>>> +		ret = fl_range_compare_params(filter, key, mkey,
>>> +					      FLOWER_ENDPOINT_SRC);
>>> +		if (ret < 0)
>>> +			continue;
>>> +
>>> +		f = rhashtable_lookup_fast(&mask->ht,
>>> +					   fl_key_get_start(mkey, mask),
>>> +					   mask->filter_ht_params);
>>> +		if (f)
>>> +			return f;
>>> +	}
>>> +	return NULL;
>>> +}
>>> +
>>> static struct cls_fl_filter *fl_lookup(struct fl_flow_mask *mask,
>>> -				       struct fl_flow_key *mkey)
>>> +				       struct fl_flow_key *mkey,
>>> +				       struct fl_flow_key *key, bool is_skb)
>>> {
>>> -	return rhashtable_lookup_fast(&mask->ht, fl_key_get_start(mkey, mask),
>>> -				      mask->filter_ht_params);
>>> +	if ((!(mask->key.tp_min.dst && mask->key.tp_max.dst) &&
>>> +	     !(mask->key.tp_min.src && mask->key.tp_max.src)) || !is_skb) {
>> 
>> Would be probably good to have a dedicated bit to check for and decide
>> if you do normal/range lookup. This is fast path. 
>> 
>
>Will fix in v2.
>
>> 
>>> +		return  rhashtable_lookup_fast(&mask->ht,
>> 
>> Remove double space   ^^
>> 
>
>Will fix in v2.
>
>> 
>>> +					       fl_key_get_start(mkey, mask),
>>> +					       mask->filter_ht_params);
>>> +	}
>>> +	/* Classify based on range */
>>> +	return fl_lookup_range(mask, mkey, key);
>>> }
>>>
>>> static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>> @@ -207,8 +290,8 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>> 		skb_flow_dissect(skb, &mask->dissector, &skb_key, 0);
>>>
>>> 		fl_set_masked_key(&skb_mkey, &skb_key, mask);
>>> +		f = fl_lookup(mask, &skb_mkey, &skb_key, true);
>>>
>>> -		f = fl_lookup(mask, &skb_mkey);
>>> 		if (f && !tc_skip_sw(f->flags)) {
>>> 			*res = f->res;
>>> 			return tcf_exts_exec(skb, &f->exts, res);
>>> @@ -909,6 +992,23 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
>>> 			       sizeof(key->arp.tha));
>>> 	}
>>>
>>> +	if (key->basic.ip_proto == IPPROTO_TCP ||
>>> +	    key->basic.ip_proto == IPPROTO_UDP ||
>>> +	    key->basic.ip_proto == IPPROTO_SCTP) {
>>> +		fl_set_key_val(tb, &key->tp_min.dst,
>>> +			       TCA_FLOWER_KEY_PORT_DST_MIN, &mask->tp_min.dst,
>>> +			       TCA_FLOWER_UNSPEC, sizeof(key->tp_min.dst));
>>> +		fl_set_key_val(tb, &key->tp_max.dst,
>>> +			       TCA_FLOWER_KEY_PORT_DST_MAX, &mask->tp_max.dst,
>>> +			       TCA_FLOWER_UNSPEC, sizeof(key->tp_max.dst));
>>> +		fl_set_key_val(tb, &key->tp_min.src,
>>> +			       TCA_FLOWER_KEY_PORT_SRC_MIN, &mask->tp_min.src,
>>> +			       TCA_FLOWER_UNSPEC, sizeof(key->tp_min.src));
>>> +		fl_set_key_val(tb, &key->tp_max.src,
>>> +			       TCA_FLOWER_KEY_PORT_SRC_MAX, &mask->tp_max.src,
>>> +			       TCA_FLOWER_UNSPEC, sizeof(key->tp_max.src));
>>> +	}
>>> +
>>> 	if (tb[TCA_FLOWER_KEY_ENC_IPV4_SRC] ||
>>> 	    tb[TCA_FLOWER_KEY_ENC_IPV4_DST]) {
>>> 		key->enc_control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
>>> @@ -1026,8 +1126,7 @@ static void fl_init_dissector(struct flow_dissector *dissector,
>>> 			     FLOW_DISSECTOR_KEY_IPV4_ADDRS, ipv4);
>>> 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
>>> 			     FLOW_DISSECTOR_KEY_IPV6_ADDRS, ipv6);
>>> -	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
>>> -			     FLOW_DISSECTOR_KEY_PORTS, tp);
>>> +	FL_KEY_SET(keys, cnt, FLOW_DISSECTOR_KEY_PORTS, tp);
>>> 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
>>> 			     FLOW_DISSECTOR_KEY_IP, ip);
>>> 	FL_KEY_SET_IF_MASKED(mask, keys, cnt,
>>> @@ -1227,7 +1326,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>>> 		goto errout_idr;
>>>
>>> 	if (!tc_skip_sw(fnew->flags)) {
>>> -		if (!fold && fl_lookup(fnew->mask, &fnew->mkey)) {
>>> +		if (!fold && fl_lookup(fnew->mask, &fnew->mkey, NULL, false)) {
>> 
>> 
>> I don't undestand why do you need the "is_skb" arg here. Could you
>> please explain?
>> 
>> Thanks!
>> 
>
>The reason to keep the 'is_skb' arg is because, fl_lookup is called in
>two cases, one for skb classification and another for checking if a
>filter exists every-time a new filter is added. In case of skb
>classification, we need to go through the range-comparator to decide if
>the skb's port-value falls within the range-filter's min and max limits.
>In case of filter validation, the range-filter that we are trying to add
>will have min and max values, and we are validating it against other
>range-filters with min and max values. So, rhashtable lookup will
>suffice here and there is no need to go through the range-comparator in
>this case. In the above code, we are validating if a range-filter
>exists, so 'is_skb' is false.

Got it. In that case, please just have a helper:
static struct cls_fl_filter *__fl_lookup(struct fl_flow_mask *mask,
					 struct fl_flow_key *mkey)
{
	return rhashtable_lookup_fast(&mask->ht, fl_key_get_start(mkey, mask),
				      mask->filter_ht_params);
}

Call this helper from both fl_lookup() and fl_change()


>
>> 
>>> 			err = -EEXIST;
>>> 			goto errout_mask;
>>> 		}
>>> @@ -1800,6 +1899,27 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net,
>>> 				  sizeof(key->arp.tha))))
>>> 		goto nla_put_failure;
>>>
>>> +	if ((key->basic.ip_proto == IPPROTO_TCP ||
>>> +	     key->basic.ip_proto == IPPROTO_UDP ||

>>> +	     key->basic.ip_proto == IPPROTO_SCTP) &&
>>> +	     (fl_dump_key_val(skb, &key->tp_min.dst,
>>> +			      TCA_FLOWER_KEY_PORT_DST_MIN,
>>> +			      &mask->tp_min.dst, TCA_FLOWER_UNSPEC,
>>> +			      sizeof(key->tp_min.dst)) ||
>>> +	      fl_dump_key_val(skb, &key->tp_max.dst,
>>> +			      TCA_FLOWER_KEY_PORT_DST_MAX,
>>> +			      &mask->tp_max.dst, TCA_FLOWER_UNSPEC,
>>> +			      sizeof(key->tp_max.dst)) ||
>>> +	      fl_dump_key_val(skb, &key->tp_min.src,
>>> +			      TCA_FLOWER_KEY_PORT_SRC_MIN,
>>> +			      &mask->tp_min.src, TCA_FLOWER_UNSPEC,
>>> +			      sizeof(key->tp_min.src)) ||
>>> +	      fl_dump_key_val(skb, &key->tp_max.src,
>>> +			      TCA_FLOWER_KEY_PORT_SRC_MAX,
>>> +			      &mask->tp_max.src, TCA_FLOWER_UNSPEC,
>>> +			      sizeof(key->tp_max.src))))
>>> +		goto nla_put_failure;
>>> +
>>> 	if (key->enc_control.addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS &&
>>> 	    (fl_dump_key_val(skb, &key->enc_ipv4.src,
>>> 			    TCA_FLOWER_KEY_ENC_IPV4_SRC, &mask->enc_ipv4.src,
>>>

^ permalink raw reply

* Re: [PATCH net-next] net: ethernet: ti: cpsw: don't flush mcast entries while switch promisc mode
From: Grygorii Strashko @ 2018-10-19 17:23 UTC (permalink / raw)
  To: davem, linux-omap, netdev, linux-kernel
In-Reply-To: <20181019120408.GA3909@khorivan>



On 10/19/18 7:04 AM, Ivan Khoronzhuk wrote:
> On Thu, Oct 18, 2018 at 07:03:06PM -0500, Grygorii Strashko wrote:
>>
>>
>> On 10/18/18 1:00 PM, Ivan Khoronzhuk wrote:
>>> No need now to flush mcast entries in switch mode while toggling to
>>> promiscuous mode. It's not needed as vlan reg_mcast = ALL_PORTS
>>> and mcast/vlan ports = ALL_PORTS, the same happening for vlan
>>> unreg_mcast, it's set to ALL_PORT_MASK just after calling promisc
>>> mode routine by calling set allmulti. I suppose main reason to flush
>>> them is to use unreg_mcast to receive all to host port. Thus, now, all
>>> mcast packets are received anyway and no reason to flush mcast entries
>>> unsafely, as they were synced with __dev_mc_sync() previously and are
>>> not restored. Another way is to _dev_mc_unsync() them, but no need.
>>
>> User have possibility to add additional mcast entries or edit existing 
>> in switch-mode, which is now done using custom tool. So, Host in promisc
>> mode will not receive packets for mcast address X if port mask for this
>> addr set to (ALL_PORTS - HOST_PORT). Am I missing smth?
> 
> I didn't take into account the custom tool changing entries directly,
> but even in this case there is at least a couple of interesting
> questions:
> 
> 1) Before the patch applied only several days ago -
>    5da1948969bc1991920987ce4361ea56046e5a98
>    "ti: cpsw: fix lost of mcast packets while rx_mode update"
>    It was impossible to do correctly anyway, as all mcast entries not
>    in the mc list were flushed (after rx_mode cb), by:
>    cpsw_ale_flush_multicast(cpsw->ale, ALE_ALL_PORTS, vid);
>    and those in mc, rewritten by adding them back in corrected form.
>    ... or this cb was not supposed to be called at all ...

It's not allowed to manipulate ALE table in dual_mac mode, so your
patches are safe in dual_mac mode. For switch-mode (unless we  move
forward with switch dev) standard linux interfaces allows create
default mcast entries which then (if required) corrected using custom
tool now.

> 
> 2) What is the reason to add mcast switch entires
>    (ALL_PORTS - HOST_PORT) if its function is added anyway by
>    unreq_mcast & (ALL_PORTS - HOST_PORT) ?
>    So, doesn't matter it's added or not - it will work :-|.

because in switch mode not all traffic directed to the Host port -
only in promisc mode. Reason safety and performance - Host should not
receive traffic which is not designated for it.

promiscuous in switch mode:
- disables learning
- enables unicast flooding to Host port
- enables unregistered multi-cast flooding to the Host port
In other words, CPSW will continue forwarding packets between P1&P2, but 
also will "duplicate" packets to Host port. This will work only for
vlans which have host port as member.

> 
> 3) Even so, toggling promisc mode will clear all these changes anyway,
>    even I will call _dev_mc_unsync() after flushing them.

there can be records which are not under control of Linux now.

> 
> 4) If user can tune ALE table by hand, what stops him do it after moving
>    to promisc mode, seems he knows what he's doing?
> 
> 5) It could be possible only for not default vlan entries, but mcast
>    vlan support is not supported yet. Who is gona restore those
>    entries after promisc off?
> 
> This behaviour is arguable, and flushing mcast entries can bring more
> issues then leaving. For me it doesn't matter, I can archive the same
> by adding after flush one line, it's even shorter:
> __dev_mc_unsync(priv->ndev, NULL);

Again, unless we move forward with switch dev you can't assume that
Linux stack has full control over ALE table. Sry, hence this patch is
not a fix and can introduce changes in current behavior and cause
regression reports - NACK.

-- 
regards,
-grygorii

^ permalink raw reply

* [PATCH v3 0/2] phy: ocelot-serdes: fix out-of-bounds read
From: Gustavo A. R. Silva @ 2018-10-19  9:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob Herring, Mark Rutland, devicetree, Kishon Vijay Abraham I,
	David S. Miller, Quentin Schulz, netdev, Gustavo A. R. Silva

This patchset aims to fix an out-of-bounds bug in
the phy-ocelot-serdes driver.

Currently, there is an out-of-bounds read on array ctrl->phys,
once variable i reaches the maximum array size of SERDES_MAX
in the for loop.

Quentin Schulz pointed out that SERDES_MAX is a valid value to
index ctrl->phys. So, I updated SERDES_MAX to be SERDES6G_MAX + 1
in include/dt-bindings/phy/phy-ocelot-serdes.h.

Then I changed the condition in the for loop from
i <= SERDES_MAX to i < SERDES_MAX in order to
complete the fix.

The reason I'm sending this fix as series is because
checkpatch reported an error when I first tried to
integrate the whole solution into a singe patch. So,
changes to dt-bindings should be sent as a separate
patch.

Thanks!

Changes in v3:
 - Post the series to netdev, so Dave can take it.

Changes in v2:
 - Send the whole series to Kishon Vijay Abraham I, so it
   can be taken into the PHY tree.
 - Add Quentin's Reviewed-by to commit log in both patches.

Gustavo A. R. Silva (2):
  dt-bindings: phy: Update SERDES_MAX to be SERDES_MAX + 1
  phy: ocelot-serdes: fix out-of-bounds read

 drivers/phy/mscc/phy-ocelot-serdes.c        | 4 ++--
 include/dt-bindings/phy/phy-ocelot-serdes.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
2.7.4

^ permalink raw reply

* Re: netconsole warning in 4.19.0-rc7
From: David Miller @ 2018-10-19 17:45 UTC (permalink / raw)
  To: davej; +Cc: mroos, xiyou.wangcong, linux-kernel, netdev
In-Reply-To: <20181019134704.ky2vqeij6m77qrpg@codemonkey.org.uk>

From: Dave Jones <davej@codemonkey.org.uk>
Date: Fri, 19 Oct 2018 09:47:04 -0400

> On Fri, Oct 19, 2018 at 12:51:38PM +0300, Meelis Roos wrote:
>  > >   > > I took another look at that error path. Turns out this is all we need I
>  > >   > > think..
>  > >   >
>  > >   > With this patch applied on top of 4.19-rc8, I stll get the warning:
>  > >   >
>  > >   > [   13.722919] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:168 __local_bh_enable_ip+0x2e/0x44
>  > > 
>  > > It's going to be a couple days before I get chance to get back to this.
>  > > Did the previous patch in this thread (without the _bh) fare any better?
>  > 
>  > Tried it with rcu_read_unlock() instead, still the same.
> 
> Ok, this is going to take more time than I have.  DaveM, do you want
> to revert 6fe9487892b32cb1c8b8b0d552ed7222a527fe30, or do you want a
> patch doing the same ?
> 
> That'll bring back the rcu warning, but fewer people were complaining
> about that than this new issue..

I'll revert.

Thanks Dave.

^ permalink raw reply

* Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}
From: Peter Zijlstra @ 2018-10-19  9:44 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: alexei.starovoitov, paulmck, will.deacon, acme, yhs,
	john.fastabend, netdev
In-Reply-To: <df1da9c7-f9b4-a482-0bbc-a6455b57a476@iogearbox.net>

On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
> diff --git a/tools/include/linux/ring_buffer.h b/tools/include/linux/ring_buffer.h
> new file mode 100644
> index 0000000..48200e0
> --- /dev/null
> +++ b/tools/include/linux/ring_buffer.h
> @@ -0,0 +1,69 @@
> +#ifndef _TOOLS_LINUX_RING_BUFFER_H_
> +#define _TOOLS_LINUX_RING_BUFFER_H_
> +
> +#include <linux/compiler.h>
> +#include <asm/barrier.h>
> +
> +/*
> + * Below barriers pair as follows (kernel/events/ring_buffer.c):
> + *
> + * Since the mmap() consumer (userspace) can run on a different CPU:
> + *
> + *   kernel                             user
> + *
> + *   if (LOAD ->data_tail) {            LOAD ->data_head
> + *                      (A)             smp_rmb()       (C)
> + *      STORE $data                     LOAD $data
> + *      smp_wmb()       (B)             smp_mb()        (D)
> + *      STORE ->data_head               STORE ->data_tail
> + *   }
> + *
> + * Where A pairs with D, and B pairs with C.
> + *
> + * In our case A is a control dependency that separates the load
> + * of the ->data_tail and the stores of $data. In case ->data_tail
> + * indicates there is no room in the buffer to store $data we do not.
> + *
> + * D needs to be a full barrier since it separates the data READ
> + * from the tail WRITE.
> + *
> + * For B a WMB is sufficient since it separates two WRITEs, and for
> + * C an RMB is sufficient since it separates two READs.
> + */
> +
> +/*
> + * Note, instead of B, C, D we could also use smp_store_release()
> + * in B and D as well as smp_load_acquire() in C. However, this
> + * optimization makes sense not for all architectures since it
> + * would resolve into READ_ONCE() + smp_mb() pair for smp_load_acquire()
> + * and smp_mb() + WRITE_ONCE() pair for smp_store_release(), thus
> + * for those smp_wmb() in B and smp_rmb() in C would still be less
> + * expensive. For the case of D this has either the same cost or
> + * is less expensive. For example, due to TSO (total store order),
> + * x86 can avoid the CPU barrier entirely.
> + */
> +
> +static inline u64 ring_buffer_read_head(struct perf_event_mmap_page *base)
> +{
> +/*
> + * Architectures where smp_load_acquire() does not fallback to
> + * READ_ONCE() + smp_mb() pair.
> + */
> +#if defined(__x86_64__) || defined(__aarch64__) || defined(__powerpc64__) || \
> +    defined(__ia64__) || defined(__sparc__) && defined(__arch64__)
> +	return smp_load_acquire(&base->data_head);
> +#else
> +	u64 head = READ_ONCE(base->data_head);
> +
> +	smp_rmb();
> +	return head;
> +#endif
> +}
> +
> +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base,
> +					  u64 tail)
> +{
> +	smp_store_release(&base->data_tail, tail);
> +}
> +
> +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */

(for the whole patch, but in particular the above)

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply

* Improving accuracy of PHC readings
From: Miroslav Lichvar @ 2018-10-19  9:51 UTC (permalink / raw)
  To: netdev; +Cc: Richard Cochran, Keller, Jacob E

I think there might be a way how we could significantly improve
accuracy of synchronization between the system clock and a PTP
hardware clock, at least with some network drivers.

Currently, the PTP_SYS_OFFSET ioctl reads the system clock, reads the
PHC using the gettime64 function of the driver, and reads the system
clock again. The ioctl can repeat this to provide multiple readings to
the user space.

phc2sys (or another program synchronizing the system clock to the PHC)
assumes the PHC timestamps were captured in the middle between the two
closest system clock timestamps.

The trouble is that gettime64 typically reads multiple (2-3) registers
and the timestamp is latched on the first one, so the assumption about
middle point is wrong. There is an asymmetry, even if the delays on
the PCIe bus are perfectly symmetric.

A solution to this would be a new driver function that wraps the
latching register read with readings of the system clock and return
three timestamps instead of one. For example:

        ktime_get_real_ts64(&sys_ts1);
	IXGBE_READ_REG(hw, IXGBE_SYSTIMR);
	ktime_get_real_ts64(&sys_ts2);
	phc_ts.tv_nsec = IXGBE_READ_REG(hw, IXGBE_SYSTIML);
	phc_ts.tv_sec = IXGBE_READ_REG(hw, IXGBE_SYSTIMH);
 
The extra timestamp doesn't fit the API of the PTP_SYS_OFFSET ioctl,
so it would need to shift the timestamp it returns by the missing
intervals (assuming the frequency offset between the PHC and system
clock is small), or a new ioctl could be introduced that would return
all timestamps in an array looking like this:

	[sys, phc, sys, sys, phc, sys, ...]

This should significantly improve the accuracy of the synchronization,
reduce the uncertainty in the readings to less than a half or third,
and also reduce the jitter as there are fewer register reads sensitive
to the PCIe delay.

What do you think?

-- 
Miroslav Lichvar

^ permalink raw reply

* [PATCH] mISDN: Fix type of switch control variable in ctrl_teimanager
From: Nathan Chancellor @ 2018-10-19 18:00 UTC (permalink / raw)
  To: Karsten Keil, David S. Miller; +Cc: netdev, linux-kernel, Nathan Chancellor

Clang warns (trimmed for brevity):

drivers/isdn/mISDN/tei.c:1193:7: warning: overflow converting case value
to switch condition type (2147764552 to 18446744071562348872) [-Wswitch]
        case IMHOLD_L1:
             ^
drivers/isdn/mISDN/tei.c:1187:7: warning: overflow converting case value
to switch condition type (2147764550 to 18446744071562348870) [-Wswitch]
        case IMCLEAR_L2:
             ^
2 warnings generated.

The root cause is that the _IOC macro can generate really large numbers,
which don't find into type int. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful and surveying the
kernel tree shows that aside from here and a few places in the scsi
subsystem, everything that uses _IOC is at least of type 'unsigned int'.
Make that change here because as nothing in this function cares about
the signedness of the variable and it removes ambiguity, which is never
good when dealing with compilers.

While we're here, remove the unnecessary local variable ret (just return
-EINVAL and 0 directly).

Link: https://github.com/ClangBuiltLinux/linux/issues/67
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---
 drivers/isdn/mISDN/tei.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/isdn/mISDN/tei.c b/drivers/isdn/mISDN/tei.c
index 12d9e5f4beb1..58635b5f296f 100644
--- a/drivers/isdn/mISDN/tei.c
+++ b/drivers/isdn/mISDN/tei.c
@@ -1180,8 +1180,7 @@ static int
 ctrl_teimanager(struct manager *mgr, void *arg)
 {
 	/* currently we only have one option */
-	int	*val = (int *)arg;
-	int	ret = 0;
+	unsigned int *val = (unsigned int *)arg;
 
 	switch (val[0]) {
 	case IMCLEAR_L2:
@@ -1197,9 +1196,9 @@ ctrl_teimanager(struct manager *mgr, void *arg)
 			test_and_clear_bit(OPTION_L1_HOLD, &mgr->options);
 		break;
 	default:
-		ret = -EINVAL;
+		return -EINVAL;
 	}
-	return ret;
+	return 0;
 }
 
 /* This function does create a L2 for fixed TEI in NT Mode */
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH] selftests/bpf: add missing executables to .gitignore
From: Y Song @ 2018-10-19 18:00 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Alexei Starovoitov, Daniel Borkmann, Shuah Khan, netdev, LKML,
	linux-kselftest
In-Reply-To: <20181019142436.2955-1-anders.roxell@linaro.org>

On Fri, Oct 19, 2018 at 7:25 AM Anders Roxell <anders.roxell@linaro.org> wrote:
>
> Fixes: 371e4fcc9d96 ("selftests/bpf: cgroup local storage-based network counters")
> Fixes: 370920c47b26 ("selftests/bpf: Test libbpf_{prog,attach}_type_by_name")
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply

* [PATCH] bpf/test_run: Add braces to initialization in bpf_prog_test_run_skb
From: Nathan Chancellor @ 2018-10-19 18:26 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, linux-kernel, Nathan Chancellor

Clang warns:

net/bpf/test_run.c:120:20: error: suggest braces around initialization
of subobject [-Werror,-Wmissing-braces]
        struct sock sk = {0};
                          ^
                          {}

Add the braces to properly initialize all subobjects.

Fixes: 75079847e9d0 ("bpf: add tests for direct packet access from CGROUP_SKB")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
---
 net/bpf/test_run.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 8dccac305268..65e049c61a7a 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -117,7 +117,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
 	u32 retval, duration;
 	int hh_len = ETH_HLEN;
 	struct sk_buff *skb;
-	struct sock sk = {0};
+	struct sock sk = { { {0} } };
 	void *data;
 	int ret;
 
-- 
2.19.1

^ permalink raw reply related

* Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}
From: Daniel Borkmann @ 2018-10-19 10:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: alexei.starovoitov, paulmck, will.deacon, acme, yhs,
	john.fastabend, netdev
In-Reply-To: <20181019094417.GE3121@hirez.programming.kicks-ass.net>

On 10/19/2018 11:44 AM, Peter Zijlstra wrote:
> On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
>> diff --git a/tools/include/linux/ring_buffer.h b/tools/include/linux/ring_buffer.h
>> new file mode 100644
>> index 0000000..48200e0
>> --- /dev/null
>> +++ b/tools/include/linux/ring_buffer.h
>> @@ -0,0 +1,69 @@
>> +#ifndef _TOOLS_LINUX_RING_BUFFER_H_
>> +#define _TOOLS_LINUX_RING_BUFFER_H_
>> +
>> +#include <linux/compiler.h>
>> +#include <asm/barrier.h>
>> +
>> +/*
>> + * Below barriers pair as follows (kernel/events/ring_buffer.c):
>> + *
>> + * Since the mmap() consumer (userspace) can run on a different CPU:
>> + *
>> + *   kernel                             user
>> + *
>> + *   if (LOAD ->data_tail) {            LOAD ->data_head
>> + *                      (A)             smp_rmb()       (C)
>> + *      STORE $data                     LOAD $data
>> + *      smp_wmb()       (B)             smp_mb()        (D)
>> + *      STORE ->data_head               STORE ->data_tail
>> + *   }
>> + *
>> + * Where A pairs with D, and B pairs with C.
>> + *
>> + * In our case A is a control dependency that separates the load
>> + * of the ->data_tail and the stores of $data. In case ->data_tail
>> + * indicates there is no room in the buffer to store $data we do not.
>> + *
>> + * D needs to be a full barrier since it separates the data READ
>> + * from the tail WRITE.
>> + *
>> + * For B a WMB is sufficient since it separates two WRITEs, and for
>> + * C an RMB is sufficient since it separates two READs.
>> + */
>> +
>> +/*
>> + * Note, instead of B, C, D we could also use smp_store_release()
>> + * in B and D as well as smp_load_acquire() in C. However, this
>> + * optimization makes sense not for all architectures since it
>> + * would resolve into READ_ONCE() + smp_mb() pair for smp_load_acquire()
>> + * and smp_mb() + WRITE_ONCE() pair for smp_store_release(), thus
>> + * for those smp_wmb() in B and smp_rmb() in C would still be less
>> + * expensive. For the case of D this has either the same cost or
>> + * is less expensive. For example, due to TSO (total store order),
>> + * x86 can avoid the CPU barrier entirely.
>> + */
>> +
>> +static inline u64 ring_buffer_read_head(struct perf_event_mmap_page *base)
>> +{
>> +/*
>> + * Architectures where smp_load_acquire() does not fallback to
>> + * READ_ONCE() + smp_mb() pair.
>> + */
>> +#if defined(__x86_64__) || defined(__aarch64__) || defined(__powerpc64__) || \
>> +    defined(__ia64__) || defined(__sparc__) && defined(__arch64__)
>> +	return smp_load_acquire(&base->data_head);
>> +#else
>> +	u64 head = READ_ONCE(base->data_head);
>> +
>> +	smp_rmb();
>> +	return head;
>> +#endif
>> +}
>> +
>> +static inline void ring_buffer_write_tail(struct perf_event_mmap_page *base,
>> +					  u64 tail)
>> +{
>> +	smp_store_release(&base->data_tail, tail);
>> +}
>> +
>> +#endif /* _TOOLS_LINUX_RING_BUFFER_H_ */
> 
> (for the whole patch, but in particular the above)
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Great, thanks a lot, Peter! Will flush out v2 in a bit.

^ permalink raw reply

* Re: [PATCH] bpf/test_run: Add braces to initialization in bpf_prog_test_run_skb
From: Eric Dumazet @ 2018-10-19 18:46 UTC (permalink / raw)
  To: Nathan Chancellor, Alexei Starovoitov, Daniel Borkmann
  Cc: netdev, linux-kernel
In-Reply-To: <20181019182649.24301-1-natechancellor@gmail.com>



On 10/19/2018 11:26 AM, Nathan Chancellor wrote:
> Clang warns:
> 
> net/bpf/test_run.c:120:20: error: suggest braces around initialization
> of subobject [-Werror,-Wmissing-braces]
>         struct sock sk = {0};
>                           ^
>                           {}
> 
> Add the braces to properly initialize all subobjects.
> 
> Fixes: 75079847e9d0 ("bpf: add tests for direct packet access from CGROUP_SKB")
> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> ---
>  net/bpf/test_run.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 8dccac305268..65e049c61a7a 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -117,7 +117,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>  	u32 retval, duration;
>  	int hh_len = ETH_HLEN;
>  	struct sk_buff *skb;
> -	struct sock sk = {0};
> +	struct sock sk = { { {0} } };
>  	void *data;
>  	int ret;
>  
> 

Strange, I thought this patch was still under discussion.
Has an old version of it being merged somewhere ?

^ permalink raw reply

* Re: [PATCH] bpf/test_run: Add braces to initialization in bpf_prog_test_run_skb
From: Alexei Starovoitov @ 2018-10-19 18:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: natechancellor, Alexei Starovoitov, Daniel Borkmann,
	Network Development, LKML
In-Reply-To: <3e2e5343-3ead-5b4d-758d-14f04183b39e@gmail.com>

On Fri, Oct 19, 2018 at 11:46 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 10/19/2018 11:26 AM, Nathan Chancellor wrote:
> > Clang warns:
> >
> > net/bpf/test_run.c:120:20: error: suggest braces around initialization
> > of subobject [-Werror,-Wmissing-braces]
> >         struct sock sk = {0};
> >                           ^
> >                           {}
> >
> > Add the braces to properly initialize all subobjects.
> >
> > Fixes: 75079847e9d0 ("bpf: add tests for direct packet access from CGROUP_SKB")
> > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> > ---
> >  net/bpf/test_run.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index 8dccac305268..65e049c61a7a 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -117,7 +117,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> >       u32 retval, duration;
> >       int hh_len = ETH_HLEN;
> >       struct sk_buff *skb;
> > -     struct sock sk = {0};
> > +     struct sock sk = { { {0} } };
> >       void *data;
> >       int ret;
> >
> >
>
> Strange, I thought this patch was still under discussion.
> Has an old version of it being merged somewhere ?

merged and reverted. This patch is not necessary.

^ permalink raw reply

* Re: [PATCH] bpf/test_run: Add braces to initialization in bpf_prog_test_run_skb
From: Nathan Chancellor @ 2018-10-19 19:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Eric Dumazet, Alexei Starovoitov, Daniel Borkmann,
	Network Development, LKML
In-Reply-To: <CAADnVQKyaU=kPZN5+Bi=NJDGh6AMV0HKZoDg+tzLGQdq+L8xaQ@mail.gmail.com>

On Fri, Oct 19, 2018 at 11:47:34AM -0700, Alexei Starovoitov wrote:
> On Fri, Oct 19, 2018 at 11:46 AM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >
> >
> > On 10/19/2018 11:26 AM, Nathan Chancellor wrote:
> > > Clang warns:
> > >
> > > net/bpf/test_run.c:120:20: error: suggest braces around initialization
> > > of subobject [-Werror,-Wmissing-braces]
> > >         struct sock sk = {0};
> > >                           ^
> > >                           {}
> > >
> > > Add the braces to properly initialize all subobjects.
> > >
> > > Fixes: 75079847e9d0 ("bpf: add tests for direct packet access from CGROUP_SKB")
> > > Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
> > > ---
> > >  net/bpf/test_run.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > > index 8dccac305268..65e049c61a7a 100644
> > > --- a/net/bpf/test_run.c
> > > +++ b/net/bpf/test_run.c
> > > @@ -117,7 +117,7 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> > >       u32 retval, duration;
> > >       int hh_len = ETH_HLEN;
> > >       struct sk_buff *skb;
> > > -     struct sock sk = {0};
> > > +     struct sock sk = { { {0} } };
> > >       void *data;
> > >       int ret;
> > >
> > >
> >
> > Strange, I thought this patch was still under discussion.
> > Has an old version of it being merged somewhere ?

Looks like it made its way into -next in the 20181019 version, which
is what I am working off of.

> 
> merged and reverted. This patch is not necessary.

Thank you for the heads up and sorry for the noise!
Nathan

^ permalink raw reply

* Re: [PATCH bpf-next 2/3] tools, perf: use smp_{rmb,mb} barriers instead of {rmb,mb}
From: Will Deacon @ 2018-10-19 11:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Peter Zijlstra, paulmck, acme, yhs,
	john.fastabend, netdev
In-Reply-To: <20181019035340.ahjocmdj2o2zam4m@ast-mbp.dhcp.thefacebook.com>

On Thu, Oct 18, 2018 at 08:53:42PM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 18, 2018 at 09:00:46PM +0200, Daniel Borkmann wrote:
> > On 10/18/2018 05:33 PM, Alexei Starovoitov wrote:
> > > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote:
> > >>  #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */
> > >> diff --git a/tools/arch/powerpc/include/asm/barrier.h b/tools/arch/powerpc/include/asm/barrier.h
> > >> index a634da0..905a2c6 100644
> > >> --- a/tools/arch/powerpc/include/asm/barrier.h
> > >> +++ b/tools/arch/powerpc/include/asm/barrier.h
> > >> @@ -27,4 +27,20 @@
> > >>  #define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
> > >>  #define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
> > >>
> > >> +#if defined(__powerpc64__)
> > >> +#define smp_lwsync()	__asm__ __volatile__ ("lwsync" : : : "memory")
> > >> +
> > >> +#define smp_store_release(p, v)			\
> > >> +do {						\
> > >> +	smp_lwsync();				\
> > >> +	WRITE_ONCE(*p, v);			\
> > >> +} while (0)
> > >> +
> > >> +#define smp_load_acquire(p)			\
> > >> +({						\
> > >> +	typeof(*p) ___p1 = READ_ONCE(*p);	\
> > >> +	smp_lwsync();				\
> > >> +	___p1;					\
> > > 
> > > I don't like this proliferation of asm.
> > > Why do we think that we can do better job than compiler?
> > > can we please use gcc builtins instead?
> > > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > > __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
> > > __atomic_store_n(ptr, val, __ATOMIC_RELEASE);
> > > are done specifically for this use case if I'm not mistaken.
> > > I think it pays to learn what compiler provides.
> > 
> > But are you sure the C11 memory model matches exact same model as kernel?
> > Seems like last time Will looked into it [0] it wasn't the case ...
> 
> I'm only suggesting equivalence of __atomic_load_n(ptr, __ATOMIC_ACQUIRE)
> with kernel's smp_load_acquire().
> I've seen a bunch of user space ring buffer implementations implemented
> with __atomic_load_n() primitives.
> But let's ask experts who live in both worlds.

One thing to be wary of is if there is an implementation choice between
how to implement load-acquire and store-release for a given architecture.
In these situations, it's often important that concurrent software agrees
on the "mapping", so we'd need to be sure that (a) All userspace compilers
that we care about have compatible mappings and (b) These mappings are
compatible with the kernel code.

Will

^ permalink raw reply

* [PATCH net-next 0/7] Adds support of RAS Error Handling in HNS3 Driver
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm

This patch-set adds support related to RAS Error handling to the HNS3
Ethernet PF Driver. Set of errors occurred in the HNS3 hardware are
reported to the driver through the PCIe AER interface. The received
error information is then used to classify the received errors and
then decide the appropriate receovery action depending on the type
of error.


Shiju Jose (7):
  net: hns3: Add PCIe AER callback error_detected
  net: hns3: Add PCIe AER error recovery
  net: hns3: Add support to enable and disable hw errors
  net: hns3: Add enable and process common ecc errors
  net: hns3: Add enable and process hw errors from IGU, EGU and NCSI
  net: hns3: Add enable and process hw errors from PPP
  net: hns3: Add enable and process hw errors of TM scheduler

 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |    3 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    |   50 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/Makefile    |    2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   22 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 1088 ++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |   83 ++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |   33 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |    3 +-
 8 files changed, 1276 insertions(+), 8 deletions(-)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h

-- 
2.7.4

^ permalink raw reply

* [PATCH net-next 1/7] net: hns3: Add PCIe AER callback error_detected
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

Set of hw errors occurred in the HNS3 are reported to the
hns3 driver through PCIe AER and RAS.The error info will be
processed and appropriately recovered.
This patch adds error_detected callback and error processing.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |  1 +
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 30 +++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/Makefile    |  2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 38 ++++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h | 29 +++++++++++++++++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |  2 ++
 6 files changed, 101 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
 create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index c3bd2a1..2af3a2d 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -429,6 +429,7 @@ struct hnae3_ae_ops {
 				struct ethtool_rxnfc *cmd, u32 *rule_locs);
 	int (*restore_fd_rules)(struct hnae3_handle *handle);
 	void (*enable_fd)(struct hnae3_handle *handle, bool enable);
+	pci_ers_result_t (*process_hw_error)(struct hnae3_ae_dev *ae_dev);
 };
 
 struct hnae3_dcb_ops {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 76ce2f2..3c6fa39 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -1771,6 +1771,35 @@ static void hns3_shutdown(struct pci_dev *pdev)
 		pci_set_power_state(pdev, PCI_D3hot);
 }
 
+static pci_ers_result_t hns3_error_detected(struct pci_dev *pdev,
+					    pci_channel_state_t state)
+{
+	struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev);
+	pci_ers_result_t ret;
+
+	dev_info(&pdev->dev, "PCI error detected, state(=%d)!!\n", state);
+
+	if (state == pci_channel_io_perm_failure)
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	if (!ae_dev) {
+		dev_err(&pdev->dev,
+			"Can't recover - error happened during device init\n");
+		return PCI_ERS_RESULT_NONE;
+	}
+
+	if (ae_dev->ops->process_hw_error)
+		ret = ae_dev->ops->process_hw_error(ae_dev);
+	else
+		return PCI_ERS_RESULT_NONE;
+
+	return ret;
+}
+
+static const struct pci_error_handlers hns3_err_handler = {
+	.error_detected = hns3_error_detected,
+};
+
 static struct pci_driver hns3_driver = {
 	.name     = hns3_driver_name,
 	.id_table = hns3_pci_tbl,
@@ -1778,6 +1807,7 @@ static struct pci_driver hns3_driver = {
 	.remove   = hns3_remove,
 	.shutdown = hns3_shutdown,
 	.sriov_configure = hns3_pci_sriov_configure,
+	.err_handler    = &hns3_err_handler,
 };
 
 /* set default feature to hns3 */
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
index cb8ddd0..580e817 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
@@ -6,6 +6,6 @@
 ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
 
 obj-$(CONFIG_HNS3_HCLGE) += hclge.o
-hclge-objs = hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o hclge_mbx.o
+hclge-objs = hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o hclge_mbx.o hclge_err.o
 
 hclge-$(CONFIG_HNS3_DCB) += hclge_dcb.o
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
new file mode 100644
index 0000000..83aca6f
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0+
+/* Copyright (c) 2016-2017 Hisilicon Limited. */
+
+#include "hclge_err.h"
+
+static const struct hclge_hw_blk hw_blk[] = {
+	{ /* sentinel */ }
+};
+
+pci_ers_result_t hclge_process_ras_hw_error(struct hnae3_ae_dev *ae_dev)
+{
+	struct hclge_dev *hdev = ae_dev->priv;
+	struct device *dev = &hdev->pdev->dev;
+	u32 sts, val;
+	int i = 0;
+
+	sts = hclge_read_dev(&hdev->hw, HCLGE_RAS_PF_OTHER_INT_STS_REG);
+
+	/* Processing Non-fatal errors */
+	if (sts & HCLGE_RAS_REG_NFE_MASK) {
+		val = (sts >> HCLGE_RAS_REG_NFE_SHIFT) & 0xFF;
+		i = 0;
+		while (hw_blk[i].name) {
+			if (!(hw_blk[i].msk & val)) {
+				i++;
+				continue;
+			}
+			dev_warn(dev, "%s ras non-fatal error identified\n",
+				 hw_blk[i].name);
+			if (hw_blk[i].process_error)
+				hw_blk[i].process_error(hdev,
+							 HCLGE_ERR_INT_RAS_NFE);
+			i++;
+		}
+	}
+
+	return PCI_ERS_RESULT_NEED_RESET;
+}
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
new file mode 100644
index 0000000..ea1637c
--- /dev/null
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*  Copyright (c) 2016-2017 Hisilicon Limited. */
+
+#ifndef __HCLGE_ERR_H
+#define __HCLGE_ERR_H
+
+#include "hclge_main.h"
+
+#define HCLGE_RAS_PF_OTHER_INT_STS_REG   0x20B00
+#define HCLGE_RAS_REG_FE_MASK    0xFF
+#define HCLGE_RAS_REG_NFE_MASK   0xFF00
+#define HCLGE_RAS_REG_NFE_SHIFT	8
+
+enum hclge_err_int_type {
+	HCLGE_ERR_INT_MSIX = 0,
+	HCLGE_ERR_INT_RAS_CE = 1,
+	HCLGE_ERR_INT_RAS_NFE = 2,
+	HCLGE_ERR_INT_RAS_FE = 3,
+};
+
+struct hclge_hw_blk {
+	u32 msk;
+	const char *name;
+	void (*process_error)(struct hclge_dev *hdev,
+			      enum hclge_err_int_type type);
+};
+
+pci_ers_result_t hclge_process_ras_hw_error(struct hnae3_ae_dev *ae_dev);
+#endif
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 1bd83e8..94d3678 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -19,6 +19,7 @@
 #include "hclge_mbx.h"
 #include "hclge_mdio.h"
 #include "hclge_tm.h"
+#include "hclge_err.h"
 #include "hnae3.h"
 
 #define HCLGE_NAME			"hclge"
@@ -7312,6 +7313,7 @@ static const struct hnae3_ae_ops hclge_ops = {
 	.get_fd_all_rules = hclge_get_all_rules,
 	.restore_fd_rules = hclge_restore_fd_entries,
 	.enable_fd = hclge_enable_fd,
+	.process_hw_error = hclge_process_ras_hw_error,
 };
 
 static struct hnae3_ae_algo ae_algo = {
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 2/7] net: hns3: Add PCIe AER error recovery
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

This patch adds the error recovery for the HNS hw errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hnae3.h          |  2 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c      | 20 +++++++++++++++++++-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c  | 17 +++++++++++++----
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c    |  3 ++-
 4 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
index 2af3a2d..e82e4ca 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h
@@ -402,7 +402,7 @@ struct hnae3_ae_ops {
 	int (*set_vf_vlan_filter)(struct hnae3_handle *handle, int vfid,
 				  u16 vlan, u8 qos, __be16 proto);
 	int (*enable_hw_strip_rxvtag)(struct hnae3_handle *handle, bool enable);
-	void (*reset_event)(struct hnae3_handle *handle);
+	void (*reset_event)(struct pci_dev *pdev, struct hnae3_handle *handle);
 	void (*get_channels)(struct hnae3_handle *handle,
 			     struct ethtool_channels *ch);
 	void (*get_tqps_and_rss_info)(struct hnae3_handle *h,
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 3c6fa39..32f3aca8 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -9,6 +9,7 @@
 #include <linux/ipv6.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/aer.h>
 #include <linux/skbuff.h>
 #include <linux/sctp.h>
 #include <linux/vermagic.h>
@@ -1613,7 +1614,7 @@ static void hns3_nic_net_timeout(struct net_device *ndev)
 
 	/* request the reset */
 	if (h->ae_algo->ops->reset_event)
-		h->ae_algo->ops->reset_event(h);
+		h->ae_algo->ops->reset_event(h->pdev, h);
 }
 
 static const struct net_device_ops hns3_nic_netdev_ops = {
@@ -1796,8 +1797,25 @@ static pci_ers_result_t hns3_error_detected(struct pci_dev *pdev,
 	return ret;
 }
 
+static pci_ers_result_t hns3_slot_reset(struct pci_dev *pdev)
+{
+	struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev);
+	struct device *dev = &pdev->dev;
+
+	dev_info(dev, "requesting reset due to PCI error\n");
+
+	/* request the reset */
+	if (ae_dev->ops->reset_event) {
+		ae_dev->ops->reset_event(pdev, NULL);
+		return PCI_ERS_RESULT_RECOVERED;
+	}
+
+	return PCI_ERS_RESULT_DISCONNECT;
+}
+
 static const struct pci_error_handlers hns3_err_handler = {
 	.error_detected = hns3_error_detected,
+	.slot_reset     = hns3_slot_reset,
 };
 
 static struct pci_driver hns3_driver = {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 94d3678..5075365 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2489,12 +2489,18 @@ static void hclge_reset(struct hclge_dev *hdev)
 	ae_dev->reset_type = HNAE3_NONE_RESET;
 }
 
-static void hclge_reset_event(struct hnae3_handle *handle)
+static void hclge_reset_event(struct pci_dev *pdev, struct hnae3_handle *handle)
 {
-	struct hclge_vport *vport = hclge_get_vport(handle);
-	struct hclge_dev *hdev = vport->back;
+	struct hnae3_ae_dev *ae_dev = pci_get_drvdata(pdev);
+	struct hclge_dev *hdev = ae_dev->priv;
 
-	/* check if this is a new reset request and we are not here just because
+	/* We might end up getting called broadly because of 2 below cases:
+	 * 1. Recoverable error was conveyed through APEI and only way to bring
+	 *    normalcy is to reset.
+	 * 2. A new reset request from the stack due to timeout
+	 *
+	 * For the first case,error event might not have ae handle available.
+	 * check if this is a new reset request and we are not here just because
 	 * last reset attempt did not succeed and watchdog hit us again. We will
 	 * know this if last reset request did not occur very recently (watchdog
 	 * timer = 5*HZ, let us check after sufficiently large time, say 4*5*Hz)
@@ -2503,6 +2509,9 @@ static void hclge_reset_event(struct hnae3_handle *handle)
 	 * want to make sure we throttle the reset request. Therefore, we will
 	 * not allow it again before 3*HZ times.
 	 */
+	if (!handle)
+		handle = &hdev->vport[0].nic;
+
 	if (time_before(jiffies, (handle->last_reset_time + 3 * HZ)))
 		return;
 	else if (time_after(jiffies, (handle->last_reset_time + 4 * 5 * HZ)))
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index ac67fec..e0a86a5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1214,7 +1214,8 @@ static int hclgevf_do_reset(struct hclgevf_dev *hdev)
 	return status;
 }
 
-static void hclgevf_reset_event(struct hnae3_handle *handle)
+static void hclgevf_reset_event(struct pci_dev *pdev,
+				struct hnae3_handle *handle)
 {
 	struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 3/7] net: hns3: Add support to enable and disable hw errors
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

This patch adds functions to enable and disable hw errors.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 22 ++++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |  2 ++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |  8 ++++++++
 3 files changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
index 83aca6f..d2640d1 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -7,6 +7,28 @@ static const struct hclge_hw_blk hw_blk[] = {
 	{ /* sentinel */ }
 };
 
+int hclge_hw_error_set_state(struct hclge_dev *hdev, bool state)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int ret = 0;
+	int i = 0;
+
+	while (hw_blk[i].name) {
+		if (!hw_blk[i].enable_error) {
+			i++;
+			continue;
+		}
+		ret = hw_blk[i].enable_error(hdev, state);
+		if (ret) {
+			dev_err(dev, "fail(%d) to en/disable err int\n", ret);
+			return ret;
+		}
+		i++;
+	}
+
+	return ret;
+}
+
 pci_ers_result_t hclge_process_ras_hw_error(struct hnae3_ae_dev *ae_dev)
 {
 	struct hclge_dev *hdev = ae_dev->priv;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
index ea1637c..373e9bf 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -21,9 +21,11 @@ enum hclge_err_int_type {
 struct hclge_hw_blk {
 	u32 msk;
 	const char *name;
+	int (*enable_error)(struct hclge_dev *hdev, bool en);
 	void (*process_error)(struct hclge_dev *hdev,
 			      enum hclge_err_int_type type);
 };
 
+int hclge_hw_error_set_state(struct hclge_dev *hdev, bool state);
 pci_ers_result_t hclge_process_ras_hw_error(struct hnae3_ae_dev *ae_dev);
 #endif
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 5075365..082ea97 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -6759,6 +6759,13 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev)
 		goto err_mdiobus_unreg;
 	}
 
+	ret = hclge_hw_error_set_state(hdev, true);
+	if (ret) {
+		dev_err(&pdev->dev,
+			"hw error interrupts enable failed, ret =%d\n", ret);
+		goto err_mdiobus_unreg;
+	}
+
 	hclge_dcb_ops_set(hdev);
 
 	timer_setup(&hdev->service_timer, hclge_service_timer, 0);
@@ -6896,6 +6903,7 @@ static void hclge_uninit_ae_dev(struct hnae3_ae_dev *ae_dev)
 	hclge_enable_vector(&hdev->misc_vector, false);
 	synchronize_irq(hdev->misc_vector.vector_irq);
 
+	hclge_hw_error_set_state(hdev, false);
 	hclge_destroy_cmd_queue(&hdev->hw);
 	hclge_misc_irq_uninit(hdev);
 	hclge_pci_uninit(hdev);
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 4/7] net: hns3: Add enable and process common ecc errors
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

This patch adds enable and processing of ecc errors from
common HNS blocks, CMDQ(Command Queue),
IMP(Integrated Management Processor) and TQP(Task Queue Pair).

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   3 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 285 +++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |  30 +++
 3 files changed, 318 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 1ccde67..8525f18 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -209,6 +209,9 @@ enum hclge_opcode_type {
 
 	/* Led command */
 	HCLGE_OPC_LED_STATUS_CFG	= 0xB000,
+
+	/* Error INT commands */
+	HCLGE_COMMON_ECC_INT_CFG	= 0x1505,
 };
 
 #define HCLGE_TQP_REG_OFFSET		0x80000
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
index d2640d1..8b37de4 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -3,7 +3,292 @@
 
 #include "hclge_err.h"
 
+static const struct hclge_hw_error hclge_imp_tcm_ecc_int[] = {
+	{ .int_msk = BIT(0), .msg = "imp_itcm0_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "imp_itcm0_ecc_mbit_err" },
+	{ .int_msk = BIT(2), .msg = "imp_itcm1_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "imp_itcm1_ecc_mbit_err" },
+	{ .int_msk = BIT(4), .msg = "imp_itcm2_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "imp_itcm2_ecc_mbit_err" },
+	{ .int_msk = BIT(6), .msg = "imp_itcm3_ecc_1bit_err" },
+	{ .int_msk = BIT(7), .msg = "imp_itcm3_ecc_mbit_err" },
+	{ .int_msk = BIT(8), .msg = "imp_dtcm0_mem0_ecc_1bit_err" },
+	{ .int_msk = BIT(9), .msg = "imp_dtcm0_mem0_ecc_mbit_err" },
+	{ .int_msk = BIT(10), .msg = "imp_dtcm0_mem1_ecc_1bit_err" },
+	{ .int_msk = BIT(11), .msg = "imp_dtcm0_mem1_ecc_mbit_err" },
+	{ .int_msk = BIT(12), .msg = "imp_dtcm1_mem0_ecc_1bit_err" },
+	{ .int_msk = BIT(13), .msg = "imp_dtcm1_mem0_ecc_mbit_err" },
+	{ .int_msk = BIT(14), .msg = "imp_dtcm1_mem1_ecc_1bit_err" },
+	{ .int_msk = BIT(15), .msg = "imp_dtcm1_mem1_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_imp_itcm4_ecc_int[] = {
+	{ .int_msk = BIT(0), .msg = "imp_itcm4_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "imp_itcm4_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_cmdq_nic_mem_ecc_int[] = {
+	{ .int_msk = BIT(0), .msg = "cmdq_nic_rx_depth_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "cmdq_nic_rx_depth_ecc_mbit_err" },
+	{ .int_msk = BIT(2), .msg = "cmdq_nic_tx_depth_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "cmdq_nic_tx_depth_ecc_mbit_err" },
+	{ .int_msk = BIT(4), .msg = "cmdq_nic_rx_tail_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "cmdq_nic_rx_tail_ecc_mbit_err" },
+	{ .int_msk = BIT(6), .msg = "cmdq_nic_tx_tail_ecc_1bit_err" },
+	{ .int_msk = BIT(7), .msg = "cmdq_nic_tx_tail_ecc_mbit_err" },
+	{ .int_msk = BIT(8), .msg = "cmdq_nic_rx_head_ecc_1bit_err" },
+	{ .int_msk = BIT(9), .msg = "cmdq_nic_rx_head_ecc_mbit_err" },
+	{ .int_msk = BIT(10), .msg = "cmdq_nic_tx_head_ecc_1bit_err" },
+	{ .int_msk = BIT(11), .msg = "cmdq_nic_tx_head_ecc_mbit_err" },
+	{ .int_msk = BIT(12), .msg = "cmdq_nic_rx_addr_ecc_1bit_err" },
+	{ .int_msk = BIT(13), .msg = "cmdq_nic_rx_addr_ecc_mbit_err" },
+	{ .int_msk = BIT(14), .msg = "cmdq_nic_tx_addr_ecc_1bit_err" },
+	{ .int_msk = BIT(15), .msg = "cmdq_nic_tx_addr_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_cmdq_rocee_mem_ecc_int[] = {
+	{ .int_msk = BIT(0), .msg = "cmdq_rocee_rx_depth_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "cmdq_rocee_rx_depth_ecc_mbit_err" },
+	{ .int_msk = BIT(2), .msg = "cmdq_rocee_tx_depth_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "cmdq_rocee_tx_depth_ecc_mbit_err" },
+	{ .int_msk = BIT(4), .msg = "cmdq_rocee_rx_tail_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "cmdq_rocee_rx_tail_ecc_mbit_err" },
+	{ .int_msk = BIT(6), .msg = "cmdq_rocee_tx_tail_ecc_1bit_err" },
+	{ .int_msk = BIT(7), .msg = "cmdq_rocee_tx_tail_ecc_mbit_err" },
+	{ .int_msk = BIT(8), .msg = "cmdq_rocee_rx_head_ecc_1bit_err" },
+	{ .int_msk = BIT(9), .msg = "cmdq_rocee_rx_head_ecc_mbit_err" },
+	{ .int_msk = BIT(10), .msg = "cmdq_rocee_tx_head_ecc_1bit_err" },
+	{ .int_msk = BIT(11), .msg = "cmdq_rocee_tx_head_ecc_mbit_err" },
+	{ .int_msk = BIT(12), .msg = "cmdq_rocee_rx_addr_ecc_1bit_err" },
+	{ .int_msk = BIT(13), .msg = "cmdq_rocee_rx_addr_ecc_mbit_err" },
+	{ .int_msk = BIT(14), .msg = "cmdq_rocee_tx_addr_ecc_1bit_err" },
+	{ .int_msk = BIT(15), .msg = "cmdq_rocee_tx_addr_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_tqp_int_ecc_int[] = {
+	{ .int_msk = BIT(0), .msg = "tqp_int_cfg_even_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "tqp_int_cfg_odd_ecc_1bit_err" },
+	{ .int_msk = BIT(2), .msg = "tqp_int_ctrl_even_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "tqp_int_ctrl_odd_ecc_1bit_err" },
+	{ .int_msk = BIT(4), .msg = "tx_que_scan_int_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "rx_que_scan_int_ecc_1bit_err" },
+	{ .int_msk = BIT(6), .msg = "tqp_int_cfg_even_ecc_mbit_err" },
+	{ .int_msk = BIT(7), .msg = "tqp_int_cfg_odd_ecc_mbit_err" },
+	{ .int_msk = BIT(8), .msg = "tqp_int_ctrl_even_ecc_mbit_err" },
+	{ .int_msk = BIT(9), .msg = "tqp_int_ctrl_odd_ecc_mbit_err" },
+	{ .int_msk = BIT(10), .msg = "tx_que_scan_int_ecc_mbit_err" },
+	{ .int_msk = BIT(11), .msg = "rx_que_scan_int_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static void hclge_log_error(struct device *dev,
+			    const struct hclge_hw_error *err_list,
+			    u32 err_sts)
+{
+	const struct hclge_hw_error *err;
+	int i = 0;
+
+	while (err_list[i].msg) {
+		err = &err_list[i];
+		if (!(err->int_msk & err_sts)) {
+			i++;
+			continue;
+		}
+		dev_warn(dev, "%s [error status=0x%x] found\n",
+			 err->msg, err_sts);
+		i++;
+	}
+}
+
+/* hclge_cmd_query_error: read the error information
+ * @hdev: pointer to struct hclge_dev
+ * @desc: descriptor for describing the command
+ * @cmd:  command opcode
+ * @flag: flag for extended command structure
+ * @w_num: offset for setting the read interrupt type.
+ * @int_type: select which type of the interrupt for which the error
+ * info will be read(RAS-CE/RAS-NFE/RAS-FE etc).
+ *
+ * This function query the error info from hw register/s using command
+ */
+static int hclge_cmd_query_error(struct hclge_dev *hdev,
+				 struct hclge_desc *desc, u32 cmd,
+				 u16 flag, u8 w_num,
+				 enum hclge_err_int_type int_type)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int num = 1;
+	int ret;
+
+	hclge_cmd_setup_basic_desc(&desc[0], cmd, true);
+	if (flag) {
+		desc[0].flag |= cpu_to_le16(flag);
+		hclge_cmd_setup_basic_desc(&desc[1], cmd, true);
+		num = 2;
+	}
+	if (w_num)
+		desc[0].data[w_num] = cpu_to_le32(int_type);
+
+	ret = hclge_cmd_send(&hdev->hw, &desc[0], num);
+	if (ret)
+		dev_err(dev, "query error cmd failed (%d)\n", ret);
+
+	return ret;
+}
+
+/* hclge_cmd_clear_error: clear the error status
+ * @hdev: pointer to struct hclge_dev
+ * @desc: descriptor for describing the command
+ * @desc_src: prefilled descriptor from the previous command for reusing
+ * @cmd:  command opcode
+ * @flag: flag for extended command structure
+ *
+ * This function clear the error status in the hw register/s using command
+ */
+static int hclge_cmd_clear_error(struct hclge_dev *hdev,
+				 struct hclge_desc *desc,
+				 struct hclge_desc *desc_src,
+				 u32 cmd, u16 flag)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int num = 1;
+	int ret, i;
+
+	if (cmd) {
+		hclge_cmd_setup_basic_desc(&desc[0], cmd, false);
+		if (flag) {
+			desc[0].flag |= cpu_to_le16(flag);
+			hclge_cmd_setup_basic_desc(&desc[1], cmd, false);
+			num = 2;
+		}
+		if (desc_src) {
+			for (i = 0; i < 6; i++) {
+				desc[0].data[i] = desc_src[0].data[i];
+				if (flag)
+					desc[1].data[i] = desc_src[1].data[i];
+			}
+		}
+	} else {
+		hclge_cmd_reuse_desc(&desc[0], false);
+		if (flag) {
+			desc[0].flag |= cpu_to_le16(flag);
+			hclge_cmd_reuse_desc(&desc[1], false);
+			num = 2;
+		}
+	}
+	ret = hclge_cmd_send(&hdev->hw, &desc[0], num);
+	if (ret)
+		dev_err(dev, "clear error cmd failed (%d)\n", ret);
+
+	return ret;
+}
+
+static int hclge_enable_common_error(struct hclge_dev *hdev, bool en)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc[2];
+	int ret;
+
+	hclge_cmd_setup_basic_desc(&desc[0], HCLGE_COMMON_ECC_INT_CFG, false);
+	desc[0].flag |= cpu_to_le16(HCLGE_CMD_FLAG_NEXT);
+	hclge_cmd_setup_basic_desc(&desc[1], HCLGE_COMMON_ECC_INT_CFG, false);
+
+	if (en) {
+		/* enable COMMON error interrupts */
+		desc[0].data[0] = cpu_to_le32(HCLGE_IMP_TCM_ECC_ERR_INT_EN);
+		desc[0].data[2] = cpu_to_le32(HCLGE_CMDQ_NIC_ECC_ERR_INT_EN |
+					HCLGE_CMDQ_ROCEE_ECC_ERR_INT_EN);
+		desc[0].data[3] = cpu_to_le32(HCLGE_IMP_RD_POISON_ERR_INT_EN);
+		desc[0].data[4] = cpu_to_le32(HCLGE_TQP_ECC_ERR_INT_EN);
+		desc[0].data[5] = cpu_to_le32(HCLGE_IMP_ITCM4_ECC_ERR_INT_EN);
+	} else {
+		/* disable COMMON error interrupts */
+		desc[0].data[0] = 0;
+		desc[0].data[2] = 0;
+		desc[0].data[3] = 0;
+		desc[0].data[4] = 0;
+		desc[0].data[5] = 0;
+	}
+	desc[1].data[0] = cpu_to_le32(HCLGE_IMP_TCM_ECC_ERR_INT_EN_MASK);
+	desc[1].data[2] = cpu_to_le32(HCLGE_CMDQ_NIC_ECC_ERR_INT_EN_MASK |
+				HCLGE_CMDQ_ROCEE_ECC_ERR_INT_EN_MASK);
+	desc[1].data[3] = cpu_to_le32(HCLGE_IMP_RD_POISON_ERR_INT_EN_MASK);
+	desc[1].data[4] = cpu_to_le32(HCLGE_TQP_ECC_ERR_INT_EN_MASK);
+	desc[1].data[5] = cpu_to_le32(HCLGE_IMP_ITCM4_ECC_ERR_INT_EN_MASK);
+
+	ret = hclge_cmd_send(&hdev->hw, &desc[0], 2);
+	if (ret)
+		dev_err(dev,
+			"failed(%d) to enable/disable COMMON err interrupts\n",
+			ret);
+
+	return ret;
+}
+
+static void hclge_process_common_error(struct hclge_dev *hdev,
+				       enum hclge_err_int_type type)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc[2];
+	u32 err_sts;
+	int ret;
+
+	/* read err sts */
+	ret = hclge_cmd_query_error(hdev, &desc[0],
+				    HCLGE_COMMON_ECC_INT_CFG,
+				    HCLGE_CMD_FLAG_NEXT, 0, 0);
+	if (ret) {
+		dev_err(dev,
+			"failed(=%d) to query COMMON error interrupt status\n",
+			ret);
+		return;
+	}
+
+	/* log err */
+	err_sts = (le32_to_cpu(desc[0].data[0])) & HCLGE_IMP_TCM_ECC_INT_MASK;
+	hclge_log_error(dev, &hclge_imp_tcm_ecc_int[0], err_sts);
+
+	err_sts = (le32_to_cpu(desc[0].data[1])) & HCLGE_CMDQ_ECC_INT_MASK;
+	hclge_log_error(dev, &hclge_cmdq_nic_mem_ecc_int[0], err_sts);
+
+	err_sts = (le32_to_cpu(desc[0].data[1]) >> HCLGE_CMDQ_ROC_ECC_INT_SHIFT)
+		   & HCLGE_CMDQ_ECC_INT_MASK;
+	hclge_log_error(dev, &hclge_cmdq_rocee_mem_ecc_int[0], err_sts);
+
+	if ((le32_to_cpu(desc[0].data[3])) & BIT(0))
+		dev_warn(dev, "imp_rd_data_poison_err found\n");
+
+	err_sts = (le32_to_cpu(desc[0].data[3]) >> HCLGE_TQP_ECC_INT_SHIFT) &
+		   HCLGE_TQP_ECC_INT_MASK;
+	hclge_log_error(dev, &hclge_tqp_int_ecc_int[0], err_sts);
+
+	err_sts = (le32_to_cpu(desc[0].data[5])) &
+		   HCLGE_IMP_ITCM4_ECC_INT_MASK;
+	hclge_log_error(dev, &hclge_imp_itcm4_ecc_int[0], err_sts);
+
+	/* clear error interrupts */
+	desc[1].data[0] = cpu_to_le32(HCLGE_IMP_TCM_ECC_CLR_MASK);
+	desc[1].data[1] = cpu_to_le32(HCLGE_CMDQ_NIC_ECC_CLR_MASK |
+				HCLGE_CMDQ_ROCEE_ECC_CLR_MASK);
+	desc[1].data[3] = cpu_to_le32(HCLGE_TQP_IMP_ERR_CLR_MASK);
+	desc[1].data[5] = cpu_to_le32(HCLGE_IMP_ITCM4_ECC_CLR_MASK);
+
+	ret = hclge_cmd_clear_error(hdev, &desc[0], NULL, 0,
+				    HCLGE_CMD_FLAG_NEXT);
+	if (ret)
+		dev_err(dev,
+			"failed(%d) to clear COMMON error interrupt status\n",
+			ret);
+}
+
 static const struct hclge_hw_blk hw_blk[] = {
+	{ .msk = BIT(5), .name = "COMMON",
+	  .enable_error = hclge_enable_common_error,
+	  .process_error = hclge_process_common_error, },
 	{ /* sentinel */ }
 };
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
index 373e9bf..b413141 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -11,6 +11,31 @@
 #define HCLGE_RAS_REG_NFE_MASK   0xFF00
 #define HCLGE_RAS_REG_NFE_SHIFT	8
 
+#define HCLGE_IMP_TCM_ECC_ERR_INT_EN	0xFFFF0000
+#define HCLGE_IMP_TCM_ECC_ERR_INT_EN_MASK	0xFFFF0000
+#define HCLGE_IMP_ITCM4_ECC_ERR_INT_EN	0x300
+#define HCLGE_IMP_ITCM4_ECC_ERR_INT_EN_MASK	0x300
+#define HCLGE_CMDQ_NIC_ECC_ERR_INT_EN	0xFFFF
+#define HCLGE_CMDQ_NIC_ECC_ERR_INT_EN_MASK	0xFFFF
+#define HCLGE_CMDQ_ROCEE_ECC_ERR_INT_EN	0xFFFF0000
+#define HCLGE_CMDQ_ROCEE_ECC_ERR_INT_EN_MASK	0xFFFF0000
+#define HCLGE_IMP_RD_POISON_ERR_INT_EN	0x0100
+#define HCLGE_IMP_RD_POISON_ERR_INT_EN_MASK	0x0100
+#define HCLGE_TQP_ECC_ERR_INT_EN	0x0FFF
+#define HCLGE_TQP_ECC_ERR_INT_EN_MASK	0x0FFF
+
+#define HCLGE_IMP_TCM_ECC_INT_MASK	0xFFFF
+#define HCLGE_IMP_ITCM4_ECC_INT_MASK	0x3
+#define HCLGE_CMDQ_ECC_INT_MASK		0xFFFF
+#define HCLGE_CMDQ_ROC_ECC_INT_SHIFT	16
+#define HCLGE_TQP_ECC_INT_MASK		0xFFF
+#define HCLGE_TQP_ECC_INT_SHIFT		16
+#define HCLGE_IMP_TCM_ECC_CLR_MASK	0xFFFF
+#define HCLGE_IMP_ITCM4_ECC_CLR_MASK	0x3
+#define HCLGE_CMDQ_NIC_ECC_CLR_MASK	0xFFFF
+#define HCLGE_CMDQ_ROCEE_ECC_CLR_MASK	0xFFFF0000
+#define HCLGE_TQP_IMP_ERR_CLR_MASK	0x0FFF0001
+
 enum hclge_err_int_type {
 	HCLGE_ERR_INT_MSIX = 0,
 	HCLGE_ERR_INT_RAS_CE = 1,
@@ -26,6 +51,11 @@ struct hclge_hw_blk {
 			      enum hclge_err_int_type type);
 };
 
+struct hclge_hw_error {
+	u32 int_msk;
+	const char *msg;
+};
+
 int hclge_hw_error_set_state(struct hclge_dev *hdev, bool state);
 pci_ers_result_t hclge_process_ras_hw_error(struct hnae3_ae_dev *ae_dev);
 #endif
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 5/7] net: hns3: Add enable and process hw errors from IGU, EGU and NCSI
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

This patch adds enable and processing of hw errors from IGU(Ingress Unit),
EGU(Egress Unit) and NCSI(Network Controller Sideband Interface).

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   9 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 190 +++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |   8 +
 3 files changed, 207 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 8525f18..97c90aa 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -212,6 +212,15 @@ enum hclge_opcode_type {
 
 	/* Error INT commands */
 	HCLGE_COMMON_ECC_INT_CFG	= 0x1505,
+	HCLGE_IGU_EGU_TNL_INT_QUERY	= 0x1802,
+	HCLGE_IGU_EGU_TNL_INT_EN	= 0x1803,
+	HCLGE_IGU_EGU_TNL_INT_CLR	= 0x1804,
+	HCLGE_IGU_COMMON_INT_QUERY	= 0x1805,
+	HCLGE_IGU_COMMON_INT_EN		= 0x1806,
+	HCLGE_IGU_COMMON_INT_CLR	= 0x1807,
+	HCLGE_NCSI_INT_QUERY		= 0x2400,
+	HCLGE_NCSI_INT_EN		= 0x2401,
+	HCLGE_NCSI_INT_CLR		= 0x2402,
 };
 
 #define HCLGE_TQP_REG_OFFSET		0x80000
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
index 8b37de4..34c4edd 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -85,6 +85,30 @@ static const struct hclge_hw_error hclge_tqp_int_ecc_int[] = {
 	{ /* sentinel */ }
 };
 
+static const struct hclge_hw_error hclge_igu_com_err_int[] = {
+	{ .int_msk = BIT(0), .msg = "igu_rx_buf0_ecc_mbit_err" },
+	{ .int_msk = BIT(1), .msg = "igu_rx_buf0_ecc_1bit_err" },
+	{ .int_msk = BIT(2), .msg = "igu_rx_buf1_ecc_mbit_err" },
+	{ .int_msk = BIT(3), .msg = "igu_rx_buf1_ecc_1bit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_igu_egu_tnl_err_int[] = {
+	{ .int_msk = BIT(0), .msg = "rx_buf_overflow" },
+	{ .int_msk = BIT(1), .msg = "rx_stp_fifo_overflow" },
+	{ .int_msk = BIT(2), .msg = "rx_stp_fifo_undeflow" },
+	{ .int_msk = BIT(3), .msg = "tx_buf_overflow" },
+	{ .int_msk = BIT(4), .msg = "tx_buf_underrun" },
+	{ .int_msk = BIT(5), .msg = "rx_stp_buf_overflow" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_ncsi_err_int[] = {
+	{ .int_msk = BIT(0), .msg = "ncsi_tx_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "ncsi_tx_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
 static void hclge_log_error(struct device *dev,
 			    const struct hclge_hw_error *err_list,
 			    u32 err_sts)
@@ -229,6 +253,75 @@ static int hclge_enable_common_error(struct hclge_dev *hdev, bool en)
 	return ret;
 }
 
+static int hclge_enable_ncsi_error(struct hclge_dev *hdev, bool en)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc;
+	int ret;
+
+	if (hdev->pdev->revision < 0x21)
+		return 0;
+
+	/* enable/disable NCSI  error interrupts */
+	hclge_cmd_setup_basic_desc(&desc, HCLGE_NCSI_INT_EN, false);
+	if (en)
+		desc.data[0] = cpu_to_le32(HCLGE_NCSI_ERR_INT_EN);
+	else
+		desc.data[0] = 0;
+
+	ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+	if (ret)
+		dev_err(dev,
+			"failed(%d) to enable/disable NCSI error interrupts\n",
+			ret);
+
+	return ret;
+}
+
+static int hclge_enable_igu_egu_error(struct hclge_dev *hdev, bool en)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc;
+	int ret;
+
+	/* enable/disable error interrupts */
+	hclge_cmd_setup_basic_desc(&desc, HCLGE_IGU_COMMON_INT_EN, false);
+	if (en)
+		desc.data[0] = cpu_to_le32(HCLGE_IGU_ERR_INT_EN);
+	else
+		desc.data[0] = 0;
+	desc.data[1] = cpu_to_le32(HCLGE_IGU_ERR_INT_EN_MASK);
+
+	ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+	if (ret) {
+		dev_err(dev,
+			"failed(%d) to enable/disable IGU common interrupts\n",
+			ret);
+		return ret;
+	}
+
+	hclge_cmd_setup_basic_desc(&desc, HCLGE_IGU_EGU_TNL_INT_EN, false);
+	if (en)
+		desc.data[0] = cpu_to_le32(HCLGE_IGU_TNL_ERR_INT_EN);
+	else
+		desc.data[0] = 0;
+	desc.data[1] = cpu_to_le32(HCLGE_IGU_TNL_ERR_INT_EN_MASK);
+
+	ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+	if (ret) {
+		dev_err(dev,
+			"failed(%d) to enable/disable IGU-EGU TNL interrupts\n",
+			ret);
+		return ret;
+	}
+
+	ret = hclge_enable_ncsi_error(hdev, en);
+	if (ret)
+		dev_err(dev, "fail(%d) to en/disable err int\n", ret);
+
+	return ret;
+}
+
 static void hclge_process_common_error(struct hclge_dev *hdev,
 				       enum hclge_err_int_type type)
 {
@@ -285,7 +378,104 @@ static void hclge_process_common_error(struct hclge_dev *hdev,
 			ret);
 }
 
+static void hclge_process_ncsi_error(struct hclge_dev *hdev,
+				     enum hclge_err_int_type type)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc_rd;
+	struct hclge_desc desc_wr;
+	u32 err_sts;
+	int ret;
+
+	if (hdev->pdev->revision < 0x21)
+		return;
+
+	/* read NCSI error status */
+	ret = hclge_cmd_query_error(hdev, &desc_rd, HCLGE_NCSI_INT_QUERY,
+				    0, 1, HCLGE_NCSI_ERR_INT_TYPE);
+	if (ret) {
+		dev_err(dev,
+			"failed(=%d) to query NCSI error interrupt status\n",
+			ret);
+		return;
+	}
+
+	/* log err */
+	err_sts = le32_to_cpu(desc_rd.data[0]);
+	hclge_log_error(dev, &hclge_ncsi_err_int[0], err_sts);
+
+	/* clear err int */
+	ret = hclge_cmd_clear_error(hdev, &desc_wr, &desc_rd,
+				    HCLGE_NCSI_INT_CLR, 0);
+	if (ret)
+		dev_err(dev, "failed(=%d) to clear NCSI intrerrupt status\n",
+			ret);
+}
+
+static void hclge_process_igu_egu_error(struct hclge_dev *hdev,
+					enum hclge_err_int_type int_type)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc_rd;
+	struct hclge_desc desc_wr;
+	u32 err_sts;
+	int ret;
+
+	/* read IGU common err sts */
+	ret = hclge_cmd_query_error(hdev, &desc_rd,
+				    HCLGE_IGU_COMMON_INT_QUERY,
+				    0, 1, int_type);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to query IGU common int status\n",
+			ret);
+		return;
+	}
+
+	/* log err */
+	err_sts = le32_to_cpu(desc_rd.data[0]) &
+				   HCLGE_IGU_COM_INT_MASK;
+	hclge_log_error(dev, &hclge_igu_com_err_int[0], err_sts);
+
+	/* clear err int */
+	ret = hclge_cmd_clear_error(hdev, &desc_wr, &desc_rd,
+				    HCLGE_IGU_COMMON_INT_CLR, 0);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to clear IGU common int status\n",
+			ret);
+		return;
+	}
+
+	/* read IGU-EGU TNL err sts */
+	ret = hclge_cmd_query_error(hdev, &desc_rd,
+				    HCLGE_IGU_EGU_TNL_INT_QUERY,
+				    0, 1, int_type);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to query IGU-EGU TNL int status\n",
+			ret);
+		return;
+	}
+
+	/* log err */
+	err_sts = le32_to_cpu(desc_rd.data[0]) &
+				   HCLGE_IGU_EGU_TNL_INT_MASK;
+	hclge_log_error(dev, &hclge_igu_egu_tnl_err_int[0], err_sts);
+
+	/* clear err int */
+	ret = hclge_cmd_clear_error(hdev, &desc_wr, &desc_rd,
+				    HCLGE_IGU_EGU_TNL_INT_CLR, 0);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to clear IGU-EGU TNL int status\n",
+			ret);
+		return;
+	}
+
+	hclge_process_ncsi_error(hdev, HCLGE_ERR_INT_RAS_NFE);
+}
+
 static const struct hclge_hw_blk hw_blk[] = {
+	{ .msk = BIT(0), .name = "IGU_EGU",
+	  .enable_error = hclge_enable_igu_egu_error,
+	  .process_error = hclge_process_igu_egu_error, },
 	{ .msk = BIT(5), .name = "COMMON",
 	  .enable_error = hclge_enable_common_error,
 	  .process_error = hclge_process_common_error, },
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
index b413141..f46c8c2 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -23,6 +23,12 @@
 #define HCLGE_IMP_RD_POISON_ERR_INT_EN_MASK	0x0100
 #define HCLGE_TQP_ECC_ERR_INT_EN	0x0FFF
 #define HCLGE_TQP_ECC_ERR_INT_EN_MASK	0x0FFF
+#define HCLGE_IGU_ERR_INT_EN	0x0000066F
+#define HCLGE_IGU_ERR_INT_EN_MASK	0x000F
+#define HCLGE_IGU_TNL_ERR_INT_EN    0x0002AABF
+#define HCLGE_IGU_TNL_ERR_INT_EN_MASK  0x003F
+#define HCLGE_NCSI_ERR_INT_EN	0x3
+#define HCLGE_NCSI_ERR_INT_TYPE	0x9
 
 #define HCLGE_IMP_TCM_ECC_INT_MASK	0xFFFF
 #define HCLGE_IMP_ITCM4_ECC_INT_MASK	0x3
@@ -35,6 +41,8 @@
 #define HCLGE_CMDQ_NIC_ECC_CLR_MASK	0xFFFF
 #define HCLGE_CMDQ_ROCEE_ECC_CLR_MASK	0xFFFF0000
 #define HCLGE_TQP_IMP_ERR_CLR_MASK	0x0FFF0001
+#define HCLGE_IGU_COM_INT_MASK		0xF
+#define HCLGE_IGU_EGU_TNL_INT_MASK	0x3F
 
 enum hclge_err_int_type {
 	HCLGE_ERR_INT_MSIX = 0,
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next 6/7] net: hns3: Add enable and process hw errors from PPP
From: Salil Mehta @ 2018-10-19 19:15 UTC (permalink / raw)
  To: davem
  Cc: salil.mehta, yisen.zhuang, lipeng321, mehta.salil, netdev,
	linux-kernel, linuxarm, Shiju Jose
In-Reply-To: <20181019191532.10088-1-salil.mehta@huawei.com>

From: Shiju Jose <shiju.jose@huawei.com>

This patch enables and process hw errors from the
PPP(Programmable Packet Process) block.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h |   2 +
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 267 +++++++++++++++++++++
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h |  11 +
 3 files changed, 280 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
index 97c90aa..f23cf78 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h
@@ -218,6 +218,8 @@ enum hclge_opcode_type {
 	HCLGE_IGU_COMMON_INT_QUERY	= 0x1805,
 	HCLGE_IGU_COMMON_INT_EN		= 0x1806,
 	HCLGE_IGU_COMMON_INT_CLR	= 0x1807,
+	HCLGE_PPP_CMD0_INT_CMD		= 0x2100,
+	HCLGE_PPP_CMD1_INT_CMD		= 0x2101,
 	HCLGE_NCSI_INT_QUERY		= 0x2400,
 	HCLGE_NCSI_INT_EN		= 0x2401,
 	HCLGE_NCSI_INT_CLR		= 0x2402,
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
index 34c4edd..ea73def 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c
@@ -109,6 +109,110 @@ static const struct hclge_hw_error hclge_ncsi_err_int[] = {
 	{ /* sentinel */ }
 };
 
+static const struct hclge_hw_error hclge_ppp_mpf_int0[] = {
+	{ .int_msk = BIT(0), .msg = "vf_vlan_ad_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "umv_mcast_group_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(2), .msg = "umv_key_mem0_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "umv_key_mem1_ecc_1bit_err" },
+	{ .int_msk = BIT(4), .msg = "umv_key_mem2_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "umv_key_mem3_ecc_1bit_err" },
+	{ .int_msk = BIT(6), .msg = "umv_ad_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(7), .msg = "rss_tc_mode_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(8), .msg = "rss_idt_mem0_ecc_1bit_err" },
+	{ .int_msk = BIT(9), .msg = "rss_idt_mem1_ecc_1bit_err" },
+	{ .int_msk = BIT(10), .msg = "rss_idt_mem2_ecc_1bit_err" },
+	{ .int_msk = BIT(11), .msg = "rss_idt_mem3_ecc_1bit_err" },
+	{ .int_msk = BIT(12), .msg = "rss_idt_mem4_ecc_1bit_err" },
+	{ .int_msk = BIT(13), .msg = "rss_idt_mem5_ecc_1bit_err" },
+	{ .int_msk = BIT(14), .msg = "rss_idt_mem6_ecc_1bit_err" },
+	{ .int_msk = BIT(15), .msg = "rss_idt_mem7_ecc_1bit_err" },
+	{ .int_msk = BIT(16), .msg = "rss_idt_mem8_ecc_1bit_err" },
+	{ .int_msk = BIT(17), .msg = "rss_idt_mem9_ecc_1bit_err" },
+	{ .int_msk = BIT(18), .msg = "rss_idt_mem10_ecc_1bit_err" },
+	{ .int_msk = BIT(19), .msg = "rss_idt_mem11_ecc_1bit_err" },
+	{ .int_msk = BIT(20), .msg = "rss_idt_mem12_ecc_1bit_err" },
+	{ .int_msk = BIT(21), .msg = "rss_idt_mem13_ecc_1bit_err" },
+	{ .int_msk = BIT(22), .msg = "rss_idt_mem14_ecc_1bit_err" },
+	{ .int_msk = BIT(23), .msg = "rss_idt_mem15_ecc_1bit_err" },
+	{ .int_msk = BIT(24), .msg = "port_vlan_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(25), .msg = "mcast_linear_table_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(26), .msg = "mcast_result_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(27),
+		.msg = "flow_director_ad_mem0_ecc_1bit_err" },
+	{ .int_msk = BIT(28),
+		.msg = "flow_director_ad_mem1_ecc_1bit_err" },
+	{ .int_msk = BIT(29),
+		.msg = "rx_vlan_tag_memory_ecc_1bit_err" },
+	{ .int_msk = BIT(30),
+		.msg = "Tx_UP_mapping_config_mem_ecc_1bit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_ppp_mpf_int1[] = {
+	{ .int_msk = BIT(0), .msg = "vf_vlan_ad_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(1), .msg = "umv_mcast_group_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(2), .msg = "umv_key_mem0_ecc_mbit_err" },
+	{ .int_msk = BIT(3), .msg = "umv_key_mem1_ecc_mbit_err" },
+	{ .int_msk = BIT(4), .msg = "umv_key_mem2_ecc_mbit_err" },
+	{ .int_msk = BIT(5), .msg = "umv_key_mem3_ecc_mbit_err" },
+	{ .int_msk = BIT(6), .msg = "umv_ad_mem_ecc_mbit_erre" },
+	{ .int_msk = BIT(7), .msg = "rss_tc_mode_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(8), .msg = "rss_idt_mem0_ecc_mbit_err" },
+	{ .int_msk = BIT(9), .msg = "rss_idt_mem1_ecc_mbit_err" },
+	{ .int_msk = BIT(10), .msg = "rss_idt_mem2_ecc_mbit_err" },
+	{ .int_msk = BIT(11), .msg = "rss_idt_mem3_ecc_mbit_err" },
+	{ .int_msk = BIT(12), .msg = "rss_idt_mem4_ecc_mbit_err" },
+	{ .int_msk = BIT(13), .msg = "rss_idt_mem5_ecc_mbit_err" },
+	{ .int_msk = BIT(14), .msg = "rss_idt_mem6_ecc_mbit_err" },
+	{ .int_msk = BIT(15), .msg = "rss_idt_mem7_ecc_mbit_err" },
+	{ .int_msk = BIT(16), .msg = "rss_idt_mem8_ecc_mbit_err" },
+	{ .int_msk = BIT(17), .msg = "rss_idt_mem9_ecc_mbit_err" },
+	{ .int_msk = BIT(18), .msg = "rss_idt_mem10_ecc_m1bit_err" },
+	{ .int_msk = BIT(19), .msg = "rss_idt_mem11_ecc_mbit_err" },
+	{ .int_msk = BIT(20), .msg = "rss_idt_mem12_ecc_mbit_err" },
+	{ .int_msk = BIT(21), .msg = "rss_idt_mem13_ecc_mbit_err" },
+	{ .int_msk = BIT(22), .msg = "rss_idt_mem14_ecc_mbit_err" },
+	{ .int_msk = BIT(23), .msg = "rss_idt_mem15_ecc_mbit_err" },
+	{ .int_msk = BIT(24), .msg = "port_vlan_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(25), .msg = "mcast_linear_table_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(26), .msg = "mcast_result_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(27),
+		.msg = "flow_director_ad_mem0_ecc_mbit_err" },
+	{ .int_msk = BIT(28),
+		.msg = "flow_director_ad_mem1_ecc_mbit_err" },
+	{ .int_msk = BIT(29),
+		.msg = "rx_vlan_tag_memory_ecc_mbit_err" },
+	{ .int_msk = BIT(30),
+		.msg = "Tx_UP_mapping_config_mem_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_ppp_pf_int[] = {
+	{ .int_msk = BIT(0), .msg = "Tx_vlan_tag_err" },
+	{ .int_msk = BIT(1), .msg = "rss_list_tc_unassigned_queue_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_ppp_mpf_int2[] = {
+	{ .int_msk = BIT(0), .msg = "hfs_fifo_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(1), .msg = "rslt_descr_fifo_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(2), .msg = "tx_vlan_tag_mem_ecc_1bit_err" },
+	{ .int_msk = BIT(3), .msg = "FD_CN0_memory_ecc_1bit_err" },
+	{ .int_msk = BIT(4), .msg = "FD_CN1_memory_ecc_1bit_err" },
+	{ .int_msk = BIT(5), .msg = "GRO_AD_memory_ecc_1bit_err" },
+	{ /* sentinel */ }
+};
+
+static const struct hclge_hw_error hclge_ppp_mpf_int3[] = {
+	{ .int_msk = BIT(0), .msg = "hfs_fifo_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(1), .msg = "rslt_descr_fifo_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(2), .msg = "tx_vlan_tag_mem_ecc_mbit_err" },
+	{ .int_msk = BIT(3), .msg = "FD_CN0_memory_ecc_mbit_err" },
+	{ .int_msk = BIT(4), .msg = "FD_CN1_memory_ecc_mbit_err" },
+	{ .int_msk = BIT(5), .msg = "GRO_AD_memory_ecc_mbit_err" },
+	{ /* sentinel */ }
+};
+
 static void hclge_log_error(struct device *dev,
 			    const struct hclge_hw_error *err_list,
 			    u32 err_sts)
@@ -322,6 +426,81 @@ static int hclge_enable_igu_egu_error(struct hclge_dev *hdev, bool en)
 	return ret;
 }
 
+static int hclge_enable_ppp_error_interrupt(struct hclge_dev *hdev, u32 cmd,
+					    bool en)
+{
+	struct device *dev = &hdev->pdev->dev;
+	struct hclge_desc desc[2];
+	int ret;
+
+	/* enable/disable PPP error interrupts */
+	hclge_cmd_setup_basic_desc(&desc[0], cmd, false);
+	desc[0].flag |= cpu_to_le16(HCLGE_CMD_FLAG_NEXT);
+	hclge_cmd_setup_basic_desc(&desc[1], cmd, false);
+
+	if (cmd == HCLGE_PPP_CMD0_INT_CMD) {
+		if (en) {
+			desc[0].data[0] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT0_EN);
+			desc[0].data[1] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT1_EN);
+		} else {
+			desc[0].data[0] = 0;
+			desc[0].data[1] = 0;
+		}
+		desc[1].data[0] =
+			cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT0_EN_MASK);
+		desc[1].data[1] =
+			cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT1_EN_MASK);
+	} else if (cmd == HCLGE_PPP_CMD1_INT_CMD) {
+		if (en) {
+			desc[0].data[0] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT2_EN);
+			desc[0].data[1] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT3_EN);
+		} else {
+			desc[0].data[0] = 0;
+			desc[0].data[1] = 0;
+		}
+		desc[1].data[0] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT2_EN_MASK);
+		desc[1].data[1] =
+				cpu_to_le32(HCLGE_PPP_MPF_ECC_ERR_INT3_EN_MASK);
+	}
+
+	ret = hclge_cmd_send(&hdev->hw, &desc[0], 2);
+	if (ret)
+		dev_err(dev,
+			"failed(%d) to enable/disable PPP error interrupts\n",
+			ret);
+
+	return ret;
+}
+
+static int hclge_enable_ppp_error(struct hclge_dev *hdev, bool en)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int ret;
+
+	ret = hclge_enable_ppp_error_interrupt(hdev, HCLGE_PPP_CMD0_INT_CMD,
+					       en);
+	if (ret) {
+		dev_err(dev,
+			"failed(%d) to enable/disable PPP error intr 0,1\n",
+			ret);
+		return ret;
+	}
+
+	ret = hclge_enable_ppp_error_interrupt(hdev, HCLGE_PPP_CMD1_INT_CMD,
+					       en);
+	if (ret)
+		dev_err(dev,
+			"failed(%d) to enable/disable PPP error intr 2,3\n",
+			ret);
+
+	return ret;
+}
+
 static void hclge_process_common_error(struct hclge_dev *hdev,
 				       enum hclge_err_int_type type)
 {
@@ -472,6 +651,91 @@ static void hclge_process_igu_egu_error(struct hclge_dev *hdev,
 	hclge_process_ncsi_error(hdev, HCLGE_ERR_INT_RAS_NFE);
 }
 
+static int hclge_log_and_clear_ppp_error(struct hclge_dev *hdev, u32 cmd,
+					 enum hclge_err_int_type int_type)
+{
+	enum hnae3_reset_type reset_level = HNAE3_NONE_RESET;
+	struct device *dev = &hdev->pdev->dev;
+	const struct hclge_hw_error *hw_err_lst1, *hw_err_lst2, *hw_err_lst3;
+	struct hclge_desc desc[2];
+	u32 err_sts;
+	int ret;
+
+	/* read PPP INT sts */
+	ret = hclge_cmd_query_error(hdev, &desc[0], cmd,
+				    HCLGE_CMD_FLAG_NEXT, 5, int_type);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to query PPP interrupt status\n",
+			ret);
+		return -EIO;
+	}
+
+	/* log error */
+	if (cmd == HCLGE_PPP_CMD0_INT_CMD) {
+		hw_err_lst1 = &hclge_ppp_mpf_int0[0];
+		hw_err_lst2 = &hclge_ppp_mpf_int1[0];
+		hw_err_lst3 = &hclge_ppp_pf_int[0];
+	} else if (cmd == HCLGE_PPP_CMD1_INT_CMD) {
+		hw_err_lst1 = &hclge_ppp_mpf_int2[0];
+		hw_err_lst2 = &hclge_ppp_mpf_int3[0];
+	} else {
+		dev_err(dev, "invalid command(=%d)\n", cmd);
+		return -EINVAL;
+	}
+
+	err_sts = le32_to_cpu(desc[0].data[2]);
+	if (err_sts) {
+		hclge_log_error(dev, hw_err_lst1, err_sts);
+		reset_level = HNAE3_FUNC_RESET;
+	}
+
+	err_sts = le32_to_cpu(desc[0].data[3]);
+	if (err_sts) {
+		hclge_log_error(dev, hw_err_lst2, err_sts);
+		reset_level = HNAE3_FUNC_RESET;
+	}
+
+	err_sts = (le32_to_cpu(desc[0].data[4]) >> 8) & 0x3;
+	if (err_sts) {
+		hclge_log_error(dev, hw_err_lst3, err_sts);
+		reset_level = HNAE3_FUNC_RESET;
+	}
+
+	/* clear PPP INT */
+	ret = hclge_cmd_clear_error(hdev, &desc[0], NULL, 0,
+				    HCLGE_CMD_FLAG_NEXT);
+	if (ret) {
+		dev_err(dev, "failed(=%d) to clear PPP interrupt status\n",
+			ret);
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static void hclge_process_ppp_error(struct hclge_dev *hdev,
+				    enum hclge_err_int_type int_type)
+{
+	struct device *dev = &hdev->pdev->dev;
+	int ret;
+
+	/* read PPP INT0,1 sts */
+	ret = hclge_log_and_clear_ppp_error(hdev, HCLGE_PPP_CMD0_INT_CMD,
+					    int_type);
+	if (ret < 0) {
+		dev_err(dev, "failed(=%d) to clear PPP interrupt 0,1 status\n",
+			ret);
+		return;
+	}
+
+	/* read err PPP INT2,3 sts */
+	ret = hclge_log_and_clear_ppp_error(hdev, HCLGE_PPP_CMD1_INT_CMD,
+					    int_type);
+	if (ret < 0)
+		dev_err(dev, "failed(=%d) to clear PPP interrupt 2,3 status\n",
+			ret);
+}
+
 static const struct hclge_hw_blk hw_blk[] = {
 	{ .msk = BIT(0), .name = "IGU_EGU",
 	  .enable_error = hclge_enable_igu_egu_error,
@@ -479,6 +743,9 @@ static const struct hclge_hw_blk hw_blk[] = {
 	{ .msk = BIT(5), .name = "COMMON",
 	  .enable_error = hclge_enable_common_error,
 	  .process_error = hclge_process_common_error, },
+	{ .msk = BIT(1), .name = "PPP",
+	  .enable_error = hclge_enable_ppp_error,
+	  .process_error = hclge_process_ppp_error, },
 	{ /* sentinel */ }
 };
 
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
index f46c8c2..c6d3739 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.h
@@ -27,6 +27,16 @@
 #define HCLGE_IGU_ERR_INT_EN_MASK	0x000F
 #define HCLGE_IGU_TNL_ERR_INT_EN    0x0002AABF
 #define HCLGE_IGU_TNL_ERR_INT_EN_MASK  0x003F
+#define HCLGE_PPP_MPF_ECC_ERR_INT0_EN	0xFFFFFFFF
+#define HCLGE_PPP_MPF_ECC_ERR_INT0_EN_MASK	0xFFFFFFFF
+#define HCLGE_PPP_MPF_ECC_ERR_INT1_EN	0xFFFFFFFF
+#define HCLGE_PPP_MPF_ECC_ERR_INT1_EN_MASK	0xFFFFFFFF
+#define HCLGE_PPP_PF_ERR_INT_EN	0x0003
+#define HCLGE_PPP_PF_ERR_INT_EN_MASK	0x0003
+#define HCLGE_PPP_MPF_ECC_ERR_INT2_EN	0x003F
+#define HCLGE_PPP_MPF_ECC_ERR_INT2_EN_MASK	0x003F
+#define HCLGE_PPP_MPF_ECC_ERR_INT3_EN	0x003F
+#define HCLGE_PPP_MPF_ECC_ERR_INT3_EN_MASK	0x003F
 #define HCLGE_NCSI_ERR_INT_EN	0x3
 #define HCLGE_NCSI_ERR_INT_TYPE	0x9
 
@@ -43,6 +53,7 @@
 #define HCLGE_TQP_IMP_ERR_CLR_MASK	0x0FFF0001
 #define HCLGE_IGU_COM_INT_MASK		0xF
 #define HCLGE_IGU_EGU_TNL_INT_MASK	0x3F
+#define HCLGE_PPP_PF_INT_MASK		0x100
 
 enum hclge_err_int_type {
 	HCLGE_ERR_INT_MSIX = 0,
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH ghak90 (was ghak32) V4 02/10] audit: add container id
From: Paul Moore @ 2018-10-19 19:40 UTC (permalink / raw)
  To: rgb
  Cc: containers, linux-api, linux-audit, linux-fsdevel, linux-kernel,
	netdev, netfilter-devel, ebiederm, luto, carlos, dhowells, viro,
	simo, Eric Paris, Serge Hallyn
In-Reply-To: <CAHC9VhTT7n4wDoaDGFXAK+wtcsMDDMRkQVCQ2jNYLKSFqFc+bA@mail.gmail.com>

Ooops, I hit send prematurely on this :/  My comments below should
stand, but for things like this I usually try to get through the
entire patchset before sending my comments as later patches can affect
my comments on the earlier patches.

On Fri, Oct 19, 2018 at 3:38 PM Paul Moore <paul@paul-moore.com> wrote:
> On Tue, Jul 31, 2018 at 4:11 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> >
> > Implement the proc fs write to set the audit container identifier of a
> > process, emitting an AUDIT_CONTAINER_OP record to document the event.
> >
> > This is a write from the container orchestrator task to a proc entry of
> > the form /proc/PID/audit_containerid where PID is the process ID of the
> > newly created task that is to become the first task in a container, or
> > an additional task added to a container.
> >
> > The write expects up to a u64 value (unset: 18446744073709551615).
> >
> > The writer must have capability CAP_AUDIT_CONTROL.
> >
> > This will produce a record such as this:
> >   type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes
>
> You need to update the record type in the example above.
>
> > The "op" field indicates an initial set.  The "pid" to "ses" fields are
> > the orchestrator while the "opid" field is the object's PID, the process
> > being "contained".  Old and new audit container identifier values are
> > given in the "contid" fields, while res indicates its success.
>
> I understand Steve's concern around the "op" field, but I think it
> might be a bit premature to think we might not need to do some sort of
> audit container ID management in the future that would want to make
> use of the CONTAINER_OP message type.  I would like to see the "op"
> field preserved.
>
> > It is not permitted to unset the audit container identifier.
> > A child inherits its parent's audit container identifier.
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/90
> > See: https://github.com/linux-audit/audit-userspace/issues/51
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> >
> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > Acked-by: Serge Hallyn <serge@hallyn.com>
> > Acked-by: Steve Grubb <sgrubb@redhat.com>
> > ---
> >  fs/proc/base.c             | 37 +++++++++++++++++++++++++
> >  include/linux/audit.h      | 24 ++++++++++++++++
> >  include/uapi/linux/audit.h |  2 ++
> >  kernel/auditsc.c           | 68 ++++++++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 131 insertions(+)
>
> ...
>
> > @@ -2112,6 +2114,72 @@ int audit_set_loginuid(kuid_t loginuid)
> >  }
> >
> >  /**
> > + * audit_set_contid - set current task's audit_context contid
> > + * @contid: contid value
> > + *
> > + * Returns 0 on success, -EPERM on permission failure.
> > + *
> > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > + */
> > +int audit_set_contid(struct task_struct *task, u64 contid)
> > +{
> > +       u64 oldcontid;
> > +       int rc = 0;
> > +       struct audit_buffer *ab;
> > +       uid_t uid;
> > +       struct tty_struct *tty;
> > +       char comm[sizeof(current->comm)];
> > +
> > +       task_lock(task);
> > +       /* Can't set if audit disabled */
> > +       if (!task->audit) {
> > +               task_unlock(task);
> > +               return -ENOPROTOOPT;
> > +       }
> > +       oldcontid = audit_get_contid(task);
> > +       read_lock(&tasklist_lock);
>
> I assume lockdep was happy with nesting the tasklist_lock inside the task lock?
>
> > +       /* Don't allow the audit containerid to be unset */
> > +       if (!audit_contid_valid(contid))
> > +               rc = -EINVAL;
> > +       /* if we don't have caps, reject */
> > +       else if (!capable(CAP_AUDIT_CONTROL))
> > +               rc = -EPERM;
> > +       /* if task has children or is not single-threaded, deny */
> > +       else if (!list_empty(&task->children))
> > +               rc = -EBUSY;
> > +       else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > +               rc = -EALREADY;
> > +       read_unlock(&tasklist_lock);
> > +       if (!rc)
> > +               task->audit->contid = contid;
> > +       task_unlock(task);
> > +
> > +       if (!audit_enabled)
> > +               return rc;
> > +
> > +       ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_OP);
> > +       if (!ab)
> > +               return rc;
> > +
> > +       uid = from_kuid(&init_user_ns, task_uid(current));
> > +       tty = audit_get_tty(current);
> > +       audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d uid=%u auid=%u tty=%s ses=%u",
> > +                        task_tgid_nr(task), oldcontid, contid,
> > +                        task_tgid_nr(current), uid,
> > +                        from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > +                        tty ? tty_name(tty) : "(none)",
> > +                        audit_get_sessionid(current));
> > +       audit_put_tty(tty);
> > +       audit_log_task_context(ab);
> > +       audit_log_format(ab, " comm=");
> > +       audit_log_untrustedstring(ab, get_task_comm(comm, current));
> > +       audit_log_d_path_exe(ab, current->mm);
> > +       audit_log_format(ab, " res=%d", !rc);
> > +       audit_log_end(ab);
> > +       return rc;
> > +}
>
> --
> paul moore
> www.paul-moore.com



-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [PATCH net-next] rocker: Drop pointless static qualifier
From: YueHaibing @ 2018-10-19 12:02 UTC (permalink / raw)
  To: Jiri Pirko, davem; +Cc: YueHaibing, netdev, kernel-janitors

There is no need to have the 'struct rocker_desc_info *desc_info'
variable static since new value always be assigned before use it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/rocker/rocker_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 8721c05..beb0662 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -371,7 +371,7 @@ static void rocker_desc_cookie_ptr_set(const struct rocker_desc_info *desc_info,
 static struct rocker_desc_info *
 rocker_desc_head_get(const struct rocker_dma_ring_info *info)
 {
-	static struct rocker_desc_info *desc_info;
+	struct rocker_desc_info *desc_info;
 	u32 head = __pos_inc(info->head, info->size);
 
 	desc_info = &info->desc_info[info->head];
@@ -402,7 +402,7 @@ static void rocker_desc_head_set(const struct rocker *rocker,
 static struct rocker_desc_info *
 rocker_desc_tail_get(struct rocker_dma_ring_info *info)
 {
-	static struct rocker_desc_info *desc_info;
+	struct rocker_desc_info *desc_info;
 
 	if (info->tail == info->head)
 		return NULL; /* nothing to be done between head and tail */

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox