Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf] bpf: fix wrong helper enablement in cgroup local storage
From: Roman Gushchin @ 2018-10-26 22:57 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: ast@kernel.org, netdev@vger.kernel.org
In-Reply-To: <20181026224902.12455-1-daniel@iogearbox.net>

On Sat, Oct 27, 2018 at 12:49:02AM +0200, Daniel Borkmann wrote:
> Commit cd3394317653 ("bpf: introduce the bpf_get_local_storage()
> helper function") enabled the bpf_get_local_storage() helper also
> for BPF program types where it does not make sense to use them.
> 
> They have been added both in sk_skb_func_proto() and sk_msg_func_proto()
> even though both program types are not invoked in combination with
> cgroups, and neither through BPF_PROG_RUN_ARRAY(). In the latter the
> bpf_cgroup_storage_set() is set shortly before BPF program invocation.
> 
> Later, the helper bpf_get_local_storage() retrieves this prior set
> up per-cpu pointer and hands the buffer to the BPF program. The map
> argument in there solely retrieves the enum bpf_cgroup_storage_type
> from a local storage map associated with the program and based on the
> type returns either the global or per-cpu storage. However, there
> is no specific association between the program's map and the actual
> content in bpf_cgroup_storage[].
> 
> Meaning, any BPF program that would have been properly run from the
> cgroup side through BPF_PROG_RUN_ARRAY() where bpf_cgroup_storage_set()
> was performed, and that is later unloaded such that prog / maps are
> teared down will cause a use after free if that pointer is retrieved
> from programs that are not run through BPF_PROG_RUN_ARRAY() but have
> the cgroup local storage helper enabled in their func proto.
> 
> Lets just remove it from the two sock_map program types to fix it.
> Auditing through the types where this helper is enabled, it appears
> that these are the only ones where it was mistakenly allowed.
> 
> Fixes: cd3394317653 ("bpf: introduce the bpf_get_local_storage() helper function")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Roman Gushchin <guro@fb.com>
> Acked-by: John Fastabend <john.fastabend@gmail.com>


Acked-by: Roman Gushchin <guro@fb.com>

Thanks, Daniel!

^ permalink raw reply

* Re: [PATCH net] ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called
From: David Miller @ 2018-10-26 22:58 UTC (permalink / raw)
  To: sbrivio; +Cc: yoshfuji, netdev
In-Reply-To: <7acb12cb7a35c7a5f9670fc5b1373610b4d5ed67.1540384081.git.sbrivio@redhat.com>

From: Stefano Brivio <sbrivio@redhat.com>
Date: Wed, 24 Oct 2018 14:37:21 +0200

> Commit a61bbcf28a8c ("[NET]: Store skb->timestamp as offset to a base
> timestamp") introduces a neighbour control buffer and zeroes it out in
> ndisc_rcv(), as ndisc_recv_ns() uses it.
> 
> Commit f2776ff04722 ("[IPV6]: Fix address/interface handling in UDP and
> DCCP, according to the scoping architecture.") introduces the usage of the
> IPv6 control buffer in protocol error handlers (e.g. inet6_iif() in
> present-day __udp6_lib_err()).
> 
> Now, with commit b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate
> redirect, instead of rt6_redirect()."), we call protocol error handlers
> from ndisc_redirect_rcv(), after the control buffer is already stolen and
> some parts are already zeroed out. This implies that inet6_iif() on this
> path will always return zero.
> 
> This gives unexpected results on UDP socket lookup in __udp6_lib_err(), as
> we might actually need to match sockets for a given interface.
> 
> Instead of always claiming the control buffer in ndisc_rcv(), do that only
> when needed.
> 
> Fixes: b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>

Applied and queued up for -stable, thank you.

^ permalink raw reply

* Re: [PATCH net V2 1/1] net/smc: fix smc_buf_unuse to use the lgr pointer
From: David Miller @ 2018-10-26 23:00 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, raspl
In-Reply-To: <20181025112528.77307-1-ubraun@linux.ibm.com>

From: Ursula Braun <ubraun@linux.ibm.com>
Date: Thu, 25 Oct 2018 13:25:28 +0200

> From: Karsten Graul <kgraul@linux.ibm.com>
> 
> The pointer to the link group is unset in the smc connection structure
> right before the call to smc_buf_unuse. Provide the lgr pointer to
> smc_buf_unuse explicitly.
> And move the call to smc_lgr_schedule_free_work to the end of
> smc_conn_free.
> 
> Fixes: a6920d1d130c ("net/smc: handle unregistered buffers")
> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>

Applied and queued up for -stable, thanks!

^ permalink raw reply

* Re: [PATCH net] bridge: do not add port to router list when receives query with source 0.0.0.0
From: David Miller @ 2018-10-26 23:02 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, nikolay, jiri, linus.luessing
In-Reply-To: <1540520923-17589-1-git-send-email-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Fri, 26 Oct 2018 10:28:43 +0800

> Based on RFC 4541, 2.1.1.  IGMP Forwarding Rules
> 
>   The switch supporting IGMP snooping must maintain a list of
>   multicast routers and the ports on which they are attached.  This
>   list can be constructed in any combination of the following ways:
> 
>   a) This list should be built by the snooping switch sending
>      Multicast Router Solicitation messages as described in IGMP
>      Multicast Router Discovery [MRDISC].  It may also snoop
>      Multicast Router Advertisement messages sent by and to other
>      nodes.
> 
>   b) The arrival port for IGMP Queries (sent by multicast routers)
>      where the source address is not 0.0.0.0.
> 
> We should not add the port to router list when receives query with source
> 0.0.0.0.
> 
> Reported-by: Ying Xu <yinxu@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] net: allow traceroute with a specified interface in a vrf
From: David Miller @ 2018-10-26 23:03 UTC (permalink / raw)
  To: mmanning; +Cc: netdev
In-Reply-To: <20181026112435.12159-1-mmanning@vyatta.att-mail.com>

From: Mike Manning <mmanning@vyatta.att-mail.com>
Date: Fri, 26 Oct 2018 12:24:35 +0100

> Traceroute executed in a vrf succeeds if no device is given or if the
> vrf is given as the device, but fails if the interface is given as the
> device. This is for default UDP probes, it succeeds for TCP SYN or ICMP
> ECHO probes. As the skb bound dev is the interface and the sk dev is
> the vrf, sk lookup fails for ICMP_DEST_UNREACH and ICMP_TIME_EXCEEDED
> messages. The solution is for the secondary dev to be passed so that
> the interface is available for the device match to succeed, in the same
> way as is already done for non-error cases.
> 
> Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] net/neigh: fix NULL deref in pneigh_dump_table()
From: David Miller @ 2018-10-26 23:04 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, dsahern
In-Reply-To: <20181026163327.219797-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Fri, 26 Oct 2018 09:33:27 -0700

> pneigh can have NULL device pointer, so we need to make
> neigh_master_filtered() and neigh_ifindex_filtered() more robust.
 ...
> Fixes: 6f52f80e8530 ("net/neigh: Extend dump filter to proxy neighbor dumps")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: David Ahern <dsahern@gmail.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH bpf] bpf: fix wrong helper enablement in cgroup local storage
From: Alexei Starovoitov @ 2018-10-26 23:05 UTC (permalink / raw)
  To: Roman Gushchin; +Cc: Daniel Borkmann, ast@kernel.org, netdev@vger.kernel.org
In-Reply-To: <20181026225709.GA12467@tower.DHCP.thefacebook.com>

On Fri, Oct 26, 2018 at 10:57:18PM +0000, Roman Gushchin wrote:
> On Sat, Oct 27, 2018 at 12:49:02AM +0200, Daniel Borkmann wrote:
> > Commit cd3394317653 ("bpf: introduce the bpf_get_local_storage()
> > helper function") enabled the bpf_get_local_storage() helper also
> > for BPF program types where it does not make sense to use them.
> > 
> > They have been added both in sk_skb_func_proto() and sk_msg_func_proto()
> > even though both program types are not invoked in combination with
> > cgroups, and neither through BPF_PROG_RUN_ARRAY(). In the latter the
> > bpf_cgroup_storage_set() is set shortly before BPF program invocation.
> > 
> > Later, the helper bpf_get_local_storage() retrieves this prior set
> > up per-cpu pointer and hands the buffer to the BPF program. The map
> > argument in there solely retrieves the enum bpf_cgroup_storage_type
> > from a local storage map associated with the program and based on the
> > type returns either the global or per-cpu storage. However, there
> > is no specific association between the program's map and the actual
> > content in bpf_cgroup_storage[].
> > 
> > Meaning, any BPF program that would have been properly run from the
> > cgroup side through BPF_PROG_RUN_ARRAY() where bpf_cgroup_storage_set()
> > was performed, and that is later unloaded such that prog / maps are
> > teared down will cause a use after free if that pointer is retrieved
> > from programs that are not run through BPF_PROG_RUN_ARRAY() but have
> > the cgroup local storage helper enabled in their func proto.
> > 
> > Lets just remove it from the two sock_map program types to fix it.
> > Auditing through the types where this helper is enabled, it appears
> > that these are the only ones where it was mistakenly allowed.
> > 
> > Fixes: cd3394317653 ("bpf: introduce the bpf_get_local_storage() helper function")
> > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> > Cc: Roman Gushchin <guro@fb.com>
> > Acked-by: John Fastabend <john.fastabend@gmail.com>
> 
> 
> Acked-by: Roman Gushchin <guro@fb.com>

Applied, Thanks

^ permalink raw reply

* pull-request: bpf 2018-10-27
From: Daniel Borkmann @ 2018-10-27  0:07 UTC (permalink / raw)
  To: davem; +Cc: daniel, ast, netdev

Hi David,

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix toctou race in BTF header validation, from Martin and Wenwen.

2) Fix devmap interface comparison in notifier call which was
   neglecting netns, from Taehee.

3) Several fixes in various places, for example, correcting direct
   packet access and helper function availability, from Daniel.

4) Fix BPF kselftest config fragment to include af_xdp and sockmap,
   from Naresh.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

----------------------------------------------------------------

The following changes since commit 42d0f71c9b5fd48861d61cfc05c9e001f847c9d5:

  octeontx2-af: Use GFP_ATOMIC under spin lock (2018-10-25 11:36:29 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git 

for you to fetch changes up to d8fd9e106fbc291167ebb675ad69234597d0fd98:

  bpf: fix wrong helper enablement in cgroup local storage (2018-10-26 16:03:30 -0700)

----------------------------------------------------------------
Alexei Starovoitov (1):
      Merge branch 'pkt-access-fixes'

Daniel Borkmann (9):
      bpf: fix test suite to enable all unpriv program types
      bpf: disallow direct packet access for unpriv in cg_skb
      bpf: fix direct packet access for flow dissector progs
      bpf: fix cg_skb types to hint access type in may_access_direct_pkt_data
      bpf: fix direct packet write into pop/peek helpers
      bpf: fix leaking uninitialized memory on pop/peek helpers
      bpf: make direct packet write unclone more robust
      bpf: add bpf_jit_limit knob to restrict unpriv allocations
      bpf: fix wrong helper enablement in cgroup local storage

Martin Lau (1):
      bpf, btf: fix a missing check bug in btf_parse

Naresh Kamboju (1):
      selftests/bpf: add config fragments BPF_STREAM_PARSER and XDP_SOCKETS

Taehee Yoo (1):
      bpf: devmap: fix wrong interface selection in notifier_call

 Documentation/sysctl/net.txt                |  8 ++++
 include/linux/filter.h                      |  1 +
 kernel/bpf/btf.c                            | 58 +++++++++++++----------------
 kernel/bpf/core.c                           | 49 ++++++++++++++++++++++--
 kernel/bpf/devmap.c                         |  3 +-
 kernel/bpf/helpers.c                        |  2 -
 kernel/bpf/queue_stack_maps.c               |  2 +
 kernel/bpf/verifier.c                       | 13 +++++--
 net/core/filter.c                           | 21 +++++++++--
 net/core/sysctl_net_core.c                  | 10 ++++-
 tools/testing/selftests/bpf/config          |  2 +
 tools/testing/selftests/bpf/test_verifier.c | 15 +++++++-
 12 files changed, 133 insertions(+), 51 deletions(-)

^ permalink raw reply

* [PATCH net] net: bridge: remove ipv6 zero address check in mcast queries
From: Nikolay Aleksandrov @ 2018-10-27  9:07 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov, roopa, bridge, yinxu, liuhangbin, davem
In-Reply-To: <90c5f2fe-1743-6b17-2e44-eba58cdbbb35@cumulusnetworks.com>

Recently a check was added which prevents marking of routers with zero
source address, but for IPv6 that cannot happen as the relevant RFCs
actually forbid such packets:
RFC 2710 (MLDv1):
"To be valid, the Query message MUST
 come from a link-local IPv6 Source Address, be at least 24 octets
 long, and have a correct MLD checksum."

Same goes for RFC 3810.

And also it can be seen as a requirement in ipv6_mc_check_mld_query()
which is used by the bridge to validate the message before processing
it. Thus any queries with :: source address won't be processed anyway.
So just remove the check for zero IPv6 source address from the query
processing function.

Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 net/bridge/br_multicast.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 41cdafbf2ebe..6bac0d6b7b94 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1428,8 +1428,7 @@ static void br_multicast_query_received(struct net_bridge *br,
 	 * is 0.0.0.0 should not be added to router port list.
 	 */
 	if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
-	    (saddr->proto == htons(ETH_P_IPV6) &&
-	     !ipv6_addr_any(&saddr->u.ip6)))
+	    saddr->proto == htons(ETH_P_IPV6))
 		br_multicast_mark_router(br, port);
 }
 
-- 
2.17.2

^ permalink raw reply related

* [net:master 17/19] net//bridge/br_multicast.c:1432:32: error: 'union <anonymous>' has no member named 'ip6'; did you mean 'ip4'?
From: kbuild test robot @ 2018-10-27  0:50 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: kbuild-all, netdev

[-- Attachment #1: Type: text/plain, Size: 2074 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
head:   aab456dfa404f3a16d6f1780e62a6a8533c4d008
commit: 5a2de63fd1a59c30c02526d427bc014b98adf508 [17/19] bridge: do not add port to router list when receives query with source 0.0.0.0
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 5a2de63fd1a59c30c02526d427bc014b98adf508
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   net//bridge/br_multicast.c: In function 'br_multicast_query_received':
>> net//bridge/br_multicast.c:1432:32: error: 'union <anonymous>' has no member named 'ip6'; did you mean 'ip4'?
          !ipv6_addr_any(&saddr->u.ip6)))
                                   ^~~
                                   ip4

vim +1432 net//bridge/br_multicast.c

  1414	
  1415	static void br_multicast_query_received(struct net_bridge *br,
  1416						struct net_bridge_port *port,
  1417						struct bridge_mcast_other_query *query,
  1418						struct br_ip *saddr,
  1419						unsigned long max_delay)
  1420	{
  1421		if (!br_multicast_select_querier(br, port, saddr))
  1422			return;
  1423	
  1424		br_multicast_update_query_timer(br, query, max_delay);
  1425	
  1426		/* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules,
  1427		 * the arrival port for IGMP Queries where the source address
  1428		 * is 0.0.0.0 should not be added to router port list.
  1429		 */
  1430		if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
  1431		    (saddr->proto == htons(ETH_P_IPV6) &&
> 1432		     !ipv6_addr_any(&saddr->u.ip6)))
  1433			br_multicast_mark_router(br, port);
  1434	}
  1435	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 23983 bytes --]

^ permalink raw reply

* checksumming on non-local forward path
From: Jason A. Donenfeld @ 2018-10-27  0:58 UTC (permalink / raw)
  To: Netdev

Hey netdev,

In a protocol like wireguard, if a packet arrives on the other end of
the tunnel, and the crypto checks out, then we're absolutely certain
that the packet hasn't been modified in transit. In this case, there's
no useful information that validating the inner checksums of the
various headers would give us. Every byte of the packet has arrived
intact, and we're certain of it.

Therefore, you might think in a situation like this, before calling
netif_receive_skb or the like, we can just set ip_summed to
CHECKSUM_UNNECESSARY, csum_level to ~0, and feel glad that we're not
wasting cycles unnecessarily validating the checksum.

But what if the receiving computer forwards the packet on across a
real physical network? In that case, it's probably important that the
kernel that originally produced the packet in the first place ensures
it has a valid checksum before encrypting it and sending it through
the wireguard tunnel. That's fine, but it does mean that in the case
of the packet being locally received on the other end, and not
forwarded, the checksumming by the original sender was not needed and
was therefore wasteful.

What would you think of a flag on the receiving end like,
"CHECKSUM_INVALID_BUT_UNNECESSARY"? It would be treated as
CHECKSUM_UNNECESSARY in the case that the the packet is locally
received. But if the packet is going to be forwarded instead, then
skb_checksum_help is called on it before forwarding onward.

AFAICT, wireguard isn't the only thing that could benefit from this:
virtio is another case where it's not always necessary for the sender
to call skb_checksum_help, when the receiver could just do it
conditionally based on whether it's being forwarded.

Thoughts?

Regards,
Jason

^ permalink raw reply

* [Bug?] ss command display unix domain port overflow
From: Suoben @ 2018-10-27  1:34 UTC (permalink / raw)
  To: Stephen Hemminger, netdev@vger.kernel.org
  Cc: Wencongyang (UVP), sangxu, guijianfeng, Wanghui (John),
	chenminhua (A)

[-- Attachment #1: Type: text/plain, Size: 2618 bytes --]


> Dear all，
> 
> When l use ss command to display unix domain socket, the ss shows as below, the local address port and peer address port may overflow which looks like equal to the socket fd inode .
> 
> /**********
> # ss –apx
> Netid  State      Recv-Q Send-Q     Local Address:Port                                                                      Peer Address:Port
> u_str  ESTAB      0      0         /var/run/libvirt/libvirt-sock -1351067523                                                    * -1351073721           users:(("libvirtd",pid=2295,fd=39))
> u_str  ESTAB      0      0         /var/run/libvirt/libvirt-sock -1351121518                                                    * -1351223988           users:(("libvirtd",pid=2295,fd=21))
> u_str  ESTAB      0      0         /var/run/libvirt/libvirt-sock -1351245849                                                    * -1351128250           users:(("libvirtd",pid=2295,fd=40))
> u_str  ESTAB      0      0         /var/run/libvirt/libvirt-sock -1351042552                                                    * -1351050742           users:(("libvirtd",pid=2295,fd=51))
> 
> #ll /proc/2295/fd/51
> lrwx------ 1 root root 64 Oct 10 10:34 51 -> socket:[2943924744]
> 
> **********/
> 
> When I read the iproute source code., I find that the ‘lport’ and 
> ‘rport’ was defined as type ‘int’ in ‘struct sockstat’, which I think 
> should be ‘unsigned int’, also the ‘fd’ in ‘struct user_ent’ should 
> be.  And,
> 
> when printing the port, In function ‘static const char 
> *resolve_service(int port)’ uses “%u” to print tcp port, however, in 
> function ‘static void unix_stats_print(struct sockstat *s, struct 
> filter *f)’ uses “%d” to
> 
> print port when showing unix info, what is the problem I think.
> 
> Can you tell me if there are some reasons by using the type ‘int’ to define the port? Or is it a bug? Can I use “%u” to print the port when showing the unix socket info?
> 
> /**********
> static void unix_stats_print(struct sockstat *s, struct filter *f) {
>          char port_name[30] = {};
> 
>          sock_state_print(s);
> 
>          sock_addr_print(s->name ?: "*", " ",
>                             int_to_str(s->lport, port_name), NULL);
>          sock_addr_print(s->peer_name ?: "*", " ",
>                             int_to_str(s->rport, port_name), NULL);
> 
>          proc_ctx_print(s);
> }
> **********/
> 
> 
> Best regards，
> SangXu

Please  report problems to netdev@vger.kernel.org.
Yes, it looks like a bug.

[-- Attachment #2: 0001-ss-Use-u-to-print-port-when-show-unix-socket.patch --]
[-- Type: application/octet-stream, Size: 1539 bytes --]

From 777eba52309e229853a4dd08566e3df4980de740 Mon Sep 17 00:00:00 2001
From: sangxu <sangxu@huawei.com>
Date: Fri, 26 Oct 2018 16:34:43 +0800
Subject: [PATCH] ss:Use "%u" to print port when show unix socket

When using ss command to display unix domain socket, the ss shows
as below, the local address port and peer address port may
overflow which looks like equal to the socket fd inode.

Netid  State  Recv-Q Send-Q Local Address:Port Peer Address:Port
u_str  ESTAB  0      0  /var/run/libvirt/libvirt-sock -1351042552
 * -1351050742           users:(("libvirtd",pid=2295,fd=51))

lrwx------ 1 root root 64 Oct 10 10:34 51 -> socket:[2943924744]

As we can see, it uses "%u" to print port when printing tcp socket
in function "resolve_service". So, it seems that it should use "%u"
to pirnt port when showing unix socket info as well.

---
 misc/ss.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index f99b687..abee4e4 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -3540,10 +3540,11 @@ static void unix_stats_print(struct sockstat *s, struct filter *f)
 
 	sock_state_print(s);
 
-	sock_addr_print(s->name ?: "*", " ",
-			int_to_str(s->lport, port_name), NULL);
-	sock_addr_print(s->peer_name ?: "*", " ",
-			int_to_str(s->rport, port_name), NULL);
+	sprintf(port_name, "%u", s->lport);
+	sock_addr_print(s->name ?: "*", " ", port_name, NULL);
+
+	sprintf(port_name, "%u", s->rport);
+	sock_addr_print(s->peer_name ?: "*", " ", port_name, NULL);
 
 	proc_ctx_print(s);
 }
-- 
1.8.3.1


^ permalink raw reply related

* [net:master 17/19] net/bridge/br_multicast.c:1432:31: error: 'union <anonymous>' has no member named 'ip6'
From: kbuild test robot @ 2018-10-27  1:49 UTC (permalink / raw)
  To: Hangbin Liu; +Cc: kbuild-all, netdev

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
head:   aab456dfa404f3a16d6f1780e62a6a8533c4d008
commit: 5a2de63fd1a59c30c02526d427bc014b98adf508 [17/19] bridge: do not add port to router list when receives query with source 0.0.0.0
config: i386-randconfig-x0-10270816 (attached as .config)
compiler: gcc-5 (Debian 5.5.0-3) 5.4.1 20171010
reproduce:
        git checkout 5a2de63fd1a59c30c02526d427bc014b98adf508
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   net/bridge/br_multicast.c: In function 'br_multicast_query_received':
>> net/bridge/br_multicast.c:1432:31: error: 'union <anonymous>' has no member named 'ip6'
          !ipv6_addr_any(&saddr->u.ip6)))
                                  ^

vim +1432 net/bridge/br_multicast.c

  1414	
  1415	static void br_multicast_query_received(struct net_bridge *br,
  1416						struct net_bridge_port *port,
  1417						struct bridge_mcast_other_query *query,
  1418						struct br_ip *saddr,
  1419						unsigned long max_delay)
  1420	{
  1421		if (!br_multicast_select_querier(br, port, saddr))
  1422			return;
  1423	
  1424		br_multicast_update_query_timer(br, query, max_delay);
  1425	
  1426		/* Based on RFC4541, section 2.1.1 IGMP Forwarding Rules,
  1427		 * the arrival port for IGMP Queries where the source address
  1428		 * is 0.0.0.0 should not be added to router port list.
  1429		 */
  1430		if ((saddr->proto == htons(ETH_P_IP) && saddr->u.ip4) ||
  1431		    (saddr->proto == htons(ETH_P_IPV6) &&
> 1432		     !ipv6_addr_any(&saddr->u.ip6)))
  1433			br_multicast_mark_router(br, port);
  1434	}
  1435	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34058 bytes --]

^ permalink raw reply

* Re: [PATCH net v2] net: udp: fix handling of CHECKSUM_COMPLETE packets
From: Maciej Żenczykowski @ 2018-10-27  1:53 UTC (permalink / raw)
  To: David Miller
  Cc: stranche, Eric Dumazet, Linux NetDev, samanthakumar, Eric Dumazet
In-Reply-To: <20181024.142105.1871980098995024381.davem@davemloft.net>

Possibly worth mentioning that without this fix you can also end up
with valid udp packets being dropped (ie. the reverse of the commit
description which talks about receiving invalid ones).

The (approximate?) requirement is:
(a) nic generates CHECKSUM_COMPLETE, but gets the actual checksum wrong
(b) udp packet is longer then 76 bytes L2 (CHECKSUM_BREAK), so we
don't calculate checksum inline prior to pulling udp header
(c) udp socket has no bpf filter, so we delay checksum until we do
copy to userspace from recvmsg()

76 bytes is 76 - 14 ether - 20 ipv4 - 8 udp header = 34 udp ipv4 payload bytes
On Wed, Oct 24, 2018 at 2:22 PM David Miller <davem@davemloft.net> wrote:
>
> From: Sean Tranchetti <stranche@codeaurora.org>
> Date: Tue, 23 Oct 2018 16:04:31 -0600
>
> > Current handling of CHECKSUM_COMPLETE packets by the UDP stack is
> > incorrect for any packet that has an incorrect checksum value.
> >
> > udp4/6_csum_init() will both make a call to
> > __skb_checksum_validate_complete() to initialize/validate the csum
> > field when receiving a CHECKSUM_COMPLETE packet. When this packet
> > fails validation, skb->csum will be overwritten with the pseudoheader
> > checksum so the packet can be fully validated by software, but the
> > skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way
> > the stack can later warn the user about their hardware spewing bad
> > checksums. Unfortunately, leaving the SKB in this state can cause
> > problems later on in the checksum calculation.
> >
> > Since the the packet is still marked as CHECKSUM_COMPLETE,
> > udp_csum_pull_header() will SUBTRACT the checksum of the UDP header
> > from skb->csum instead of adding it, leaving us with a garbage value
> > in that field. Once we try to copy the packet to userspace in the
> > udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg()
> > to checksum the packet data and add it in the garbage skb->csum value
> > to perform our final validation check.
> >
> > Since the value we're validating is not the proper checksum, it's possible
> > that the folded value could come out to 0, causing us not to drop the
> > packet. Instead, we believe that the packet was checksummed incorrectly
> > by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt
> > to warn the user with netdev_rx_csum_fault(skb->dev);
> >
> > Unfortunately, since this is the UDP path, skb->dev has been overwritten
> > by skb->dev_scratch and is no longer a valid pointer, so we end up
> > reading invalid memory.
>
> Just want to say that it has always been complicated in this area due to
> the fact that we do this deferral of final checksum validation to when we
> copy the packet into userspace.  For example, poll() needs to do special
> things, etc.
>
> Because we have to make it seem as if we dropped the packet with a bad
> checksum from the point of view of what the user sees during recvmsg()
> and poll() calls.  But until we do that checksum validation, we don't
> know exactly what the situation is.
>
> > This patch addresses this problem in two ways:
> >       1) Do not use the dev pointer when calling netdev_rx_csum_fault()
> >          from skb_copy_and_csum_datagram_msg(). Since this gets called
> >          from the UDP path where skb->dev has been overwritten, we have
> >          no way of knowing if the pointer is still valid. Also for the
> >          sake of consistency with the other uses of
> >          netdev_rx_csum_fault(), don't attempt to call it if the
> >          packet was checksummed by software.
> >
> >       2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init().
> >          If we receive a packet that's CHECKSUM_COMPLETE that fails
> >          verification (i.e. skb->csum_valid == 0), check who performed
> >          the calculation. It's possible that the checksum was done in
> >          software by the network stack earlier (such as Netfilter's
> >          CONNTRACK module), and if that says the checksum is bad,
> >          we can drop the packet immediately instead of waiting until
> >          we try and copy it to userspace. Otherwise, we need to
> >          mark the SKB as CHECKSUM_NONE, since the skb->csum field
> >          no longer contains the full packet checksum after the
> >          call to __skb_checksum_validate_complete().
> >
> > Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
>
> Can't count on my hands how many regressions are a result of that change and
> it's subtle side effects. :-/
>
> > Fixes: c84d949057ca ("udp: copy skb->truesize in the first cache line")
> > Cc: Sam Kumar <samanthakumar@google.com>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
>
> Applied and queued up for -stable, thank you.

^ permalink raw reply

* [Patch net 03/11] net: hns3: bugfix for reporting unknown vector0 interrupt repeatly problem
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

The current driver supports handling two vector0 interrupts, reset and
mailbox. When the hardware reports an interrupt of another type of
interrupt source, if the driver does not process the interrupt and
enables the interrupt, the hardware will repeatedly report the unknown
interrupt.

Therefore, the driver enables the vector0 interrupt after clearing the
known type of interrupt source. Other conditions are not enabled.

Fixes: cd8c5c269b1d ("net: hns3: Fix for hclge_reset running repeatly problem")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 5234b53..2a63147 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2236,7 +2236,7 @@ static irqreturn_t hclge_misc_irq_handle(int irq, void *data)
 	}
 
 	/* clear the source of interrupt if it is not cause by reset */
-	if (event_cause != HCLGE_VECTOR0_EVENT_RST) {
+	if (event_cause == HCLGE_VECTOR0_EVENT_MBX) {
 		hclge_clear_event_cause(hdev, event_cause, clearval);
 		hclge_enable_vector(&hdev->misc_vector, true);
 	}
-- 
2.7.4

^ permalink raw reply related

* [Patch net 00/11] Bugfix for the HNS3 driver
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan

This patch series include bugfix for the HNS3 ethernet
controller driver.

Huazhong Tan (11):
  net: hns3: add error handler for hns3_nic_init_vector_data()
  net: hns3: add error handler for
    hns3_get_ring_config/hns3_queue_to_ring
  net: hns3: bugfix for reporting unknown vector0 interrupt repeatly
    problem
  net: hns3: bugfix for the initialization of command queue's spin lock
  net: hns3: remove unnecessary queue reset in the
    hns3_uninit_all_ring()
  net: hns3: bugfix for is_valid_csq_clean_head()
  net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read
  net: hns3: fix incorrect return value/type of some functions
  net: hns3: bugfix for handling mailbox while the command queue
    reinitialized
  net: hns3: bugfix for rtnl_lock's range in the hclge_reset()
  net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()

 drivers/net/ethernet/hisilicon/hns3/hnae3.h        |   6 +-
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c    | 106 +++++++++++++++------
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.h    |   2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c |  26 +++--
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c    |  40 ++++----
 .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h    |   2 +-
 .../net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c |   6 ++
 .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c    |   4 +-
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c  |  12 ++-
 9 files changed, 133 insertions(+), 71 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [Patch net 04/11] net: hns3: bugfix for the initialization of command queue's spin lock
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

The spin lock of the command queue only needs to be initialized once
when the driver initializes the command queue. It is not necessary to
initialize the spin lock when resetting. At the same time, the
modification of the queue member should be performed after acquiring
the lock.

Fixes: 3efb960f056d ("net: hns3: Refactor the initialization of command queue")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index ac13cb2..68026a5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -304,6 +304,10 @@ int hclge_cmd_queue_init(struct hclge_dev *hdev)
 {
 	int ret;
 
+	/* Setup the lock for command queue */
+	spin_lock_init(&hdev->hw.cmq.csq.lock);
+	spin_lock_init(&hdev->hw.cmq.crq.lock);
+
 	/* Setup the queue entries for use cmd queue */
 	hdev->hw.cmq.csq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
 	hdev->hw.cmq.crq.desc_num = HCLGE_NIC_CMQ_DESC_NUM;
@@ -337,18 +341,20 @@ int hclge_cmd_init(struct hclge_dev *hdev)
 	u32 version;
 	int ret;
 
+	spin_lock_bh(&hdev->hw.cmq.csq.lock);
+	spin_lock_bh(&hdev->hw.cmq.crq.lock);
+
 	hdev->hw.cmq.csq.next_to_clean = 0;
 	hdev->hw.cmq.csq.next_to_use = 0;
 	hdev->hw.cmq.crq.next_to_clean = 0;
 	hdev->hw.cmq.crq.next_to_use = 0;
 
-	/* Setup the lock for command queue */
-	spin_lock_init(&hdev->hw.cmq.csq.lock);
-	spin_lock_init(&hdev->hw.cmq.crq.lock);
-
 	hclge_cmd_init_regs(&hdev->hw);
 	clear_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
 
+	spin_unlock_bh(&hdev->hw.cmq.crq.lock);
+	spin_unlock_bh(&hdev->hw.cmq.csq.lock);
+
 	ret = hclge_cmd_query_firmware_version(&hdev->hw, &version);
 	if (ret) {
 		dev_err(&hdev->pdev->dev,
-- 
2.7.4

^ permalink raw reply related

* [Patch net 07/11] net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

When there is a PHY, the driver needs to complete some operations through
MDIO during reset reinitialization, so HCLGE_STATE_CMD_DISABLE is more
suitable than HCLGE_STATE_RST_HANDLING to prevent the MDIO operation from
being sent during the hardware reset.

Fixes: b50ae26c57cb ("net: hns3: never send command queue message to IMP when reset)
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
index 24b1f2a..0301863 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
@@ -52,7 +52,7 @@ static int hclge_mdio_write(struct mii_bus *bus, int phyid, int regnum,
 	struct hclge_desc desc;
 	int ret;
 
-	if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+	if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state))
 		return 0;
 
 	hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_MDIO_CONFIG, false);
@@ -90,7 +90,7 @@ static int hclge_mdio_read(struct mii_bus *bus, int phyid, int regnum)
 	struct hclge_desc desc;
 	int ret;
 
-	if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+	if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state))
 		return 0;
 
 	hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_MDIO_CONFIG, true);
-- 
2.7.4

^ permalink raw reply related

* [Patch net 05/11] net: hns3: remove unnecessary queue reset in the hns3_uninit_all_ring()
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

It is not necessary to reset the queue in the hns3_uninit_all_ring(),
since the queue is stopped in the down operation, and will be resetted
in the up operaton. And the judgment of the HCLGE_STATE_RST_HANDLING
flag in the hclge_reset_tqp() is not correct, because we need to reset
tqp during pf reset, otherwise it may cause queue not be resetted to
working state problem.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c         | 3 ---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 6f0fd62..a80ecfb 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3240,9 +3240,6 @@ int hns3_uninit_all_ring(struct hns3_nic_priv *priv)
 	int i;
 
 	for (i = 0; i < h->kinfo.num_tqps; i++) {
-		if (h->ae_algo->ops->reset_queue)
-			h->ae_algo->ops->reset_queue(h, i);
-
 		hns3_fini_ring(priv->ring_data[i].ring);
 		hns3_fini_ring(priv->ring_data[i + h->kinfo.num_tqps].ring);
 	}
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 2a63147..4dd0506 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -6116,9 +6116,6 @@ void hclge_reset_tqp(struct hnae3_handle *handle, u16 queue_id)
 	u16 queue_gid;
 	int ret;
 
-	if (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
-		return;
-
 	queue_gid = hclge_covert_handle_qid_global(handle, queue_id);
 
 	ret = hclge_tqp_enable(hdev, queue_id, 0, false);
-- 
2.7.4

^ permalink raw reply related

* [Patch net 06/11] net: hns3: bugfix for is_valid_csq_clean_head()
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

The HEAD pointer of the hardware command queue maybe equal to the command
queue's next_to_use the driver, so that does not belong to the invalid
HEAD pointer, since the hardware may not process the command in time,
causing the HEAD pointer to be too late to update. The variables' name
in this function is unreadable, so give them a more readable one.

Fixes: 3ff504908f95 ("net: hns3: fix a dead loop in hclge_cmd_csq_clean")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
index 68026a5..690f62e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.c
@@ -24,15 +24,15 @@ static int hclge_ring_space(struct hclge_cmq_ring *ring)
 	return ring->desc_num - used - 1;
 }
 
-static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int h)
+static int is_valid_csq_clean_head(struct hclge_cmq_ring *ring, int head)
 {
-	int u = ring->next_to_use;
-	int c = ring->next_to_clean;
+	int ntu = ring->next_to_use;
+	int ntc = ring->next_to_clean;
 
-	if (unlikely(h >= ring->desc_num))
-		return 0;
+	if (ntu > ntc)
+		return head >= ntc && head <= ntu;
 
-	return u > c ? (h > c && h <= u) : (h > c || h <= u);
+	return head >= ntc || head <= ntu;
 }
 
 static int hclge_alloc_cmd_desc(struct hclge_cmq_ring *ring)
-- 
2.7.4

^ permalink raw reply related

* [Patch net 01/11] net: hns3: add error handler for hns3_nic_init_vector_data()
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

When hns3_nic_init_vector_data() failed for mapping ring to vector,
it should cancel the netif_napi_add() that have been successfully done
and then exit.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 32f3aca8..d9066c5 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2821,7 +2821,7 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
 	struct hnae3_handle *h = priv->ae_handle;
 	struct hns3_enet_tqp_vector *tqp_vector;
 	int ret = 0;
-	u16 i;
+	int i, j;
 
 	hns3_nic_set_cpumask(priv);
 
@@ -2868,13 +2868,19 @@ static int hns3_nic_init_vector_data(struct hns3_nic_priv *priv)
 		hns3_free_vector_ring_chain(tqp_vector, &vector_ring_chain);
 
 		if (ret)
-			return ret;
+			goto map_ring_fail;
 
 		netif_napi_add(priv->netdev, &tqp_vector->napi,
 			       hns3_nic_common_poll, NAPI_POLL_WEIGHT);
 	}
 
 	return 0;
+
+map_ring_fail:
+	for (j = i - 1; j >= 0; j--)
+		netif_napi_del(&priv->tqp_vector[j].napi);
+
+	return ret;
 }
 
 static int hns3_nic_alloc_vector_data(struct hns3_nic_priv *priv)
-- 
2.7.4

^ permalink raw reply related

* [Patch net 02/11] net: hns3: add error handler for hns3_get_ring_config/hns3_queue_to_ring
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

When hns3_get_ring_config()/hns3_queue_to_ring() failed during resetting,
the allocated memory has not been freed before hns3_get_ring_config() and
hns3_queue_to_ring() return. So this patch fixes the buffer not freeing
problem during resetting.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index d9066c5..6f0fd62 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3037,8 +3037,10 @@ static int hns3_queue_to_ring(struct hnae3_queue *tqp,
 		return ret;
 
 	ret = hns3_ring_get_cfg(tqp, priv, HNAE3_RING_TYPE_RX);
-	if (ret)
+	if (ret) {
+		devm_kfree(priv->dev, priv->ring_data[tqp->tqp_index].ring);
 		return ret;
+	}
 
 	return 0;
 }
@@ -3047,7 +3049,7 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
 {
 	struct hnae3_handle *h = priv->ae_handle;
 	struct pci_dev *pdev = h->pdev;
-	int i, ret;
+	int i, j, ret;
 
 	priv->ring_data =  devm_kzalloc(&pdev->dev,
 					array3_size(h->kinfo.num_tqps,
@@ -3065,6 +3067,12 @@ static int hns3_get_ring_config(struct hns3_nic_priv *priv)
 
 	return 0;
 err:
+	for (j = i - 1; j >= 0; j--) {
+		devm_kfree(priv->dev, priv->ring_data[j].ring);
+		devm_kfree(priv->dev,
+			   priv->ring_data[j + h->kinfo.num_tqps].ring);
+	}
+
 	devm_kfree(&pdev->dev, priv->ring_data);
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related

* [Patch net 10/11] net: hns3: bugfix for rtnl_lock's range in the hclge_reset()
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

Since hclge_reset_wait() is used to wait for the hardware to complete
the reset, it is not necessary to hold the rtnl_lock during
hclge_reset_wait(). So this patch release the lock for the duration
of hclge_reset_wait().

Fixes: 6d4fab39533f ("net: hns3: Reset net device with rtnl_lock")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index a0c2834..394f797 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -2470,14 +2470,17 @@ static void hclge_reset(struct hclge_dev *hdev)
 	handle = &hdev->vport[0].nic;
 	rtnl_lock();
 	hclge_notify_client(hdev, HNAE3_DOWN_CLIENT);
+	rtnl_unlock();
 
 	if (!hclge_reset_wait(hdev)) {
+		rtnl_lock();
 		hclge_notify_client(hdev, HNAE3_UNINIT_CLIENT);
 		hclge_reset_ae_dev(hdev->ae_dev);
 		hclge_notify_client(hdev, HNAE3_INIT_CLIENT);
 
 		hclge_clear_reset_cause(hdev);
 	} else {
+		rtnl_lock();
 		/* schedule again to check pending resets later */
 		set_bit(hdev->reset_type, &hdev->reset_pending);
 		hclge_reset_task_schedule(hdev);
-- 
2.7.4

^ permalink raw reply related

* [Patch net 11/11] net: hns3: bugfix for rtnl_lock's range in the hclgevf_reset()
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

Since hclgevf_reset_wait() is used to wait for the hardware to complete
the reset, it is not necessary to hold the rtnl_lock during
hclgevf_reset_wait(). So this patch release the lock for the duration
of hclgevf_reset_wait().

Fixes: 6988eb2a9b77 ("net: hns3: Add support to reset the enet/ring mgmt layer")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 552b708..fa6251e 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1170,6 +1170,8 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
 	/* bring down the nic to stop any ongoing TX/RX */
 	hclgevf_notify_client(hdev, HNAE3_DOWN_CLIENT);
 
+	rtnl_unlock();
+
 	/* check if VF could successfully fetch the hardware reset completion
 	 * status from the hardware
 	 */
@@ -1187,6 +1189,8 @@ static int hclgevf_reset(struct hclgevf_dev *hdev)
 		return ret;
 	}
 
+	rtnl_lock();
+
 	/* now, re-initialize the nic client and ae device*/
 	ret = hclgevf_reset_stack(hdev);
 	if (ret)
-- 
2.7.4

^ permalink raw reply related

* [Patch net 09/11] net: hns3: bugfix for handling mailbox while the command queue reinitialized
From: Huazhong Tan @ 2018-10-27  2:41 UTC (permalink / raw)
  To: davem; +Cc: netdev, linuxarm, salil.mehta, yisen.zhuang, lipeng321,
	Huazhong Tan
In-Reply-To: <1540608118-27449-1-git-send-email-tanhuazhong@huawei.com>

In a multi-core machine, the mailbox service and reset service
will be executed at the same time. The reset server will re-initialize
the commond queue, before that, the mailbox handler can only get some
invalid messages.

The HCLGE_STATE_CMD_DISABLE flag means that the command queue is not
available and needs to be reinitialized. Therefore, when the mailbox
hanlder recognizes this flag, it should not process the command.

Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
index 04462a3..6ac2fab 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c
@@ -400,6 +400,12 @@ void hclge_mbx_handler(struct hclge_dev *hdev)
 
 	/* handle all the mailbox requests in the queue */
 	while (!hclge_cmd_crq_empty(&hdev->hw)) {
+		if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
+			dev_warn(&hdev->pdev->dev,
+				 "command queue need re-initialize\n");
+			return;
+		}
+
 		desc = &crq->desc[crq->next_to_use];
 		req = (struct hclge_mbx_vf_to_pf_cmd *)desc->data;
 
-- 
2.7.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox