Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bpf 2/5] tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattach
From: Song Liu @ 2018-08-16 21:26 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, John Fastabend, Networking
In-Reply-To: <20180816194910.9040-3-daniel@iogearbox.net>

On Thu, Aug 16, 2018 at 12:49 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> I found that in BPF sockmap programs once we either delete a socket
> from the map or we updated a map slot and the old socket was purged
> from the map that these socket can never get reattached into a map
> even though their related psock has been dropped entirely at that
> point.
>
> Reason is that tcp_cleanup_ulp() leaves the old icsk->icsk_ulp_ops
> intact, so that on the next tcp_set_ulp_id() the kernel returns an
> -EEXIST thinking there is still some active ULP attached.
>
> BPF sockmap is the only one that has this issue as the other user,
> kTLS, only calls tcp_cleanup_ulp() from tcp_v4_destroy_sock() whereas
> sockmap semantics allow dropping the socket from the map with all
> related psock state being cleaned up.
>
> Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  net/ipv4/tcp_ulp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
> index 7dd44b6..a5995bb 100644
> --- a/net/ipv4/tcp_ulp.c
> +++ b/net/ipv4/tcp_ulp.c
> @@ -129,6 +129,8 @@ void tcp_cleanup_ulp(struct sock *sk)
>         if (icsk->icsk_ulp_ops->release)
>                 icsk->icsk_ulp_ops->release(sk);
>         module_put(icsk->icsk_ulp_ops->owner);
> +
> +       icsk->icsk_ulp_ops = NULL;
>  }
>
>  /* Change upper layer protocol for socket */
> --
> 2.9.5
>

^ permalink raw reply

* Re: [PATCH bpf 1/5] tcp, ulp: add alias for all ulp modules
From: Song Liu @ 2018-08-16 21:25 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, John Fastabend, Networking
In-Reply-To: <20180816194910.9040-2-daniel@iogearbox.net>

On Thu, Aug 16, 2018 at 12:49 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Lets not turn the TCP ULP lookup into an arbitrary module loader as
> we only intend to load ULP modules through this mechanism, not other
> unrelated kernel modules:
>
>   [root@bar]# cat foo.c
>   #include <sys/types.h>
>   #include <sys/socket.h>
>   #include <linux/tcp.h>
>   #include <linux/in.h>
>
>   int main(void)
>   {
>       int sock = socket(PF_INET, SOCK_STREAM, 0);
>       setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
>       return 0;
>   }
>
>   [root@bar]# gcc foo.c -O2 -Wall
>   [root@bar]# lsmod | grep sctp
>   [root@bar]# ./a.out
>   [root@bar]# lsmod | grep sctp
>   sctp                 1077248  4
>   libcrc32c              16384  3 nf_conntrack,nf_nat,sctp
>   [root@bar]#
>
> Fix it by adding module alias to TCP ULP modules, so probing module
> via request_module() will be limited to tcp-ulp-[name]. The existing
> modules like kTLS will load fine given tcp-ulp-tls alias, but others
> will fail to load:
>
>   [root@bar]# lsmod | grep sctp
>   [root@bar]# ./a.out
>   [root@bar]# lsmod | grep sctp
>   [root@bar]#
>
> Sockmap is not affected from this since it's either built-in or not.
>
> Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: John Fastabend <john.fastabend@gmail.com>

Acked-by: Song Liu <songliubraving@fb.com>
> ---
>  include/net/tcp.h  | 4 ++++
>  net/ipv4/tcp_ulp.c | 2 +-
>  net/tls/tls_main.c | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index d196901..770917d 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2065,6 +2065,10 @@ int tcp_set_ulp_id(struct sock *sk, const int ulp);
>  void tcp_get_available_ulp(char *buf, size_t len);
>  void tcp_cleanup_ulp(struct sock *sk);
>
> +#define MODULE_ALIAS_TCP_ULP(name)                             \
> +       __MODULE_INFO(alias, alias_userspace, name);            \
> +       __MODULE_INFO(alias, alias_tcp_ulp, "tcp-ulp-" name)
> +
>  /* Call BPF_SOCK_OPS program that returns an int. If the return value
>   * is < 0, then the BPF op failed (for example if the loaded BPF
>   * program does not support the chosen operation or there is no BPF
> diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
> index 622caa4..7dd44b6 100644
> --- a/net/ipv4/tcp_ulp.c
> +++ b/net/ipv4/tcp_ulp.c
> @@ -51,7 +51,7 @@ static const struct tcp_ulp_ops *__tcp_ulp_find_autoload(const char *name)
>  #ifdef CONFIG_MODULES
>         if (!ulp && capable(CAP_NET_ADMIN)) {
>                 rcu_read_unlock();
> -               request_module("%s", name);
> +               request_module("tcp-ulp-%s", name);
>                 rcu_read_lock();
>                 ulp = tcp_ulp_find(name);
>         }
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index b09867c..93c0c22 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -45,6 +45,7 @@
>  MODULE_AUTHOR("Mellanox Technologies");
>  MODULE_DESCRIPTION("Transport Layer Security Support");
>  MODULE_LICENSE("Dual BSD/GPL");
> +MODULE_ALIAS_TCP_ULP("tls");
>
>  enum {
>         TLSV4,
> --
> 2.9.5
>

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH next-queue 0/8] ixgbe/ixgbevf: IPsec offload support for VFs
From: Alexander Duyck @ 2018-08-16 21:15 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: intel-wired-lan, Jeff Kirsher, Steffen Klassert, Netdev
In-Reply-To: <5ff4527d-7557-10f4-f41e-3e618f0e5863@oracle.com>

On Tue, Aug 14, 2018 at 10:10 AM Shannon Nelson
<shannon.nelson@oracle.com> wrote:
>
> On 8/14/2018 8:30 AM, Alexander Duyck wrote:
> > On Mon, Aug 13, 2018 at 11:43 AM Shannon Nelson
> > <shannon.nelson@oracle.com> wrote:
> >>
> >> This set of patches implements IPsec hardware offload for VF devices in
> >> Intel's 10Gbe x540 family of Ethernet devices.
>
> [...]
>
> >
> > So the one question I would have about this patch set is what happens
> > if you are setting up a ipsec connection between the PF and one of the
> > VFs on the same port/function? Do the ipsec offloads get translated
> > across the Tx loopback or do they end up causing issues? Specifically
> > I would be interested in seeing the results of a test either between
> > two VFs, or the PF and one of the VFs on the same port.
> >
> > - Alex
> >
>
> There is definitely something funky in the internal switch connection,
> as messages going from PF to VF with an offloaded encryption don't seem
> to get received by the VF, at least when in a VEB setup.  If I only set
> up offloads on the Rx on both PF and VF, and don't offload the Tx, then
> things work.
>
> I don't have a setup to test this, but I suspect that in a VEPA
> configuration, with packets going out to the switch and turned around
> back in, the Tx encryption offload would happen as expected.
>
> sln

We should probably look at adding at least one patch to the set then
that disables IPsec Tx offload if SR-IOV is enabled with VEB so that
we don't end up breaking connections should a VF be migrated from a
remote system to a local one that it is connected to.

- Alex

^ permalink raw reply

* [PATCH] Revert "net/smc: Replace ib_query_gid with rdma_get_gid_attr"
From: Jason Gunthorpe @ 2018-08-16 20:31 UTC (permalink / raw)
  To: Parav Pandit, Leon Romanovsky, Ursula Braun; +Cc: linux-rdma, netdev

This reverts commit ddb457c6993babbcdd41fca638b870d2a2fc3941.

The include rdma/ib_cache.h is kept, and we have to add a memset
to the compat wrapper to avoid compiler warnings in gcc-7

This revert is done to avoid extensive merge conflicts with SMC
changes in netdev during the 4.19 merge window.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 include/rdma/ib_cache.h |  1 +
 net/smc/smc_core.c      | 19 ++++++++++---------
 net/smc/smc_ib.c        | 24 ++++++++++--------------
 3 files changed, 21 insertions(+), 23 deletions(-)

As discussed before, the above patch to SMC in the rdma.git causes too
many merge conflicts, I am reverting it prior to sending the pull
request for RDMA and instead relying on the ibv_query_gid() compat
wrapper that has been in linux-next for some time.

Parav, please respin this patch against this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git/log/?h=wip/jgg-for-next

Thanks,
Jason

diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h
index a4ce441f36f0ad..3e11e7cc60b745 100644
--- a/include/rdma/ib_cache.h
+++ b/include/rdma/ib_cache.h
@@ -143,6 +143,7 @@ static inline __deprecated int ib_query_gid(struct ib_device *device,
 {
 	const struct ib_gid_attr *attr;
 
+	memset(attr_out, 0, sizeof(*attr_out));
 	attr = rdma_get_gid_attr(device, port_num, index);
 	if (IS_ERR(attr))
 		return PTR_ERR(attr);
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index d99a75f75e42be..15bad268f37d8b 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -451,7 +451,8 @@ static int smc_vlan_by_tcpsk(struct socket *clcsock, unsigned short *vlan_id)
 static int smc_link_determine_gid(struct smc_link_group *lgr)
 {
 	struct smc_link *lnk = &lgr->lnk[SMC_SINGLE_LINK];
-	const struct ib_gid_attr *gattr;
+	struct ib_gid_attr gattr;
+	union ib_gid gid;
 	int i;
 
 	if (!lgr->vlan_id) {
@@ -461,18 +462,18 @@ static int smc_link_determine_gid(struct smc_link_group *lgr)
 
 	for (i = 0; i < lnk->smcibdev->pattr[lnk->ibport - 1].gid_tbl_len;
 	     i++) {
-		gattr = rdma_get_gid_attr(lnk->smcibdev->ibdev, lnk->ibport, i);
-		if (IS_ERR(gattr))
+		if (ib_query_gid(lnk->smcibdev->ibdev, lnk->ibport, i, &gid,
+				 &gattr))
 			continue;
-		if (gattr->ndev) {
-			if (is_vlan_dev(gattr->ndev) &&
-			    vlan_dev_vlan_id(gattr->ndev) == lgr->vlan_id) {
-				lnk->gid = gattr->gid;
-				rdma_put_gid_attr(gattr);
+		if (gattr.ndev) {
+			if (is_vlan_dev(gattr.ndev) &&
+			    vlan_dev_vlan_id(gattr.ndev) == lgr->vlan_id) {
+				lnk->gid = gid;
+				dev_put(gattr.ndev);
 				return 0;
 			}
+			dev_put(gattr.ndev);
 		}
-		rdma_put_gid_attr(gattr);
 	}
 	return -ENODEV;
 }
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 74f29f814ec1f9..117b05f1a49475 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -373,21 +373,17 @@ void smc_ib_buf_unmap_sg(struct smc_ib_device *smcibdev,
 
 static int smc_ib_fill_gid_and_mac(struct smc_ib_device *smcibdev, u8 ibport)
 {
-	const struct ib_gid_attr *gattr;
-	int rc = 0;
+	struct ib_gid_attr gattr;
+	int rc;
 
-	gattr = rdma_get_gid_attr(smcibdev->ibdev, ibport, 0);
-	if (IS_ERR(gattr))
-		return PTR_ERR(gattr);
-	if (!gattr->ndev) {
-		rc = -ENODEV;
-		goto done;
-	}
-	smcibdev->gid[ibport - 1] = gattr->gid;
-	memcpy(smcibdev->mac[ibport - 1], gattr->ndev->dev_addr, ETH_ALEN);
-done:
-	rdma_put_gid_attr(gattr);
-	return rc;
+	rc = ib_query_gid(smcibdev->ibdev, ibport, 0,
+			  &smcibdev->gid[ibport - 1], &gattr);
+	if (rc || !gattr.ndev)
+		return -ENODEV;
+
+	memcpy(smcibdev->mac[ibport - 1], gattr.ndev->dev_addr, ETH_ALEN);
+	dev_put(gattr.ndev);
+	return 0;
 }
 
 /* Create an identifier unique for this instance of SMC-R.
-- 
2.18.0

^ permalink raw reply related

* [PATCH bpf 5/5] bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexist
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann
In-Reply-To: <20180816194910.9040-1-daniel@iogearbox.net>

The current code in sock_map_ctx_update_elem() allows for BPF_EXIST
and BPF_NOEXIST map update flags. While on array-like maps this approach
is rather uncommon, e.g. bpf_fd_array_map_update_elem() and others
enforce map update flags to be BPF_ANY such that xchg() can be used
directly, the current implementation in sock map does not guarantee
that such operation with BPF_EXIST / BPF_NOEXIST is atomic.

The initial test does a READ_ONCE(stab->sock_map[i]) to fetch the
socket from the slot which is then tested for NULL / non-NULL. However
later after __sock_map_ctx_update_elem(), the actual update is done
through osock = xchg(&stab->sock_map[i], sock). Problem is that in
the meantime a different CPU could have updated / deleted a socket
on that specific slot and thus flag contraints won't hold anymore.

I've been thinking whether best would be to just break UAPI and do
an enforcement of BPF_ANY to check if someone actually complains,
however trouble is that already in BPF kselftest we use BPF_NOEXIST
for the map update, and therefore it might have been copied into
applications already. The fix to keep the current behavior intact
would be to add a map lock similar to the sock hash bucket lock only
for covering the whole map.

Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c | 106 +++++++++++++++++++++++++++------------------------
 1 file changed, 57 insertions(+), 49 deletions(-)

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 921cb6b..98e621a 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -58,6 +58,7 @@ struct bpf_stab {
 	struct bpf_map map;
 	struct sock **sock_map;
 	struct bpf_sock_progs progs;
+	raw_spinlock_t lock;
 };
 
 struct bucket {
@@ -89,9 +90,9 @@ enum smap_psock_state {
 
 struct smap_psock_map_entry {
 	struct list_head list;
+	struct bpf_map *map;
 	struct sock **entry;
 	struct htab_elem __rcu *hash_link;
-	struct bpf_htab __rcu *htab;
 };
 
 struct smap_psock {
@@ -343,13 +344,18 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 	e = psock_map_pop(sk, psock);
 	while (e) {
 		if (e->entry) {
-			osk = cmpxchg(e->entry, sk, NULL);
+			struct bpf_stab *stab = container_of(e->map, struct bpf_stab, map);
+
+			raw_spin_lock_bh(&stab->lock);
+			osk = *e->entry;
 			if (osk == sk) {
+				*e->entry = NULL;
 				smap_release_sock(psock, sk);
 			}
+			raw_spin_unlock_bh(&stab->lock);
 		} else {
 			struct htab_elem *link = rcu_dereference(e->hash_link);
-			struct bpf_htab *htab = rcu_dereference(e->htab);
+			struct bpf_htab *htab = container_of(e->map, struct bpf_htab, map);
 			struct hlist_head *head;
 			struct htab_elem *l;
 			struct bucket *b;
@@ -1642,6 +1648,7 @@ static struct bpf_map *sock_map_alloc(union bpf_attr *attr)
 		return ERR_PTR(-ENOMEM);
 
 	bpf_map_init_from_attr(&stab->map, attr);
+	raw_spin_lock_init(&stab->lock);
 
 	/* make sure page count doesn't overflow */
 	cost = (u64) stab->map.max_entries * sizeof(struct sock *);
@@ -1716,14 +1723,15 @@ static void sock_map_free(struct bpf_map *map)
 	 * and a grace period expire to ensure psock is really safe to remove.
 	 */
 	rcu_read_lock();
+	raw_spin_lock_bh(&stab->lock);
 	for (i = 0; i < stab->map.max_entries; i++) {
 		struct smap_psock *psock;
 		struct sock *sock;
 
-		sock = xchg(&stab->sock_map[i], NULL);
+		sock = stab->sock_map[i];
 		if (!sock)
 			continue;
-
+		stab->sock_map[i] = NULL;
 		psock = smap_psock_sk(sock);
 		/* This check handles a racing sock event that can get the
 		 * sk_callback_lock before this case but after xchg happens
@@ -1735,6 +1743,7 @@ static void sock_map_free(struct bpf_map *map)
 			smap_release_sock(psock, sock);
 		}
 	}
+	raw_spin_unlock_bh(&stab->lock);
 	rcu_read_unlock();
 
 	sock_map_remove_complete(stab);
@@ -1778,14 +1787,16 @@ static int sock_map_delete_elem(struct bpf_map *map, void *key)
 	if (k >= map->max_entries)
 		return -EINVAL;
 
-	sock = xchg(&stab->sock_map[k], NULL);
+	raw_spin_lock_bh(&stab->lock);
+	sock = stab->sock_map[k];
+	stab->sock_map[k] = NULL;
+	raw_spin_unlock_bh(&stab->lock);
 	if (!sock)
 		return -EINVAL;
 
 	psock = smap_psock_sk(sock);
 	if (!psock)
-		goto out;
-
+		return 0;
 	if (psock->bpf_parse) {
 		write_lock_bh(&sock->sk_callback_lock);
 		smap_stop_sock(psock, sock);
@@ -1793,7 +1804,6 @@ static int sock_map_delete_elem(struct bpf_map *map, void *key)
 	}
 	smap_list_map_remove(psock, &stab->sock_map[k]);
 	smap_release_sock(psock, sock);
-out:
 	return 0;
 }
 
@@ -1829,11 +1839,9 @@ static int sock_map_delete_elem(struct bpf_map *map, void *key)
 static int __sock_map_ctx_update_elem(struct bpf_map *map,
 				      struct bpf_sock_progs *progs,
 				      struct sock *sock,
-				      struct sock **map_link,
 				      void *key)
 {
 	struct bpf_prog *verdict, *parse, *tx_msg;
-	struct smap_psock_map_entry *e = NULL;
 	struct smap_psock *psock;
 	bool new = false;
 	int err = 0;
@@ -1906,14 +1914,6 @@ static int __sock_map_ctx_update_elem(struct bpf_map *map,
 		new = true;
 	}
 
-	if (map_link) {
-		e = kzalloc(sizeof(*e), GFP_ATOMIC | __GFP_NOWARN);
-		if (!e) {
-			err = -ENOMEM;
-			goto out_free;
-		}
-	}
-
 	/* 3. At this point we have a reference to a valid psock that is
 	 * running. Attach any BPF programs needed.
 	 */
@@ -1935,17 +1935,6 @@ static int __sock_map_ctx_update_elem(struct bpf_map *map,
 		write_unlock_bh(&sock->sk_callback_lock);
 	}
 
-	/* 4. Place psock in sockmap for use and stop any programs on
-	 * the old sock assuming its not the same sock we are replacing
-	 * it with. Because we can only have a single set of programs if
-	 * old_sock has a strp we can stop it.
-	 */
-	if (map_link) {
-		e->entry = map_link;
-		spin_lock_bh(&psock->maps_lock);
-		list_add_tail(&e->list, &psock->maps);
-		spin_unlock_bh(&psock->maps_lock);
-	}
 	return err;
 out_free:
 	smap_release_sock(psock, sock);
@@ -1956,7 +1945,6 @@ static int __sock_map_ctx_update_elem(struct bpf_map *map,
 	}
 	if (tx_msg)
 		bpf_prog_put(tx_msg);
-	kfree(e);
 	return err;
 }
 
@@ -1966,36 +1954,57 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 {
 	struct bpf_stab *stab = container_of(map, struct bpf_stab, map);
 	struct bpf_sock_progs *progs = &stab->progs;
-	struct sock *osock, *sock;
+	struct sock *osock, *sock = skops->sk;
+	struct smap_psock_map_entry *e;
+	struct smap_psock *psock;
 	u32 i = *(u32 *)key;
 	int err;
 
 	if (unlikely(flags > BPF_EXIST))
 		return -EINVAL;
-
 	if (unlikely(i >= stab->map.max_entries))
 		return -E2BIG;
 
-	sock = READ_ONCE(stab->sock_map[i]);
-	if (flags == BPF_EXIST && !sock)
-		return -ENOENT;
-	else if (flags == BPF_NOEXIST && sock)
-		return -EEXIST;
+	e = kzalloc(sizeof(*e), GFP_ATOMIC | __GFP_NOWARN);
+	if (!e)
+		return -ENOMEM;
 
-	sock = skops->sk;
-	err = __sock_map_ctx_update_elem(map, progs, sock, &stab->sock_map[i],
-					 key);
+	err = __sock_map_ctx_update_elem(map, progs, sock, key);
 	if (err)
 		goto out;
 
-	osock = xchg(&stab->sock_map[i], sock);
-	if (osock) {
-		struct smap_psock *opsock = smap_psock_sk(osock);
+	/* psock guaranteed to be present. */
+	psock = smap_psock_sk(sock);
+	raw_spin_lock_bh(&stab->lock);
+	osock = stab->sock_map[i];
+	if (osock && flags == BPF_NOEXIST) {
+		err = -EEXIST;
+		goto out_unlock;
+	}
+	if (!osock && flags == BPF_EXIST) {
+		err = -ENOENT;
+		goto out_unlock;
+	}
+
+	e->entry = &stab->sock_map[i];
+	e->map = map;
+	spin_lock_bh(&psock->maps_lock);
+	list_add_tail(&e->list, &psock->maps);
+	spin_unlock_bh(&psock->maps_lock);
 
-		smap_list_map_remove(opsock, &stab->sock_map[i]);
-		smap_release_sock(opsock, osock);
+	stab->sock_map[i] = sock;
+	if (osock) {
+		psock = smap_psock_sk(osock);
+		smap_list_map_remove(psock, &stab->sock_map[i]);
+		smap_release_sock(psock, osock);
 	}
+	raw_spin_unlock_bh(&stab->lock);
+	return 0;
+out_unlock:
+	smap_release_sock(psock, sock);
+	raw_spin_unlock_bh(&stab->lock);
 out:
+	kfree(e);
 	return err;
 }
 
@@ -2358,7 +2367,7 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 	b = __select_bucket(htab, hash);
 	head = &b->head;
 
-	err = __sock_map_ctx_update_elem(map, progs, sock, NULL, key);
+	err = __sock_map_ctx_update_elem(map, progs, sock, key);
 	if (err)
 		goto err;
 
@@ -2384,8 +2393,7 @@ static int sock_hash_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 	}
 
 	rcu_assign_pointer(e->hash_link, l_new);
-	rcu_assign_pointer(e->htab,
-			   container_of(map, struct bpf_htab, map));
+	e->map = map;
 	spin_lock_bh(&psock->maps_lock);
 	list_add_tail(&e->list, &psock->maps);
 	spin_unlock_bh(&psock->maps_lock);
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 4/5] bpf, sockmap: fix map elem deletion race with smap_stop_sock
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann
In-Reply-To: <20180816194910.9040-1-daniel@iogearbox.net>

The smap_start_sock() and smap_stop_sock() are each protected under
the sock->sk_callback_lock from their call-sites except in the case
of sock_map_delete_elem() where we drop the old socket from the map
slot. This is racy because the same sock could be part of multiple
sock maps, so we run smap_stop_sock() in parallel, and given at that
point psock->strp_enabled might be true on both CPUs, we might for
example wrongly restore the sk->sk_data_ready / sk->sk_write_space.
Therefore, hold the sock->sk_callback_lock as well on delete. Looks
like 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add
multi-map support") had this right, but later on e9db4ef6bf4c ("bpf:
sockhash fix omitted bucket lock in sock_close") removed it again
from delete leaving this smap_stop_sock() instance unprotected.

Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 94a324b..921cb6b 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -1786,8 +1786,11 @@ static int sock_map_delete_elem(struct bpf_map *map, void *key)
 	if (!psock)
 		goto out;
 
-	if (psock->bpf_parse)
+	if (psock->bpf_parse) {
+		write_lock_bh(&sock->sk_callback_lock);
 		smap_stop_sock(psock, sock);
+		write_unlock_bh(&sock->sk_callback_lock);
+	}
 	smap_list_map_remove(psock, &stab->sock_map[k]);
 	smap_release_sock(psock, sock);
 out:
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 1/5] tcp, ulp: add alias for all ulp modules
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann
In-Reply-To: <20180816194910.9040-1-daniel@iogearbox.net>

Lets not turn the TCP ULP lookup into an arbitrary module loader as
we only intend to load ULP modules through this mechanism, not other
unrelated kernel modules:

  [root@bar]# cat foo.c
  #include <sys/types.h>
  #include <sys/socket.h>
  #include <linux/tcp.h>
  #include <linux/in.h>

  int main(void)
  {
      int sock = socket(PF_INET, SOCK_STREAM, 0);
      setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
      return 0;
  }

  [root@bar]# gcc foo.c -O2 -Wall
  [root@bar]# lsmod | grep sctp
  [root@bar]# ./a.out
  [root@bar]# lsmod | grep sctp
  sctp                 1077248  4
  libcrc32c              16384  3 nf_conntrack,nf_nat,sctp
  [root@bar]#

Fix it by adding module alias to TCP ULP modules, so probing module
via request_module() will be limited to tcp-ulp-[name]. The existing
modules like kTLS will load fine given tcp-ulp-tls alias, but others
will fail to load:

  [root@bar]# lsmod | grep sctp
  [root@bar]# ./a.out
  [root@bar]# lsmod | grep sctp
  [root@bar]#

Sockmap is not affected from this since it's either built-in or not.

Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/net/tcp.h  | 4 ++++
 net/ipv4/tcp_ulp.c | 2 +-
 net/tls/tls_main.c | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index d196901..770917d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2065,6 +2065,10 @@ int tcp_set_ulp_id(struct sock *sk, const int ulp);
 void tcp_get_available_ulp(char *buf, size_t len);
 void tcp_cleanup_ulp(struct sock *sk);
 
+#define MODULE_ALIAS_TCP_ULP(name)				\
+	__MODULE_INFO(alias, alias_userspace, name);		\
+	__MODULE_INFO(alias, alias_tcp_ulp, "tcp-ulp-" name)
+
 /* Call BPF_SOCK_OPS program that returns an int. If the return value
  * is < 0, then the BPF op failed (for example if the loaded BPF
  * program does not support the chosen operation or there is no BPF
diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 622caa4..7dd44b6 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -51,7 +51,7 @@ static const struct tcp_ulp_ops *__tcp_ulp_find_autoload(const char *name)
 #ifdef CONFIG_MODULES
 	if (!ulp && capable(CAP_NET_ADMIN)) {
 		rcu_read_unlock();
-		request_module("%s", name);
+		request_module("tcp-ulp-%s", name);
 		rcu_read_lock();
 		ulp = tcp_ulp_find(name);
 	}
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index b09867c..93c0c22 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -45,6 +45,7 @@
 MODULE_AUTHOR("Mellanox Technologies");
 MODULE_DESCRIPTION("Transport Layer Security Support");
 MODULE_LICENSE("Dual BSD/GPL");
+MODULE_ALIAS_TCP_ULP("tls");
 
 enum {
 	TLSV4,
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 3/5] bpf, sockmap: fix leakage of smap_psock_map_entry
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann
In-Reply-To: <20180816194910.9040-1-daniel@iogearbox.net>

While working on sockmap I noticed that we do not always kfree the
struct smap_psock_map_entry list elements which track psocks attached
to maps. In the case of sock_hash_ctx_update_elem(), these map entries
are allocated outside of __sock_map_ctx_update_elem() with their
linkage to the socket hash table filled. In the case of sock array,
the map entries are allocated inside of __sock_map_ctx_update_elem()
and added with their linkage to the psock->maps. Both additions are
under psock->maps_lock each.

Now, we drop these elements from their psock->maps list in a few
occasions: i) in sock array via smap_list_map_remove() when an entry
is either deleted from the map from user space, or updated via
user space or BPF program where we drop the old socket at that map
slot, or the sock array is freed via sock_map_free() and drops all
its elements; ii) for sock hash via smap_list_hash_remove() in exactly
the same occasions as just described for sock array; iii) in the
bpf_tcp_close() where we remove the elements from the list via
psock_map_pop() and iterate over them dropping themselves from either
sock array or sock hash; and last but not least iv) once again in
smap_gc_work() which is a callback for deferring the work once the
psock refcount hit zero and thus the socket is being destroyed.

Problem is that the only case where we kfree() the list entry is
in case iv), which at that point should have an empty list in
normal cases. So in cases from i) to iii) we unlink the elements
without freeing where they go out of reach from us. Hence fix is
to properly kfree() them as well to stop the leakage. Given these
are all handled under psock->maps_lock there is no need for deferred
RCU freeing.

I later also ran with kmemleak detector and it confirmed the finding
as well where in the state before the fix the object goes unreferenced
while after the patch no kmemleak report related to BPF showed up.

  [...]
  unreferenced object 0xffff880378eadae0 (size 64):
    comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s)
    hex dump (first 32 bytes):
      00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de  ................
      50 4d 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00  PMu]............
    backtrace:
      [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210
      [<0000000045dd6d3c>] bpf_sock_map_update+0x29/0x60
      [<00000000877723aa>] ___bpf_prog_run+0x1e1f/0x4960
      [<000000002ef89e83>] 0xffffffffffffffff
  unreferenced object 0xffff880378ead240 (size 64):
    comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s)
    hex dump (first 32 bytes):
      00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de  ................
      00 44 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00  .Du]............
    backtrace:
      [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210
      [<0000000030e37a3a>] sock_map_update_elem+0x125/0x240
      [<000000002e5ce36e>] map_update_elem+0x4eb/0x7b0
      [<00000000db453cc9>] __x64_sys_bpf+0x1f9/0x360
      [<0000000000763660>] do_syscall_64+0x9a/0x300
      [<00000000422a2bb2>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [<000000002ef89e83>] 0xffffffffffffffff
  [...]

Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close")
Fixes: 54fedb42c653 ("bpf: sockmap, fix smap_list_map_remove when psock is in many maps")
Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 0c1a696..94a324b 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -370,6 +370,7 @@ static void bpf_tcp_close(struct sock *sk, long timeout)
 			}
 			raw_spin_unlock_bh(&b->lock);
 		}
+		kfree(e);
 		e = psock_map_pop(sk, psock);
 	}
 	rcu_read_unlock();
@@ -1675,8 +1676,10 @@ static void smap_list_map_remove(struct smap_psock *psock,
 
 	spin_lock_bh(&psock->maps_lock);
 	list_for_each_entry_safe(e, tmp, &psock->maps, list) {
-		if (e->entry == entry)
+		if (e->entry == entry) {
 			list_del(&e->list);
+			kfree(e);
+		}
 	}
 	spin_unlock_bh(&psock->maps_lock);
 }
@@ -1690,8 +1693,10 @@ static void smap_list_hash_remove(struct smap_psock *psock,
 	list_for_each_entry_safe(e, tmp, &psock->maps, list) {
 		struct htab_elem *c = rcu_dereference(e->hash_link);
 
-		if (c == hash_link)
+		if (c == hash_link) {
 			list_del(&e->list);
+			kfree(e);
+		}
 	}
 	spin_unlock_bh(&psock->maps_lock);
 }
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 2/5] tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattach
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann
In-Reply-To: <20180816194910.9040-1-daniel@iogearbox.net>

I found that in BPF sockmap programs once we either delete a socket
from the map or we updated a map slot and the old socket was purged
from the map that these socket can never get reattached into a map
even though their related psock has been dropped entirely at that
point.

Reason is that tcp_cleanup_ulp() leaves the old icsk->icsk_ulp_ops
intact, so that on the next tcp_set_ulp_id() the kernel returns an
-EEXIST thinking there is still some active ULP attached.

BPF sockmap is the only one that has this issue as the other user,
kTLS, only calls tcp_cleanup_ulp() from tcp_v4_destroy_sock() whereas
sockmap semantics allow dropping the socket from the map with all
related psock state being cleaned up.

Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_ulp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c
index 7dd44b6..a5995bb 100644
--- a/net/ipv4/tcp_ulp.c
+++ b/net/ipv4/tcp_ulp.c
@@ -129,6 +129,8 @@ void tcp_cleanup_ulp(struct sock *sk)
 	if (icsk->icsk_ulp_ops->release)
 		icsk->icsk_ulp_ops->release(sk);
 	module_put(icsk->icsk_ulp_ops->owner);
+
+	icsk->icsk_ulp_ops = NULL;
 }
 
 /* Change upper layer protocol for socket */
-- 
2.9.5

^ permalink raw reply related

* [PATCH bpf 0/5] BPF sockmap and ulp fixes
From: Daniel Borkmann @ 2018-08-16 19:49 UTC (permalink / raw)
  To: alexei.starovoitov; +Cc: john.fastabend, netdev, Daniel Borkmann

Batch of various fixes related to BPF sockmap and ULP, including
adding module alias to restrict module requests, races and memory
leaks in sockmap code. For details please refer to the individual
patches. Thanks!

Daniel Borkmann (5):
  tcp, ulp: add alias for all ulp modules
  tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattach
  bpf, sockmap: fix leakage of smap_psock_map_entry
  bpf, sockmap: fix map elem deletion race with smap_stop_sock
  bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexist

 include/net/tcp.h    |   4 ++
 kernel/bpf/sockmap.c | 120 +++++++++++++++++++++++++++++----------------------
 net/ipv4/tcp_ulp.c   |   4 +-
 net/tls/tls_main.c   |   1 +
 4 files changed, 76 insertions(+), 53 deletions(-)

-- 
2.9.5

^ permalink raw reply

* Re: phylib: Any PHY which reports link up during autoneg ?
From: Heiner Kallweit @ 2018-08-16 19:46 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev@vger.kernel.org
In-Reply-To: <45f80146-f041-d6e6-a78b-b997456e03e1@gmail.com>

On 16.08.2018 21:21, Florian Fainelli wrote:
> On 08/16/2018 12:15 PM, Heiner Kallweit wrote:
>> When reading through the state machine code in phy.c I wondered whether
>> there is any PHY which reports the link as up during autonegotiation.
>> (It's about handling PHY_AN in the state machine, once we know the link
>> is up we still check whether aneg was completed. Is this needed?)
>>
>> At least the PHY's I have access to all report the link as down when
>> autonegotiating.
>> I checked also clause 22 of 802.3u, however found no clear definition
>> when a link should be considered up. There it's stated that it's
>> PHY-dependent.
>>
>> Can you shed any light on this?
> 
> I think the answer is no, there are no PHYs that can report a link UP
> during auto-negotiation that is, between the time you ask for an
> auto-negotiation to occur/restart, and the PHY reporting that it is
> done. By definition, auto-negotiation needs to figure out the common
> denominator between link partners, if, and only if that process
> completes (ANEG done), then the link should be UP.
> 
> If the link was reported UP during auto-negotiation, it would not be
> possible to be deterministic about:
> 
> - which link parameters changed between negotiation attempts
> - whether the link is DOWN due to a physical disconnection or a failure
> to agree on parameters
> 
Thanks for the clarification.
My confusion was mainly about the case of an autoneg restart if there's
an established connection. Because autoneg parameters are exchanged
by sideband signaling (link pulses), normal traffic may not be affected.
Except autoneg requires the rx/tx units to switch to some autoneg
processing mode.

Of course at one point in time, when both link partners have agreed
to a parameter set, they have to switch the mode, resulting in at
least a small window where link is down.

^ permalink raw reply

* Re: [PATCH v3] net/mlx5e: Delete unneeded function argument
From: David Miller @ 2018-08-16 19:28 UTC (permalink / raw)
  To: yuval.shaia; +Cc: saeedm, leon, netdev, linux-rdma
In-Reply-To: <20180816090220.4537-1-yuval.shaia@oracle.com>

From: Yuval Shaia <yuval.shaia@oracle.com>
Date: Thu, 16 Aug 2018 12:02:20 +0300

> priv argument is not used by the function, delete it.
> 
> Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups")
> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> ---
> v1 -> v2:
> 	* Remove blank line as pointed by Leon.
> 
> v2 -> v3:
> 	* Change prefix to mlx5e

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] net: dsa: add support for ksz9897 ethernet switch
From: David Miller @ 2018-08-16 19:25 UTC (permalink / raw)
  To: prabhakar.csengg
  Cc: Woojung.Huh, UNGLinuxDriver, andrew, vivien.didelot, f.fainelli,
	netdev, devicetree
In-Reply-To: <1534348283-12790-1-git-send-email-prabhakar.csengg@gmail.com>

From: Lad Prabhakar <prabhakar.csengg@gmail.com>
Date: Wed, 15 Aug 2018 16:51:23 +0100

> From: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
> 
> ksz9477 is superset of ksz9xx series, driver just works
> out of the box for ksz9897 chip with this patch.
> 
> Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>

Since this is just adding chip IDs and such, this can go in now.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v3 net-next] veth: Free queues on link delete
From: David Miller @ 2018-08-16 19:22 UTC (permalink / raw)
  To: makita.toshiaki; +Cc: netdev, dsahern, dsahern
In-Reply-To: <1534320449-2433-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>

From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Date: Wed, 15 Aug 2018 17:07:29 +0900

> David Ahern reported memory leak in veth.
 ...
> veth_rq allocated in veth_newlink() was not freed on dellink.
> 
> We need to free up them after veth_close() so that any packets will not
> reference the queues afterwards. Thus free them in veth_dev_free() in
> the same way as freeing stats structure (vstats).
> 
> Also move queues allocation to veth_dev_init() to be in line with stats
> allocation.
> 
> Fixes: 638264dc90227 ("veth: Support per queue XDP ring")
> Reported-by: David Ahern <dsahern@gmail.com>
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Applied, thank you.

^ permalink raw reply

* Re: phylib: Any PHY which reports link up during autoneg ?
From: Florian Fainelli @ 2018-08-16 19:21 UTC (permalink / raw)
  To: Heiner Kallweit, Andrew Lunn; +Cc: netdev@vger.kernel.org
In-Reply-To: <9996748f-c274-e1de-6a5a-962f2035d5c2@gmail.com>

On 08/16/2018 12:15 PM, Heiner Kallweit wrote:
> When reading through the state machine code in phy.c I wondered whether
> there is any PHY which reports the link as up during autonegotiation.
> (It's about handling PHY_AN in the state machine, once we know the link
> is up we still check whether aneg was completed. Is this needed?)
> 
> At least the PHY's I have access to all report the link as down when
> autonegotiating.
> I checked also clause 22 of 802.3u, however found no clear definition
> when a link should be considered up. There it's stated that it's
> PHY-dependent.
> 
> Can you shed any light on this?

I think the answer is no, there are no PHYs that can report a link UP
during auto-negotiation that is, between the time you ask for an
auto-negotiation to occur/restart, and the PHY reporting that it is
done. By definition, auto-negotiation needs to figure out the common
denominator between link partners, if, and only if that process
completes (ANEG done), then the link should be UP.

If the link was reported UP during auto-negotiation, it would not be
possible to be deterministic about:

- which link parameters changed between negotiation attempts
- whether the link is DOWN due to a physical disconnection or a failure
to agree on parameters
-- 
Florian

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Heiner Kallweit @ 2018-08-16 19:18 UTC (permalink / raw)
  To: Jian-Hong Pan, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <ff5ea624-0bd5-ef9e-9c02-deb4c1de601b@gmail.com>

On 16.08.2018 20:59, Heiner Kallweit wrote:
>> From: Jian-Hong Pan <jian-hong@endlessm.com>
>>
>> Found the ethernet network on ASUS X441UAR doesn't come back on resume
>> from suspend when using MSI-X.  The chip is RTL8106e - version 39.
>>
> The patch itself looks good, just the commit message is wrong in one
> place and a little bit long.
> 
Patch should also be annotated "net", and it misses a "Fixes" tag.

>> asus@endless:~$ dmesg | grep r8169
>> [   21.848357] libphy: r8169: probed
>> [   21.848473] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID
>> 44900000, IRQ 127
>> [   22.518860] r8169 0000:02:00.0 enp2s0: renamed from eth0
>> [   29.458041] Generic PHY r8169-200:00: attached PHY driver [Generic
>> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
>> [   63.227398] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
>> flow control off
>> [  124.514648] Generic PHY r8169-200:00: attached PHY driver [Generic
>> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
>>
>> Here is the ethernet controller in detail:
>>
>> asus@endless:~$ sudo lspci -nnvs 02:00.0
>> [sudo] password for asus:
>> 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>> RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
>> (rev 07)
>> 	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
>> Ethernet controller [1043:200f]
>> 	Flags: bus master, fast devsel, latency 0, IRQ 16
>> 	I/O ports at e000 [size=256]
>> 	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
>> 	Memory at e0000000 (64-bit, prefetchable) [size=16K]
>> 	Capabilities: [40] Power Management version 3
>> 	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>> 	Capabilities: [70] Express Endpoint, MSI 01
>> 	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>> 	Capabilities: [d0] Vital Product Data
>> 	Capabilities: [100] Advanced Error Reporting
>> 	Capabilities: [140] Virtual Channel
>> 	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
>> 	Capabilities: [170] Latency Tolerance Reporting
>> 	Kernel driver in use: r8169
>> 	Kernel modules: r8169
>>
>> Here is the system interrupt table:
>>
>> asus@endless:~$ cat /proc/interrupts
>>             CPU0       CPU1       CPU2       CPU3
>>    0:         22          0          0          0   IO-APIC    2-edge
>> timer
>>    1:        157         42          0          0   IO-APIC    1-edge
>> i8042
>>    8:          0          0          1          0   IO-APIC    8-edge
>> rtc0
>>    9:         10         13          0          0   IO-APIC    9-fasteoi
>> acpi
>>   16:          0          0          0          0   IO-APIC   16-fasteoi
>> i2c_designware.0, i801_smbus
>>   17:       2445          0       3453          0   IO-APIC   17-fasteoi
>> i2c_designware.1, rtl_pci
>>  109:          2          0          0          1   IO-APIC  109-fasteoi
>> FTE1200:00
>>  120:          0          0          0          0   PCI-MSI 458752-edge
>> PCIe PME
>>  121:          0          0          0          0   PCI-MSI 466944-edge
>> PCIe PME
>>  122:          0          0          0          0   PCI-MSI 468992-edge
>> PCIe PME
>>  123:       1465          0          0      21263   PCI-MSI 376832-edge
>> ahci[0000:00:17.0]
>>  124:          0        530          0          0   PCI-MSI 327680-edge
>> xhci_hcd
>>  125:       5204          0          0          0   PCI-MSI 32768-edge
>> i915
>>  126:          0          0        149          0   PCI-MSI 514048-edge
>> snd_hda_intel:card0
>>  127:          0          0        337          0   PCI-MSI 1048576-edge
>> enp2s0
>>  NMI:          0          0          0          0   Non-maskable
>> interrupts
>>  LOC:      45049      39474      38978      46677   Local timer
>> interrupts
>>  SPU:          0          0          0          0   Spurious interrupts
>>  PMI:          0          0          0          0   Performance
>> monitoring interrupts
>>  IWI:        619          8          0          1   IRQ work interrupts
>>  RTR:          6          0          0          0   APIC ICR read
>> retries
>>  RES:       4918       4436       3835       2943   Rescheduling
>> interrupts
>>  CAL:       1399       1478       1598       1465   Function call
>> interrupts
>>  TLB:        608        513        723        559   TLB shootdowns
>>  TRM:          0          0          0          0   Thermal event
>> interrupts
>>  THR:          0          0          0          0   Threshold APIC
>> interrupts
>>  DFR:          0          0          0          0   Deferred Error APIC
>> interrupts
>>  MCE:          0          0          0          0   Machine check
>> exceptions
>>  MCP:          3          4          4          4   Machine check polls
>>  ERR:          0
>>  MIS:          0
>>  PIN:          0          0          0          0   Posted-interrupt
>> notification event
>>  NPI:          0          0          0          0   Nested
>> posted-interrupt event
>>  PIW:          0          0          0          0   Posted-interrupt
>> wakeup event
>>
>> It is the IRQ 127 - PCI-MSI used by enp2s0.  However, lspci lists MSI is
>> disabled and MSI-X is enabled which conflicts to the interrupt table.
>>
> Both types of interrupts, MSI and MSI-X, are listed with irq chip name
> "PCI-MSI", because MSI-X is treated as a sub-feature of MSI.
> Therefore the output of /proc/interrupts doesn't allow to tell whether
> a MSI or MSI-X interrupt is used, and as a consequence there is no such
> conflict.
> Indeed only lspci provides the information whether MSI or MSI-X is used.
> 
>> Falling back to MSI fixes the issue.
>>
>> Here is the test result with this patch in dmesg:
>>
>> asus@endless:~$ dmesg | grep r8169
>> [   22.017477] libphy: r8169: probed
>> [   22.017735] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID
>> 44900000, IRQ 127
>> [   22.041489] r8169 0000:02:00.0 enp2s0: renamed from eth0
>> [   29.138312] Generic PHY r8169-200:00: attached PHY driver [Generic
>> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
>> [   30.927359] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
>> flow control off
>> [  289.998077] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
>> flow control off
>> [  290.508084] Generic PHY r8169-200:00: attached PHY driver [Generic
>> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
>> [  290.745690] r8169 0000:02:00.0 enp2s0: Link is Down
>> [  292.367717] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
>> flow control off
>>
>> lspci lists MSI is enabled and MSI-X is disabled with this patch:
>>
>> asus@endless:~/linux-net$ sudo lspci -nnvs 02:00.0
>> [sudo] password for asus:
>> 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>> RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
>> (rev 07)
>> 	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
>> Ethernet controller [1043:200f]
>> 	Flags: bus master, fast devsel, latency 0, IRQ 127
>> 	I/O ports at e000 [size=256]
>> 	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
>> 	Memory at e0000000 (64-bit, prefetchable) [size=16K]
>> 	Capabilities: [40] Power Management version 3
>> 	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
>> 	Capabilities: [70] Express Endpoint, MSI 01
>> 	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
>> 	Capabilities: [d0] Vital Product Data
>> 	Capabilities: [100] Advanced Error Reporting
>> 	Capabilities: [140] Virtual Channel
>> 	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
>> 	Capabilities: [170] Latency Tolerance Reporting
>> 	Kernel driver in use: r8169
>> 	Kernel modules: r8169
>>
>> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
>> ---
>>  drivers/net/ethernet/realtek/r8169.c | 9 ++++++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>> index 0d9c3831838f..0efa977c422d 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -7071,17 +7071,20 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
>>  {
>>  	unsigned int flags;
>>  
>> -	if (tp->mac_version <= RTL_GIGA_MAC_VER_06) {
>> +	switch (tp->mac_version) {
>> +	case RTL_GIGA_MAC_VER_01 ... RTL_GIGA_MAC_VER_06:
>>  		RTL_W8(tp, Cfg9346, Cfg9346_Unlock);
>>  		RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~MSIEnable);
>>  		RTL_W8(tp, Cfg9346, Cfg9346_Lock);
>>  		flags = PCI_IRQ_LEGACY;
>> -	} else if (tp->mac_version == RTL_GIGA_MAC_VER_40) {
>> +		break;
>> +	case RTL_GIGA_MAC_VER_39 ... RTL_GIGA_MAC_VER_40:
>>  		/* This version was reported to have issues with resume
>>  		 * from suspend when using MSI-X
>>  		 */
>>  		flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;
>> -	} else {
>> +		break;
>> +	default:
>>  		flags = PCI_IRQ_ALL_TYPES;
>>  	}
>>  
>>
> 
> 

^ permalink raw reply

* [PATCH] net: nixge: Add support for 64-bit platforms
From: Moritz Fischer @ 2018-08-16 19:07 UTC (permalink / raw)
  To: davem
  Cc: keescook, netdev, alex.williams, moritz.fischer, Moritz Fischer,
	Florian Fainelli

Add support for 64-bit platforms to driver.

The hardware only supports 32-bit register accesses
so the accesses need to be split up into two writes
when setting the current and tail descriptor values.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
---

Changes from RFC:
- Work around warning when building by casting dma_addr to (u64)
  when netdev_err() printing it
- Use nixge_hw_dma_bd_set_{offset,phys} to zero out descs
- change KConfig from depends on ARCH_ZYNQ to "depends on HAS_IOMEM &&
  HAS_DMA"

---
 drivers/net/ethernet/ni/Kconfig |   3 +-
 drivers/net/ethernet/ni/nixge.c | 168 ++++++++++++++++++++++----------
 2 files changed, 116 insertions(+), 55 deletions(-)

diff --git a/drivers/net/ethernet/ni/Kconfig b/drivers/net/ethernet/ni/Kconfig
index aa41e5f6e437..04e315704f71 100644
--- a/drivers/net/ethernet/ni/Kconfig
+++ b/drivers/net/ethernet/ni/Kconfig
@@ -18,8 +18,9 @@ if NET_VENDOR_NI
 
 config NI_XGE_MANAGEMENT_ENET
 	tristate "National Instruments XGE management enet support"
-	depends on ARCH_ZYNQ
+	depends on HAS_IOMEM && HAS_DMA
 	select PHYLIB
+	select OF_MDIO
 	help
 	  Simple LAN device for debug or management purposes. Can
 	  support either 10G or 1G PHYs via SFP+ ports.
diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
index 76efed058f33..74cf52e3fb09 100644
--- a/drivers/net/ethernet/ni/nixge.c
+++ b/drivers/net/ethernet/ni/nixge.c
@@ -106,10 +106,10 @@
 	(NIXGE_JUMBO_MTU + NIXGE_HDR_SIZE + NIXGE_TRL_SIZE)
 
 struct nixge_hw_dma_bd {
-	u32 next;
-	u32 reserved1;
-	u32 phys;
-	u32 reserved2;
+	u32 next_lo;
+	u32 next_hi;
+	u32 phys_lo;
+	u32 phys_hi;
 	u32 reserved3;
 	u32 reserved4;
 	u32 cntrl;
@@ -119,11 +119,39 @@ struct nixge_hw_dma_bd {
 	u32 app2;
 	u32 app3;
 	u32 app4;
-	u32 sw_id_offset;
-	u32 reserved5;
+	u32 sw_id_offset_lo;
+	u32 sw_id_offset_hi;
 	u32 reserved6;
 };
 
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define nixge_hw_dma_bd_set_addr(bd, field, addr) \
+	do { \
+		(bd)->field##_lo = lower_32_bits(((u64)addr)); \
+		(bd)->field##_hi = upper_32_bits(((u64)addr)); \
+	} while (0)
+#else
+#define nixge_hw_dma_bd_set_addr(bd, field, addr) \
+	((bd)->field##_lo = lower_32_bits((addr)))
+#endif
+
+#define nixge_hw_dma_bd_set_phys(bd, addr) \
+	nixge_hw_dma_bd_set_addr((bd), phys, (addr))
+
+#define nixge_hw_dma_bd_set_next(bd, addr) \
+	nixge_hw_dma_bd_set_addr((bd), next, (addr))
+
+#define nixge_hw_dma_bd_set_offset(bd, addr) \
+	nixge_hw_dma_bd_set_addr((bd), sw_id_offset, (addr))
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+#define nixge_hw_dma_bd_get_addr(bd, field) \
+	(dma_addr_t)((((u64)(bd)->field##_hi) << 32) | ((bd)->field##_lo))
+#else
+#define nixge_hw_dma_bd_get_addr(bd, field) \
+	(dma_addr_t)((bd)->field##_lo)
+#endif
+
 struct nixge_tx_skb {
 	struct sk_buff *skb;
 	dma_addr_t mapping;
@@ -176,6 +204,15 @@ static void nixge_dma_write_reg(struct nixge_priv *priv, off_t offset, u32 val)
 	writel(val, priv->dma_regs + offset);
 }
 
+static void nixge_dma_write_desc_reg(struct nixge_priv *priv, off_t offset,
+				     dma_addr_t addr)
+{
+	writel(lower_32_bits(addr), priv->dma_regs + offset);
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+	writel(upper_32_bits(addr), priv->dma_regs + offset + 4);
+#endif
+}
+
 static u32 nixge_dma_read_reg(const struct nixge_priv *priv, off_t offset)
 {
 	return readl(priv->dma_regs + offset);
@@ -202,13 +239,22 @@ static u32 nixge_ctrl_read_reg(struct nixge_priv *priv, off_t offset)
 static void nixge_hw_dma_bd_release(struct net_device *ndev)
 {
 	struct nixge_priv *priv = netdev_priv(ndev);
+	dma_addr_t phys_addr;
+	struct sk_buff *skb;
 	int i;
 
 	for (i = 0; i < RX_BD_NUM; i++) {
-		dma_unmap_single(ndev->dev.parent, priv->rx_bd_v[i].phys,
-				 NIXGE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE);
-		dev_kfree_skb((struct sk_buff *)
-			      (priv->rx_bd_v[i].sw_id_offset));
+		phys_addr = nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i],
+						     phys);
+
+		dma_unmap_single(ndev->dev.parent, phys_addr,
+				 NIXGE_MAX_JUMBO_FRAME_SIZE,
+				 DMA_FROM_DEVICE);
+
+		skb = (struct sk_buff *)
+			nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[i],
+						 sw_id_offset);
+		dev_kfree_skb(skb);
 	}
 
 	if (priv->rx_bd_v)
@@ -231,6 +277,7 @@ static int nixge_hw_dma_bd_init(struct net_device *ndev)
 {
 	struct nixge_priv *priv = netdev_priv(ndev);
 	struct sk_buff *skb;
+	dma_addr_t phys;
 	u32 cr;
 	int i;
 
@@ -259,27 +306,30 @@ static int nixge_hw_dma_bd_init(struct net_device *ndev)
 		goto out;
 
 	for (i = 0; i < TX_BD_NUM; i++) {
-		priv->tx_bd_v[i].next = priv->tx_bd_p +
-				      sizeof(*priv->tx_bd_v) *
-				      ((i + 1) % TX_BD_NUM);
+		nixge_hw_dma_bd_set_next(&priv->tx_bd_v[i],
+					 priv->tx_bd_p +
+					 sizeof(*priv->tx_bd_v) *
+					 ((i + 1) % TX_BD_NUM));
 	}
 
 	for (i = 0; i < RX_BD_NUM; i++) {
-		priv->rx_bd_v[i].next = priv->rx_bd_p +
-				      sizeof(*priv->rx_bd_v) *
-				      ((i + 1) % RX_BD_NUM);
+		nixge_hw_dma_bd_set_next(&priv->rx_bd_v[i],
+					 priv->rx_bd_p
+					 + sizeof(*priv->rx_bd_v) *
+					 ((i + 1) % RX_BD_NUM));
 
 		skb = netdev_alloc_skb_ip_align(ndev,
 						NIXGE_MAX_JUMBO_FRAME_SIZE);
 		if (!skb)
 			goto out;
 
-		priv->rx_bd_v[i].sw_id_offset = (u32)skb;
-		priv->rx_bd_v[i].phys =
-			dma_map_single(ndev->dev.parent,
-				       skb->data,
-				       NIXGE_MAX_JUMBO_FRAME_SIZE,
-				       DMA_FROM_DEVICE);
+		nixge_hw_dma_bd_set_offset(&priv->rx_bd_v[i], skb);
+		phys = dma_map_single(ndev->dev.parent, skb->data,
+				      NIXGE_MAX_JUMBO_FRAME_SIZE,
+				      DMA_FROM_DEVICE);
+
+		nixge_hw_dma_bd_set_phys(&priv->rx_bd_v[i], phys);
+
 		priv->rx_bd_v[i].cntrl = NIXGE_MAX_JUMBO_FRAME_SIZE;
 	}
 
@@ -312,18 +362,18 @@ static int nixge_hw_dma_bd_init(struct net_device *ndev)
 	/* Populate the tail pointer and bring the Rx Axi DMA engine out of
 	 * halted state. This will make the Rx side ready for reception.
 	 */
-	nixge_dma_write_reg(priv, XAXIDMA_RX_CDESC_OFFSET, priv->rx_bd_p);
+	nixge_dma_write_desc_reg(priv, XAXIDMA_RX_CDESC_OFFSET, priv->rx_bd_p);
 	cr = nixge_dma_read_reg(priv, XAXIDMA_RX_CR_OFFSET);
 	nixge_dma_write_reg(priv, XAXIDMA_RX_CR_OFFSET,
 			    cr | XAXIDMA_CR_RUNSTOP_MASK);
-	nixge_dma_write_reg(priv, XAXIDMA_RX_TDESC_OFFSET, priv->rx_bd_p +
+	nixge_dma_write_desc_reg(priv, XAXIDMA_RX_TDESC_OFFSET, priv->rx_bd_p +
 			    (sizeof(*priv->rx_bd_v) * (RX_BD_NUM - 1)));
 
 	/* Write to the RS (Run-stop) bit in the Tx channel control register.
 	 * Tx channel is now ready to run. But only after we write to the
 	 * tail pointer register that the Tx channel will start transmitting.
 	 */
-	nixge_dma_write_reg(priv, XAXIDMA_TX_CDESC_OFFSET, priv->tx_bd_p);
+	nixge_dma_write_desc_reg(priv, XAXIDMA_TX_CDESC_OFFSET, priv->tx_bd_p);
 	cr = nixge_dma_read_reg(priv, XAXIDMA_TX_CR_OFFSET);
 	nixge_dma_write_reg(priv, XAXIDMA_TX_CR_OFFSET,
 			    cr | XAXIDMA_CR_RUNSTOP_MASK);
@@ -451,7 +501,7 @@ static int nixge_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	struct nixge_priv *priv = netdev_priv(ndev);
 	struct nixge_hw_dma_bd *cur_p;
 	struct nixge_tx_skb *tx_skb;
-	dma_addr_t tail_p;
+	dma_addr_t tail_p, cur_phys;
 	skb_frag_t *frag;
 	u32 num_frag;
 	u32 ii;
@@ -466,15 +516,16 @@ static int nixge_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		return NETDEV_TX_OK;
 	}
 
-	cur_p->phys = dma_map_single(ndev->dev.parent, skb->data,
-				     skb_headlen(skb), DMA_TO_DEVICE);
-	if (dma_mapping_error(ndev->dev.parent, cur_p->phys))
+	cur_phys = dma_map_single(ndev->dev.parent, skb->data,
+				  skb_headlen(skb), DMA_TO_DEVICE);
+	if (dma_mapping_error(ndev->dev.parent, cur_phys))
 		goto drop;
+	nixge_hw_dma_bd_set_phys(cur_p, cur_phys);
 
 	cur_p->cntrl = skb_headlen(skb) | XAXIDMA_BD_CTRL_TXSOF_MASK;
 
 	tx_skb->skb = NULL;
-	tx_skb->mapping = cur_p->phys;
+	tx_skb->mapping = cur_phys;
 	tx_skb->size = skb_headlen(skb);
 	tx_skb->mapped_as_page = false;
 
@@ -485,16 +536,17 @@ static int nixge_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 		tx_skb = &priv->tx_skb[priv->tx_bd_tail];
 		frag = &skb_shinfo(skb)->frags[ii];
 
-		cur_p->phys = skb_frag_dma_map(ndev->dev.parent, frag, 0,
-					       skb_frag_size(frag),
-					       DMA_TO_DEVICE);
-		if (dma_mapping_error(ndev->dev.parent, cur_p->phys))
+		cur_phys = skb_frag_dma_map(ndev->dev.parent, frag, 0,
+					    skb_frag_size(frag),
+					    DMA_TO_DEVICE);
+		if (dma_mapping_error(ndev->dev.parent, cur_phys))
 			goto frag_err;
+		nixge_hw_dma_bd_set_phys(cur_p, cur_phys);
 
 		cur_p->cntrl = skb_frag_size(frag);
 
 		tx_skb->skb = NULL;
-		tx_skb->mapping = cur_p->phys;
+		tx_skb->mapping = cur_phys;
 		tx_skb->size = skb_frag_size(frag);
 		tx_skb->mapped_as_page = true;
 	}
@@ -506,7 +558,7 @@ static int nixge_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 
 	tail_p = priv->tx_bd_p + sizeof(*priv->tx_bd_v) * priv->tx_bd_tail;
 	/* Start the transfer */
-	nixge_dma_write_reg(priv, XAXIDMA_TX_TDESC_OFFSET, tail_p);
+	nixge_dma_write_desc_reg(priv, XAXIDMA_TX_TDESC_OFFSET, tail_p);
 	++priv->tx_bd_tail;
 	priv->tx_bd_tail %= TX_BD_NUM;
 
@@ -537,7 +589,7 @@ static int nixge_recv(struct net_device *ndev, int budget)
 	struct nixge_priv *priv = netdev_priv(ndev);
 	struct sk_buff *skb, *new_skb;
 	struct nixge_hw_dma_bd *cur_p;
-	dma_addr_t tail_p = 0;
+	dma_addr_t tail_p = 0, cur_phys = 0;
 	u32 packets = 0;
 	u32 length = 0;
 	u32 size = 0;
@@ -549,13 +601,15 @@ static int nixge_recv(struct net_device *ndev, int budget)
 		tail_p = priv->rx_bd_p + sizeof(*priv->rx_bd_v) *
 			 priv->rx_bd_ci;
 
-		skb = (struct sk_buff *)(cur_p->sw_id_offset);
+		skb = (struct sk_buff *)nixge_hw_dma_bd_get_addr(cur_p,
+								 sw_id_offset);
 
 		length = cur_p->status & XAXIDMA_BD_STS_ACTUAL_LEN_MASK;
 		if (length > NIXGE_MAX_JUMBO_FRAME_SIZE)
 			length = NIXGE_MAX_JUMBO_FRAME_SIZE;
 
-		dma_unmap_single(ndev->dev.parent, cur_p->phys,
+		dma_unmap_single(ndev->dev.parent,
+				 nixge_hw_dma_bd_get_addr(cur_p, phys),
 				 NIXGE_MAX_JUMBO_FRAME_SIZE,
 				 DMA_FROM_DEVICE);
 
@@ -579,16 +633,17 @@ static int nixge_recv(struct net_device *ndev, int budget)
 		if (!new_skb)
 			return packets;
 
-		cur_p->phys = dma_map_single(ndev->dev.parent, new_skb->data,
-					     NIXGE_MAX_JUMBO_FRAME_SIZE,
-					     DMA_FROM_DEVICE);
-		if (dma_mapping_error(ndev->dev.parent, cur_p->phys)) {
+		cur_phys = dma_map_single(ndev->dev.parent, new_skb->data,
+					  NIXGE_MAX_JUMBO_FRAME_SIZE,
+					  DMA_FROM_DEVICE);
+		if (dma_mapping_error(ndev->dev.parent, cur_phys)) {
 			/* FIXME: bail out and clean up */
 			netdev_err(ndev, "Failed to map ...\n");
 		}
+		nixge_hw_dma_bd_set_phys(cur_p, cur_phys);
 		cur_p->cntrl = NIXGE_MAX_JUMBO_FRAME_SIZE;
 		cur_p->status = 0;
-		cur_p->sw_id_offset = (u32)new_skb;
+		nixge_hw_dma_bd_set_offset(cur_p, new_skb);
 
 		++priv->rx_bd_ci;
 		priv->rx_bd_ci %= RX_BD_NUM;
@@ -599,7 +654,7 @@ static int nixge_recv(struct net_device *ndev, int budget)
 	ndev->stats.rx_bytes += size;
 
 	if (tail_p)
-		nixge_dma_write_reg(priv, XAXIDMA_RX_TDESC_OFFSET, tail_p);
+		nixge_dma_write_desc_reg(priv, XAXIDMA_RX_TDESC_OFFSET, tail_p);
 
 	return packets;
 }
@@ -637,6 +692,7 @@ static irqreturn_t nixge_tx_irq(int irq, void *_ndev)
 	struct nixge_priv *priv = netdev_priv(_ndev);
 	struct net_device *ndev = _ndev;
 	unsigned int status;
+	dma_addr_t phys;
 	u32 cr;
 
 	status = nixge_dma_read_reg(priv, XAXIDMA_TX_SR_OFFSET);
@@ -650,9 +706,11 @@ static irqreturn_t nixge_tx_irq(int irq, void *_ndev)
 		return IRQ_NONE;
 	}
 	if (status & XAXIDMA_IRQ_ERROR_MASK) {
+		phys = nixge_hw_dma_bd_get_addr(&priv->tx_bd_v[priv->tx_bd_ci],
+						phys);
+
 		netdev_err(ndev, "DMA Tx error 0x%x\n", status);
-		netdev_err(ndev, "Current BD is at: 0x%x\n",
-			   (priv->tx_bd_v[priv->tx_bd_ci]).phys);
+		netdev_err(ndev, "Current BD is at: 0x%llx\n", (u64)phys);
 
 		cr = nixge_dma_read_reg(priv, XAXIDMA_TX_CR_OFFSET);
 		/* Disable coalesce, delay timer and error interrupts */
@@ -678,6 +736,7 @@ static irqreturn_t nixge_rx_irq(int irq, void *_ndev)
 	struct nixge_priv *priv = netdev_priv(_ndev);
 	struct net_device *ndev = _ndev;
 	unsigned int status;
+	dma_addr_t phys;
 	u32 cr;
 
 	status = nixge_dma_read_reg(priv, XAXIDMA_RX_SR_OFFSET);
@@ -697,9 +756,10 @@ static irqreturn_t nixge_rx_irq(int irq, void *_ndev)
 		return IRQ_NONE;
 	}
 	if (status & XAXIDMA_IRQ_ERROR_MASK) {
+		phys = nixge_hw_dma_bd_get_addr(&priv->rx_bd_v[priv->rx_bd_ci],
+						phys);
 		netdev_err(ndev, "DMA Rx error 0x%x\n", status);
-		netdev_err(ndev, "Current BD is at: 0x%x\n",
-			   (priv->rx_bd_v[priv->rx_bd_ci]).phys);
+		netdev_err(ndev, "Current BD is at: 0x%llx\n", (u64)phys);
 
 		cr = nixge_dma_read_reg(priv, XAXIDMA_TX_CR_OFFSET);
 		/* Disable coalesce, delay timer and error interrupts */
@@ -735,10 +795,10 @@ static void nixge_dma_err_handler(unsigned long data)
 		tx_skb = &lp->tx_skb[i];
 		nixge_tx_skb_unmap(lp, tx_skb);
 
-		cur_p->phys = 0;
+		nixge_hw_dma_bd_set_phys(cur_p, 0);
 		cur_p->cntrl = 0;
 		cur_p->status = 0;
-		cur_p->sw_id_offset = 0;
+		nixge_hw_dma_bd_set_offset(cur_p, 0);
 	}
 
 	for (i = 0; i < RX_BD_NUM; i++) {
@@ -779,18 +839,18 @@ static void nixge_dma_err_handler(unsigned long data)
 	/* Populate the tail pointer and bring the Rx Axi DMA engine out of
 	 * halted state. This will make the Rx side ready for reception.
 	 */
-	nixge_dma_write_reg(lp, XAXIDMA_RX_CDESC_OFFSET, lp->rx_bd_p);
+	nixge_dma_write_desc_reg(lp, XAXIDMA_RX_CDESC_OFFSET, lp->rx_bd_p);
 	cr = nixge_dma_read_reg(lp, XAXIDMA_RX_CR_OFFSET);
 	nixge_dma_write_reg(lp, XAXIDMA_RX_CR_OFFSET,
 			    cr | XAXIDMA_CR_RUNSTOP_MASK);
-	nixge_dma_write_reg(lp, XAXIDMA_RX_TDESC_OFFSET, lp->rx_bd_p +
+	nixge_dma_write_desc_reg(lp, XAXIDMA_RX_TDESC_OFFSET, lp->rx_bd_p +
 			    (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
 
 	/* Write to the RS (Run-stop) bit in the Tx channel control register.
 	 * Tx channel is now ready to run. But only after we write to the
 	 * tail pointer register that the Tx channel will start transmitting
 	 */
-	nixge_dma_write_reg(lp, XAXIDMA_TX_CDESC_OFFSET, lp->tx_bd_p);
+	nixge_dma_write_desc_reg(lp, XAXIDMA_TX_CDESC_OFFSET, lp->tx_bd_p);
 	cr = nixge_dma_read_reg(lp, XAXIDMA_TX_CR_OFFSET);
 	nixge_dma_write_reg(lp, XAXIDMA_TX_CR_OFFSET,
 			    cr | XAXIDMA_CR_RUNSTOP_MASK);
-- 
2.18.0

^ permalink raw reply related

* phylib: Any PHY which reports link up during autoneg ?
From: Heiner Kallweit @ 2018-08-16 19:15 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev@vger.kernel.org

When reading through the state machine code in phy.c I wondered whether
there is any PHY which reports the link as up during autonegotiation.
(It's about handling PHY_AN in the state machine, once we know the link
is up we still check whether aneg was completed. Is this needed?)

At least the PHY's I have access to all report the link as down when
autonegotiating.
I checked also clause 22 of 802.3u, however found no clear definition
when a link should be considered up. There it's stated that it's
PHY-dependent.

Can you shed any light on this?

Thanks, Heiner

^ permalink raw reply

* Re: [Patch net-next] ila: make lockdep happy again
From: David Miller @ 2018-08-16 19:15 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, tom
In-Reply-To: <20180814222131.19259-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Tue, 14 Aug 2018 15:21:31 -0700

> Previously, alloc_ila_locks() and bucket_table_alloc() call
> spin_lock_init() separately, therefore they have two different
> lock names and lock class keys. However, after commit b893281715ab
> ("ila: Call library function alloc_bucket_locks") they both call
> helper alloc_bucket_spinlocks() which now only has one lock
> name and lock class key. This causes a few bogus lockdep warnings
> as reported by syzbot.
> 
> Fix this by making alloc_bucket_locks() a macro and pass declaration
> name as lock name and a static lock class key inside the macro.
> 
> Fixes: b893281715ab ("ila: Call library function alloc_bucket_locks")
> Reported-by: <syzbot+b66a5a554991a8ed027c@syzkaller.appspotmail.com>
> Cc: Tom Herbert <tom@quantonium.net>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next] net: sched: act_ife: always release ife action on init error
From: David Miller @ 2018-08-16 19:12 UTC (permalink / raw)
  To: vladbu; +Cc: netdev, jhs, xiyou.wangcong, jiri
In-Reply-To: <1534267796-9841-1-git-send-email-vladbu@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>
Date: Tue, 14 Aug 2018 20:29:56 +0300

> Action init API was changed to always take reference to action, even when
> overwriting existing action. Substitute conditional action release, which
> was executed only if action is newly created, with unconditional release in
> tcf_ife_init() error handling code to prevent double free or memory leak in
> case of overwrite.
> 
> Fixes: 4e8ddd7f1758 ("net: sched: don't release reference on action overwrite")
> Reported-by: Cong Wang <xiyou.wangcong@gmail.com>
> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] cls_matchall: fix tcf_unbind_filter missing
From: David Miller @ 2018-08-16 19:09 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, xiyou.wangcong, yotamg, jiri
In-Reply-To: <1534238906-16097-1-git-send-email-liuhangbin@gmail.com>

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue, 14 Aug 2018 17:28:26 +0800

> Fix tcf_unbind_filter missing in cls_matchall as this will trigger
> WARN_ON() in cbq_destroy_class().
> 
> Fixes: fd62d9f5c575f ("net/sched: matchall: Fix configuration race")
> Reported-by: Li Shuang <shuali@redhat.com>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: Under what conditions is phy_device "adjust_link()" called?
From: Florian Fainelli @ 2018-08-16 18:59 UTC (permalink / raw)
  To: rpjday, netdev
In-Reply-To: <20180816132658.Horde.v9igInCJzJMqJVzetKhC8Jh@crashcourse.ca>

On 08/16/2018 10:26 AM, rpjday@crashcourse.ca wrote:
> 
> I can see from the documentation that the callback adjust_link() is invoked
> "for the enet controller to respond to changes in the link state." Is there
> a specific list of the events that would generate such a change? Are we
> talking initially opening the device, ifup/ifdown, physically unplugging
> from the port, some or all of the above?

adjust_link() is typically called on transitions from link UP to DOWN
and DOWN to UP. This may include the initial configuration of the PHY
during e.g: phy_connect() and then typically when an event occurs than
requires a re-configuration of the MAC: link parameters (speed, status,
duplex, pause) changed.

> 
> Not a network expert (yet), so I'm still digging through the code. Thanks.

Reading the PHY state machine under drivers/net/phy/phy.c will make this
more clear IMHO.
-- 
Florian

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Heiner Kallweit @ 2018-08-16 18:59 UTC (permalink / raw)
  To: Jian-Hong Pan, David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <2676c1ed-4450-d720-84a0-95e5884490cb@web.de>

> From: Jian-Hong Pan <jian-hong@endlessm.com>
> 
> Found the ethernet network on ASUS X441UAR doesn't come back on resume
> from suspend when using MSI-X.  The chip is RTL8106e - version 39.
> 
The patch itself looks good, just the commit message is wrong in one
place and a little bit long.

> asus@endless:~$ dmesg | grep r8169
> [   21.848357] libphy: r8169: probed
> [   21.848473] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID
> 44900000, IRQ 127
> [   22.518860] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   29.458041] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   63.227398] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> [  124.514648] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> 
> Here is the ethernet controller in detail:
> 
> asus@endless:~$ sudo lspci -nnvs 02:00.0
> [sudo] password for asus:
> 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
> RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
> (rev 07)
> 	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
> Ethernet controller [1043:200f]
> 	Flags: bus master, fast devsel, latency 0, IRQ 16
> 	I/O ports at e000 [size=256]
> 	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
> 	Memory at e0000000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [40] Power Management version 3
> 	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
> 	Capabilities: [70] Express Endpoint, MSI 01
> 	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
> 	Capabilities: [d0] Vital Product Data
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [140] Virtual Channel
> 	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
> 	Capabilities: [170] Latency Tolerance Reporting
> 	Kernel driver in use: r8169
> 	Kernel modules: r8169
> 
> Here is the system interrupt table:
> 
> asus@endless:~$ cat /proc/interrupts
>             CPU0       CPU1       CPU2       CPU3
>    0:         22          0          0          0   IO-APIC    2-edge
> timer
>    1:        157         42          0          0   IO-APIC    1-edge
> i8042
>    8:          0          0          1          0   IO-APIC    8-edge
> rtc0
>    9:         10         13          0          0   IO-APIC    9-fasteoi
> acpi
>   16:          0          0          0          0   IO-APIC   16-fasteoi
> i2c_designware.0, i801_smbus
>   17:       2445          0       3453          0   IO-APIC   17-fasteoi
> i2c_designware.1, rtl_pci
>  109:          2          0          0          1   IO-APIC  109-fasteoi
> FTE1200:00
>  120:          0          0          0          0   PCI-MSI 458752-edge
> PCIe PME
>  121:          0          0          0          0   PCI-MSI 466944-edge
> PCIe PME
>  122:          0          0          0          0   PCI-MSI 468992-edge
> PCIe PME
>  123:       1465          0          0      21263   PCI-MSI 376832-edge
> ahci[0000:00:17.0]
>  124:          0        530          0          0   PCI-MSI 327680-edge
> xhci_hcd
>  125:       5204          0          0          0   PCI-MSI 32768-edge
> i915
>  126:          0          0        149          0   PCI-MSI 514048-edge
> snd_hda_intel:card0
>  127:          0          0        337          0   PCI-MSI 1048576-edge
> enp2s0
>  NMI:          0          0          0          0   Non-maskable
> interrupts
>  LOC:      45049      39474      38978      46677   Local timer
> interrupts
>  SPU:          0          0          0          0   Spurious interrupts
>  PMI:          0          0          0          0   Performance
> monitoring interrupts
>  IWI:        619          8          0          1   IRQ work interrupts
>  RTR:          6          0          0          0   APIC ICR read
> retries
>  RES:       4918       4436       3835       2943   Rescheduling
> interrupts
>  CAL:       1399       1478       1598       1465   Function call
> interrupts
>  TLB:        608        513        723        559   TLB shootdowns
>  TRM:          0          0          0          0   Thermal event
> interrupts
>  THR:          0          0          0          0   Threshold APIC
> interrupts
>  DFR:          0          0          0          0   Deferred Error APIC
> interrupts
>  MCE:          0          0          0          0   Machine check
> exceptions
>  MCP:          3          4          4          4   Machine check polls
>  ERR:          0
>  MIS:          0
>  PIN:          0          0          0          0   Posted-interrupt
> notification event
>  NPI:          0          0          0          0   Nested
> posted-interrupt event
>  PIW:          0          0          0          0   Posted-interrupt
> wakeup event
> 
> It is the IRQ 127 - PCI-MSI used by enp2s0.  However, lspci lists MSI is
> disabled and MSI-X is enabled which conflicts to the interrupt table.
> 
Both types of interrupts, MSI and MSI-X, are listed with irq chip name
"PCI-MSI", because MSI-X is treated as a sub-feature of MSI.
Therefore the output of /proc/interrupts doesn't allow to tell whether
a MSI or MSI-X interrupt is used, and as a consequence there is no such
conflict.
Indeed only lspci provides the information whether MSI or MSI-X is used.

> Falling back to MSI fixes the issue.
> 
> Here is the test result with this patch in dmesg:
> 
> asus@endless:~$ dmesg | grep r8169
> [   22.017477] libphy: r8169: probed
> [   22.017735] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID
> 44900000, IRQ 127
> [   22.041489] r8169 0000:02:00.0 enp2s0: renamed from eth0
> [   29.138312] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [   30.927359] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> [  289.998077] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> [  290.508084] Generic PHY r8169-200:00: attached PHY driver [Generic
> PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE)
> [  290.745690] r8169 0000:02:00.0 enp2s0: Link is Down
> [  292.367717] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full -
> flow control off
> 
> lspci lists MSI is enabled and MSI-X is disabled with this patch:
> 
> asus@endless:~/linux-net$ sudo lspci -nnvs 02:00.0
> [sudo] password for asus:
> 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
> RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
> (rev 07)
> 	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
> Ethernet controller [1043:200f]
> 	Flags: bus master, fast devsel, latency 0, IRQ 127
> 	I/O ports at e000 [size=256]
> 	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
> 	Memory at e0000000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [40] Power Management version 3
> 	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
> 	Capabilities: [70] Express Endpoint, MSI 01
> 	Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
> 	Capabilities: [d0] Vital Product Data
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [140] Virtual Channel
> 	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
> 	Capabilities: [170] Latency Tolerance Reporting
> 	Kernel driver in use: r8169
> 	Kernel modules: r8169
> 
> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
> ---
>  drivers/net/ethernet/realtek/r8169.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 0d9c3831838f..0efa977c422d 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -7071,17 +7071,20 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
>  {
>  	unsigned int flags;
>  
> -	if (tp->mac_version <= RTL_GIGA_MAC_VER_06) {
> +	switch (tp->mac_version) {
> +	case RTL_GIGA_MAC_VER_01 ... RTL_GIGA_MAC_VER_06:
>  		RTL_W8(tp, Cfg9346, Cfg9346_Unlock);
>  		RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~MSIEnable);
>  		RTL_W8(tp, Cfg9346, Cfg9346_Lock);
>  		flags = PCI_IRQ_LEGACY;
> -	} else if (tp->mac_version == RTL_GIGA_MAC_VER_40) {
> +		break;
> +	case RTL_GIGA_MAC_VER_39 ... RTL_GIGA_MAC_VER_40:
>  		/* This version was reported to have issues with resume
>  		 * from suspend when using MSI-X
>  		 */
>  		flags = PCI_IRQ_LEGACY | PCI_IRQ_MSI;
> -	} else {
> +		break;
> +	default:
>  		flags = PCI_IRQ_ALL_TYPES;
>  	}
>  
> 

^ permalink raw reply

* Re: ANNOUNCE: pahole v1.12 (BTF edition)
From: Jan Engelhardt @ 2018-08-16 21:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: dwarves, Dodji Seketeli,
	Linux Networking Development Mailing List,
	Linux Kernel Mailing List
In-Reply-To: <20180816200942.GA19939@kernel.org>

On Thursday 2018-08-16 22:09, Arnaldo Carvalho de Melo wrote:

>	After a long time without announces, here is pahole 1.12,
>available at:
>
>	https://fedorapeople.org/~acme/dwarves/dwarves-1.12.tar.bz2
>
>	git://git.kernel.org/pub/scm/devel/pahole/pahole.git	
>
>	Some distros haven't picked 1.11, that comes with several
>goodies, my bad for not having announced it at that time more widely,

Missing announcements can be forgiven. But there are automatic tools 
that scrape the web for updates (usually something tries to scan
the enclosing directory of the last known URL), so uploads are 
essential.
Since 1.11 was never uploaded, it did not find its way..
(One had to grab a tarball gitweb generated from the tag,
but had to know there was a 1.11, too).


Can we have signatures for the release tarballs?
(Only if you think it's worth having.)


>Please report any problems to me, I'll try and get problems fixed.

Here's one (or six):


$ cat x.cpp 
#include <utility>
struct F {
        template<typename T, typename... A> F(T &, T &&, A &&...x) { }
        F clone() const && { int q; return F(q, 3, 4); }
        int xpub() { return xprot(); }
        protected:
        int xprot() { return xpriv(); }
        private:
        int xpriv() { return 0; }
};
int z;
F f(z,2,3,4);
int main()
{
        f.xpub();
        std::move(f).clone();
}


$ g++-7 x.cpp -c -ggdb3 -Wall && pahole x.o
die__process_function: tag not supported 0x2f (template_type_parameter)!
//expected: handle type
die__process_function: tag not supported 0x4107 (GNU_template_parameter_pack)!
//expected: handle type
die__process_function: tag not supported 0x4108 (GNU_formal_parameter_pack)!
//expected: handle type
ftype__recode_dwarf_types: couldn't find 0x321 abstract_origin for 0x397 (formal_parameter)!
//expected: handle type
ftype__recode_dwarf_types: couldn't find 0x326 abstract_origin for 0x39f (formal_parameter)!
ftype__recode_dwarf_types: couldn't find 0x3e0 abstract_origin for 0x447 (formal_parameter)!
struct F {
        class F clone(const class F  *);
	//expected: "struct F clone(const struct F *&&);"

        int xpub(class F *);

protected:

        int xprot(class F *);

private:

        int xpriv(class F *);

//expected: "public:"

        void F<int, int, int>(class F *, int &, , , );
	//expected: "void F<int, int, int>(struct F *, int &, int &&, int &&, int &&);

        void F<int, int>(class F *, int &, , );
	//expected: "void F<int, int, int>(struct F *, int &, int &&, int &&);

        /* size: 1, cachelines: 0, members: 0 */
        /* last cacheline: 1 bytes */
};

^ permalink raw reply

* Re: [PATCH ethtool v2 0/3] ethtool: Wake-on-LAN using filters
From: John W. Linville @ 2018-08-16 18:32 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, davem, andrew
In-Reply-To: <20180809180402.19430-1-f.fainelli@gmail.com>

On Thu, Aug 09, 2018 at 11:03:59AM -0700, Florian Fainelli wrote:
> Hi John,
> 
> This patch series syncs up ethtool-copy.h to get the new definitions
> required for supporting wake-on-LAN using filters: WAKE_FILTER and
> RX_CLS_FLOW_WAKE and then updates the rxclass.c code to allow us to
> specify action -2 (RX_CLS_FLOW_WAKE).
> 
> Let me know if you would like this to be done differently.
> 
> Thanks!
> 
> Changes in v2:
> 
> - properly put the man page hunk describing action -2 into patch #3
> 
> Florian Fainelli (3):
>   ethtool-copy.h: sync with net-next
>   ethtool: Add support for WAKE_FILTER (WoL using filters)
>   ethtool: Add support for action value -2 (wake-up filter)
> 
>  ethtool-copy.h | 15 +++++++++++----
>  ethtool.8.in   |  4 +++-
>  ethtool.c      |  5 +++++
>  rxclass.c      |  8 +++++---
>  4 files changed, 24 insertions(+), 8 deletions(-)

Thanks, Florian -- LGTM!

Patches merged and pushed-out, queued for next release (probably next week)...

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox