Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 05/11] tuntap: tweak on the path of non-xdp case in tun_build_skb()
From: Michael S. Tsirkin @ 2018-09-06 17:16 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-6-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:20PM +0800, Jason Wang wrote:
> If we're sure not to go native XDP, there's no need for several things
> like bh and rcu stuffs.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

True...

> ---
>  drivers/net/tun.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index f8cdcfa392c3..389aa0727cc6 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1675,10 +1675,12 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	 * of xdp_prog above, this should be rare and for simplicity
>  	 * we do XDP on skb in case the headroom is not enough.
>  	 */
> -	if (hdr->gso_type || !xdp_prog)
> +	if (hdr->gso_type || !xdp_prog) {
>  		*skb_xdp = 1;
> -	else
> -		*skb_xdp = 0;
> +		goto build;
> +	}
> +
> +	*skb_xdp = 0;
>  
>  	local_bh_disable();
>  	rcu_read_lock();
> @@ -1724,6 +1726,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	rcu_read_unlock();
>  	local_bh_enable();
>  
> +build:

But this is spaghetti code. Please just put common
code into functions and call them, don't goto.

>  	skb = build_skb(buf, buflen);
>  	if (!skb) {
>  		skb = ERR_PTR(-ENOMEM);
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH net-next 04/11] tuntap: simplify error handling in tun_build_skb()
From: Michael S. Tsirkin @ 2018-09-06 17:14 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-5-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:19PM +0800, Jason Wang wrote:
> There's no need to duplicate page get logic in each action. So this
> patch tries to get page and calculate the offset before processing XDP
> actions, and undo them when meet errors (we don't care the performance
> on errors). This will be used for factoring out XDP logic.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

I see some issues with this one.

> ---
>  drivers/net/tun.c | 37 ++++++++++++++++---------------------
>  1 file changed, 16 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 372caf7d67d9..f8cdcfa392c3 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1642,7 +1642,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  				     int len, int *skb_xdp)
>  {
>  	struct page_frag *alloc_frag = &current->task_frag;
> -	struct sk_buff *skb;
> +	struct sk_buff *skb = NULL;
>  	struct bpf_prog *xdp_prog;
>  	int buflen = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
>  	unsigned int delta = 0;
> @@ -1668,6 +1668,9 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	if (copied != len)
>  		return ERR_PTR(-EFAULT);
>  
> +	get_page(alloc_frag->page);
> +	alloc_frag->offset += buflen;
> +

This adds an atomic op on XDP_DROP which is a data path
operation for some workloads.

>  	/* There's a small window that XDP may be set after the check
>  	 * of xdp_prog above, this should be rare and for simplicity
>  	 * we do XDP on skb in case the headroom is not enough.
> @@ -1695,23 +1698,15 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  
>  		switch (act) {
>  		case XDP_REDIRECT:
> -			get_page(alloc_frag->page);
> -			alloc_frag->offset += buflen;
>  			err = xdp_do_redirect(tun->dev, &xdp, xdp_prog);
>  			xdp_do_flush_map();
>  			if (err)
> -				goto err_redirect;
> -			rcu_read_unlock();
> -			local_bh_enable();
> -			return NULL;
> +				goto err_xdp;
> +			goto out;
>  		case XDP_TX:
> -			get_page(alloc_frag->page);
> -			alloc_frag->offset += buflen;
>  			if (tun_xdp_tx(tun->dev, &xdp) < 0)
> -				goto err_redirect;
> -			rcu_read_unlock();
> -			local_bh_enable();
> -			return NULL;
> +				goto err_xdp;
> +			goto out;
>  		case XDP_PASS:
>  			delta = orig_data - xdp.data;
>  			len = xdp.data_end - xdp.data;
> @@ -1730,23 +1725,23 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	local_bh_enable();
>  
>  	skb = build_skb(buf, buflen);
> -	if (!skb)
> -		return ERR_PTR(-ENOMEM);
> +	if (!skb) {
> +		skb = ERR_PTR(-ENOMEM);
> +		goto out;

So goto out will skip put_page, and we did
do get_page above. Seems wrong. You should
goto err_skb or something like this.


> +	}
>  
>  	skb_reserve(skb, pad - delta);
>  	skb_put(skb, len);
> -	get_page(alloc_frag->page);
> -	alloc_frag->offset += buflen;
>  
>  	return skb;
>  
> -err_redirect:
> -	put_page(alloc_frag->page);
>  err_xdp:
> +	alloc_frag->offset -= buflen;
> +	put_page(alloc_frag->page);
> +out:

Out here isn't an error at all, is it?  You should not mix return and
error handling IMHO.



>  	rcu_read_unlock();
>  	local_bh_enable();
> -	this_cpu_inc(tun->pcpu_stats->rx_dropped);

Doesn't this break rx_dropped accounting?

> -	return NULL;
> +	return skb;
>  }
>  
>  /* Get packet from user space buffer */
> -- 
> 2.17.1

^ permalink raw reply

* [PATCH net-next] net: stmmac: Enable TC Ops for GMAC >= 4
From: Jose Abreu @ 2018-09-06 12:29 UTC (permalink / raw)
  To: netdev
  Cc: Jose Abreu, David S. Miller, Joao Pinto, Giuseppe Cavallaro,
	Alexandre Torgue

GMAC >= 4 also supports CBS. Lets enable the TC Ops for these versions.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/hwif.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c
index d9a34a4d08b3..81b966a8261b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.c
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c
@@ -133,7 +133,7 @@ static const struct stmmac_hwif_entry {
 		.mac = &dwmac4_ops,
 		.hwtimestamp = &stmmac_ptp,
 		.mode = NULL,
-		.tc = NULL,
+		.tc = &dwmac510_tc_ops,
 		.setup = dwmac4_setup,
 		.quirks = stmmac_dwmac4_quirks,
 	}, {
@@ -150,7 +150,7 @@ static const struct stmmac_hwif_entry {
 		.mac = &dwmac410_ops,
 		.hwtimestamp = &stmmac_ptp,
 		.mode = &dwmac4_ring_mode_ops,
-		.tc = NULL,
+		.tc = &dwmac510_tc_ops,
 		.setup = dwmac4_setup,
 		.quirks = NULL,
 	}, {
@@ -167,7 +167,7 @@ static const struct stmmac_hwif_entry {
 		.mac = &dwmac410_ops,
 		.hwtimestamp = &stmmac_ptp,
 		.mode = &dwmac4_ring_mode_ops,
-		.tc = NULL,
+		.tc = &dwmac510_tc_ops,
 		.setup = dwmac4_setup,
 		.quirks = NULL,
 	}, {
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next 03/11] tuntap: enable bh early during processing XDP
From: Michael S. Tsirkin @ 2018-09-06 17:02 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-4-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:18PM +0800, Jason Wang wrote:
> This patch move the bh enabling a little bit earlier, this will be
> used for factoring out the core XDP logic of tuntap.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

no reason to disable bh for non-xdp things.

> ---
>  drivers/net/tun.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index d3677a544b56..372caf7d67d9 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1726,22 +1726,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  			goto err_xdp;
>  		}
>  	}
> +	rcu_read_unlock();
> +	local_bh_enable();
>  
>  	skb = build_skb(buf, buflen);
> -	if (!skb) {
> -		rcu_read_unlock();
> -		local_bh_enable();
> +	if (!skb)
>  		return ERR_PTR(-ENOMEM);
> -	}
>  
>  	skb_reserve(skb, pad - delta);
>  	skb_put(skb, len);
>  	get_page(alloc_frag->page);
>  	alloc_frag->offset += buflen;
>  
> -	rcu_read_unlock();
> -	local_bh_enable();
> -
>  	return skb;
>  
>  err_redirect:
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH net-next 02/11] tuntap: switch to use XDP_PACKET_HEADROOM
From: Michael S. Tsirkin @ 2018-09-06 16:57 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-3-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:17PM +0800, Jason Wang wrote:
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

any idea why didn't we do this straight away?

> ---
>  drivers/net/tun.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 2c548bd20393..d3677a544b56 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -113,7 +113,6 @@ do {								\
>  } while (0)
>  #endif
>  
> -#define TUN_HEADROOM 256
>  #define TUN_RX_PAD (NET_IP_ALIGN + NET_SKB_PAD)
>  
>  /* TUN device flags */
> @@ -1654,7 +1653,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
>  	rcu_read_lock();
>  	xdp_prog = rcu_dereference(tun->xdp_prog);
>  	if (xdp_prog)
> -		pad += TUN_HEADROOM;
> +		pad += XDP_PACKET_HEADROOM;
>  	buflen += SKB_DATA_ALIGN(len + pad);
>  	rcu_read_unlock();
>  
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH net-next 01/11] net: sock: introduce SOCK_XDP
From: Michael S. Tsirkin @ 2018-09-06 16:56 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-2-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:16PM +0800, Jason Wang wrote:
> This patch introduces a new sock flag - SOCK_XDP. This will be used
> for notifying the upper layer that XDP program is attached on the
> lower socket, and requires for extra headroom.
> 
> TUN will be the first user.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

In fact vhost is the 1st user, right? So this can be
pushed out to become patch 10/11.

> ---
>  drivers/net/tun.c  | 19 +++++++++++++++++++
>  include/net/sock.h |  1 +
>  2 files changed, 20 insertions(+)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index ebd07ad82431..2c548bd20393 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -869,6 +869,9 @@ static int tun_attach(struct tun_struct *tun, struct file *file,
>  		tun_napi_init(tun, tfile, napi);
>  	}
>  
> +	if (rtnl_dereference(tun->xdp_prog))
> +		sock_set_flag(&tfile->sk, SOCK_XDP);
> +
>  	tun_set_real_num_queues(tun);
>  
>  	/* device is allowed to go away first, so no need to hold extra
> @@ -1241,13 +1244,29 @@ static int tun_xdp_set(struct net_device *dev, struct bpf_prog *prog,
>  		       struct netlink_ext_ack *extack)
>  {
>  	struct tun_struct *tun = netdev_priv(dev);
> +	struct tun_file *tfile;
>  	struct bpf_prog *old_prog;
> +	int i;
>  
>  	old_prog = rtnl_dereference(tun->xdp_prog);
>  	rcu_assign_pointer(tun->xdp_prog, prog);
>  	if (old_prog)
>  		bpf_prog_put(old_prog);
>  
> +	for (i = 0; i < tun->numqueues; i++) {
> +		tfile = rtnl_dereference(tun->tfiles[i]);
> +		if (prog)
> +			sock_set_flag(&tfile->sk, SOCK_XDP);
> +		else
> +			sock_reset_flag(&tfile->sk, SOCK_XDP);
> +	}
> +	list_for_each_entry(tfile, &tun->disabled, next) {
> +		if (prog)
> +			sock_set_flag(&tfile->sk, SOCK_XDP);
> +		else
> +			sock_reset_flag(&tfile->sk, SOCK_XDP);
> +	}
> +
>  	return 0;
>  }
>  
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 433f45fc2d68..38cae35f6e16 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -800,6 +800,7 @@ enum sock_flags {
>  	SOCK_SELECT_ERR_QUEUE, /* Wake select on error queue */
>  	SOCK_RCU_FREE, /* wait rcu grace period in sk_destruct() */
>  	SOCK_TXTIME,
> +	SOCK_XDP, /* XDP is attached */
>  };
>  
>  #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH net-next 08/11] tun: switch to new type of msg_control
From: Michael S. Tsirkin @ 2018-09-06 16:54 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-9-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:23PM +0800, Jason Wang wrote:
> This patch introduces to a new tun/tap specific msg_control:
> 
> #define TUN_MSG_UBUF 1
> #define TUN_MSG_PTR  2
> struct tun_msg_ctl {
>        int type;
>        void *ptr;
> };
> 
> This allows us to pass different kinds of msg_control through
> sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will
> be used by the existed vhost_net zerocopy code. The second is XDP
> buff, which allows vhost_net to pass XDP buff to TUN. This could be
> used to implement accepting an array of XDP buffs from vhost_net in
> the following patches.
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>

At this point, do we want to just add a new sock opt for tap's
benefit? Seems cleaner than (ab)using sendmsg.

> ---
>  drivers/net/tap.c      | 18 ++++++++++++------
>  drivers/net/tun.c      |  6 +++++-
>  drivers/vhost/net.c    |  7 +++++--
>  include/linux/if_tun.h |  7 +++++++
>  4 files changed, 29 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/tap.c b/drivers/net/tap.c
> index f0f7cd977667..7996ed7cbf18 100644
> --- a/drivers/net/tap.c
> +++ b/drivers/net/tap.c
> @@ -619,7 +619,7 @@ static inline struct sk_buff *tap_alloc_skb(struct sock *sk, size_t prepad,
>  #define TAP_RESERVE HH_DATA_OFF(ETH_HLEN)
>  
>  /* Get packet from user space buffer */
> -static ssize_t tap_get_user(struct tap_queue *q, struct msghdr *m,
> +static ssize_t tap_get_user(struct tap_queue *q, void *msg_control,
>  			    struct iov_iter *from, int noblock)
>  {
>  	int good_linear = SKB_MAX_HEAD(TAP_RESERVE);
> @@ -663,7 +663,7 @@ static ssize_t tap_get_user(struct tap_queue *q, struct msghdr *m,
>  	if (unlikely(len < ETH_HLEN))
>  		goto err;
>  
> -	if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
> +	if (msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
>  		struct iov_iter i;
>  
>  		copylen = vnet_hdr.hdr_len ?
> @@ -724,11 +724,11 @@ static ssize_t tap_get_user(struct tap_queue *q, struct msghdr *m,
>  	tap = rcu_dereference(q->tap);
>  	/* copy skb_ubuf_info for callback when skb has no error */
>  	if (zerocopy) {
> -		skb_shinfo(skb)->destructor_arg = m->msg_control;
> +		skb_shinfo(skb)->destructor_arg = msg_control;
>  		skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
>  		skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
> -	} else if (m && m->msg_control) {
> -		struct ubuf_info *uarg = m->msg_control;
> +	} else if (msg_control) {
> +		struct ubuf_info *uarg = msg_control;
>  		uarg->callback(uarg, false);
>  	}
>  
> @@ -1150,7 +1150,13 @@ static int tap_sendmsg(struct socket *sock, struct msghdr *m,
>  		       size_t total_len)
>  {
>  	struct tap_queue *q = container_of(sock, struct tap_queue, sock);
> -	return tap_get_user(q, m, &m->msg_iter, m->msg_flags & MSG_DONTWAIT);
> +	struct tun_msg_ctl *ctl = m->msg_control;
> +
> +	if (ctl && ctl->type != TUN_MSG_UBUF)
> +		return -EINVAL;
> +
> +	return tap_get_user(q, ctl ? ctl->ptr : NULL, &m->msg_iter,
> +			    m->msg_flags & MSG_DONTWAIT);
>  }
>  
>  static int tap_recvmsg(struct socket *sock, struct msghdr *m,
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index ff1cbf3ebd50..c839a4bdcbd9 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -2429,11 +2429,15 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
>  	int ret;
>  	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
>  	struct tun_struct *tun = tun_get(tfile);
> +	struct tun_msg_ctl *ctl = m->msg_control;
>  
>  	if (!tun)
>  		return -EBADFD;
>  
> -	ret = tun_get_user(tun, tfile, m->msg_control, &m->msg_iter,
> +	if (ctl && ctl->type != TUN_MSG_UBUF)
> +		return -EINVAL;
> +
> +	ret = tun_get_user(tun, tfile, ctl ? ctl->ptr : NULL, &m->msg_iter,
>  			   m->msg_flags & MSG_DONTWAIT,
>  			   m->msg_flags & MSG_MORE);
>  	tun_put(tun);
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 4e656f89cb22..fb01ce6d981c 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -620,6 +620,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  		.msg_controllen = 0,
>  		.msg_flags = MSG_DONTWAIT,
>  	};
> +	struct tun_msg_ctl ctl;
>  	size_t len, total_len = 0;
>  	int err;
>  	struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
> @@ -664,8 +665,10 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
>  			ubuf->ctx = nvq->ubufs;
>  			ubuf->desc = nvq->upend_idx;
>  			refcount_set(&ubuf->refcnt, 1);
> -			msg.msg_control = ubuf;
> -			msg.msg_controllen = sizeof(ubuf);
> +			msg.msg_control = &ctl;
> +			ctl.type = TUN_MSG_UBUF;
> +			ctl.ptr = ubuf;
> +			msg.msg_controllen = sizeof(ctl);
>  			ubufs = nvq->ubufs;
>  			atomic_inc(&ubufs->refcount);
>  			nvq->upend_idx = (nvq->upend_idx + 1) % UIO_MAXIOV;
> diff --git a/include/linux/if_tun.h b/include/linux/if_tun.h
> index 3d2996dc7d85..ba46dced1f38 100644
> --- a/include/linux/if_tun.h
> +++ b/include/linux/if_tun.h
> @@ -19,6 +19,13 @@
>  
>  #define TUN_XDP_FLAG 0x1UL
>  
> +#define TUN_MSG_UBUF 1
> +#define TUN_MSG_PTR  2

Looks like TUN_MSG_PTR should be pushed out to a follow-up patch?

> +struct tun_msg_ctl {
> +	int type;
> +	void *ptr;
> +};
> +

type actually includes a size. Why not two short fields then?


>  #if defined(CONFIG_TUN) || defined(CONFIG_TUN_MODULE)
>  struct socket *tun_get_socket(struct file *);
>  struct ptr_ring *tun_get_tx_ring(struct file *file);
> -- 
> 2.17.1

^ permalink raw reply

* Re: [PATCH net-next 11/11] vhost_net: batch submitting XDP buffers to underlayer sockets
From: Michael S. Tsirkin @ 2018-09-06 16:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <20180906040526.22518-12-jasowang@redhat.com>

On Thu, Sep 06, 2018 at 12:05:26PM +0800, Jason Wang wrote:
> This patch implements XDP batching for vhost_net. The idea is first to
> try to do userspace copy and build XDP buff directly in vhost. Instead
> of submitting the packet immediately, vhost_net will batch them in an
> array and submit every 64 (VHOST_NET_BATCH) packets to the under layer
> sockets through msg_control of sendmsg().
> 
> When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a
> loop without caring GUP thus it can do batch map flushing. When XDP is
> not enabled or not supported, the underlayer socket need to build skb
> and pass it to network core. The batched packet submission allows us
> to do batching like netif_receive_skb_list() in the future.
> 
> This saves lots of indirect calls for better cache utilization. For
> the case that we can't so batching e.g when sndbuf is limited or
> packet size is too large, we will go for usual one packet per
> sendmsg() way.
> 
> Doing testpmd on various setups gives us:
> 
> Test                /+pps%
> XDP_DROP on TAP     /+44.8%
> XDP_REDIRECT on TAP /+29%
> macvtap (skb)       /+26%
> 
> Netperf tests shows obvious improvements for small packet transmission:
> 
> size/session/+thu%/+normalize%
>    64/     1/   +2%/    0%
>    64/     2/   +3%/   +1%
>    64/     4/   +7%/   +5%
>    64/     8/   +8%/   +6%
>   256/     1/   +3%/    0%
>   256/     2/  +10%/   +7%
>   256/     4/  +26%/  +22%
>   256/     8/  +27%/  +23%
>   512/     1/   +3%/   +2%
>   512/     2/  +19%/  +14%
>   512/     4/  +43%/  +40%
>   512/     8/  +45%/  +41%
>  1024/     1/   +4%/    0%
>  1024/     2/  +27%/  +21%
>  1024/     4/  +38%/  +73%
>  1024/     8/  +15%/  +24%
>  2048/     1/  +10%/   +7%
>  2048/     2/  +16%/  +12%
>  2048/     4/    0%/   +2%
>  2048/     8/    0%/   +2%
>  4096/     1/  +36%/  +60%
>  4096/     2/  -11%/  -26%
>  4096/     4/    0%/  +14%
>  4096/     8/    0%/   +4%
> 16384/     1/   -1%/   +5%
> 16384/     2/    0%/   +2%
> 16384/     4/    0%/   -3%
> 16384/     8/    0%/   +4%
> 65535/     1/    0%/  +10%
> 65535/     2/    0%/   +8%
> 65535/     4/    0%/   +1%
> 65535/     8/    0%/   +3%
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/net.c | 164 ++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 151 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index fb01ce6d981c..1dd4239cbff8 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -116,6 +116,7 @@ struct vhost_net_virtqueue {
>  	 * For RX, number of batched heads
>  	 */
>  	int done_idx;
> +	int batched_xdp;

Pls add a comment documenting what does this new field do.

>  	/* an array of userspace buffers info */
>  	struct ubuf_info *ubuf_info;
>  	/* Reference counting for outstanding ubufs.
> @@ -123,6 +124,7 @@ struct vhost_net_virtqueue {
>  	struct vhost_net_ubuf_ref *ubufs;
>  	struct ptr_ring *rx_ring;
>  	struct vhost_net_buf rxq;
> +	struct xdp_buff xdp[VHOST_NET_BATCH];

This is a bit too much to have inline. 64 entries 48 bytes each, and we
have 2 of these structures, that's already >4K.

>  };
>  
>  struct vhost_net {
> @@ -338,6 +340,11 @@ static bool vhost_sock_zcopy(struct socket *sock)
>  		sock_flag(sock->sk, SOCK_ZEROCOPY);
>  }
>  
> +static bool vhost_sock_xdp(struct socket *sock)
> +{
> +	return sock_flag(sock->sk, SOCK_XDP);

what if an xdp program is attached while this processing
is going on? Flag value will be wrong - is this safe
and why? Pls add a comment.

> +}
> +
>  /* In case of DMA done not in order in lower device driver for some reason.
>   * upend_idx is used to track end of used idx, done_idx is used to track head
>   * of used idx. Once lower device DMA done contiguously, we will signal KVM
> @@ -444,10 +451,36 @@ static void vhost_net_signal_used(struct vhost_net_virtqueue *nvq)
>  	nvq->done_idx = 0;
>  }
>  
> +static void vhost_tx_batch(struct vhost_net *net,
> +			   struct vhost_net_virtqueue *nvq,
> +			   struct socket *sock,
> +			   struct msghdr *msghdr)
> +{
> +	struct tun_msg_ctl ctl = {
> +		.type = nvq->batched_xdp << 16 | TUN_MSG_PTR,
> +		.ptr = nvq->xdp,
> +	};
> +	int err;
> +
> +	if (nvq->batched_xdp == 0)
> +		goto signal_used;
> +
> +	msghdr->msg_control = &ctl;
> +	err = sock->ops->sendmsg(sock, msghdr, 0);
> +	if (unlikely(err < 0)) {
> +		vq_err(&nvq->vq, "Fail to batch sending packets\n");
> +		return;
> +	}
> +
> +signal_used:
> +	vhost_net_signal_used(nvq);
> +	nvq->batched_xdp = 0;
> +}
> +
>  static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>  				    struct vhost_net_virtqueue *nvq,
>  				    unsigned int *out_num, unsigned int *in_num,
> -				    bool *busyloop_intr)
> +				    struct msghdr *msghdr, bool *busyloop_intr)
>  {
>  	struct vhost_virtqueue *vq = &nvq->vq;
>  	unsigned long uninitialized_var(endtime);
> @@ -455,8 +488,9 @@ static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
>  				  out_num, in_num, NULL, NULL);
>  
>  	if (r == vq->num && vq->busyloop_timeout) {
> +		/* Flush batched packets first */
>  		if (!vhost_sock_zcopy(vq->private_data))
> -			vhost_net_signal_used(nvq);
> +			vhost_tx_batch(net, nvq, vq->private_data, msghdr);
>  		preempt_disable();
>  		endtime = busy_clock() + vq->busyloop_timeout;
>  		while (vhost_can_busy_poll(endtime)) {
> @@ -512,7 +546,7 @@ static int get_tx_bufs(struct vhost_net *net,
>  	struct vhost_virtqueue *vq = &nvq->vq;
>  	int ret;
>  
> -	ret = vhost_net_tx_get_vq_desc(net, nvq, out, in, busyloop_intr);
> +	ret = vhost_net_tx_get_vq_desc(net, nvq, out, in, msg, busyloop_intr);
>  
>  	if (ret < 0 || ret == vq->num)
>  		return ret;
> @@ -540,6 +574,83 @@ static bool tx_can_batch(struct vhost_virtqueue *vq, size_t total_len)
>  	       !vhost_vq_avail_empty(vq->dev, vq);
>  }
>  
> +#define VHOST_NET_RX_PAD (NET_IP_ALIGN + NET_SKB_PAD)

I wonder whether NET_IP_ALIGN make sense for XDP.

> +
> +static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq,
> +			       struct iov_iter *from)
> +{
> +	struct vhost_virtqueue *vq = &nvq->vq;
> +	struct socket *sock = vq->private_data;
> +	struct page_frag *alloc_frag = &current->task_frag;
> +	struct virtio_net_hdr *gso;
> +	struct xdp_buff *xdp = &nvq->xdp[nvq->batched_xdp];
> +	size_t len = iov_iter_count(from);
> +	int headroom = vhost_sock_xdp(sock) ? XDP_PACKET_HEADROOM : 0;
> +	int buflen = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> +	int pad = SKB_DATA_ALIGN(VHOST_NET_RX_PAD + headroom + nvq->sock_hlen);
> +	int sock_hlen = nvq->sock_hlen;
> +	void *buf;
> +	int copied;
> +
> +	if (unlikely(len < nvq->sock_hlen))
> +		return -EFAULT;
> +
> +	if (SKB_DATA_ALIGN(len + pad) +
> +	    SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE)
> +		return -ENOSPC;
> +
> +	buflen += SKB_DATA_ALIGN(len + pad);
> +	alloc_frag->offset = ALIGN((u64)alloc_frag->offset, SMP_CACHE_BYTES);
> +	if (unlikely(!skb_page_frag_refill(buflen, alloc_frag, GFP_KERNEL)))
> +		return -ENOMEM;
> +
> +	buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset;
> +
> +	/* We store two kinds of metadata in the header which will be
> +	 * used for XDP_PASS to do build_skb():
> +	 * offset 0: buflen
> +	 * offset sizeof(int): vnet header

Define a struct for the metadata then?


> +	 */
> +	copied = copy_page_from_iter(alloc_frag->page,
> +				     alloc_frag->offset + sizeof(int),
> +				     sock_hlen, from);
> +	if (copied != sock_hlen)
> +		return -EFAULT;
> +
> +	gso = (struct virtio_net_hdr *)(buf + sizeof(int));
> +
> +	if ((gso->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
> +	    vhost16_to_cpu(vq, gso->csum_start) +
> +	    vhost16_to_cpu(vq, gso->csum_offset) + 2 >
> +	    vhost16_to_cpu(vq, gso->hdr_len)) {
> +		gso->hdr_len = cpu_to_vhost16(vq,
> +			       vhost16_to_cpu(vq, gso->csum_start) +
> +			       vhost16_to_cpu(vq, gso->csum_offset) + 2);
> +
> +		if (vhost16_to_cpu(vq, gso->hdr_len) > len)
> +			return -EINVAL;
> +	}
> +
> +	len -= sock_hlen;
> +	copied = copy_page_from_iter(alloc_frag->page,
> +				     alloc_frag->offset + pad,
> +				     len, from);
> +	if (copied != len)
> +		return -EFAULT;
> +
> +	xdp->data_hard_start = buf;
> +	xdp->data = buf + pad;
> +	xdp->data_end = xdp->data + len;
> +	*(int *)(xdp->data_hard_start) = buflen;
> +
> +	get_page(alloc_frag->page);
> +	alloc_frag->offset += buflen;
> +
> +	++nvq->batched_xdp;
> +
> +	return 0;
> +}
> +
>  static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  {
>  	struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX];
> @@ -556,10 +667,14 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  	size_t len, total_len = 0;
>  	int err;
>  	int sent_pkts = 0;
> +	bool bulking = (sock->sk->sk_sndbuf == INT_MAX);

What does bulking mean?

>  
>  	for (;;) {
>  		bool busyloop_intr = false;
>  
> +		if (nvq->done_idx == VHOST_NET_BATCH)
> +			vhost_tx_batch(net, nvq, sock, &msg);
> +
>  		head = get_tx_bufs(net, nvq, &msg, &out, &in, &len,
>  				   &busyloop_intr);
>  		/* On error, stop handling until the next kick. */
> @@ -577,14 +692,34 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  			break;
>  		}
>  
> -		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> -		vq->heads[nvq->done_idx].len = 0;
> -
>  		total_len += len;
> -		if (tx_can_batch(vq, total_len))
> -			msg.msg_flags |= MSG_MORE;
> -		else
> -			msg.msg_flags &= ~MSG_MORE;
> +
> +		/* For simplicity, TX batching is only enabled if
> +		 * sndbuf is unlimited.

What if sndbuf changes while this processing is going on?

> +		 */
> +		if (bulking) {
> +			err = vhost_net_build_xdp(nvq, &msg.msg_iter);
> +			if (!err) {
> +				goto done;
> +			} else if (unlikely(err != -ENOSPC)) {
> +				vhost_tx_batch(net, nvq, sock, &msg);
> +				vhost_discard_vq_desc(vq, 1);
> +				vhost_net_enable_vq(net, vq);
> +				break;
> +			}
> +
> +			/* We can't build XDP buff, go for single
> +			 * packet path but let's flush batched
> +			 * packets.
> +			 */
> +			vhost_tx_batch(net, nvq, sock, &msg);
> +			msg.msg_control = NULL;
> +		} else {
> +			if (tx_can_batch(vq, total_len))
> +				msg.msg_flags |= MSG_MORE;
> +			else
> +				msg.msg_flags &= ~MSG_MORE;
> +		}
>  
>  		/* TODO: Check specific error and bomb out unless ENOBUFS? */
>  		err = sock->ops->sendmsg(sock, &msg, len);
> @@ -596,15 +731,17 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
>  		if (err != len)
>  			pr_debug("Truncated TX packet: len %d != %zd\n",
>  				 err, len);
> -		if (++nvq->done_idx >= VHOST_NET_BATCH)
> -			vhost_net_signal_used(nvq);
> +done:
> +		vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head);
> +		vq->heads[nvq->done_idx].len = 0;
> +		++nvq->done_idx;
>  		if (vhost_exceeds_weight(++sent_pkts, total_len)) {
>  			vhost_poll_queue(&vq->poll);
>  			break;
>  		}
>  	}
>  
> -	vhost_net_signal_used(nvq);
> +	vhost_tx_batch(net, nvq, sock, &msg);
>  }
>  
>  static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
> @@ -1111,6 +1248,7 @@ static int vhost_net_open(struct inode *inode, struct file *f)
>  		n->vqs[i].ubuf_info = NULL;
>  		n->vqs[i].upend_idx = 0;
>  		n->vqs[i].done_idx = 0;
> +		n->vqs[i].batched_xdp = 0;
>  		n->vqs[i].vhost_hlen = 0;
>  		n->vqs[i].sock_hlen = 0;
>  		n->vqs[i].rx_ring = NULL;
> -- 
> 2.17.1

^ permalink raw reply

* KASAN: use-after-free Read in _decode_session6
From: syzbot @ 2018-09-06 16:41 UTC (permalink / raw)
  To: davem, herbert, kuznet, linux-kernel, netdev, steffen.klassert,
	syzkaller-bugs, yoshfuji

Hello,

syzbot found the following crash on:

HEAD commit:    b36fdc6853a3 Merge tag 'gpio-v4.19-2' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1556a396400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4c7e83258d6e0156
dashboard link: https://syzkaller.appspot.com/bug?extid=e8c1d30881266e47eb33
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14d42021400000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13d09f1e400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+e8c1d30881266e47eb33@syzkaller.appspotmail.com

audit: type=1400 audit(1536202560.246:9): avc:  denied  { prog_run } for   
pid=4752 comm="syz-executor726"  
scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023  
tcontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tclass=bpf  
permissive=1
==================================================================
BUG: KASAN: use-after-free in _decode_session6+0x1331/0x14e0  
net/ipv6/xfrm6_policy.c:161
Read of size 1 at addr ffff8801bb87e97f by task syz-executor726/4752

CPU: 0 PID: 4752 Comm: syz-executor726 Not tainted 4.19.0-rc2+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
  print_address_description+0x6c/0x20b mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412
  __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
  _decode_session6+0x1331/0x14e0 net/ipv6/xfrm6_policy.c:161
  __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2299
  xfrm_decode_session include/net/xfrm.h:1232 [inline]
  vti6_tnl_xmit+0x3fc/0x1bb1 net/ipv6/ip6_vti.c:542
  __netdev_start_xmit include/linux/netdevice.h:4287 [inline]
  netdev_start_xmit include/linux/netdevice.h:4296 [inline]
  xmit_one net/core/dev.c:3216 [inline]
  dev_hard_start_xmit+0x272/0xc10 net/core/dev.c:3232
  __dev_queue_xmit+0x2ab2/0x3870 net/core/dev.c:3802
  dev_queue_xmit+0x17/0x20 net/core/dev.c:3835
  __bpf_tx_skb net/core/filter.c:2012 [inline]
  __bpf_redirect_common net/core/filter.c:2050 [inline]
  __bpf_redirect+0x5b7/0xae0 net/core/filter.c:2057
  ____bpf_clone_redirect net/core/filter.c:2090 [inline]
  bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2062
  bpf_prog_c39d1ba309a769f7+0xd1e/0x1000

Allocated by task 4752:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
  __do_kmalloc_node mm/slab.c:3682 [inline]
  __kmalloc_node_track_caller+0x47/0x70 mm/slab.c:3696
  __kmalloc_reserve.isra.41+0x3a/0xe0 net/core/skbuff.c:137
  pskb_expand_head+0x230/0x10e0 net/core/skbuff.c:1463
  skb_ensure_writable+0x3dd/0x640 net/core/skbuff.c:5129
  __bpf_try_make_writable net/core/filter.c:1633 [inline]
  bpf_try_make_writable net/core/filter.c:1639 [inline]
  bpf_try_make_head_writable net/core/filter.c:1647 [inline]
  ____bpf_clone_redirect net/core/filter.c:2084 [inline]
  bpf_clone_redirect+0x14a/0x490 net/core/filter.c:2062
  bpf_prog_c39d1ba309a769f7+0xd1e/0x1000

Freed by task 4752:
  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
  set_track mm/kasan/kasan.c:460 [inline]
  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
  __cache_free mm/slab.c:3498 [inline]
  kfree+0xd9/0x210 mm/slab.c:3813
  skb_free_head+0x99/0xc0 net/core/skbuff.c:550
  skb_release_data+0x6a4/0x880 net/core/skbuff.c:570
  skb_release_all+0x4a/0x60 net/core/skbuff.c:627
  __kfree_skb net/core/skbuff.c:641 [inline]
  kfree_skb+0x19d/0x4e0 net/core/skbuff.c:659
  vti6_tnl_xmit+0x387/0x1bb1 net/ipv6/ip6_vti.c:561
  __netdev_start_xmit include/linux/netdevice.h:4287 [inline]
  netdev_start_xmit include/linux/netdevice.h:4296 [inline]
  xmit_one net/core/dev.c:3216 [inline]
  dev_hard_start_xmit+0x272/0xc10 net/core/dev.c:3232
  __dev_queue_xmit+0x2ab2/0x3870 net/core/dev.c:3802
  dev_queue_xmit+0x17/0x20 net/core/dev.c:3835
  __bpf_tx_skb net/core/filter.c:2012 [inline]
  __bpf_redirect_common net/core/filter.c:2050 [inline]
  __bpf_redirect+0x5b7/0xae0 net/core/filter.c:2057
  ____bpf_clone_redirect net/core/filter.c:2090 [inline]
  bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2062
  bpf_prog_c39d1ba309a769f7+0xd1e/0x1000

The buggy address belongs to the object at ffff8801bb87e780
  which belongs to the cache kmalloc-512 of size 512
The buggy address is located 511 bytes inside of
  512-byte region [ffff8801bb87e780, ffff8801bb87e980)
The buggy address belongs to the page:
page:ffffea0006ee1f80 count:1 mapcount:0 mapping:ffff8801dac00940 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffffea0006eefd48 ffffea00073f5f08 ffff8801dac00940
raw: 0000000000000000 ffff8801bb87e000 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8801bb87e800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8801bb87e880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff8801bb87e900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                 ^
  ffff8801bb87e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8801bb87ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* [PATCH] r8169: inform about CLKRUN protocol issue when behind a CardBus bridge
From: Maciej S. Szmigiero @ 2018-09-06 16:10 UTC (permalink / raw)
  To: Realtek linux nic maintainers, David S. Miller; +Cc: netdev, linux-kernel

It turns out that at least some r8169 CardBus cards don't operate correctly
when CLKRUN protocol is enabled - the symptoms are recurring timeouts
during PHY reads / writes and a very high packet drop rate.
This is true of at least RTL8169sc/8110sc (XID 18000000) chip in
Sunrich C-160 CardBus NIC.

Such behavior was observed on two separate laptops, the first one has
TI PCIxx12 CardBus bridge, while the second one has Ricoh RL5c476II.

Setting CLKRUN_En bit in CONFIG 3 register via an EEPROM write didn't
improve things in either case (this is probably why it wasn't set by the
card manufacturer).
The only way to fix the issue was to disable the CLKRUN protocol either
in the CardBus bridge (only possible in the TI one) or in the southbridge.

Since the problem takes some time to debug let's warn people that have
the suspect configuration (Conventional PCI r8169 NIC behind a CardBus
bridge) so they know what they can do if they encounter it.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
---
 drivers/net/ethernet/realtek/r8169.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index b08d51bf7a20..b935a18358cb 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7254,6 +7254,22 @@ static int rtl_jumbo_max(struct rtl8169_private *tp)
 	}
 }
 
+static void rtl_pci_cardbus_check(struct pci_dev *pdev)
+{
+	struct pci_dev *parent = pdev;
+
+	while ((parent = pci_upstream_bridge(parent)) != NULL) {
+		if (parent->hdr_type != PCI_HEADER_TYPE_CARDBUS)
+			continue;
+
+		dev_info(&pdev->dev,
+			 "device is behind a CardBus bridge\n");
+		dev_info(&pdev->dev,
+			 "in case of erratic or no operation try disabling CLKRUN protocol in the CardBus bridge or in the southbridge\n");
+		break;
+	}
+}
+
 static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	const struct rtl_cfg_info *cfg = rtl_cfg_infos + ent->driver_data;
@@ -7305,8 +7321,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	tp->mmio_addr = pcim_iomap_table(pdev)[region];
 
-	if (!pci_is_pcie(pdev))
+	if (!pci_is_pcie(pdev)) {
 		dev_info(&pdev->dev, "not PCI Express\n");
+		rtl_pci_cardbus_check(pdev);
+	}
 
 	/* Identify chip attached to board */
 	rtl8169_get_mac_version(tp, cfg->default_ver);
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH 1/1] ath10k: avoid possible memory access violation
From: Kalle Valo @ 2018-09-06 16:04 UTC (permalink / raw)
  To: K.T.VIJAYAKUMAAR
  Cc: netdev, linux-wireless, linux-kernel, ath10k, cpgs, vijay.bvb,
	davem
In-Reply-To: <1533292805-9709-1-git-send-email-vijay.bvb@samsung.com>

"K.T.VIJAYAKUMAAR" <vijay.bvb@samsung.com> wrote:

> array "ctl_power_table" access index "pream" is initialized with -1 and
> is raised as a static analysis tool issue.
> [drivers\net\wireless\ath\ath10k\wmi.c:4719] ->
> [drivers\net\wireless\ath\ath10k\wmi.c:4730]: (error) Array index -1 is
> out of bounds.
> 
> Since the "pream" index for accessing ctl_power_table array is initialized
> with -1, there is a chance of memory access violation for the cases below.
> 1) wmi_pdev_tpc_final_table_event change frequency is between 2483 and 5180
> 2) pream_idx is out of the enumeration ranges of wmi_tpc_pream_2ghz,
> wmi_tpc_pream_5ghz
> 
> Signed-off-by: K.T.VIJAYAKUMAAR <vijay.bvb@samsung.com>
> [kvalo@codeaurora.org: clean up the warning message]
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

Patch applied to ath-next branch of ath.git, thanks.

97c69a70dc2c ath10k: avoid possible memory access violation

-- 
https://patchwork.kernel.org/patch/10554929/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 1/1] ath10k: avoid possible memory access violation
From: Kalle Valo @ 2018-09-06 16:04 UTC (permalink / raw)
  To: K.T.VIJAYAKUMAAR
  Cc: davem, ath10k, linux-wireless, netdev, linux-kernel, cpgs,
	vijay.bvb
In-Reply-To: <1533292805-9709-1-git-send-email-vijay.bvb@samsung.com>

"K.T.VIJAYAKUMAAR" <vijay.bvb@samsung.com> wrote:

> array "ctl_power_table" access index "pream" is initialized with -1 and
> is raised as a static analysis tool issue.
> [drivers\net\wireless\ath\ath10k\wmi.c:4719] ->
> [drivers\net\wireless\ath\ath10k\wmi.c:4730]: (error) Array index -1 is
> out of bounds.
> 
> Since the "pream" index for accessing ctl_power_table array is initialized
> with -1, there is a chance of memory access violation for the cases below.
> 1) wmi_pdev_tpc_final_table_event change frequency is between 2483 and 5180
> 2) pream_idx is out of the enumeration ranges of wmi_tpc_pream_2ghz,
> wmi_tpc_pream_5ghz
> 
> Signed-off-by: K.T.VIJAYAKUMAAR <vijay.bvb@samsung.com>
> [kvalo@codeaurora.org: clean up the warning message]
> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>

Patch applied to ath-next branch of ath.git, thanks.

97c69a70dc2c ath10k: avoid possible memory access violation

-- 
https://patchwork.kernel.org/patch/10554929/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] [RFC v2] Drop all 00-INDEX files from Documentation/
From: Steven Rostedt @ 2018-09-06 16:01 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Mark Rutland, linux-mips, linux-fbdev, x86, kvm, linux-doc,
	Peter Zijlstra, James Hogan, Henrik Austad, Will Deacon,
	dri-devel, Masahiro Yamada, Jan Kandziora, Paul Mackerras,
	Henrik Austad, Pavel Machek, H. Peter Anvin, Evgeniy Polyakov,
	linux-s390, Ian Kent, linux-security-module, Paul Moore,
	Michael Ellerman, Helge Deller, Radim Krčmář,
	James E.J. Bottomley
In-Reply-To: <20180906095804.5ab2716f@lwn.net>

On Thu, 6 Sep 2018 09:58:04 -0600
Jonathan Corbet <corbet@lwn.net> wrote:

> Thanks,
> 
> jon  (who is increasingly inclined to apply this patch)

As Colin Kaepernick now says... "Just do it!"

;-)

-- Steve
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply

* Re: [PATCH] wil6210: fix unsigned cid comparison with >= 0
From: Kalle Valo @ 2018-09-06 16:00 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Maya Erez, David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	wil6210-Rm6X0d1/PG5y9aJCnZT0Uw, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Gustavo A. R. Silva
In-Reply-To: <20180829175018.GA3776-L1vi/lXTdts+Va1GwOuvDg@public.gmane.org>

"Gustavo A. R. Silva" <gustavo-L1vi/lXTdts+Va1GwOuvDg@public.gmane.org> wrote:

> The comparison of cid >= 0 is always true because cid is of type u8
> (8 bits, unsigned).
> 
> Fix this by removing such comparison and updating the type of
> variable cid to u8 in the caller function.
> 
> Addresses-Coverity-ID: 1473079 ("Unsigned compared against 0")
> Fixes: b9010f105f21 ("wil6210: add FT roam support for AP and station")
> Signed-off-by: Gustavo A. R. Silva <gustavo-L1vi/lXTdts+Va1GwOuvDg@public.gmane.org>
> Signed-off-by: Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

Patch applied to ath-next branch of ath.git, thanks.

49925f247016 wil6210: fix unsigned cid comparison with >= 0

-- 
https://patchwork.kernel.org/patch/10580739/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] [RFC v2] Drop all 00-INDEX files from Documentation/
From: Jonathan Corbet @ 2018-09-06 15:58 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mark Rutland, linux-mips, linux-fbdev, x86, kvm, linux-doc,
	Peter Zijlstra, James Hogan, Henrik Austad, Will Deacon,
	dri-devel, Masahiro Yamada, Jan Kandziora, Paul Mackerras,
	Henrik Austad, Pavel Machek, H. Peter Anvin, Evgeniy Polyakov,
	linux-s390, Ian Kent, linux-security-module, Paul Moore,
	Michael Ellerman, Helge Deller, Radim Krčmář,
	James E.J. Bottomley
In-Reply-To: <20180904095908.13298b3d@gandalf.local.home>

On Tue, 4 Sep 2018 09:59:08 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 4 Sep 2018 13:30:30 +0200
> Pavel Machek <pavel@ucw.cz> wrote:
> 
> > I'd say this is still quite valueable, and it might be worth fixing,
> > rather then removing completely.  
> 
> I agree. Perhaps we should have a 00-DESCRIPTION file in each
> directory, and each file could start with a:
> 
>  DESCRIPTION: <one line description here>
> 
> and then these files could be generated by those that have these tags.

I really don't want to hack up yet another documentation syntax and
processing scheme.  We already have one that does all of this and more.
That energy would be far better spent bringing the docs into the RST
hierarchy, IMO.

Thanks,

jon  (who is increasingly inclined to apply this patch)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply

* Re: [PATCH net-next v1] net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPC
From: Sabrina Dubroca @ 2018-09-06 11:22 UTC (permalink / raw)
  To: Vakul Garg; +Cc: netdev, borisp, aviadye, davejwatson, davem, doronrk
In-Reply-To: <20180905162743.10145-1-vakul.garg@nxp.com>

2018-09-05, 21:57:43 +0530, Vakul Garg wrote:
> tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
> function sk_alloc_sg(). In case the number of SG entries hit
> MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
> current SG index to '0'. This leads to calling of function
> tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
> kernel crash. To fix this, set the number of SG elements to the number
> of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
> returns -ENOSPC.
> 
> Signed-off-by: Vakul Garg <vakul.garg@nxp.com>

What commit introduced this bug? Can you add Fixes: tag? And if it's a
bug present in the net tree, the fix should go into the net tree as
well.


Another thing I've noticed a few times with your patch submissions:
the clock on the system you're using for git-send-email seems to be
set something like 5 hours and 18 minutes in the future. Could you fix
it?  It makes your email appear out of order.

Date: Wed,  5 Sep 2018 21:57:43 +0530

Received: from lti.ap.freescale.net (14.143.30.134) by
        AM0PR04MB4243.eurprd04.prod.outlook.com (2603:10a6:208:66::29) with Microsoft
        SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
        15.20.1101.17; Wed, 5 Sep 2018 11:09:20 +0000

-- 
Sabrina

^ permalink raw reply

* Re: [PATCH v2 01/17] asm: simd context helper API
From: Jason A. Donenfeld @ 2018-09-06 15:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andrew Lutomirski, LKML, Netdev, David Miller, Greg Kroah-Hartman,
	Samuel Neves, linux-arch, Rik van Riel
In-Reply-To: <alpine.DEB.2.21.1809061542330.1570@nanos.tec.linutronix.de>

Hi Thomas,

On Thu, Sep 6, 2018 at 9:29 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, 1 Sep 2018, Jason A. Donenfeld wrote:
> > On Sat, Sep 1, 2018 at 2:32 PM Andy Lutomirski <luto@kernel.org> wrote:
> > > I tend to think the right approach is to merge Jason's code and then
> > > make it better later.  Even with a totally perfect lazy FPU restore
> > > implementation on x86, we'll probably still need some way of dealing
> > > with SIMD contexts.  I think we're highly unlikely to ever a allow
> > > SIMD usage in all NMI contexts, for example, and there will always be
> > > cases where we specifically don't want to use all available SIMD
> > > capabilities even if we can.  For example, generating random numbers
> > > does crypto, but we probably don't want to do *SIMD* crypto, since
> > > that will force a save and restore and will probably fire up the
> > > AVX512 unit, and that's not worth it unless we're already using it for
> > > some other reason.
> > >
> > > Also, as Rik has discovered, lazy FPU restore is conceptually
> > > straightforward but isn't entirely trivial :)
> >
> > Sounds good. I'll move ahead on this basis.
>
> Fine with me.

Do you want to pull this single patch [01/17] into your tree now, and
then when I submit v3 of WireGuard and such, I can just drop this
patch from it, and then the rest will enter like usual networking
stuff through Dave's tree?

Jason

^ permalink raw reply

* Re: [PATCH net-next] net: sched: change tcf_del_walker() to use concurrent-safe delete
From: Vlad Buslov @ 2018-09-06 11:14 UTC (permalink / raw)
  To: Cong Wang
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	David Miller
In-Reply-To: <CAM_iQpUY_-itn2oRQ-CEtOxZ-u1sKxCfbYWgPS1v6NTkT-NbSA@mail.gmail.com>


On Wed 05 Sep 2018 at 20:32, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Sep 5, 2018 at 12:05 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>>
>> On Tue 04 Sep 2018 at 22:41, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> > On Mon, Sep 3, 2018 at 1:33 PM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >>
>> >>
>> >> On Mon 03 Sep 2018 at 18:50, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> >> > On Mon, Sep 3, 2018 at 12:06 AM Vlad Buslov <vladbu@mellanox.com> wrote:
>> >> >>
>> >> >> Action API was changed to work with actions and action_idr in concurrency
>> >> >> safe manner, however tcf_del_walker() still uses actions without taking
>> >> >> reference to them first and deletes them directly, disregarding possible
>> >> >> concurrent delete.
>> >> >>
>> >> >> Change tcf_del_walker() to use tcf_idr_delete_index() that doesn't require
>> >> >> caller to hold reference to action and accepts action id as argument,
>> >> >> instead of direct action pointer.
>> >> >
>> >> > Hmm, why doesn't tcf_del_walker() just take idrinfo->lock? At least
>> >> > tcf_dump_walker() already does.
>> >>
>> >> Because tcf_del_walker() calls __tcf_idr_release(), which take
>> >> idrinfo->lock itself (deadlock). It also calls sleeping functions like
>> >
>> > Deadlock can be easily resolved by moving the lock out.
>> >
>> >
>> >> tcf_action_goto_chain_fini(), so just implementing function that
>> >> releases action without taking idrinfo->lock is not enough.
>> >
>> > Sleeping can be resolved either by making it atomic or
>> > deferring it to a work queue.
>> >
>> > None of your arguments here is a blocker to locking
>> > idrinfo->lock. You really should focus on if it is really
>> > necessary to lock idrinfo->lock in tcf_del_walker(), rather
>> > than these details.
>> >
>> > For me, if you need idrinfo->lock for dump walker, you must
>> > need it for delete walker too, because deletion is a writer
>> > which should require stronger protection than the dumper,
>> > which merely a reader.
>>
>> I don't get how it is necessary. Dump walker uses pointers to actions
>> directly, and in order to be concurrency-safe it must either hold the
>
> It uses the pointer in a read-only way, what you said doesn't change
> the fact that it is a reader. And, like other readers, it may not need
> to lock at all, which is a different topic.
>
>
>> lock or obtain reference to action. Note that del walker doesn't use the
>> action pointer, it only passed action id to tcf_idr_delete_index()
>> function, which does all the necessary locking and can deal with
>> potential concurrency issues (concurrent delete, etc.). This approach
>> also benefits from code reuse from other code paths that delete actions,
>> instead of implementing its own.
>
> Look at the difference below.
>
> With your change:
>
> idr_for_each_entry_ul{
>    spin_lock(&idrinfo->lock);
>    idr_remove();
>    spin_unlock(&idrinfo->lock);
> }
>
> With what I suggest:
>
> spin_lock(&idrinfo->lock);
> idr_for_each_entry_ul{
>    idr_remove();
> }
> spin_unlock(&idrinfo->lock);
>
> Isn't a concurrent tcf_idr_check_alloc() able to livelock here with
> your change?
>
> idr_for_each_entry_ul{
>    spin_lock(&idrinfo->lock);
>    idr_remove();
>    spin_unlock(&idrinfo->lock);
>       // tcf_idr_check_alloc() jumps in,
>      // allocates next ID which can be found
>       // by idr_get_next_ul()
> } // the whole loop goes _literately_ infinite...

idr_for_each_entry_ul traverses idr entries with ascending order of
identifiers, so infinite livelock like this is not possible because it
never goes back to newly added entries with id<current_id.

>
>
> Also, idr_for_each_entry_ul() is supposed to be protected either
> by RCU or idrinfo->lock, no? With your change or without any change,
> it doesn't even have any lock after removing RTNL?

After reading this comment I checked actual idr implementation and I
think you are right. Even though idr_for_each_entry_ul() macro (and
function idr_get_next_ul() that it uses to iterate over idr entries)
doesn't specify any locking requirements in comment description (that is
why this patch doesn't use any), its implementation seems to require
external synchronization.

You suggest I should just hold idrinfo->lock for whole del_walker loop
duration, or play nicely with potential concurrent users and
take/release it per action?

Thanks,
Vlad

^ permalink raw reply

* [PATCH net-next] liquidio CN23XX: Remove set but not used variable 'ring_flag'
From: YueHaibing @ 2018-09-06 11:22 UTC (permalink / raw)
  To: Derek Chickles, Satanand Burla, Felix Manlunas, Raghu Vatsavayi,
	David S. Miller
  Cc: YueHaibing, netdev, kernel-janitors

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c: In function 'cn23xx_setup_octeon_vf_device':
drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c:619:20: warning:
 variable 'ring_flag' set but not used [-Wunused-but-set-variable]

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c b/drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c
index 962bb62..fda4940 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c
@@ -616,7 +616,7 @@ static void cn23xx_disable_vf_interrupt(struct octeon_device *oct, u8 intr_flag)
 int cn23xx_setup_octeon_vf_device(struct octeon_device *oct)
 {
 	struct octeon_cn23xx_vf *cn23xx = (struct octeon_cn23xx_vf *)oct->chip;
-	u32 rings_per_vf, ring_flag;
+	u32 rings_per_vf;
 	u64 reg_val;
 
 	if (octeon_map_pci_barx(oct, 0, 0))
@@ -634,8 +634,6 @@ int cn23xx_setup_octeon_vf_device(struct octeon_device *oct)
 
 	rings_per_vf = reg_val & CN23XX_PKT_INPUT_CTL_RPVF_MASK;
 
-	ring_flag = 0;
-
 	cn23xx->conf  = oct_get_config_info(oct, LIO_23XX);
 	if (!cn23xx->conf) {
 		dev_err(&oct->pci_dev->dev, "%s No Config found for CN23XX\n",

^ permalink raw reply related

* Re: [PATCH 2/7] mark root hnode explicitly
From: Jamal Hadi Salim @ 2018-09-06 11:04 UTC (permalink / raw)
  To: Al Viro; +Cc: netdev, Cong Wang, Jiri Pirko
In-Reply-To: <20180906105933.GU19965@ZenIV.linux.org.uk>

On 2018-09-06 6:59 a.m., Al Viro wrote:
> On Thu, Sep 06, 2018 at 06:34:00AM -0400, Jamal Hadi Salim wrote:
>> On 2018-09-06 6:28 a.m., Jamal Hadi Salim wrote:

[..]

> Point, and that one is IMO enough to give up on using ->flags for
> that.  How about simply
> 
> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
> index 3f985f29ef30..d14048e38b5c 100644
> --- a/net/sched/cls_u32.c
> +++ b/net/sched/cls_u32.c
> @@ -84,6 +84,7 @@ struct tc_u_hnode {
>   	int			refcnt;
>   	unsigned int		divisor;
>   	struct idr		handle_idr;
> +	bool			is_root;
>   	struct rcu_head		rcu;
>   	u32			flags;
>   	/* The 'ht' field MUST be the last field in structure to allow for
> @@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
>   	root_ht->refcnt++;
>   	root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
>   	root_ht->prio = tp->prio;
> +	root_ht->is_root = true;
>   	idr_init(&root_ht->handle_idr);
>   
>   	if (tp_c == NULL) {
> @@ -693,7 +695,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
>   		goto out;
>   	}
>   
> -	if (root_ht == ht) {
> +	if (ht->is_root) {
>   		NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
>   		return -EINVAL;
>   	}
> @@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
>   				NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
>   				return -EINVAL;
>   			}
> +			if (ht_down->is_root) {
> +				NL_SET_ERR_MSG_MOD(extack, "Not linking to root node");
> +				return -EINVAL;
> +			}
>   			ht_down->refcnt++;


Looks good to me ;->

cheers,
jamal

^ permalink raw reply

* Re: [PATCH 2/7] mark root hnode explicitly
From: Al Viro @ 2018-09-06 10:59 UTC (permalink / raw)
  To: Jamal Hadi Salim; +Cc: netdev, Cong Wang, Jiri Pirko
In-Reply-To: <5689c19a-c4fb-24c1-4594-86f2639faa98@mojatatu.com>

On Thu, Sep 06, 2018 at 06:34:00AM -0400, Jamal Hadi Salim wrote:
> On 2018-09-06 6:28 a.m., Jamal Hadi Salim wrote:
> > On 2018-09-05 3:04 p.m., Al Viro wrote:
> > > From: Al Viro <viro@zeniv.linux.org.uk>
> > > 
> > > ... and disallow deleting or linking to such
> > > 
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > Same comment as other one in regards to subject
> > 
> > Since the flag space is coming from htnode which is
> > exposed via uapi it makes sense to keep this one here
> > because it is for private use; but  a comment in
> > include/uapi/linux/pkt_cls.h that this flag or
> > maybe a set of bits is reserved for internal use.
> > Otherwise:
> > 
> > Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
> 
> 
> Sorry, additional comment:
> It makes sense to reject user space attempt to
> set TCA_CLS_FLAGS_U32_ROOT

Point, and that one is IMO enough to give up on using ->flags for
that.  How about simply

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 3f985f29ef30..d14048e38b5c 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -84,6 +84,7 @@ struct tc_u_hnode {
 	int			refcnt;
 	unsigned int		divisor;
 	struct idr		handle_idr;
+	bool			is_root;
 	struct rcu_head		rcu;
 	u32			flags;
 	/* The 'ht' field MUST be the last field in structure to allow for
@@ -377,6 +378,7 @@ static int u32_init(struct tcf_proto *tp)
 	root_ht->refcnt++;
 	root_ht->handle = tp_c ? gen_new_htid(tp_c, root_ht) : 0x80000000;
 	root_ht->prio = tp->prio;
+	root_ht->is_root = true;
 	idr_init(&root_ht->handle_idr);
 
 	if (tp_c == NULL) {
@@ -693,7 +695,7 @@ static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 		goto out;
 	}
 
-	if (root_ht == ht) {
+	if (ht->is_root) {
 		NL_SET_ERR_MSG_MOD(extack, "Not allowed to delete root node");
 		return -EINVAL;
 	}
@@ -795,6 +797,10 @@ static int u32_set_parms(struct net *net, struct tcf_proto *tp,
 				NL_SET_ERR_MSG_MOD(extack, "Link hash table not found");
 				return -EINVAL;
 			}
+			if (ht_down->is_root) {
+				NL_SET_ERR_MSG_MOD(extack, "Not linking to root node");
+				return -EINVAL;
+			}
 			ht_down->refcnt++;
 		}
 

^ permalink raw reply related

* Re: [PATCH 2/7] mark root hnode explicitly
From: Jamal Hadi Salim @ 2018-09-06 10:42 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko
In-Reply-To: <5689c19a-c4fb-24c1-4594-86f2639faa98@mojatatu.com>

[-- Attachment #1: Type: text/plain, Size: 47 bytes --]


And a bunch of indentations...

cheers,
jamal

[-- Attachment #2: indent.p --]
[-- Type: text/x-pascal, Size: 975 bytes --]

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index 6d45ec4c218c..cb3bee12af78 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -485,7 +485,8 @@ static void u32_clear_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h,
 	struct tc_cls_u32_offload cls_u32 = {};
 
 	tc_cls_common_offload_init(&cls_u32.common, tp,
-		h->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
+				   h->flags & ~TCA_CLS_FLAGS_U32_ROOT,
+				   extack);
 	cls_u32.command = TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = h->divisor;
 	cls_u32.hnode.handle = h->handle;
@@ -1215,7 +1216,7 @@ static int u32_reoffload_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 	int err;
 
 	tc_cls_common_offload_init(&cls_u32.common, tp,
-		ht->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
+				   ht->flags & ~TCA_CLS_FLAGS_U32_ROOT, extack);
 	cls_u32.command = add ? TC_CLSU32_NEW_HNODE : TC_CLSU32_DELETE_HNODE;
 	cls_u32.hnode.divisor = ht->divisor;
 	cls_u32.hnode.handle = ht->handle;

^ permalink raw reply related

* Re: [PATCH iproute2] tc/mqprio: Print extra info on invalid args.
From: Stephen Hemminger @ 2018-09-06 10:39 UTC (permalink / raw)
  To: Caleb Raitto; +Cc: netdev, jhs, xiyou.wangcong, jiri, Caleb Raitto
In-Reply-To: <20180905202419.238812-1-caleb.raitto@gmail.com>

On Wed,  5 Sep 2018 13:24:19 -0700
Caleb Raitto <caleb.raitto@gmail.com> wrote:

> From: Caleb Raitto <caraitto@google.com>
> 
> Print the name of the argument that wasn't understood, and also print
> the usage string.
> 
> Signed-off-by: Caleb Raitto <caraitto@google.com>


The standard code pattern in iproute2 is to use invarg().
Why not use that?

^ permalink raw reply

* Re: [PATCH] ath9k: add reset for airtime station debugfs
From: Louie Lu @ 2018-09-06 10:41 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: kvalo, ath9k-devel, davem, linux-wireless, netdev
In-Reply-To: <877ejzynz9.fsf@toke.dk>

Previous mail rejects by mailing list, re-send again...


Toke Høiland-Jørgensen <toke@toke.dk> 於 2018年9月6日 週四 下午5:27寫道:
>
> Louie Lu <git@louie.lu> writes:
>
> > Let user can reset station airtime status by debugfs, it will
> > reset all airtime deficit to ATH_AIRTIME_QUANTUM and reset rx/tx
> > airtime accumulate to 0.
>
> No objections to the patch, but I'm curious which issues you were
> debugging that led you to needing it? :)
>
I'm testing to get the packet queue time + airtime in ath_tx_process_buffer,
it would be useful if I can reset the station airtime accumulated
value, so I can observe in each test round (e.g. 5 ping) airtime
accumulated

Also to reset the deficit to make sure it runs like fresh one.

Louie.

>
> -Toke

^ permalink raw reply

* Re: [PATCH 7/7] clean tc_u_common hashtable
From: Jamal Hadi Salim @ 2018-09-06 10:36 UTC (permalink / raw)
  To: Al Viro, netdev; +Cc: Cong Wang, Jiri Pirko
In-Reply-To: <20180905190414.5477-7-viro@ZenIV.linux.org.uk>

On 2018-09-05 3:04 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> * calculate key *once*, not for each hash chain element
> * let tc_u_hash() return the pointer to chain head rather than index -
> callers are cleaner that way.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox