Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 0/2] r8169: enable ASPM on RTL8168E-VL
From: David Miller @ 2018-06-23 11:55 UTC (permalink / raw)
  To: hkallweit1; +Cc: nic_swsd, netdev
In-Reply-To: <50617290-b86c-f5fd-5d74-87459b0e9a4d@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Sat, 23 Jun 2018 09:49:37 +0200

> This patch series enables ASPM for the RTL8168E-VL and aligns ASPM entry
> latency handling with the vendor driver before.

Looks good to me, series applied, thank you.

^ permalink raw reply

* Re: [PATCH] ipv6: avoid copy_from_user() via ipv6_renew_options_kern()
From: David Miller @ 2018-06-23 12:16 UTC (permalink / raw)
  To: pmoore; +Cc: netdev, selinux, linux-security-module
In-Reply-To: <152970230022.7734.15824980755229329454.stgit@chester>

From: Paul Moore <pmoore@redhat.com>
Date: Fri, 22 Jun 2018 17:18:20 -0400

> From: Paul Moore <paul@paul-moore.com>
> 
> The ipv6_renew_options_kern() function eventually called into
> copy_from_user(), despite it not using any userspace buffers, which
> was problematic as that ended up calling access_ok() which emited
> a warning on x86 (and likely other arches as well).
> 
>   ipv6_renew_options_kern()
>     ipv6_renew_options()
>       ipv6_renew_option()
>         copy_from_user()
>           _copy_from_user()
>             access_ok()
> 
> The access_ok() check inside _copy_from_user() is obviously the right
> thing to do which means that calling copy_from_user() via
> ipv6_renew_options_kern() is obviously the wrong thing to do.

Ok, I re-read the code around here.

access_ok() is not warning because we are calling copy_from_user()
with a kernel pointer.  The set_ds(KERNEL_DS) adjusts the
user_addr_max() setting, and thus that check passes.

The problem is that we are invoking this from an interrupt, and this
triggers the WARN_ON_IN_IRQ() in access_ok().

Although I think that WARN_ON_IN_IRQ() is completely unnecessary when
KERNEL_DS is set, the situation that really causes this problem is not
at all clear from your commit message.

I guess that for now your fix is fine, but I want you to please adjust
the commit message.

Provide the _full_ annotated kernel backtrace from the warning that
triggers, because this will show the reader that we are in an
interrupt.  And explain that being in the interrupt is strictly what
causes this to warn, not that we are using kernel pointers.  The
latter is %100 valid when set_fs(KERNEL_DS) is performed.

Thank you.

^ permalink raw reply

* Re: [PATCH net] xfrm: fix missing dst_release() after policy blocking lbcast and multicast
From: Steffen Klassert @ 2018-06-23 14:03 UTC (permalink / raw)
  To: Tommi Rantala
  Cc: netdev, huaibin Wang, Herbert Xu, David S. Miller, open list
In-Reply-To: <20180621063048.13847-1-tommi.t.rantala@nokia.com>

On Thu, Jun 21, 2018 at 09:30:47AM +0300, Tommi Rantala wrote:
> Fix missing dst_release() when local broadcast or multicast traffic is
> xfrm policy blocked.
> 
> For IPv4 this results to dst leak: ip_route_output_flow() allocates
> dst_entry via __ip_route_output_key() and passes it to
> xfrm_lookup_route(). xfrm_lookup returns ERR_PTR(-EPERM) that is
> propagated. The dst that was allocated is never released.
> 
> IPv4 local broadcast testcase:
>  ping -b 192.168.1.255 &
>  sleep 1
>  ip xfrm policy add src 0.0.0.0/0 dst 192.168.1.255/32 dir out action block
> 
> IPv4 multicast testcase:
>  ping 224.0.0.1 &
>  sleep 1
>  ip xfrm policy add src 0.0.0.0/0 dst 224.0.0.1/32 dir out action block
> 
> For IPv6 the missing dst_release() causes trouble e.g. when used in netns:
>  ip netns add TEST
>  ip netns exec TEST ip link set lo up
>  ip link add dummy0 type dummy
>  ip link set dev dummy0 netns TEST
>  ip netns exec TEST ip addr add fd00::1111 dev dummy0
>  ip netns exec TEST ip link set dummy0 up
>  ip netns exec TEST ping -6 -c 5 ff02::1%dummy0 &
>  sleep 1
>  ip netns exec TEST ip xfrm policy add src ::/0 dst ff02::1 dir out action block
>  wait
>  ip netns del TEST
> 
> After netns deletion we see:
> [  258.239097] unregister_netdevice: waiting for lo to become free. Usage count = 2
> [  268.279061] unregister_netdevice: waiting for lo to become free. Usage count = 2
> [  278.367018] unregister_netdevice: waiting for lo to become free. Usage count = 2
> [  288.375259] unregister_netdevice: waiting for lo to become free. Usage count = 2
> 
> Fixes: ac37e2515c1a ("xfrm: release dst_orig in case of error in xfrm_lookup()")
> Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>

Patch applied, thanks a lot!

^ permalink raw reply

* [PATCH net] cxgb4: when disabling dcb set txq dcb priority to 0
From: Ganesh Goudar @ 2018-06-23 14:58 UTC (permalink / raw)
  To: netdev, davem
  Cc: nirranjan, indranil, robert, Ganesh Goudar, David Ahern,
	Casey Leedom

When we are disabling DCB, store "0" in txq->dcb_prio
since that's used for future TX Work Request "OVLAN_IDX"
values. Setting non zero priority upon disabling DCB
would halt the traffic.

Reported-by: AMG Zollner Robert <robert@cloudmedia.eu>
CC: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 35cb3ae..aaaf775 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -263,7 +263,7 @@ static void dcb_tx_queue_prio_enable(struct net_device *dev, int enable)
 				"Can't %s DCB Priority on port %d, TX Queue %d: err=%d\n",
 				enable ? "set" : "unset", pi->port_id, i, -err);
 		else
-			txq->dcb_prio = value;
+			txq->dcb_prio = enable ? value : 0;
 	}
 }
 
-- 
2.1.0

^ permalink raw reply related

* Re: [PATCH net-next 3/4] netdevsim: add ipsec offload testing
From: Shannon Nelson @ 2018-06-23 15:16 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, anders.roxell, linux-kselftest
In-Reply-To: <20180622210708.43d50d15@cakuba.netronome.com>

On 6/22/2018 9:07 PM, Jakub Kicinski wrote:
> On Fri, 22 Jun 2018 17:31:37 -0700, Shannon Nelson wrote:
>> Implement the IPsec/XFRM offload API for testing.
>>
>> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
> 
> Thanks for the patch!  Just a number of stylistic nit picks.

Thanks for the comments, I'll do a v2 in a couple of days.
sln

> 
>> diff --git a/drivers/net/netdevsim/ipsec.c b/drivers/net/netdevsim/ipsec.c
>> new file mode 100644
>> index 0000000..ad64266
>> --- /dev/null
>> +++ b/drivers/net/netdevsim/ipsec.c
>> @@ -0,0 +1,345 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright(c) 2018 Oracle and/or its affiliates. All rights reserved. */
>> +
>> +#include <net/xfrm.h>
>> +#include <crypto/aead.h>
>> +#include <linux/debugfs.h>
>> +#include "netdevsim.h"
> 
> Other files in the driver sort headers alphabetically and put an empty
> line between global and local headers.
> 
>> +#define NSIM_IPSEC_AUTH_BITS	128
>> +
>> +/**
>> + * nsim_ipsec_dbg_read - read for ipsec data
>> + * @filp: the opened file
>> + * @buffer: where to write the data for the user to read
>> + * @count: the size of the user's buffer
>> + * @ppos: file position offset
>> + **/
>> +static ssize_t nsim_dbg_netdev_ops_read(struct file *filp,
> 
> Doesn't match the kdoc.  Please run
> 
> ./scripts/kernel-doc -none $file
> 
> if you want kdoc.  Although IMHO you may as well drop the kdoc, your
> code is quite self explanatory and local.
> 
>> +					char __user *buffer,
>> +					size_t count, loff_t *ppos)
>> +{
>> +	struct netdevsim *ns = filp->private_data;
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
>> +	size_t bufsize;
>> +	char *buf, *p;
>> +	int len;
>> +	int i;
>> +
>> +	/* don't allow partial reads */
>> +	if (*ppos != 0)
>> +		return 0;
>> +
>> +	/* the buffer needed is
>> +	 * (num SAs * 3 lines each * ~60 bytes per line) + one more line
>> +	 */
>> +	bufsize = (ipsec->count * 4 * 60) + 60;
>> +	buf = kzalloc(bufsize, GFP_KERNEL);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +
>> +	p = buf;
>> +	p += snprintf(p, bufsize - (p - buf),
>> +		      "SA count=%u tx=%u\n",
>> +		      ipsec->count, ipsec->tx);
>> +
>> +	for (i = 0; i < NSIM_IPSEC_MAX_SA_COUNT; i++) {
>> +		struct nsim_sa *sap = &ipsec->sa[i];
>> +
>> +		if (!sap->used)
>> +			continue;
>> +
>> +		p += snprintf(p, bufsize - (p - buf),
>> +			      "sa[%i] %cx ipaddr=0x%08x %08x %08x %08x\n",
>> +			      i, (sap->rx ? 'r' : 't'), sap->ipaddr[0],
>> +			      sap->ipaddr[1], sap->ipaddr[2], sap->ipaddr[3]);
>> +		p += snprintf(p, bufsize - (p - buf),
>> +			      "sa[%i]    spi=0x%08x proto=0x%x salt=0x%08x crypt=%d\n",
>> +			      i, be32_to_cpu(sap->xs->id.spi),
>> +			      sap->xs->id.proto, sap->salt, sap->crypt);
>> +		p += snprintf(p, bufsize - (p - buf),
>> +			      "sa[%i]    key=0x%08x %08x %08x %08x\n",
>> +			      i, sap->key[0], sap->key[1],
>> +			      sap->key[2], sap->key[3]);
>> +	}
>> +
>> +	len = simple_read_from_buffer(buffer, count, ppos, buf, p - buf);
> 
> Why not seq_file for this?
> 
>> +	kfree(buf);
>> +	return len;
>> +}
>> +
>> +static const struct file_operations ipsec_dbg_fops = {
>> +	.owner = THIS_MODULE,
>> +	.open = simple_open,
>> +	.read = nsim_dbg_netdev_ops_read,
>> +};
>> +
>> +/**
>> + * nsim_ipsec_find_empty_idx - find the first unused security parameter index
>> + * @ipsec: pointer to ipsec struct
>> + **/
>> +static int nsim_ipsec_find_empty_idx(struct nsim_ipsec *ipsec)
>> +{
>> +	u32 i;
>> +
>> +	if (ipsec->count == NSIM_IPSEC_MAX_SA_COUNT)
>> +		return -ENOSPC;
>> +
>> +	/* search sa table */
>> +	for (i = 0; i < NSIM_IPSEC_MAX_SA_COUNT; i++) {
>> +		if (!ipsec->sa[i].used)
>> +			return i;
>> +	}
>> +
>> +	return -ENOSPC;
> 
> FWIW I personally find bitmaps and find_first_zero_bit() etc. nice and
> concise for a small ID allocator, but no objection to open coding.
> 
>> +}
>> +
>> +/**
>> + * nsim_ipsec_parse_proto_keys - find the key and salt based on the protocol
>> + * @xs: pointer to xfrm_state struct
>> + * @mykey: pointer to key array to populate
>> + * @mysalt: pointer to salt value to populate
>> + *
>> + * This copies the protocol keys and salt to our own data tables.  The
>> + * 82599 family only supports the one algorithm.
> 
> 82599 is a fine chip, it's not netdevsim tho? ;)
> 
>> + **/
>> +static int nsim_ipsec_parse_proto_keys(struct xfrm_state *xs,
>> +				       u32 *mykey, u32 *mysalt)
>> +{
>> +	struct net_device *dev = xs->xso.dev;
>> +	unsigned char *key_data;
>> +	char *alg_name = NULL;
>> +	const char aes_gcm_name[] = "rfc4106(gcm(aes))";
>> +	int key_len;
> 
> reverse xmas tree please
> 
>> +
>> +	if (!xs->aead) {
>> +		netdev_err(dev, "Unsupported IPsec algorithm\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (xs->aead->alg_icv_len != NSIM_IPSEC_AUTH_BITS) {
>> +		netdev_err(dev, "IPsec offload requires %d bit authentication\n",
>> +			   NSIM_IPSEC_AUTH_BITS);
>> +		return -EINVAL;
>> +	}
>> +
>> +	key_data = &xs->aead->alg_key[0];
>> +	key_len = xs->aead->alg_key_len;
>> +	alg_name = xs->aead->alg_name;
>> +
>> +	if (strcmp(alg_name, aes_gcm_name)) {
>> +		netdev_err(dev, "Unsupported IPsec algorithm - please use %s\n",
>> +			   aes_gcm_name);
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* The key bytes come down in a bigendian array of bytes, so
>> +	 * we don't need to do any byteswapping.
> 
> Why the mention of bigendian?  82599 needs big endian? -.^
> 
>> +	 * 160 accounts for 16 byte key and 4 byte salt
>> +	 */
>> +	if (key_len > 128) {
> 
> s/128/NSIM_IPSEC_AUTH_BITS/ ?
> 
>> +		*mysalt = ((u32 *)key_data)[4];
> 
> Is alignment guaranteed?  There are the unaligned helpers if you need
> them..
> 
>> +	} else if (key_len == 128) {
>> +		*mysalt = 0;
>> +	} else {
>> +		netdev_err(dev, "IPsec hw offload only supports 128 bit keys with optional 32 bit salt\n");
>> +		return -EINVAL;
>> +	}
>> +	memcpy(mykey, key_data, 16);
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * nsim_ipsec_add_sa - program device with a security association
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static int nsim_ipsec_add_sa(struct xfrm_state *xs)
>> +{
>> +	struct net_device *dev = xs->xso.dev;
>> +	struct netdevsim *ns = netdev_priv(dev);
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
> 
> xmas tree again (initialize out of line if you have to)
> 
>> +	struct nsim_sa sa;
>> +	u16 sa_idx;
>> +	int ret;
>> +
>> +	if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
>> +		netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
>> +			   xs->id.proto);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (xs->calg) {
>> +		netdev_err(dev, "Compression offload not supported\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* find the first unused index */
>> +	ret = nsim_ipsec_find_empty_idx(ipsec);
>> +	if (ret < 0) {
>> +		netdev_err(dev, "No space for SA in Rx table!\n");
>> +		return ret;
>> +	}
>> +	sa_idx = (u16)ret;
>> +
>> +	memset(&sa, 0, sizeof(sa));
>> +	sa.used = true;
>> +	sa.xs = xs;
>> +
>> +	if (sa.xs->id.proto & IPPROTO_ESP)
>> +		sa.crypt = xs->ealg || xs->aead;
>> +
>> +	/* get the key and salt */
>> +	ret = nsim_ipsec_parse_proto_keys(xs, sa.key, &sa.salt);
>> +	if (ret) {
>> +		netdev_err(dev, "Failed to get key data for SA table\n");
>> +		return ret;
>> +	}
>> +
>> +	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
>> +		sa.rx = true;
>> +
>> +		if (xs->props.family == AF_INET6)
>> +			memcpy(sa.ipaddr, &xs->id.daddr.a6, 16);
>> +		else
>> +			memcpy(&sa.ipaddr[3], &xs->id.daddr.a4, 4);
>> +	}
>> +
>> +	/* the preparations worked, so save the info */
>> +	memcpy(&ipsec->sa[sa_idx], &sa, sizeof(sa));
>> +
>> +	/* the XFRM stack doesn't like offload_handle == 0,
>> +	 * so add a bitflag in case our array index is 0
>> +	 */
>> +	xs->xso.offload_handle = sa_idx | NSIM_IPSEC_VALID;
>> +	ipsec->count++;
>> +
>> +	return 0;
>> +}
>> +
>> +/**
>> + * nsim_ipsec_del_sa - clear out this specific SA
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static void nsim_ipsec_del_sa(struct xfrm_state *xs)
>> +{
>> +	struct net_device *dev = xs->xso.dev;
>> +	struct netdevsim *ns = netdev_priv(dev);
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
>> +	u16 sa_idx;
>> +
>> +	sa_idx = xs->xso.offload_handle & ~NSIM_IPSEC_VALID;
>> +	if (!ipsec->sa[sa_idx].used) {
>> +		netdev_err(dev, "Invalid SA for delete sa_idx=%d\n", sa_idx);
>> +		return;
>> +	}
>> +
>> +	memset(&ipsec->sa[sa_idx], 0, sizeof(struct nsim_sa));
>> +	ipsec->count--;
>> +}
>> +
>> +/**
>> + * nsim_ipsec_offload_ok - can this packet use the xfrm hw offload
>> + * @skb: current data packet
>> + * @xs: pointer to transformer state struct
>> + **/
>> +static bool nsim_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
>> +{
>> +	struct net_device *dev = xs->xso.dev;
>> +	struct netdevsim *ns = netdev_priv(dev);
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
>> +
>> +	ipsec->ok++;
>> +
>> +	return true;
>> +}
>> +
>> +static const struct xfrmdev_ops nsim_xfrmdev_ops = {
>> +	.xdo_dev_state_add = nsim_ipsec_add_sa,
>> +	.xdo_dev_state_delete = nsim_ipsec_del_sa,
>> +	.xdo_dev_offload_ok = nsim_ipsec_offload_ok,
> 
> Please align the initializers by adding tabs before '='.
> 
>> +};
>> +
>> +/**
>> + * nsim_ipsec_tx - check Tx packet for ipsec offload
>> + * @ns: pointer to ns structure
>> + * @skb: current data packet
>> + **/
>> +int nsim_ipsec_tx(struct netdevsim *ns, struct sk_buff *skb)
>> +{
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
>> +	struct xfrm_state *xs;
>> +	struct nsim_sa *tsa;
>> +	u32 sa_idx;
>> +
>> +	/* do we even need to check this packet? */
>> +	if (!skb->sp)
>> +		return 1;
>> +
>> +	if (unlikely(!skb->sp->len)) {
>> +		netdev_err(ns->netdev, "%s: no xfrm state len = %d\n",
>> +			   __func__, skb->sp->len);
> 
> Hmm..  __func__ started appearing in errors?  Perhaps either always or
> never add it?
> 
> Also, I know this is not a real device, but please always use rate
> limited print functions on the data path.
> 
>> +		return 0;
>> +	}
>> +
>> +	xs = xfrm_input_state(skb);
>> +	if (unlikely(!xs)) {
>> +		netdev_err(ns->netdev, "%s: no xfrm_input_state() xs = %p\n",
>> +			   __func__, xs);
>> +		return 0;
>> +	}
>> +
>> +	sa_idx = xs->xso.offload_handle & ~NSIM_IPSEC_VALID;
>> +	if (unlikely(sa_idx > NSIM_IPSEC_MAX_SA_COUNT)) {
>> +		netdev_err(ns->netdev, "%s: bad sa_idx=%d max=%d\n",
>> +			   __func__, sa_idx, NSIM_IPSEC_MAX_SA_COUNT);
>> +		return 0;
>> +	}
>> +
>> +	tsa = &ipsec->sa[sa_idx];
>> +	if (unlikely(!tsa->used)) {
>> +		netdev_err(ns->netdev, "%s: unused sa_idx=%d\n",
>> +			   __func__, sa_idx);
>> +		return 0;
>> +	}
>> +
>> +	if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
>> +		netdev_err(ns->netdev, "%s: unexpected proto=%d\n",
>> +			   __func__, xs->id.proto);
>> +		return 0;
>> +	}
>> +
>> +	ipsec->tx++;
>> +
>> +	return 1;
>> +}
> 
> Looks like the function should return bool?
> 
>> +
>> +/**
>> + * nsim_ipsec_init - initialize security registers for IPSec operation
>> + * @ns: board private structure
> 
> "board"?  Yes, the kdoc may be best removed ;)
> 
>> + **/
>> +void nsim_ipsec_init(struct netdevsim *ns)
>> +{
>> +	ns->netdev->xfrmdev_ops = &nsim_xfrmdev_ops;
>> +
>> +#define NSIM_ESP_FEATURES	(NETIF_F_HW_ESP | \
>> +				 NETIF_F_HW_ESP_TX_CSUM | \
>> +				 NETIF_F_GSO_ESP)
>> +
>> +	ns->netdev->features |= NSIM_ESP_FEATURES;
>> +	ns->netdev->hw_enc_features |= NSIM_ESP_FEATURES;
>> +
>> +	ns->ipsec.pfile = debugfs_create_file("ipsec", 0400, ns->ddir, ns,
>> +					      &ipsec_dbg_fops);
>> +}
>> +
>> +void nsim_ipsec_teardown(struct netdevsim *ns)
>> +{
>> +	struct nsim_ipsec *ipsec = &ns->ipsec;
>> +
>> +	if (ipsec->count)
>> +		netdev_err(ns->netdev, "%s: tearing down IPsec offload with %d SAs left\n",
>> +			   __func__, ipsec->count);
>> +	debugfs_remove_recursive(ipsec->pfile);
>> +}
>> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
>> index ec68f38..6ce8604d 100644
>> --- a/drivers/net/netdevsim/netdev.c
>> +++ b/drivers/net/netdevsim/netdev.c
>> @@ -171,6 +171,8 @@ static int nsim_init(struct net_device *dev)
>>   	if (err)
>>   		goto err_unreg_dev;
>>   
>> +	nsim_ipsec_init(ns);
>> +
>>   	return 0;
>>   
>>   err_unreg_dev:
>> @@ -186,6 +188,7 @@ static void nsim_uninit(struct net_device *dev)
>>   {
>>   	struct netdevsim *ns = netdev_priv(dev);
>>   
>> +	nsim_ipsec_teardown(ns);
>>   	nsim_devlink_teardown(ns);
>>   	debugfs_remove_recursive(ns->ddir);
>>   	nsim_bpf_uninit(ns);
>> @@ -203,11 +206,15 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
>>   {
>>   	struct netdevsim *ns = netdev_priv(dev);
>>   
>> +	if (!nsim_ipsec_tx(ns, skb))
>> +		goto out;
>> +
>>   	u64_stats_update_begin(&ns->syncp);
>>   	ns->tx_packets++;
>>   	ns->tx_bytes += skb->len;
>>   	u64_stats_update_end(&ns->syncp);
>>   
>> +out:
>>   	dev_kfree_skb(skb);
>>   
>>   	return NETDEV_TX_OK;
>> diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h
>> index 3a8581a..1708dee 100644
>> --- a/drivers/net/netdevsim/netdevsim.h
>> +++ b/drivers/net/netdevsim/netdevsim.h
>> @@ -29,6 +29,29 @@ struct bpf_prog;
>>   struct dentry;
>>   struct nsim_vf_config;
>>   
>> +#if IS_ENABLED(CONFIG_XFRM_OFFLOAD)
>> +#define NSIM_IPSEC_MAX_SA_COUNT		33
> 
> 33 caught my eye - out of curiosity is it 2^5 + 1 to catch some type of
> bug or failure mode?
> 
>> +#define NSIM_IPSEC_VALID		BIT(31)
>> +
>> +struct nsim_sa {
>> +	struct xfrm_state *xs;
>> +	__be32 ipaddr[4];
>> +	u32 key[4];
>> +	u32 salt;
>> +	bool used;
>> +	bool crypt;
>> +	bool rx;
>> +};
>> +
>> +struct nsim_ipsec {
>> +	struct nsim_sa sa[NSIM_IPSEC_MAX_SA_COUNT];
>> +	struct dentry *pfile;
>> +	u32 count;
>> +	u32 tx;
>> +	u32 ok;
>> +};
>> +#endif
> 
> No need to wrap struct definitions in #if/#endif.
> 
>>   struct netdevsim {
>>   	struct net_device *netdev;
>>   
>> @@ -67,6 +90,9 @@ struct netdevsim {
>>   #if IS_ENABLED(CONFIG_NET_DEVLINK)
>>   	struct devlink *devlink;
>>   #endif
>> +#if IS_ENABLED(CONFIG_XFRM_OFFLOAD)
>> +	struct nsim_ipsec ipsec;
>> +#endif
>>   };
>>   
>>   extern struct dentry *nsim_ddir;
>> @@ -147,6 +173,17 @@ static inline void nsim_devlink_exit(void)
>>   }
>>   #endif
>>   
>> +#if IS_ENABLED(CONFIG_XFRM_OFFLOAD)
>> +void nsim_ipsec_init(struct netdevsim *ns);
>> +void nsim_ipsec_teardown(struct netdevsim *ns);
>> +int nsim_ipsec_tx(struct netdevsim *ns, struct sk_buff *skb);
>> +#else
>> +static inline void nsim_ipsec_init(struct netdevsim *ns) {};
>> +static inline void nsim_ipsec_teardown(struct netdevsim *ns) {};
>> +static inline int nsim_ipsec_tx(struct netdevsim *ns, struct sk_buff *skb)
>> +								{ return 1; };
> 
> Please use the same formatting for static inlines as the rest of the
> file.  The ';' are also unnecessary.
> 
> Other than those formatting nit picks looks good to me :)
> 

^ permalink raw reply

* [net regression] "fib_rules: move common handling of newrule delrule msgs into fib_nl2rule" breaks suppress_prefixlength
From: Jason A. Donenfeld @ 2018-06-23 15:46 UTC (permalink / raw)
  To: roopa; +Cc: Netdev

Hey Roopa,

On a kernel with a minimal networking config,
CONFIG_IP_MULTIPLE_TABLES appears to be broken for certain rules after
f9d4b0c1e9695e3de7af3768205bacc27312320c.

Try, for example, running:

$ ip -4 rule add table main suppress_prefixlength 0

It returns with EEXIST.

Perhaps the reason is that the new rule_find function does not match
on suppress_prefixlength? However, rule_exist from before didn't do
that either. I'll keep playing and see if I can track it down myself,
but thought I should let you know first.

A relevant .config can be found at https://א.cc/iq5HoUY0

Jason

^ permalink raw reply

* [PATCH] fib_rules: match rules based on suppress_* properties too
From: Jason A. Donenfeld @ 2018-06-23 15:59 UTC (permalink / raw)
  To: roopa, Netdev; +Cc: Jason A. Donenfeld
In-Reply-To: <CAHmME9rdxf+577u-=qx1Ss1YAz_zOPAHa6TM6ThewtybBE_R_g@mail.gmail.com>

Two rules with different values of suppress_prefix or suppress_ifgroup
are not the same. This fixes an -EEXIST when running:

   $ ip -4 rule add table main suppress_prefixlength 0

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fixes: f9d4b0c1e969 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")
---
 net/core/fib_rules.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 126ffc5bc630..665799311b98 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -416,6 +416,12 @@ static struct fib_rule *rule_find(struct fib_rules_ops *ops,
 		if (rule->mark && r->mark != rule->mark)
 			continue;

+		if (r->suppress_ifgroup != rule->suppress_ifgroup)
+			continue;
+
+		if (r->suppress_prefixlen != rule->suppress_prefixlen)
+			continue;
+
 		if (rule->mark_mask && r->mark_mask != rule->mark_mask)
 			continue;

-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] ipv6: avoid copy_from_user() via ipv6_renew_options_kern()
From: Paul Moore @ 2018-06-23 16:15 UTC (permalink / raw)
  To: davem; +Cc: Paul Moore, netdev, selinux, linux-security-module
In-Reply-To: <20180623.211618.698435338116789998.davem@davemloft.net>

On Sat, Jun 23, 2018 at 8:16 AM David Miller <davem@davemloft.net> wrote:
>
> From: Paul Moore <pmoore@redhat.com>
> Date: Fri, 22 Jun 2018 17:18:20 -0400
>
> > From: Paul Moore <paul@paul-moore.com>
> >
> > The ipv6_renew_options_kern() function eventually called into
> > copy_from_user(), despite it not using any userspace buffers, which
> > was problematic as that ended up calling access_ok() which emited
> > a warning on x86 (and likely other arches as well).
> >
> >   ipv6_renew_options_kern()
> >     ipv6_renew_options()
> >       ipv6_renew_option()
> >         copy_from_user()
> >           _copy_from_user()
> >             access_ok()
> >
> > The access_ok() check inside _copy_from_user() is obviously the right
> > thing to do which means that calling copy_from_user() via
> > ipv6_renew_options_kern() is obviously the wrong thing to do.
>
> Ok, I re-read the code around here.
>
> access_ok() is not warning because we are calling copy_from_user()
> with a kernel pointer.  The set_ds(KERNEL_DS) adjusts the
> user_addr_max() setting, and thus that check passes.
>
> The problem is that we are invoking this from an interrupt, and this
> triggers the WARN_ON_IN_IRQ() in access_ok().
>
> Although I think that WARN_ON_IN_IRQ() is completely unnecessary when
> KERNEL_DS is set, the situation that really causes this problem is not
> at all clear from your commit message.
>
> I guess that for now your fix is fine, but I want you to please adjust
> the commit message.
>
> Provide the _full_ annotated kernel backtrace from the warning that
> triggers, because this will show the reader that we are in an
> interrupt.  And explain that being in the interrupt is strictly what
> causes this to warn, not that we are using kernel pointers.  The
> latter is %100 valid when set_fs(KERNEL_DS) is performed.
>
> Thank you.

Okay, so it's the right fix for all the wrong reasons :)

Thanks for the correction; I'll fixup the commit subject/description
and resend when I'm in front of the system with the patch (later this
weekend, early next week).

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [Patch net-next] net_sched: remove unused htb drop_list
From: Cong Wang @ 2018-06-23 20:46 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, Florian Westphal

After commit a09ceb0e0814 ("sched: remove qdisc->drop"),
it is no longer used.

Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_htb.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 2a4ab7caf553..43c4bfe625a9 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -126,7 +126,6 @@ struct htb_class {
 
 	union {
 		struct htb_class_leaf {
-			struct list_head drop_list;
 			int		deficit[TC_HTB_MAXDEPTH];
 			struct Qdisc	*q;
 		} leaf;
@@ -171,7 +170,6 @@ struct htb_sched {
 	struct qdisc_watchdog	watchdog;
 
 	s64			now;	/* cached dequeue time */
-	struct list_head	drops[TC_HTB_NUMPRIO];/* active leaves (for drops) */
 
 	/* time of nearest event per level (row) */
 	s64			near_ev_cache[TC_HTB_MAXDEPTH];
@@ -562,8 +560,6 @@ static inline void htb_activate(struct htb_sched *q, struct htb_class *cl)
 	if (!cl->prio_activity) {
 		cl->prio_activity = 1 << cl->prio;
 		htb_activate_prios(q, cl);
-		list_add_tail(&cl->un.leaf.drop_list,
-			      q->drops + cl->prio);
 	}
 }
 
@@ -579,7 +575,6 @@ static inline void htb_deactivate(struct htb_sched *q, struct htb_class *cl)
 
 	htb_deactivate_prios(q, cl);
 	cl->prio_activity = 0;
-	list_del_init(&cl->un.leaf.drop_list);
 }
 
 static void htb_enqueue_tail(struct sk_buff *skb, struct Qdisc *sch,
@@ -981,7 +976,6 @@ static void htb_reset(struct Qdisc *sch)
 			else {
 				if (cl->un.leaf.q)
 					qdisc_reset(cl->un.leaf.q);
-				INIT_LIST_HEAD(&cl->un.leaf.drop_list);
 			}
 			cl->prio_activity = 0;
 			cl->cmode = HTB_CAN_SEND;
@@ -993,8 +987,6 @@ static void htb_reset(struct Qdisc *sch)
 	sch->qstats.backlog = 0;
 	memset(q->hlevel, 0, sizeof(q->hlevel));
 	memset(q->row_mask, 0, sizeof(q->row_mask));
-	for (i = 0; i < TC_HTB_NUMPRIO; i++)
-		INIT_LIST_HEAD(q->drops + i);
 }
 
 static const struct nla_policy htb_policy[TCA_HTB_MAX + 1] = {
@@ -1024,7 +1016,6 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt,
 	struct nlattr *tb[TCA_HTB_MAX + 1];
 	struct tc_htb_glob *gopt;
 	int err;
-	int i;
 
 	qdisc_watchdog_init(&q->watchdog, sch);
 	INIT_WORK(&q->work, htb_work_func);
@@ -1050,8 +1041,6 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt,
 	err = qdisc_class_hash_init(&q->clhash);
 	if (err < 0)
 		return err;
-	for (i = 0; i < TC_HTB_NUMPRIO; i++)
-		INIT_LIST_HEAD(q->drops + i);
 
 	qdisc_skb_head_init(&q->direct_queue);
 
@@ -1224,7 +1213,6 @@ static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl,
 
 	parent->level = 0;
 	memset(&parent->un.inner, 0, sizeof(parent->un.inner));
-	INIT_LIST_HEAD(&parent->un.leaf.drop_list);
 	parent->un.leaf.q = new_q ? new_q : &noop_qdisc;
 	parent->tokens = parent->buffer;
 	parent->ctokens = parent->cbuffer;
@@ -1418,7 +1406,6 @@ static int htb_change_class(struct Qdisc *sch, u32 classid,
 		}
 
 		cl->children = 0;
-		INIT_LIST_HEAD(&cl->un.leaf.drop_list);
 		RB_CLEAR_NODE(&cl->pq_node);
 
 		for (prio = 0; prio < TC_HTB_NUMPRIO; prio++)
-- 
2.14.4

^ permalink raw reply related

* [PATCH v2 net-next] net/sched: add skbprio scheduler
From: Nishanth Devarajan @ 2018-06-23 20:47 UTC (permalink / raw)
  To: xiyou.wangcong, jhs, jiri, davem; +Cc: netdev, doucette, michel

net/sched: add skbprio scheduler

Skbprio (SKB Priority Queue) is a queueing discipline that prioritizes packets
according to their skb->priority field. Although Skbprio can be employed in any
scenario in which a higher skb->priority field means a higher priority packet,
Skbprio was concieved as a solution for denial-of-service defenses that need to
route packets with different priorities.

v2
*Use skb->priority field rather than DS field. Rename queueing discipline as
SKB Priority Queue (previously Gatekeeper Priority Queue).

*Queueing discipline is made classful to expose Skbprio's internal priority
queues.

Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: Sachin Paryani <sachin.paryani@gmail.com>
Reviewed-by: Cody Doucette <doucette@bu.edu>
Reviewed-by: Michel Machado <michel@digirati.com.br>
---
 include/uapi/linux/pkt_sched.h |  15 ++
 net/sched/Kconfig              |  13 ++
 net/sched/Makefile             |   1 +
 net/sched/sch_skbprio.c        | 347 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 376 insertions(+)
 create mode 100644 net/sched/sch_skbprio.c

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..6fd07e8 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -124,6 +124,21 @@ struct tc_fifo_qopt {
 	__u32	limit;	/* Queue length: bytes for bfifo, packets for pfifo */
 };
 
+/* SKBPRIO section */
+
+/*
+ * Priorities go from zero to (SKBPRIO_MAX_PRIORITY - 1).
+ * SKBPRIO_MAX_PRIORITY should be at least 64 in order for skbprio to be able
+ * to map one to one the DS field of IPV4 and IPV6 headers.
+ * Memory allocation grows linearly with SKBPRIO_MAX_PRIORITY.
+ */
+
+#define SKBPRIO_MAX_PRIORITY 64
+
+struct tc_skbprio_qopt {
+	__u32	limit; 	    	/* Queue length in packets. */
+};
+
 /* PRIO section */
 
 #define TCQ_PRIO_BANDS	16
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index a01169f..9ac4b53 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -240,6 +240,19 @@ config NET_SCH_MQPRIO
 
 	  If unsure, say N.
 
+config NET_SCH_SKBPRIO
+	tristate "SKB priority queue scheduler (SKBPRIO)"
+	help
+	  Say Y here if you want to use the SKB priority queue
+	  scheduler. This schedules packets according to skb->priority,
+	  which is useful for request packets in DoS mitigation systems such
+	  as Gatekeeper.
+
+	  To compile this driver as a module, choose M here: the module will
+	  be called sch_skbprio.
+
+	  If unsure, say N.
+
 config NET_SCH_CHOKE
 	tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
 	help
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 8811d38..a4d8893 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_NET_SCH_NETEM)	+= sch_netem.o
 obj-$(CONFIG_NET_SCH_DRR)	+= sch_drr.o
 obj-$(CONFIG_NET_SCH_PLUG)	+= sch_plug.o
 obj-$(CONFIG_NET_SCH_MQPRIO)	+= sch_mqprio.o
+obj-$(CONFIG_NET_SCH_SKBPRIO)	+= sch_skbprio.o
 obj-$(CONFIG_NET_SCH_CHOKE)	+= sch_choke.o
 obj-$(CONFIG_NET_SCH_QFQ)	+= sch_qfq.o
 obj-$(CONFIG_NET_SCH_CODEL)	+= sch_codel.o
diff --git a/net/sched/sch_skbprio.c b/net/sched/sch_skbprio.c
new file mode 100644
index 0000000..5e89446
--- /dev/null
+++ b/net/sched/sch_skbprio.c
@@ -0,0 +1,347 @@
+/*
+ * net/sched/sch_skbprio.c  SKB Priority Queue.
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Nishanth Devarajan, <ndev2021@gmail.com>
+ *		Cody Doucette, <doucette@bu.edu>
+ *	        original idea by Michel Machado, Cody Doucette, and Qiaobin Fu
+ */
+
+#include <linux/string.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <net/pkt_sched.h>
+#include <net/sch_generic.h>
+#include <net/inet_ecn.h>
+
+
+/*	  SKB Priority Queue
+ *	=================================
+ *
+ * This qdisc schedules a packet according to skb->priority, where a higher
+ * value places the packet closer to the exit of the queue. When the queue is
+ * full, the lowest priority packet in the queue is dropped to make room for
+ * the packet to be added if it has higher priority. If the packet to be added
+ * has lower priority than all packets in the queue, it is dropped.
+ *
+ * Without the SKB priority queue, queue length limits must be imposed
+ * for individual queues, and there is no easy way to enforce a global queue
+ * length limit across all priorities. With the SKBprio queue, a global
+ * queue length limit can be enforced while not restricting the queue lengths
+ * of individual priorities.
+ *
+ * This is especially useful for a denial-of-service defense system like
+ * Gatekeeper, which prioritizes packets in flows that demonstrate expected
+ * behavior of legitimate users. The queue is flexible to allow any number
+ * of packets of any priority up to the global limit of the scheduler
+ * without risking resource overconsumption by a flood of low priority packets.
+ *
+ * The Gatekeeper codebase is found here:
+ *
+ *		https://github.com/AltraMayor/gatekeeper
+ */
+
+struct skbprio_sched_data {
+	/* Parameters. */
+	u32 max_limit;
+
+	/* Queue state. */
+	struct sk_buff_head qdiscs[SKBPRIO_MAX_PRIORITY];
+	struct gnet_stats_queue qstats[SKBPRIO_MAX_PRIORITY];
+	u16 highest_prio;
+	u16 lowest_prio;
+};
+
+static u16 calc_new_high_prio(const struct skbprio_sched_data *q)
+{
+	int prio;
+
+	for (prio = q->highest_prio - 1; prio >= q->lowest_prio; prio--) {
+		if (!skb_queue_empty(&q->qdiscs[prio]))
+			return prio;
+	}
+
+	/* SKB queue is empty, return 0 (default highest priority setting). */
+	return 0;
+}
+
+static u16 calc_new_low_prio(const struct skbprio_sched_data *q)
+{
+	int prio;
+
+	for (prio = q->lowest_prio + 1; prio <= q->highest_prio; prio++) {
+		if (!skb_queue_empty(&q->qdiscs[prio]))
+			return prio;
+	}
+
+	/* SKB queue is empty, return SKBPRIO_MAX_PRIORITY - 1
+	 * (default lowest priority setting).
+	 */
+	return SKBPRIO_MAX_PRIORITY - 1;
+}
+
+static int skbprio_enqueue(struct sk_buff *skb, struct Qdisc *sch,
+			  struct sk_buff **to_free)
+{
+	const unsigned int max_priority = SKBPRIO_MAX_PRIORITY - 1;
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	struct sk_buff_head *qdisc;
+	struct sk_buff_head *lp_qdisc;
+	struct sk_buff *to_drop;
+	u16 prio, lp;
+
+	/* Obtain the priority of @skb. */
+	prio = min(skb->priority, max_priority);
+
+	qdisc = &q->qdiscs[prio];
+	if (sch->q.qlen < q->max_limit) {
+		__skb_queue_tail(qdisc, skb);
+		qdisc_qstats_backlog_inc(sch, skb);
+		q->qstats[prio].backlog += qdisc_pkt_len(skb);
+
+		/* Check to update highest and lowest priorities. */
+		if (prio > q->highest_prio)
+			q->highest_prio = prio;
+
+		if (prio < q->lowest_prio)
+			q->lowest_prio = prio;
+
+		sch->q.qlen++;
+		return NET_XMIT_SUCCESS;
+	}
+
+	/* If this packet has the lowest priority, drop it. */
+	lp = q->lowest_prio;
+	if (prio <= lp) {
+		q->qstats[prio].drops++;
+		return qdisc_drop(skb, sch, to_free);
+	}
+
+	/* Drop the packet at the tail of the lowest priority qdisc. */
+	lp_qdisc = &q->qdiscs[lp];
+	to_drop = __skb_dequeue_tail(lp_qdisc);
+	BUG_ON(!to_drop);
+	qdisc_qstats_backlog_dec(sch, to_drop);
+	qdisc_drop(to_drop, sch, to_free);
+
+	q->qstats[lp].backlog -= qdisc_pkt_len(to_drop);
+	q->qstats[lp].drops++;
+
+	__skb_queue_tail(qdisc, skb);
+	qdisc_qstats_backlog_inc(sch, skb);
+	q->qstats[prio].backlog += qdisc_pkt_len(skb);
+
+	/* Check to update highest and lowest priorities. */
+	if (skb_queue_empty(lp_qdisc)) {
+		if (q->lowest_prio == q->highest_prio) {
+			BUG_ON(sch->q.qlen);
+			q->lowest_prio = prio;
+			q->highest_prio = prio;
+		} else {
+			q->lowest_prio = calc_new_low_prio(q);
+		}
+	}
+
+	if (prio > q->highest_prio)
+		q->highest_prio = prio;
+
+	return NET_XMIT_CN;
+}
+
+static struct sk_buff *skbprio_dequeue(struct Qdisc *sch)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	struct sk_buff_head *hpq = &q->qdiscs[q->highest_prio];
+	struct sk_buff *skb = __skb_dequeue(hpq);
+
+	if (unlikely(!skb))
+		return NULL;
+
+	sch->q.qlen--;
+	qdisc_qstats_backlog_dec(sch, skb);
+	qdisc_bstats_update(sch, skb);
+
+	q->qstats[q->highest_prio].backlog -= qdisc_pkt_len(skb);
+
+	/* Update highest priority field. */
+	if (skb_queue_empty(hpq)) {
+		if (q->lowest_prio == q->highest_prio) {
+			BUG_ON(sch->q.qlen);
+			q->highest_prio = 0;
+			q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
+		} else {
+			q->highest_prio = calc_new_high_prio(q);
+		}
+	}
+	return skb;
+}
+
+static int skbprio_change(struct Qdisc *sch, struct nlattr *opt,
+			struct netlink_ext_ack *extack)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	struct tc_skbprio_qopt *ctl = nla_data(opt);
+	const unsigned int min_limit = 1;
+
+	if (ctl->limit == (typeof(ctl->limit))-1)
+		q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
+	else if (ctl->limit < min_limit ||
+		ctl->limit > qdisc_dev(sch)->tx_queue_len)
+		return -EINVAL;
+	else
+		q->max_limit = ctl->limit;
+
+	return 0;
+}
+
+static int skbprio_init(struct Qdisc *sch, struct nlattr *opt,
+			struct netlink_ext_ack *extack)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	const unsigned int min_limit = 1;
+	int prio;
+
+	/* Initialise all queues, one for each possible priority. */
+	for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_head_init(&q->qdiscs[prio]);
+
+	memset(&q->qstats, 0, sizeof(q->qstats));
+	q->highest_prio = 0;
+	q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
+	if (!opt) {
+		q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
+		return 0;
+	}
+	return skbprio_change(sch, opt, extack);
+}
+
+static int skbprio_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	struct tc_skbprio_qopt opt;
+
+	opt.limit = q->max_limit;
+
+	if (nla_put(skb, TCA_OPTIONS, sizeof(opt), &opt))
+		return -1;
+
+	return skb->len;
+}
+
+static void skbprio_reset(struct Qdisc *sch)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	int prio;
+
+	sch->qstats.backlog = 0;
+	sch->q.qlen = 0;
+
+	for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_purge(&q->qdiscs[prio]);
+
+	memset(&q->qstats, 0, sizeof(q->qstats));
+	q->highest_prio = 0;
+	q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
+}
+
+static void skbprio_destroy(struct Qdisc *sch)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	int prio;
+
+	for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_purge(&q->qdiscs[prio]);
+}
+
+static struct Qdisc *skbprio_leaf(struct Qdisc *sch, unsigned long arg)
+{
+	return NULL;
+}
+
+static unsigned long skbprio_find(struct Qdisc *sch, u32 classid)
+{
+	return 0;
+}
+
+static int skbprio_dump_class(struct Qdisc *sch, unsigned long cl,
+			     struct sk_buff *skb, struct tcmsg *tcm)
+{
+	tcm->tcm_handle |= TC_H_MIN(cl);
+	return 0;
+}
+
+static int skbprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
+				   struct gnet_dump *d)
+{
+	struct skbprio_sched_data *q = qdisc_priv(sch);
+	if (gnet_stats_copy_queue(d, NULL, &q->qstats[cl - 1],
+		q->qstats[cl - 1].qlen) < 0)
+		return -1;
+	return 0;
+}
+
+static void skbprio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+	unsigned int i;
+
+	if (arg->stop)
+		return;
+
+	for (i = 0; i < SKBPRIO_MAX_PRIORITY; i++) {
+		if (arg->count < arg->skip) {
+			arg->count++;
+			continue;
+		}
+		if (arg->fn(sch, i + 1, arg) < 0) {
+			arg->stop = 1;
+			break;
+		}
+		arg->count++;
+	}
+}
+
+static const struct Qdisc_class_ops skbprio_class_ops = {
+	.leaf		=	skbprio_leaf,
+	.find		=	skbprio_find,
+	.dump		=	skbprio_dump_class,
+	.dump_stats	=	skbprio_dump_class_stats,
+	.walk		=	skbprio_walk,
+};
+
+static struct Qdisc_ops skbprio_qdisc_ops __read_mostly = {
+	.cl_ops		=	&skbprio_class_ops,
+	.id		=	"skbprio",
+	.priv_size	=	sizeof(struct skbprio_sched_data),
+	.enqueue	=	skbprio_enqueue,
+	.dequeue	=	skbprio_dequeue,
+	.peek		=	qdisc_peek_dequeued,
+	.init		=	skbprio_init,
+	.reset		=	skbprio_reset,
+	.change		=	skbprio_change,
+	.dump		=	skbprio_dump,
+	.destroy	=	skbprio_destroy,
+	.owner		=	THIS_MODULE,
+};
+
+static int __init skbprio_module_init(void)
+{
+	return register_qdisc(&skbprio_qdisc_ops);
+}
+
+static void __exit skbprio_module_exit(void)
+{
+	unregister_qdisc(&skbprio_qdisc_ops);
+}
+
+module_init(skbprio_module_init)
+module_exit(skbprio_module_exit)
+
+MODULE_LICENSE("GPL");
-- 
1.9.1

^ permalink raw reply related

* [PATCH] qmi_wwan: add support for the Dell Wireless 5821e module
From: Aleksander Morgado @ 2018-06-23 21:22 UTC (permalink / raw)
  To: bjorn; +Cc: netdev, linux-usb, Aleksander Morgado

This module exposes two USB configurations: a QMI+AT capable setup on
USB config #1 and a MBIM capable setup on USB config #2.

By default the kernel will choose the MBIM capable configuration as
long as the cdc_mbim driver is available. This patch adds support for
the QMI port in the secondary configuration.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 8e8b51f171f4..8fac8e132c5b 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1246,6 +1246,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x413c, 0x81b3, 8)},	/* Dell Wireless 5809e Gobi(TM) 4G LTE Mobile Broadband Card (rev3) */
 	{QMI_FIXED_INTF(0x413c, 0x81b6, 8)},	/* Dell Wireless 5811e */
 	{QMI_FIXED_INTF(0x413c, 0x81b6, 10)},	/* Dell Wireless 5811e */
+	{QMI_FIXED_INTF(0x413c, 0x81d7, 1)},	/* Dell Wireless 5821e */
 	{QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)},	/* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */
 	{QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)},	/* HP lt4120 Snapdragon X5 LTE */
 	{QMI_FIXED_INTF(0x22de, 0x9061, 3)},	/* WeTelecom WPD-600N */
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] ipv6: avoid copy_from_user() via ipv6_renew_options_kern()
From: Al Viro @ 2018-06-23 21:26 UTC (permalink / raw)
  To: David Miller; +Cc: pmoore, netdev, selinux, linux-security-module
In-Reply-To: <20180623.105706.385733107379565893.davem@davemloft.net>

On Sat, Jun 23, 2018 at 10:57:06AM +0900, David Miller wrote:
> From: Paul Moore <pmoore@redhat.com>
> Date: Fri, 22 Jun 2018 17:18:20 -0400
> 
> > -	const mm_segment_t old_fs = get_fs();
> > -
> > -	set_fs(KERNEL_DS);
> > -	ret_val = ipv6_renew_options(sk, opt, newtype,
> > -				     (struct ipv6_opt_hdr __user *)newopt,
> > -				     newoptlen);
> > -	set_fs(old_fs);
> 
> So is it really the case that the traditional construct:
> 
> 	set_fs(KERNEL_DS);
> 	... copy_{from,to}_user(...);
> 	set_fs(old_fs);
> 
> is no longer allowed?

s/no longer allowed/best avoided/, but IMO in this case the replacement is
too ugly to live ;-/

^ permalink raw reply

* Re: [PATCH v2 net-next] net/sched: add skbprio scheduler
From: Cong Wang @ 2018-06-23 21:43 UTC (permalink / raw)
  To: Nishanth Devarajan
  Cc: Jamal Hadi Salim, Jiri Pirko, David Miller,
	Linux Kernel Network Developers, Cody Doucette, Michel Machado
In-Reply-To: <20180623204745.GA4337@gmail.com>

On Sat, Jun 23, 2018 at 1:47 PM, Nishanth Devarajan <ndev2021@gmail.com> wrote:
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index 37b5096..6fd07e8 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
...
> +#define SKBPRIO_MAX_PRIORITY 64
> +
> +struct tc_skbprio_qopt {
> +       __u32   limit;          /* Queue length in packets. */
> +};


Since this is just an integer, you can just make it NLA_U32 instead
of a struct?


> +static int skbprio_change(struct Qdisc *sch, struct nlattr *opt,
> +                       struct netlink_ext_ack *extack)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       struct tc_skbprio_qopt *ctl = nla_data(opt);
> +       const unsigned int min_limit = 1;
> +
> +       if (ctl->limit == (typeof(ctl->limit))-1)
> +               q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
> +       else if (ctl->limit < min_limit ||
> +               ctl->limit > qdisc_dev(sch)->tx_queue_len)
> +               return -EINVAL;
> +       else
> +               q->max_limit = ctl->limit;
> +
> +       return 0;
> +}

Isn't q->max_limit same with sch->limit?

Also, please avoid dev->tx_queue_len here, it may change
independently of your qdisc change, unless you want to implement
ops->change_tx_queue_len().

^ permalink raw reply

* Re: [PATCH] qmi_wwan: add support for the Dell Wireless 5821e module
From: Bjørn Mork @ 2018-06-23 21:56 UTC (permalink / raw)
  To: Aleksander Morgado; +Cc: netdev, linux-usb
In-Reply-To: <20180623212252.15026-1-aleksander@aleksander.es>

Aleksander Morgado <aleksander@aleksander.es> writes:

> This module exposes two USB configurations: a QMI+AT capable setup on
> USB config #1 and a MBIM capable setup on USB config #2.
>
> By default the kernel will choose the MBIM capable configuration as
> long as the cdc_mbim driver is available. This patch adds support for
> the QMI port in the secondary configuration.
>
> Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>

Acked-by: Bjørn Mork <bjorn@mork.no>

Please queue this up for stable too.  Thanks.

^ permalink raw reply

* Re: [PATCH v2 net-next] net/sched: add skbprio scheduler
From: Alexander Duyck @ 2018-06-23 22:10 UTC (permalink / raw)
  To: Nishanth Devarajan
  Cc: Cong Wang, Jamal Hadi Salim, Jiri Pirko, David Miller, Netdev,
	doucette, michel
In-Reply-To: <20180623204745.GA4337@gmail.com>

On Sat, Jun 23, 2018 at 1:47 PM, Nishanth Devarajan <ndev2021@gmail.com> wrote:
> net/sched: add skbprio scheduler
>
> Skbprio (SKB Priority Queue) is a queueing discipline that prioritizes packets
> according to their skb->priority field. Although Skbprio can be employed in any
> scenario in which a higher skb->priority field means a higher priority packet,
> Skbprio was concieved as a solution for denial-of-service defenses that need to
> route packets with different priorities.

Really this description is not very good. Reading it I was thinking to
myself "why do we need this, prio already does this". It wasn't until
I read through the code that I figured out that you are basically
adding dropping of lower priority frames.

>
> v2
> *Use skb->priority field rather than DS field. Rename queueing discipline as
> SKB Priority Queue (previously Gatekeeper Priority Queue).
>
> *Queueing discipline is made classful to expose Skbprio's internal priority
> queues.
>
> Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
> Reviewed-by: Sachin Paryani <sachin.paryani@gmail.com>
> Reviewed-by: Cody Doucette <doucette@bu.edu>
> Reviewed-by: Michel Machado <michel@digirati.com.br>
> ---
>  include/uapi/linux/pkt_sched.h |  15 ++
>  net/sched/Kconfig              |  13 ++
>  net/sched/Makefile             |   1 +
>  net/sched/sch_skbprio.c        | 347 +++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 376 insertions(+)
>  create mode 100644 net/sched/sch_skbprio.c
>
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index 37b5096..6fd07e8 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -124,6 +124,21 @@ struct tc_fifo_qopt {
>         __u32   limit;  /* Queue length: bytes for bfifo, packets for pfifo */
>  };
>
> +/* SKBPRIO section */
> +
> +/*
> + * Priorities go from zero to (SKBPRIO_MAX_PRIORITY - 1).
> + * SKBPRIO_MAX_PRIORITY should be at least 64 in order for skbprio to be able
> + * to map one to one the DS field of IPV4 and IPV6 headers.
> + * Memory allocation grows linearly with SKBPRIO_MAX_PRIORITY.
> + */
> +
> +#define SKBPRIO_MAX_PRIORITY 64
> +
> +struct tc_skbprio_qopt {
> +       __u32   limit;          /* Queue length in packets. */
> +};
> +
>  /* PRIO section */
>
>  #define TCQ_PRIO_BANDS 16
> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
> index a01169f..9ac4b53 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -240,6 +240,19 @@ config NET_SCH_MQPRIO
>
>           If unsure, say N.
>
> +config NET_SCH_SKBPRIO
> +       tristate "SKB priority queue scheduler (SKBPRIO)"
> +       help
> +         Say Y here if you want to use the SKB priority queue
> +         scheduler. This schedules packets according to skb->priority,
> +         which is useful for request packets in DoS mitigation systems such
> +         as Gatekeeper.
> +
> +         To compile this driver as a module, choose M here: the module will
> +         be called sch_skbprio.
> +
> +         If unsure, say N.
> +
>  config NET_SCH_CHOKE
>         tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
>         help
> diff --git a/net/sched/Makefile b/net/sched/Makefile
> index 8811d38..a4d8893 100644
> --- a/net/sched/Makefile
> +++ b/net/sched/Makefile
> @@ -46,6 +46,7 @@ obj-$(CONFIG_NET_SCH_NETEM)   += sch_netem.o
>  obj-$(CONFIG_NET_SCH_DRR)      += sch_drr.o
>  obj-$(CONFIG_NET_SCH_PLUG)     += sch_plug.o
>  obj-$(CONFIG_NET_SCH_MQPRIO)   += sch_mqprio.o
> +obj-$(CONFIG_NET_SCH_SKBPRIO)  += sch_skbprio.o
>  obj-$(CONFIG_NET_SCH_CHOKE)    += sch_choke.o
>  obj-$(CONFIG_NET_SCH_QFQ)      += sch_qfq.o
>  obj-$(CONFIG_NET_SCH_CODEL)    += sch_codel.o
> diff --git a/net/sched/sch_skbprio.c b/net/sched/sch_skbprio.c
> new file mode 100644
> index 0000000..5e89446
> --- /dev/null
> +++ b/net/sched/sch_skbprio.c
> @@ -0,0 +1,347 @@
> +/*
> + * net/sched/sch_skbprio.c  SKB Priority Queue.
> + *
> + *             This program is free software; you can redistribute it and/or
> + *             modify it under the terms of the GNU General Public License
> + *             as published by the Free Software Foundation; either version
> + *             2 of the License, or (at your option) any later version.
> + *
> + * Authors:    Nishanth Devarajan, <ndev2021@gmail.com>
> + *             Cody Doucette, <doucette@bu.edu>
> + *             original idea by Michel Machado, Cody Doucette, and Qiaobin Fu
> + */
> +
> +#include <linux/string.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/errno.h>
> +#include <linux/skbuff.h>
> +#include <net/pkt_sched.h>
> +#include <net/sch_generic.h>
> +#include <net/inet_ecn.h>
> +
> +
> +/*       SKB Priority Queue
> + *     =================================
> + *
> + * This qdisc schedules a packet according to skb->priority, where a higher
> + * value places the packet closer to the exit of the queue. When the queue is
> + * full, the lowest priority packet in the queue is dropped to make room for
> + * the packet to be added if it has higher priority. If the packet to be added
> + * has lower priority than all packets in the queue, it is dropped.
> + *
> + * Without the SKB priority queue, queue length limits must be imposed
> + * for individual queues, and there is no easy way to enforce a global queue
> + * length limit across all priorities. With the SKBprio queue, a global
> + * queue length limit can be enforced while not restricting the queue lengths
> + * of individual priorities.
> + *
> + * This is especially useful for a denial-of-service defense system like
> + * Gatekeeper, which prioritizes packets in flows that demonstrate expected
> + * behavior of legitimate users. The queue is flexible to allow any number
> + * of packets of any priority up to the global limit of the scheduler
> + * without risking resource overconsumption by a flood of low priority packets.
> + *
> + * The Gatekeeper codebase is found here:
> + *
> + *             https://github.com/AltraMayor/gatekeeper

Do we really need to be linking to third party projects in the code? I
am not even really sure it is related here since I cannot find
anything that references the queuing discipline.

Also this seems like it should be in the Documentation/networking
folder rather than being stored in the code itself.

> + */
> +
> +struct skbprio_sched_data {
> +       /* Parameters. */
> +       u32 max_limit;
> +
> +       /* Queue state. */
> +       struct sk_buff_head qdiscs[SKBPRIO_MAX_PRIORITY];
> +       struct gnet_stats_queue qstats[SKBPRIO_MAX_PRIORITY];
> +       u16 highest_prio;
> +       u16 lowest_prio;
> +};
> +
> +static u16 calc_new_high_prio(const struct skbprio_sched_data *q)
> +{
> +       int prio;
> +
> +       for (prio = q->highest_prio - 1; prio >= q->lowest_prio; prio--) {
> +               if (!skb_queue_empty(&q->qdiscs[prio]))
> +                       return prio;
> +       }
> +
> +       /* SKB queue is empty, return 0 (default highest priority setting). */
> +       return 0;
> +}
> +
> +static u16 calc_new_low_prio(const struct skbprio_sched_data *q)
> +{
> +       int prio;
> +
> +       for (prio = q->lowest_prio + 1; prio <= q->highest_prio; prio++) {
> +               if (!skb_queue_empty(&q->qdiscs[prio]))
> +                       return prio;
> +       }
> +
> +       /* SKB queue is empty, return SKBPRIO_MAX_PRIORITY - 1
> +        * (default lowest priority setting).
> +        */
> +       return SKBPRIO_MAX_PRIORITY - 1;
> +}
> +
> +static int skbprio_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> +                         struct sk_buff **to_free)
> +{
> +       const unsigned int max_priority = SKBPRIO_MAX_PRIORITY - 1;
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       struct sk_buff_head *qdisc;
> +       struct sk_buff_head *lp_qdisc;
> +       struct sk_buff *to_drop;
> +       u16 prio, lp;
> +
> +       /* Obtain the priority of @skb. */
> +       prio = min(skb->priority, max_priority);
> +
> +       qdisc = &q->qdiscs[prio];
> +       if (sch->q.qlen < q->max_limit) {
> +               __skb_queue_tail(qdisc, skb);
> +               qdisc_qstats_backlog_inc(sch, skb);
> +               q->qstats[prio].backlog += qdisc_pkt_len(skb);
> +
> +               /* Check to update highest and lowest priorities. */
> +               if (prio > q->highest_prio)
> +                       q->highest_prio = prio;
> +
> +               if (prio < q->lowest_prio)
> +                       q->lowest_prio = prio;
> +
> +               sch->q.qlen++;
> +               return NET_XMIT_SUCCESS;
> +       }
> +
> +       /* If this packet has the lowest priority, drop it. */
> +       lp = q->lowest_prio;
> +       if (prio <= lp) {
> +               q->qstats[prio].drops++;
> +               return qdisc_drop(skb, sch, to_free);
> +       }
> +
> +       /* Drop the packet at the tail of the lowest priority qdisc. */
> +       lp_qdisc = &q->qdiscs[lp];
> +       to_drop = __skb_dequeue_tail(lp_qdisc);

Is there a specific reason for dropping tail instead of head? I would
think you might see better latency numbers with this if you were to
focus on dropping older packets from the lowest priority instead of
the newest ones. Then things like TCP would likely be able to respond
more quickly to the congestion effects and would be able to more
accurately determine actual round trip time versus buffer delay.

> +       BUG_ON(!to_drop);
> +       qdisc_qstats_backlog_dec(sch, to_drop);
> +       qdisc_drop(to_drop, sch, to_free);
> +
> +       q->qstats[lp].backlog -= qdisc_pkt_len(to_drop);
> +       q->qstats[lp].drops++;
> +
> +       __skb_queue_tail(qdisc, skb);
> +       qdisc_qstats_backlog_inc(sch, skb);
> +       q->qstats[prio].backlog += qdisc_pkt_len(skb);
> +

You could probably just take care of enqueueing first, and then take
care of the dequeue. That way you can drop the references to skb
earlier from the stack or free up whatever register is storing it.

> +       /* Check to update highest and lowest priorities. */
> +       if (skb_queue_empty(lp_qdisc)) {
> +               if (q->lowest_prio == q->highest_prio) {
> +                       BUG_ON(sch->q.qlen);
> +                       q->lowest_prio = prio;
> +                       q->highest_prio = prio;
> +               } else {
> +                       q->lowest_prio = calc_new_low_prio(q);
> +               }
> +       }
> +
> +       if (prio > q->highest_prio)
> +               q->highest_prio = prio;
> +
> +       return NET_XMIT_CN;
> +}
> +
> +static struct sk_buff *skbprio_dequeue(struct Qdisc *sch)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       struct sk_buff_head *hpq = &q->qdiscs[q->highest_prio];
> +       struct sk_buff *skb = __skb_dequeue(hpq);
> +
> +       if (unlikely(!skb))
> +               return NULL;
> +
> +       sch->q.qlen--;
> +       qdisc_qstats_backlog_dec(sch, skb);
> +       qdisc_bstats_update(sch, skb);
> +
> +       q->qstats[q->highest_prio].backlog -= qdisc_pkt_len(skb);
> +
> +       /* Update highest priority field. */
> +       if (skb_queue_empty(hpq)) {
> +               if (q->lowest_prio == q->highest_prio) {
> +                       BUG_ON(sch->q.qlen);
> +                       q->highest_prio = 0;
> +                       q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
> +               } else {
> +                       q->highest_prio = calc_new_high_prio(q);
> +               }
> +       }
> +       return skb;
> +}
> +
> +static int skbprio_change(struct Qdisc *sch, struct nlattr *opt,
> +                       struct netlink_ext_ack *extack)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       struct tc_skbprio_qopt *ctl = nla_data(opt);
> +       const unsigned int min_limit = 1;
> +
> +       if (ctl->limit == (typeof(ctl->limit))-1)
> +               q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
> +       else if (ctl->limit < min_limit ||
> +               ctl->limit > qdisc_dev(sch)->tx_queue_len)
> +               return -EINVAL;
> +       else
> +               q->max_limit = ctl->limit;
> +
> +       return 0;
> +}
> +
> +static int skbprio_init(struct Qdisc *sch, struct nlattr *opt,
> +                       struct netlink_ext_ack *extack)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       const unsigned int min_limit = 1;
> +       int prio;
> +
> +       /* Initialise all queues, one for each possible priority. */
> +       for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
> +               __skb_queue_head_init(&q->qdiscs[prio]);
> +
> +       memset(&q->qstats, 0, sizeof(q->qstats));
> +       q->highest_prio = 0;
> +       q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
> +       if (!opt) {
> +               q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
> +               return 0;
> +       }
> +       return skbprio_change(sch, opt, extack);
> +}
> +
> +static int skbprio_dump(struct Qdisc *sch, struct sk_buff *skb)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       struct tc_skbprio_qopt opt;
> +
> +       opt.limit = q->max_limit;
> +
> +       if (nla_put(skb, TCA_OPTIONS, sizeof(opt), &opt))
> +               return -1;
> +
> +       return skb->len;
> +}
> +
> +static void skbprio_reset(struct Qdisc *sch)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       int prio;
> +
> +       sch->qstats.backlog = 0;
> +       sch->q.qlen = 0;
> +
> +       for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
> +               __skb_queue_purge(&q->qdiscs[prio]);
> +
> +       memset(&q->qstats, 0, sizeof(q->qstats));
> +       q->highest_prio = 0;
> +       q->lowest_prio = SKBPRIO_MAX_PRIORITY - 1;
> +}
> +
> +static void skbprio_destroy(struct Qdisc *sch)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       int prio;
> +
> +       for (prio = 0; prio < SKBPRIO_MAX_PRIORITY; prio++)
> +               __skb_queue_purge(&q->qdiscs[prio]);
> +}
> +
> +static struct Qdisc *skbprio_leaf(struct Qdisc *sch, unsigned long arg)
> +{
> +       return NULL;
> +}
> +
> +static unsigned long skbprio_find(struct Qdisc *sch, u32 classid)
> +{
> +       return 0;
> +}
> +
> +static int skbprio_dump_class(struct Qdisc *sch, unsigned long cl,
> +                            struct sk_buff *skb, struct tcmsg *tcm)
> +{
> +       tcm->tcm_handle |= TC_H_MIN(cl);
> +       return 0;
> +}
> +
> +static int skbprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
> +                                  struct gnet_dump *d)
> +{
> +       struct skbprio_sched_data *q = qdisc_priv(sch);
> +       if (gnet_stats_copy_queue(d, NULL, &q->qstats[cl - 1],
> +               q->qstats[cl - 1].qlen) < 0)
> +               return -1;
> +       return 0;
> +}
> +
> +static void skbprio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
> +{
> +       unsigned int i;
> +
> +       if (arg->stop)
> +               return;
> +
> +       for (i = 0; i < SKBPRIO_MAX_PRIORITY; i++) {
> +               if (arg->count < arg->skip) {
> +                       arg->count++;
> +                       continue;
> +               }
> +               if (arg->fn(sch, i + 1, arg) < 0) {
> +                       arg->stop = 1;
> +                       break;
> +               }
> +               arg->count++;
> +       }
> +}
> +
> +static const struct Qdisc_class_ops skbprio_class_ops = {
> +       .leaf           =       skbprio_leaf,
> +       .find           =       skbprio_find,
> +       .dump           =       skbprio_dump_class,
> +       .dump_stats     =       skbprio_dump_class_stats,
> +       .walk           =       skbprio_walk,
> +};
> +
> +static struct Qdisc_ops skbprio_qdisc_ops __read_mostly = {
> +       .cl_ops         =       &skbprio_class_ops,
> +       .id             =       "skbprio",
> +       .priv_size      =       sizeof(struct skbprio_sched_data),
> +       .enqueue        =       skbprio_enqueue,
> +       .dequeue        =       skbprio_dequeue,
> +       .peek           =       qdisc_peek_dequeued,
> +       .init           =       skbprio_init,
> +       .reset          =       skbprio_reset,
> +       .change         =       skbprio_change,
> +       .dump           =       skbprio_dump,
> +       .destroy        =       skbprio_destroy,
> +       .owner          =       THIS_MODULE,
> +};
> +
> +static int __init skbprio_module_init(void)
> +{
> +       return register_qdisc(&skbprio_qdisc_ops);
> +}
> +
> +static void __exit skbprio_module_exit(void)
> +{
> +       unregister_qdisc(&skbprio_qdisc_ops);
> +}
> +
> +module_init(skbprio_module_init)
> +module_exit(skbprio_module_exit)
> +
> +MODULE_LICENSE("GPL");
> --
> 1.9.1
>

^ permalink raw reply

* [PATCH] brcmsmac: make function wlc_phy_workarounds_nphy_rev1 static
From: Colin King @ 2018-06-23 22:15 UTC (permalink / raw)
  To: Arend van Spriel, Franky Lin, Hante Meuleman, Chi-Hsien Lin,
	Wright Feng, Kalle Valo, David S . Miller, linux-wireless,
	brcm80211-dev-list.pdl, brcm80211-dev-list, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The function wlc_phy_workarounds_nphy_rev1 is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'wlc_phy_workarounds_nphy_rev1' was not declared. Should it
be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
index 1a187557982e..bedec1606caa 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
@@ -16904,7 +16904,7 @@ static void wlc_phy_workarounds_nphy_rev3(struct brcms_phy *pi)
 	}
 }
 
-void wlc_phy_workarounds_nphy_rev1(struct brcms_phy *pi)
+static void wlc_phy_workarounds_nphy_rev1(struct brcms_phy *pi)
 {
 	static const u8 rfseq_rx2tx_events[] = {
 		NPHY_RFSEQ_CMD_NOP,
-- 
2.17.0

^ permalink raw reply related

* [PATCH] brcmsmac: make function wlc_phy_workarounds_nphy_rev1 static
From: Colin King @ 2018-06-23 22:15 UTC (permalink / raw)
  To: Arend van Spriel, Franky Lin, Hante Meuleman, Chi-Hsien Lin,
	Wright Feng, Kalle Valo, David S . Miller, linux-wireless,
	brcm80211-dev-list.pdl, brcm80211-dev-list, netdev
  Cc: kernel-janitors, linux-kernel
In-Reply-To: <20180623221531.6396-1-colin.king@canonical.com>

From: Colin Ian King <colin.king@canonical.com>

The function wlc_phy_workarounds_nphy_rev1 is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'wlc_phy_workarounds_nphy_rev1' was not declared. Should it
be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
index 1a187557982e..bedec1606caa 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c
@@ -16904,7 +16904,7 @@ static void wlc_phy_workarounds_nphy_rev3(struct brcms_phy *pi)
 	}
 }
 
-void wlc_phy_workarounds_nphy_rev1(struct brcms_phy *pi)
+static void wlc_phy_workarounds_nphy_rev1(struct brcms_phy *pi)
 {
 	static const u8 rfseq_rx2tx_events[] = {
 		NPHY_RFSEQ_CMD_NOP,
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] ipv6: avoid copy_from_user() via ipv6_renew_options_kern()
From: Al Viro @ 2018-06-23 22:21 UTC (permalink / raw)
  To: David Miller; +Cc: pmoore, netdev, selinux, linux-security-module
In-Reply-To: <20180623212626.GD30522@ZenIV.linux.org.uk>

On Sat, Jun 23, 2018 at 10:26:27PM +0100, Al Viro wrote:
> On Sat, Jun 23, 2018 at 10:57:06AM +0900, David Miller wrote:
> > From: Paul Moore <pmoore@redhat.com>
> > Date: Fri, 22 Jun 2018 17:18:20 -0400
> > 
> > > -	const mm_segment_t old_fs = get_fs();
> > > -
> > > -	set_fs(KERNEL_DS);
> > > -	ret_val = ipv6_renew_options(sk, opt, newtype,
> > > -				     (struct ipv6_opt_hdr __user *)newopt,
> > > -				     newoptlen);
> > > -	set_fs(old_fs);
> > 
> > So is it really the case that the traditional construct:
> > 
> > 	set_fs(KERNEL_DS);
> > 	... copy_{from,to}_user(...);
> > 	set_fs(old_fs);
> > 
> > is no longer allowed?
> 
> s/no longer allowed/best avoided/, but IMO in this case the replacement is
> too ugly to live ;-/

BTW, I wonder if the life would be simpler with do_ipv6_setsockopt() doing
the copy-in and verifying ipv6_optlen(*hdr) <= newoptlen; that would've
simplified ipv6_renew_option{,s}() quite a bit and completely eliminated
ipv6_renew_options_kern()...

Incidentally, is copying the entire value in case newoptlen > ipv6_optlen(...)
the right thing?  After all, the next update in any of those options will
quietly lose everything past ipv6_optlen(...), wouldn't it?

IOW, how about the following (completely untested):

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 16475c269749..d02881e4ad1f 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -355,14 +355,7 @@ struct ipv6_txoptions *ipv6_dup_options(struct sock *sk,
 struct ipv6_txoptions *ipv6_renew_options(struct sock *sk,
 					  struct ipv6_txoptions *opt,
 					  int newtype,
-					  struct ipv6_opt_hdr __user *newopt,
-					  int newoptlen);
-struct ipv6_txoptions *
-ipv6_renew_options_kern(struct sock *sk,
-			struct ipv6_txoptions *opt,
-			int newtype,
-			struct ipv6_opt_hdr *newopt,
-			int newoptlen);
+					  struct ipv6_opt_hdr *newopt);
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 					  struct ipv6_txoptions *opt);
 
diff --git a/net/ipv6/calipso.c b/net/ipv6/calipso.c
index 1323b9679cf7..1c0bb9fb76e6 100644
--- a/net/ipv6/calipso.c
+++ b/net/ipv6/calipso.c
@@ -799,8 +799,7 @@ static int calipso_opt_update(struct sock *sk, struct ipv6_opt_hdr *hop)
 {
 	struct ipv6_txoptions *old = txopt_get(inet6_sk(sk)), *txopts;
 
-	txopts = ipv6_renew_options_kern(sk, old, IPV6_HOPOPTS,
-					 hop, hop ? ipv6_optlen(hop) : 0);
+	txopts = ipv6_renew_options(sk, old, IPV6_HOPOPTS, hop);
 	txopt_put(old);
 	if (IS_ERR(txopts))
 		return PTR_ERR(txopts);
@@ -1222,8 +1221,7 @@ static int calipso_req_setattr(struct request_sock *req,
 	if (IS_ERR(new))
 		return PTR_ERR(new);
 
-	txopts = ipv6_renew_options_kern(sk, req_inet->ipv6_opt, IPV6_HOPOPTS,
-					 new, new ? ipv6_optlen(new) : 0);
+	txopts = ipv6_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
 
 	kfree(new);
 
@@ -1260,8 +1258,7 @@ static void calipso_req_delattr(struct request_sock *req)
 	if (calipso_opt_del(req_inet->ipv6_opt->hopopt, &new))
 		return; /* Nothing to do */
 
-	txopts = ipv6_renew_options_kern(sk, req_inet->ipv6_opt, IPV6_HOPOPTS,
-					 new, new ? ipv6_optlen(new) : 0);
+	txopts = ipv6_renew_options(sk, req_inet->ipv6_opt, IPV6_HOPOPTS, new);
 
 	if (!IS_ERR(txopts)) {
 		txopts = xchg(&req_inet->ipv6_opt, txopts);
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 5bc2bf3733ab..4b6915eedfc3 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -1015,29 +1015,15 @@ ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt)
 }
 EXPORT_SYMBOL_GPL(ipv6_dup_options);
 
-static int ipv6_renew_option(void *ohdr,
-			     struct ipv6_opt_hdr __user *newopt, int newoptlen,
-			     int inherit,
+static void ipv6_renew_option(struct ipv6_opt_hdr *opt,
 			     struct ipv6_opt_hdr **hdr,
 			     char **p)
 {
-	if (inherit) {
-		if (ohdr) {
-			memcpy(*p, ohdr, ipv6_optlen((struct ipv6_opt_hdr *)ohdr));
-			*hdr = (struct ipv6_opt_hdr *)*p;
-			*p += CMSG_ALIGN(ipv6_optlen(*hdr));
-		}
-	} else {
-		if (newopt) {
-			if (copy_from_user(*p, newopt, newoptlen))
-				return -EFAULT;
-			*hdr = (struct ipv6_opt_hdr *)*p;
-			if (ipv6_optlen(*hdr) > newoptlen)
-				return -EINVAL;
-			*p += CMSG_ALIGN(newoptlen);
-		}
+	if (opt) {
+		memcpy(*p, opt, ipv6_optlen(opt));
+		*hdr = (struct ipv6_opt_hdr *)*p;
+		*p += CMSG_ALIGN(ipv6_optlen(*hdr));
 	}
-	return 0;
 }
 
 /**
@@ -1063,13 +1049,11 @@ static int ipv6_renew_option(void *ohdr,
  */
 struct ipv6_txoptions *
 ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
-		   int newtype,
-		   struct ipv6_opt_hdr __user *newopt, int newoptlen)
+		   int newtype, struct ipv6_opt_hdr *newopt)
 {
 	int tot_len = 0;
 	char *p;
 	struct ipv6_txoptions *opt2;
-	int err;
 
 	if (opt) {
 		if (newtype != IPV6_HOPOPTS && opt->hopopt)
@@ -1082,8 +1066,8 @@ ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
 			tot_len += CMSG_ALIGN(ipv6_optlen(opt->dst1opt));
 	}
 
-	if (newopt && newoptlen)
-		tot_len += CMSG_ALIGN(newoptlen);
+	if (newopt)
+		tot_len += CMSG_ALIGN(ipv6_optlen(newopt));
 
 	if (!tot_len)
 		return NULL;
@@ -1098,67 +1082,25 @@ ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt,
 	opt2->tot_len = tot_len;
 	p = (char *)(opt2 + 1);
 
-	err = ipv6_renew_option(opt ? opt->hopopt : NULL, newopt, newoptlen,
-				newtype != IPV6_HOPOPTS,
-				&opt2->hopopt, &p);
-	if (err)
-		goto out;
-
-	err = ipv6_renew_option(opt ? opt->dst0opt : NULL, newopt, newoptlen,
-				newtype != IPV6_RTHDRDSTOPTS,
-				&opt2->dst0opt, &p);
-	if (err)
-		goto out;
-
-	err = ipv6_renew_option(opt ? opt->srcrt : NULL, newopt, newoptlen,
-				newtype != IPV6_RTHDR,
-				(struct ipv6_opt_hdr **)&opt2->srcrt, &p);
-	if (err)
-		goto out;
-
-	err = ipv6_renew_option(opt ? opt->dst1opt : NULL, newopt, newoptlen,
-				newtype != IPV6_DSTOPTS,
-				&opt2->dst1opt, &p);
-	if (err)
-		goto out;
-
+	ipv6_renew_option(newtype == IPV6_HOPOPTS ? newopt :
+				opt ? opt->hopopt : NULL,
+			  &opt2->hopopt, &p);
+
+	ipv6_renew_option(newtype == IPV6_RTHDRDSTOPTS ? newopt :
+				opt ? opt->dst0opt : NULL,
+			&opt2->dst0opt, &p);
+	ipv6_renew_option(newtype == IPV6_RTHDR ? newopt :
+				opt ? (struct ipv6_opt_hdr *)opt->srcrt : NULL,
+			(struct ipv6_opt_hdr **)&opt2->srcrt, &p);
+	ipv6_renew_option(newtype == IPV6_DSTOPTS ? newopt :
+				opt ? opt->dst1opt : NULL,
+			&opt2->dst1opt, &p);
 	opt2->opt_nflen = (opt2->hopopt ? ipv6_optlen(opt2->hopopt) : 0) +
 			  (opt2->dst0opt ? ipv6_optlen(opt2->dst0opt) : 0) +
 			  (opt2->srcrt ? ipv6_optlen(opt2->srcrt) : 0);
 	opt2->opt_flen = (opt2->dst1opt ? ipv6_optlen(opt2->dst1opt) : 0);
 
 	return opt2;
-out:
-	sock_kfree_s(sk, opt2, opt2->tot_len);
-	return ERR_PTR(err);
-}
-
-/**
- * ipv6_renew_options_kern - replace a specific ext hdr with a new one.
- *
- * @sk: sock from which to allocate memory
- * @opt: original options
- * @newtype: option type to replace in @opt
- * @newopt: new option of type @newtype to replace (kernel-mem)
- * @newoptlen: length of @newopt
- *
- * See ipv6_renew_options().  The difference is that @newopt is
- * kernel memory, rather than user memory.
- */
-struct ipv6_txoptions *
-ipv6_renew_options_kern(struct sock *sk, struct ipv6_txoptions *opt,
-			int newtype, struct ipv6_opt_hdr *newopt,
-			int newoptlen)
-{
-	struct ipv6_txoptions *ret_val;
-	const mm_segment_t old_fs = get_fs();
-
-	set_fs(KERNEL_DS);
-	ret_val = ipv6_renew_options(sk, opt, newtype,
-				     (struct ipv6_opt_hdr __user *)newopt,
-				     newoptlen);
-	set_fs(old_fs);
-	return ret_val;
 }
 
 struct ipv6_txoptions *ipv6_fixup_options(struct ipv6_txoptions *opt_space,
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 4d780c7f0130..b69ba18e0138 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -398,6 +398,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 	case IPV6_DSTOPTS:
 	{
 		struct ipv6_txoptions *opt;
+		struct ipv6_opt_hdr *new = NULL;
+
+		/* hop-by-hop / destination options are privileged option */
+		retv = -EPERM;
+		if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW))
+			break;
 
 		/* remove any sticky options header with a zero option
 		 * length, per RFC3542.
@@ -409,17 +415,23 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		else if (optlen < sizeof(struct ipv6_opt_hdr) ||
 			 optlen & 0x7 || optlen > 8 * 255)
 			goto e_inval;
-
-		/* hop-by-hop / destination options are privileged option */
-		retv = -EPERM;
-		if (optname != IPV6_RTHDR && !ns_capable(net->user_ns, CAP_NET_RAW))
-			break;
+		else {
+			new = kmemdup(optval, optlen, GFP_USER);
+			if (IS_ERR(new)) {
+				retv = PTR_ERR(new);
+				break;
+			}
+			if (unlikely(ipv6_optlen(new) > optlen)) {
+				kfree(new);
+				retv = -EINVAL;
+				break;
+			}
+		}
 
 		opt = rcu_dereference_protected(np->opt,
 						lockdep_sock_is_held(sk));
-		opt = ipv6_renew_options(sk, opt, optname,
-					 (struct ipv6_opt_hdr __user *)optval,
-					 optlen);
+		opt = ipv6_renew_options(sk, opt, optname, new);
+		kfree(new);
 		if (IS_ERR(opt)) {
 			retv = PTR_ERR(opt);
 			break;

^ permalink raw reply related

* [PATCH net-next] net: phy: fixed-phy: Make the error path simpler
From: Fabio Estevam @ 2018-06-24  0:28 UTC (permalink / raw)
  To: davem; +Cc: andrew, f.fainelli, netdev, Fabio Estevam

From: Fabio Estevam <fabio.estevam@nxp.com>

When platform_device_register_simple() fails we can return
the error immediately instead of jumping to the 'err_pdev'
label.

This makes the error path a bit simpler.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
---
 drivers/net/phy/fixed_phy.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index 001fe1d..67b2608 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -259,10 +259,8 @@ static int __init fixed_mdio_bus_init(void)
 	int ret;
 
 	pdev = platform_device_register_simple("Fixed MDIO bus", 0, NULL, 0);
-	if (IS_ERR(pdev)) {
-		ret = PTR_ERR(pdev);
-		goto err_pdev;
-	}
+	if (IS_ERR(pdev))
+		return PTR_ERR(pdev);
 
 	fmb->mii_bus = mdiobus_alloc();
 	if (fmb->mii_bus == NULL) {
@@ -287,7 +285,6 @@ static int __init fixed_mdio_bus_init(void)
 	mdiobus_free(fmb->mii_bus);
 err_mdiobus_reg:
 	platform_device_unregister(pdev);
-err_pdev:
 	return ret;
 }
 module_init(fixed_mdio_bus_init);
-- 
2.7.4

^ permalink raw reply related

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Michael S. Tsirkin @ 2018-06-24  1:45 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Alexander Duyck, virtio-dev, Jiri Pirko, konrad.wilk,
	Jakub Kicinski, Samudrala, Sridhar, Cornelia Huck, qemu-devel,
	virtualization, Venu Busireddy, Netdev, boris.ostrovsky,
	aaron.f.brown, Joao Martins
In-Reply-To: <CADGSJ22DX8=rNPY+gNS_q=LCYLR5ieoE7oD8p4+8AnpzQsWSCg@mail.gmail.com>

On Fri, Jun 22, 2018 at 05:17:10PM -0700, Siwei Liu wrote:
> I forgot to mention the above has the assumption that we expose both
> STANDBY and UUID feature to QEMU user. In fact, as we're going towards
> not exposing the STANDBY feature directly to user, UUID may be always
> required to enable STANDBY.

Sounds good.

> If not, how do we make sure QEMU can
> control the visibility of primary device?

Hypervisors fundamentally always can control visibility of all
virtual devices.

> Something to be confirmed
> before implementing the code.
> 

^ permalink raw reply

* Re: [PATCH] fib_rules: match rules based on suppress_* properties too
From: David Miller @ 2018-06-24  2:25 UTC (permalink / raw)
  To: Jason; +Cc: roopa, netdev
In-Reply-To: <20180623155930.25983-1-Jason@zx2c4.com>

From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: Sat, 23 Jun 2018 17:59:30 +0200

> Two rules with different values of suppress_prefix or suppress_ifgroup
> are not the same. This fixes an -EEXIST when running:
> 
>    $ ip -4 rule add table main suppress_prefixlength 0
> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Fixes: f9d4b0c1e969 ("fib_rules: move common handling of newrule delrule msgs into fib_nl2rule")

But the old rule_find() code didn't check this key either, so I can't
see how the behavior in this area changed.

I think the behavior changed for a different reason.

The commit mentioned in your Fixes: tag changed newrule semantics
wrt. defaults or "any" values.

The original code matched on pure values of the keys, whereas the new
code only compares the keys when the new rule is not specifying an
"any" value.

-		if (r->table != rule->table)
+		if (rule->table && r->table != rule->table)
 			continue;

And I think these changes are what makes your test case fail after the
commit.  Some other key didn't match previous due to the handling of
"any" values.

^ permalink raw reply

* Re: [PATCH net] cxgb4: when disabling dcb set txq dcb priority to 0
From: David Miller @ 2018-06-24  2:43 UTC (permalink / raw)
  To: ganeshgr; +Cc: netdev, nirranjan, indranil, robert, dsa, leedom
In-Reply-To: <1529765906-7499-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Sat, 23 Jun 2018 20:28:26 +0530

> When we are disabling DCB, store "0" in txq->dcb_prio
> since that's used for future TX Work Request "OVLAN_IDX"
> values. Setting non zero priority upon disabling DCB
> would halt the traffic.
> 
> Reported-by: AMG Zollner Robert <robert@cloudmedia.eu>
> CC: David Ahern <dsa@cumulusnetworks.com>
> Signed-off-by: Casey Leedom <leedom@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] qmi_wwan: add support for the Dell Wireless 5821e module
From: David Miller @ 2018-06-24  2:44 UTC (permalink / raw)
  To: aleksander; +Cc: bjorn, netdev, linux-usb
In-Reply-To: <20180623212252.15026-1-aleksander@aleksander.es>

From: Aleksander Morgado <aleksander@aleksander.es>
Date: Sat, 23 Jun 2018 23:22:52 +0200

> This module exposes two USB configurations: a QMI+AT capable setup on
> USB config #1 and a MBIM capable setup on USB config #2.
> 
> By default the kernel will choose the MBIM capable configuration as
> long as the cdc_mbim driver is available. This patch adds support for
> the QMI port in the secondary configuration.
> 
> Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>

Applied and queued up for -stable.

^ permalink raw reply

* BUG: unable to handle kernel paging request in bpf_int_jit_compile
From: syzbot @ 2018-06-24  4:09 UTC (permalink / raw)
  To: ast, daniel, davem, hpa, kuznet, linux-kernel, mingo, netdev,
	syzkaller-bugs, tglx, x86, yoshfuji

Hello,

syzbot found the following crash on:

HEAD commit:    5e2204832b20 Merge tag 'powerpc-4.18-2' of git://git.kerne..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148b5a90400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=befbcd7305e41bb0
dashboard link: https://syzkaller.appspot.com/bug?extid=a4eb8c7766952a1ca872
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=10ee22d4400000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com

RAX: ffffffffffffffda RBX: 0000000001429914 RCX: 0000000000455a99
RDX: 0000000000000048 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
R13: 00000000004bb7d5 R14: 00000000004c8508 R15: 0000000000000023
BUG: unable to handle kernel paging request at ffffffffa0008002
PGD 8e6d067 P4D 8e6d067 PUD 8e6e063 PMD 1b4528067 PTE 1d433d161
Oops: 0003 [#1] SMP KASAN
CPU: 1 PID: 4811 Comm: syz-executor0 Not tainted 4.18.0-rc1+ #114
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:bpf_jit_binary_lock_ro include/linux/filter.h:703 [inline]
RIP: 0010:bpf_int_jit_compile+0xc36/0xf30 arch/x86/net/bpf_jit_comp.c:1168
Code: b8 00 00 00 00 00 fc ff df 48 c1 ea 03 0f b6 04 02 4c 89 f2 83 e2 07  
38 d0 7f 08 84 c0 0f 85 a0 02 00 00 48 8b 85 00 ff ff ff <80> 60 02 fe e9  
c7 fb ff ff e8 ac 00 36 00 48 8b 8d 30 ff ff ff 48
RSP: 0018:ffff8801cfca7998 EFLAGS: 00010246
RAX: ffffffffa0008000 RBX: 0000000000000046 RCX: ffffffff81460e4a
RDX: 0000000000000002 RSI: ffffffff81460e58 RDI: 0000000000000005
RBP: ffff8801cfca7ab8 R08: ffff8801aa2121c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90001938002
R13: ffff8801cfca7a90 R14: ffffffffa0008002 R15: 00000000fffffff4
FS:  0000000001429940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa0008002 CR3: 00000001d2c40000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  bpf_prog_select_runtime+0x7db/0xa60 kernel/bpf/core.c:1505
  bpf_prog_load+0x1194/0x1c60 kernel/bpf/syscall.c:1356
  __do_sys_bpf kernel/bpf/syscall.c:2360 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:2322 [inline]
  __x64_sys_bpf+0x36c/0x510 kernel/bpf/syscall.c:2322
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455a99
Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffd396676f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000001429914 RCX: 0000000000455a99
RDX: 0000000000000048 RSI: 0000000020000240 RDI: 0000000000000005
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
R13: 00000000004bb7d5 R14: 00000000004c8508 R15: 0000000000000023
Modules linked in:
Dumping ftrace buffer:
    (ftrace buffer empty)
CR2: ffffffffa0008002
---[ end trace fa548fc30dca8c15 ]---
RIP: 0010:bpf_jit_binary_lock_ro include/linux/filter.h:703 [inline]
RIP: 0010:bpf_int_jit_compile+0xc36/0xf30 arch/x86/net/bpf_jit_comp.c:1168
Code: b8 00 00 00 00 00 fc ff df 48 c1 ea 03 0f b6 04 02 4c 89 f2 83 e2 07  
38 d0 7f 08 84 c0 0f 85 a0 02 00 00 48 8b 85 00 ff ff ff <80> 60 02 fe e9  
c7 fb ff ff e8 ac 00 36 00 48 8b 8d 30 ff ff ff 48
RSP: 0018:ffff8801cfca7998 EFLAGS: 00010246
RAX: ffffffffa0008000 RBX: 0000000000000046 RCX: ffffffff81460e4a
RDX: 0000000000000002 RSI: ffffffff81460e58 RDI: 0000000000000005
RBP: ffff8801cfca7ab8 R08: ffff8801aa2121c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90001938002
R13: ffff8801cfca7a90 R14: ffffffffa0008002 R15: 00000000fffffff4
FS:  0000000001429940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffa0008002 CR3: 00000001d2c40000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply

* Re: Donation
From: M.M fridman @ 2018-06-24  3:07 UTC (permalink / raw)




I Mikhail Fridman. has selected you specially as one of my beneficiaries
for my Charitable Donation, Just as I have declared on May 23, 2016 to give
my fortune as charity.

Check the link below for confirmation:

http://www.ibtimes.co.uk/russias-second-wealthiest-man-mikhail-fridman-plans-leaving-14-2bn-fortune-charity-1561604

Reply as soon as possible with further directives.

Best Regards,
Mikhail Fridman.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox