Netdev List
 help / color / mirror / Atom feed
* RE: [PATCH 3/3] ocelot_ace: fix action of trap
From: Y.b. Lu @ 2019-08-13  2:12 UTC (permalink / raw)
  To: Allan W. Nielsen
  Cc: netdev@vger.kernel.org, David S . Miller, Alexandre Belloni,
	Microchip Linux Driver Support
In-Reply-To: <20190812123147.6jjd3kocityxbvcg@lx-anielsen.microsemi.net>

Hi Allan,

> -----Original Message-----
> From: Allan W. Nielsen <allan.nielsen@microchip.com>
> Sent: Monday, August 12, 2019 8:32 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>
> Cc: netdev@vger.kernel.org; David S . Miller <davem@davemloft.net>;
> Alexandre Belloni <alexandre.belloni@bootlin.com>; Microchip Linux Driver
> Support <UNGLinuxDriver@microchip.com>
> Subject: Re: [PATCH 3/3] ocelot_ace: fix action of trap
> 
> The 08/12/2019 18:48, Yangbo Lu wrote:
> > The trap action should be copying the frame to CPU and dropping it for
> > forwarding, but current setting was just copying frame to CPU.
> 
> Are there any actions which do a "copy-to-cpu" and still forward the frame in
> HW?

[Y.b. Lu] We're using Felix switch whose code hadn't been accepted by upstream.
https://patchwork.ozlabs.org/project/netdev/list/?series=115399&state=*

I'd like to trap all IEEE 1588 PTP Ethernet frames to CPU through etype 0x88f7.
When I used current TRAP option, I found the frames were not only copied to CPU, but also forwarded to other ports.
So I just made the TRAP option same with DROP option except enabling CPU_COPY_ENA in the patch.

Thanks.

> 
> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> > ---
> >  drivers/net/ethernet/mscc/ocelot_ace.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mscc/ocelot_ace.c
> > b/drivers/net/ethernet/mscc/ocelot_ace.c
> > index 91250f3..59ad590 100644
> > --- a/drivers/net/ethernet/mscc/ocelot_ace.c
> > +++ b/drivers/net/ethernet/mscc/ocelot_ace.c
> > @@ -317,9 +317,9 @@ static void is2_action_set(struct vcap_data *data,
> >  		break;
> >  	case OCELOT_ACL_ACTION_TRAP:
> >  		VCAP_ACT_SET(PORT_MASK, 0x0);
> > -		VCAP_ACT_SET(MASK_MODE, 0x0);
> > -		VCAP_ACT_SET(POLICE_ENA, 0x0);
> > -		VCAP_ACT_SET(POLICE_IDX, 0x0);
> > +		VCAP_ACT_SET(MASK_MODE, 0x1);
> > +		VCAP_ACT_SET(POLICE_ENA, 0x1);
> > +		VCAP_ACT_SET(POLICE_IDX, OCELOT_POLICER_DISCARD);
> This seems wrong. The policer is used to ensure that traffic are discarded, even
> in the case where other users of the code has requested it to go to the CPU.
> 
> Are you sure this is working? If it is working, then I fear we have an issue with
> the DROP action which uses this to discard frames.
> 
> >  		VCAP_ACT_SET(CPU_QU_NUM, 0x0);
> >  		VCAP_ACT_SET(CPU_COPY_ENA, 0x1);
> >  		break;
> > --
> > 2.7.4
> 
> --
> /Allan

^ permalink raw reply

* Re: [PATCH bpf-next v2 2/4] bpf: support cloning sk storage on accept()
From: Martin Lau @ 2019-08-13  1:47 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, davem@davemloft.net,
	ast@kernel.org, daniel@iogearbox.net, Yonghong Song
In-Reply-To: <20190809161038.186678-3-sdf@google.com>

On Fri, Aug 09, 2019 at 09:10:36AM -0700, Stanislav Fomichev wrote:
> Add new helper bpf_sk_storage_clone which optionally clones sk storage
> and call it from sk_clone_lock.
Thanks for v2.  Sorry for the delay.  I am traveling.

> 
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Yonghong Song <yhs@fb.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  include/net/bpf_sk_storage.h |  10 ++++
>  include/uapi/linux/bpf.h     |   3 ++
>  net/core/bpf_sk_storage.c    | 100 +++++++++++++++++++++++++++++++++--
>  net/core/sock.c              |   9 ++--
>  4 files changed, 116 insertions(+), 6 deletions(-)
> 
> diff --git a/include/net/bpf_sk_storage.h b/include/net/bpf_sk_storage.h
> index b9dcb02e756b..8e4f831d2e52 100644
> --- a/include/net/bpf_sk_storage.h
> +++ b/include/net/bpf_sk_storage.h
> @@ -10,4 +10,14 @@ void bpf_sk_storage_free(struct sock *sk);
>  extern const struct bpf_func_proto bpf_sk_storage_get_proto;
>  extern const struct bpf_func_proto bpf_sk_storage_delete_proto;
>  
> +#ifdef CONFIG_BPF_SYSCALL
> +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk);
> +#else
> +static inline int bpf_sk_storage_clone(const struct sock *sk,
> +				       struct sock *newsk)
> +{
> +	return 0;
> +}
> +#endif
> +
>  #endif /* _BPF_SK_STORAGE_H */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 4393bd4b2419..0ef594ac3899 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -337,6 +337,9 @@ enum bpf_attach_type {
>  #define BPF_F_RDONLY_PROG	(1U << 7)
>  #define BPF_F_WRONLY_PROG	(1U << 8)
>  
> +/* Clone map from listener for newly accepted socket */
> +#define BPF_F_CLONE		(1U << 9)
> +
>  /* flags for BPF_PROG_QUERY */
>  #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
>  
> diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
> index 94c7f77ecb6b..584e08ee0ca3 100644
> --- a/net/core/bpf_sk_storage.c
> +++ b/net/core/bpf_sk_storage.c
> @@ -12,6 +12,9 @@
>  
>  static atomic_t cache_idx;
>  
> +#define SK_STORAGE_CREATE_FLAG_MASK					\
> +	(BPF_F_NO_PREALLOC | BPF_F_CLONE)
> +
>  struct bucket {
>  	struct hlist_head list;
>  	raw_spinlock_t lock;
> @@ -209,7 +212,6 @@ static void selem_unlink_sk(struct bpf_sk_storage_elem *selem)
>  		kfree_rcu(sk_storage, rcu);
>  }
>  
> -/* sk_storage->lock must be held and sk_storage->list cannot be empty */
>  static void __selem_link_sk(struct bpf_sk_storage *sk_storage,
>  			    struct bpf_sk_storage_elem *selem)
>  {
> @@ -509,7 +511,7 @@ static int sk_storage_delete(struct sock *sk, struct bpf_map *map)
>  	return 0;
>  }
>  
> -/* Called by __sk_destruct() */
> +/* Called by __sk_destruct() & bpf_sk_storage_clone() */
>  void bpf_sk_storage_free(struct sock *sk)
>  {
>  	struct bpf_sk_storage_elem *selem;
> @@ -557,6 +559,11 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
>  
>  	smap = (struct bpf_sk_storage_map *)map;
>  
> +	/* Note that this map might be concurrently cloned from
> +	 * bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone
> +	 * RCU read section to finish before proceeding. New RCU
> +	 * read sections should be prevented via bpf_map_inc_not_zero.
> +	 */
Thanks!

>  	synchronize_rcu();
>  
>  	/* bpf prog and the userspace can no longer access this map
> @@ -601,7 +608,8 @@ static void bpf_sk_storage_map_free(struct bpf_map *map)
>  
>  static int bpf_sk_storage_map_alloc_check(union bpf_attr *attr)
>  {
> -	if (attr->map_flags != BPF_F_NO_PREALLOC || attr->max_entries ||
> +	if (attr->map_flags & ~SK_STORAGE_CREATE_FLAG_MASK ||
> +	    attr->max_entries ||
I think "!(attr->map_flags & BPF_F_NO_PREALLOC)" should also be needed.

>  	    attr->key_size != sizeof(int) || !attr->value_size ||
>  	    /* Enforce BTF for userspace sk dumping */
>  	    !attr->btf_key_type_id || !attr->btf_value_type_id)
> @@ -739,6 +747,92 @@ static int bpf_fd_sk_storage_delete_elem(struct bpf_map *map, void *key)
>  	return err;
>  }
>  
> +static struct bpf_sk_storage_elem *
> +bpf_sk_storage_clone_elem(struct sock *newsk,
> +			  struct bpf_sk_storage_map *smap,
> +			  struct bpf_sk_storage_elem *selem)
> +{
> +	struct bpf_sk_storage_elem *copy_selem;
> +
> +	copy_selem = selem_alloc(smap, newsk, NULL, true);
> +	if (!copy_selem)
> +		return NULL;
> +
> +	if (map_value_has_spin_lock(&smap->map))
> +		copy_map_value_locked(&smap->map, SDATA(copy_selem)->data,
> +				      SDATA(selem)->data, true);
> +	else
> +		copy_map_value(&smap->map, SDATA(copy_selem)->data,
> +			       SDATA(selem)->data);
> +
> +	return copy_selem;
> +}
> +
> +int bpf_sk_storage_clone(const struct sock *sk, struct sock *newsk)
> +{
> +	struct bpf_sk_storage *new_sk_storage = NULL;
> +	struct bpf_sk_storage *sk_storage;
> +	struct bpf_sk_storage_elem *selem;
> +	int ret;
> +
> +	RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> +
> +	rcu_read_lock();
> +	sk_storage = rcu_dereference(sk->sk_bpf_storage);
> +
> +	if (!sk_storage || hlist_empty(&sk_storage->list))
> +		goto out;
> +
> +	hlist_for_each_entry_rcu(selem, &sk_storage->list, snode) {
> +		struct bpf_sk_storage_elem *copy_selem;
> +		struct bpf_sk_storage_map *smap;
> +		struct bpf_map *map;
> +		int refold;
> +
> +		smap = rcu_dereference(SDATA(selem)->smap);
> +		if (!(smap->map.map_flags & BPF_F_CLONE))
> +			continue;
> +
> +		map = bpf_map_inc_not_zero(&smap->map, false);
> +		if (IS_ERR(map))
> +			continue;
> +
> +		copy_selem = bpf_sk_storage_clone_elem(newsk, smap, selem);
> +		if (!copy_selem) {
> +			ret = -ENOMEM;
> +			bpf_map_put(map);
> +			goto err;
> +		}
> +
> +		if (new_sk_storage) {
> +			selem_link_map(smap, copy_selem);
> +			__selem_link_sk(new_sk_storage, copy_selem);
> +		} else {
> +			ret = sk_storage_alloc(newsk, smap, copy_selem);
> +			if (ret) {
> +				kfree(copy_selem);
> +				atomic_sub(smap->elem_size,
> +					   &newsk->sk_omem_alloc);
> +				bpf_map_put(map);
> +				goto err;
> +			}
> +
> +			new_sk_storage = rcu_dereference(copy_selem->sk_storage);
> +		}
> +		bpf_map_put(map);
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +	return 0;
> +
> +err:
> +	rcu_read_unlock();
> +
> +	bpf_sk_storage_free(newsk);
The later sk_free_unlock_clone(newsk) should eventually call
bpf_sk_storage_free(newsk) also?

Others LGTM.

> +	return ret;
> +}
> +
>  BPF_CALL_4(bpf_sk_storage_get, struct bpf_map *, map, struct sock *, sk,
>  	   void *, value, u64, flags)
>  {
> diff --git a/net/core/sock.c b/net/core/sock.c
> index d57b0cc995a0..f5e801a9cea4 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1851,9 +1851,12 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
>  			goto out;
>  		}
>  		RCU_INIT_POINTER(newsk->sk_reuseport_cb, NULL);
> -#ifdef CONFIG_BPF_SYSCALL
> -		RCU_INIT_POINTER(newsk->sk_bpf_storage, NULL);
> -#endif
> +
> +		if (bpf_sk_storage_clone(sk, newsk)) {
> +			sk_free_unlock_clone(newsk);
> +			newsk = NULL;
> +			goto out;
> +		}
>  
>  		newsk->sk_err	   = 0;
>  		newsk->sk_err_soft = 0;
> -- 
> 2.23.0.rc1.153.gdeed80330f-goog
> 

^ permalink raw reply

* Re: [patch net-next v3 0/3] net: devlink: Finish network namespace support
From: David Ahern @ 2019-08-13  1:46 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: Jiri Pirko, netdev, davem, stephen, mlxsw
In-Reply-To: <20190812181100.1cfd8b9d@cakuba.netronome.com>

On 8/12/19 7:11 PM, Jakub Kicinski wrote:
> If the devlink instance just disappeared - that'd be a very very strange
> thing. Only software objects disappear with the namespace. 
> Netdevices without ->rtnl_link_ops go back to init_net.

netdevsim still has rtnl_link_ops:

static struct rtnl_link_ops nsim_link_ops __read_mostly = {
        .kind           = DRV_NAME,
        .validate       = nsim_validate,
};

^ permalink raw reply

* Re: [patch net-next v3 1/3] net: devlink: allow to change namespaces
From: Jakub Kicinski @ 2019-08-13  1:21 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, stephen, dsahern, mlxsw
In-Reply-To: <20190812134751.30838-2-jiri@resnulli.us>

On Mon, 12 Aug 2019 15:47:49 +0200, Jiri Pirko wrote:
> @@ -6953,9 +7089,33 @@ int devlink_compat_switch_id_get(struct net_device *dev,
>  	return 0;
>  }
>  
> +static void __net_exit devlink_pernet_exit(struct net *net)
> +{
> +	struct devlink *devlink;
> +
> +	mutex_lock(&devlink_mutex);
> +	list_for_each_entry(devlink, &devlink_list, list)
> +		if (net_eq(devlink_net(devlink), net))
> +			devlink_netns_change(devlink, &init_net);
> +	mutex_unlock(&devlink_mutex);
> +}

Just to be sure - this will not cause any locking issues?
Usually the locking order goes devlink -> rtnl

^ permalink raw reply

* Re: [PATCH iproute2] tc: Fix block-handle support for filter operations
From: Stephen Hemminger @ 2019-08-13  1:16 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, dsahern, jiri, mlxsw, Ido Schimmel
In-Reply-To: <20190812101706.15778-1-idosch@idosch.org>

On Mon, 12 Aug 2019 13:17:06 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Commit e991c04d64c0 ("Revert "tc: Add batchsize feature for filter and
> actions"") reverted more than it should and broke shared block
> functionality. Fix this by restoring the original functionality.
> 
> To reproduce:
> 
> # tc qdisc add dev swp1 ingress_block 10 ingress
> # tc filter add block 10 proto ip pref 1 flower \
> 	dst_ip 192.0.2.0/24 action drop
> Unknown filter "block", hence option "10" is unparsable
> 
> Fixes: e991c04d64c0 ("Revert "tc: Add batchsize feature for filter and actions"")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>

Applied

^ permalink raw reply

* [PATCH] netdevsim: fix ptr_ret.cocci warnings
From: kbuild test robot @ 2019-08-13  1:14 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: kbuild-all, netdev, davem, jakub.kicinski, mlxsw
In-Reply-To: <20190812101620.7884-1-jiri@resnulli.us>

From: kbuild test robot <lkp@intel.com>

drivers/net/netdevsim/dev.c:297:1-3: WARNING: PTR_ERR_OR_ZERO can be used


 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

Fixes: e9cf98183f96 ("netdevsim: implement support for devlink region and snapshots")
CC: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: kbuild test robot <lkp@intel.com>
---

url:    https://github.com/0day-ci/linux/commits/Jiri-Pirko/netdevsim-implement-support-for-devlink-region-and-snapshots/20190813-002135

 dev.c |    5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -294,10 +294,7 @@ static int nsim_dev_dummy_region_init(st
 		devlink_region_create(devlink, "dummy",
 				      NSIM_DEV_DUMMY_REGION_SNAPSHOT_MAX,
 				      NSIM_DEV_DUMMY_REGION_SIZE);
-	if (IS_ERR(nsim_dev->dummy_region))
-		return PTR_ERR(nsim_dev->dummy_region);
-
-	return 0;
+	return PTR_ERR_OR_ZERO(nsim_dev->dummy_region);
 }
 
 static void nsim_dev_dummy_region_exit(struct nsim_dev *nsim_dev)

^ permalink raw reply

* Re: [patch net-next] netdevsim: implement support for devlink region and snapshots
From: kbuild test robot @ 2019-08-13  1:14 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: kbuild-all, netdev, davem, jakub.kicinski, mlxsw
In-Reply-To: <20190812101620.7884-1-jiri@resnulli.us>

Hi Jiri,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jiri-Pirko/netdevsim-implement-support-for-devlink-region-and-snapshots/20190813-002135

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


coccinelle warnings: (new ones prefixed by >>)

>> drivers/net/netdevsim/dev.c:297:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* Re: [patch net-next v3 0/3] net: devlink: Finish network namespace support
From: Jakub Kicinski @ 2019-08-13  1:11 UTC (permalink / raw)
  To: David Ahern; +Cc: Jiri Pirko, netdev, davem, stephen, mlxsw
In-Reply-To: <bfb879be-a232-0ef1-1c40-3a9c8bcba8f8@gmail.com>

On Mon, 12 Aug 2019 18:24:41 -0600, David Ahern wrote:
> On 8/12/19 7:47 AM, Jiri Pirko wrote:
> > From: Jiri Pirko <jiri@mellanox.com>
> > 
> > Devlink from the beginning counts with network namespaces, but the
> > instances has been fixed to init_net. The first patch allows user
> > to move existing devlink instances into namespaces:
> > 
> > $ devlink dev
> > netdevsim/netdevsim1
> > $ ip netns add ns1
> > $ devlink dev set netdevsim/netdevsim1 netns ns1
> > $ devlink -N ns1 dev
> > netdevsim/netdevsim1
> > 
> > The last patch allows user to create new netdevsim instance directly
> > inside network namespace of a caller.  
> 
> The namespace behavior seems odd to me. If devlink instance is created
> in a namespace and never moved, it should die with the namespace. With
> this patch set, devlink instance and its ports are moved to init_net on
> namespace delete.

If the devlink instance just disappeared - that'd be a very very strange
thing. Only software objects disappear with the namespace. 
Netdevices without ->rtnl_link_ops go back to init_net.

^ permalink raw reply

* Re: [patch net-next] netdevsim: implement support for devlink region and snapshots
From: Jakub Kicinski @ 2019-08-13  0:58 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, mlxsw
In-Reply-To: <20190812101620.7884-1-jiri@resnulli.us>

On Mon, 12 Aug 2019 12:16:20 +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Implement dummy region of size 32K and allow user to create snapshots
> or random data using debugfs file trigger.
> 
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>

I'm nacking all the netdevsim patches unless the selftest 
is posted at the same time :/

You're leaking those features one by one what if you get distracted 
and the tests never materialize :/

This is all dead code.

> diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
> index 08ca59fc189b..e76ea6a3cb60 100644
> --- a/drivers/net/netdevsim/dev.c
> +++ b/drivers/net/netdevsim/dev.c
> @@ -27,6 +27,41 @@
>  
>  static struct dentry *nsim_dev_ddir;
>  
> +#define NSIM_DEV_DUMMY_REGION_SIZE (1024 * 32)
> +
> +static ssize_t nsim_dev_take_snapshot_write(struct file *file,
> +					    const char __user *data,
> +					    size_t count, loff_t *ppos)
> +{
> +	struct nsim_dev *nsim_dev = file->private_data;
> +	void *dummy_data;
> +	u32 id;
> +	int err;
> +
> +	dummy_data = kmalloc(NSIM_DEV_DUMMY_REGION_SIZE, GFP_KERNEL);
> +	if (!dummy_data) {
> +		pr_err("Failed to allocate memory for region snapshot\n");
> +		goto out;
> +	}
> +
> +	get_random_bytes(dummy_data, NSIM_DEV_DUMMY_REGION_SIZE);
> +
> +	id = devlink_region_shapshot_id_get(priv_to_devlink(nsim_dev));
> +	err = devlink_region_snapshot_create(nsim_dev->dummy_region,
> +					     dummy_data, id, kfree);
> +	if (err)
> +		pr_err("Failed to create region snapshot\n");
> +
> +out:
> +	return count;

why not return an error?

> +}
> +
> +static const struct file_operations nsim_dev_take_snapshot_fops = {
> +	.open = simple_open,
> +	.write = nsim_dev_take_snapshot_write,
> +	.llseek = generic_file_llseek,
> +};
> +
>  static int nsim_dev_debugfs_init(struct nsim_dev *nsim_dev)
>  {
>  	char dev_ddir_name[16];
> @@ -44,6 +79,8 @@ static int nsim_dev_debugfs_init(struct nsim_dev *nsim_dev)
>  			   &nsim_dev->max_macs);
>  	debugfs_create_bool("test1", 0600, nsim_dev->ddir,
>  			    &nsim_dev->test1);
> +	debugfs_create_file("take_snapshot", 0200, nsim_dev->ddir, nsim_dev,
> +			    &nsim_dev_take_snapshot_fops);
>  	return 0;
>  }
>  
> @@ -248,6 +285,26 @@ static void nsim_devlink_param_load_driverinit_values(struct devlink *devlink)
>  		nsim_dev->test1 = saved_value.vbool;
>  }
>  
> +#define NSIM_DEV_DUMMY_REGION_SNAPSHOT_MAX 16
> +
> +static int nsim_dev_dummy_region_init(struct nsim_dev *nsim_dev,
> +				      struct devlink *devlink)
> +{
> +	nsim_dev->dummy_region =
> +		devlink_region_create(devlink, "dummy",
> +				      NSIM_DEV_DUMMY_REGION_SNAPSHOT_MAX,
> +				      NSIM_DEV_DUMMY_REGION_SIZE);
> +	if (IS_ERR(nsim_dev->dummy_region))
> +		return PTR_ERR(nsim_dev->dummy_region);
> +
> +	return 0;

PTR_ERR_OR_ZERO()

> +}
> +
> +static void nsim_dev_dummy_region_exit(struct nsim_dev *nsim_dev)
> +{
> +	devlink_region_destroy(nsim_dev->dummy_region);
> +}
> +
>  static int nsim_dev_reload(struct devlink *devlink,
>  			   struct netlink_ext_ack *extack)
>  {


^ permalink raw reply

* Re: [PATCH] tools: bpftool: add feature check for zlib
From: Jakub Kicinski @ 2019-08-13  0:53 UTC (permalink / raw)
  To: Peter Wu
  Cc: Daniel Borkmann, Stanislav Fomichev, Alexei Starovoitov, netdev,
	Quentin Monnet
In-Reply-To: <20190813003833.22042-2-peter@lekensteyn.nl>

On Tue, 13 Aug 2019 01:38:33 +0100, Peter Wu wrote:
> bpftool requires libelf, and zlib for decompressing /proc/config.gz.
> zlib is a transitive dependency via libelf, and became mandatory since
> elfutils 0.165 (Jan 2016). The feature check of libelf is already done
> in the elfdep target of tools/lib/bpf/Makefile, pulled in by bpftool via
> a dependency on libbpf.a. Add a similar feature check for zlib.
> 
> Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Peter Wu <peter@lekensteyn.nl>

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Thanks!

^ permalink raw reply

* Re: BUG: corrupted list in rxrpc_local_processor
From: syzbot @ 2019-08-13  0:51 UTC (permalink / raw)
  To: davem, dhowells, linux-afs, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <5014.1565649712@warthog.procyon.org.uk>

Hello,

syzbot has tested the proposed patch but the reproducer still triggered  
crash:
KASAN: use-after-free Read in rxrpc_queue_local

==================================================================
BUG: KASAN: use-after-free in atomic_read  
include/asm-generic/atomic-instrumented.h:26 [inline]
BUG: KASAN: use-after-free in rxrpc_queue_local+0x7c/0x3e0  
net/rxrpc/local_object.c:354
Read of size 4 at addr ffff8880a82b56d4 by task syz-executor.0/11829

CPU: 1 PID: 11829 Comm: syz-executor.0 Not tainted 5.3.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
  __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
  kasan_report+0x12/0x17 mm/kasan/common.c:612
  check_memory_region_inline mm/kasan/generic.c:185 [inline]
  check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
  __kasan_check_read+0x11/0x20 mm/kasan/common.c:92
  atomic_read include/asm-generic/atomic-instrumented.h:26 [inline]
  rxrpc_queue_local+0x7c/0x3e0 net/rxrpc/local_object.c:354
  rxrpc_unuse_local+0x52/0x80 net/rxrpc/local_object.c:408
  rxrpc_release_sock net/rxrpc/af_rxrpc.c:904 [inline]
  rxrpc_release+0x47d/0x840 net/rxrpc/af_rxrpc.c:930
  __sock_release+0xce/0x280 net/socket.c:590
  sock_close+0x1e/0x30 net/socket.c:1268
  __fput+0x2ff/0x890 fs/file_table.c:280
  ____fput+0x16/0x20 fs/file_table.c:313
  task_work_run+0x145/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x316/0x380 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x5a9/0x6a0 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48  
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48  
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffc204e87c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000413511
RDX: 0000001b2e420000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000001 R08: ffffffffffffffff R09: ffffffffffffffff
R10: 00007ffc204e88a0 R11: 0000000000000293 R12: 000000000075bf20
R13: 000000000001ac29 R14: 0000000000760210 R15: ffffffffffffffff

Allocated by task 11830:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:460
  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:501
  kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3550
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
  rxrpc_lookup_local+0x562/0x1ba0 net/rxrpc/local_object.c:277
  rxrpc_sendmsg+0x379/0x5f0 net/rxrpc/af_rxrpc.c:566
  sock_sendmsg_nosec net/socket.c:637 [inline]
  sock_sendmsg+0xd7/0x130 net/socket.c:657
  ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
  __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
  __do_sys_sendmmsg net/socket.c:2442 [inline]
  __se_sys_sendmmsg net/socket.c:2439 [inline]
  __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 16:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/common.c:449
  kasan_slab_free+0xe/0x10 mm/kasan/common.c:457
  __cache_free mm/slab.c:3425 [inline]
  kfree+0x10a/0x2c0 mm/slab.c:3756
  rxrpc_local_rcu+0x62/0x80 net/rxrpc/local_object.c:495
  __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
  rcu_do_batch kernel/rcu/tree.c:2114 [inline]
  rcu_core+0x67f/0x1580 kernel/rcu/tree.c:2314
  rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
  __do_softirq+0x262/0x98c kernel/softirq.c:292

The buggy address belongs to the object at ffff8880a82b56c0
  which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 20 bytes inside of
  1024-byte region [ffff8880a82b56c0, ffff8880a82b5ac0)
The buggy address belongs to the page:
page:ffffea0002a0ad00 refcount:1 mapcount:0 mapping:ffff8880aa400c40  
index:0xffff8880a82b5b40 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea0002a1cb08 ffffea00023b0c88 ffff8880aa400c40
raw: ffff8880a82b5b40 ffff8880a82b4040 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8880a82b5580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880a82b5600: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff8880a82b5680: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
                                                  ^
  ffff8880a82b5700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8880a82b5780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


Tested on:

commit:         03a62469 rxrpc: Fix local endpoint replacement
git tree:        
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
console output: https://syzkaller.appspot.com/x/log.txt?x=15f1679a600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)


^ permalink raw reply

* Re: [PATCH] `iwlist scan` fails with many networks available
From: James Nylen @ 2019-08-13  0:43 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel
In-Reply-To: <f7de98001849bc98a0a084d2ffc369f4d9772d52.camel@sipsolutions.net>

>I suppose we could consider applying a workaround like this if it has a
>condition checking that the buffer passed in is the maximum possible
>buffer (65535 bytes, due to iw_point::length being u16)

This is what the latest patch does (attached to my email from
yesterday / https://lkml.org/lkml/2019/8/10/452 ).

If you'd like to apply it, I'm happy to make any needed revisions.
Otherwise I'm going to have to keep patching my kernels for this
issue, unfortunately I don't have the time to try to get wicd to
migrate to a better solution.

On 8/11/19, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Sun, 2019-08-11 at 02:08 +0000, James Nylen wrote:
>> In 5.x it's still possible for `ieee80211_scan_results` (`iwlist
>> scan`) to fail when too many wireless networks are available.  This
>> code path is used by `wicd`.
>>
>> Previously: https://lkml.org/lkml/2017/4/2/192
>
> This has been known for probably a decade or longer. I don't know why
> 'wicd' still insists on using wext, unless it's no longer maintained at
> all. nl80211 doesn't have this problem at all, and I think gives more
> details about the networks found too.
>
>> I've been applying this updated patch to my own kernels since 2017 with
>> no issues.  I am sure it is not the ideal way to solve this problem, but
>> I'm making my fix available in case it helps others.
>
> I don't think silently dropping data is a good solution.
>
> I suppose we could consider applying a workaround like this if it has a
> condition checking that the buffer passed in is the maximum possible
> buffer (65535 bytes, due to iw_point::length being u16), but below that
> -E2BIG serves well-written implementations as an indicator that they
> need to retry with a bigger buffer.
>
>> Please advise on next steps or if this is a dead end.
>
> I think wireless extensions are in fact a dead end and all software
> (even 'wicd', which seems to be the lone holdout) should migrate to
> nl80211 instead.
>
> johannes
>
>

^ permalink raw reply

* [PATCH] tools: bpftool: add feature check for zlib
From: Peter Wu @ 2019-08-13  0:38 UTC (permalink / raw)
  To: Daniel Borkmann, Jakub Kicinski
  Cc: Stanislav Fomichev, Alexei Starovoitov, netdev, Quentin Monnet
In-Reply-To: <20190813003833.22042-1-peter@lekensteyn.nl>

bpftool requires libelf, and zlib for decompressing /proc/config.gz.
zlib is a transitive dependency via libelf, and became mandatory since
elfutils 0.165 (Jan 2016). The feature check of libelf is already done
in the elfdep target of tools/lib/bpf/Makefile, pulled in by bpftool via
a dependency on libbpf.a. Add a similar feature check for zlib.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
Hi,

This is a follow-up for an earlier "tools: bpftool: fix reading from
/proc/config.gz" patch. It applies Jakub and Daniel suggestions from:
https://lkml.kernel.org/r/6154af6c-4f24-4b0a-25c2-a8a1d6c9948f@iogearbox.net
https://lkml.kernel.org/r/20190809140956.24369b00@cakuba.netronome.com

Feel free to massage the commit message and patch as you see fit.

Kind regards,
Peter
---
 tools/bpf/bpftool/Makefile | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 078bd0dcfba5..4c9d1ffc3fc7 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -58,8 +58,8 @@ INSTALL ?= install
 RM ?= rm -f
 
 FEATURE_USER = .bpftool
-FEATURE_TESTS = libbfd disassembler-four-args reallocarray
-FEATURE_DISPLAY = libbfd disassembler-four-args
+FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib
+FEATURE_DISPLAY = libbfd disassembler-four-args zlib
 
 check_feat := 1
 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
@@ -111,6 +111,8 @@ OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
 	$(QUIET_CC)$(COMPILE.c) -MMD -o $@ $<
 
+$(OUTPUT)feature.o: | zdep
+
 $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
 	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS)
 
@@ -149,6 +151,9 @@ doc-uninstall:
 
 FORCE:
 
-.PHONY: all FORCE clean install uninstall
+zdep:
+	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
+
+.PHONY: all FORCE clean install uninstall zdep
 .PHONY: doc doc-clean doc-install doc-uninstall
 .DEFAULT_GOAL := all
-- 
2.22.0


^ permalink raw reply related

* [PATCH] tools: bpftool: add feature check for zlib
From: Peter Wu @ 2019-08-13  0:38 UTC (permalink / raw)
  To: Daniel Borkmann, Jakub Kicinski
  Cc: Stanislav Fomichev, Alexei Starovoitov, netdev, Quentin Monnet

bpftool requires libelf, and zlib for decompressing /proc/config.gz.
zlib is a transitive dependency via libelf, and became mandatory since
elfutils 0.165 (Jan 2016). The feature check of libelf is already done
in the elfdep target of tools/lib/bpf/Makefile, pulled in by bpftool via
a dependency on libbpf.a. Add a similar feature check for zlib.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
Hi,

This is a follow-up for an earlier "tools: bpftool: fix reading from
/proc/config.gz" patch. It applies Jakub and Daniel suggestions from:
https://lkml.kernel.org/r/6154af6c-4f24-4b0a-25c2-a8a1d6c9948f@iogearbox.net
https://lkml.kernel.org/r/20190809140956.24369b00@cakuba.netronome.com

Feel free to massage the commit message and patch as you see fit.

Kind regards,
Peter
---
 tools/bpf/bpftool/Makefile | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 078bd0dcfba5..4c9d1ffc3fc7 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -58,8 +58,8 @@ INSTALL ?= install
 RM ?= rm -f
 
 FEATURE_USER = .bpftool
-FEATURE_TESTS = libbfd disassembler-four-args reallocarray
-FEATURE_DISPLAY = libbfd disassembler-four-args
+FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib
+FEATURE_DISPLAY = libbfd disassembler-four-args zlib
 
 check_feat := 1
 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
@@ -111,6 +111,8 @@ OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
 	$(QUIET_CC)$(COMPILE.c) -MMD -o $@ $<
 
+$(OUTPUT)feature.o: | zdep
+
 $(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
 	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^ $(LIBS)
 
@@ -149,6 +151,9 @@ doc-uninstall:
 
 FORCE:
 
-.PHONY: all FORCE clean install uninstall
+zdep:
+	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
+
+.PHONY: all FORCE clean install uninstall zdep
 .PHONY: doc doc-clean doc-install doc-uninstall
 .DEFAULT_GOAL := all
-- 
2.22.0


^ permalink raw reply related

* Re: KASAN: use-after-free Read in rxrpc_queue_local
From: syzbot @ 2019-08-13  0:36 UTC (permalink / raw)
  To: davem, dhowells, linux-afs, linux-kernel, netdev, syzkaller-bugs
In-Reply-To: <4694.1565649521@warthog.procyon.org.uk>

Hello,

syzbot has tested the proposed patch but the reproducer still triggered  
crash:
KASAN: use-after-free Read in rxrpc_queue_local

==================================================================
BUG: KASAN: use-after-free in atomic_read  
include/asm-generic/atomic-instrumented.h:26 [inline]
BUG: KASAN: use-after-free in rxrpc_queue_local+0x7c/0x3e0  
net/rxrpc/local_object.c:354
Read of size 4 at addr ffff888081e3db14 by task syz-executor.5/31180

CPU: 0 PID: 31180 Comm: syz-executor.5 Not tainted 5.3.0-rc3+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
  __kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
  kasan_report+0x12/0x17 mm/kasan/common.c:612
  check_memory_region_inline mm/kasan/generic.c:185 [inline]
  check_memory_region+0x134/0x1a0 mm/kasan/generic.c:192
  __kasan_check_read+0x11/0x20 mm/kasan/common.c:92
  atomic_read include/asm-generic/atomic-instrumented.h:26 [inline]
  rxrpc_queue_local+0x7c/0x3e0 net/rxrpc/local_object.c:354
  rxrpc_unuse_local+0x52/0x80 net/rxrpc/local_object.c:408
  rxrpc_release_sock net/rxrpc/af_rxrpc.c:904 [inline]
  rxrpc_release+0x47d/0x840 net/rxrpc/af_rxrpc.c:930
  __sock_release+0xce/0x280 net/socket.c:590
  sock_close+0x1e/0x30 net/socket.c:1268
  __fput+0x2ff/0x890 fs/file_table.c:280
  ____fput+0x16/0x20 fs/file_table.c:313
  task_work_run+0x145/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x316/0x380 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x5a9/0x6a0 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48  
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48  
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fffc45736d0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000413511
RDX: 0000001b33920000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000000001 R08: ffffffffffffffff R09: ffffffffffffffff
R10: 00007fffc45737b0 R11: 0000000000000293 R12: 000000000075bf20
R13: 000000000003624a R14: 0000000000760270 R15: ffffffffffffffff

Allocated by task 31182:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:460
  kasan_kmalloc+0x9/0x10 mm/kasan/common.c:501
  kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3550
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
  rxrpc_lookup_local+0x562/0x1ba0 net/rxrpc/local_object.c:277
  rxrpc_bind+0x34d/0x5e0 net/rxrpc/af_rxrpc.c:149
  __sys_bind+0x239/0x290 net/socket.c:1647
  __do_sys_bind net/socket.c:1658 [inline]
  __se_sys_bind net/socket.c:1656 [inline]
  __x64_sys_bind+0x73/0xb0 net/socket.c:1656
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9:
  save_stack+0x23/0x90 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x102/0x150 mm/kasan/common.c:449
  kasan_slab_free+0xe/0x10 mm/kasan/common.c:457
  __cache_free mm/slab.c:3425 [inline]
  kfree+0x10a/0x2c0 mm/slab.c:3756
  rxrpc_local_rcu+0x62/0x80 net/rxrpc/local_object.c:495
  __rcu_reclaim kernel/rcu/rcu.h:222 [inline]
  rcu_do_batch kernel/rcu/tree.c:2114 [inline]
  rcu_core+0x67f/0x1580 kernel/rcu/tree.c:2314
  rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2323
  __do_softirq+0x262/0x98c kernel/softirq.c:292

The buggy address belongs to the object at ffff888081e3db00
  which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 20 bytes inside of
  1024-byte region [ffff888081e3db00, ffff888081e3df00)
The buggy address belongs to the page:
page:ffffea0002078f00 refcount:1 mapcount:0 mapping:ffff8880aa400c40  
index:0x0 compound_mapcount: 0
flags: 0x1fffc0000010200(slab|head)
raw: 01fffc0000010200 ffffea0002073288 ffffea00022f1608 ffff8880aa400c40
raw: 0000000000000000 ffff888081e3c000 0000000100000007 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff888081e3da00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff888081e3da80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff888081e3db00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                          ^
  ffff888081e3db80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff888081e3dc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


Tested on:

commit:         03a62469 rxrpc: Fix local endpoint replacement
git tree:        
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14500d36600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)


^ permalink raw reply

* Re: [patch net-next rfc 3/7] net: rtnetlink: add commands to add and delete alternative ifnames
From: David Ahern @ 2019-08-13  0:29 UTC (permalink / raw)
  To: Jakub Kicinski, Roopa Prabhu
  Cc: Jiri Pirko, netdev, David Miller, Stephen Hemminger, dcbw,
	Michal Kubecek, Andrew Lunn, parav, Saeed Mahameed, mlxsw
In-Reply-To: <20190812144310.442869de@cakuba.netronome.com>

On 8/12/19 3:43 PM, Jakub Kicinski wrote:
> Is not adding commands better because it's easier to deal with the
> RTM_NEWLINK notification? I must say it's unclear from the thread why
> muxing the op through RTM_SETLINK is preferable. IMHO new op is
> cleaner, do we have precedent for such IFLA_.*_OP-style attributes?

An alternative name for a link is not a primary object; it is only an
attribute of a link and links are manipulated through RTM_*LINK commands.

^ permalink raw reply

* Re: [patch net-next v3 0/3] net: devlink: Finish network namespace support
From: David Ahern @ 2019-08-13  0:24 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem, jakub.kicinski, stephen, mlxsw
In-Reply-To: <20190812134751.30838-1-jiri@resnulli.us>

On 8/12/19 7:47 AM, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@mellanox.com>
> 
> Devlink from the beginning counts with network namespaces, but the
> instances has been fixed to init_net. The first patch allows user
> to move existing devlink instances into namespaces:
> 
> $ devlink dev
> netdevsim/netdevsim1
> $ ip netns add ns1
> $ devlink dev set netdevsim/netdevsim1 netns ns1
> $ devlink -N ns1 dev
> netdevsim/netdevsim1
> 
> The last patch allows user to create new netdevsim instance directly
> inside network namespace of a caller.

The namespace behavior seems odd to me. If devlink instance is created
in a namespace and never moved, it should die with the namespace. With
this patch set, devlink instance and its ports are moved to init_net on
namespace delete.

The fib controller needs an update to return the namespace of the
devlink instance (on top of the patch applied to net):

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index 89795071f085..fa7e876f2d3b 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -114,11 +114,6 @@ static void nsim_dev_port_debugfs_exit(struct
nsim_dev_port *nsim_dev_port)
        debugfs_remove_recursive(nsim_dev_port->ddir);
 }

-static struct net *nsim_devlink_net(struct devlink *devlink)
-{
-       return &init_net;
-}
-
 static u64 nsim_dev_ipv4_fib_resource_occ_get(void *priv)
 {
        struct net *net = priv;
@@ -154,7 +149,7 @@ static int nsim_dev_resources_register(struct
devlink *devlink)
                .size_granularity = 1,
                .unit = DEVLINK_RESOURCE_UNIT_ENTRY
        };
-       struct net *net = nsim_devlink_net(devlink);
+       struct net *net = devlink_net(devlink);
        int err;
        u64 n;

@@ -309,7 +304,7 @@ static int nsim_dev_reload(struct devlink *devlink,
                NSIM_RESOURCE_IPV4_FIB, NSIM_RESOURCE_IPV4_FIB_RULES,
                NSIM_RESOURCE_IPV6_FIB, NSIM_RESOURCE_IPV6_FIB_RULES
        };
-       struct net *net = nsim_devlink_net(devlink);
+       struct net *net = devlink_net(devlink);
        int i;

        for (i = 0; i < ARRAY_SIZE(res_ids); ++i) {


^ permalink raw reply related

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: David Ahern @ 2019-08-13  0:23 UTC (permalink / raw)
  To: Stefano Brivio, David Miller; +Cc: liuhangbin, netdev, mleitner
In-Reply-To: <20190813005830.41f92428@redhat.com>

On 8/12/19 4:58 PM, Stefano Brivio wrote:
> How so, actually? I don't see how that would happen. On the forwarding
> path, 'iif' is set (not to loopback interface), so that's not affected.
> 
> Is there any other route lookup possibility I'm missing?

Use case is saddr is set and FLOWI_FLAG_ANYSRC is not set and that seems
pretty common to me. From a quick look, icmp_route_lookup,
ipv4_update_pmtu, ipv4_redirect, inet_csk_route_req, ...

Enable trace_fib_table_lookup and look at the flags for various use cases.

^ permalink raw reply

* Re: [PATCH 3/3] riscv: dts: Add DT node for SiFive FU540 Ethernet controller driver
From: Rob Herring @ 2019-08-12 23:33 UTC (permalink / raw)
  To: Paul Walmsley
  Cc: Yash Shah, davem, sagar.kadam, netdev, devicetree, linux-kernel,
	linux-riscv, mark.rutland, palmer, aou, nicolas.ferre, ynezz,
	sachin.ghadi, andrew
In-Reply-To: <alpine.DEB.2.21.9999.1907221446340.5793@viisi.sifive.com>

On Mon, Jul 22, 2019 at 02:48:40PM -0700, Paul Walmsley wrote:
> On Fri, 19 Jul 2019, Yash Shah wrote:
> 
> > DT node for SiFive FU540-C000 GEMGXL Ethernet controller driver added
> > 
> > Signed-off-by: Yash Shah <yash.shah@sifive.com>
> 
> Thanks, queuing this one for v5.3-rc with Andrew's suggested change to 
> change phy1 to phy0.
> 
> Am assuming patches 1 and 2 will go in via -net.

I don't think that has happened.

Rob

^ permalink raw reply

* Re: [PATCH 1/3] macb: bindings doc: update sifive fu540-c000 binding
From: Rob Herring @ 2019-08-12 23:32 UTC (permalink / raw)
  To: Yash Shah
  Cc: davem, robh+dt, paul.walmsley, netdev, devicetree, linux-kernel,
	linux-riscv, mark.rutland, palmer, aou, nicolas.ferre, ynezz,
	sachin.ghadi, Yash Shah
In-Reply-To: <1563534631-15897-1-git-send-email-yash.shah@sifive.com>

On Fri, 19 Jul 2019 16:40:29 +0530, Yash Shah wrote:
> As per the discussion with Nicolas Ferre, rename the compatible property
> to a more appropriate and specific string.
> LINK: https://lkml.org/lkml/2019/7/17/200
> 
> Signed-off-by: Yash Shah <yash.shah@sifive.com>
> ---
>  Documentation/devicetree/bindings/net/macb.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* [bpf-next] selftests/bpf: fix race in flow dissector tests
From: Petar Penkov @ 2019-08-12 23:30 UTC (permalink / raw)
  To: netdev, bpf; +Cc: davem, ast, daniel, sdf, Petar Penkov

From: Petar Penkov <ppenkov@google.com>

Since the "last_dissection" map holds only the flow keys for the most
recent packet, there is a small race in the skb-less flow dissector
tests if a new packet comes between transmitting the test packet, and
reading its keys from the map. If this happens, the test packet keys
will be overwritten and the test will fail.

Changing the "last_dissection" map to a hash map, keyed on the
source/dest port pair resolves this issue. Additionally, let's clear the
last test results from the map between tests to prevent previous test
cases from interfering with the following test cases.

Fixes: 0905beec9f52 ("selftests/bpf: run flow dissector tests in skb-less mode")
Signed-off-by: Petar Penkov <ppenkov@google.com>
---
 .../selftests/bpf/prog_tests/flow_dissector.c | 22 ++++++++++++++++++-
 tools/testing/selftests/bpf/progs/bpf_flow.c  | 13 +++++------
 2 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
index 700d73d2f22a..6892b88ae065 100644
--- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
+++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
@@ -109,6 +109,8 @@ struct test tests[] = {
 			.iph.protocol = IPPROTO_TCP,
 			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN,
@@ -116,6 +118,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IP,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -125,6 +129,8 @@ struct test tests[] = {
 			.iph.nexthdr = IPPROTO_TCP,
 			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN,
@@ -132,6 +138,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IPV6,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -143,6 +151,8 @@ struct test tests[] = {
 			.iph.protocol = IPPROTO_TCP,
 			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN + VLAN_HLEN,
@@ -150,6 +160,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IP,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IP),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -161,6 +173,8 @@ struct test tests[] = {
 			.iph.nexthdr = IPPROTO_TCP,
 			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
 			.tcp.doff = 5,
+			.tcp.source = 80,
+			.tcp.dest = 8080,
 		},
 		.keys = {
 			.nhoff = ETH_HLEN + VLAN_HLEN * 2,
@@ -169,6 +183,8 @@ struct test tests[] = {
 			.addr_proto = ETH_P_IPV6,
 			.ip_proto = IPPROTO_TCP,
 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
+			.sport = 80,
+			.dport = 8080,
 		},
 	},
 	{
@@ -487,7 +503,8 @@ void test_flow_dissector(void)
 			BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
 		struct bpf_prog_test_run_attr tattr = {};
 		struct bpf_flow_keys flow_keys = {};
-		__u32 key = 0;
+		__u32 key = (__u32)(tests[i].keys.sport) << 16 |
+			    tests[i].keys.dport;
 
 		/* For skb-less case we can't pass input flags; run
 		 * only the tests that have a matching set of flags.
@@ -504,6 +521,9 @@ void test_flow_dissector(void)
 
 		CHECK_ATTR(err, tests[i].name, "skb-less err %d\n", err);
 		CHECK_FLOW_KEYS(tests[i].name, flow_keys, tests[i].keys);
+
+		err = bpf_map_delete_elem(keys_fd, &key);
+		CHECK_ATTR(err, tests[i].name, "bpf_map_delete_elem %d\n", err);
 	}
 
 	bpf_prog_detach(prog_fd, BPF_FLOW_DISSECTOR);
diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
index 08bd8b9d58d0..040a44206f29 100644
--- a/tools/testing/selftests/bpf/progs/bpf_flow.c
+++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
@@ -65,8 +65,8 @@ struct {
 } jmp_table SEC(".maps");
 
 struct {
-	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, 1);
+	__uint(type, BPF_MAP_TYPE_HASH);
+	__uint(max_entries, 1024);
 	__type(key, __u32);
 	__type(value, struct bpf_flow_keys);
 } last_dissection SEC(".maps");
@@ -74,12 +74,11 @@ struct {
 static __always_inline int export_flow_keys(struct bpf_flow_keys *keys,
 					    int ret)
 {
-	struct bpf_flow_keys *val;
-	__u32 key = 0;
+	__u32 key = (__u32)(keys->sport) << 16 | keys->dport;
+	struct bpf_flow_keys val;
 
-	val = bpf_map_lookup_elem(&last_dissection, &key);
-	if (val)
-		memcpy(val, keys, sizeof(*val));
+	memcpy(&val, keys, sizeof(val));
+	bpf_map_update_elem(&last_dissection, &key, &val, BPF_ANY);
 	return ret;
 }
 
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related

* Re: [PATCH v1] dt-bindings: fec: explicitly mark deprecated properties
From: Rob Herring @ 2019-08-12 23:12 UTC (permalink / raw)
  To: Sven Van Asbroeck
  Cc: Fugang Duan, Mark Rutland, David S . Miller, netdev, devicetree,
	linux-kernel, Andrew Lunn, Fabio Estevam, Lucas Stach
In-Reply-To: <20190718201453.13062-1-TheSven73@gmail.com>

On Thu, 18 Jul 2019 16:14:53 -0400, Sven Van Asbroeck wrote:
> fec's gpio phy reset properties have been deprecated.
> Update the dt-bindings documentation to explicitly mark
> them as such, and provide a short description of the
> recommended alternative.
> 
> Signed-off-by: Sven Van Asbroeck <TheSven73@gmail.com>
> ---
>  .../devicetree/bindings/net/fsl-fec.txt       | 30 +++++++++++--------
>  1 file changed, 17 insertions(+), 13 deletions(-)
> 

Applied, thanks.

Rob

^ permalink raw reply

* Re: [PATCH net] ipv6: Fix return value of ipv6_mc_may_pull() for malformed packets
From: Guillaume Nault @ 2019-08-12 23:08 UTC (permalink / raw)
  To: Stefano Brivio
  Cc: David Miller, Hangbin Liu, Eric Dumazet, Linus Lüssing,
	netdev
In-Reply-To: <dc0d0b1bc3c67e2a1346b0dd1f68428eb956fbb7.1565649789.git.sbrivio@redhat.com>

On Tue, Aug 13, 2019 at 12:46:01AM +0200, Stefano Brivio wrote:
> Commit ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and
> ipv6_mc_check_mld() calls") replaces direct calls to pskb_may_pull()
> in br_ipv6_multicast_mld2_report() with calls to ipv6_mc_may_pull(),
> that returns -EINVAL on buffers too short to be valid IPv6 packets,
> while maintaining the previous handling of the return code.
> 
> This leads to the direct opposite of the intended effect: if the
> packet is malformed, -EINVAL evaluates as true, and we'll happily
> proceed with the processing.
> 
> Return 0 if the packet is too short, in the same way as this was
> fixed for IPv4 by commit 083b78a9ed64 ("ip: fix ip_mc_may_pull()
> return value").
> 
> I don't have a reproducer for this, unlike the one referred to by
> the IPv4 commit, but this is clearly broken.
> 
> Fixes: ba5ea614622d ("bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls")
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> ---
>  include/net/addrconf.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/net/addrconf.h b/include/net/addrconf.h
> index becdad576859..3f62b347b04a 100644
> --- a/include/net/addrconf.h
> +++ b/include/net/addrconf.h
> @@ -206,7 +206,7 @@ static inline int ipv6_mc_may_pull(struct sk_buff *skb,
>  				   unsigned int len)
>  {
>  	if (skb_transport_offset(skb) + ipv6_transport_len(skb) < len)
> -		return -EINVAL;
> +		return 0;
>  
>  	return pskb_may_pull(skb, len);
>  }

Acked-by: Guillaume Nault <gnault@redhat.com>

^ permalink raw reply

* Re: tun: mark small packets as owned by the tap sock
From: Dave Jones @ 2019-08-12 22:19 UTC (permalink / raw)
  To: Alexis Bauvin; +Cc: netdev
In-Reply-To: <git-mailbomb-linux-master-4b663366246be1d1d4b1b8b01245b2e88ad9e706@kernel.org>

On Wed, Aug 07, 2019 at 12:30:07AM +0000, Linux Kernel wrote:
 > Commit:     4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Parent:     16b2084a8afa1432d14ba72b7c97d7908e178178
 > Web:        https://git.kernel.org/torvalds/c/4b663366246be1d1d4b1b8b01245b2e88ad9e706
 > Author:     Alexis Bauvin <abauvin@scaleway.com>
 > AuthorDate: Tue Jul 23 16:23:01 2019 +0200
 > 
 >     tun: mark small packets as owned by the tap sock
 >     
 >     - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
 >     
 >     Small packets going out of a tap device go through an optimized code
 >     path that uses build_skb() rather than sock_alloc_send_pskb(). The
 >     latter calls skb_set_owner_w(), but the small packet code path does not.
 >     
 >     The net effect is that small packets are not owned by the userland
 >     application's socket (e.g. QEMU), while large packets are.
 >     This can be seen with a TCP session, where packets are not owned when
 >     the window size is small enough (around PAGE_SIZE), while they are once
 >     the window grows (note that this requires the host to support virtio
 >     tso for the guest to offload segmentation).
 >     All this leads to inconsistent behaviour in the kernel, especially on
 >     netfilter modules that uses sk->socket (e.g. xt_owner).
 >     
 >     Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
 >     Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
 >     Acked-by: Jason Wang <jasowang@redhat.com>

This commit breaks ipv6 routing when I deployed on it a linode.
It seems to work briefly after boot, and then silently all packets get
dropped. (Presumably, it's dropping RA or ND packets)

With this reverted, everything works as it did in rc3.

	Dave


^ permalink raw reply

* Re: [PATCH net] ipv4/route: do not check saddr dev if iif is LOOPBACK_IFINDEX
From: Stefano Brivio @ 2019-08-12 22:58 UTC (permalink / raw)
  To: David Miller; +Cc: dsahern, liuhangbin, netdev, mleitner
In-Reply-To: <20190811.204918.777837587917672157.davem@davemloft.net>

On Sun, 11 Aug 2019 20:49:18 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: David Ahern <dsahern@gmail.com>
> Date: Thu, 1 Aug 2019 22:16:00 -0600
> 
> > On 8/1/19 10:13 PM, Hangbin Liu wrote:  
> >> On Thu, Aug 01, 2019 at 01:51:25PM -0600, David Ahern wrote:  
> >>> On 8/1/19 2:29 AM, Hangbin Liu wrote:  
> >>>> Jianlin reported a bug that for IPv4, ip route get from src_addr would fail
> >>>> if src_addr is not an address on local system.
> >>>>
> >>>> \# ip route get 1.1.1.1 from 2.2.2.2
> >>>> RTNETLINK answers: Invalid argument  
> >>>
> >>> so this is a forwarding lookup in which case iif should be set. Based on  
> >> 
> >> with out setting iif in userspace, the kernel set iif to lo by default.  
> > 
> > right, it presumes locally generated traffic.  
> >>   
> >>> the above 'route get' inet_rtm_getroute is doing a lookup as if it is
> >>> locally generated traffic.  
> >> 
> >> yeah... but what about the IPv6 part. That cause a different behavior in
> >> userspace.  
> > 
> > just one of many, many annoying differences between v4 and v6. We could
> > try to catalog it.  
> 
> I think we just have to accept this difference because this change would
> change behavior for all route lookups, not just those done by ip route get.

How so, actually? I don't see how that would happen. On the forwarding
path, 'iif' is set (not to loopback interface), so that's not affected.

Is there any other route lookup possibility I'm missing?

-- 
Stefano

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox