Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 7/7] mlxsw: spectrum: Add extack messages for enslave failures
From: Ido Schimmel @ 2017-10-04 13:24 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, j.vosburgh, vfalico, andy, jiri, idosch, davem, bridge
In-Reply-To: <1507093134-20406-8-git-send-email-dsahern@gmail.com>

On Tue, Oct 03, 2017 at 09:58:54PM -0700, David Ahern wrote:
> mlxsw fails device enslavement for a number of reasons. Use the extack
> facility to return an error message to the user stating why the enslave
> is failing.
> 
> Messages are prefixed with "spectrum" so users know it is a constraint
> imposed by the hardware driver. For example:
>     $ ip li add br0.11 link br0 type vlan id 11
>     $ ip li set swp11 master br0
>     Error: spectrum: Enslaving a port to a device that already has an upper device is not supported.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>

Great stuff, thanks David!

Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>

See one small nit below, but I would like to patch
mlxsw_sp_netdevice_port_vlan_event() as well, so I can take care of it
then.

> ---
>  drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 46 ++++++++++++++++++++------
>  1 file changed, 36 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> index 3adf237c951a..a6cc00e4e543 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
> @@ -4019,14 +4019,20 @@ static int mlxsw_sp_lag_index_get(struct mlxsw_sp *mlxsw_sp,
>  static bool
>  mlxsw_sp_master_lag_check(struct mlxsw_sp *mlxsw_sp,
>  			  struct net_device *lag_dev,
> -			  struct netdev_lag_upper_info *lag_upper_info)
> +			  struct netdev_lag_upper_info *lag_upper_info,
> +			  struct netlink_ext_ack *extack)
>  {
>  	u16 lag_id;
>  
> -	if (mlxsw_sp_lag_index_get(mlxsw_sp, lag_dev, &lag_id) != 0)
> +	if (mlxsw_sp_lag_index_get(mlxsw_sp, lag_dev, &lag_id) != 0) {
> +		NL_SET_ERR_MSG(extack, "spectrum: Unknown LAG device");

A more accurate message would be:
"spectrum: Exceeded number of supported LAG devices"

>  		return false;
> -	if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH)
> +	}
> +	if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
> +		NL_SET_ERR_MSG(extack,
> +			       "spectrum: LAG device using unsupported Tx type");
>  		return false;
> +	}
>  	return true;
>  }

^ permalink raw reply

* Re: etsec2 attached to sgmii phy
From: Andrew Lunn @ 2017-10-04 13:36 UTC (permalink / raw)
  To: Jörg Willmann; +Cc: netdev
In-Reply-To: <alpine.DEB.2.20.1710040756160.31757@brian.int1.clnt.de>

On Wed, Oct 04, 2017 at 07:56:53AM +0200, Jörg Willmann wrote:
> Hi,
> 
> we use a QorIQ P1011 connected via SGMII to a switch (Marvell 88E6352).
> Currently we still use a really old linux kernel (2.6.33) successfully.
> 
> For configuration of the MDIO Bus attached to the corresponding eTSEC/TBI
> Phy we use the following settings in the device tree:
> 
>                         mdio@25000 {
>                                    #address-cells = <0x1>;
>                                    #size-cells = <0x0>;
>                                    compatible = "fsl,etsec2-tbi";
>                                    reg = <0x25000 0x1000 0xb1030 0x4>;

Hi Joerg

Is 0xb1030 0x4 fixed by the silicon? Can it be expressed as an offset from
0x25000?

It seems like the idea behind the patch is to hard code some
things. If you can hard code the offset into get_etsec_tbipa(), i
think that would be an O.K. solution to your problem.

      Andrew

^ permalink raw reply

* [PATCH v1] s390/qeth: use kstrtobool() in qeth_bridgeport_hostnotification_store()
From: Andy Shevchenko @ 2017-10-04 13:38 UTC (permalink / raw)
  To: Julian Wiedmann, Ursula Braun, Martin Schwidefsky, Heiko Carstens,
	linux-s390, David Miller, netdev
  Cc: Andy Shevchenko

The sysfs enabled value is a boolean, so kstrtobool() is a better fit
for parsing the input string since it does the range checking for us.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 drivers/s390/net/qeth_l2_sys.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/s390/net/qeth_l2_sys.c b/drivers/s390/net/qeth_l2_sys.c
index 4608daedb204..470a4e5f3c62 100644
--- a/drivers/s390/net/qeth_l2_sys.c
+++ b/drivers/s390/net/qeth_l2_sys.c
@@ -146,18 +146,15 @@ static ssize_t qeth_bridgeport_hostnotification_store(struct device *dev,
 		struct device_attribute *attr, const char *buf, size_t count)
 {
 	struct qeth_card *card = dev_get_drvdata(dev);
-	int rc = 0;
-	int enable;
+	bool enable;
+	int rc;
 
 	if (!card)
 		return -EINVAL;
 
-	if (sysfs_streq(buf, "0"))
-		enable = 0;
-	else if (sysfs_streq(buf, "1"))
-		enable = 1;
-	else
-		return -EINVAL;
+	rc = kstrtobool(buf, &enable);
+	if (rc)
+		return rc;
 
 	mutex_lock(&card->conf_mutex);
 
-- 
2.14.2

^ permalink raw reply related

* [PATCH net-next] rtnetlink: remove slave_validate callback
From: Florian Westphal @ 2017-10-04 13:55 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Florian Westphal

no users in the tree.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/rtnetlink.h | 3 ---
 net/core/rtnetlink.c    | 6 ------
 2 files changed, 9 deletions(-)

diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 21837ca68ecc..6520993ff449 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -93,9 +93,6 @@ struct rtnl_link_ops {
 
 	int			slave_maxtype;
 	const struct nla_policy	*slave_policy;
-	int			(*slave_validate)(struct nlattr *tb[],
-						  struct nlattr *data[],
-						  struct netlink_ext_ack *extack);
 	int			(*slave_changelink)(struct net_device *dev,
 						    struct net_device *slave_dev,
 						    struct nlattr *tb[],
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 3961f87cdc76..b63c5759641f 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2631,12 +2631,6 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 					return err;
 				slave_data = slave_attr;
 			}
-			if (m_ops->slave_validate) {
-				err = m_ops->slave_validate(tb, slave_data,
-							    extack);
-				if (err < 0)
-					return err;
-			}
 		}
 
 		if (dev) {
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH net-next] rtnetlink: remove slave_validate callback
From: Jiri Pirko @ 2017-10-04 13:56 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev
In-Reply-To: <20171004135529.3967-1-fw@strlen.de>

Wed, Oct 04, 2017 at 03:55:29PM CEST, fw@strlen.de wrote:
>no users in the tree.
>
>Signed-off-by: Florian Westphal <fw@strlen.de>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] libbpf: parse maps sections of varying size
From: Craig Gallek @ 2017-10-04 13:58 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexei Starovoitov, Daniel Borkmann, David S . Miller,
	Chonggang Li, netdev, Eric Leblond, Wangnan (F)
In-Reply-To: <20171003160348.1d84548d@redhat.com>

On Tue, Oct 3, 2017 at 10:03 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
>
> First of all, thank you Craig for working on this.  As Alexei says, we
> need to improve tools/lib/bpf/libbpf and move towards converting users
> of bpf_load.c to this lib instead.
>
> Comments inlined below.
>
>> +                     obj->maps[map_idx].def = *def;
>
> I'm not too happy/comfortable with this way of copying the memory of
> "def" (the type-cased struct bpf_map_def).  I guess it works, and is
> part of the C-standard(?).

I believe this is a C++-ism.  I'm not sure if it was pulled into the
C99 standard or if it's just a gcc 'feature' now.  I kept it because
it was in the initial code, but I'm happy to do an explicit copy for
v3 if that looks better.

^ permalink raw reply

* [PATCH net-next] rtnetlink: remove __rtnl_af_unregister
From: Florian Westphal @ 2017-10-04 13:58 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

switch the only caller to rtnl_af_unregister.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/rtnetlink.h |  2 --
 net/core/rtnetlink.c    | 14 +-------------
 net/ipv6/addrconf.c     |  4 ++--
 3 files changed, 3 insertions(+), 17 deletions(-)

diff --git a/include/net/rtnetlink.h b/include/net/rtnetlink.h
index 6520993ff449..e3ca8e2e3103 100644
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -151,8 +151,6 @@ struct rtnl_af_ops {
 	size_t			(*get_stats_af_size)(const struct net_device *dev);
 };
 
-void __rtnl_af_unregister(struct rtnl_af_ops *ops);
-
 void rtnl_af_register(struct rtnl_af_ops *ops);
 void rtnl_af_unregister(struct rtnl_af_ops *ops);
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index b63c5759641f..3fb1ca33cba4 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -476,25 +476,13 @@ void rtnl_af_register(struct rtnl_af_ops *ops)
 EXPORT_SYMBOL_GPL(rtnl_af_register);
 
 /**
- * __rtnl_af_unregister - Unregister rtnl_af_ops from rtnetlink.
- * @ops: struct rtnl_af_ops * to unregister
- *
- * The caller must hold the rtnl_mutex.
- */
-void __rtnl_af_unregister(struct rtnl_af_ops *ops)
-{
-	list_del(&ops->list);
-}
-EXPORT_SYMBOL_GPL(__rtnl_af_unregister);
-
-/**
  * rtnl_af_unregister - Unregister rtnl_af_ops from rtnetlink.
  * @ops: struct rtnl_af_ops * to unregister
  */
 void rtnl_af_unregister(struct rtnl_af_ops *ops)
 {
 	rtnl_lock();
-	__rtnl_af_unregister(ops);
+	list_del(&ops->list);
 	rtnl_unlock();
 }
 EXPORT_SYMBOL_GPL(rtnl_af_unregister);
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index f553f72d0bee..837418ff2d4b 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -6618,9 +6618,9 @@ void addrconf_cleanup(void)
 	unregister_pernet_subsys(&addrconf_ops);
 	ipv6_addr_label_cleanup();
 
-	rtnl_lock();
+	rtnl_af_unregister(&inet6_ops);
 
-	__rtnl_af_unregister(&inet6_ops);
+	rtnl_lock();
 
 	/* clean dev list */
 	for_each_netdev(&init_net, dev) {
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH net-next v2 1/2] libbpf: parse maps sections of varying size
From: Craig Gallek @ 2017-10-04 13:58 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexei Starovoitov, Daniel Borkmann, David S . Miller,
	Chonggang Li, netdev, Eric Leblond
In-Reply-To: <20171003161151.039979f4@redhat.com>

On Tue, Oct 3, 2017 at 10:11 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Mon,  2 Oct 2017 12:41:28 -0400
> Craig Gallek <kraigatgoog@gmail.com> wrote:
>
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index 4f402dcdf372..28b300868ad7 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -580,7 +580,7 @@ bpf_object__init_kversion(struct bpf_object *obj,
>>  }
>>
>>  static int
>> -bpf_object__validate_maps(struct bpf_object *obj)
>> +bpf_object__validate_maps(struct bpf_object *obj, int map_def_sz)
>>  {
>>       int i;
>>
>> @@ -595,9 +595,11 @@ bpf_object__validate_maps(struct bpf_object *obj)
>>               const struct bpf_map *a = &obj->maps[i - 1];
>>               const struct bpf_map *b = &obj->maps[i];
>>
>> -             if (b->offset - a->offset < sizeof(struct bpf_map_def)) {
>> -                     pr_warning("corrupted map section in %s: map \"%s\" too small\n",
>> -                                obj->path, a->name);
>> +             if (b->offset - a->offset < map_def_sz) {
>> +                     pr_warning("corrupted map section in %s: map \"%s\" too small "
>> +                                "(%zd vs %d)\n",
>> +                                obj->path, a->name, b->offset - a->offset,
>> +                                map_def_sz);
>>                       return -EINVAL;
>
> Hmm... one more comment.  You have just coded handling of ELF
> map_def_sz which are smaller in a safe manor, but here this case will
> get rejected (in bpf_object__validate_maps).  That cannot be the right
> intend?

I think you are right, but my test program didn't catch this...
Perhaps an off-by-one (I only used slightly smaller map_def sizes).  I
suppose this entire function is unnecessary now.  Even the old version
wouldn't catch the case where the last offset was out of bounds?

> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] libbpf: parse maps sections of varying size
From: Craig Gallek @ 2017-10-04 13:58 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexei Starovoitov, Daniel Borkmann, David S . Miller,
	Chonggang Li, netdev, Eric Leblond
In-Reply-To: <20171003161151.039979f4@redhat.com>

On Tue, Oct 3, 2017 at 10:11 AM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Mon,  2 Oct 2017 12:41:28 -0400
> Craig Gallek <kraigatgoog@gmail.com> wrote:
>
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index 4f402dcdf372..28b300868ad7 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -580,7 +580,7 @@ bpf_object__init_kversion(struct bpf_object *obj,
>>  }
>>
>>  static int
>> -bpf_object__validate_maps(struct bpf_object *obj)
>> +bpf_object__validate_maps(struct bpf_object *obj, int map_def_sz)
>>  {
>>       int i;
>>
>> @@ -595,9 +595,11 @@ bpf_object__validate_maps(struct bpf_object *obj)
>>               const struct bpf_map *a = &obj->maps[i - 1];
>>               const struct bpf_map *b = &obj->maps[i];
>>
>> -             if (b->offset - a->offset < sizeof(struct bpf_map_def)) {
>> -                     pr_warning("corrupted map section in %s: map \"%s\" too small\n",
>> -                                obj->path, a->name);
>> +             if (b->offset - a->offset < map_def_sz) {
>> +                     pr_warning("corrupted map section in %s: map \"%s\" too small "
>> +                                "(%zd vs %d)\n",
>> +                                obj->path, a->name, b->offset - a->offset,
>> +                                map_def_sz);
>>                       return -EINVAL;
>
> Hmm... one more comment.  You have just coded handling of ELF
> map_def_sz which are smaller in a safe manor, but here this case will
> get rejected (in bpf_object__validate_maps).  That cannot be the right
> intend?

I think you are right, but my test program didn't catch this...
Perhaps an off-by-one (I only used slightly smaller map_def sizes).  I
suppose this entire function is unnecessary now.  Even the old version
wouldn't catch the case where the last offset was out of bounds?

> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] libbpf: parse maps sections of varying size
From: Craig Gallek @ 2017-10-04 13:59 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Jesper Dangaard Brouer, David S . Miller,
	Chonggang Li, netdev
In-Reply-To: <59D3A12B.7010101@iogearbox.net>

On Tue, Oct 3, 2017 at 10:39 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 10/03/2017 01:07 AM, Alexei Starovoitov wrote:
>>
>> On 10/2/17 9:41 AM, Craig Gallek wrote:
>>>
>>> +    /* Assume equally sized map definitions */
>>> +    map_def_sz = data->d_size / nr_maps;
>>> +    if (!data->d_size || (data->d_size % nr_maps) != 0) {
>>> +        pr_warning("unable to determine map definition size "
>>> +               "section %s, %d maps in %zd bytes\n",
>>> +               obj->path, nr_maps, data->d_size);
>>> +        return -EINVAL;
>>> +    }
>>
>>
>> this approach is not as flexible as done by samples/bpf/bpf_load.c
>> where it looks at every map independently by walking symtab,
>> but I guess it's ok.
>
>
> Regarding different map spec structs in a single prog: unless
> we have a good use case why we would need it (and I'm not aware
> of anything in particular), I would just go with a fixed size.
> I did kind of similar sanity checks in bpf_fetch_maps_end() in
> iproute2 loader as well.

Split vote?  I'm happy to try to make this work with varying sizes,
but I agree the usefulness seems low.  It would imply building map
sections with different definition structures and we would then need a
way to differentiate them.  If the goal is to allow for different map
definition structures, I don't think size alone is a sufficient
differentiator.

^ permalink raw reply

* RE: [PATCH net-next v2 1/2] libbpf: parse maps sections of varying size
From: David Laight @ 2017-10-04 14:12 UTC (permalink / raw)
  To: 'Craig Gallek', Jesper Dangaard Brouer
  Cc: Alexei Starovoitov, Daniel Borkmann, David S . Miller,
	Chonggang Li, netdev, Eric Leblond, Wangnan (F)
In-Reply-To: <CAEfhGizpYSn=sQ+XvuTUzTZDqLLMg8mODE18HgM119sq3hyhnw@mail.gmail.com>

From: Craig Gallek
> Sent: 04 October 2017 14:58
> >> +                     obj->maps[map_idx].def = *def;
> >
> > I'm not too happy/comfortable with this way of copying the memory of
> > "def" (the type-cased struct bpf_map_def).  I guess it works, and is
> > part of the C-standard(?).
> 
> I believe this is a C++-ism.  I'm not sure if it was pulled into the
> C99 standard or if it's just a gcc 'feature' now.  I kept it because
> it was in the initial code, but I'm happy to do an explicit copy for
> v3 if that looks better.

Structure copies have been valid C for ages.

Annoyingly gcc will call memcpy() directly (not a 'private' function).

	David


^ permalink raw reply

* Re: [PATCH net-next] selftests: rtnetlink: try concurrent change of ifalias
From: Eric Dumazet @ 2017-10-04 14:16 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev
In-Reply-To: <20171004115258.5878-1-fw@strlen.de>

On Wed, 2017-10-04 at 13:52 +0200, Florian Westphal wrote:
> to make sure this is serialized correctly.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  tools/testing/selftests/net/rtnetlink.sh | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
> index 62c87da92770..13c29beb5769 100755
> --- a/tools/testing/selftests/net/rtnetlink.sh
> +++ b/tools/testing/selftests/net/rtnetlink.sh
> @@ -278,6 +278,14 @@ kci_test_ifalias()
>  	        ip link show "$devdummy" | grep -q "alias $namewant"
>  		check_fail $?
>  
> +		for i in $(seq 1 100); do
> +			uuidgen > "$syspathname" &
> +		done
> +
> +		for i in $(seq 1 100); do
> +			wait
> +		done
> +

A single wait is enough, you do not need a loop.

^ permalink raw reply

* Re: etsec2 attached to sgmii phy
From: Jörg Willmann @ 2017-10-04 14:19 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev
In-Reply-To: <20171004133642.GM30045@lunn.ch>

[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]

On Wed, 4 Oct 2017, Andrew Lunn wrote:

> On Wed, Oct 04, 2017 at 07:56:53AM +0200, Jörg Willmann wrote:
>> Hi,
>>
>> we use a QorIQ P1011 connected via SGMII to a switch (Marvell 88E6352).
>> Currently we still use a really old linux kernel (2.6.33) successfully.
>>
>> For configuration of the MDIO Bus attached to the corresponding eTSEC/TBI
>> Phy we use the following settings in the device tree:
>>
>>                         mdio@25000 {
>>                                    #address-cells = <0x1>;
>>                                    #size-cells = <0x0>;
>>                                    compatible = "fsl,etsec2-tbi";
>>                                    reg = <0x25000 0x1000 0xb1030 0x4>;
>
> Hi Joerg
>
> Is 0xb1030 0x4 fixed by the silicon? Can it be expressed as an offset from
> 0x25000?
>
> It seems like the idea behind the patch is to hard code some
> things. If you can hard code the offset into get_etsec_tbipa(), i
> think that would be an O.K. solution to your problem.
>
>      Andrew
>
Yes, the adress 0xb1030 is fixed but it's something totally different than 
the address range of 0x25000. 0xb0000, 0xb1000 and 0xb2000 are base 
addresses of the eTSEC MAC (TPIPA is a register within the MAC) and 
0x24000, 0x25000 and 0x26000 are the base registers of the corresponding 
MDIO controllers. So I wouldn't add a dependency between these two things.
>From my point of view, the implementation in the old kernel where 
get_gfar_tbipa() got the device tree node pointer as argument was not soo 
bad ;-)

 	Joerg

^ permalink raw reply

* [PATCH v2 net-next] selftests: rtnetlink: try concurrent change of ifalias
From: Florian Westphal @ 2017-10-04 14:22 UTC (permalink / raw)
  To: netdev; +Cc: Florian Westphal

to make sure this is serialized correctly.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 change since v1:
 Eric points out 'wait' blocks for all current children, so no need
 for another loop.

 tools/testing/selftests/net/rtnetlink.sh | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
index 62c87da92770..e8c86c416ed0 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -278,6 +278,12 @@ kci_test_ifalias()
 	        ip link show "$devdummy" | grep -q "alias $namewant"
 		check_fail $?
 
+		for i in $(seq 1 100); do
+			uuidgen > "$syspathname" &
+		done
+
+		wait
+
 		# re-add the alias -- kernel should free mem when dummy dev is removed
 		ip link set dev "$devdummy" alias "$namewant"
 		check_err $?
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: Stefan Hajnoczi @ 2017-10-04 15:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171003112653.213c7cb3@xeon-e3>

On Tue, Oct 03, 2017 at 11:26:53AM -0700, Stephen Hemminger wrote:
> On Tue,  3 Oct 2017 13:57:42 -0400
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> > Linux has more than 32 address families defined in <bits/socket.h>.  Use
> > a 64-bit type so all of them can be represented in the filter->families
> > bitmask.
> > 
> > It's easy to introduce bugs when using (1 << AF_FAMILY) because the
> > value is 32-bit.  This can produce incorrect results from bitmask
> > operations so introduce the FAMILY_MASK() macro to eliminate these bugs.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  misc/ss.c | 54 ++++++++++++++++++++++++++++--------------------------
> >  1 file changed, 28 insertions(+), 26 deletions(-)
> > 
> > diff --git a/misc/ss.c b/misc/ss.c
> > index dd8dfaa4..12a31c90 100644
> > --- a/misc/ss.c
> > +++ b/misc/ss.c
> > @@ -170,55 +170,57 @@ enum {
> >  struct filter {
> >  	int dbs;
> >  	int states;
> > -	int families;
> > +	__u64 families;
> 
> Since this isn't a value that is coming from kernel. It should be uint64_t
> rather than __u64.

Okay, will fix in v2.

^ permalink raw reply

* RE: [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: David Laight @ 2017-10-04 15:03 UTC (permalink / raw)
  To: 'Stefan Hajnoczi', Stephen Hemminger
  Cc: netdev@vger.kernel.org, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171004150119.GA3840@stefanha-x1.localdomain>

From: Stefan Hajnoczi
> Sent: 04 October 2017 16:01
...
> > > --- a/misc/ss.c
> > > +++ b/misc/ss.c
> > > @@ -170,55 +170,57 @@ enum {
> > >  struct filter {
> > >  	int dbs;
> > >  	int states;
> > > -	int families;
> > > +	__u64 families;
> >
> > Since this isn't a value that is coming from kernel. It should be uint64_t
> > rather than __u64.
> 
> Okay, will fix in v2.

But that looks like a kernel structure, not one exposed to userspace.
So surely __u64 is correct.

	David

^ permalink raw reply

* Re: [PATCH iproute2 2/3] include: add <linux/vm_sockets_diag.h>
From: Stefan Hajnoczi @ 2017-10-04 15:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Jorgen Hansen, Dexuan Cui
In-Reply-To: <20171003112455.5ebac674@xeon-e3>

On Tue, Oct 03, 2017 at 11:24:55AM -0700, Stephen Hemminger wrote:
> On Tue,  3 Oct 2017 13:57:43 -0400
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> > This new Linux header file defines the sock_diag interface used by
> > AF_VSOCK.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  include/linux/vm_sockets_diag.h | 33 +++++++++++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> >  create mode 100644 include/linux/vm_sockets_diag.h
> > 
> 
> All header files in linux must be headers that come from 'make install_headers'
> in kernel. I don't see vm_sockets_diag.h even in net-next.

This series depends on "[PATCH 0/5] VSOCK: add sock_diag interface".
Once that has been merged in Linux the <linux/vm_sockets_diag.h> header
will be available.

I sent the ss(8) patch at the same time and referenced the Linux patch
series in the cover letter so they can be reviewed/tested together.

Thanks,
Stefan

^ permalink raw reply

* Re: [PATCH 4/5] VSOCK: add sock_diag interface
From: Stefan Hajnoczi @ 2017-10-04 15:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jhansen, decui
In-Reply-To: <20171003.214652.1920235341803755309.davem@davemloft.net>

On Tue, Oct 03, 2017 at 09:46:52PM -0700, David Miller wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> Date: Tue,  3 Oct 2017 11:39:42 -0400
> 
> > +static int sk_diag_fill(struct sock *sk, struct sk_buff *skb,
> > +			u32 portid, u32 seq, u32 flags)
> > +{
> > +	struct nlmsghdr *nlh;
> > +	struct vsock_diag_msg *rep;
> > +	struct vsock_sock *vsk = vsock_sk(sk);
> 
> Please order local variables from longest to shortest line.
> 
> > +static int vsock_diag_dump(struct sk_buff *skb, struct netlink_callback *cb)
> > +{
> > +	struct vsock_diag_req *req;
> > +	unsigned int table;
> > +	unsigned int bucket;
> > +	unsigned int last_i;
> > +	unsigned int i;
> > +	struct vsock_sock *vsk;
> > +	struct net *net;
> 
> Likewise.

Will fix in v2.

^ permalink raw reply

* Re: [PATCH 5/5] VSOCK: add tools/vsock/vsock_diag_test
From: Stefan Hajnoczi @ 2017-10-04 15:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jhansen, decui
In-Reply-To: <20171003.214805.176732000485014040.davem@davemloft.net>

On Tue, Oct 03, 2017 at 09:48:05PM -0700, David Miller wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> Date: Tue,  3 Oct 2017 11:39:43 -0400
> 
> >  MAINTAINERS                   |   1 +
> >  tools/vsock/Makefile          |   9 +
> >  tools/vsock/control.h         |  13 +
> >  tools/vsock/timeout.h         |  14 +
> >  tools/vsock/control.c         | 219 ++++++++++++++
> >  tools/vsock/timeout.c         |  64 ++++
> >  tools/vsock/vsock_diag_test.c | 681 ++++++++++++++++++++++++++++++++++++++++++
> >  tools/vsock/.gitignore        |   2 +
> 
> Please don't create you own "special" directory for tests.
> 
> Tests belong under tools/testing/selftests/
> 
> If you put your tests in the proper place, and structure them properly (especially
> the Makefile rules), they will automatically be run by various automated build
> and test frameworks.

Cool, thanks for the hint.  I wasn't aware of that.

Will fix in v2.

Stefan

^ permalink raw reply

* Re: [PATCH 2/5] VSOCK: export __vsock_in_bound/connected_table()
From: Stefan Hajnoczi @ 2017-10-04 15:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, jhansen, decui
In-Reply-To: <20171003.214617.1472124119367144767.davem@davemloft.net>

On Tue, Oct 03, 2017 at 09:46:17PM -0700, David Miller wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
> Date: Tue,  3 Oct 2017 11:39:40 -0400
> 
> > @@ -250,15 +250,17 @@ static struct sock *__vsock_find_connected_socket(struct sockaddr_vm *src,
> >  	return NULL;
> >  }
> >  
> > -static bool __vsock_in_bound_table(struct vsock_sock *vsk)
> > +bool __vsock_in_bound_table(struct vsock_sock *vsk)
> >  {
> >  	return !list_empty(&vsk->bound_table);
> >  }
> > +EXPORT_SYMBOL_GPL(__vsock_in_bound_table);
> >  
> > -static bool __vsock_in_connected_table(struct vsock_sock *vsk)
> > +bool __vsock_in_connected_table(struct vsock_sock *vsk)
> >  {
> >  	return !list_empty(&vsk->connected_table);
> >  }
> > +EXPORT_SYMBOL_GPL(__vsock_in_connected_table);
> 
> Maybe you can just make these inline helpers in af_vsock.h?

Will fix in v2.

^ permalink raw reply

* Re: [PATCH 0/1] XDP Program for Ip forward
From: Jesper Dangaard Brouer @ 2017-10-04 15:07 UTC (permalink / raw)
  To: cjacob; +Cc: brouer, netdev, linux-kernel, linux-arm-kernel
In-Reply-To: <1507016225-319-1-git-send-email-Christina.Jacob@cavium.com>

First of all thank you for working on this.

On Tue,  3 Oct 2017 13:07:04 +0530 cjacob <Christina.Jacob@cavium.com> wrote:

> 	Usage: ./xdp3 [-S] <ifindex1...ifindexn> 
> 
> 	-S to choose generic xdp implementation 
> 	  [Default is driver xdp implementation]
> 	ifindex - the index of the interface to which 
> 	the xdp program has to be attached.
> 	in 4.14-rc3 kernel.

I would prefer if we can name the program something more descriptive
than "xdp3".  What about "xdp_redirect_router" or "xdp_router_ipv4" ?

I would also appreciate if we can stop using ifindex'es, and instead
use normal device ifname's.  And simply do the lookup to the ifindex in
the program via if_nametoindex(ifname), see how in [1] and [2].

When adding more ifname's you can just use the same trick as with
multiple --cpu options like [1] and [2].

[1] http://lkml.kernel.org/r/150711864538.9499.11712573036995600273.stgit@firesoul
[2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/samples/bpf/xdp_redirect_cpu_user.c

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH net-next 6/7] net: bridge: Pass extack to down to netdev_master_upper_dev_link
From: Stephen Hemminger @ 2017-10-04 15:13 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev, j.vosburgh, vfalico, andy, jiri, idosch, davem, bridge
In-Reply-To: <1507093134-20406-7-git-send-email-dsahern@gmail.com>

On Tue,  3 Oct 2017 21:58:53 -0700
David Ahern <dsahern@gmail.com> wrote:

> Pass extack arg to br_add_if. Add messages for a couple of failures
> and pass arg to netdev_master_upper_dev_link.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>

This looks good. You might want to pass the netlink_ext_ack down as
an immutable pointer (const).

Acked-by: Stephen Hemminger <stephen@networkplumber.org>

^ permalink raw reply

* Re: [PATCH iproute2 1/3] ss: allow AF_FAMILY constants >32
From: Stephen Hemminger @ 2017-10-04 15:14 UTC (permalink / raw)
  To: David Laight
  Cc: 'Stefan Hajnoczi', netdev@vger.kernel.org, Jorgen Hansen,
	Dexuan Cui
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DD0089AB4@AcuExch.aculab.com>

On Wed, 4 Oct 2017 15:03:09 +0000
David Laight <David.Laight@ACULAB.COM> wrote:

> From: Stefan Hajnoczi
> > Sent: 04 October 2017 16:01  
> ...
> > > > --- a/misc/ss.c
> > > > +++ b/misc/ss.c
> > > > @@ -170,55 +170,57 @@ enum {
> > > >  struct filter {
> > > >  	int dbs;
> > > >  	int states;
> > > > -	int families;
> > > > +	__u64 families;  
> > >
> > > Since this isn't a value that is coming from kernel. It should be uint64_t
> > > rather than __u64.  
> > 
> > Okay, will fix in v2.  
> 
> But that looks like a kernel structure, not one exposed to userspace.
> So surely __u64 is correct.
> 
> 	David
> 

families is a bit field which is built from data, not from kernel.

^ permalink raw reply

* Re: Linux 4.12+ memory leak on router with i40e NICs
From: Alexander Duyck @ 2017-10-04 15:32 UTC (permalink / raw)
  To: Anders K. Pedersen | Cohaesio
  Cc: alexander.h.duyck@intel.com, netdev@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org
In-Reply-To: <1507121766.30720.4.camel@cohaesio.com>

On Wed, Oct 4, 2017 at 5:56 AM, Anders K. Pedersen | Cohaesio
<akp@cohaesio.com> wrote:
> Hello,
>
> After updating one of our Linux based routers to kernel 4.13 it began
> leaking memory quite fast (about 1 GB every half hour). To narrow we
> tried various kernel versions and found that 4.11.12 is okay, while
> 4.12 also leaks, so we did a bisection between 4.11 and 4.12.
>
> The first bisection ended at
> "[6964e53f55837b0c49ed60d36656d2e0ee4fc27b] i40e: fix handling of HW
> ATR eviction", which fixes some flag handling that was broken by
> 47994c119a36 "i40e: remove hw_disabled_flags in favor of using separate
> flag bits", so I did a second bisection, where I added 6964e53f5583
> "i40e: fix handling of HW ATR eviction" to the steps that had
> 47994c119a36 "i40e: remove hw_disabled_flags in favor of using separate
> flag bits" in them.
>
> The second bisection ended at
> "[0e626ff7ccbfc43c6cc4aeea611c40b899682382] i40e: Fix support for flow
> director programming status", where I don't see any obvious problems,
> so I'm hoping for some assistance.
>
> The router is a PowerEdge R730 server (Haswell based) with three Intel
> NICs (all using the i40e driver):
>
> X710 quad port 10 GbE SFP+: eth0 eth1 eth2 eth3
> X710 quad port 10 GbE SFP+: eth4 eth5 eth6 eth7
> XL710 dual port 40 GbE QSFP+: eth8 eth9
>
> The NICs are aggregated with LACP with the team driver:
>
> team0: eth9 (40 GbE selected primary), and eth3, eth7 (10 GbE non-selected backups)
> team1: eth0, eth1, eth4, eth5 (all 10 GbE selected)
>
> team0 is used for internal networks and has one untagged and four
> tagged VLAN interfaces, while team1 has an external uplink connection
> without any VLANs.
>
> The router runs an eBGP session on team1 to one of our uplinks, and
> iBGP via team0 to our other border routers. It also runs OSPF on the
> internal VLANs on team0. One thing I've noticed is that when OSPF is
> not announcing a default gateway to the internal networks, so there is
> almost no traffic coming in on team0 and out on team1, but still plenty
> of traffic coming in on team1 and out via team0, there's no memory leak
> (or at least it is so small that we haven't detected it). But as soon
> as we configure OSPF to announce a default gateway to the internal
> VLANs, so we get traffic from team0 to team1 the leaking begins.
> Stopping the OSPF default gateway announcement again also stops the
> leaking, but does not release already leaked memory.
>
> So this leads to me suspect that the leaking is related to RX on team0
> (where XL710 eth9 is normally the only active interface) or TX on team1
> (X710 eth0, eth1, eth4, eth5). The first bad commit is related to RX
> cleaning, which suggests RX on team0. Since we're only seeing the leak
> for our outbound traffic, I suspect either a difference between the
> X710 vs. XL710 NICs, or that the inbound traffic is for relatively few
> destination addresses (only our own systems) while the outbound traffic
> is for many different addresses on the internet. But I'm just guessing
> here.
>
> I've tried kmemleak, but it only found a few kB of suspected memory
> leaks (several of which disappeared again after a while).
>
> Below I've included more details - git bisect logs, ethtool -i, dmesg,
> Kernel .config, and various memory related /proc files. Any help or
> suggestions would be much appreciated, and please let me know if more
> information is needed or there's something I should try.
>
> Regards,
> Anders K. Pedersen
>

Hi Anders,

I think I see the problem and should have a patch submitted shortly to
address it. From what I can tell it looks like the issue is that we
weren't properly recycling the pages associated with descriptors that
contained an Rx programming status. For now the workaround would be to
try disabling ATR via the "ethtool --set-priv-flags" command. I should
have a patch out in the next hour or so that you can try testing to
verify if it addresses the issue.

Thanks.

- Alex

^ permalink raw reply

* Re: etsec2 attached to sgmii phy
From: Andrew Lunn @ 2017-10-04 15:34 UTC (permalink / raw)
  To: Jörg Willmann; +Cc: netdev
In-Reply-To: <alpine.DEB.2.20.1710041613320.9785@brian.int1.clnt.de>

On Wed, Oct 04, 2017 at 04:19:23PM +0200, Jörg Willmann wrote:
> On Wed, 4 Oct 2017, Andrew Lunn wrote:
> 
> >On Wed, Oct 04, 2017 at 07:56:53AM +0200, Jörg Willmann wrote:
> >>Hi,
> >>
> >>we use a QorIQ P1011 connected via SGMII to a switch (Marvell 88E6352).
> >>Currently we still use a really old linux kernel (2.6.33) successfully.
> >>
> >>For configuration of the MDIO Bus attached to the corresponding eTSEC/TBI
> >>Phy we use the following settings in the device tree:
> >>
> >>                        mdio@25000 {
> >>                                   #address-cells = <0x1>;
> >>                                   #size-cells = <0x0>;
> >>                                   compatible = "fsl,etsec2-tbi";
> >>                                   reg = <0x25000 0x1000 0xb1030 0x4>;
> >
> >Hi Joerg
> >
> >Is 0xb1030 0x4 fixed by the silicon? Can it be expressed as an offset from
> >0x25000?
> >
> >It seems like the idea behind the patch is to hard code some
> >things. If you can hard code the offset into get_etsec_tbipa(), i
> >think that would be an O.K. solution to your problem.
> >
> >     Andrew
> >
> Yes, the adress 0xb1030 is fixed but it's something totally different than
> the address range of 0x25000. 0xb0000, 0xb1000 and 0xb2000 are base
> addresses of the eTSEC MAC (TPIPA is a register within the MAC) and 0x24000,
> 0x25000 and 0x26000 are the base registers of the corresponding MDIO
> controllers. So I wouldn't add a dependency between these two things.
> >From my point of view, the implementation in the old kernel where
> get_gfar_tbipa() got the device tree node pointer as argument was not soo
> bad ;-)

I took a quick look at the current device tree files. They all seem to
have the 0xb1030 0x4. So reading it inside of get_etsec_tbipa() is
O.K.

Looks like you need to modify all the get_tbipa() functions to take a
device_node *, and this code looks like it needs to change:

                        /*
                         * Add consistency check to make sure TBI is contained
                         * within the mapped range (not because we would get a
                         * segfault, rather to catch bugs in computing TBI
                         * address). Print error message but continue anyway.
                         */
                        if ((void *)tbipa > priv->map + resource_size(&res) - 4)
                                dev_err(&pdev->dev, "invalid register map (should be at least 0x%04zx to contain TBI address)\n",
                                        ((void *)tbipa - priv->map) + 4);

                        iowrite32be(be32_to_cpup(prop), tbipa);

	Andrew

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox