Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH for-next 1/4] devlink: refactor validation of finding required arguments
From: David Ahern @ 2019-02-11  2:46 UTC (permalink / raw)
  To: Aya Levin, netdev, David S. Miller, Jiri Pirko
  Cc: Moshe Shemesh, Eran Ben Elisha, Tal Alon, Ariel Almog
In-Reply-To: <1549823329-10377-2-git-send-email-ayal@mellanox.com>

On 2/10/19 11:28 AM, Aya Levin wrote:
> @@ -950,6 +951,51 @@ static int param_cmode_get(const char *cmodestr,
>  	return 0;
>  }
>  
> +struct dl_args_metadata {
> +	uint32_t o_flag;
> +	char err_msg[DL_ARGS_REQUIRED_MAX_ERR_LEN];
> +};
> +
> +static const struct dl_args_metadata dl_args_required[] = {
> +	{DL_OPT_PORT_TYPE,	      "Port type not set.\n"},
> +	{DL_OPT_PORT_COUNT,	      "Port split count option expected.\n"},
> +	{DL_OPT_SB_POOL,	      "Pool index option expected.\n"},
> +	{DL_OPT_SB_SIZE,	      "Pool size option expected.\n"},
> +	{DL_OPT_SB_TYPE,	      "Pool type option expected.\n"},
> +	{DL_OPT_SB_THTYPE,	      "Pool threshold type option expected.\n"},
> +	{DL_OPT_SB_TH,		      "Threshold option expected.\n"},
> +	{DL_OPT_SB_TC,		      "TC index option expected.\n"},
> +	{DL_OPT_ESWITCH_MODE,	      "E-Switch mode option expected.\n"},
> +	{DL_OPT_ESWITCH_INLINE_MODE,  "E-Switch inline-mode option expected.\n"},
> +	{DL_OPT_DPIPE_TABLE_NAME,     "Dpipe table name expected\n"},
> +	{DL_OPT_DPIPE_TABLE_COUNTERS, "Dpipe table counter state expected\n"},
> +	{DL_OPT_ESWITCH_ENCAP_MODE,   "E-Switch encapsulation option expected.\n"},
> +	{DL_OPT_PARAM_NAME,	      "Parameter name expected.\n"},
> +	{DL_OPT_PARAM_VALUE,	      "Value to set expected.\n"},
> +	{DL_OPT_PARAM_CMODE,	      "Configuration mode expected.\n"},
> +	{DL_OPT_REGION_SNAPSHOT_ID,   "Region snapshot id expected.\n"},
> +	{DL_OPT_REGION_ADDRESS,	      "Region address value expected.\n"},
> +	{DL_OPT_REGION_LENGTH,	      "Region length value expected.\n"},
> +};
> +
> +static int validate_finding_required_dl_args(uint32_t o_required,
> +					     uint32_t o_found)
> +{
> +	uint32_t dl_args_required_size;
> +	uint32_t o_flag;
> +	int i;
> +
> +	dl_args_required_size = ARRAY_SIZE(dl_args_required);
> +	for (i = 0; i < dl_args_required_size; i++) {
> +		o_flag = dl_args_required[i].o_flag;
> +		if ((o_required & o_flag) && !(o_found & o_flag)) {
> +			pr_err("%s", dl_args_required[i].err_msg);
> +			return -EINVAL;
> +		}
> +	}
> +	return 0;
> +}
> +

much better. Thank you for refactoring this.


^ permalink raw reply

* Re: [PATCH net-next 3/4] nfp: devlink: rename vendor to manufacture
From: Jakub Kicinski @ 2019-02-11  2:53 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: davem, netdev, oss-drivers
In-Reply-To: <20190209083644.GG2353@nanopsycho>

On Sat, 9 Feb 2019 09:36:44 +0100, Jiri Pirko wrote:
> >+	{ "board.manufacture",				"assembly.vendor", },  
> 
> I wonder, why this is not among generic?

No real reason, I'll move it in v2.

^ permalink raw reply

* [PATCH v2] net: fix IPv6 prefix route residue
From: Zhiqiang Liu @ 2019-02-11  2:57 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji, 0xeffeff, edumazet
  Cc: netdev, mingfangsen, zhangwenhao8, wangxiaogang3, zhoukang7,
	dsahern, thaller, maowenan
In-Reply-To: <98d563a5-34f0-a504-d62f-d20ed7c770da@huawei.com>

From: Zhiqiang Liu <liuzhiqiang26@huawei.com>

Follow those steps:
 # ip addr add 2001:123::1/32 dev eth0
 # ip addr add 2001:123:456::2/64 dev eth0
 # ip addr del 2001:123::1/32 dev eth0
 # ip addr del 2001:123:456::2/64 dev eth0
and then prefix route of 2001:123::1/32 will still exist.

This is because ipv6_prefix_equal in check_cleanup_prefix_route
func does not check whether two IPv6 addresses have the same
prefix length. If the prefix of one address starts with another
shorter address prefix, even though their prefix lengths are
different, the return value of ipv6_prefix_equal is true.

Here I add a check of whether two addresses have the same prefix
to decide whether their prefixes are equal.

Fixes: 5b84efecb7d9 ("ipv6 addrconf: don't cleanup prefix route
for IFA_F_NOPREFIXROUTE")
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reported-by: Wenhao Zhang <zhangwenhao8@huawei.com>
---
V1->V2:
- fix the indentation of the condition
- add Fixes tag

 net/ipv6/addrconf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 84c3588..72ffd3d 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1165,7 +1165,8 @@ enum cleanup_prefix_rt_t {
 	list_for_each_entry(ifa, &idev->addr_list, if_list) {
 		if (ifa == ifp)
 			continue;
-		if (!ipv6_prefix_equal(&ifa->addr, &ifp->addr,
+		if (ifa->prefix_len != ifp->prefix_len ||
+		    !ipv6_prefix_equal(&ifa->addr, &ifp->addr,
 				       ifp->prefix_len))
 			continue;
 		if (ifa->flags & (IFA_F_PERMANENT | IFA_F_NOPREFIXROUTE))
-- 
1.8.3.1




^ permalink raw reply related

* Re: [PATCH v3] arm64: dts: lx2160aqds: Add mdio mux nodes
From: Shawn Guo @ 2019-02-11  3:00 UTC (permalink / raw)
  To: Pankaj Bansal
  Cc: Leo Li, Andrew Lunn, Florian Fainelli, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20190206150520.9683-1-pankaj.bansal@nxp.com>

On Wed, Feb 06, 2019 at 09:40:33AM +0000, Pankaj Bansal wrote:
> The two external MDIO buses used to communicate with phy devices that are
> external to SOC are muxed in LX2160AQDS board.
> 
> These buses can be routed to any one of the eight IO slots on LX2160AQDS
> board depending on value in fpga register 0x54.
> 
> Additionally the external MDIO1 is used to communicate to the onboard
> RGMII phy devices.
> 
> The mdio1 is controlled by bits 4-7 of fpga register and mdio2 is
> controlled by bits 0-3 of fpga register.
> 
> Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
> ---
> 
> Notes:
>     V3:
>     - Add status = disabled in soc file and status = okay in board file
>       for external MDIO nodes
>     - Add interrupts property in external mdio nodes in soc file
>     V2:
>     - removed unnecassary TODO statements
>     - removed device_type from mdio nodes
>     - change the case of hex number to lowercase
>     - removed board specific comments from soc file
> 
>  .../boot/dts/freescale/fsl-lx2160a-qds.dts   | 123 +++++++++++++++++
>  .../boot/dts/freescale/fsl-lx2160a.dtsi      |  22 +++
>  2 files changed, 145 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> index 99a22abbe725..079264b391a2 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> @@ -35,6 +35,14 @@
>  	status = "okay";
>  };
>  
> +&emdio1 {
> +	status = "okay";
> +};
> +
> +&emdio2 {
> +	status = "okay";
> +};
> +
>  &esdhc0 {
>  	status = "okay";
>  };
> @@ -46,6 +54,121 @@
>  &i2c0 {
>  	status = "okay";
>  
> +	fpga@66 {
> +		compatible = "fsl,lx2160aqds-fpga", "fsl,fpga-qixis-i2c";
> +		reg = <0x66>;
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +
> +		mdio-mux-1@54 {
> +			mdio-parent-bus = <&emdio1>;
> +			reg = <0x54>;		 /* BRDCFG4 */
> +			mux-mask = <0xf8>;      /* EMI1_MDIO */
> +			#address-cells=<1>;
> +			#size-cells = <0>;
> +
> +			mdio@0 {
> +				reg = <0x00>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};

Please have a newline between nodes.  It doesn't deserve a respin
though.  I can fix them up when applying if Leo is fine with this
version.

Shawn

> +			mdio@40 {
> +				reg = <0x40>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@c0 {
> +				reg = <0xc0>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@c8 {
> +				reg = <0xc8>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@d0 {
> +				reg = <0xd0>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@d8 {
> +				reg = <0xd8>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@e0 {
> +				reg = <0xe0>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@e8 {
> +				reg = <0xe8>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@f0 {
> +				reg = <0xf0>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@f8 {
> +				reg = <0xf8>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +		};
> +
> +		mdio-mux-2@54 {
> +			mdio-parent-bus = <&emdio2>;
> +			reg = <0x54>;		 /* BRDCFG4 */
> +			mux-mask = <0x07>;      /* EMI2_MDIO */
> +			#address-cells=<1>;
> +			#size-cells = <0>;
> +
> +			mdio@0 {
> +				reg = <0x00>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@1 {
> +				reg = <0x01>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@2 {
> +				reg = <0x02>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@3 {
> +				reg = <0x03>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@4 {
> +				reg = <0x04>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@5 {
> +				reg = <0x05>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@6 {
> +				reg = <0x06>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +			mdio@7 {
> +				reg = <0x07>;
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +			};
> +		};
> +	};
> +
>  	i2c-mux@77 {
>  		compatible = "nxp,pca9547";
>  		reg = <0x77>;
> diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> index a79f5c1ea56d..7def5252ac1a 100644
> --- a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> @@ -762,5 +762,27 @@
>  				     <GIC_SPI 209 IRQ_TYPE_LEVEL_HIGH>;
>  			dma-coherent;
>  		};
> +
> +		/* WRIOP0: 0x8b8_0000, E-MDIO1: 0x1_6000 */
> +		emdio1: mdio@8b96000 {
> +			compatible = "fsl,fman-memac-mdio";
> +			reg = <0x0 0x8b96000 0x0 0x1000>;
> +			interrupts = <GIC_SPI 90 IRQ_TYPE_LEVEL_HIGH>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			little-endian;	/* force the driver in LE mode */
> +			status = "disabled";
> +		};
> +
> +		/* WRIOP0: 0x8b8_0000, E-MDIO2: 0x1_7000 */
> +		emdio2: mdio@8b97000 {
> +			compatible = "fsl,fman-memac-mdio";
> +			reg = <0x0 0x8b97000 0x0 0x1000>;
> +			interrupts = <GIC_SPI 91 IRQ_TYPE_LEVEL_HIGH>;
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			little-endian;	/* force the driver in LE mode */
> +			status = "disabled";
> +		};
>  	};
>  };
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH iproute2-next] use print_{,h}hu instead of print_uint when format specifier is %{,h}hu
From: David Ahern @ 2019-02-11  3:01 UTC (permalink / raw)
  To: Davide Caratti, Stephen Hemminger; +Cc: Andrea Claudi, netdev
In-Reply-To: <318e07e5b1d43e31f9664ff0b3628d4f2994aea9.1549536471.git.dcaratti@redhat.com>

On 2/7/19 3:51 AM, Davide Caratti wrote:
> in this way, a useless cast to unsigned int is avoided in bpf_print_ops()
> and print_tunnel().
> 
> Tested with:
>  # ./tdc.py -c bpf
> 
> Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Andrea Claudi <aclaudi@redhat.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  ip/ipl2tp.c | 4 ++--
>  lib/bpf.c   | 6 +++---
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 

applied to iproute2-next. Thanks



^ permalink raw reply

* Re: [RFC] apparently bogus logics in unix_find_other() since 2002
From: Al Viro @ 2019-02-11  3:19 UTC (permalink / raw)
  To: netdev; +Cc: Solar Designer, David Miller
In-Reply-To: <20190210042414.GH2217@ZenIV.linux.org.uk>

On Sun, Feb 10, 2019 at 04:24:15AM +0000, Al Viro wrote:
> 
> Looks like that should be impossible; what am I missing here?  Incidentally,
> how can the quoted fragment in in unix_stream_connect() be reached with NULL
> otheru->addr?  After all, otheru is unix_sock of a listener; how could
> we possibly have found it if it had NULL ->addr?
> 
> Confused...

BTW, speaking of interesting corner cases in AF_UNIX: am I right assuming that
identical abstract names with different protocols are considered entirely
independent?  Where is that thing (== abstract namespace) documented, anyway?

^ permalink raw reply

* Re: [PATCH net-next v5 09/12] socket: Add SO_TIMESTAMPING_NEW
From: Deepa Dinamani @ 2019-02-11  3:21 UTC (permalink / raw)
  To: Ran Rozenstein
  Cc: davem@davemloft.net, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, arnd@arndb.de, y2038@lists.linaro.org,
	chris@zankel.net, fenghua.yu@intel.com, rth@twiddle.net,
	tglx@linutronix.de, ubraun@linux.ibm.com,
	linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-ia64@vger.kernel.org, linux-mips@linux-mips.org,
	linux-s390@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	sparclinux@vger.kernel.org
In-Reply-To: <AM4PR0501MB2769C4BE6CF1C5B051068D0BC56B0@AM4PR0501MB2769.eurprd05.prod.outlook.com>

On Feb 10, 2019, at 7:43 AM, Ran Rozenstein <ranro@mellanox.com> wrote:

>> Subject: [PATCH net-next v5 09/12] socket: Add SO_TIMESTAMPING_NEW
>>
>> Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
>> This is the y2038 safe versions of the SO_TIMESTAMPING_OLD for all
>> architectures.
>>
>> Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
>> Acked-by: Willem de Bruijn <willemb@google.com>
>
>
> Hi,
>
> I have app that include:
>    #include <linux/errqueue.h>
>
> It now fail with this error:
>    In file included from timestamping.c:6:0:
>    /usr/include/linux/errqueue.h:46:27: error: array type has incomplete element type 'struct __kernel_timespec'
>      struct __kernel_timespec ts[3];
>                                       ^~
> I tried to do the trivial fix, to include time.h:
> In include/uapi/linux/errqueue.h
>    #include <linux/time.h>
>    #include <linux/types.h>
>
> But it just add some other noises:
>                In file included from /usr/include/linux/errqueue.h:5:0,
>                                 from timestamping.c:6:
>                /usr/include/linux/time.h:10:8: error: redefinition of ?struct timespec?
>                 struct timespec {
>                        ^~~~~~~~
>                In file included from /usr/include/sys/select.h:39:0,
>                                 from /usr/include/sys/types.h:197,
>                                 from /usr/include/stdlib.h:279,
>                                 from timestamping.c:2:
>                /usr/include/bits/types/struct_timespec.h:8:8: note: originally defined here
>                 struct timespec
>                        ^~~~~~~~
>                In file included from /usr/include/linux/errqueue.h:5:0,
>                                 from timestamping.c:6:
>                /usr/include/linux/time.h:16:8: error: redefinition of ?struct timeval?
>                 struct timeval {
>                        ^~~~~~~
>                In file included from /usr/include/sys/select.h:37:0,
>                                 from /usr/include/sys/types.h:197,
>                                 from /usr/include/stdlib.h:279,
>                                 from timestamping.c:2:
>                /usr/include/bits/types/struct_timeval.h:8:8: note: originally defined here
>                 struct timeval
>                        ^~~~~~~
>
>
> Can you please advise how to solve it?
>
> Thanks,
> Ran

The errqueue.h already had the same issue reported previously:
https://lore.kernel.org/netdev/CAF=yD-L2ntuH54J_SwN9WcpBMgkV_v0e-Q2Pu2mrQ3+1RozGFQ@mail.gmail.com/

Earlier when I tested this with kernel selftests such as
tools/testing/selftests/networking/timestamping/rxtimestamp(the test
was broken to begin with because of  missing include of unistd.h), I
was using make.cross to build.
This does not put the headers in the right place
(obj-$ARCH/usr/include instead of usr/include). Hence, I did not
realize that this breaks the inclusion of errqueue.h due to the
missing __kernel_timespec definition.
I forgot that nobody seems to be using linux/time.h.

But, if I include guards( #ifndef __KERNEL__) for struct timespec,
struct timeval etc for linux/time.h, then we can include it from
userspace/ errqueue.h for __kernel_timespec:

--- a/include/uapi/linux/errqueue.h
+++ b/include/uapi/linux/errqueue.h
@@ -2,7 +2,7 @@
 #ifndef _UAPI_LINUX_ERRQUEUE_H
 #define _UAPI_LINUX_ERRQUEUE_H

-#include <linux/types.h>
+#include <linux/time.h>

 struct sock_extended_err {
        __u32   ee_errno;
diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h
index a6aca9aaab80..40913d9a5bc8 100644
--- a/include/uapi/linux/time.h
+++ b/include/uapi/linux/time.h
@@ -5,6 +5,8 @@
 #include <linux/types.h>


+#ifdef __KERNEL__
+
 #ifndef _STRUCT_TIMESPEC
 #define _STRUCT_TIMESPEC
 struct timespec {
@@ -42,6 +44,8 @@ struct itimerval {
        struct timeval it_value;        /* current value */
 };

+#endif /* __KERNEL__ */

Arnd,

I forgot how we plan to include the definition for __kernel_timespec
for libc or userspace. Does this seem right to you?
Also these changes to errqueue.h needs to be reverted probably as this
breaks userspace.

Thanks,
-Deepa







-Deepa

^ permalink raw reply related

* [PATCH net-next v2 0/5] devlink: minor tweaks to reported device info
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski

Hi!

This series contains two minor touch ups for devlink code. First
|| is corrected to && in the ethtool compat code. Next patch
decreases the stack allocation size.

On the nfp side after further discussions with the manufacturing
team we decided to realign the serial number contents slightly and
rename one of the other fields from "vendor" to "mfr", short for
"manufacture".

v2: - add patch 3 - move board maker as a generic attribute.

Jakub Kicinski (5):
  devlink: fix condition for compat device info
  devlink: don't allocate attrs on the stack
  devlink: add a generic board.manufacture version name
  nfp: devlink: use the generic manufacture identifier instead of vendor
  nfp: devlink: include vendor/product info in serial number

 .../networking/devlink-info-versions.rst      |  5 ++++
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 23 +++++++++++++++----
 include/net/devlink.h                         |  2 ++
 net/core/devlink.c                            | 16 +++++++++----
 4 files changed, 37 insertions(+), 9 deletions(-)

-- 
2.19.2


^ permalink raw reply

* [PATCH net-next v2 1/5] devlink: fix condition for compat device info
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20190211033531.12928-1-jakub.kicinski@netronome.com>

We need the port to be both ethernet and have the rigth netdev,
not one or the other.

Fixes: ddb6e99e2db1 ("ethtool: add compat for devlink info")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/core/devlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index e6a015b8ac9b..cf0f511bc56c 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -6385,7 +6385,7 @@ void devlink_compat_running_version(struct net_device *dev,
 	list_for_each_entry(devlink, &devlink_list, list) {
 		mutex_lock(&devlink->lock);
 		list_for_each_entry(devlink_port, &devlink->port_list, list) {
-			if (devlink_port->type == DEVLINK_PORT_TYPE_ETH ||
+			if (devlink_port->type == DEVLINK_PORT_TYPE_ETH &&
 			    devlink_port->type_dev == dev) {
 				__devlink_compat_running_version(devlink,
 								 buf, len);
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v2 2/5] devlink: don't allocate attrs on the stack
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20190211033531.12928-1-jakub.kicinski@netronome.com>

Number of devlink attributes has grown over 128, causing the
following warning:

../net/core/devlink.c: In function ‘devlink_nl_cmd_region_read_dumpit’:
../net/core/devlink.c:3740:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
 }
  ^

Since the number of attributes is only going to grow allocate
the array dynamically.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
 net/core/devlink.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/core/devlink.c b/net/core/devlink.c
index cf0f511bc56c..46c468a1f3dc 100644
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -3629,26 +3629,30 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 					     struct netlink_callback *cb)
 {
 	u64 ret_offset, start_offset, end_offset = 0;
-	struct nlattr *attrs[DEVLINK_ATTR_MAX + 1];
 	const struct genl_ops *ops = cb->data;
 	struct devlink_region *region;
 	struct nlattr *chunks_attr;
 	const char *region_name;
 	struct devlink *devlink;
+	struct nlattr **attrs;
 	bool dump = true;
 	void *hdr;
 	int err;
 
 	start_offset = *((u64 *)&cb->args[0]);
 
+	attrs = kmalloc_array(DEVLINK_ATTR_MAX + 1, sizeof(*attrs), GFP_KERNEL);
+	if (!attrs)
+		return -ENOMEM;
+
 	err = nlmsg_parse(cb->nlh, GENL_HDRLEN + devlink_nl_family.hdrsize,
 			  attrs, DEVLINK_ATTR_MAX, ops->policy, cb->extack);
 	if (err)
-		goto out;
+		goto out_free;
 
 	devlink = devlink_get_from_attrs(sock_net(cb->skb->sk), attrs);
 	if (IS_ERR(devlink))
-		goto out;
+		goto out_free;
 
 	mutex_lock(&devlink_mutex);
 	mutex_lock(&devlink->lock);
@@ -3710,6 +3714,7 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 	genlmsg_end(skb, hdr);
 	mutex_unlock(&devlink->lock);
 	mutex_unlock(&devlink_mutex);
+	kfree(attrs);
 
 	return skb->len;
 
@@ -3718,7 +3723,8 @@ static int devlink_nl_cmd_region_read_dumpit(struct sk_buff *skb,
 out_unlock:
 	mutex_unlock(&devlink->lock);
 	mutex_unlock(&devlink_mutex);
-out:
+out_free:
+	kfree(attrs);
 	return 0;
 }
 
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v2 5/5] nfp: devlink: include vendor/product info in serial number
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20190211033531.12928-1-jakub.kicinski@netronome.com>

The manufacturing team requests we include vendor and product
in the serial number field, as the serial number itself is not
unique across manufacturing facilities and products.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 .../net/ethernet/netronome/nfp/nfp_devlink.c  | 21 ++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index bf4e124dbdd2..080a301f379e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -258,18 +258,33 @@ nfp_devlink_info_get(struct devlink *devlink, struct devlink_info_req *req,
 		     struct netlink_ext_ack *extack)
 {
 	struct nfp_pf *pf = devlink_priv(devlink);
+	const char *sn, *vendor, *part;
 	struct nfp_nsp *nsp;
 	char *buf = NULL;
-	const char *sn;
 	int err;
 
 	err = devlink_info_driver_name_put(req, "nfp");
 	if (err)
 		return err;
 
+	vendor = nfp_hwinfo_lookup(pf->hwinfo, "assembly.vendor");
+	part = nfp_hwinfo_lookup(pf->hwinfo, "assembly.partno");
 	sn = nfp_hwinfo_lookup(pf->hwinfo, "assembly.serial");
-	if (sn) {
-		err = devlink_info_serial_number_put(req, sn);
+	if (vendor && part && sn) {
+		char *buf;
+
+		buf = kmalloc(strlen(vendor) + strlen(part) + strlen(sn) + 1,
+			      GFP_KERNEL);
+		if (!buf)
+			return -ENOMEM;
+
+		buf[0] = '\0';
+		strcat(buf, vendor);
+		strcat(buf, part);
+		strcat(buf, sn);
+
+		err = devlink_info_serial_number_put(req, buf);
+		kfree(buf);
 		if (err)
 			return err;
 	}
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v2 4/5] nfp: devlink: use the generic manufacture identifier instead of vendor
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20190211033531.12928-1-jakub.kicinski@netronome.com>

Vendor may sound ambiguous, let's rename the fab string to
"board.manufacture" (which was just added as a generic identifier).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_devlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
index dddbb0575be9..bf4e124dbdd2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_devlink.c
@@ -178,7 +178,7 @@ static const struct nfp_devlink_versions_simple {
 } nfp_devlink_versions_hwinfo[] = {
 	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_ID,	"assembly.partno", },
 	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_REV,	"assembly.revision", },
-	{ "board.vendor", /* fab */			"assembly.vendor", },
+	{ DEVLINK_INFO_VERSION_GENERIC_BOARD_MANUFACTURE, "assembly.vendor", },
 	{ "board.model", /* code name */		"assembly.model", },
 };
 
-- 
2.19.2


^ permalink raw reply related

* [PATCH net-next v2 3/5] devlink: add a generic board.manufacture version name
From: Jakub Kicinski @ 2019-02-11  3:35 UTC (permalink / raw)
  To: davem, jiri; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20190211033531.12928-1-jakub.kicinski@netronome.com>

At Jiri's suggestion add a generic "board.manufacture"
version identifier.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 Documentation/networking/devlink-info-versions.rst | 5 +++++
 include/net/devlink.h                              | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/Documentation/networking/devlink-info-versions.rst b/Documentation/networking/devlink-info-versions.rst
index 7d4ecf6b6f34..c79ad8593383 100644
--- a/Documentation/networking/devlink-info-versions.rst
+++ b/Documentation/networking/devlink-info-versions.rst
@@ -14,6 +14,11 @@ board.rev
 
 Board design revision.
 
+board.manufacture
+=================
+
+An identifier of the company or the facility which produced the part.
+
 fw.mgmt
 =======
 
diff --git a/include/net/devlink.h b/include/net/devlink.h
index 2b384a38911b..07660fe4c0e3 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -435,6 +435,8 @@ enum devlink_param_wol_types {
 #define DEVLINK_INFO_VERSION_GENERIC_BOARD_ID	"board.id"
 /* Revision of board design */
 #define DEVLINK_INFO_VERSION_GENERIC_BOARD_REV	"board.rev"
+/* Maker of the board */
+#define DEVLINK_INFO_VERSION_GENERIC_BOARD_MANUFACTURE	"board.manufacture"
 
 /* Control processor FW version */
 #define DEVLINK_INFO_VERSION_GENERIC_FW_MGMT	"fw.mgmt"
-- 
2.19.2


^ permalink raw reply related

* linux-next: manual merge of the phy-next tree with the net-next tree
From: Stephen Rothwell @ 2019-02-11  3:42 UTC (permalink / raw)
  To: Kishon Vijay Abraham I, David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Russell King,
	Miquel Raynal, Igal Liberman, Evan Wang, Grzegorz Jaszczyk

[-- Attachment #1: Type: text/plain, Size: 2998 bytes --]

Hi all,

Today's linux-next merge of the phy-next tree got conflicts in:

  drivers/phy/marvell/Kconfig
  drivers/phy/marvell/Makefile

between commit:

  14dc100b4411 ("phy: armada38x: add common phy support")

from the net-next tree and commit:

  9695375a3f4a ("phy: add A3700 COMPHY support")
  cc8b7a0ae866 ("phy: add A3700 UTMI PHY driver")

from the phy-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/phy/marvell/Kconfig
index 224ea4e6a46d,b8e9dd38ad0d..000000000000
--- a/drivers/phy/marvell/Kconfig
+++ b/drivers/phy/marvell/Kconfig
@@@ -21,16 -21,27 +21,37 @@@ config PHY_BERLIN_US
  	help
  	  Enable this to support the USB PHY on Marvell Berlin SoCs.
  
+ config PHY_MVEBU_A3700_COMPHY
+ 	tristate "Marvell A3700 comphy driver"
+ 	depends on ARCH_MVEBU || COMPILE_TEST
+ 	depends on OF
+ 	depends on HAVE_ARM_SMCCC
+ 	default y
+ 	select GENERIC_PHY
+ 	help
+ 	  This driver allows to control the comphy, a hardware block providing
+ 	  shared serdes PHYs on Marvell Armada 3700. Its serdes lanes can be
+ 	  used by various controllers: Ethernet, SATA, USB3, PCIe.
+ 
+ config PHY_MVEBU_A3700_UTMI
+ 	tristate "Marvell A3700 UTMI driver"
+ 	depends on ARCH_MVEBU || COMPILE_TEST
+ 	depends on OF
+ 	default y
+ 	select GENERIC_PHY
+ 	help
+ 	  Enable this to support Marvell A3700 UTMI PHY driver.
+ 
 +config PHY_MVEBU_A38X_COMPHY
 +	tristate "Marvell Armada 38x comphy driver"
 +	depends on ARCH_MVEBU || COMPILE_TEST
 +	depends on OF
 +	select GENERIC_PHY
 +	help
 +	  This driver allows to control the comphy, an hardware block providing
 +	  shared serdes PHYs on Marvell Armada 38x. Its serdes lanes can be
 +	  used by various controllers (Ethernet, sata, usb, PCIe...).
 +
  config PHY_MVEBU_CP110_COMPHY
  	tristate "Marvell CP110 comphy driver"
  	depends on ARCH_MVEBU || COMPILE_TEST
diff --cc drivers/phy/marvell/Makefile
index 59b6c03ef756,82f291cf59ee..000000000000
--- a/drivers/phy/marvell/Makefile
+++ b/drivers/phy/marvell/Makefile
@@@ -2,7 -2,8 +2,9 @@@
  obj-$(CONFIG_ARMADA375_USBCLUSTER_PHY)	+= phy-armada375-usb2.o
  obj-$(CONFIG_PHY_BERLIN_SATA)		+= phy-berlin-sata.o
  obj-$(CONFIG_PHY_BERLIN_USB)		+= phy-berlin-usb.o
+ obj-$(CONFIG_PHY_MVEBU_A3700_COMPHY)	+= phy-mvebu-a3700-comphy.o
+ obj-$(CONFIG_PHY_MVEBU_A3700_UTMI)	+= phy-mvebu-a3700-utmi.o
 +obj-$(CONFIG_PHY_MVEBU_A38X_COMPHY)	+= phy-armada38x-comphy.o
  obj-$(CONFIG_PHY_MVEBU_CP110_COMPHY)	+= phy-mvebu-cp110-comphy.o
  obj-$(CONFIG_PHY_MVEBU_SATA)		+= phy-mvebu-sata.o
  obj-$(CONFIG_PHY_PXA_28NM_HSIC)		+= phy-pxa-28nm-hsic.o

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH bpf] bpf: Fix narrow load on a bpf_sock returned from sk_lookup()
From: Alexei Starovoitov @ 2019-02-11  3:54 UTC (permalink / raw)
  To: Martin Lau
  Cc: netdev@vger.kernel.org, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, Joe Stringer
In-Reply-To: <20190210071513.o56emdqcb23xtng3@kafai-mbp.dhcp.thefacebook.com>

On Sun, Feb 10, 2019 at 07:15:17AM +0000, Martin Lau wrote:
> > > Fixes: c64b7983288e ("bpf: Add PTR_TO_SOCKET verifier type")
> > > Cc: Joe Stringer <joe@wand.net.nz>
> > > Signed-off-by: Martin KaFai Lau <kafai@fb.com>
> > 
> > Applied to bpf tree.
> Thanks!
> 
> > 
> > Martin, if your is_fullsock work depends on it, I can apply the fix
> > to bpf-next as well. Just let me know.
> Yes, the is_fullsock work depends on it.
> I should have mentioned it in this commit log.

Ok. I've pushed it to bpf-next as well.
Last time we discusses this scenario at netconf and agreed that git should
do the right thing, since commit is the same.
I think this is a case where I think it makes sense to give it a shot.
If we get issues during pulls/merges it will be a lesson to avoid
such things in the future, but if we don't try it we won't know.
So applied.


^ permalink raw reply

* [PATCH net-next v3] ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
From: Callum Sinclair @ 2019-02-11  3:54 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji, nikolay, netdev, linux-kernel
  Cc: nicolas.dichtel, Callum Sinclair

Created a way to clear the multicast forwarding cache on a socket
without having to either remove the entries manually using the delete
entry socket option or destroy and recreate the multicast socket.

Calling the socket option MRT_FLUSH will allow any combination of the
four flag options to be cleared.

MRT_FLUSH_MFC will clear all non static mfc entries
MRT_FLUSH_MFC_STATIC will clear all static mfc entries
MRT_FLUSH_VIFS will clear all non static interfaces
MRT_FLUSH_VIFS_STATIC will clear all static interfaces.

Callum Sinclair (1):
  ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs

 include/uapi/linux/mroute.h  |  9 ++++-
 include/uapi/linux/mroute6.h |  9 ++++-
 net/ipv4/ipmr.c              | 73 ++++++++++++++++++++-------------
 net/ipv6/ip6mr.c             | 78 +++++++++++++++++++++++-------------
 4 files changed, 112 insertions(+), 57 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [PATCH net-next v3] ipmr: ip6mr: Create new sockopt to clear mfc cache or vifs
From: Callum Sinclair @ 2019-02-11  3:54 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji, nikolay, netdev, linux-kernel
  Cc: nicolas.dichtel, Callum Sinclair
In-Reply-To: <20190211035412.29218-1-callum.sinclair@alliedtelesis.co.nz>

v1 -> v2:
Implemented additional flags for static entries
v2 -> v3:
Cleaned up flag logic so any combination of routes can be cleared.
Fixed style errors
Fixed incorrect flag values

Currently the only way to clear the forwarding cache was to delete the
entries one by one using the MRT_DEL_MFC socket option or to destroy and
recreate the socket.

Create a new socket option which with the use of optional flags can
clear any combination of multicast entries (static or not static) and
multicast vifs (static or not static).

Calling the new socket option MRT_FLUSH with the flags MRT_FLUSH_MFC and
MRT_FLUSH_VIFS will clear all entries and vifs on the socket except for
static entries.

Signed-off-by: Callum Sinclair <callum.sinclair@alliedtelesis.co.nz>
---
 include/uapi/linux/mroute.h  |  9 ++++-
 include/uapi/linux/mroute6.h |  9 ++++-
 net/ipv4/ipmr.c              | 73 ++++++++++++++++++++-------------
 net/ipv6/ip6mr.c             | 78 +++++++++++++++++++++++-------------
 4 files changed, 112 insertions(+), 57 deletions(-)

diff --git a/include/uapi/linux/mroute.h b/include/uapi/linux/mroute.h
index 5d37a9ccce63..11c8c1fc1124 100644
--- a/include/uapi/linux/mroute.h
+++ b/include/uapi/linux/mroute.h
@@ -28,12 +28,19 @@
 #define MRT_TABLE	(MRT_BASE+9)	/* Specify mroute table ID		*/
 #define MRT_ADD_MFC_PROXY	(MRT_BASE+10)	/* Add a (*,*|G) mfc entry	*/
 #define MRT_DEL_MFC_PROXY	(MRT_BASE+11)	/* Del a (*,*|G) mfc entry	*/
-#define MRT_MAX		(MRT_BASE+11)
+#define MRT_FLUSH	(MRT_BASE+12)	/* Flush all mfc entries and/or vifs	*/
+#define MRT_MAX		(MRT_BASE+12)
 
 #define SIOCGETVIFCNT	SIOCPROTOPRIVATE	/* IP protocol privates */
 #define SIOCGETSGCNT	(SIOCPROTOPRIVATE+1)
 #define SIOCGETRPF	(SIOCPROTOPRIVATE+2)
 
+/* MRT_FLUSH optional flags */
+#define MRT_FLUSH_MFC	1	/* Flush multicast entries */
+#define MRT_FLUSH_MFC_STATIC	2	/* Flush static multicast entries */
+#define MRT_FLUSH_VIFS	4	/* Flush multicast vifs */
+#define MRT_FLUSH_VIFS_STATIC	8	/* Flush static multicast vifs */
+
 #define MAXVIFS		32
 typedef unsigned long vifbitmap_t;	/* User mode code depends on this lot */
 typedef unsigned short vifi_t;
diff --git a/include/uapi/linux/mroute6.h b/include/uapi/linux/mroute6.h
index 9999cc006390..ac84ef11b29c 100644
--- a/include/uapi/linux/mroute6.h
+++ b/include/uapi/linux/mroute6.h
@@ -31,12 +31,19 @@
 #define MRT6_TABLE	(MRT6_BASE+9)	/* Specify mroute table ID		*/
 #define MRT6_ADD_MFC_PROXY	(MRT6_BASE+10)	/* Add a (*,*|G) mfc entry	*/
 #define MRT6_DEL_MFC_PROXY	(MRT6_BASE+11)	/* Del a (*,*|G) mfc entry	*/
-#define MRT6_MAX	(MRT6_BASE+11)
+#define MRT6_FLUSH	(MRT6_BASE+12)	/* Flush all mfc entries and/or vifs	*/
+#define MRT6_MAX	(MRT6_BASE+12)
 
 #define SIOCGETMIFCNT_IN6	SIOCPROTOPRIVATE	/* IP protocol privates */
 #define SIOCGETSGCNT_IN6	(SIOCPROTOPRIVATE+1)
 #define SIOCGETRPF	(SIOCPROTOPRIVATE+2)
 
+/* MRT6_FLUSH optional flags */
+#define MRT6_FLUSH_MFC	1	/* Flush multicast entries */
+#define MRT6_FLUSH_MFC_STATIC	2	/* Flush static multicast entries */
+#define MRT6_FLUSH_VIFS	4	/* Flushing multicast vifs */
+#define MRT6_FLUSH_VIFS_STATIC	8	/* Flush static multicast vifs */
+
 #define MAXMIFS		32
 typedef unsigned long mifbitmap_t;	/* User mode code depends on this lot */
 typedef unsigned short mifi_t;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index e536970557dd..2c95ef8cf224 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -110,7 +110,7 @@ static int ipmr_cache_report(struct mr_table *mrt,
 static void mroute_netlink_event(struct mr_table *mrt, struct mfc_cache *mfc,
 				 int cmd);
 static void igmpmsg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt);
-static void mroute_clean_tables(struct mr_table *mrt, bool all);
+static void mroute_clean_tables(struct mr_table *mrt, int flags);
 static void ipmr_expire_process(struct timer_list *t);
 
 #ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES
@@ -415,7 +415,8 @@ static struct mr_table *ipmr_new_table(struct net *net, u32 id)
 static void ipmr_free_table(struct mr_table *mrt)
 {
 	del_timer_sync(&mrt->ipmr_expire_timer);
-	mroute_clean_tables(mrt, true);
+	mroute_clean_tables(mrt, MRT_FLUSH_VIFS | MRT_FLUSH_VIFS_STATIC |
+						MRT_FLUSH_MFC | MRT_FLUSH_MFC_STATIC);
 	rhltable_destroy(&mrt->mfc_hash);
 	kfree(mrt);
 }
@@ -1296,7 +1297,7 @@ static int ipmr_mfc_add(struct net *net, struct mr_table *mrt,
 }
 
 /* Close the multicast socket, and clear the vif tables etc */
-static void mroute_clean_tables(struct mr_table *mrt, bool all)
+static void mroute_clean_tables(struct mr_table *mrt, int flags)
 {
 	struct net *net = read_pnet(&mrt->net);
 	struct mr_mfc *c, *tmp;
@@ -1305,35 +1306,42 @@ static void mroute_clean_tables(struct mr_table *mrt, bool all)
 	int i;
 
 	/* Shut down all active vif entries */
-	for (i = 0; i < mrt->maxvif; i++) {
-		if (!all && (mrt->vif_table[i].flags & VIFF_STATIC))
-			continue;
-		vif_delete(mrt, i, 0, &list);
+	if (flags & (MRT_FLUSH_VIFS | MRT_FLUSH_VIFS_STATIC)) {
+		for (i = 0; i < mrt->maxvif; i++) {
+			if (((mrt->vif_table[i].flags & VIFF_STATIC) &&
+			     !(flags & MRT_FLUSH_VIFS_STATIC)) ||
+			    (!(mrt->vif_table[i].flags & VIFF_STATIC) && !(flags & MRT_FLUSH)))
+				continue;
+			vif_delete(mrt, i, 0, &list);
+		}
+		unregister_netdevice_many(&list);
 	}
-	unregister_netdevice_many(&list);
 
 	/* Wipe the cache */
-	list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
-		if (!all && (c->mfc_flags & MFC_STATIC))
-			continue;
-		rhltable_remove(&mrt->mfc_hash, &c->mnode, ipmr_rht_params);
-		list_del_rcu(&c->list);
-		cache = (struct mfc_cache *)c;
-		call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
-					      mrt->id);
-		mroute_netlink_event(mrt, cache, RTM_DELROUTE);
-		mr_cache_put(c);
-	}
-
-	if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
-		spin_lock_bh(&mfc_unres_lock);
-		list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
-			list_del(&c->list);
+	if (flags & (MRT_FLUSH_MFC | MRT_FLUSH_MFC_STATIC)) {
+		list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
+			if (((c->mfc_flags & MFC_STATIC) && !(flags & MRT_FLUSH_MFC_STATIC)) ||
+			    (!(c->mfc_flags & MFC_STATIC) && !(flags & MRT_FLUSH_MFC)))
+				continue;
+			rhltable_remove(&mrt->mfc_hash, &c->mnode, ipmr_rht_params);
+			list_del_rcu(&c->list);
 			cache = (struct mfc_cache *)c;
+			call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
+						      mrt->id);
 			mroute_netlink_event(mrt, cache, RTM_DELROUTE);
-			ipmr_destroy_unres(mrt, cache);
+			mr_cache_put(c);
+		}
+
+		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+			spin_lock_bh(&mfc_unres_lock);
+			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
+				list_del(&c->list);
+				cache = (struct mfc_cache *)c;
+				mroute_netlink_event(mrt, cache, RTM_DELROUTE);
+				ipmr_destroy_unres(mrt, cache);
+			}
+			spin_unlock_bh(&mfc_unres_lock);
 		}
-		spin_unlock_bh(&mfc_unres_lock);
 	}
 }
 
@@ -1354,7 +1362,7 @@ static void mrtsock_destruct(struct sock *sk)
 						    NETCONFA_IFINDEX_ALL,
 						    net->ipv4.devconf_all);
 			RCU_INIT_POINTER(mrt->mroute_sk, NULL);
-			mroute_clean_tables(mrt, false);
+			mroute_clean_tables(mrt, MRT_FLUSH_VIFS | MRT_FLUSH_MFC);
 		}
 	}
 	rtnl_unlock();
@@ -1479,6 +1487,17 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 					   sk == rtnl_dereference(mrt->mroute_sk),
 					   parent);
 		break;
+	case MRT_FLUSH:
+		if (optlen != sizeof(val)) {
+			ret = -EINVAL;
+			break;
+		}
+		if (get_user(val, (int __user *)optval)) {
+			ret = -EFAULT;
+			break;
+		}
+		mroute_clean_tables(mrt, val);
+		break;
 	/* Control PIM assert. */
 	case MRT_ASSERT:
 		if (optlen != sizeof(val)) {
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index cc01aa3f2b5e..c6909b5e927c 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -97,7 +97,7 @@ static void mr6_netlink_event(struct mr_table *mrt, struct mfc6_cache *mfc,
 static void mrt6msg_netlink_event(struct mr_table *mrt, struct sk_buff *pkt);
 static int ip6mr_rtm_dumproute(struct sk_buff *skb,
 			       struct netlink_callback *cb);
-static void mroute_clean_tables(struct mr_table *mrt, bool all);
+static void mroute_clean_tables(struct mr_table *mrt, int flags);
 static void ipmr_expire_process(struct timer_list *t);
 
 #ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
@@ -393,7 +393,8 @@ static struct mr_table *ip6mr_new_table(struct net *net, u32 id)
 static void ip6mr_free_table(struct mr_table *mrt)
 {
 	del_timer_sync(&mrt->ipmr_expire_timer);
-	mroute_clean_tables(mrt, true);
+	mroute_clean_tables(mrt, MRT6_FLUSH_VIFS | MRT6_FLUSH_VIFS_STATIC |
+						MRT6_FLUSH_MFC | MRT6_FLUSH_MFC_STATIC);
 	rhltable_destroy(&mrt->mfc_hash);
 	kfree(mrt);
 }
@@ -1496,42 +1497,49 @@ static int ip6mr_mfc_add(struct net *net, struct mr_table *mrt,
  *	Close the multicast socket, and clear the vif tables etc
  */
 
-static void mroute_clean_tables(struct mr_table *mrt, bool all)
+static void mroute_clean_tables(struct mr_table *mrt, int flags)
 {
 	struct mr_mfc *c, *tmp;
 	LIST_HEAD(list);
 	int i;
 
 	/* Shut down all active vif entries */
-	for (i = 0; i < mrt->maxvif; i++) {
-		if (!all && (mrt->vif_table[i].flags & VIFF_STATIC))
-			continue;
-		mif6_delete(mrt, i, 0, &list);
+	if (flags & (MRT6_FLUSH_VIFS | MRT6_FLUSH_VIFS_STATIC)) {
+		for (i = 0; i < mrt->maxvif; i++) {
+			if (((mrt->vif_table[i].flags & VIFF_STATIC) &&
+			     !(flags & MRT6_FLUSH_VIFS_STATIC)) ||
+			    (!(mrt->vif_table[i].flags & VIFF_STATIC) && !(flags & MRT6_FLUSH_VIFS)))
+				continue;
+			mif6_delete(mrt, i, 0, &list);
+		}
+		unregister_netdevice_many(&list);
 	}
-	unregister_netdevice_many(&list);
 
 	/* Wipe the cache */
-	list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
-		if (!all && (c->mfc_flags & MFC_STATIC))
-			continue;
-		rhltable_remove(&mrt->mfc_hash, &c->mnode, ip6mr_rht_params);
-		list_del_rcu(&c->list);
-		call_ip6mr_mfc_entry_notifiers(read_pnet(&mrt->net),
-					       FIB_EVENT_ENTRY_DEL,
-					       (struct mfc6_cache *)c, mrt->id);
-		mr6_netlink_event(mrt, (struct mfc6_cache *)c, RTM_DELROUTE);
-		mr_cache_put(c);
-	}
+	if (flags & (MRT6_FLUSH_MFC | MRT6_FLUSH_MFC_STATIC)) {
+		list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
+			if (((c->mfc_flags & MFC_STATIC) && !(flags & MRT6_FLUSH_MFC_STATIC)) ||
+			    (!(c->mfc_flags & MFC_STATIC) && !(flags & MRT6_FLUSH_MFC)))
+				continue;
+			rhltable_remove(&mrt->mfc_hash, &c->mnode, ip6mr_rht_params);
+			list_del_rcu(&c->list);
+			call_ip6mr_mfc_entry_notifiers(read_pnet(&mrt->net),
+						       FIB_EVENT_ENTRY_DEL,
+										   (struct mfc6_cache *)c, mrt->id);
+			mr6_netlink_event(mrt, (struct mfc6_cache *)c, RTM_DELROUTE);
+			mr_cache_put(c);
+		}
 
-	if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
-		spin_lock_bh(&mfc_unres_lock);
-		list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
-			list_del(&c->list);
-			mr6_netlink_event(mrt, (struct mfc6_cache *)c,
-					  RTM_DELROUTE);
-			ip6mr_destroy_unres(mrt, (struct mfc6_cache *)c);
+		if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
+			spin_lock_bh(&mfc_unres_lock);
+			list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
+				list_del(&c->list);
+				mr6_netlink_event(mrt, (struct mfc6_cache *)c,
+						  RTM_DELROUTE);
+				ip6mr_destroy_unres(mrt, (struct mfc6_cache *)c);
+			}
+			spin_unlock_bh(&mfc_unres_lock);
 		}
-		spin_unlock_bh(&mfc_unres_lock);
 	}
 }
 
@@ -1587,7 +1595,7 @@ int ip6mr_sk_done(struct sock *sk)
 						     NETCONFA_IFINDEX_ALL,
 						     net->ipv6.devconf_all);
 
-			mroute_clean_tables(mrt, false);
+			mroute_clean_tables(mrt, MRT6_FLUSH_VIFS | MRT6_FLUSH_MFC);
 			err = 0;
 			break;
 		}
@@ -1703,6 +1711,20 @@ int ip6_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, uns
 		rtnl_unlock();
 		return ret;
 
+	case MRT6_FLUSH:
+	{
+		int flags;
+
+		if (optlen != sizeof(flags))
+			return -EINVAL;
+		if (get_user(flags, (int __user *)optval))
+			return -EFAULT;
+		rtnl_lock();
+		mroute_clean_tables(mrt, flags);
+		rtnl_unlock();
+		return 0;
+	}
+
 	/*
 	 *	Control PIM assert (to activate pim will activate assert)
 	 */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v2 bpf-next 0/7] Add __sk_buff->sk, bpf_tcp_sock, BPF_FUNC_tcp_sock and BPF_FUNC_sk_fullsock
From: Alexei Starovoitov @ 2019-02-11  3:55 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team,
	Lawrence Brakmo
In-Reply-To: <20190210072220.1530061-1-kafai@fb.com>

On Sat, Feb 09, 2019 at 11:22:20PM -0800, Martin KaFai Lau wrote:
> This series adds __sk_buff->sk, "struct bpf_tcp_sock",
> BPF_FUNC_sk_fullsock and BPF_FUNC_tcp_sock.  Together, they provide
> a common way to expose the members of "struct tcp_sock" and
> "struct bpf_sock" for the bpf_prog to access.
> 
> The patch series first adds a bpf_sock pointer to __sk_buff
> and a new helper BPF_FUNC_sk_fullsock.
> 
> It then adds BPF_FUNC_tcp_sock to get a bpf_tcp_sock
> pointer from a bpf_sock pointer.
> 
> The current use case is to allow a cg_skb_bpf_prog to provide
> per cgroup traffic policing/shaping.
> 
> Please see individual patch for details.
> 
> v2:
> - Patch 1 depends on
>   commit d623876646be ("bpf: Fix narrow load on a bpf_sock returned from sk_lookup()")
>   in the bpf branch.
> - Add sk_to_full_sk() to bpf_sk_fullsock() and bpf_tcp_sock()
>   such that there is a way to access the listener's sk and tcp_sk
>   when __sk_buff->sk is a request_sock.
>   The comments in the uapi bpf.h is updated accordingly.
> - bpf_ctx_range_till() is used in bpf_sock_common_is_valid_access()
>   in patch 1.  Saved a few lines.
> - Patch 2 is new in v2 and it adds "state", "dst_ip4", "dst_ip6" and
>   "dst_port" to the bpf_sock.  Narrow load is allowed on them.
>   The "state" (i.e. sk_state) has already been used in
>   INET_DIAG (e.g. ss -t) and getsockopt(TCP_INFO).
> - While at it in the new patch 2, also allow narrow load on some
>   existing fields of the bpf_sock, which are "family", "type", "protocol"
>   and "src_port".  Only allow loading from first byte for now.
>   i.e. does not allow narrow load starting from the 2nd byte.
> - Add some narrow load tests to the test_verifier's sock.c

Daniel,
I believe this new revision addresses your concerns exactly as we discussed.
So I pushed it to bpf-next.
please double check that it's what you expected.
We can always revert.
Thanks everyone!


^ permalink raw reply

* Re: [PATCH bpf] bpf: only adjust gso_size on bytestream protocols
From: Alexei Starovoitov @ 2019-02-11  4:00 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, ast, daniel, posk.devel, dja, Willem de Bruijn
In-Reply-To: <20190207195416.27082-1-willemdebruijn.kernel@gmail.com>

On Thu, Feb 07, 2019 at 02:54:16PM -0500, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@google.com>
> 
> bpf_skb_change_proto and bpf_skb_adjust_room change skb header length.
> For GSO packets they adjust gso_size to maintain the same MTU.
> 
> The gso size can only be safely adjusted on bytestream protocols.
> Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat
> to deal with gso sctp skbs") excluded SKB_GSO_SCTP.
> 
> Since then type SKB_GSO_UDP_L4 has been added, whose contents are one
> gso_size unit per datagram. Also exclude these.
> 
> Move from a blacklist to a whitelist check to future proof against
> additional such new GSO types, e.g., for fraglist based GRO.
> 
> Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Applied to bpf tree.
I agree that whitelist approach is the most appropriate.


^ permalink raw reply

* Re: [PATCH bpf-next 2/3] selftests: bpf: extend sub-register mode compilation to all bpf object files
From: Alexei Starovoitov @ 2019-02-11  4:04 UTC (permalink / raw)
  To: Jiong Wang; +Cc: daniel, netdev, oss-drivers
In-Reply-To: <1549647681-13818-3-git-send-email-jiong.wang@netronome.com>

On Fri, Feb 08, 2019 at 05:41:20PM +0000, Jiong Wang wrote:
> At the moment, we only do extra sub-register mode compilation on bpf object
> files used by "test_progs". These object files are really loaded and
> executed.
> 
> This patch further extends sub-register mode compilation to all bpf object
> files, even those without corresponding runtime tests. Because this could
> help testing LLVM sub-register code-gen, kernel bpf selftest has much more
> C testcases with reasonable size and complexity compared with LLVM
> testsuite which only contains unit tests.
> 
> There were some file duplication inside BPF_OBJ_FILES_DUAL_COMPILE which
> is removed now.
> 
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  tools/testing/selftests/bpf/Makefile | 21 ++++++++-------------
>  1 file changed, 8 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 383d2ff..70b2570 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -35,20 +35,15 @@ BPF_OBJ_FILES = \
>  	sendmsg4_prog.o sendmsg6_prog.o test_lirc_mode2_kern.o \
>  	get_cgroup_id_kern.o socket_cookie_prog.o test_select_reuseport_kern.o \
>  	test_skb_cgroup_id_kern.o bpf_flow.o netcnt_prog.o test_xdp_vlan.o \
> -	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o
> -
> -# Objects are built with default compilation flags and with sub-register
> -# code-gen enabled.
> -BPF_OBJ_FILES_DUAL_COMPILE = \
> -	test_pkt_access.o test_pkt_access.o test_xdp.o test_adjust_tail.o \
> -	test_l4lb.o test_l4lb_noinline.o test_xdp_noinline.o test_tcp_estats.o \
> +	xdp_dummy.o test_map_in_map.o test_spin_lock.o test_map_lock.o \
> +	test_pkt_access.o test_xdp.o test_adjust_tail.o test_l4lb.o \
> +	test_l4lb_noinline.o test_xdp_noinline.o test_tcp_estats.o \
>  	test_obj_id.o test_pkt_md_access.o test_tracepoint.o \
> -	test_stacktrace_map.o test_stacktrace_map.o test_stacktrace_build_id.o \
> -	test_stacktrace_build_id.o test_get_stack_rawtp.o \
> -	test_get_stack_rawtp.o test_tracepoint.o test_sk_lookup_kern.o \
> -	test_queue_map.o test_stack_map.o
> +	test_stacktrace_map.o test_stacktrace_build_id.o \
> +	test_get_stack_rawtp.o test_sk_lookup_kern.o test_queue_map.o \
> +	test_stack_map.o
>  
> -TEST_GEN_FILES = $(BPF_OBJ_FILES) $(BPF_OBJ_FILES_DUAL_COMPILE)
> +TEST_GEN_FILES = $(BPF_OBJ_FILES)
>  
>  # Also test sub-register code-gen if LLVM + kernel both has eBPF v3 processor
>  # support which is the first version to contain both ALU32 and JMP32
> @@ -58,7 +53,7 @@ SUBREG_CODEGEN := $(shell echo "int cal(int a) { return a > 0; }" | \
>  			$(LLC) -mattr=+alu32 -mcpu=probe 2>&1 | \
>  			grep 'if w')

build and test servers can be different.
Would it make sense to use -mcpu=v3 instead of -mcpu=probe ?

Also while testing test_progs_32 fails like this:
libbpf: failed to open ./bpf_flow.o: No such file or directory
libbpf: failed to open ./test_spin_lock.o: No such file or directory
test_spin_lock:bpf_prog_load errno 2

Do you see the same ?


^ permalink raw reply

* Re: [PATCH bpf-next 3/3] selftests: bpf: centre kernel bpf objects under new subdir "kern_progs"
From: Alexei Starovoitov @ 2019-02-11  4:06 UTC (permalink / raw)
  To: Jiong Wang; +Cc: daniel, netdev, oss-drivers
In-Reply-To: <1549647681-13818-4-git-send-email-jiong.wang@netronome.com>

On Fri, Feb 08, 2019 at 05:41:21PM +0000, Jiong Wang wrote:
> At the moment, all kernel bpf objects are listed under BPF_OBJ_FILES.
> Listing them manually sometimes causing patch conflict when people are
> adding new testcases simultaneously.
> 
> It is better to centre all the related source files under a subdir
> "kern_progs", then auto-generate the object file list.
> 
> Suggested-by: Alexei Starovoitov <ast@kernel.org>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
> ---
>  tools/testing/selftests/bpf/Makefile               | 26 +++++-----------------
>  .../selftests/bpf/{ => kern_progs}/bpf_flow.c      |  0
>  .../selftests/bpf/{ => kern_progs}/connect4_prog.c |  0
>  .../selftests/bpf/{ => kern_progs}/connect6_prog.c |  0
>  .../selftests/bpf/{ => kern_progs}/dev_cgroup.c    |  0

Thanks a lot for the patch.
A tiny bit of bikeshedding...
'kern_progs' feels a bit too long and awkward to type.
May be just 'progs' ?


^ permalink raw reply

* Re: [PATCH bpf] xsk: add missing smp_rmb() in xsk_mmap
From: Alexei Starovoitov @ 2019-02-11  4:08 UTC (permalink / raw)
  To: Magnus Karlsson; +Cc: bjorn.topel, ast, daniel, netdev
In-Reply-To: <1549631630-29208-1-git-send-email-magnus.karlsson@intel.com>

On Fri, Feb 08, 2019 at 02:13:50PM +0100, Magnus Karlsson wrote:
> All the setup code in AF_XDP is protected by a mutex with the
> exception of the mmap code that cannot use it. To make sure that a
> process banging on the mmap call at the same time as another process
> is setting up the socket, smp_wmb() calls were added in the umem
> registration code and the queue creation code, so that the published
> structures that xsk_mmap needs would be consistent. However, the
> corresponding smp_rmb() calls were not added to the xsk_mmap
> code. This patch adds these calls.
> 
> Fixes: 37b076933a8e3 ("xsk: add missing write- and data-dependency barrier")
> Fixes: c0c77d8fb787c ("xsk: add user memory registration support sockopt")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>

Applied, Thanks


^ permalink raw reply

* Re: [PATCH net-next 0/2] Revert wake_on_lan devlink parameter
From: Vasundhara Volam @ 2019-02-11  4:39 UTC (permalink / raw)
  To: David Miller; +Cc: michael.chan@broadcom.com, Jiri Pirko, Netdev
In-Reply-To: <20190208.230707.337997172178002602.davem@davemloft.net>

On Sat, Feb 9, 2019 at 12:37 PM David Miller <davem@davemloft.net> wrote:
>
> From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
> Date: Fri,  8 Feb 2019 14:43:08 +0530
>
> > As per discussion with Jakub Kicinski and Michal Kubecek,
> > this will be better addressed by soon-too-come ethtool netlink
> > API with additional indication that given WoL configuration request
> > is supposed to be persisted.
> >
> > Retain bnxt_en code for devlink port param table registration.
> > There will be follow up patches to add some devlink port params
> > for bnxt_en driver.
>
> Please fix the kbuild robot reported build failure and repost.
David, second patch in this patchset has already taken care of all
this failures.
Could you please apply both patches together?

^ permalink raw reply

* Re: r8169 Driver - Poor Network Performance Since Kernel 4.19
From: David Chang @ 2019-02-11  6:23 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: Realtek linux nic maintainers, netdev, Martti Laaksonen
In-Reply-To: <856b3a75-5daf-6ce8-7fa3-0405e3cefe97@gmail.com>

Hi Heiner,

Sorry for late!

On Feb 05, 2019 at 19:50:30 +0100, Heiner Kallweit wrote:
> Hi David,
> 
> meanwhile there's the following bug report matching what reported.
> It's even the same chip version (RTL8168h).
> https://bugzilla.redhat.com/show_bug.cgi?id=1671958
> 
> Symptom there is also a significant number of rx_missed packets.
> Could you try what I mentioned there last:
> Try building a kernel with the call to rtl_hw_aspm_clkreq_enable(tp, true) at the
> end of rtl_hw_start_8168h_1() being disabled.

Will do.

Thanks,
David Chang

> Heiner
> 
> 
> On 31.01.2019 03:32, David Chang wrote:
> > Hi,
> > 
> > We had a similr case here.
> > - Realtek r8169 receive performance regression in kernel 4.19
> >   https://bugzilla.suse.com/show_bug.cgi?id=1119649
> > 
> > kernel: r8169 0000:01:00.0 eth0: RTL8168h/8111h, XID 54100880
> > The major symptom is there are many rx_missed count.
> > 
> > 
> > On Jan 30, 2019 at 20:15:45 +0100, Heiner Kallweit wrote:
> >> Hi Peter,
> >>
> >> recently I had somebody where pcie_aspm=off for whatever reason didn't
> >> do the trick, can you also check with pcie_aspm.policy=performance.
> > 
> > We will give it a try later.
> > 
> >> And please check with "ethtool -S <if>" whether the chip statistics
> >> show a significant number of errors.
> >>
> >> If this doesn't help you may have to bisect to find the offending commit.
> > 
> > We had tried fallback driver to a few previous commits as following,
> > but with no luck.
> > 
> > 9675931e6b65 r8169: re-enable MSI-X on RTL8168g (v4.19)
> > 098b01ad9837 r8169: don't include asm headers directly (v4.19-rc1)
> > a2965f12fde6 r8169: remove rtl8169_set_speed_xmii (v4.19-rc1)
> > 6fcf9b1d4d6c r8169: fix runtime suspend (v4.19-rc1)
> > e397286b8e89 r8169: remove TBI 1000BaseX support (v4.19-rc1)
> > 
> > Thanks,
> > David Chang
> > 
> >>
> >> Heiner
> >>
> >>
> >> On 30.01.2019 10:59, Peter Ceiley wrote:
> >>> Hi Heiner,
> >>>
> >>> I tried disabling the ASPM using the pcie_aspm=off kernel parameter
> >>> and this made no difference.
> >>>
> >>> I tried compiling the 4.18.16 r8169.c with the 4.19.18 source and
> >>> subsequently loaded the module in the running 4.19.18 kernel. I can
> >>> confirm that this immediately resolved the issue and access to the NFS
> >>> shares operated as expected.
> >>>
> >>> I presume this means it is an issue with the r8169 driver included in
> >>> 4.19 onwards?
> >>>
> >>> To answer your last questions:
> >>>
> >>> Base Board Information
> >>>     Manufacturer: Alienware
> >>>     Product Name: 0PGRP5
> >>>     Version: A02
> >>>
> >>> ... and yes, the RTL8168 is the onboard network chip.
> >>>
> >>> Regards,
> >>>
> >>> Peter.
> >>>
> >>> On Tue, 29 Jan 2019 at 17:44, Heiner Kallweit <hkallweit1@gmail.com> wrote:
> >>>>
> >>>> Hi Peter,
> >>>>
> >>>> I think the vendor driver doesn't enable ASPM per default.
> >>>> So it's worth a try to disable ASPM in the BIOS or via sysfs.
> >>>> Few older systems seem to have issues with ASPM, what kind of
> >>>> system / mainboard are you using? The RTL8168 is the onboard
> >>>> network chip?
> >>>>
> >>>> Rgds, Heiner
> >>>>
> >>>>
> >>>> On 29.01.2019 07:20, Peter Ceiley wrote:
> >>>>> Hi Heiner,
> >>>>>
> >>>>> Thanks, I'll do some more testing. It might not be the driver - I
> >>>>> assumed it was due to the fact that using the r8168 driver 'resolves'
> >>>>> the issue. I'll see if I can test the r8169.c on top of 4.19 - this is
> >>>>> a good idea.
> >>>>>
> >>>>> Cheers,
> >>>>>
> >>>>> Peter.
> >>>>>
> >>>>> On Tue, 29 Jan 2019 at 17:16, Heiner Kallweit <hkallweit1@gmail.com> wrote:
> >>>>>>
> >>>>>> Hi Peter,
> >>>>>>
> >>>>>> at a first glance it doesn't look like a typical driver issue.
> >>>>>> What you could do:
> >>>>>>
> >>>>>> - Test the r8169.c from 4.18 on top of 4.19.
> >>>>>>
> >>>>>> - Check whether disabling ASPM (/sys/module/pcie_aspm) has an effect.
> >>>>>>
> >>>>>> - Bisect between 4.18 and 4.19 to find the offending commit.
> >>>>>>
> >>>>>> Any specific reason why you think root cause is in the driver and not
> >>>>>> elsewhere in the network subsystem?
> >>>>>>
> >>>>>> Heiner
> >>>>>>
> >>>>>>
> >>>>>> On 28.01.2019 23:10, Peter Ceiley wrote:
> >>>>>>> Hi Heiner,
> >>>>>>>
> >>>>>>> Thanks for getting back to me.
> >>>>>>>
> >>>>>>> No, I don't use jumbo packets.
> >>>>>>>
> >>>>>>> Bandwidth is *generally* good, and iperf results to my NAS provide
> >>>>>>> over 900 Mbits/s in both circumstances. The issue seems to appear when
> >>>>>>> establishing a connection and is most notable, for example, on my
> >>>>>>> mounted NFS shares where it takes seconds (up to 10's of seconds on
> >>>>>>> larger directories) to list the contents of each directory. Once a
> >>>>>>> transfer begins on a file, I appear to get good bandwidth.
> >>>>>>>
> >>>>>>> I'm unsure of the best scientific data to provide you in order to
> >>>>>>> troubleshoot this issue. Running the following
> >>>>>>>
> >>>>>>>     netstat -s |grep retransmitted
> >>>>>>>
> >>>>>>> shows a steady increase in retransmitted segments each time I list the
> >>>>>>> contents of a remote directory, for example, running 'ls' on a
> >>>>>>> directory containing 345 media files did the following using kernel
> >>>>>>> 4.19.18:
> >>>>>>>
> >>>>>>> increased retransmitted segments by 21 and the 'time' command showed
> >>>>>>> the following:
> >>>>>>>     real    0m19.867s
> >>>>>>>     user    0m0.012s
> >>>>>>>     sys    0m0.036s
> >>>>>>>
> >>>>>>> The same command shows no retransmitted segments running kernel
> >>>>>>> 4.18.16 and 'time' showed:
> >>>>>>>     real    0m0.300s
> >>>>>>>     user    0m0.004s
> >>>>>>>     sys    0m0.007s
> >>>>>>>
> >>>>>>> ifconfig does not show any RX/TX errors nor dropped packets in either case.
> >>>>>>>
> >>>>>>> dmesg XID:
> >>>>>>> [    2.979984] r8169 0000:03:00.0 eth0: RTL8168g/8111g,
> >>>>>>> f8:b1:56:fe:67:e0, XID 4c000800, IRQ 32
> >>>>>>>
> >>>>>>> # lspci -vv
> >>>>>>> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> >>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
> >>>>>>>     Subsystem: Dell RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
> >>>>>>>     Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> >>>>>>> ParErr- Stepping- SERR- FastB2B- DisINTx+
> >>>>>>>     Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> >>>>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
> >>>>>>>     Latency: 0, Cache Line Size: 64 bytes
> >>>>>>>     Interrupt: pin A routed to IRQ 19
> >>>>>>>     Region 0: I/O ports at d000 [size=256]
> >>>>>>>     Region 2: Memory at f7b00000 (64-bit, non-prefetchable) [size=4K]
> >>>>>>>     Region 4: Memory at f2100000 (64-bit, prefetchable) [size=16K]
> >>>>>>>     Capabilities: [40] Power Management version 3
> >>>>>>>         Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA
> >>>>>>> PME(D0+,D1+,D2+,D3hot+,D3cold+)
> >>>>>>>         Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> >>>>>>>     Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
> >>>>>>>         Address: 0000000000000000  Data: 0000
> >>>>>>>     Capabilities: [70] Express (v2) Endpoint, MSI 01
> >>>>>>>         DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s
> >>>>>>> <512ns, L1 <64us
> >>>>>>>             ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
> >>>>>>> SlotPowerLimit 10.000W
> >>>>>>>         DevCtl:    CorrErr- NonFatalErr- FatalErr- UnsupReq-
> >>>>>>>             RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> >>>>>>>             MaxPayload 128 bytes, MaxReadReq 4096 bytes
> >>>>>>>         DevSta:    CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
> >>>>>>>         LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit
> >>>>>>> Latency L0s unlimited, L1 <64us
> >>>>>>>             ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> >>>>>>>         LnkCtl:    ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
> >>>>>>>             ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> >>>>>>>         LnkSta:    Speed 2.5GT/s (ok), Width x1 (ok)
> >>>>>>>             TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> >>>>>>>         DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+,
> >>>>>>> OBFF Via message/WAKE#
> >>>>>>>              AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> >>>>>>>         DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+,
> >>>>>>> OBFF Disabled
> >>>>>>>              AtomicOpsCtl: ReqEn-
> >>>>>>>         LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> >>>>>>>              Transmit Margin: Normal Operating Range,
> >>>>>>> EnterModifiedCompliance- ComplianceSOS-
> >>>>>>>              Compliance De-emphasis: -6dB
> >>>>>>>         LnkSta2: Current De-emphasis Level: -6dB,
> >>>>>>> EqualizationComplete-, EqualizationPhase1-
> >>>>>>>              EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> >>>>>>>     Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
> >>>>>>>         Vector table: BAR=4 offset=00000000
> >>>>>>>         PBA: BAR=4 offset=00000800
> >>>>>>>     Capabilities: [d0] Vital Product Data
> >>>>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
> >>>>>>>         Not readable
> >>>>>>>     Capabilities: [100 v1] Advanced Error Reporting
> >>>>>>>         UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> >>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >>>>>>>         UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> >>>>>>> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> >>>>>>>         UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> >>>>>>> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> >>>>>>>         CESta:    RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ AdvNonFatalErr-
> >>>>>>>         CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> >>>>>>>         AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn-
> >>>>>>> ECRCChkCap+ ECRCChkEn-
> >>>>>>>             MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> >>>>>>>         HeaderLog: 00000000 00000000 00000000 00000000
> >>>>>>>     Capabilities: [140 v1] Virtual Channel
> >>>>>>>         Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
> >>>>>>>         Arb:    Fixed- WRR32- WRR64- WRR128-
> >>>>>>>         Ctrl:    ArbSelect=Fixed
> >>>>>>>         Status:    InProgress-
> >>>>>>>         VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> >>>>>>>             Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> >>>>>>>             Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=01
> >>>>>>>             Status:    NegoPending- InProgress-
> >>>>>>>     Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
> >>>>>>>     Capabilities: [170 v1] Latency Tolerance Reporting
> >>>>>>>         Max snoop latency: 71680ns
> >>>>>>>         Max no snoop latency: 71680ns
> >>>>>>>     Kernel driver in use: r8169
> >>>>>>>     Kernel modules: r8169
> >>>>>>>
> >>>>>>> Please let me know if you have any other ideas in terms of testing.
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>> Peter.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, 29 Jan 2019 at 05:28, Heiner Kallweit <hkallweit1@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> On 28.01.2019 12:13, Peter Ceiley wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I have been experiencing very poor network performance since Kernel
> >>>>>>>>> 4.19 and I'm confident it's related to the r8169 driver.
> >>>>>>>>>
> >>>>>>>>> I have no issue with kernel versions 4.18 and prior. I am experiencing
> >>>>>>>>> this issue in kernels 4.19 and 4.20 (currently running/testing with
> >>>>>>>>> 4.20.4 & 4.19.18).
> >>>>>>>>>
> >>>>>>>>> If someone could guide me in the right direction, I'm happy to help
> >>>>>>>>> troubleshoot this issue. Note that I have been keeping an eye on one
> >>>>>>>>> issue related to loading of the PHY driver, however, my symptoms
> >>>>>>>>> differ in that I still have a network connection. I have attempted to
> >>>>>>>>> reload the driver on a running system, but this does not improve the
> >>>>>>>>> situation.
> >>>>>>>>>
> >>>>>>>>> Using the proprietary r8168 driver returns my device to proper working order.
> >>>>>>>>>
> >>>>>>>>> lshw shows:
> >>>>>>>>>        description: Ethernet interface
> >>>>>>>>>        product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
> >>>>>>>>>        vendor: Realtek Semiconductor Co., Ltd.
> >>>>>>>>>        physical id: 0
> >>>>>>>>>        bus info: pci@0000:03:00.0
> >>>>>>>>>        logical name: enp3s0
> >>>>>>>>>        version: 0c
> >>>>>>>>>        serial:
> >>>>>>>>>        size: 1Gbit/s
> >>>>>>>>>        capacity: 1Gbit/s
> >>>>>>>>>        width: 64 bits
> >>>>>>>>>        clock: 33MHz
> >>>>>>>>>        capabilities: pm msi pciexpress msix vpd bus_master cap_list
> >>>>>>>>> ethernet physical tp aui bnc mii fibre 10bt 10bt-fd 100bt 100bt-fd
> >>>>>>>>> 1000bt-fd autonegotiation
> >>>>>>>>>        configuration: autonegotiation=on broadcast=yes driver=r8169
> >>>>>>>>> duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=192.168.1.25
> >>>>>>>>> latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
> >>>>>>>>>        resources: irq:19 ioport:d000(size=256)
> >>>>>>>>> memory:f7b00000-f7b00fff memory:f2100000-f2103fff
> >>>>>>>>>
> >>>>>>>>> Kind Regards,
> >>>>>>>>>
> >>>>>>>>> Peter.
> >>>>>>>>>
> >>>>>>>> Hi Peter,
> >>>>>>>>
> >>>>>>>> the description "poor network performance" is quite vague, therefore:
> >>>>>>>>
> >>>>>>>> - Can you provide any measurements?
> >>>>>>>> - iperf results before and after
> >>>>>>>> - statistics about dropped packets (rx and/or tx)
> >>>>>>>> - Do you use jumbo packets?
> >>>>>>>>
> >>>>>>>> Also help would be a "lspci -vv" output for the network card and
> >>>>>>>> the dmesg output line with the chip XID.
> >>>>>>>>
> >>>>>>>> Heiner
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > 
> 
> 

^ permalink raw reply

* [PATCH] Documentation: fix some freescale dpio-driver.rst warnings
From: Randy Dunlap @ 2019-02-11  6:32 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: Stuart Yoder, Laurentiu Tudor, Ioana Radulescu, Madalin Bucur,
	David Miller, linux-doc@vger.kernel.org

From: Randy Dunlap <rdunlap@infradead.org>

Fix markup warnings for one list by using correct list syntax.
Fix markup warnings for another list by using blank lines before the
list.

Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst:30: WARNING: Unexpected indentation.
Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst:143: WARNING: Unexpected indentation.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Cc: netdev@vger.kernel.org
Cc: Madalin Bucur <madalin.bucur@nxp.com>
---
This still leaves 2 other warnings that I don't yet see how to fix.

 Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst |   14 +++++-----
 1 file changed, 7 insertions(+), 7 deletions(-)

--- lnx-50-rc6.orig/Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst
+++ lnx-50-rc6/Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst
@@ -27,11 +27,12 @@ Driver Overview
 
 The DPIO driver is bound to DPIO objects discovered on the fsl-mc bus and
 provides services that:
-  A) allow other drivers, such as the Ethernet driver, to enqueue and dequeue
+
+  A. allow other drivers, such as the Ethernet driver, to enqueue and dequeue
      frames for their respective objects
-  B) allow drivers to register callbacks for data availability notifications
+  B. allow drivers to register callbacks for data availability notifications
      when data becomes available on a queue or channel
-  C) allow drivers to manage hardware buffer pools
+  C. allow drivers to manage hardware buffer pools
 
 The Linux DPIO driver consists of 3 primary components--
    DPIO object driver-- fsl-mc driver that manages the DPIO object
@@ -140,11 +141,10 @@ QBman portal interface (qbman-portal.c)
 
    The qbman-portal component provides APIs to do the low level hardware
    bit twiddling for operations such as:
-      -initializing Qman software portals
-
-      -building and sending portal commands
 
-      -portal interrupt configuration and processing
+      - initializing Qman software portals
+      - building and sending portal commands
+      - portal interrupt configuration and processing
 
    The qbman-portal APIs are not public to other drivers, and are
    only used by dpio-service.



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox