Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] hv_netvsc: ignore devices that are not PCI
From: David Miller @ 2018-08-21 19:03 UTC (permalink / raw)
  To: stephen; +Cc: kys, netdev, sthemmin
In-Reply-To: <20180821174038.7942-1-sthemmin@microsoft.com>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Tue, 21 Aug 2018 10:40:38 -0700

> Registering another device with same MAC address (such as TAP, VPN or
> DPDK KNI) will confuse the VF autobinding logic.  Restrict the search
> to only run if the device is known to be a PCI attached VF.
> 
> Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] selftests: net: move fragment forwarding/config up a level
From: Ido Schimmel @ 2018-08-21 18:56 UTC (permalink / raw)
  To: Anders Roxell; +Cc: davem, shuah, linux-kernel, netdev, linux-kselftest
In-Reply-To: <20180821161212.5750-1-anders.roxell@linaro.org>

On Tue, Aug 21, 2018 at 06:12:12PM +0200, Anders Roxell wrote:
> 'make kselftest-merge' assumes that the config files for the tests are
> located under the 'main' tet dir, like tools/testing/selftests/net/ and
> not in a subdir to net.

The tests under tools/testing/selftests/net/forwarding/ aren't executed
as part of the Makefile. The config file is there mainly so that people
will know which config options they need in order to run the tests.

The tests can be added to the Makefile, but some of them take a few
minutes to complete which is probably against "Don't take too long;"
mentioned in Documentation/dev-tools/kselftest.rst.

^ permalink raw reply

* [PATCHv2 iproute2 3/3] iproute: make clang happy with iproute2 package
From: Mahesh Bandewar @ 2018-08-21 17:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Mahesh Bandewar

From: Mahesh Bandewar <maheshb@google.com>

These are primarily fixes for "string is not string literal" warnings
/ errors (with -Werror -Wformat-nonliteral). This should be a no-op
change. I had to replace couple of print helper functions with the
code they call as it was becoming harder to eliminate these warnings,
however these helpers were used only at couple of places, so no
major change as such.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
 include/json_writer.h |  3 +--
 ip/iplink_can.c       | 19 ++++++++++++-------
 lib/color.c           |  1 +
 lib/json_print.c      |  1 +
 lib/json_writer.c     | 15 +--------------
 misc/ss.c             |  3 ++-
 tc/m_ematch.c         |  1 +
 tc/m_ematch.h         |  1 +
 8 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/include/json_writer.h b/include/json_writer.h
index 9ab88e1dbdd9..0c8831c1136d 100644
--- a/include/json_writer.h
+++ b/include/json_writer.h
@@ -29,6 +29,7 @@ void jsonw_pretty(json_writer_t *self, bool on);
 void jsonw_name(json_writer_t *self, const char *name);
 
 /* Add value  */
+__attribute__((format(printf, 2, 3)))
 void jsonw_printf(json_writer_t *self, const char *fmt, ...);
 void jsonw_string(json_writer_t *self, const char *value);
 void jsonw_bool(json_writer_t *self, bool value);
@@ -59,8 +60,6 @@ void jsonw_luint_field(json_writer_t *self, const char *prop,
 			unsigned long int num);
 void jsonw_lluint_field(json_writer_t *self, const char *prop,
 			unsigned long long int num);
-void jsonw_float_field_fmt(json_writer_t *self, const char *prop,
-			   const char *fmt, double val);
 
 /* Collections */
 void jsonw_start_object(json_writer_t *self);
diff --git a/ip/iplink_can.c b/ip/iplink_can.c
index 587413da15c4..c0deeb1f1fcf 100644
--- a/ip/iplink_can.c
+++ b/ip/iplink_can.c
@@ -316,11 +316,14 @@ static void can_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 		struct can_bittiming *bt = RTA_DATA(tb[IFLA_CAN_BITTIMING]);
 
 		if (is_json_context()) {
+			json_writer_t *jw;
+
 			open_json_object("bittiming");
 			print_int(PRINT_ANY, "bitrate", NULL, bt->bitrate);
-			jsonw_float_field_fmt(get_json_writer(),
-					      "sample_point", "%.3f",
-					      (float) bt->sample_point / 1000.);
+			jw = get_json_writer();
+			jsonw_name(jw, "sample_point");
+			jsonw_printf(jw, "%.3f",
+				     (float) bt->sample_point / 1000);
 			print_int(PRINT_ANY, "tq", NULL, bt->tq);
 			print_int(PRINT_ANY, "prop_seg", NULL, bt->prop_seg);
 			print_int(PRINT_ANY, "phase_seg1",
@@ -415,12 +418,14 @@ static void can_print_opt(struct link_util *lu, FILE *f, struct rtattr *tb[])
 			RTA_DATA(tb[IFLA_CAN_DATA_BITTIMING]);
 
 		if (is_json_context()) {
+			json_writer_t *jw;
+
 			open_json_object("data_bittiming");
 			print_int(PRINT_JSON, "bitrate", NULL, dbt->bitrate);
-			jsonw_float_field_fmt(get_json_writer(),
-					      "sample_point",
-					      "%.3f",
-					      (float) dbt->sample_point / 1000.);
+			jw = get_json_writer();
+			jsonw_name(jw, "sample_point");
+			jsonw_printf(jw, "%.3f",
+				     (float) dbt->sample_point / 1000.);
 			print_int(PRINT_JSON, "tq", NULL, dbt->tq);
 			print_int(PRINT_JSON, "prop_seg", NULL, dbt->prop_seg);
 			print_int(PRINT_JSON, "phase_seg1",
diff --git a/lib/color.c b/lib/color.c
index eaf69e74d673..e5406294dfc4 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -132,6 +132,7 @@ void set_color_palette(void)
 		is_dark_bg = 1;
 }
 
+__attribute__((format(printf, 3, 4)))
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...)
 {
 	int ret = 0;
diff --git a/lib/json_print.c b/lib/json_print.c
index 5dc41bfabfd4..77902824a738 100644
--- a/lib/json_print.c
+++ b/lib/json_print.c
@@ -100,6 +100,7 @@ void close_json_array(enum output_type type, const char *str)
  * functions handling different types
  */
 #define _PRINT_FUNC(type_name, type)					\
+	__attribute__((format(printf, 4, 0)))				\
 	void print_color_##type_name(enum output_type t,		\
 				     enum color_attr color,		\
 				     const char *key,			\
diff --git a/lib/json_writer.c b/lib/json_writer.c
index aa9ce1c65e51..68890b34ee92 100644
--- a/lib/json_writer.c
+++ b/lib/json_writer.c
@@ -152,6 +152,7 @@ void jsonw_name(json_writer_t *self, const char *name)
 		putc(' ', self->out);
 }
 
+__attribute__((format(printf, 2, 3)))
 void jsonw_printf(json_writer_t *self, const char *fmt, ...)
 {
 	va_list ap;
@@ -205,11 +206,6 @@ void jsonw_null(json_writer_t *self)
 	jsonw_printf(self, "null");
 }
 
-void jsonw_float_fmt(json_writer_t *self, const char *fmt, double num)
-{
-	jsonw_printf(self, fmt, num);
-}
-
 void jsonw_float(json_writer_t *self, double num)
 {
 	jsonw_printf(self, "%g", num);
@@ -274,15 +270,6 @@ void jsonw_float_field(json_writer_t *self, const char *prop, double val)
 	jsonw_float(self, val);
 }
 
-void jsonw_float_field_fmt(json_writer_t *self,
-			   const char *prop,
-			   const char *fmt,
-			   double val)
-{
-	jsonw_name(self, prop);
-	jsonw_float_fmt(self, fmt, val);
-}
-
 void jsonw_uint_field(json_writer_t *self, const char *prop, unsigned int num)
 {
 	jsonw_name(self, prop);
diff --git a/misc/ss.c b/misc/ss.c
index 41e7762bb61f..93b1baf5dc40 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -976,6 +976,7 @@ static int buf_update(int len)
 }
 
 /* Append content to buffer as part of the current field */
+__attribute__((format(printf, 1, 2)))
 static void out(const char *fmt, ...)
 {
 	struct column *f = current_field;
@@ -1093,7 +1094,7 @@ static void print_header(void)
 {
 	while (!field_is_last(current_field)) {
 		if (!current_field->disabled)
-			out(current_field->header);
+			out("%s", current_field->header);
 		field_next();
 	}
 }
diff --git a/tc/m_ematch.c b/tc/m_ematch.c
index ace4b3dd738b..a524b520b276 100644
--- a/tc/m_ematch.c
+++ b/tc/m_ematch.c
@@ -277,6 +277,7 @@ static int flatten_tree(struct ematch *head, struct ematch *tree)
 	return count;
 }
 
+__attribute__((format(printf, 5, 6)))
 int em_parse_error(int err, struct bstr *args, struct bstr *carg,
 		   struct ematch_util *e, char *fmt, ...)
 {
diff --git a/tc/m_ematch.h b/tc/m_ematch.h
index 80b02cfad6cc..95515a074624 100644
--- a/tc/m_ematch.h
+++ b/tc/m_ematch.h
@@ -107,6 +107,7 @@ static inline int parse_layer(struct bstr *b)
 		return INT_MAX;
 }
 
+__attribute__((format(printf, 5, 6)))
 int em_parse_error(int err, struct bstr *args, struct bstr *carg,
 		   struct ematch_util *, char *fmt, ...);
 int print_ematch(FILE *, const struct rtattr *);
-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply related

* Re: [PATCHv2 iproute2 2/3] tc: remove extern from prototype declarations
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-08-21 18:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Mahesh Bandewar, netdev
In-Reply-To: <20180821111911.796b5c9f@xeon-e3>

On Tue, Aug 21, 2018 at 11:19 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 21 Aug 2018 10:48:54 -0700
> Mahesh Bandewar <mahesh@bandewar.net> wrote:
>
>> From: Mahesh Bandewar <maheshb@google.com>
>>
>> Signed-off-by: Mahesh Bandewar <maheshb@google.com>
>
> I already did this yesterday. Patch was on mailing list.
Hmm, I thought I did mention that I would take care in v2. In any
case, we can remove this from the patch-series if remaining patches
are fine. Does that make sense?

^ permalink raw reply

* linux-next: build warning after merge of the net tree
From: Stephen Rothwell @ 2018-08-21 22:04 UTC (permalink / raw)
  To: David Miller, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List, Cong Wang

[-- Attachment #1: Type: text/plain, Size: 433 bytes --]

Hi all,

After merging the net tree, today's linux-next build (KCONFIG_NAME)
produced this warning:

drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c: In function 'tc_fill_actions':
drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c:64:6: warning: unused variable 'i' [-Wunused-variable]
  int i;
      ^

Introduced by commit

  244cd96adb5f ("net_sched: remove list_head from tc_action")

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH 06/11] net: ethernet: renesas: use SPDX identifier for Renesas drivers
From: Wolfram Sang @ 2018-08-21 22:02 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfram Sang, Sergei Shtylyov, David S. Miller,
	netdev, linux-kernel
In-Reply-To: <20180821220233.9202-1-wsa+renesas@sang-engineering.com>

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

To be applied individually per subsystem tree. Morimoto-san, could you maybe
ack this with your Renesas address?

 drivers/net/ethernet/renesas/ravb.h      |  5 +----
 drivers/net/ethernet/renesas/ravb_main.c |  5 +----
 drivers/net/ethernet/renesas/sh_eth.c    | 13 +------------
 drivers/net/ethernet/renesas/sh_eth.h    | 13 +------------
 4 files changed, 4 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
index b81f4faf7b10..1470fc12282b 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -1,3 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /* Renesas Ethernet AVB device driver
  *
  * Copyright (C) 2014-2015 Renesas Electronics Corporation
@@ -5,10 +6,6 @@
  * Copyright (C) 2015-2016 Cogent Embedded, Inc. <source@cogentembedded.com>
  *
  * Based on the SuperH Ethernet driver
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License version 2,
- * as published by the Free Software Foundation.
  */
 
 #ifndef __RAVB_H__
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index c06f2df895c2..aff5516b781e 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /* Renesas Ethernet AVB device driver
  *
  * Copyright (C) 2014-2015 Renesas Electronics Corporation
@@ -5,10 +6,6 @@
  * Copyright (C) 2015-2016 Cogent Embedded, Inc. <source@cogentembedded.com>
  *
  * Based on the SuperH Ethernet driver
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License version 2,
- * as published by the Free Software Foundation.
  */
 
 #include <linux/cache.h>
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 5573199c4536..ad4433d59237 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1,3 +1,4 @@
+// SPDX-License-Identifier: GPL-2.0
 /*  SuperH Ethernet device driver
  *
  *  Copyright (C) 2014 Renesas Electronics Corporation
@@ -5,18 +6,6 @@
  *  Copyright (C) 2008-2014 Renesas Solutions Corp.
  *  Copyright (C) 2013-2017 Cogent Embedded, Inc.
  *  Copyright (C) 2014 Codethink Limited
- *
- *  This program is free software; you can redistribute it and/or modify it
- *  under the terms and conditions of the GNU General Public License,
- *  version 2, as published by the Free Software Foundation.
- *
- *  This program is distributed in the hope it will be useful, but WITHOUT
- *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- *  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- *  more details.
- *
- *  The full GNU General Public License is included in this distribution in
- *  the file called "COPYING".
  */
 
 #include <linux/module.h>
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index f94be99cf400..0c18650bbfe6 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -1,19 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*  SuperH Ethernet device driver
  *
  *  Copyright (C) 2006-2012 Nobuhiro Iwamatsu
  *  Copyright (C) 2008-2012 Renesas Solutions Corp.
- *
- *  This program is free software; you can redistribute it and/or modify it
- *  under the terms and conditions of the GNU General Public License,
- *  version 2, as published by the Free Software Foundation.
- *
- *  This program is distributed in the hope it will be useful, but WITHOUT
- *  ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- *  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- *  more details.
- *
- *  The full GNU General Public License is included in this distribution in
- *  the file called "COPYING".
  */
 
 #ifndef __SH_ETH_H__
-- 
2.11.0

^ permalink raw reply related

* [PATCH 05/11] can: rcar: use SPDX identifier for Renesas drivers
From: Wolfram Sang @ 2018-08-21 22:02 UTC (permalink / raw)
  To: linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfram Sang, Wolfgang Grandegger,
	Marc Kleine-Budde, David S. Miller, linux-can, netdev,
	linux-kernel
In-Reply-To: <20180821220233.9202-1-wsa+renesas@sang-engineering.com>

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

To be applied individually per subsystem tree. Morimoto-san, could you maybe
ack this with your Renesas address?

 drivers/net/can/rcar/rcar_can.c   | 6 +-----
 drivers/net/can/rcar/rcar_canfd.c | 6 +-----
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/can/rcar/rcar_can.c b/drivers/net/can/rcar/rcar_can.c
index 11662f479e76..051bf4ef4be2 100644
--- a/drivers/net/can/rcar/rcar_can.c
+++ b/drivers/net/can/rcar/rcar_can.c
@@ -1,12 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
 /* Renesas R-Car CAN device driver
  *
  * Copyright (C) 2013 Cogent Embedded, Inc. <source@cogentembedded.com>
  * Copyright (C) 2013 Renesas Solutions Corp.
- *
- * This program is free software; you can redistribute  it and/or modify it
- * under  the terms of  the GNU General  Public License as published by the
- * Free Software Foundation;  either version 2 of the  License, or (at your
- * option) any later version.
  */
 
 #include <linux/module.h>
diff --git a/drivers/net/can/rcar/rcar_canfd.c b/drivers/net/can/rcar/rcar_canfd.c
index 602c19e23f05..09a5b038a9f0 100644
--- a/drivers/net/can/rcar/rcar_canfd.c
+++ b/drivers/net/can/rcar/rcar_canfd.c
@@ -1,11 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
 /* Renesas R-Car CAN FD device driver
  *
  * Copyright (C) 2015 Renesas Electronics Corp.
- *
- * This program is free software; you can redistribute  it and/or modify it
- * under  the terms of  the GNU General  Public License as published by the
- * Free Software Foundation;  either version 2 of the  License, or (at your
- * option) any later version.
  */
 
 /* The R-Car CAN FD controller can operate in either one of the below two modes
-- 
2.11.0

^ permalink raw reply related

* Fw: [Bug 200879] New: Poor network performance using CX-5 Mellanox card
From: Stephen Hemminger @ 2018-08-21 18:39 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Tue, 21 Aug 2018 18:37:01 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 200879] New: Poor network performance using CX-5 Mellanox card


https://bugzilla.kernel.org/show_bug.cgi?id=200879

            Bug ID: 200879
           Summary: Poor network performance using CX-5 Mellanox card
           Product: Networking
           Version: 2.5
    Kernel Version: 4.18
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: IPV4
          Assignee: stephen@networkplumber.org
          Reporter: kolga@netapp.com
        Regression: No

I'm having issues with the latest kernel (4.18) and using Mellanox CX-5 cards.
Tested using a direct connection between the machines and via a 40G link. There
is an asymmetric throughput.

I have tested some kernels in between 4.15-rc4 and 4.18. What I notice is that
I have asymmetric flow performance where "good" direction gets 20+G and the bad
direction gets 2+G. I measure performance over doing multiple runs (10). I have
noticed that while 4.15-rc4 was not perfect at getting symmetric 20+G
performance all the time (details below), the poor performance is more
prevalent starting from 4.16 kernel.

In 4.15-rc4 6 out of 10 runs show good performance 20+G (in the bad
direction). performance in other direction is mostly 28+G (7 out of 10
runs. where 3 runs it goes down to 15+G)
In 4.16 3 out of 10 runs show good performance 20+G. performance in
other direction is mostly 20+G (7 out of 10 runs where 3 runs goes
down to 15G)
In 4.17 0 out of 10 runs show good performance. performance in other
direction is mostly 20+G (7 out of 10 runs where 3 runs it goes down
to 5G)
In 4.18 0 out of 10 runs show good performance in the "bad
direction". the "good direction" is now also pretty bad where 7 out of
10 runs had throughput between 7-10G and 3 runs had 11-17G.

Let me try to do with a description of the setup. I have 2machines:
sti-rx200-231 and sti-rx200-232. I have been running iperf3 testing
bandwidth where at first the server was on -231 and then when the
server was on -232. The "bad" direction is when -232 is the server and
-231 is the client.

> can you share some configuration details:
> CPU numa and affinity details:  

[kolga@sti-rx200-232 ~]$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 32155 MB
node 0 free: 30394 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 32231 MB
node 1 free: 30486 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

[kolga@sti-rx200-231 ~]$ numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 32161 MB
node 0 free: 31100 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 32231 MB
node 1 free: 29774 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

> ethtool -l  
[kolga@sti-rx200-231 ~]$ sudo ethtool -l enp4s0
Channel parameters for enp4s0:
Pre-set maximums:
RX:             0
TX:             0
Other:          0
Combined:       24
Current hardware settings:
RX:             0
TX:             0
Other:          0
Combined:       24

(same for the other machine)

> ethtool -g  
[kolga@sti-rx200-231 ~]$ sudo ethtool -g enp4s0
Ring parameters for enp4s0:
Pre-set maximums:
RX:             8192
RX Mini:        0
RX Jumbo:       0
TX:             8192
Current hardware settings:
RX:             1024
RX Mini:        0
RX Jumbo:       0
TX:             1024

(same for the other machine)

> ethtool -x  
[kolga@sti-rx200-231 ~]$ sudo ethtool -x enp4s0
RX flow hash indirection table for enp4s0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:      0     1     2     3     4     5     6     7
   32:      8     9    10    11    12    13    14    15
   40:     16    17    18    19    20    21    22    23
   48:      0     1     2     3     4     5     6     7
   56:      8     9    10    11    12    13    14    15
   64:     16    17    18    19    20    21    22    23
   72:      0     1     2     3     4     5     6     7
   80:      8     9    10    11    12    13    14    15
   88:     16    17    18    19    20    21    22    23
   96:      0     1     2     3     4     5     6     7
  104:      8     9    10    11    12    13    14    15
  112:     16    17    18    19    20    21    22    23
  120:      0     1     2     3     4     5     6     7
RSS hash key:
5e:1c:93:e2:ec:b6:44:b1:e4:ec:b1:20:57:ab:90:f6:0c:1a:46:13:b8:19:66:c8:56:0c:06:b2:d5:53:a6:4d:89:6b:0b:b1:d4:30:90:31

(same for the other machine but the RSS hash key is different)

> ethtool -k  
[kolga@sti-rx200-231 ~]$ sudo ethtool -k enp4s0
Features for enp4s0:
Cannot get device udp-fragmentation-offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

(i think it's the same for the other machine but just in case here's the
output)
[kolga@sti-rx200-232 ~]$ sudo ethtool -k enp4s0
Features for enp4s0:
Cannot get device udp-fragmentation-offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: off
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

> ethtool --show-priv-flags  
[kolga@sti-rx200-231 ~]$ sudo ethtool --show-priv-flags enp4s0
Private flags for enp4s0:
rx_cqe_moder   : on
tx_cqe_moder   : off
rx_cqe_compress: off
rx_striding_rq : on

(same for the other machine)

> ethool -S //before and after the good and bad runs"
> perf report/top while running the test.  

This is a run where -232 is a server and -231 is a client.

[kolga@sti-rx200-232 src]$ ./iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 172.20.35.189, port 37302
[  5] local 172.20.35.191 port 5201 connected to 172.20.35.189 port 37304
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   236 MBytes  1.98 Gbits/sec
[  5]   1.00-2.00   sec   233 MBytes  1.95 Gbits/sec
[  5]   2.00-3.00   sec   235 MBytes  1.97 Gbits/sec
[  5]   3.00-4.00   sec   231 MBytes  1.94 Gbits/sec
[  5]   4.00-5.00   sec   243 MBytes  2.04 Gbits/sec
[  5]   5.00-6.00   sec   238 MBytes  1.99 Gbits/sec
[  5]   6.00-7.00   sec   230 MBytes  1.93 Gbits/sec
[  5]   7.00-8.00   sec   232 MBytes  1.94 Gbits/sec
[  5]   8.00-9.00   sec   272 MBytes  2.28 Gbits/sec
[  5]   9.00-10.00  sec   249 MBytes  2.09 Gbits/sec
[  5]  10.00-10.05  sec  10.4 MBytes  1.90 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.05  sec  2.35 GBytes  2.01 Gbits/sec                  receiver

[kolga@sti-rx200-231 src]$ sudo ./iperf3 -c 172.20.35.191
Connecting to host 172.20.35.191, port 5201
[  5] local 172.20.35.189 port 37304 connected to 172.20.35.191 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   249 MBytes  2.09 Gbits/sec    4   1.93 MBytes
[  5]   1.00-2.00   sec   232 MBytes  1.95 Gbits/sec    1   1.69 MBytes
[  5]   2.00-3.00   sec   234 MBytes  1.96 Gbits/sec    0   2.24 MBytes
[  5]   3.00-4.00   sec   232 MBytes  1.95 Gbits/sec    0   2.66 MBytes
[  5]   4.00-5.00   sec   242 MBytes  2.03 Gbits/sec   18   1.32 MBytes
[  5]   5.00-6.00   sec   238 MBytes  1.99 Gbits/sec    1   1.63 MBytes
[  5]   6.00-7.00   sec   230 MBytes  1.93 Gbits/sec    0   2.18 MBytes
[  5]   7.00-8.00   sec   232 MBytes  1.95 Gbits/sec    8   1.38 MBytes
[  5]   8.00-9.00   sec   272 MBytes  2.29 Gbits/sec    2   1.45 MBytes
[  5]   9.00-10.00  sec   248 MBytes  2.08 Gbits/sec    2   1.66 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.35 GBytes  2.02 Gbits/sec   36             sender
[  5]   0.00-10.05  sec  2.35 GBytes  2.01 Gbits/sec                  receiver

iperf Done.

Top output from -231
Tasks: 412 total,   1 running, 209 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.3 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem : 65938632 total, 63195184 free,  2191948 used,   551500 buff/cache
KiB Swap: 33030140 total, 33030140 free,        0 used. 63105388 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
  162 root      20   0       0      0      0 I   3.3  0.0   0:00.51 kworker/12+
 2530 root      20   0       0      0      0 I   1.7  0.0   0:00.39 kworker/u4+
 2742 root      20   0       0      0      0 I   1.3  0.0   0:00.17 kworker/u4+
 2394 root      20   0       0      0      0 I   1.0  0.0   0:00.16 kworker/17+
 2774 kolga     20   0  158000   4640   3676 R   0.3  0.0   0:00.03 top
    1 root      20   0  191980   6436   3900 S   0.0  0.0   0:26.21 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    5 root      20   0       0      0      0 I   0.0  0.0   0:00.02 kworker/0:+
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    8 root      20   0       0      0      0 I   0.0  0.0   0:00.00 kworker/u4+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
   11 root      20   0       0      0      0 I   0.0  0.0   0:00.17 rcu_sched
   12 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh
   13 root      rt   0       0      0      0 S   0.0  0.0   0:00.06 migration/0

Output from -232
Tasks: 400 total,   2 running, 202 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  1.5 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  1.0 si,  0.0 st
KiB Mem : 65932576 total, 63205228 free,  2180492 used,   546856 buff/cache
KiB Swap: 28901372 total, 28901372 free,        0 used. 63114220 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2467 kolga     20   0   43716   4144   3608 R  37.9  0.0   0:03.32 lt-iperf3
 1199 root      20   0       0      0      0 D   2.7  0.0   0:00.18 kworker/23+
  565 root      20   0       0      0      0 I   1.7  0.0   0:00.23 kworker/2:+
 2360 root      20   0       0      0      0 I   1.3  0.0   0:00.16 kworker/u4+
  153 root      20   0       0      0      0 I   0.7  0.0   0:00.11 kworker/23+
 2273 root      20   0       0      0      0 I   0.7  0.0   0:01.13 kworker/u4+
  691 root      20   0       0      0      0 S   0.3  0.0   0:00.20 xfsaild/dm+
 2448 root      20   0       0      0      0 I   0.3  0.0   0:00.08 kworker/u4+
    1 root      20   0  191820   6108   3804 S   0.0  0.0   0:26.12 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0
   11 root      20   0       0      0      0 I   0.0  0.0   0:00.13 rcu_sched
   12 root      20   0       0      0      0 I   0.0  0.0   0:00.00 rcu_bh


Here's a run where -231 is the server and -232 is the client. A "good"
direction
[kolga@sti-rx200-231 src]$ ./iperf3 -s
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 172.20.35.191, port 35060
[  5] local 172.20.35.189 port 5201 connected to 172.20.35.191 port 35062
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.81 GBytes  15.5 Gbits/sec
[  5]   1.00-2.00   sec  1.90 GBytes  16.3 Gbits/sec
[  5]   2.00-3.00   sec  1.25 GBytes  10.8 Gbits/sec
[  5]   3.00-4.00   sec   826 MBytes  6.93 Gbits/sec
[  5]   4.00-5.00   sec   819 MBytes  6.87 Gbits/sec
[  5]   5.00-6.00   sec  1.47 GBytes  12.6 Gbits/sec
[  5]   6.00-7.00   sec  1.79 GBytes  15.4 Gbits/sec
[  5]   7.00-8.00   sec  1.14 GBytes  9.75 Gbits/sec
[  5]   8.00-9.00   sec  1.81 GBytes  15.6 Gbits/sec
[  5]   9.00-10.00  sec  1.77 GBytes  15.2 Gbits/sec
[  5]  10.00-10.04  sec  78.2 MBytes  16.3 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.04  sec  14.6 GBytes  12.5 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

./iperf3 -c 172.20.35.189
Connecting to host 172.20.35.189, port 5201
[  5] local 172.20.35.191 port 35062 connected to 172.20.35.189 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.89 GBytes  16.2 Gbits/sec  100   1.45 MBytes
[  5]   1.00-2.00   sec  1.90 GBytes  16.4 Gbits/sec   86    638 KBytes
[  5]   2.00-3.00   sec  1.21 GBytes  10.4 Gbits/sec   57   1.15 MBytes
[  5]   3.00-4.00   sec   830 MBytes  6.96 Gbits/sec    7   1.54 MBytes
[  5]   4.00-5.00   sec   815 MBytes  6.84 Gbits/sec   15   1.10 MBytes
[  5]   5.00-6.00   sec  1.51 GBytes  12.9 Gbits/sec  362    690 KBytes
[  5]   6.00-7.00   sec  1.72 GBytes  14.8 Gbits/sec  665    690 KBytes
[  5]   7.00-8.00   sec  1.20 GBytes  10.4 Gbits/sec  639    778 KBytes
[  5]   8.00-9.00   sec  1.81 GBytes  15.6 Gbits/sec  879    708 KBytes
[  5]   9.00-10.00  sec  1.77 GBytes  15.2 Gbits/sec  865    577 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  14.6 GBytes  12.6 Gbits/sec  3675             sender
[  5]   0.00-10.04  sec  14.6 GBytes  12.5 Gbits/sec                  receiver

iperf Done.

-232 top
Tasks: 391 total,   1 running, 202 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  1.4 sy,  0.0 ni, 97.2 id,  0.0 wa,  0.0 hi,  1.4 si,  0.0 st
KiB Mem : 65932576 total, 63203164 free,  2181920 used,   547492 buff/cache
KiB Swap: 28901372 total, 28901372 free,        0 used. 63111764 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2637 kolga     20   0   43716   4476   3944 S  23.6  0.0   0:02.09 lt-iperf3
 2510 root      20   0       0      0      0 I   4.3  0.0   0:00.46 kworker/10+
 2653 root      20   0       0      0      0 I   3.0  0.0   0:00.16 kworker/1:+
 2630 root      20   0       0      0      0 I   1.7  0.0   0:00.14 kworker/u4+
  565 root      20   0       0      0      0 I   1.0  0.0   0:00.32 kworker/2:+
 2609 root      20   0       0      0      0 I   1.0  0.0   0:00.48 kworker/u4+
 2273 root      20   0       0      0      0 I   0.7  0.0   0:01.67 kworker/u4+
   25 root      20   0       0      0      0 S   0.3  0.0   0:00.01 ksoftirqd/2
   74 root      20   0       0      0      0 S   0.3  0.0   0:00.03 ksoftirqd/+
 2651 kolga     20   0  158000   4568   3640 R   0.3  0.0   0:00.03 top
    1 root      20   0  191820   6108   3804 S   0.0  0.0   0:26.12 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0

-231 top
Tasks: 413 total,   3 running, 209 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  3.1 sy,  0.0 ni, 94.7 id,  0.0 wa,  0.0 hi,  2.1 si,  0.0 st
KiB Mem : 65938632 total, 63202896 free,  2183572 used,   552164 buff/cache
KiB Swap: 33030140 total, 33030140 free,        0 used. 63113436 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2859 kolga     20   0   43716   4108   3572 R  70.8  0.0   0:05.75 lt-iperf3
 2875 root      20   0       0      0      0 I   7.3  0.0   0:00.36 kworker/9:+
  185 root      20   0       0      0      0 I   6.6  0.0   0:00.47 kworker/9:+
   68 root      20   0       0      0      0 S   5.0  0.0   0:00.25 ksoftirqd/9
 2421 root      20   0       0      0      0 I   3.0  0.0   0:00.10 kworker/13+
 2832 root      20   0       0      0      0 I   1.7  0.0   0:00.28 kworker/u4+
 2742 root      20   0       0      0      0 I   1.0  0.0   0:00.63 kworker/u4+
 2530 root      20   0       0      0      0 I   0.7  0.0   0:01.41 kworker/u4+
 2877 kolga     20   0  158000   4712   3740 R   0.3  0.0   0:00.02 top
    1 root      20   0  191980   6440   3900 S   0.0  0.0   0:26.23 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.01 kthreadd
    3 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_gp
    4 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 rcu_par_gp
    5 root      20   0       0      0      0 I   0.0  0.0   0:00.02 kworker/0:+
    6 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 kworker/0:+
    8 root      20   0       0      0      0 I   0.0  0.0   0:00.00 kworker/u4+
    9 root       0 -20       0      0      0 I   0.0  0.0   0:00.00 mm_percpu_+

I could attach outputs from ethtool -S (too long to include) 

From google about CPU states and disabling them
[kolga@sti-rx200-232 ~]$ sudo cat
/sys/module/intel_idle/parameters/max_cstate  9

adding processor.max_cstate=0 and intel_idle.max_cstate=0 to the
kernel boot parameters made is so max_cstate stated at 0.

I re-did the re-experiments and it made no difference.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCHv2 iproute2 2/3] tc: remove extern from prototype declarations
From: Stephen Hemminger @ 2018-08-21 18:19 UTC (permalink / raw)
  To: Mahesh Bandewar; +Cc: netdev, Mahesh Bandewar
In-Reply-To: <20180821174854.208561-1-mahesh@bandewar.net>

On Tue, 21 Aug 2018 10:48:54 -0700
Mahesh Bandewar <mahesh@bandewar.net> wrote:

> From: Mahesh Bandewar <maheshb@google.com>
> 
> Signed-off-by: Mahesh Bandewar <maheshb@google.com>

I already did this yesterday. Patch was on mailing list.

^ permalink raw reply

* Re: ixgbe hangs when XDP_TX is enabled
From: Alexander Duyck @ 2018-08-21 18:13 UTC (permalink / raw)
  To: tehnerd; +Cc: Netdev, Jeff Kirsher
In-Reply-To: <20180821165858.GA1507@maindev>

On Tue, Aug 21, 2018 at 9:59 AM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
>
> On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> > On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> > >
> > > we are getting such errors:
> > >
> > > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> > >                  Tx Queue             <46>
> > >                  TDH, TDT             <0>, <2>
> > >                  next_to_use          <2>
> > >                  next_to_clean        <0>
> > >                tx_buffer_info[next_to_clean]
> > >                  time_stamp           <0>
> > >                  jiffies              <1000197c0>
> > > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> > >
> > > while running XDP prog on ixgbe nic.
> > > right now i'm seing this on bpfnext kernel
> > > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> > >
> > > looks like this is the same issue as reported by Brenden in
> > > https://www.spinics.net/lists/netdev/msg439438.html
> > >
> > > --
> > > Nikita V. Shirokov
> >
> > Could you provide some additional information about your setup.
> > Specifically useful would be "ethtool -i", "ethtool -l", and lspci
> > -vvv info for your device. The total number of CPUs on the system
> > would be useful to know as well. In addition could you try
> > reproducing
> sure:
>
> ethtool -l eth0
> Channel parameters for eth0:
> Pre-set maximums:
> RX:             0
> TX:             0
> Other:          1
> Combined:       63
> Current hardware settings:
> RX:             0
> TX:             0
> Other:          1
> Combined:       48
>
> # ethtool -i eth0
> driver: ixgbe
> version: 5.1.0-k
> firmware-version: 0x800006f1
> expansion-rom-version:
> bus-info: 0000:03:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
>
> # nproc
> 48
>
> lspci:
>
> 03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
>         Subsystem: Intel Corporation Device 000d
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 32 bytes
>         Interrupt: pin A routed to IRQ 30
>         NUMA node: 0
>         Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M]
>         Region 2: I/O ports at 6000 [size=32]
>         Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K]
>         Expansion ROM at c7e00000 [disabled] [size=512K]
>         Capabilities: [40] Power Management version 3
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>                 Address: 0000000000000000  Data: 0000
>                 Masking: 00000000  Pending: 00000000
>         Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>                 Vector table: BAR=4 offset=00000000
>                 PBA: BAR=4 offset=00002000
>         Capabilities: [a0] Express (v2) Endpoint, MSI 00
>                 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
>                         RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
>                         MaxPayload 256 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
>                 LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us
>                         ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>                 DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
>                 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>                 LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
>                          Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>                          Compliance De-emphasis: -6dB
>                 LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>                          EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>         Capabilities: [100 v1] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>                 AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>         Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60
>         Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
>                 ARICap: MFVC- ACS-, Next Function: 0
>                 ARICtl: MFVC- ACS-, Function Group: 0
>         Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
>                 IOVCap: Migration-, Interrupt Message Number: 000
>                 IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
>                 IOVSta: Migration-
>                 Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
>                 VF offset: 128, stride: 2, Device ID: 10ed
>                 Supported Page Size: 00000553, System Page Size: 00000001
>                 Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable)
>                 Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable)
>                 VF Migration: offset: 00000000, BIR: 0
>         Kernel driver in use: ixgbe
>
>
>
>
> workaround for now is to do the same, as Brenden did in his original
> finding: make sure that combined + xdp queues < max_tx_queues
> (e.g. w/ combined == 14 the issue goes away).
>
> > the issue with one of the sample XDP programs provided with the kernel
> > such as the xdp2 which I believe uses the XDP_TX function. We need to
> > try and create a similar setup in our own environment for
> > reproduction and debugging.
>
> will try but this could take a while, because i'm not sure that we have
> ixgbe in our test lab (and it would be hard to run such test in prod)
>
> >
> > Thanks.
> >
> > - Alex
>
> --
> Nikita V. Shirokov

So I have been reading the datasheet
(https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82599-10-gbe-controller-datasheet.pdf)
and it looks like the assumption that Brenden came to in the earlier
referenced link is probably correct. From what I can tell there is a
limit of 64 queues in the base RSS mode of the device, so while it
supports more than 64 queues you can only make use of 64 as per table
7-25.

For now I think the workaround you are using is probably the only
viable solution. I myself don't have time to work on resolving this,
but I am sure on of the maintainers for ixgbe will be responding
shortly.

One possible solution we may want to look at would be to make use of
the 32 pool/VF mode in the MTQC register. That should enable us to
make use of all 128 queues but I am sure there would be other side
effects such as having to set the bits in the PFVFTE register in order
to enable the extra Tx queues.

Thanks.

- Alex

^ permalink raw reply

* Re: serdev: How to attach serdev devices to USB based tty devices?
From: Rob Herring @ 2018-08-21 18:01 UTC (permalink / raw)
  To: mailinglists
  Cc: Andreas Färber, open list:SERIAL DRIVERS, Linux USB List,
	Linux-MIPS, Xue Liu, Ben Whitten, devicetree, netdev, oneukum,
	Alexander Graf, LoRa_Community_Support, 潘建宏,
	rehm, moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE
In-Reply-To: <b00d8330-dab4-e444-e02c-dee6b54abc81@kunz-im-inter.net>

On Tue, Aug 21, 2018 at 11:33 AM Frank Kunz
<mailinglists@kunz-im-inter.net> wrote:
>
> Am 14.08.2018 um 04:28 schrieb Andreas Färber:
> > Hi Rob et al.,
> >
> > For my LoRa network driver project [1] I have found your serdev
> > framework to be a valuable help for dealing with hardware modules
> > exposing some textual or binary UART interface.
> >
> > In particular on arm(64) and mips this allows to define an unlimited
> > number of serdev drivers [2] that are associated via their Device Tree
> > compatible string and can optionally be configured via DT properties.
> >
> > And in theory it seems serdev has also grown support for ACPI.
> >
> > Now, a growing number of vendors are placing such modules on a USB stick
> > for easy evaluation on x86_64 PC hardware, or are designing mPCIe or M.2
> > cards using their USB pins. While I do not yet have access to such a
> > device myself, it is my understanding that devices with USB-UART bridge
> > chipsets (e.g., FTDI) will show up as /dev/ttyUSBx and devices with an
> > MCU implementing the CDC USB protocol (e.g., Pico-cell gateway = picoGW)
> > will show up as /dev/ttyACMx.
> > On the Raspberry Pi I've seen that Device Tree nodes can be used to pass
> > information to on-board devices such as MAC address to Ethernet chipset,
> > but that does not seem all that useful for passing a serdev child node
> > to hot-plugged devices at unpredictable hub/port location (where it
> > should not interfere with regular USB-UART cables for debugging), nor
> > would it help ACPI based platforms such as x86_64.
> >
> > My idea then was that if we had some unique criteria like vendor and
> > product IDs (or whatever is supported in usb_device_id), we could write
> > a usb_driver with suitable USB_DEVICE*() macro. In its probe function we
> > could call into the existing tty driver's probe function and afterwards
> > try creating and attaching the appropriate serdev device, i.e. a fixed
> > USB-to-serdev driver mapping. Problem is that most devices don't seem to
> > implement any unique identifier I could make this depend on - either by
> > using a standard FT232/FT2232/CH340G chip or by using STMicroelectronics
> > virtual com port identifiers in CDC firmware and only differing in the
> > textual description [3] the usb_device_id does not seem to match on.
> >
> > The obvious solution would of course be if hardware vendors could revise
> > their designs to configure FTDI/etc. chips uniquely. I hear that that
> > may involve exchanging the chipset, increasing costs, and may impact
> > existing drivers. Wouldn't help for devices out there today either.
>
> They need to put an extra eeprom (cents) into their design and program it.
>
> >
> > For the picoGW CDC firmware, Semtech does appear to own a USB vendor ID,
> > so it would seem possible to allocate their own product IDs for SX1301
> > and SX1308 respectively to replace the generic STMicroelectronics IDs,
> > which the various vendors could offer as firmware updates.
> >
> > All outside my control though.
> >
> > Oliver therefore suggested to not mess with USB drivers and instead use
> > a line discipline (ldisc). It seems that for example the userspace tool
> > slattach takes a tty device and performs an ioctl to switch the generic
> > tty device into a special N_SLIP protocol mode, implemented in [4].
> >
> > However, the existing number of such ldisc modes appears to be below 30,
> > with hardly any vendor-specific implementation, so polluting its number
> > space seems undesirable? And in some cases I would like to use the same
> > protocol implementation over direct UART and over USB, so would like to
> > avoid duplicate serdev_device_driver and tty_ldisc_ops implementations.
> >
> > Long story short, has there been any thinking about a userspace
> > interface to attach a given serdev driver to a tty device?
> >
> > Or is there, on OF_DYNAMIC platforms, a way from userspace to associate
> > a DT fragment (!= DT Overlay) with a given USB device dynamically, to
> > attach a serdev node with sub-nodes?
> >
> > Any other ideas how to cleanly solve this?
> >
> > In some cases we're talking about a "simple" AT-like command interface;
> > the picoGW implements a semi-generic USB-SPI bridge that may host a
> > choice of 2+ chipsets, which in turn has two further sub-devices with 3+
> > chipset choices (theoretically clk output and rx/tx options etc.) each.
> > (For the latter I'm thinking we'll need a serdev driver exposing a
> > regmap_bus and then implement regmap_bus based versions of the SPI
> > drivers like Ben and I refactored SX1257 in [2] last weekend.)>
>
> There is a mPCIe module (RAK833) available by RAK wireless that uses a
> FT2232 as USB-SPI bridge, not uart. I have one here for experiments. It
> is detected as generic FT2232 device on usb. As far as I understood so
> far the serdev does only support uart based communication, is there a
> chance to get USB-SPI bridged modules also working?

That should be somewhat easier than a UART because there's not the
interactions with the tty layer to deal with. You still have the issue
of what is the DT root for the FTDI device.

Rob

^ permalink raw reply

* Re: Experimental fix for MSI-X issue on r8169
From: Steve Dodd @ 2018-08-21 17:57 UTC (permalink / raw)
  To: Jian-Hong Pan; +Cc: Heiner Kallweit, Lou Reed, netdev@vger.kernel.org
In-Reply-To: <CAPpJ_ecsMOL23VYM2juZ9R8JLrSh1bjCet16XCSpv0mDaSYu6w@mail.gmail.com>

On 20 August 2018 at 04:47, Jian-Hong Pan <jian-hong@endlessm.com> wrote:

> There is no "MSIX address lost, re-configuring" in dmesg.
> The ethernet interface is still down after resume.

Sorry, only just seen this thread. I can confirm Jian-Jong's report --
this patch doesn't help (applied to 4.18.3); no message is output.
Thanks for the investigatory effort though!

Steve

^ permalink raw reply

* [PATCHv2 iproute2 2/3] tc: remove extern from prototype declarations
From: Mahesh Bandewar @ 2018-08-21 17:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Mahesh Bandewar

From: Mahesh Bandewar <maheshb@google.com>

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
 tc/m_ematch.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tc/m_ematch.h b/tc/m_ematch.h
index f634f19164fa..80b02cfad6cc 100644
--- a/tc/m_ematch.h
+++ b/tc/m_ematch.h
@@ -20,7 +20,7 @@ struct bstr
 	struct bstr	*next;
 };
 
-extern struct bstr * bstr_alloc(const char *text);
+struct bstr * bstr_alloc(const char *text);
 
 static inline struct bstr * bstr_new(char *data, unsigned int len)
 {
@@ -51,8 +51,8 @@ static inline struct bstr *bstr_next(struct bstr *b)
 	return b->next;
 }
 
-extern unsigned long bstrtoul(const struct bstr *b);
-extern void bstr_print(FILE *fd, const struct bstr *b, int ascii);
+unsigned long bstrtoul(const struct bstr *b);
+void bstr_print(FILE *fd, const struct bstr *b, int ascii);
 
 
 struct ematch
@@ -79,7 +79,7 @@ static inline struct ematch * new_ematch(struct bstr *args, int inverted)
 	return e;
 }
 
-extern void print_ematch_tree(const struct ematch *tree);
+void print_ematch_tree(const struct ematch *tree);
 
 
 struct ematch_util
@@ -107,9 +107,9 @@ static inline int parse_layer(struct bstr *b)
 		return INT_MAX;
 }
 
-extern int em_parse_error(int err, struct bstr *args, struct bstr *carg,
+int em_parse_error(int err, struct bstr *args, struct bstr *carg,
 		   struct ematch_util *, char *fmt, ...);
-extern int print_ematch(FILE *, const struct rtattr *);
-extern int parse_ematch(int *, char ***, int, struct nlmsghdr *);
+int print_ematch(FILE *, const struct rtattr *);
+int parse_ematch(int *, char ***, int, struct nlmsghdr *);
 
 #endif
-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply related

* [PATCHv2 iproute2 1/3] ipmaddr: use preferred_family when given
From: Mahesh Bandewar @ 2018-08-21 17:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Mahesh Bandewar

From: Mahesh Bandewar <maheshb@google.com>

When creating socket() AF_INET is used irrespective of the family
that is given at the command-line (with -4, -6, or -0). This change
will open the socket with the preferred family.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
---
 ip/ipmaddr.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/ip/ipmaddr.c b/ip/ipmaddr.c
index a48499029e17..abf83784d0df 100644
--- a/ip/ipmaddr.c
+++ b/ip/ipmaddr.c
@@ -289,6 +289,7 @@ static int multiaddr_list(int argc, char **argv)
 static int multiaddr_modify(int cmd, int argc, char **argv)
 {
 	struct ifreq ifr = {};
+	int family;
 	int fd;
 
 	if (cmd == RTM_NEWADDR)
@@ -324,7 +325,17 @@ static int multiaddr_modify(int cmd, int argc, char **argv)
 		exit(-1);
 	}
 
-	fd = socket(AF_INET, SOCK_DGRAM, 0);
+	switch (preferred_family) {
+	case AF_INET6:
+	case AF_PACKET:
+	case AF_INET:
+		family = preferred_family;
+		break;
+	default:
+		family = AF_INET;
+	}
+
+	fd = socket(family, SOCK_DGRAM, 0);
 	if (fd < 0) {
 		perror("Cannot create socket");
 		exit(1);
-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply related

* [PATCHv2 iproute2 0/3] clang + misc changes
From: Mahesh Bandewar @ 2018-08-21 17:48 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Mahesh Bandewar

From: Mahesh Bandewar <maheshb@google.com>

The primary theme is to make clang compile the iproute2 package without
warnings. Along with this there are two other misc patches in the series.

First patch uses the preferred_family when operating with maddr feature.
Prior to this patch, it would always open an AF_INET socket irrespective
of the family that is preferred via command-line. 

Second patch just removes extern from the prototype declarations from
the m_ematch.h header file.

Third patch mostly adds format attributes to make the c-lang compiler
happy and not throw the warning messages.

Mahesh Bandewar (3):
  ipmaddr: use preferred_family when given
  tc: remove extern from prototype declarations
  iproute: make clang happy with iproute2 package

 include/json_writer.h |  3 +--
 ip/iplink_can.c       | 19 ++++++++++++-------
 ip/ipmaddr.c          | 13 ++++++++++++-
 lib/color.c           |  1 +
 lib/json_print.c      |  1 +
 lib/json_writer.c     | 15 +--------------
 misc/ss.c             |  3 ++-
 tc/m_ematch.c         |  1 +
 tc/m_ematch.h         | 15 ++++++++-------
 9 files changed, 39 insertions(+), 32 deletions(-)

-- 
2.18.0.865.gffc8e1a3cd6-goog

^ permalink raw reply

* [PATCH net] hv_netvsc: ignore devices that are not PCI
From: Stephen Hemminger @ 2018-08-21 17:40 UTC (permalink / raw)
  To: kys, davem; +Cc: netdev, Stephen Hemminger

Registering another device with same MAC address (such as TAP, VPN or
DPDK KNI) will confuse the VF autobinding logic.  Restrict the search
to only run if the device is known to be a PCI attached VF.

Fixes: e8ff40d4bff1 ("hv_netvsc: improve VF device matching")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
 drivers/net/hyperv/netvsc_drv.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 507f68190cb1..1121a1ec407c 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -29,6 +29,7 @@
 #include <linux/netdevice.h>
 #include <linux/inetdevice.h>
 #include <linux/etherdevice.h>
+#include <linux/pci.h>
 #include <linux/skbuff.h>
 #include <linux/if_vlan.h>
 #include <linux/in.h>
@@ -2039,12 +2040,16 @@ static int netvsc_register_vf(struct net_device *vf_netdev)
 {
 	struct net_device *ndev;
 	struct net_device_context *net_device_ctx;
+	struct device *pdev = vf_netdev->dev.parent;
 	struct netvsc_device *netvsc_dev;
 	int ret;
 
 	if (vf_netdev->addr_len != ETH_ALEN)
 		return NOTIFY_DONE;
 
+	if (!pdev || !dev_is_pci(pdev) || dev_is_pf(pdev))
+		return NOTIFY_DONE;
+
 	/*
 	 * We will use the MAC address to locate the synthetic interface to
 	 * associate with the VF interface. If we don't find a matching
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Heiner Kallweit @ 2018-08-21 20:54 UTC (permalink / raw)
  To: Marc Zyngier, Bjorn Helgaas, jian-hong
  Cc: David Miller, nic_swsd, netdev, linux-kernel, linux, linux-pci,
	Thomas Gleixner, Christoph Hellwig
In-Reply-To: <02c08346-3901-9c39-a837-f04e283794d5@arm.com>

On 21.08.2018 10:28, Marc Zyngier wrote:
> On 20/08/18 19:44, Bjorn Helgaas wrote:
>> [+cc Marc, Thomas, Christoph, linux-pci)
>> (beginning of thread at [1])
>>
>> On Thu, Aug 16, 2018 at 09:50:48PM +0200, Heiner Kallweit wrote:
>>> On 16.08.2018 21:39, David Miller wrote:
>>>> From: Heiner Kallweit <hkallweit1@gmail.com>
>>>> Date: Thu, 16 Aug 2018 21:37:31 +0200
>>>>
>>>>> On 16.08.2018 21:21, David Miller wrote:
>>>>>> From: <jian-hong@endlessm.com>
>>>>>> Date: Wed, 15 Aug 2018 14:21:10 +0800
>>>>>>
>>>>>>> Found the ethernet network on ASUS X441UAR doesn't come back on resume
>>>>>>> from suspend when using MSI-X.  The chip is RTL8106e - version 39.
>>>>>>
>>>>>> Heiner, please take a look at this.
>>>>>>
>>>>>> You recently disabled MSI-X on RTL8168g for similar reasons.
>>>>>>
>>>>>> Now that we've seen two chips like this, maybe there is some other
>>>>>> problem afoot.
>>>>>>
>>>>> Thanks for the hint. I saw it already and just contacted Realtek
>>>>> whether they are aware of any MSI-X issues with particular chip
>>>>> versions. With the chip versions I have access to MSI-X works fine.
>>>>>
>>>>> There's also the theoretical option that the issues are caused by
>>>>> broken BIOS's. But so far only chip versions have been reported
>>>>> which are very similar, at least with regard to version number
>>>>> (2x VER_40, 1x VER_39). So they may share some buggy component.
>>>>>
>>>>> Let's see whether Realtek can provide some hint.
>>>>> If more chip versions are reported having problems with MSI-X,
>>>>> then we could switch to a whitelist or disable MSI-X in general.
>>>>
>>>> It could be that we need to reprogram some register(s) on resume,
>>>> which normally might not be needed, and that is what is causing the
>>>> problem with some chips.
>>>>
>>> Indeed. That's what I'm checking with Realtek.
>>> In the register list in the r8169 driver there's one entry which
>>> seems to indicate that there are MSI-X specific settings.
>>> However this register isn't used, and the r8168 vendor driver
>>> uses only MSI. And there are no public datasheets.
>>
>> Do we have any information about these chip versions in other systems?
>> Or other devices using MSI-X in the same ASUS system?  It seems
>> possible that there's some PCI core or suspend/resume issue with MSI-X
>> and this patch just avoids it without fixing the root cause.
>>
>> It might be useful to have a kernel.org bugzilla with the complete
>> dmesg, "sudo lspci -vv" output, and /proc/interrupts contents archived
>> for future reference.
> 
> The one system I have with a Realtek chip seems happy enough with MSI-X,
> but it never gets suspended.

Other owners of affected chip versiosn made the same experience, MSI-X
works fine until resume from suspend.

> There is comment in the patch that I don't quite get:
> 
>> It is the IRQ 127 - PCI-MSI used by enp2s0.  However, lspci lists MSI is
>> disabled and MSI-X is enabled which conflicts to the interrupt table.
> 
> What do you mean by "conflicts"? With what? Another question is whether
> you've loaded any firmware (some versions of the Realtek HW seem to require
> it).
> 
These "conflicts" were a misunderstanding which was clarified with the
reporter. "PCI-MSI" as irq chip name in /proc/interrupts output was
interpreted in a way that a MSI irq is used, not a MSI-X irq.

The firmware is for the PHY only, that's at least my experience on
the chip versions I have for testing.

> For the posterity, some data from my own system, which I don't know if it
> has any relevance to the problem at hand.
> 
> Thanks,
> 
> 	M.
> 
> [    2.624963] r8169 0000:02:00.0 eth0: RTL8168g/8111g, 5a:fe:ad:ce:11:00, XID 4c000800, IRQ 26
> [    2.633398] r8169 0000:02:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
> 
>  26:         50     997005          0          0       MSI 1048576 Edge      enp2s0
> 
> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
> 	Subsystem: Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin A routed to IRQ 25
> 	Region 0: I/O ports at 1000 [size=256]
> 	Region 2: Memory at 100004000 (64-bit, prefetchable) [size=4K]
> 	Region 4: Memory at 100000000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [40] Power Management version 3
> 		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
> 		Address: 0000000000000000  Data: 0000
> 	Capabilities: [70] Express (v2) Endpoint, MSI 01
> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
> 			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
> 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
> 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
> 			MaxPayload 128 bytes, MaxReadReq 4096 bytes
> 		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
> 			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
> 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance De-emphasis: -6dB
> 		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
> 		Vector table: BAR=4 offset=00000000
> 		PBA: BAR=4 offset=00000800
> 	Capabilities: [d0] Vital Product Data
> pcilib: sysfs_read_vpd: read failed: Input/output error
> 		Not readable
> 	Capabilities: [100 v1] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
> 		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
> 	Capabilities: [140 v1] Virtual Channel
> 		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
> 		Arb:	Fixed- WRR32- WRR64- WRR128-
> 		Ctrl:	ArbSelect=Fixed
> 		Status:	InProgress-
> 		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> 			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> 			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
> 			Status:	NegoPending- InProgress-
> 	Capabilities: [160 v1] Device Serial Number 00-00-00-00-00-00-00-00
> 	Capabilities: [170 v1] Latency Tolerance Reporting
> 		Max snoop latency: 0ns
> 		Max no snoop latency: 0ns
> 	Kernel driver in use: r8169
> 
> 

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Heiner Kallweit @ 2018-08-21 20:48 UTC (permalink / raw)
  To: David Miller
  Cc: helgaas, jian-hong, nic_swsd, netdev, linux-kernel, linux,
	linux-pci, marc.zyngier, tglx, hch
In-Reply-To: <20180821.123108.89921430801253333.davem@davemloft.net>

On 21.08.2018 21:31, David Miller wrote:
> From: Heiner Kallweit <hkallweit1@gmail.com>
> Date: Mon, 20 Aug 2018 22:46:48 +0200
> 
>> I'm in contact with Realtek and according to them few chip versions
>> seem to clear MSI-X table entries on resume from suspend. Checking
>> with them how this could be fixed / worked around.
>> Worst case we may have to disable MSI-X in general.
> 
> I worry that if the chip does this, and somehow MSI-X is enabled and
> an interrupt is generated, the chip will write to the cleared out
> MSI-X address.  This will either write garbage into memory or cause
> a bus error and require PCI error recovery.
> 
> It also looks like your test patch doesn't fix things for people who
> have tested it.
> 
The test patch was based on the first info from Realtek which made me
think that the base address of the MSI-X table is cleared, what
obviously is not the case.

After some further tests it seems that the solution isn't as simple
as storing the MSI-X table entries on suspend and restore them on
resume. On my system (where MSI-X works fine) MSI-X table entries
on resume are partially different from the ones on suspend.

Unfortunately I don't have affected test hardware, currently I'm
waiting for further feedback from Realtek.

> Hmmm...
> 

^ permalink raw reply

* Re: ixgbe hangs when XDP_TX is enabled
From: Nikita V. Shirokov @ 2018-08-21 16:58 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <CAKgT0UcNsFcNNUycTqZ59b5=dX4V=Fk5mVUQ8pOYT_nz194rqQ@mail.gmail.com>

On Tue, Aug 21, 2018 at 08:58:15AM -0700, Alexander Duyck wrote:
> On Mon, Aug 20, 2018 at 12:32 PM Nikita V. Shirokov <tehnerd@tehnerd.com> wrote:
> >
> > we are getting such errors:
> >
> > [  408.737313] ixgbe 0000:03:00.0 eth0: Detected Tx Unit Hang (XDP)
> >                  Tx Queue             <46>
> >                  TDH, TDT             <0>, <2>
> >                  next_to_use          <2>
> >                  next_to_clean        <0>
> >                tx_buffer_info[next_to_clean]
> >                  time_stamp           <0>
> >                  jiffies              <1000197c0>
> > [  408.804438] ixgbe 0000:03:00.0 eth0: tx hang 1 detected on queue 46, resetting adapter
> > [  408.804440] ixgbe 0000:03:00.0 eth0: initiating reset due to tx timeout
> > [  408.817679] ixgbe 0000:03:00.0 eth0: Reset adapter
> > [  408.866091] ixgbe 0000:03:00.0 eth0: TXDCTL.ENABLE for one or more queues not cleared within the polling period
> > [  409.345289] ixgbe 0000:03:00.0 eth0: detected SFP+: 3
> > [  409.497232] ixgbe 0000:03:00.0 eth0: NIC Link is Up 10 Gbps, Flow Control: RX/TX
> >
> > while running XDP prog on ixgbe nic.
> > right now i'm seing this on bpfnext kernel
> > (latest commit from Wed Aug 15 15:04:25 2018 -0700 ;
> > 9a76aba02a37718242d7cdc294f0a3901928aa57)
> >
> > looks like this is the same issue as reported by Brenden in
> > https://www.spinics.net/lists/netdev/msg439438.html
> >
> > --
> > Nikita V. Shirokov
> 
> Could you provide some additional information about your setup.
> Specifically useful would be "ethtool -i", "ethtool -l", and lspci
> -vvv info for your device. The total number of CPUs on the system
> would be useful to know as well. In addition could you try
> reproducing
sure:

ethtool -l eth0
Channel parameters for eth0:
Pre-set maximums:
RX:             0
TX:             0
Other:          1
Combined:       63
Current hardware settings:
RX:             0
TX:             0
Other:          1
Combined:       48

# ethtool -i eth0
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x800006f1
expansion-rom-version:
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes


# nproc
48

lspci:

03:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
        Subsystem: Intel Corporation Device 000d
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 30
        NUMA node: 0
        Region 0: Memory at c7d00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: I/O ports at 6000 [size=32]
        Region 4: Memory at c7e80000 (64-bit, non-prefetchable) [size=16K]
        Expansion ROM at c7e00000 [disabled] [size=512K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
                Vector table: BAR=4 offset=00000000
                PBA: BAR=4 offset=00002000
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend+
                LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s unlimited, L1 <8us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [140 v1] Device Serial Number 90-e2-ba-ff-ff-b6-b2-60
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 128, stride: 2, Device ID: 10ed
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 00000000c7c00000 (64-bit, prefetchable)
                Region 3: Memory at 00000000c7b00000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Kernel driver in use: ixgbe




workaround for now is to do the same, as Brenden did in his original
finding: make sure that combined + xdp queues < max_tx_queues
(e.g. w/ combined == 14 the issue goes away).

> the issue with one of the sample XDP programs provided with the kernel
> such as the xdp2 which I believe uses the XDP_TX function. We need to
> try and create a similar setup in our own environment for
> reproduction and debugging.

will try but this could take a while, because i'm not sure that we have
ixgbe in our test lab (and it would be hard to run such test in prod)

> 
> Thanks.
> 
> - Alex

^ permalink raw reply

* Re: [PATCH] libbpf: Remove the duplicate checking of function storage
From: Daniel Borkmann @ 2018-08-21 20:11 UTC (permalink / raw)
  To: Jakub Kicinski, Taeung Song; +Cc: Alexei Starovoitov, Linux Netdev List, LKML
In-Reply-To: <CAJpBn1zT1EnyKnzmoEO_4WwjR1qgY94wcQjHdKUPAa5z+5OvXw@mail.gmail.com>

On 08/21/2018 06:46 PM, Jakub Kicinski wrote:
> On Tue, Aug 21, 2018 at 6:12 PM, Taeung Song <treeze.taeung@gmail.com> wrote:
>> After the commit eac7d84519a3 ("tools: libbpf: don't return '.text'
>> as a program for multi-function programs"), bpf_program__next()
>> in bpf_object__for_each_program skips the function storage such as .text,
>> so eliminate the duplicate checking.
>>
>> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
>> Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
> 
> Looks reasonable, but you may need to repost once bpf-next is open:
> 
> https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt
> 
> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Agree, please resubmit once bpf-next opens up again. Thanks!

^ permalink raw reply

* Re: [PATCH] libbpf: Remove the duplicate checking of function storage
From: Jakub Kicinski @ 2018-08-21 16:46 UTC (permalink / raw)
  To: Taeung Song; +Cc: Daniel Borkmann, Alexei Starovoitov, Linux Netdev List, LKML
In-Reply-To: <20180821161258.19718-1-treeze.taeung@gmail.com>

On Tue, Aug 21, 2018 at 6:12 PM, Taeung Song <treeze.taeung@gmail.com> wrote:
> After the commit eac7d84519a3 ("tools: libbpf: don't return '.text'
> as a program for multi-function programs"), bpf_program__next()
> in bpf_object__for_each_program skips the function storage such as .text,
> so eliminate the duplicate checking.
>
> Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
> Signed-off-by: Taeung Song <treeze.taeung@gmail.com>

Looks reasonable, but you may need to repost once bpf-next is open:

https://www.kernel.org/doc/Documentation/networking/netdev-FAQ.txt

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>

> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 2abd0f112627..8476da7f2720 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -2336,7 +2336,7 @@ int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr,
>                 bpf_program__set_expected_attach_type(prog,
>                                                       expected_attach_type);
>
> -               if (!bpf_program__is_function_storage(prog, obj) && !first_prog)
> +               if (!first_prog)
>                         first_prog = prog;
>         }
>
> --
> 2.17.1
>

^ permalink raw reply

* Re: [PATCH] rds: tcp: remove duplicated include from tcp.c
From: Santosh Shilimkar @ 2018-08-21 16:45 UTC (permalink / raw)
  To: Yue Haibing, David S. Miller
  Cc: netdev, linux-rdma, rds-devel, kernel-janitors
In-Reply-To: <1534860342-171157-1-git-send-email-yuehaibing@huawei.com>

On 8/21/2018 7:05 AM, Yue Haibing wrote:
> Remove duplicated include.
> 
> Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
> ---
Looks fine.

Acked-by : Santosh Shilimkar <santosh.shilimkar@oracle.ocm>

^ permalink raw reply

* Re: [PATCH] selftests: net: move fragment forwarding/config up a level
From: Shuah Khan @ 2018-08-21 19:41 UTC (permalink / raw)
  To: Ido Schimmel, Anders Roxell
  Cc: davem, linux-kernel, netdev, linux-kselftest, Shuah Khan
In-Reply-To: <20180821185625.GB27078@splinter>

On 08/21/2018 12:56 PM, Ido Schimmel wrote:
> On Tue, Aug 21, 2018 at 06:12:12PM +0200, Anders Roxell wrote:
>> 'make kselftest-merge' assumes that the config files for the tests are
>> located under the 'main' tet dir, like tools/testing/selftests/net/ and
>> not in a subdir to net.
> 
> The tests under tools/testing/selftests/net/forwarding/ aren't executed
> as part of the Makefile. The config file is there mainly so that people
> will know which config options they need in order to run the tests.
> 
> The tests can be added to the Makefile, but some of them take a few
> minutes to complete which is probably against "Don't take too long;"
> mentioned in Documentation/dev-tools/kselftest.rst.
> 

I don't see any reason why these shouldn't be added. With the number of
tests that get run by default, time has gone up. The goal is to run more
tests not less. There are some stress/destructive tests that continue to
be left out of the Makefile.

thanks,
-- Shuah

^ permalink raw reply

* Re: [PATCH iproute2] iproute: make clang happy
From: Mahesh Bandewar (महेश बंडेवार) @ 2018-08-21 16:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Mahesh Bandewar, netdev
In-Reply-To: <20180820175232.6a0877fe@xeon-e3>

On Mon, Aug 20, 2018 at 5:52 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 20 Aug 2018 16:44:28 -0700
> Mahesh Bandewar (महेश बंडेवार) <maheshb@google.com> wrote:
>
>> On Mon, Aug 20, 2018 at 4:38 PM, Mahesh Bandewar (महेश बंडेवार)
>> <maheshb@google.com> wrote:
>> > On Mon, Aug 20, 2018 at 3:52 PM, Stephen Hemminger
>> > <stephen@networkplumber.org> wrote:
>> >> On Mon, 20 Aug 2018 14:42:15 -0700
>> >> Mahesh Bandewar <mahesh@bandewar.net> wrote:
>> >>
>> >>> diff --git a/tc/m_ematch.c b/tc/m_ematch.c
>> >>> index ace4b3dd738b..a524b520b276 100644
>> >>> --- a/tc/m_ematch.c
>> >>> +++ b/tc/m_ematch.c
>> >>> @@ -277,6 +277,7 @@ static int flatten_tree(struct ematch *head, struct ematch *tree)
>> >>>       return count;
>> >>>  }
>> >>>
>> >>> +__attribute__((format(printf, 5, 6)))
>> >>>  int em_parse_error(int err, struct bstr *args, struct bstr *carg,
>> >>>                  struct ematch_util *e, char *fmt, ...)
>> >>
>> >> I think the printf attribute needs to go on the function prototype
>> >> here:
>> >> tc/m_ematch.h:extern int em_parse_error(int err, struct bstr *args, struct bstr *carg,
>> >>
>> > The attributes are attached to the definitions only and not prototype
>> > declarations. Please see the definition/declaration for jsonw_printf()
>> > in the same patch.
>> I take that back. Seems like it's fine either way.
>
> The reason to put the attributes in the .h file is that then the compiler
> can test for misuse in other files.  For example if em_parse_error had
> a bad format in em_u32.c, then the warning would not happen unless
> the attribute was on the function prototype.
>
correct, will take care in v2

^ permalink raw reply

* Re: [PATCH] strparser: remove any offset before parsing messages
From: Dominique Martinet @ 2018-08-21 19:36 UTC (permalink / raw)
  To: Doron Roberts-Kedes
  Cc: Tom Herbert, Dave Watson, David S. Miller, netdev, linux-kernel
In-Reply-To: <20180821145321.GA44710@doronrk-mbp>

Doron Roberts-Kedes wrote on Tue, Aug 21, 2018:
> There are a few issues with this patch. First, it seems like you're
> trying to fix bugs in users of strparser by changing an implementation
> detail of strparser.

Yes, that's why I have been writing since the original discussion that I
do not like this fix, but as I said in the other thread and v0 of this
patch I do not know how to tell the bpf function to start with an offset
in the skb in e.g. kcm_parse_func_strparser

I could add the pull in that function, but that feels a bit wrong on a
separation level to me.

> Second, this implementation change can add malloc's and copies where
> there were none before.

Yes I agree this is more than suboptimal for tls, I've also said that.


> If strparser users do not handle non-zero offset properly, then that
> doesn't motivate changing the implementation of strparser to copy
> around data to accomodate those buggy users. 
>
> Why not submit a patch that handles offset properly in the code you
> pointed out? 

One of the solutions I had suggested was adding a flag at strparser
setup time to only do that pull for users which cannot handle offset,
but nobody seemed interested two weeks ago. I can still do that.

That's still suboptimal, but I don't have any better idea.
To properly fix the users, I'd really need help with how bpf works to
even know if passing an offset would be possible in the first place, as
I do not see how at this time.


Thanks,
-- 
Dominique Martinet

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox