Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [iproute PATCH v3] Make colored output configurable
From: Phil Sutter @ 2018-08-16  9:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, David Ahern, Till Maas
In-Reply-To: <20180816081950.5851-1-phil@nwl.cc>

On Thu, Aug 16, 2018 at 10:19:50AM +0200, Phil Sutter wrote:
> Allow for -color={never,auto,always} to have colored output disabled,
> enabled only if stdout is a terminal or enabled regardless of stdout
> state.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>

Argh, please ignore this one - it doesn't even compile. It's just a
minor typo, but I'll respin.

Sorry, Phil

^ permalink raw reply

* Re: bnxt: card intermittently hanging and dropping link
From: Daniel Axtens @ 2018-08-16  9:09 UTC (permalink / raw)
  To: Michael Chan; +Cc: Netdev, jay.vosburgh
In-Reply-To: <CACKFLinW2xZCqOQSNVDt-SXU0XB0NRhoemdMbRGLtXSaxjLDuw@mail.gmail.com>

Hi Michael,

> The main issue is the TX timeout.
> .....
>
>> [ 2682.911693] bnxt_en 0000:3b:00.0 eth4: TX timeout detected, starting reset task!
>> [ 2683.782496] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
>> [ 2683.783061] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
>> [ 2684.634557] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
>> [ 2684.635120] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
>
> and it is not recovering.
>
> Please provide ethtool -i eth4 which will show the firmware version on
> the NIC.  Let's see if the firmware is too old.

driver: bnxt_en
version: 1.8.0
firmware-version: 20.6.151.0/pkg 20.06.05.11
expansion-rom-version: 
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no
supports-priv-flags: no

Thanks!

Regards,
Daniel

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-08-16 12:05 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <20180816113903.2stwrlm6frgmk2vs@unicorn.suse.cz>



On 2018/8/16 19:39, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 03:55:16PM +0800, maowenan wrote:
>> On 2018/8/16 15:44, Michal Kubecek wrote:
>>> On Thu, Aug 16, 2018 at 03:39:14PM +0800, maowenan wrote:
>>>>
>>>>
>>>> On 2018/8/16 15:23, Michal Kubecek wrote:
>>>>> On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
>>>>>> On 2018/8/16 14:52, Michal Kubecek wrote:
>>>>>>>
>>>>>>> My point is that backporting all this into stable 4.4 is quite intrusive
>>>>>>> so that if we can achieve similar results with a simple fix of an
>>>>>>> obvious omission, it would be preferrable.
>>>>>>
>>>>>> There are five patches in mainline to fix this CVE, only two patches
>>>>>> have no effect on stable 4.4, the important reason is 4.4 use simple
>>>>>> queue but mainline use RB tree.
>>>>>>
>>>>>> I have tried my best to use easy way to fix this with dropping packets
>>>>>> 12.5%(or other value) based on simple queue, but the result is not
>>>>>> very well, so the RB tree is needed and tested result is my desire.
>>>>>>
>>>>>> If we only back port two patches but they don't fix the issue, I think
>>>>>> they don't make any sense.
>>>>>
>>>>> There is an obvious omission in one of the two patches and Takashi's
>>>>> patch fixes it. If his follow-up fix (applied on top of what is in
>>>>> stable 4.4 now) addresses the problem, I would certainly prefer using it
>>>>> over backporting the whole series.
>>>>
>>>> Do you mean below codes from Takashi can fix this CVE?
>>>> But I have already tested like this two days ago, it is not good effect.
>>>
>>> IIRC what you proposed was different, you proposed to replace the "=" in
>>> the other branch by "+=".
>>
>> No, I think you don't get what I mean, I have already tested stable 4.4,
>> based on commit dc6ae4d, and change the codes like Takashi, which didn't
>> contain any codes I have sent in this patch series.
> 
> I suspect you may be doing something wrong with your tests. I checked
> the segmentsmack testcase and the CPU utilization on receiving side
> (with sending 10 times as many packets as default) went down from ~100%
> to ~3% even when comparing what is in stable 4.4 now against older 4.4
> kernel.

There seems no obvious problem when you send packets with default parameter in Segmentsmack POC,
Which is also very related with your server's hardware configuration. Please try with below parameter
to form OFO packets then you will see cpu usage is high, and perf top shows that tcp_data_queue costs
cpu about 55.6%.
If dst port is 22, then you will see sshd about 95%.

int main(int argc, char **argv)
{
  // Adjust dst_mac, src_mac and dst_ip to match source and target!
  // Adjust dst_port to match the target, needs to be an open port!
  char dst_mac[6] = {0xb8,0x27,0xeb,0x54,0x23,0x4a};
  char src_mac[6] = {0x08,0x00,0x27,0xbc,0x91,0x93};
  uint32_t dst_ip = (192<<24)|(168<<16)|(1<<8)|225;
  uint32_t src_ip = 0;
  uint16_t dst_port = 22;   //attack existed ssh link
  uint16_t src_port = 0;

  ......

    for (j = 0; j < 102400*10; j++)    //10240->102400
    {
      for (i = 0; i < 1024; i++)      // 128->1024
      {
        tcp_set_ack_on(only_tcp[i]);
        tcp_set_src_port(only_tcp[i], src_port);
        tcp_set_dst_port(only_tcp[i], dst_port);
        tcp_set_seq_number(only_tcp[i], isn+2+2*(rand()%16384));
        //tcp_set_seq_number(only_tcp[i], isn+2);
        tcp_set_ack_number(only_tcp[i], other_isn+1);
        tcp_set_data_offset(only_tcp[i], 20);
        tcp_set_window(only_tcp[i], 65535);
        tcp_set_cksum_calc(ip, 20, only_tcp[i], sizeof(only_tcp[i]));
      }
      ret = ldp_out_inject_chunk(outq, pkt_tbl_chunk, 1024);  //128->1024
      printf("sent %d packets\n", ret);
      ldp_out_txsync(outq);
      usleep(10*1000); // Adjust this and packet count to match the target!, sleep 100ms->10ms
    }
  ......

> 
> This is actually not surprising. the testcase only sends 1-byte segments
> starting at even offsets so that receiver never gets two adjacent
> segments and the "range_truesize != head->truesize" condition would
> never be satisfied, whether you fix the backport or not.
> 
> Where the missing "range_truesize += skb->truesize" makes a difference
> is in the case when there are some adjacent out of order segments, e.g.
> if you omitted 1:10 and then sent 10:11, 11:12, 12:13, 13:14, ...
> Then the missing update of range_truesize would prevent collapsing the
> queue until the total length of the range would exceed the value of
> SKB_WITH_OVERHEAD(SK_MEM_QUANTUM) (i.e. a bit less than 4 KB).
> 
> Michal Kubecek
> 
> 
> .
> 

^ permalink raw reply

* [PATCH v3] net/mlx5e: Delete unneeded function argument
From: Yuval Shaia @ 2018-08-16  9:02 UTC (permalink / raw)
  To: saeedm, leon, davem, netdev, linux-rdma; +Cc: Yuval Shaia

priv argument is not used by the function, delete it.

Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
---
v1 -> v2:
	* Remove blank line as pointed by Leon.

v2 -> v3:
	* Change prefix to mlx5e
---
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 1646859974ce..4c4d779dafa8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -813,7 +813,7 @@ static const struct counter_desc pport_per_prio_traffic_stats_desc[] = {
 
 #define NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS	ARRAY_SIZE(pport_per_prio_traffic_stats_desc)
 
-static int mlx5e_grp_per_prio_traffic_get_num_stats(struct mlx5e_priv *priv)
+static int mlx5e_grp_per_prio_traffic_get_num_stats(void)
 {
 	return NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS * NUM_PPORT_PRIO;
 }
@@ -971,7 +971,7 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv,
 
 static int mlx5e_grp_per_prio_get_num_stats(struct mlx5e_priv *priv)
 {
-	return mlx5e_grp_per_prio_traffic_get_num_stats(priv) +
+	return mlx5e_grp_per_prio_traffic_get_num_stats() +
 		mlx5e_grp_per_prio_pfc_get_num_stats(priv);
 }
 
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v2] net/mlx5: Delete unneeded function argument
From: Yuval Shaia @ 2018-08-16  9:00 UTC (permalink / raw)
  To: Gal Pressman; +Cc: saeedm, leon, davem, netdev, linux-rdma
In-Reply-To: <ebbc0eff-eacf-fd30-bb03-bd7ef9e6c3c0@gmail.com>

On Thu, Aug 16, 2018 at 09:41:34AM +0300, Gal Pressman wrote:
> On 15-Aug-18 18:08, Yuval Shaia wrote:
> > priv argument is not used by the function, delete it.
> > 
> > Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups")
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> 
> nit: prefix should be net/mlx5e.

1. Will do.
2. Why? (asking since dir name is mlx5)

> 
> > ---
> > v1 -> v2:
> > 	* Remove blank line as pointed by Leon.
> > ---
> >  drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> > index 1646859974ce..4c4d779dafa8 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> > @@ -813,7 +813,7 @@ static const struct counter_desc pport_per_prio_traffic_stats_desc[] = {
> >  
> >  #define NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS	ARRAY_SIZE(pport_per_prio_traffic_stats_desc)
> >  
> > -static int mlx5e_grp_per_prio_traffic_get_num_stats(struct mlx5e_priv *priv)
> > +static int mlx5e_grp_per_prio_traffic_get_num_stats(void)
> >  {
> >  	return NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS * NUM_PPORT_PRIO;
> >  }
> > @@ -971,7 +971,7 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv,
> >  
> >  static int mlx5e_grp_per_prio_get_num_stats(struct mlx5e_priv *priv)
> >  {
> > -	return mlx5e_grp_per_prio_traffic_get_num_stats(priv) +
> > +	return mlx5e_grp_per_prio_traffic_get_num_stats() +
> >  		mlx5e_grp_per_prio_pfc_get_num_stats(priv);
> >  }
> >  
> > 
> 

^ permalink raw reply

* [iproute PATCH v3] Make colored output configurable
From: Phil Sutter @ 2018-08-16  8:19 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, David Ahern, Till Maas

Allow for -color={never,auto,always} to have colored output disabled,
enabled only if stdout is a terminal or enabled regardless of stdout
state.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
Changes since v1:
- Allow to override isatty() check by specifying '-color' flag more than
  once.
- Document new behaviour in man pages.

Changes since v2:
- Implement new -color=foo syntax.
- Update commit message and man page texts accordingly.
---
 bridge/bridge.c   |  3 +--
 include/color.h   |  7 +++++++
 ip/ip.c           |  3 +--
 lib/color.c       | 33 ++++++++++++++++++++++++++++++++-
 man/man8/bridge.8 | 13 +++++++++++--
 man/man8/ip.8     | 13 +++++++++++--
 man/man8/tc.8     | 13 +++++++++++--
 tc/tc.c           |  3 +--
 8 files changed, 75 insertions(+), 13 deletions(-)

diff --git a/bridge/bridge.c b/bridge/bridge.c
index 451d684e0bcfd..e35e5bdf7fb30 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -173,8 +173,7 @@ main(int argc, char **argv)
 			NEXT_ARG();
 			if (netns_switch(argv[1]))
 				exit(-1);
-		} else if (matches(opt, "-color") == 0) {
-			++color;
+		} else if (matches_color(opt, &color) == 0) {
 		} else if (matches(opt, "-compressvlans") == 0) {
 			++compress_vlans;
 		} else if (matches(opt, "-force") == 0) {
diff --git a/include/color.h b/include/color.h
index 4f2c918db7e43..42038dc2e7f87 100644
--- a/include/color.h
+++ b/include/color.h
@@ -12,8 +12,15 @@ enum color_attr {
 	COLOR_NONE
 };
 
+enum color_opt {
+	COLOR_OPT_NEVER = 0,
+	COLOR_OPT_AUTO = 1,
+	COLOR_OPT_ALWAYS = 2
+};
+
 void enable_color(void);
 int check_enable_color(int color, int json);
+int matches_color(const char *arg, int *val);
 void set_color_palette(void);
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...);
 enum color_attr ifa_family_color(__u8 ifa_family);
diff --git a/ip/ip.c b/ip/ip.c
index 38eac5ec1e17d..893c3c43ef99a 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -283,8 +283,7 @@ int main(int argc, char **argv)
 				exit(-1);
 			}
 			rcvbuf = size;
-		} else if (matches(opt, "-color") == 0) {
-			++color;
+		} else if (matches_color(opt, &color) == 0) {
 		} else if (matches(opt, "-help") == 0) {
 			usage();
 		} else if (matches(opt, "-netns") == 0) {
diff --git a/lib/color.c b/lib/color.c
index edf96e5c6ecd7..3ad1d6d647722 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -3,11 +3,13 @@
 #include <stdarg.h>
 #include <stdlib.h>
 #include <string.h>
+#include <unistd.h>
 #include <sys/socket.h>
 #include <sys/types.h>
 #include <linux/if.h>
 
 #include "color.h"
+#include "utils.h"
 
 enum color {
 	C_RED,
@@ -79,13 +81,42 @@ void enable_color(void)
 
 int check_enable_color(int color, int json)
 {
-	if (color && !json) {
+	if (json || color == COLOR_OPT_NEVER)
+		return 1;
+
+	if (color == COLOR_OPT_ALWAYS || isatty(fileno(stdout))) {
 		enable_color();
 		return 0;
 	}
 	return 1;
 }
 
+int matches_color(const char *arg, int *val)
+{
+	char *dup, *p;
+
+	if (!val)
+		return 1;
+
+	dup = strdupa(arg);
+	p = strchrnul(dup, '=');
+	if (*p)
+		*(p++) = '\0';
+
+	if (matches(dup, "-color"))
+		return 1;
+
+	if (*p == '\0' || !strcmp(p, "always"))
+		*val = COLOR_OPT_ALWAYS;
+	else if (!strcmp(p, "auto"))
+		*val = COLOR_OPT_AUTO;
+	else if (!strcmp(p, "never"))
+		*val = COLOR_OPT_NEVER;
+	else
+		return 1;
+	return 0;
+}
+
 void set_color_palette(void)
 {
 	char *p = getenv("COLORFGBG");
diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index f6d228c5ebfe7..6dfd4178a19cc 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -171,8 +171,17 @@ If there were any errors during execution of the commands, the application
 return code will be non zero.
 
 .TP
-.BR "\-c" , " -color"
-Use color output.
+.BR \-c [ color ][ = { always | auto | never }
+Configure color output. If parameter is omitted or
+.BR always ,
+color output is enabled regardless of stdout state. If parameter is
+.BR auto ,
+stdout is checked to be a terminal before enabling color output. If parameter is
+.BR never ,
+color output is disabled. If specified multiple times, the last one takes
+precedence. This flag is ignored if
+.B \-json
+is also given.
 
 .TP
 .BR "\-j", " \-json"
diff --git a/man/man8/ip.8 b/man/man8/ip.8
index 0087d18b74706..1d358879ec39c 100644
--- a/man/man8/ip.8
+++ b/man/man8/ip.8
@@ -187,8 +187,17 @@ to
 executes specified command over all objects, it depends if command supports this option.
 
 .TP
-.BR "\-c" , " -color"
-Use color output.
+.BR \-c [ color ][ = { always | auto | never }
+Configure color output. If parameter is omitted or
+.BR always ,
+color output is enabled regardless of stdout state. If parameter is
+.BR auto ,
+stdout is checked to be a terminal before enabling color output. If parameter is
+.BR never ,
+color output is disabled. If specified multiple times, the last one takes
+precedence. This flag is ignored if
+.B \-json
+is also given.
 
 .TP
 .BR "\-t" , " \-timestamp"
diff --git a/man/man8/tc.8 b/man/man8/tc.8
index fd33f9b2f1806..f98398a34bd3e 100644
--- a/man/man8/tc.8
+++ b/man/man8/tc.8
@@ -755,8 +755,17 @@ option was specified. Classes can be filtered only by
 option.
 
 .TP
-.BR "\ -color"
-Use color output.
+.BR \-c [ color ][ = { always | auto | never }
+Configure color output. If parameter is omitted or
+.BR always ,
+color output is enabled regardless of stdout state. If parameter is
+.BR auto ,
+stdout is checked to be a terminal before enabling color output. If parameter is
+.BR never ,
+color output is disabled. If specified multiple times, the last one takes
+precedence. This flag is ignored if
+.B \-json
+is also given.
 
 .TP
 .BR "\-j", " \-json"
diff --git a/tc/tc.c b/tc/tc.c
index 4c7a128c8103e..dc052d26dc8b3 100644
--- a/tc/tc.c
+++ b/tc/tc.c
@@ -496,8 +496,7 @@ int main(int argc, char **argv)
 				matches(argv[1], "-conf") == 0) {
 			NEXT_ARG();
 			conf_file = argv[1];
-		} else if (matches(argv[1], "-color") == 0) {
-			++color;
+		} else if (matches_color(opt, &color) == 0) {
 		} else if (matches(argv[1], "-timestamp") == 0) {
 			timestamp++;
 		} else if (matches(argv[1], "-tshort") == 0) {
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Michal Kubecek @ 2018-08-16  7:44 UTC (permalink / raw)
  To: maowenan
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <827465d9-4116-68a8-c699-d5b2aedce69f@huawei.com>

On Thu, Aug 16, 2018 at 03:39:14PM +0800, maowenan wrote:
> 
> 
> On 2018/8/16 15:23, Michal Kubecek wrote:
> > On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
> >> On 2018/8/16 14:52, Michal Kubecek wrote:
> >>>
> >>> My point is that backporting all this into stable 4.4 is quite intrusive
> >>> so that if we can achieve similar results with a simple fix of an
> >>> obvious omission, it would be preferrable.
> >>
> >> There are five patches in mainline to fix this CVE, only two patches
> >> have no effect on stable 4.4, the important reason is 4.4 use simple
> >> queue but mainline use RB tree.
> >>
> >> I have tried my best to use easy way to fix this with dropping packets
> >> 12.5%(or other value) based on simple queue, but the result is not
> >> very well, so the RB tree is needed and tested result is my desire.
> >>
> >> If we only back port two patches but they don't fix the issue, I think
> >> they don't make any sense.
> > 
> > There is an obvious omission in one of the two patches and Takashi's
> > patch fixes it. If his follow-up fix (applied on top of what is in
> > stable 4.4 now) addresses the problem, I would certainly prefer using it
> > over backporting the whole series.
> 
> Do you mean below codes from Takashi can fix this CVE?
> But I have already tested like this two days ago, it is not good effect.

IIRC what you proposed was different, you proposed to replace the "=" in
the other branch by "+=".

Michal Kubecek


> 
> Could you try to test with POC programme mentioned previous mail in case I made mistake?
> 
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4a261e078082..9c4c6cd0316e 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4835,6 +4835,7 @@ static void tcp_collapse_ofo_queue(struct sock *sk)
>  			end = TCP_SKB_CB(skb)->end_seq;
>  			range_truesize = skb->truesize;
>  		} else {
> +			range_truesize += skb->truesize;
>  			if (before(TCP_SKB_CB(skb)->seq, start))
>  				start = TCP_SKB_CB(skb)->seq;
>  			if (after(TCP_SKB_CB(skb)->end_seq, end))
> -- 
> 
> 
> > 
> > Michal Kubecek
> > 
> > 
> > .
> > 
> 

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Michal Kubecek @ 2018-08-16  7:23 UTC (permalink / raw)
  To: maowenan
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <a2f09efe-411c-4fce-af87-329841ee3fbe@huawei.com>

On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
> On 2018/8/16 14:52, Michal Kubecek wrote:
> > 
> > My point is that backporting all this into stable 4.4 is quite intrusive
> > so that if we can achieve similar results with a simple fix of an
> > obvious omission, it would be preferrable.
> 
> There are five patches in mainline to fix this CVE, only two patches
> have no effect on stable 4.4, the important reason is 4.4 use simple
> queue but mainline use RB tree.
> 
> I have tried my best to use easy way to fix this with dropping packets
> 12.5%(or other value) based on simple queue, but the result is not
> very well, so the RB tree is needed and tested result is my desire.
> 
> If we only back port two patches but they don't fix the issue, I think
> they don't make any sense.

There is an obvious omission in one of the two patches and Takashi's
patch fixes it. If his follow-up fix (applied on top of what is in
stable 4.4 now) addresses the problem, I would certainly prefer using it
over backporting the whole series.

Michal Kubecek

^ permalink raw reply

* Re: [PATCH v2] net/mlx5: Delete unneeded function argument
From: Gal Pressman @ 2018-08-16  6:41 UTC (permalink / raw)
  To: Yuval Shaia, saeedm, leon, davem, netdev, linux-rdma
In-Reply-To: <20180815150854.30993-1-yuval.shaia@oracle.com>

On 15-Aug-18 18:08, Yuval Shaia wrote:
> priv argument is not used by the function, delete it.
> 
> Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups")
> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>

nit: prefix should be net/mlx5e.

> ---
> v1 -> v2:
> 	* Remove blank line as pointed by Leon.
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> index 1646859974ce..4c4d779dafa8 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
> @@ -813,7 +813,7 @@ static const struct counter_desc pport_per_prio_traffic_stats_desc[] = {
>  
>  #define NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS	ARRAY_SIZE(pport_per_prio_traffic_stats_desc)
>  
> -static int mlx5e_grp_per_prio_traffic_get_num_stats(struct mlx5e_priv *priv)
> +static int mlx5e_grp_per_prio_traffic_get_num_stats(void)
>  {
>  	return NUM_PPORT_PER_PRIO_TRAFFIC_COUNTERS * NUM_PPORT_PRIO;
>  }
> @@ -971,7 +971,7 @@ static int mlx5e_grp_per_prio_pfc_fill_stats(struct mlx5e_priv *priv,
>  
>  static int mlx5e_grp_per_prio_get_num_stats(struct mlx5e_priv *priv)
>  {
> -	return mlx5e_grp_per_prio_traffic_get_num_stats(priv) +
> +	return mlx5e_grp_per_prio_traffic_get_num_stats() +
>  		mlx5e_grp_per_prio_pfc_get_num_stats(priv);
>  }
>  
> 

^ permalink raw reply

* [PATCH] net: phy: add tracepoints
From: Tobias Waldekranz @ 2018-08-16  9:34 UTC (permalink / raw)
  To: tobias
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, Steven Rostedt,
	Ingo Molnar, open list, open list:ETHERNET PHY LIBRARY

Two tracepoints for now:

   * `phy_interrupt` Pretty self-explanatory.

   * `phy_state_change` Whenever the PHY's state machine is run, trace
     the old and the new state.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
---
 drivers/net/phy/phy.c      |  4 +++
 include/trace/events/phy.h | 68 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+)
 create mode 100644 include/trace/events/phy.h

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 9aabfa1a455a..8d22926f3962 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -38,6 +38,9 @@
 
 #include <asm/irq.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/phy.h>
+
 #define PHY_STATE_STR(_state)			\
 	case PHY_##_state:			\
 		return __stringify(_state);	\
@@ -1039,6 +1042,7 @@ void phy_state_machine(struct work_struct *work)
 	if (err < 0)
 		phy_error(phydev);
 
+        trace_phy_state_change(phydev, old_state);
 	if (old_state != phydev->state)
 		phydev_dbg(phydev, "PHY state change %s -> %s\n",
 			   phy_state_to_str(old_state),
diff --git a/include/trace/events/phy.h b/include/trace/events/phy.h
new file mode 100644
index 000000000000..7ba6c0dda47e
--- /dev/null
+++ b/include/trace/events/phy.h
@@ -0,0 +1,68 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM phy
+
+#if !defined(_TRACE_PHY_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PHY_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(phy_interrupt,
+	    TP_PROTO(int irq, struct phy_device *phydev),
+	    TP_ARGS(irq, phydev),
+	    TP_STRUCT__entry(
+		    __field(int, irq)
+		    __field(int, addr)
+		    __field(int, state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->irq = irq;
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-irq irq=%d ifname=%16s state=%d",
+		      __entry->addr,
+		      __entry->irq,
+		      __entry->ifname,
+		      __entry->state
+		    )
+	);
+
+TRACE_EVENT(phy_state_change,
+	    TP_PROTO(struct phy_device *phydev, enum phy_state old_state),
+	    TP_ARGS(phydev, old_state),
+	    TP_STRUCT__entry(
+		    __field(int, addr)
+		    __field(int, state)
+		    __field(int, old_state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    __entry->old_state = old_state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-change ifname=%16s old_state=%d state=%d",
+		      __entry->addr,
+		      __entry->ifname,
+		      __entry->old_state,
+		      __entry->state
+		    )
+	);
+
+#endif /* _TRACE_PHY_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.17.1

^ permalink raw reply related

* [PATCH] net: phy: add tracepoints
From: Tobias Waldekranz @ 2018-08-16  9:30 UTC (permalink / raw)
  To: Tobias Waldekranz
  Cc: Andrew Lunn, Florian Fainelli, David S. Miller, Steven Rostedt,
	Ingo Molnar, open list, open list:ETHERNET PHY LIBRARY

Two tracepoints for now:

   * `phy_interrupt` Pretty self-explanatory.

   * `phy_state_change` Whenever the PHY's state machine is run, trace
     the old and the new state.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
---
 drivers/net/phy/phy.c      |  4 +++
 include/trace/events/phy.h | 68 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 72 insertions(+)
 create mode 100644 include/trace/events/phy.h

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 9aabfa1a455a..8d22926f3962 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -38,6 +38,9 @@
 
 #include <asm/irq.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/phy.h>
+
 #define PHY_STATE_STR(_state)			\
 	case PHY_##_state:			\
 		return __stringify(_state);	\
@@ -1039,6 +1042,7 @@ void phy_state_machine(struct work_struct *work)
 	if (err < 0)
 		phy_error(phydev);
 
+        trace_phy_state_change(phydev, old_state);
 	if (old_state != phydev->state)
 		phydev_dbg(phydev, "PHY state change %s -> %s\n",
 			   phy_state_to_str(old_state),
diff --git a/include/trace/events/phy.h b/include/trace/events/phy.h
new file mode 100644
index 000000000000..7ba6c0dda47e
--- /dev/null
+++ b/include/trace/events/phy.h
@@ -0,0 +1,68 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM phy
+
+#if !defined(_TRACE_PHY_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PHY_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(phy_interrupt,
+	    TP_PROTO(int irq, struct phy_device *phydev),
+	    TP_ARGS(irq, phydev),
+	    TP_STRUCT__entry(
+		    __field(int, irq)
+		    __field(int, addr)
+		    __field(int, state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->irq = irq;
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-irq irq=%d ifname=%16s state=%d",
+		      __entry->addr,
+		      __entry->irq,
+		      __entry->ifname,
+		      __entry->state
+		    )
+	);
+
+TRACE_EVENT(phy_state_change,
+	    TP_PROTO(struct phy_device *phydev, enum phy_state old_state),
+	    TP_ARGS(phydev, old_state),
+	    TP_STRUCT__entry(
+		    __field(int, addr)
+		    __field(int, state)
+		    __field(int, old_state)
+		    __array(char, ifname, IFNAMSIZ)
+		    ),
+	    TP_fast_assign(
+		    __entry->addr = phydev->mdio.addr;
+		    __entry->state = phydev->state;
+		    __entry->old_state = old_state;
+		    if (phydev->attached_dev)
+			    memcpy(__entry->ifname,
+				   netdev_name(phydev->attached_dev),
+				   IFNAMSIZ);
+		    else
+			    memset(__entry->ifname, 0, IFNAMSIZ);
+		    ),
+	    TP_printk("phy-%d-change ifname=%16s old_state=%d state=%d",
+		      __entry->addr,
+		      __entry->ifname,
+		      __entry->old_state,
+		      __entry->state
+		    )
+	);
+
+#endif /* _TRACE_PHY_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v1 2/3] zinc: Introduce minimal cryptography library
From: Jason A. Donenfeld @ 2018-08-16  6:31 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Eric Biggers, Linux Crypto Mailing List, LKML, Netdev,
	David Miller, Andrew Lutomirski, Greg Kroah-Hartman, Samuel Neves,
	Daniel J . Bernstein, Tanja Lange, Jean-Philippe Aumasson,
	Karthikeyan Bhargavan
In-Reply-To: <20180814211229.GB24575@gmail.com>

Hi Eric,

On Tue, Aug 14, 2018 at 2:12 PM Eric Biggers <ebiggers@kernel.org> wrote:
> On ARM Cortex-A7, OpenSSL's ChaCha20 implementation is 13.9 cpb (cycles per
> byte), whereas Linux's is faster: 11.9 cpb.
>
> The reason Linux's ChaCha20 NEON implementation is faster than OpenSSL's
>
> I understand there are tradeoffs, and different implementations can be faster on
> different CPUs.
>
> So if your proposal goes in, I'd likely need to write a patch
> to get the old performance back, at least on Cortex-A7...

Yes, absolutely. Different CPUs behave differently indeed, but if you
have improvements for hardware that matters to you, we should
certainly incorporate these, and also loop Andy Polyakov in (I've
added him to the CC for the WIP v2). ChaCha is generally pretty
obvious, but for big integer algorithms -- like Poly1305 and
Curve25519 -- I think it's all the more important to involve Andy and
the rest of the world in general, so that Linux benefits from bug
research and fuzzing in places that are typically and classically
prone to nasty issues. In other words, let's definitely incorporate
your improvements after the patchset goes in, and at the same time
we'll try to bring Andy and others into the fold, where our
improvements can generally track each others.

> Also, I don't know whether Andy P. considered the 4xNEON implementation
> technique.  It could even be fastest on other ARM CPUs too, I don't know.

After v2, when he's CC'd in, let's plan to start discussing this with him.

Jason

^ permalink raw reply

* Re: bnxt: card intermittently hanging and dropping link
From: Michael Chan @ 2018-08-16  6:29 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Netdev, jay.vosburgh
In-Reply-To: <87o9e26fqj.fsf@linkitivity.dja.id.au>

On Wed, Aug 15, 2018 at 10:29 PM, Daniel Axtens <dja@axtens.net> wrote:

> [ 2682.911295] ------------[ cut here ]------------
> [ 2682.911319] NETDEV WATCHDOG: eth4 (bnxt_en): transmit queue 0 timed out

The main issue is the TX timeout.
.....

> [ 2682.911693] bnxt_en 0000:3b:00.0 eth4: TX timeout detected, starting reset task!
> [ 2683.782496] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
> [ 2683.783061] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
> [ 2684.634557] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
> [ 2684.635120] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1

and it is not recovering.

Please provide ethtool -i eth4 which will show the firmware version on
the NIC.  Let's see if the firmware is too old.

Thanks.

^ permalink raw reply

* Re: [PATCH stable 4.4 1/9] Revert "tcp: detect malicious patterns in tcp_collapse_ofo_queue()"
From: Greg KH @ 2018-08-16  6:04 UTC (permalink / raw)
  To: maowenan; +Cc: dwmw2, netdev, eric.dumazet, edumazet, davem, ycheng, jdw
In-Reply-To: <865f3aa5-e763-a5fe-9ab0-eab42e0fc310@huawei.com>

On Thu, Aug 16, 2018 at 09:55:42AM +0800, maowenan wrote:
> 
> 
> On 2018/8/15 21:18, Greg KH wrote:
> > On Wed, Aug 15, 2018 at 09:21:00PM +0800, Mao Wenan wrote:
> >> This reverts commit dc6ae4dffd656811dee7151b19545e4cd839d378.
> > 
> > I need a reason why, and a signed-off-by line :(
> 
> stable 4.4 only back port two patches to fix CVE-2018-5390, I have tested they can't
> fix fully because of simple queue used in lower version, so we need change simple queue
> to RB tree to finally resolve. But 9f5afeae have many conflicts with tcp: detect malicious patterns in tcp_collapse_ofo_queue()
> and tcp: avoid collapses in tcp_prune_queue() if possible, and there are patch series from Eric in mainline to fix CVE-2018-5390,
> so I need revert part of patches in stable 4.4 firstly, then apply 9f5afeae, and reapply five patches from Eric.
> 9f5afeae tcp: use an RB tree for ooo receive queue

Then please put this information in the changelog text, that's what we
need to see here.

thanks,

greg k-h

^ permalink raw reply

* bnxt: card intermittently hanging and dropping link
From: Daniel Axtens @ 2018-08-16  5:29 UTC (permalink / raw)
  To: netdev, jay.vosburgh, michael.chan

Hi Michael,

I have some user reports of issues with a Broadcom 57412 card with the
card intermittently hanging and dropping the link.

The problem has been observed on a Dell server with an Ubuntu 4.13
kernel (bnxt_en version 1.7.0) and with an Ubuntu 4.15 kernel (bnxt_en
version 1.8.0). It seems to occur while mounting an XFS volume over
iSCSI, although running blkid on the partition succeeds, and it seems to
be a different volume each time.

I've included an excerpt from the kernel log below.

It seems that other people have reported this with a RHEL7 kernel - I
have no idea what driver version that would be running, or what workload
they were operating, but the warnings and messages look the same:
https://www.dell.com/community/PowerEdge-Hardware-General/Critical-network-bnxt-en-module-crashes-on-14G-servers/td-p/6031769/highlight/true
The forum poster reported that disconnecting the card from power for 5
minutes was sufficient to get things working again and I have asked our
user to test that.

Is this a known issue?

Regards,
Daniel

[ 2662.526151] scsi host14: iSCSI Initiator over TCP/IP
[ 2662.538110] scsi host15: iSCSI Initiator over TCP/IP
[ 2662.547350] scsi host16: iSCSI Initiator over TCP/IP
[ 2662.554660] scsi host17: iSCSI Initiator over TCP/IP
[ 2662.813860] scsi 15:0:0:1: Direct-Access     PURE     FlashArray       8888 PQ: 0 ANSI: 6
[ 2662.813888] scsi 14:0:0:1: Direct-Access     PURE     FlashArray       8888 PQ: 0 ANSI: 6
[ 2662.813972] scsi 16:0:0:1: Direct-Access     PURE     FlashArray       8888 PQ: 0 ANSI: 6
[ 2662.814322] sd 15:0:0:1: Attached scsi generic sg1 type 0
[ 2662.814553] sd 14:0:0:1: Attached scsi generic sg2 type 0
[ 2662.814554] scsi 17:0:0:1: Direct-Access     PURE     FlashArray       8888 PQ: 0 ANSI: 6
[ 2662.814612] sd 16:0:0:1: Attached scsi generic sg3 type 0
[ 2662.815081] sd 15:0:0:1: [sdb] 10737418240 512-byte logical blocks: (5.50 TB/5.00 TiB)
[ 2662.815195] sd 15:0:0:1: [sdb] Write Protect is off
[ 2662.815197] sd 15:0:0:1: [sdb] Mode Sense: 43 00 00 08
[ 2662.815229] sd 14:0:0:1: [sdc] 10737418240 512-byte logical blocks: (5.50 TB/5.00 TiB)
[ 2662.815292] sd 17:0:0:1: Attached scsi generic sg4 type 0
[ 2662.815342] sd 14:0:0:1: [sdc] Write Protect is off
[ 2662.815343] sd 14:0:0:1: [sdc] Mode Sense: 43 00 00 08
[ 2662.815419] sd 15:0:0:1: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 2662.815447] sd 16:0:0:1: [sdd] 10737418240 512-byte logical blocks: (5.50 TB/5.00 TiB)
[ 2662.815544] sd 14:0:0:1: [sdc] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 2662.815614] sd 16:0:0:1: [sdd] Write Protect is off
[ 2662.815615] sd 16:0:0:1: [sdd] Mode Sense: 43 00 00 08
[ 2662.815882] sd 16:0:0:1: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 2662.816188] sd 17:0:0:1: [sde] 10737418240 512-byte logical blocks: (5.50 TB/5.00 TiB)
[ 2662.816298] sd 17:0:0:1: [sde] Write Protect is off
[ 2662.816300] sd 17:0:0:1: [sde] Mode Sense: 43 00 00 08
[ 2662.816502] sd 17:0:0:1: [sde] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[ 2662.820080] sd 15:0:0:1: [sdb] Attached SCSI disk
[ 2662.820594] sd 14:0:0:1: [sdc] Attached SCSI disk
[ 2662.820995] sd 17:0:0:1: [sde] Attached SCSI disk
[ 2662.821176] sd 16:0:0:1: [sdd] Attached SCSI disk
[ 2662.913642] device-mapper: multipath round-robin: version 1.2.0 loaded
[ 2663.954001] XFS (dm-2): Mounting V5 Filesystem
[ 2673.186083]  connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295558209, last ping 4295559460, now 4295560768
[ 2673.186135]  connection3:0: detected conn error (1022)
[ 2673.186137]  connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295558209, last ping 4295559460, now 4295560768
[ 2673.186168]  connection2:0: detected conn error (1022)
[ 2673.186170]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295558209, last ping 4295559460, now 4295560768
[ 2673.186211]  connection1:0: detected conn error (1022)
[ 2674.209870]  connection4:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295558463, last ping 4295559720, now 4295561024
[ 2674.209924]  connection4:0: detected conn error (1022)
[ 2678.560630]  session1: session recovery timed out after 5 secs
[ 2678.560641]  session2: session recovery timed out after 5 secs
[ 2678.560647]  session3: session recovery timed out after 5 secs
[ 2678.951453] device-mapper: multipath: Failing path 8:32.
[ 2678.951509] device-mapper: multipath: Failing path 8:48.
[ 2678.951548] device-mapper: multipath: Failing path 8:16.
[ 2679.584302]  session4: session recovery timed out after 5 secs
[ 2679.584313] sd 17:0:0:1: rejecting I/O to offline device
[ 2679.584356] sd 17:0:0:1: [sde] killing request
[ 2679.584362] sd 17:0:0:1: rejecting I/O to offline device
[ 2679.584392] sd 17:0:0:1: [sde] killing request
[ 2679.584401] sd 17:0:0:1: [sde] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2679.584402] sd 17:0:0:1: [sde] CDB: Read(16) 88 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00
[ 2679.584404] print_req_error: I/O error, dev sde, sector 0
[ 2679.584436] sd 17:0:0:1: [sde] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[ 2679.584438] sd 17:0:0:1: [sde] CDB: Write(16) 8a 00 00 00 00 00 80 10 7f 1e 00 00 10 00 00 00
[ 2679.584439] print_req_error: I/O error, dev sde, sector 2148564766
[ 2679.584464] device-mapper: multipath: Failing path 8:64.
[ 2679.584472] print_req_error: I/O error, dev dm-2, sector 2148564766
[ 2679.584562] XFS (dm-2): xfs_do_force_shutdown(0x1) called from line 1217 of file /build/linux-hwe-eGX82t/linux-hwe-4.15.0/fs/xfs/xfs_buf.c.  Return address = 0x00000000d95bdaac
[ 2679.584576] XFS (dm-2): I/O Error Detected. Shutting down filesystem
[ 2679.584595] XFS (dm-2): Please umount the filesystem and rectify the problem(s)
[ 2679.584613] XFS (dm-2): metadata I/O error: block 0x80107f1e ("xlog_bwrite") error 5 numblks 8192
[ 2679.584796] XFS (dm-2): failed to locate log tail
[ 2679.584798] XFS (dm-2): log mount/recovery failed: error -5
[ 2679.584910] XFS (dm-2): log mount failed
[ 2680.589182] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2680.589232] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2680.589250] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2680.589584] print_req_error: I/O error, dev dm-2, sector 2
[ 2680.589611] EXT4-fs (dm-2): unable to read superblock
[ 2680.589781] print_req_error: I/O error, dev dm-2, sector 2
[ 2680.589804] EXT4-fs (dm-2): unable to read superblock
[ 2680.590019] print_req_error: I/O error, dev dm-2, sector 2
[ 2680.590078] EXT4-fs (dm-2): unable to read superblock
[ 2680.590298] print_req_error: I/O error, dev dm-2, sector 0
[ 2680.590358] SQUASHFS error: squashfs_read_data failed to read block 0x0
[ 2680.590390] squashfs: SQUASHFS error: unable to read squashfs_super_block
[ 2681.592134] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2681.592217] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2681.592437] EXT4-fs (dm-2): unable to read superblock
[ 2681.592576] EXT4-fs (dm-2): unable to read superblock
[ 2681.592707] EXT4-fs (dm-2): unable to read superblock
[ 2681.592839] SQUASHFS error: squashfs_read_data failed to read block 0x0
[ 2681.592858] squashfs: SQUASHFS error: unable to read squashfs_super_block
[ 2682.661952] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2682.667509] Buffer I/O error on dev dm-2, logical block 10737418112, async page read
[ 2682.667533] Buffer I/O error on dev dm-2, logical block 10737418113, async page read
[ 2682.667552] Buffer I/O error on dev dm-2, logical block 10737418114, async page read
[ 2682.667571] Buffer I/O error on dev dm-2, logical block 10737418115, async page read
[ 2682.667612] Buffer I/O error on dev dm-2, logical block 10737418116, async page read
[ 2682.667632] Buffer I/O error on dev dm-2, logical block 10737418117, async page read
[ 2682.667653] Buffer I/O error on dev dm-2, logical block 10737418118, async page read
[ 2682.911295] ------------[ cut here ]------------
[ 2682.911319] NETDEV WATCHDOG: eth4 (bnxt_en): transmit queue 0 timed out
[ 2682.911350] WARNING: CPU: 23 PID: 0 at /build/linux-hwe-eGX82t/linux-hwe-4.15.0/net/sched/sch_generic.c:323 dev_watchdog+0x222/0x230
[ 2682.911352] Modules linked in: dm_round_robin binfmt_misc cfg80211 xt_set xt_multiport iptable_raw iptable_mangle ip_set_hash_ip ip_set_hash_net ip_set ipip tunnel4 ip_tunnel veth xt_statistic xt_nat xt_recent ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_addrtype ip_vs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6_tables xt_mark iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack_netlink nfnetlink nf_conntrack xfrm_user xfrm_algo aufs bonding intel_rapl skx_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm dcdbas irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd in
 tel_cstate
[ 2682.911430]  intel_rapl_perf mei_me lpc_ich shpchp mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi br_netfilter bridge stp llc dm_multipath autofs4 xfs bnx2x ptp mgag200 i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops pps_core bnxt_en mdio drm devlink libcrc32c ahci libahci
[ 2682.911482] CPU: 23 PID: 0 Comm: swapper/23 Not tainted 4.15.0-24-generic #26~16.04.1-Ubuntu
[ 2682.911483] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 1.3.7 02/08/2018
[ 2682.911487] RIP: 0010:dev_watchdog+0x222/0x230
[ 2682.911489] RSP: 0018:ffff91eaeeec3e68 EFLAGS: 00010282
[ 2682.911492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2682.911493] RDX: 0000000000040400 RSI: 00000000000000f6 RDI: 0000000000000300
[ 2682.911495] RBP: ffff91eaeeec3e98 R08: 0000000000000001 R09: 000000000000071b
[ 2682.911497] R10: 0000000000000000 R11: 000000000000071b R12: 000000000000004a
[ 2682.911499] R13: ffff91aae1522000 R14: ffff91aae1522478 R15: ffff91aae066dbc0
[ 2682.911505] FS:  0000000000000000(0000) GS:ffff91eaeeec0000(0000) knlGS:0000000000000000
[ 2682.911507] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2682.911509] CR2: 00007f86f4058384 CR3: 00000033bd20a005 CR4: 00000000007606e0
[ 2682.911511] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2682.911513] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2682.911514] PKRU: 55555554
[ 2682.911515] Call Trace:
[ 2682.911518]  <IRQ>
[ 2682.911524]  ? dev_deactivate_queue.constprop.33+0x60/0x60
[ 2682.911534]  call_timer_fn+0x32/0x140
[ 2682.911539]  run_timer_softirq+0x1ed/0x440
[ 2682.911543]  ? ktime_get+0x3e/0xa0
[ 2682.911550]  ? lapic_next_deadline+0x26/0x30
[ 2682.911559]  __do_softirq+0xf2/0x288
[ 2682.911565]  irq_exit+0xb6/0xc0
[ 2682.911569]  smp_apic_timer_interrupt+0x71/0x140
[ 2682.911574]  apic_timer_interrupt+0x84/0x90
[ 2682.911575]  </IRQ>
[ 2682.911583] RIP: 0010:cpuidle_enter_state+0xa7/0x300
[ 2682.911585] RSP: 0018:ffffbac30c6efe60 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[ 2682.911587] RAX: ffff91eaeeee2880 RBX: 0000000000000003 RCX: 000000000000001f
[ 2682.911589] RDX: 0000000000000000 RSI: 000000002f84dd9f RDI: 0000000000000000
[ 2682.911591] RBP: ffffbac30c6efe98 R08: ffff91eaeeee16a4 R09: 0000000000000018
[ 2682.911592] R10: ffffbac30c6efe30 R11: 000000000000266e R12: 0000000000000003
[ 2682.911594] R13: ffffdac2ff6c3138 R14: ffffffff85171d58 R15: 00000270a5abd299
[ 2682.911599]  cpuidle_enter+0x17/0x20
[ 2682.911606]  call_cpuidle+0x23/0x40
[ 2682.911610]  do_idle+0x197/0x200
[ 2682.911614]  cpu_startup_entry+0x73/0x80
[ 2682.911618]  start_secondary+0x1ab/0x200
[ 2682.911623]  secondary_startup_64+0xa5/0xb0
[ 2682.911626] Code: 38 00 49 63 4e e8 eb 92 4c 89 ef c6 05 76 f3 d8 00 01 e8 42 37 fd ff 89 d9 48 89 c2 4c 89 ee 48 c7 c7 78 9c d9 84 e8 1e b3 80 ff <0f> 0b eb c0 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 
[ 2682.911687] ---[ end trace a09e66b1b45bbbba ]---
[ 2682.911693] bnxt_en 0000:3b:00.0 eth4: TX timeout detected, starting reset task!
[ 2683.782496] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2683.783061] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
[ 2684.634557] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2684.635120] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
[ 2685.474679] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2685.475188] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
[ 2686.323579] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2686.324072] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
[ 2686.691126] print_req_error: 27 callbacks suppressed
[ 2686.691127] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2686.691604] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2686.692001] buffer_io_error: 10 callbacks suppressed
[ 2686.692002] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2686.693881] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2686.694399] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2686.694856] Buffer I/O error on dev dm-2, logical block 10737418112, async page read
[ 2686.695298] print_req_error: I/O error, dev dm-2, sector 10737418113
[ 2686.695720] Buffer I/O error on dev dm-2, logical block 10737418113, async page read
[ 2686.696135] print_req_error: I/O error, dev dm-2, sector 10737418114
[ 2686.696545] Buffer I/O error on dev dm-2, logical block 10737418114, async page read
[ 2686.696962] print_req_error: I/O error, dev dm-2, sector 10737418115
[ 2686.697379] Buffer I/O error on dev dm-2, logical block 10737418115, async page read
[ 2686.697795] print_req_error: I/O error, dev dm-2, sector 10737418116
[ 2686.698223] Buffer I/O error on dev dm-2, logical block 10737418116, async page read
[ 2686.698637] print_req_error: I/O error, dev dm-2, sector 10737418117
[ 2686.699044] Buffer I/O error on dev dm-2, logical block 10737418117, async page read
[ 2686.699456] print_req_error: I/O error, dev dm-2, sector 10737418118
[ 2686.699869] Buffer I/O error on dev dm-2, logical block 10737418118, async page read
[ 2686.700291] Buffer I/O error on dev dm-2, logical block 10737418119, async page read
[ 2687.162049] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2687.162552] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free tx failed. rc:-1
[ 2688.001911] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2688.002335] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2688.836936] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2688.837357] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2689.670237] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2689.670617] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2690.519733] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2690.520109] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2691.142178] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2691.358094] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2691.358503] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2692.210710] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2692.211091] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2693.052607] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2693.053029] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2693.907808] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2693.908233] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2694.761187] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2694.761626] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2695.609620] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2695.610055] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2696.450508] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2696.450939] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2697.293569] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2697.294007] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2698.138408] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2698.138854] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2698.992525] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2698.992979] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2699.605100] print_req_error: 12 callbacks suppressed
[ 2699.605111] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2699.605680] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2699.606212] buffer_io_error: 8 callbacks suppressed
[ 2699.606213] Buffer I/O error on dev dm-2, logical block 1342177264, async page read
[ 2699.608144] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2699.608722] print_req_error: I/O error, dev dm-2, sector 10737418112
[ 2699.609168] Buffer I/O error on dev dm-2, logical block 10737418112, async page read
[ 2699.609581] print_req_error: I/O error, dev dm-2, sector 10737418113
[ 2699.609987] Buffer I/O error on dev dm-2, logical block 10737418113, async page read
[ 2699.610422] print_req_error: I/O error, dev dm-2, sector 10737418114
[ 2699.610855] Buffer I/O error on dev dm-2, logical block 10737418114, async page read
[ 2699.611294] print_req_error: I/O error, dev dm-2, sector 10737418115
[ 2699.611727] Buffer I/O error on dev dm-2, logical block 10737418115, async page read
[ 2699.612133] print_req_error: I/O error, dev dm-2, sector 10737418116
[ 2699.612572] Buffer I/O error on dev dm-2, logical block 10737418116, async page read
[ 2699.612991] print_req_error: I/O error, dev dm-2, sector 10737418117
[ 2699.613407] Buffer I/O error on dev dm-2, logical block 10737418117, async page read
[ 2699.613836] print_req_error: I/O error, dev dm-2, sector 10737418118
[ 2699.614261] Buffer I/O error on dev dm-2, logical block 10737418118, async page read
[ 2699.614803] Buffer I/O error on dev dm-2, logical block 10737418119, async page read
[ 2699.845546] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2699.846018] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2700.688086] bnxt_en 0000:3b:00.0 eth4: Resp cmpl intr err msg: 0x51
[ 2700.688563] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free rx failed. rc:-1
[ 2701.523722] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb6b} len:0
[ 2701.524174] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2702.368502] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb6c} len:0
[ 2702.368952] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2703.222505] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb6d} len:0
[ 2703.222945] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2704.078981] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb6e} len:0
[ 2704.079748] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2704.913958] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb6f} len:0
[ 2704.914393] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2705.764961] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb70} len:0
[ 2705.765400] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2706.613411] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb71} len:0
[ 2706.613839] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2707.454748] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x51 0xb72} len:0
[ 2707.455187] bnxt_en 0000:3b:00.0 eth4: hwrm_ring_free cp failed. rc:-1
[ 2708.306362] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x61 0xb73} len:0
[ 2709.150419] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x50 0xb74} len:0
[ 2709.150880] bnxt_en 0000:3b:00.0 eth4: Invalid ring
[ 2709.151318] bnxt_en 0000:3b:00.0 eth4: hwrm ring alloc failure rc: ffffffff
[ 2709.989131] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x44 0xb75} len:0
[ 2709.989635] bnxt_en 0000:3b:00.0 eth4: hwrm vnic set tpa failure rc for vnic 0: ffffffff
[ 2710.826136] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0xb1 0xb76} len:0
[ 2710.826671] bnxt_en 0000:3b:00.0 eth4: bnxt_init_nic err: ffffffff
[ 2710.831139] bnxt_en 0000:3b:00.0 eth4: nic open fail (rc: ffffffff)
[ 2711.680497] bnxt_en 0000:3b:00.0 eth4: Error (timeout: 500) msg {0x20 0xb77} len:0
[ 2711.694743] bond1: link status definitely down for interface eth4, disabling it
[ 2711.694745] bond1: making interface eth5 the new active one
[ 2712.521953] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa2} len:0
[ 2713.363445] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x93 0xaa3} len:0
[ 2713.364096] bnxt_en 0000:3b:00.1 eth5: HWRM cfa l2 rx mask failure rc: ffffffff
[ 2714.213610] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa4} len:0
[ 2714.946281] device-mapper: multipath: Reinstating path 8:32.
[ 2714.946353] device-mapper: multipath: Reinstating path 8:48.
[ 2714.946409] device-mapper: multipath: Reinstating path 8:16.
[ 2715.052045] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa5} len:0
[ 2715.892140] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa6} len:0
[ 2715.987424] XFS (dm-2): Mounting V5 Filesystem
[ 2716.003187] XFS (dm-2): Starting recovery (logdev: internal)
[ 2716.018293] XFS (dm-2): Ending recovery (logdev: internal)
[ 2716.735471] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa7} len:0
[ 2717.575403] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa8} len:0
[ 2718.459295] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaa9} len:0
[ 2719.487382] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaaa} len:0
[ 2720.517072] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaab} len:0
[ 2721.525248] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaac} len:0
[ 2722.548765] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaad} len:0
[ 2723.572190] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaae} len:0
[ 2724.609428] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xaaf} len:0
[ 2724.882745]  connection4:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295571130, last ping 4295572416, now 4295573696
[ 2724.883488]  connection4:0: detected conn error (1022)
[ 2725.629310] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xab0} len:0
[ 2726.657752] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xab1} len:0
[ 2727.680069] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xab2} len:0
[ 2728.705973] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xab3} len:0
[ 2729.489387] bnxt_en 0000:3b:00.1 eth5: TX timeout detected, starting reset task!
[ 2729.726626] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x23 0xab4} len:0
...

[ 2744.875420] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2744.875775] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free tx failed. rc:-1
[ 2745.726024] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2745.726346] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free tx failed. rc:-1
[ 2746.579434] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2746.579709] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free tx failed. rc:-1
[ 2747.430915] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2747.431197] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free tx failed. rc:-1
[ 2748.285855] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2748.286132] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2749.141919] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2749.142170] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2749.986850] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2749.987190] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2750.821519] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2750.821848] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2751.657391] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2751.657709] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2752.512205] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2752.512543] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2753.352293] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2753.352613] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2754.187621] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2754.187946] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2755.022193] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2755.022524] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2755.856013] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2755.856329] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2756.694497] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2756.694840] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2757.527374] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2757.527732] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2758.363738] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2758.364115] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2759.215324] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2759.215701] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2760.061181] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2760.061552] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2760.898309] bnxt_en 0000:3b:00.1 eth5: Resp cmpl intr err msg: 0x51
[ 2760.898673] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free rx failed. rc:-1
[ 2761.732935] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xada} len:0
[ 2761.733303] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2762.573293] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xadb} len:0
[ 2762.573700] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2763.407002] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xadc} len:0
[ 2763.407411] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2764.255762] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xadd} len:0
[ 2764.256162] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2765.100554] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xade} len:0
[ 2765.100947] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2765.941308] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xadf} len:0
[ 2765.941712] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2766.793534] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xae0} len:0
[ 2766.794015] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2767.642247] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x51 0xae1} len:0
[ 2767.642630] bnxt_en 0000:3b:00.1 eth5: hwrm_ring_free cp failed. rc:-1
[ 2768.496229] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x61 0xae2} len:0
[ 2769.356587] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x50 0xae3} len:0
[ 2769.356939] bnxt_en 0000:3b:00.1 eth5: Invalid ring
[ 2769.357277] bnxt_en 0000:3b:00.1 eth5: hwrm ring alloc failure rc: ffffffff
[ 2770.215233] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x44 0xae4} len:0
[ 2770.215619] bnxt_en 0000:3b:00.1 eth5: hwrm vnic set tpa failure rc for vnic 0: ffffffff
[ 2771.073430] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0xb1 0xae5} len:0
[ 2771.073803] bnxt_en 0000:3b:00.1 eth5: bnxt_init_nic err: ffffffff
[ 2771.076758] bnxt_en 0000:3b:00.1 eth5: nic open fail (rc: ffffffff)
[ 2771.925191] bnxt_en 0000:3b:00.1 eth5: Error (timeout: 500) msg {0x20 0xae6} len:0
[ 2771.936019] bond1: link status definitely down for interface eth5, disabling it
[ 2771.936023] bond1: now running without any active interface!

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-08-16  7:55 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <20180816074457.bjxbrthhvfazuzmd@unicorn.suse.cz>



On 2018/8/16 15:44, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 03:39:14PM +0800, maowenan wrote:
>>
>>
>> On 2018/8/16 15:23, Michal Kubecek wrote:
>>> On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
>>>> On 2018/8/16 14:52, Michal Kubecek wrote:
>>>>>
>>>>> My point is that backporting all this into stable 4.4 is quite intrusive
>>>>> so that if we can achieve similar results with a simple fix of an
>>>>> obvious omission, it would be preferrable.
>>>>
>>>> There are five patches in mainline to fix this CVE, only two patches
>>>> have no effect on stable 4.4, the important reason is 4.4 use simple
>>>> queue but mainline use RB tree.
>>>>
>>>> I have tried my best to use easy way to fix this with dropping packets
>>>> 12.5%(or other value) based on simple queue, but the result is not
>>>> very well, so the RB tree is needed and tested result is my desire.
>>>>
>>>> If we only back port two patches but they don't fix the issue, I think
>>>> they don't make any sense.
>>>
>>> There is an obvious omission in one of the two patches and Takashi's
>>> patch fixes it. If his follow-up fix (applied on top of what is in
>>> stable 4.4 now) addresses the problem, I would certainly prefer using it
>>> over backporting the whole series.
>>
>> Do you mean below codes from Takashi can fix this CVE?
>> But I have already tested like this two days ago, it is not good effect.
> 
> IIRC what you proposed was different, you proposed to replace the "=" in
> the other branch by "+=".

No, I think you don't get what I mean, I have already tested stable 4.4,
based on commit dc6ae4d, and change the codes like Takashi, which didn't
contain any codes I have sent in this patch series.

I suggest someone to test again based on Takashi.

dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
255924e tcp: do not delay ACK in DCTCP upon CE status change
0b1d40e tcp: do not cancel delay-AcK on DCTCP special ACK
17fea38e7 tcp: helpers to send special DCTCP ack
500e03f tcp: fix dctcp delayed ACK schedule
b04c9a0 rtnetlink: add rtnl_link_state check in rtnl_configure_link
73dad08 net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper
48f41c0 ip: hash fragments consistently
54a634c MIPS: ath79: fix register address in ath79_ddr_wb_flush()
762b585 Linux 4.4.144


> 
> Michal Kubecek
> 
> 
>>
>> Could you try to test with POC programme mentioned previous mail in case I made mistake?
>>
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index 4a261e078082..9c4c6cd0316e 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -4835,6 +4835,7 @@ static void tcp_collapse_ofo_queue(struct sock *sk)
>>  			end = TCP_SKB_CB(skb)->end_seq;
>>  			range_truesize = skb->truesize;
>>  		} else {
>> +			range_truesize += skb->truesize;
>>  			if (before(TCP_SKB_CB(skb)->seq, start))
>>  				start = TCP_SKB_CB(skb)->seq;
>>  			if (after(TCP_SKB_CB(skb)->end_seq, end))
>> -- 
>>
>>
>>>
>>> Michal Kubecek
>>>
>>>
>>> .
>>>
>>
> 
> .
> 

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-08-16  7:39 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <20180816072349.2ldy75jia3ubqg3u@unicorn.suse.cz>



On 2018/8/16 15:23, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 03:19:12PM +0800, maowenan wrote:
>> On 2018/8/16 14:52, Michal Kubecek wrote:
>>>
>>> My point is that backporting all this into stable 4.4 is quite intrusive
>>> so that if we can achieve similar results with a simple fix of an
>>> obvious omission, it would be preferrable.
>>
>> There are five patches in mainline to fix this CVE, only two patches
>> have no effect on stable 4.4, the important reason is 4.4 use simple
>> queue but mainline use RB tree.
>>
>> I have tried my best to use easy way to fix this with dropping packets
>> 12.5%(or other value) based on simple queue, but the result is not
>> very well, so the RB tree is needed and tested result is my desire.
>>
>> If we only back port two patches but they don't fix the issue, I think
>> they don't make any sense.
> 
> There is an obvious omission in one of the two patches and Takashi's
> patch fixes it. If his follow-up fix (applied on top of what is in
> stable 4.4 now) addresses the problem, I would certainly prefer using it
> over backporting the whole series.

Do you mean below codes from Takashi can fix this CVE?
But I have already tested like this two days ago, it is not good effect.

Could you try to test with POC programme mentioned previous mail in case I made mistake?

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4a261e078082..9c4c6cd0316e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4835,6 +4835,7 @@ static void tcp_collapse_ofo_queue(struct sock *sk)
 			end = TCP_SKB_CB(skb)->end_seq;
 			range_truesize = skb->truesize;
 		} else {
+			range_truesize += skb->truesize;
 			if (before(TCP_SKB_CB(skb)->seq, start))
 				start = TCP_SKB_CB(skb)->seq;
 			if (after(TCP_SKB_CB(skb)->end_seq, end))
-- 


> 
> Michal Kubecek
> 
> 
> .
> 

^ permalink raw reply related

* Re: [PATCH v1 2/3] zinc: Introduce minimal cryptography library
From: D. J. Bernstein @ 2018-08-16  4:24 UTC (permalink / raw)
  To: Eric Biggers, Jason A. Donenfeld, Eric Biggers,
	Linux Crypto Mailing List, LKML, Netdev, David Miller,
	Andrew Lutomirski, Greg Kroah-Hartman, Samuel Neves, Tanja Lange,
	Jean-Philippe Aumasson, Karthikeyan Bhargavan
In-Reply-To: <20180815195732.GA79500@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7172 bytes --]

Eric Biggers writes:
> You'd probably attract more contributors if you followed established
> open source conventions.

SUPERCOP already has thousands of implementations from hundreds of
contributors. New speed records are more likely to appear in SUPERCOP
than in any other cryptographic software collection. The API is shared
by state-of-the-art benchmarks, state-of-the-art tests, three ongoing
competitions, and increasingly popular production libraries.

Am I correctly gathering from this thread that someone adding a new
implementation of a crypto primitive to the kernel has to worry about
checking the architecture and CPU features to figure out whether the
implementation will run? Wouldn't it make more sense to take this
error-prone work away from the implementor and have a robust automated
central testing mechanism, as in SUPERCOP?

Am I also correctly gathering that adding an extra implementation to the
kernel can hurt performance, unless the implementor goes to extra effort
to check for the CPUs where the previous implementation is faster---or
to build some ad-hoc timing mechanism ("raid6: using algorithm avx2x4
gen() 31737 MB/s")? Wouldn't it make more sense to take this error-prone
work away from the implementor and have a robust automated central
timing mechanism, as in SUPERCOP?

I also didn't notice anyone disputing Jason's comment about the "general
clunkiness" of the kernel's internal crypto API---but is there really no
consensus as to what the replacement API is supposed to be? Someone who
simply wants to implement some primitives has to decide on function-call
details, argue about the software location, add configuration options,
etc.? Wouldn't it make more sense to do this centrally, as in SUPERCOP?

And then there's the bigger question of how the community is organizing
ongoing work on accelerating---and auditing, and fixing, and hopefully
verifying---implementations of cryptographic primitives. Does it really
make sense that people looking for what's already been done have to go
poking around a bunch of separate libraries? Wouldn't it make more sense
to have one central collection of code, as in SUPERCOP? Is there any
fundamental obstacle to having libraries share code for primitives?

> there doesn't appear to be an official git repository for SUPERCOP,
> nor is there any mention of how to send patches, nor is there any
> COPYING or LICENSE file, nor even a README file.

https://bench.cr.yp.to/call-stream.html explains the API and submission
procedure for stream ciphers. There are similar pages for other types of
cryptographic primitives. https://bench.cr.yp.to/tips.html explains the
develop-test cycle and various useful options.

Licenses vary across implementations. There's a minimum requirement of
public distribution for verifiability of benchmark results, but it's up
to individual implementors to decide what they'll allow beyond that.
Patent status also varies; constant-time status varies; verification
status varies; code quality varies; cryptographic security varies; etc.
As I mentioned, SUPERCOP includes MD5 and Speck and RSA-512.

For comparison, where can I find an explanation of how to test kernel
crypto patches, and how fast is the develop-test cycle? Okay, I don't
have a kernel crypto patch, but I did write a kernel patch recently that
(I think) fixes some recent Lenovo ACPI stupidity:

   https://marc.info/?l=qubes-users&m=153308905514481

I'd propose this for review and upstream adoption _if_ it survives
enough tests---but what's the right test procedure? I see superficial
documentation of where to submit a patch for review, but am I really
supposed to do this before serious testing? The patch works on my
laptop, and several other people say it works, but obviously this is
missing the big question of whether the patch breaks _other_ laptops.
I see an online framework for testing, but using it looks awfully
complicated, and the level of coverage is unclear to me. Has anyone
tried to virtualize kernel testing---to capture hardware data from many
machines and then centrally simulate kernels running on those machines,
for example to check that those machines don't take certain code paths?
I suppose that people who work with the kernel all the time would know
what to do, but for me the lack of information was enough of a deterrent
that I switched to doing something else.

> Another issue is that the ChaCha code in SUPERCOP is duplicated for
> each number of rounds: 8, 12, and 20.

These are auto-generated, of course.

To understand this API detail, consider some of the possibilities for
the round counts supported by compiled code:

   * 20
   * 12
   * 8
   * caller selection from among 20 and 12 and 8
   * caller selection of any multiple of 4
   * caller selection of any multiple of 2
   * caller selection of anything

I hope that in the long term everyone is simply using 20, and then the
pure 20 is the simplest and smallest and most easily verified code, but
obviously there are other implementations today. An API with a separate
function for each round count allows any of these implementations to be
trivially benchmarked and used, whereas an API that insists on passing
the round count as an argument prohibits at least the first three and
maybe more.

> crypto_stream/chacha20/dolbeau/arm-neon/, which uses a method similar to the
> Linux implementation but it uses GCC intrinsics, so its performance will heavily
> depend on how the compiler assigns and spills registers, which can vary greatly
> depending on the compiler version and options.

Sure. The damage done by incompetent compilers is particularly clear for
in-order CPUs such as the Cortex-A7.

> I understand that Salsa20 is similar to ChaCha, and that ideas from Salsa20
> implementations often apply to ChaCha too.  But it's not always obvious what
> carries over and what doesn't; the rotation amounts can matter a lot, for
> example, as different rotations can be implemented in different ways.

This sounds backwards to me. ChaCha20 supports essentially all the
Salsa20 implementation techniques plus some extra streamlining: often a
bit less register pressure, often less data reorganization, and often
some rotation speedups.

> Nor is it always obvious which ideas from SSE2 or AVX2 implementations
> (for example) carry over to NEON implementations, as these instruction
> sets are different enough that each has its own unique quirks and
> optimizations.

Of course.

> Previously I also found that OpenSSL's ARM NEON implementation of Poly1305 is
> much faster than the implementations in SUPERCOP, as well as more
> understandable.  (I don't know the 'qhasm' language, for example.)  So from my
> perspective, I've had more luck with OpenSSL than SUPERCOP when looking for fast
> implementations of crypto algorithms.  Have you considered adding the OpenSSL
> implementations to SUPERCOP?

Almost all of the implementations in SUPERCOP were submitted by the
implementors, with a few exceptions for wrappers. Realistically, the
implementors are in the best position to check that they're getting the
expected results and to be in control of any necessary updates.

---Dan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-08-16  7:19 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <20180816065210.iq7jyscajjoiw7k4@unicorn.suse.cz>



On 2018/8/16 14:52, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 02:42:32PM +0800, maowenan wrote:
>> On 2018/8/16 14:16, Michal Kubecek wrote:
>>> On Thu, Aug 16, 2018 at 10:50:01AM +0800, Mao Wenan wrote:
>>>> There are five patches to fix CVE-2018-5390 in latest mainline 
>>>> branch, but only two patches exist in stable 4.4 and 3.18: 
>>>> dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
>>>> 5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
>>>> I have tested with stable 4.4 kernel, and found the cpu usage was very high.
>>>> So I think only two patches can't fix the CVE-2018-5390.
>>>> test results:
>>>> with fix patch：     78.2%   ksoftirqd
>>>> withoutfix patch：   90%     ksoftirqd
>>>>
>>>> Then I try to imitate 72cd43ba(tcp: free batches of packets in tcp_prune_ofo_queue())
>>>> to drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks with simple queue 
>>>> instead of RB tree. The result is not very well.
>>>>  
>>>> After analysing the codes of stable 4.4, and debuging the 
>>>> system, shows that search of ofo_queue(tcp ofo using a simple queue) cost more cycles.
>>>>
>>>> So I try to backport "tcp: use an RB tree for ooo receive queue" using RB tree 
>>>> instead of simple queue, then backport Eric Dumazet 5 fixed patches in mainline,
>>>> good news is that ksoftirqd is turn to about 20%, which is the same with mainline now.
>>>>
>>>> Stable 4.4 have already back port two patches, 
>>>> f4a3313d(tcp: avoid collapses in tcp_prune_queue() if possible)
>>>> 3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue())
>>>> If we want to change simple queue to RB tree to finally resolve, we should apply previous 
>>>> patch 9f5afeae(tcp: use an RB tree for ooo receive queue.) firstly, but 9f5afeae have many 
>>>> conflicts with 3d4bf93a and f4a3313d, which are part of patch series from Eric in 
>>>> mainline to fix CVE-2018-5390, so I need revert part of patches in stable 4.4 firstly, 
>>>> then apply 9f5afeae, and reapply five patches from Eric.
>>>
>>> There seems to be an obvious mistake in one of the backports. Could you
>>> check the results with Takashi's follow-up fix submitted at
>>>
>>>   http://lkml.kernel.org/r/20180815095846.7734-1-tiwai@suse.de
>>>
>>> (I would try myself but you don't mention what test you ran.)
>>
>> I have backport RB tree in stable 4.4, function
>> tcp_collapse_ofo_queue() has been refined, which keep the same with
>> mainline, so it seems no problem when apply Eric's patch 3d4bf93a(tcp:
>> detect malicious patterns in tcp_collapse_ofo_queue()).
>>
>> I also noticed that range_truesize != head->truesize will be always
>> false which mentioned in your URL, but this only based on stable 4.4's
>> codes, If I applied RB tree's patch 9f5afeae(tcp: use an RB tree for
>> ooo receive queue), and after apply 3d4bf93a,the codes should be
>> range_truesize += skb->truesize, and range_truesize != head->truesize
>> can be true.
> 
> My point is that backporting all this into stable 4.4 is quite intrusive
> so that if we can achieve similar results with a simple fix of an
> obvious omission, it would be preferrable.

There are five patches in mainline to fix this CVE, only two patches have no effect on stable 4.4,
the important reason is 4.4 use simple queue but mainline use RB tree.

I have tried my best to use easy way to fix this with dropping packets 12.5%(or other value) based on simple queue,
but the result is not very well, so the RB tree is needed and tested result is my desire.

If we only back port two patches but they don't fix the issue, I think they don't make any sense.

> 
> Michal Kubecek
> 
> .
> 

^ permalink raw reply

* Re: [RFC PATCH net-next V2 0/6] XDP rx handler
From: Jason Wang @ 2018-08-16  4:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Ahern, Jesper Dangaard Brouer, netdev, linux-kernel, ast,
	daniel, mst, Toshiaki Makita
In-Reply-To: <20180816024913.57ykirt7eqrjntfy@ast-mbp.dhcp.thefacebook.com>

On 2018年08月16日 10:49, Alexei Starovoitov wrote:
> On Wed, Aug 15, 2018 at 03:04:35PM +0800, Jason Wang wrote:
>>>> 3 Deliver XDP buff to userspace through macvtap.
>>> I think I'm getting what you're trying to achieve.
>>> You actually don't want any bpf programs in there at all.
>>> You want macvlan builtin logic to act on raw packet frames.
>> The built-in logic is just used to find the destination macvlan device. It
>> could be done by through another bpf program. Instead of inventing lots of
>> generic infrastructure on kernel with specific userspace API, built-in logic
>> has its own advantages:
>>
>> - support hundreds or even thousands of macvlans
> are you saying xdp bpf program cannot handle thousands macvlans?

Correct me if I was wrong. It works well when the macvlan requires 
similar logic. But let's consider the case when each macvlan wants its 
own specific logic. Is this possible to have thousands of different 
policies and actions in a single BPF program? With XDP rx hanlder, 
there's no need to root device to care about them. Each macvlan can only 
care about itself. This is similar to the case that qdisc could be 
attached to each stacked device.

>
>> - using exist tools to configure network
>> - immunity to topology changes
> what do you mean specifically?

Still the above example, if some macvlans is deleted or created. We need 
notify and update the policies in the root device, this requires 
userspace control program to monitor those changes and notify BPF 
program through maps. Unless the BPF program is designed for some 
specific configurations and setups, it would not be an easy task.

>
>> Besides the usage for containers, we can implement macvtap RX handler which
>> allows a fast packet forwarding to userspace.
> and try to reinvent af_xdp? the motivation for the patchset still escapes me.

Nope, macvtap was used for forwarding packets to VM. This is just try to 
deliver the XDP buff to VM instead of skb. Similar idea was used by 
TUN/TAP which shows amazing improvements.

>
>> Actually, the idea is not limited to macvlan but for all device that is
>> based on rx handler. Consider the case of bonding, this allows to set a very
>> simple XDP program on slaves and keep a single main logic XDP program on the
>> bond instead of duplicating it in all slaves.
> I think such mixed environment of hardcoded in-kernel things like bond
> mixed together with xdp programs will be difficult to manage and debug.
> How admin suppose to debug it?

Well, we've already had in-kernel XDP_TX routine. It should be not 
harder than that.

>   Say something in the chain of
> nic -> native xdp -> bond with your xdp rx -> veth -> xdp prog -> consumer
> is dropping a packet. If all forwarding decisions are done by bpf progs
> the progs will have packet tracing facility (like cilium does) to
> show packet flow end-to-end. It works briliantly like traceroute within a host.

Does this work well for veth pair as well? If yes, it should work for rx 
handler or maybe it has some hard code logic like "ok, the packet goes 
to veth, I'm sure it will be delivered to its peer"? The idea of this 
series is not forbidding the forwarding decisions done by bpf progs, if 
the code did this by accident, we can introduce flag to disable/enable 
XDP rx handler.

And I believe redirection is part of XDP usage, we may still want things 
like XDP_TX.

> But when you have things like macvlan, bond, bridge in the middle
> that can also act on packet, the admin will have a hard time.

I admit it may require admin help, but it gives us more flexibility.

>
> Essentially what you're proposing is to make all kernel builtin packet
> steering/forwarding facilities to understand raw xdp frames.

Probably not, at least for this series it just focus on rx handler. We 
only have less than 10 devices use that.

> That's a lot of code
> and at the end of the chain you'd need fast xdp frame consumer otherwise
> perf benefits are lost.

The performance are lost but still the same as skb. And except for 
redirection, we do have other consumer like XDP_TX.

>   If that consumer is xdp bpf program
> why bother with xdp-fied macvlan or bond?

For macvlan, we may want to have different polices for different 
devices. For bond, we don't want to duplicate XDP logic in each slaves, 
and only bond know which slave could be used for XDP_TX.

>   If that consumer is tcp stack
> than forwarding via xdp-fied bond is no faster than via skb-based bond.
>

Yes.

Thanks

^ permalink raw reply

* Re: [PATCH v3] Add BPF_SYNCHRONIZE_MAP_TO_MAP_REFERENCES bpf(2) command
From: Alexei Starovoitov @ 2018-08-16  4:01 UTC (permalink / raw)
  To: Daniel Colascione
  Cc: Daniel Borkmann, Jakub Kicinski, Joel Fernandes, linux-kernel,
	Tim Murray, netdev, Lorenzo Colitti, Chenbo Feng,
	Mathieu Desnoyers, Alexei Starovoitov
In-Reply-To: <CAKOZueuACjK0W4byW7omPvEJ-H-8MK1OU3ORrAfNtsEi8NxoPQ@mail.gmail.com>

On Tue, Aug 14, 2018 at 01:37:12PM -0700, Daniel Colascione wrote:
> 
> > If we agree on that, should bpf_map_update handle it then?
> > Wouldn't it be much easier to understand and use from user pov?
> > No new commands to learn. map_update syscall replaced the map
> > and old map is no longer accessed by the program via this given map-in-map.
> 
> Maybe with a new BPF_SYNCHRONIZE flag for BPF_MAP_UPDATE_ELEM and
> BPF_MAP_DELETE_ELEM. Otherwise, it seems wrong to make every user of
> these commands pay for synchronization that only a few will need.

I don't think extra flag is needed. Extra sync_rcu() for map-in-map
is useful for all users. I would consider it a bugfix,
since users that examine deleted map have this race today
and removing the race is always a good thing especially since the cost
is small.

^ permalink raw reply

* Re: [PATCH] net: macb: Fix regression breaking non-MDIO fixed-link PHYs
From: Ahmad Fatoum @ 2018-08-16  6:54 UTC (permalink / raw)
  To: Andrew Lunn, Uwe Kleine-König
  Cc: David S. Miller, Nicolas Ferre, netdev, mdf, stable, kernel,
	Brad Mouring, Florian Fainelli
In-Reply-To: <20180815023233.GD11610@lunn.ch>

On 08/15/2018 04:32 AM, Andrew Lunn wrote:
> Ahmed, where is the device tree for the EVB-KSZ9477?

I've attached it [1]. It's still work-in-progress (DSA doesn't work yet for example), but Ethernet is usable with Linux v4.18 and my patch applied.

Cheers
Ahmad

[1]


=======================================
diff --git a/arch/arm/boot/dts/at91-sama5d3_xplained_ung8087.dts b/arch/arm/boot/dts/at91-sama5d3_xplained_ung8087.dts
new file mode 100644
index 000000000000..164b6a47d0bf
--- /dev/null
+++ b/arch/arm/boot/dts/at91-sama5d3_xplained_ung8087.dts
@@ -0,0 +1,393 @@
+/*
+ * at91-sama5d3_xplained_ung8087.dts - Device Tree file for the EVB-KSZ9477 board
+ *
+ *  Copyright (C) 2014 Atmel,
+ *		  2014 Nicolas Ferre <nicolas.ferre@atmel.com>
+ *		  2018 Ahmad Fatoum <a.fatoum@pengutronix.de>
+ *
+ * Licensed under GPLv2 or later.
+ */
+/dts-v1/;
+#include "sama5d36.dtsi"
+
+/ {
+	model = "SAMA5D3 Xplained";
+	compatible = "atmel,sama5d3-xplained", "atmel,sama5d3", "atmel,sama5";
+
+	chosen {
+		bootargs = "console=ttyS0,115200";
+	};
+
+	memory {
+		reg = <0x20000000 0x10000000>;
+	};
+
+	clocks {
+		slow_xtal {
+			clock-frequency = <32768>;
+		};
+
+		main_xtal {
+			clock-frequency = <12000000>;
+		};
+	};
+
+	ahb {
+		apb {
+			mmc0: mmc@f0000000 {
+				pinctrl-0 = <&pinctrl_mmc0_clk_cmd_dat0 &pinctrl_mmc0_dat1_3 &pinctrl_mmc0_dat4_7 &pinctrl_mmc0_cd>;
+				status = "okay";
+				slot@0 {
+					reg = <0>;
+					bus-width = <8>;
+					cd-gpios = <&pioE 0 GPIO_ACTIVE_LOW>;
+				};
+			};
+
+			spi0: spi@f0004000 {
+				cs-gpios = <&pioD 13 0>, <0>, <0>, <&pioD 16 0>;
+				id = <0>;
+				status = "okay";
+			};
+
+			can0: can@f000c000 {
+				status = "okay";
+			};
+
+			i2c0: i2c@f0014000 {
+				pinctrl-0 = <&pinctrl_i2c0_pu>;
+				status = "okay";
+			};
+
+			i2c1: i2c@f0018000 {
+				status = "okay";
+
+				pmic: act8865@5b {
+					compatible = "active-semi,act8865";
+					reg = <0x5b>;
+					status = "okay";
+
+					regulators {
+						vcc_1v8_reg: DCDC_REG1 {
+							regulator-name = "VCC_1V8";
+							regulator-min-microvolt = <1800000>;
+							regulator-max-microvolt = <1800000>;
+							regulator-always-on;
+						};
+
+						vcc_1v2_reg: DCDC_REG2 {
+							regulator-name = "VCC_1V2";
+							regulator-min-microvolt = <1200000>;
+							regulator-max-microvolt = <1200000>;
+							regulator-always-on;
+						};
+
+						vcc_3v3_reg: DCDC_REG3 {
+							regulator-name = "VCC_3V3";
+							regulator-min-microvolt = <3300000>;
+							regulator-max-microvolt = <3300000>;
+							regulator-always-on;
+						};
+
+						vddfuse_reg: LDO_REG1 {
+							regulator-name = "FUSE_2V5";
+							regulator-min-microvolt = <2500000>;
+							regulator-max-microvolt = <2500000>;
+						};
+
+						vddana_reg: LDO_REG2 {
+							regulator-name = "VDDANA";
+							regulator-min-microvolt = <3300000>;
+							regulator-max-microvolt = <3300000>;
+							regulator-always-on;
+						};
+					};
+				};
+			};
+
+			macb0: ethernet@f0028000 {
+				phy-mode = "rgmii";
+				gpios = <&pioB 28 GPIO_ACTIVE_LOW>;
+				reset-gpios = <&pioC 31 GPIO_ACTIVE_LOW>;
+				status = "okay";
+
+				fixed-link {
+					speed = <1000>;
+					full-duplex;
+				};
+			};
+
+			pwm0: pwm@f002c000 {
+				pinctrl-names = "default";
+				pinctrl-0 = <&pinctrl_pwm0_pwmh0_0 &pinctrl_pwm0_pwmh1_0>;
+				status = "okay";
+			};
+
+			usart0: serial@f001c000 {
+				status = "okay";
+			};
+
+			usart1: serial@f0020000 {
+				pinctrl-0 = <&pinctrl_usart1 &pinctrl_usart1_rts_cts>;
+				status = "okay";
+			};
+
+			uart0: serial@f0024000 {
+				status = "okay";
+			};
+
+			mmc1: mmc@f8000000 {
+				pinctrl-0 = <&pinctrl_mmc1_clk_cmd_dat0 &pinctrl_mmc1_dat1_3 &pinctrl_mmc1_cd>;
+				status = "okay";
+				slot@0 {
+					reg = <0>;
+					bus-width = <4>;
+					cd-gpios = <&pioE 1 GPIO_ACTIVE_HIGH>;
+				};
+			};
+
+			spi1: spi@f8008000 {
+				pinctrl-0 = <&pinctrl_spi_ksz>;
+				cs-gpios = <&pioC 25 0>;
+				id = <1>;
+				status = "okay";
+
+				ksz9477: ksz9477@0 {
+					compatible = "microchip,ksz9477", "microchip,ksz9893";
+					reg = <0>;
+
+					/* Bus clock is 132 MHz. */
+					spi-max-frequency = <44000000>;
+					spi-cpha;
+					spi-cpol;
+					gpios = <&pioB 28 GPIO_ACTIVE_LOW>;
+					status = "okay";
+
+					ports {
+						#address-cells = <1>;
+						#size-cells = <0>;
+
+						port@0 {
+							reg = <0>;
+							label = "lan0";
+						};
+
+						port@1 {
+							reg = <1>;
+							label = "lan1";
+						};
+
+						port@2 {
+							reg = <2>;
+							label = "lan2";
+						};
+
+						port@3 {
+							reg = <3>;
+							label = "lan3";
+						};
+
+						port@4 {
+							reg = <4>;
+							label = "lan4";
+						};
+
+						port@5 {
+							reg = <5>;
+							label = "cpu";
+							ethernet = <&macb0>;
+							phy-mode = "rgmii-id";
+
+							fixed-link {
+								speed = <0x3e8>;
+								full-duplex;
+							};
+						};
+
+						/* port 6 is connected to eth0 */
+					};
+				};
+			};
+
+			adc0: adc@f8018000 {
+				pinctrl-0 = <
+					&pinctrl_adc0_adtrg
+					&pinctrl_adc0_ad0
+					&pinctrl_adc0_ad1
+					&pinctrl_adc0_ad2
+					&pinctrl_adc0_ad3
+					&pinctrl_adc0_ad4
+					&pinctrl_adc0_ad5
+					&pinctrl_adc0_ad6
+					&pinctrl_adc0_ad7
+					&pinctrl_adc0_ad8
+					&pinctrl_adc0_ad9
+					>;
+				status = "okay";
+			};
+
+			i2c2: i2c@f801c000 {
+				dmas = <0>, <0>;	/* Do not use DMA for i2c2 */
+				pinctrl-0 = <&pinctrl_i2c2_pu>;
+				status = "okay";
+			};
+
+			macb1: ethernet@f802c000 {
+				phy-mode = "rmii";
+				status = "disabled";
+			};
+
+			dbgu: serial@ffffee00 {
+				status = "okay";
+			};
+
+			pinctrl@fffff200 {
+				board {
+					pinctrl_i2c0_pu: i2c0_pu {
+						atmel,pins =
+							<AT91_PIOA 30 AT91_PERIPH_A AT91_PINCTRL_PULL_UP>,
+							<AT91_PIOA 31 AT91_PERIPH_A AT91_PINCTRL_PULL_UP>;
+					};
+
+					pinctrl_i2c2_pu: i2c2_pu {
+						atmel,pins =
+							<AT91_PIOA 18 AT91_PERIPH_B AT91_PINCTRL_PULL_UP>,
+							<AT91_PIOA 19 AT91_PERIPH_B AT91_PINCTRL_PULL_UP>;
+					};
+
+					pinctrl_mmc0_cd: mmc0_cd {
+						atmel,pins =
+							<AT91_PIOE 0 AT91_PERIPH_GPIO AT91_PINCTRL_PULL_UP_DEGLITCH>;
+					};
+
+					pinctrl_mmc1_cd: mmc1_cd {
+						atmel,pins =
+							<AT91_PIOE 1 AT91_PERIPH_GPIO AT91_PINCTRL_PULL_UP_DEGLITCH>;
+					};
+
+					pinctrl_usba_vbus: usba_vbus {
+						atmel,pins =
+							<AT91_PIOE 9 AT91_PERIPH_GPIO AT91_PINCTRL_DEGLITCH>;	/* PE9, conflicts with A9 */
+					};
+
+					pinctrl_spi_ksz: spi_ksz {
+						atmel,pins =
+							<
+							AT91_PIOB 28 AT91_PERIPH_GPIO AT91_PINCTRL_DEGLITCH
+							AT91_PIOC 31 AT91_PERIPH_GPIO AT91_PINCTRL_DEGLITCH
+							>;
+					};
+				};
+			};
+
+			pmc: pmc@fffffc00 {
+				main: mainck {
+					clock-frequency = <12000000>;
+				};
+			};
+		};
+
+		ebi: ebi@10000000 {
+			pinctrl-0 = <&pinctrl_ebi_nand_addr>;
+			pinctrl-names = "default";
+			status = "okay";
+
+			nand_controller: nand-controller {
+				status = "okay";
+
+				nand@3 {
+					reg = <0x3 0x0 0x2>;
+					atmel,rb = <0>;
+					nand-bus-width = <8>;
+					nand-ecc-mode = "hw";
+					nand-ecc-strength = <4>;
+					nand-ecc-step-size = <512>;
+					nand-on-flash-bbt;
+					label = "atmel_nand";
+
+					partitions {
+						compatible = "fixed-partitions";
+						#address-cells = <1>;
+						#size-cells = <1>;
+
+						at91bootstrap@0 {
+							label = "at91bootstrap";
+							reg = <0x0 0x40000>;
+						};
+
+						bootloader@40000 {
+							label = "bootloader";
+							reg = <0x40000 0x80000>;
+						};
+
+						bootloaderenv@c0000 {
+							label = "bootloader env";
+							reg = <0xc0000 0xc0000>;
+						};
+
+						dtb@180000 {
+							label = "device tree";
+							reg = <0x180000 0x80000>;
+						};
+
+						kernel@200000 {
+							label = "kernel";
+							reg = <0x200000 0x600000>;
+						};
+
+						rootfs@800000 {
+							label = "rootfs";
+							reg = <0x800000 0x0f800000>;
+						};
+					};
+				};
+			};
+		};
+
+		usb0: gadget@500000 {
+			atmel,vbus-gpio = <&pioE 9 GPIO_ACTIVE_HIGH>;	/* PE9, conflicts with A9 */
+			pinctrl-names = "default";
+			pinctrl-0 = <&pinctrl_usba_vbus>;
+			status = "okay";
+		};
+
+		usb1: ohci@600000 {
+			num-ports = <3>;
+			atmel,vbus-gpio = <0
+					   &pioE 3 GPIO_ACTIVE_LOW
+					   &pioE 4 GPIO_ACTIVE_LOW
+					  >;
+			status = "okay";
+		};
+
+		usb2: ehci@700000 {
+			status = "okay";
+		};
+	};
+
+	gpio_keys {
+		compatible = "gpio-keys";
+
+		bp3 {
+			label = "PB_USER";
+			gpios = <&pioE 29 GPIO_ACTIVE_LOW>;
+			linux,code = <0x104>;
+			gpio-key,wakeup;
+		};
+	};
+
+	leds {
+		compatible = "gpio-leds";
+
+		d2 {
+			label = "d2";
+			gpios = <&pioE 23 GPIO_ACTIVE_LOW>;	/* PE23, conflicts with A23, CTS2 */
+			linux,default-trigger = "heartbeat";
+		};
+
+		d3 {
+			label = "d3";
+			gpios = <&pioE 24 GPIO_ACTIVE_HIGH>;
+		};
+	};
+};
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Michal Kubecek @ 2018-08-16  6:52 UTC (permalink / raw)
  To: maowenan
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <bcc20776-c8aa-f202-380f-979941d2d453@huawei.com>

On Thu, Aug 16, 2018 at 02:42:32PM +0800, maowenan wrote:
> On 2018/8/16 14:16, Michal Kubecek wrote:
> > On Thu, Aug 16, 2018 at 10:50:01AM +0800, Mao Wenan wrote:
> >> There are five patches to fix CVE-2018-5390 in latest mainline 
> >> branch, but only two patches exist in stable 4.4 and 3.18: 
> >> dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
> >> 5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
> >> I have tested with stable 4.4 kernel, and found the cpu usage was very high.
> >> So I think only two patches can't fix the CVE-2018-5390.
> >> test results:
> >> with fix patch：     78.2%   ksoftirqd
> >> withoutfix patch：   90%     ksoftirqd
> >>
> >> Then I try to imitate 72cd43ba(tcp: free batches of packets in tcp_prune_ofo_queue())
> >> to drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks with simple queue 
> >> instead of RB tree. The result is not very well.
> >>  
> >> After analysing the codes of stable 4.4, and debuging the 
> >> system, shows that search of ofo_queue(tcp ofo using a simple queue) cost more cycles.
> >>
> >> So I try to backport "tcp: use an RB tree for ooo receive queue" using RB tree 
> >> instead of simple queue, then backport Eric Dumazet 5 fixed patches in mainline,
> >> good news is that ksoftirqd is turn to about 20%, which is the same with mainline now.
> >>
> >> Stable 4.4 have already back port two patches, 
> >> f4a3313d(tcp: avoid collapses in tcp_prune_queue() if possible)
> >> 3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue())
> >> If we want to change simple queue to RB tree to finally resolve, we should apply previous 
> >> patch 9f5afeae(tcp: use an RB tree for ooo receive queue.) firstly, but 9f5afeae have many 
> >> conflicts with 3d4bf93a and f4a3313d, which are part of patch series from Eric in 
> >> mainline to fix CVE-2018-5390, so I need revert part of patches in stable 4.4 firstly, 
> >> then apply 9f5afeae, and reapply five patches from Eric.
> > 
> > There seems to be an obvious mistake in one of the backports. Could you
> > check the results with Takashi's follow-up fix submitted at
> > 
> >   http://lkml.kernel.org/r/20180815095846.7734-1-tiwai@suse.de
> > 
> > (I would try myself but you don't mention what test you ran.)
> 
> I have backport RB tree in stable 4.4, function
> tcp_collapse_ofo_queue() has been refined, which keep the same with
> mainline, so it seems no problem when apply Eric's patch 3d4bf93a(tcp:
> detect malicious patterns in tcp_collapse_ofo_queue()).
> 
> I also noticed that range_truesize != head->truesize will be always
> false which mentioned in your URL, but this only based on stable 4.4's
> codes, If I applied RB tree's patch 9f5afeae(tcp: use an RB tree for
> ooo receive queue), and after apply 3d4bf93a,the codes should be
> range_truesize += skb->truesize, and range_truesize != head->truesize
> can be true.

My point is that backporting all this into stable 4.4 is quite intrusive
so that if we can achieve similar results with a simple fix of an
obvious omission, it would be preferrable.

Michal Kubecek

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: maowenan @ 2018-08-16  6:42 UTC (permalink / raw)
  To: Michal Kubecek
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <20180816061621.imodnhlytllvnlkj@unicorn.suse.cz>



On 2018/8/16 14:16, Michal Kubecek wrote:
> On Thu, Aug 16, 2018 at 10:50:01AM +0800, Mao Wenan wrote:
>> There are five patches to fix CVE-2018-5390 in latest mainline 
>> branch, but only two patches exist in stable 4.4 and 3.18: 
>> dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
>> 5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
>> I have tested with stable 4.4 kernel, and found the cpu usage was very high.
>> So I think only two patches can't fix the CVE-2018-5390.
>> test results:
>> with fix patch：     78.2%   ksoftirqd
>> withoutfix patch：   90%     ksoftirqd
>>
>> Then I try to imitate 72cd43ba(tcp: free batches of packets in tcp_prune_ofo_queue())
>> to drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks with simple queue 
>> instead of RB tree. The result is not very well.
>>  
>> After analysing the codes of stable 4.4, and debuging the 
>> system, shows that search of ofo_queue(tcp ofo using a simple queue) cost more cycles.
>>
>> So I try to backport "tcp: use an RB tree for ooo receive queue" using RB tree 
>> instead of simple queue, then backport Eric Dumazet 5 fixed patches in mainline,
>> good news is that ksoftirqd is turn to about 20%, which is the same with mainline now.
>>
>> Stable 4.4 have already back port two patches, 
>> f4a3313d(tcp: avoid collapses in tcp_prune_queue() if possible)
>> 3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue())
>> If we want to change simple queue to RB tree to finally resolve, we should apply previous 
>> patch 9f5afeae(tcp: use an RB tree for ooo receive queue.) firstly, but 9f5afeae have many 
>> conflicts with 3d4bf93a and f4a3313d, which are part of patch series from Eric in 
>> mainline to fix CVE-2018-5390, so I need revert part of patches in stable 4.4 firstly, 
>> then apply 9f5afeae, and reapply five patches from Eric.
> 
> There seems to be an obvious mistake in one of the backports. Could you
> check the results with Takashi's follow-up fix submitted at
> 
>   http://lkml.kernel.org/r/20180815095846.7734-1-tiwai@suse.de
> 
> (I would try myself but you don't mention what test you ran.)

I have backport RB tree in stable 4.4, function tcp_collapse_ofo_queue() has been refined, which keep the
same with mainline, so it seems no problem when apply Eric's patch 3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue()).

I also noticed that range_truesize != head->truesize will be always false which mentioned in your URL, but this only based on stable 4.4's codes,
If I applied RB tree's patch 9f5afeae(tcp: use an RB tree for ooo receive queue), and after apply 3d4bf93a,the codes should be
range_truesize += skb->truesize, and range_truesize != head->truesize can be true.

One POC programm(named smack-for-ficora) is to send large out of order packets to existed tcp link, and check the cpu usage of the system with top command.

> 
> Michal Kubecek
> 
> .
> 

^ permalink raw reply

* Re: [PATCH stable 4.4 0/9] fix SegmentSmack in stable branch (CVE-2018-5390)
From: Michal Kubecek @ 2018-08-16  6:16 UTC (permalink / raw)
  To: Mao Wenan
  Cc: dwmw2, gregkh, netdev, eric.dumazet, edumazet, davem, ycheng, jdw,
	stable, Takashi Iwai
In-Reply-To: <1534387810-121428-1-git-send-email-maowenan@huawei.com>

On Thu, Aug 16, 2018 at 10:50:01AM +0800, Mao Wenan wrote:
> There are five patches to fix CVE-2018-5390 in latest mainline 
> branch, but only two patches exist in stable 4.4 and 3.18: 
> dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
> 5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
> I have tested with stable 4.4 kernel, and found the cpu usage was very high.
> So I think only two patches can't fix the CVE-2018-5390.
> test results:
> with fix patch：     78.2%   ksoftirqd
> withoutfix patch：   90%     ksoftirqd
> 
> Then I try to imitate 72cd43ba(tcp: free batches of packets in tcp_prune_ofo_queue())
> to drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks with simple queue 
> instead of RB tree. The result is not very well.
>  
> After analysing the codes of stable 4.4, and debuging the 
> system, shows that search of ofo_queue(tcp ofo using a simple queue) cost more cycles.
> 
> So I try to backport "tcp: use an RB tree for ooo receive queue" using RB tree 
> instead of simple queue, then backport Eric Dumazet 5 fixed patches in mainline,
> good news is that ksoftirqd is turn to about 20%, which is the same with mainline now.
> 
> Stable 4.4 have already back port two patches, 
> f4a3313d(tcp: avoid collapses in tcp_prune_queue() if possible)
> 3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue())
> If we want to change simple queue to RB tree to finally resolve, we should apply previous 
> patch 9f5afeae(tcp: use an RB tree for ooo receive queue.) firstly, but 9f5afeae have many 
> conflicts with 3d4bf93a and f4a3313d, which are part of patch series from Eric in 
> mainline to fix CVE-2018-5390, so I need revert part of patches in stable 4.4 firstly, 
> then apply 9f5afeae, and reapply five patches from Eric.

There seems to be an obvious mistake in one of the backports. Could you
check the results with Takashi's follow-up fix submitted at

  http://lkml.kernel.org/r/20180815095846.7734-1-tiwai@suse.de

(I would try myself but you don't mention what test you ran.)

Michal Kubecek

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox