Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "Paweł Staszewski" <pstaszewski@itcare.pl>
Cc: Saeed Mahameed <saeedm@mellanox.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	brouer@redhat.com
Subject: Re: Kernel 4.19 network performance - forwarding/routing normal users traffic
Date: Sat, 10 Nov 2018 23:06:30 +0100	[thread overview]
Message-ID: <20181110230630.0daeba8e@redhat.com> (raw)
In-Reply-To: <69c11e38-da50-15a3-2dfc-bc47ccc134b9@itcare.pl>

On Sat, 10 Nov 2018 20:56:02 +0100
Paweł Staszewski <pstaszewski@itcare.pl> wrote:

> W dniu 10.11.2018 o 20:49, Paweł Staszewski pisze:
> >
> >
> > W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze:  
> >> On Fri, 9 Nov 2018 23:20:38 +0100 Paweł Staszewski 
> >> <pstaszewski@itcare.pl> wrote:
> >>  
> >>> W dniu 08.11.2018 o 20:12, Paweł Staszewski pisze:  
> >>>> CPU load is lower than for connectx4 - but it looks like bandwidth
> >>>> limit is the same :)
> >>>> But also after reaching 60Gbit/60Gbit
> >>>>
> >>>>   bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
> >>>>    input: /proc/net/dev type: rate
> >>>>    -         iface                   Rx Tx Total
> >>>> ===================================================================
> >>>>
> >>>>
> >>>>           enp175s0:          45.09 Gb/s  15.09 Gb/s     60.18 Gb/s
> >>>>           enp216s0:          15.14 Gb/s  45.19 Gb/s     60.33 Gb/s
> >>>> -------------------------------------------------------------------
> >>>>
> >>>>
> >>>>              total:          60.45 Gb/s  60.48 Gb/s 120.93 Gb/s  
> >>> Today reached 65/65Gbit/s
> >>>
> >>> But starting from 60Gbit/s RX / 60Gbit TX nics start to drop packets
> >>> (with 50%CPU on all 28cores) - so still there is cpu power to use :).  
> >> This is weird!
> >>
> >> How do you see / measure these drops?  
> >
> > Simple icmp test like ping -i 0.1
> > And im testing by icmp management ip address on vlan that is attacked 
> > to one NIC (the side that is more stressed with RX)
> > And another icmp test is forward thru this router - host behind it
> >
> > Both measurements shows same loss ratio from 0.1 to 0.5% after 
> > reaching ~45Gbit/s RX side - depends how much RX side is pushed drops 
> > vary between 0.1 to 0.5 - even 0.6%:)
> >

Okay good to know, you use an external measurement for this.  I do
think packets are getting dropped by the NIC. 

> >>> So checked other stats.
> >>> softnet_stats shows average 1k squeezed per sec:  
> >> Is below output the raw counters? not per sec?
> >>
> >> It would be valuable to see the per sec stats instead...
> >> I use this tool:
> >> https://github.com/netoptimizer/network-testing/blob/master/bin/softnet_stat.pl  
> CPU          total/sec     dropped/sec    squeezed/sec  collision/sec      rx_rps/sec  flow_limit/sec
> CPU:00               0               0               0 0                0               0
[...]
> CPU:13               0               0               0 0                0               0
> CPU:14          485538               0              43 0                0               0
> CPU:15          474794               0              51 0                0               0
> CPU:16          449322               0              41 0                0               0
> CPU:17          476420               0              46 0                0               0
> CPU:18          440436               0              38 0                0               0
> CPU:19          501499               0              49 0                0               0
> CPU:20          459468               0              49 0                0               0
> CPU:21          438928               0              47 0                0               0
> CPU:22          468983               0              40 0                0               0
> CPU:23          446253               0              47 0                0               0
> CPU:24          451909               0              46 0                0               0
> CPU:25          479373               0              55 0                0               0
> CPU:26          467848               0              49 0                0               0
> CPU:27          453153               0              51 0                0               0
> CPU:28               0               0               0 0                0               0
[...]
> CPU:40               0               0               0 0                0               0
> CPU:41               0               0               0 0                0               0
> CPU:42          466853               0              43 0                0               0
> CPU:43          453059               0              54 0                0               0
> CPU:44          363219               0              34 0                0               0
> CPU:45          353632               0              38 0                0               0
> CPU:46          371618               0              40 0                0               0
> CPU:47          350518               0              46 0                0               0
> CPU:48          397544               0              40 0                0               0
> CPU:49          364873               0              38 0                0               0
> CPU:50          383630               0              38 0                0               0
> CPU:51          358771               0              39 0                0               0
> CPU:52          372547               0              38 0                0               0
> CPU:53          372882               0              36 0                0               0
> CPU:54          366244               0              43 0                0               0
> CPU:55          365886               0              39 0                0               0
> 
> Summed:       11835201               0            1217 0                0               0


Do notice, the per CPU squeeze is not too large.
The summed 11.8 Mpps is a little high compared to:

 Ethtool(enp216s0) stat: 4971677 (4,971,677) <= rx_packets /sec
 Ethtool(enp175s0) stat: 3717148 (3,717,148) <= rx_packets /sec
 Sum:  3717148+4971677 = 8688825 (8,688,825)


[...]
> >>>
> >>> Remember those tests are now on two separate connectx5 connected to
> >>> two separate pcie x16  gen 3.0  
> >>   That is strange... I still suspect some HW NIC issue, can you provide
> >> ethtool stats info via tool:
> >>
> >> https://github.com/netoptimizer/network-testing/blob/master/bin/ethtool_stats.pl
> >>
> >> $ ethtool_stats.pl --dev enp175s0 --dev enp216s0
> >>
> >> The tool remove zero-stats counters and report per sec stats. It makes
> >> it easier to spot that is relevant for the given workload.  
> > yes mlnx have just too many counters that are always 0 for my case :)
> > Will try this also
> >  
> But still alot of non 0 counters
> Show adapter(s) (enp175s0 enp216s0) statistics (ONLY that changed!)
> Ethtool(enp175s0) stat:         8891 (          8,891) <= ch0_arm /sec
[...]

I have copied the stats over in another document so I can better looks
at it... and I've found some interesting stats.

E.g. we can see that the NIC hardware is dropping packets.

RX-drops on enp175s0:

 (enp175s0) stat: 4850734036 ( 4,850,734,036) <= rx_bytes /sec
 (enp175s0) stat: 5069043007 ( 5,069,043,007) <= rx_bytes_phy /sec
                  -218308971 (  -218,308,971) Dropped bytes /sec
 
 (enp175s0) stat: 139602 ( 139,602) <= rx_discards_phy /sec

 (enp175s0) stat: 3717148 ( 3,717,148) <= rx_packets /sec
 (enp175s0) stat: 3862420 ( 3,862,420) <= rx_packets_phy /sec
                  -145272 (  -145,272) Dropped packets /sec


RX-drops on enp216s0 is less:

 (enp216s0) stat: 2592286809 ( 2,592,286,809) <= rx_bytes /sec
 (enp216s0) stat: 2633575771 ( 2,633,575,771) <= rx_bytes_phy /sec
                   -41288962 (   -41,288,962) Dropped bytes /sec

 (enp216s0) stat:   464 (464) <= rx_discards_phy /sec

 (enp216s0) stat: 4971677 ( 4,971,677) <= rx_packets /sec
 (enp216s0) stat: 4975563 ( 4,975,563) <= rx_packets_phy /sec
                    -3886 (    -3,886) Dropped packets /sec

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

next prev parent reply	other threads:[~2018-11-11  7:53 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-31 21:57 Kernel 4.19 network performance - forwarding/routing normal users traffic Paweł Staszewski
2018-10-31 22:09 ` Eric Dumazet
2018-10-31 22:20   ` Paweł Staszewski
2018-10-31 22:45     ` Paweł Staszewski
2018-11-01  9:22     ` Jesper Dangaard Brouer
2018-11-01 10:34       ` Paweł Staszewski
2018-11-01 15:27       ` Aaron Lu
2018-11-01 20:23         ` Saeed Mahameed
2018-11-02  5:23           ` Aaron Lu
2018-11-02 11:40             ` Jesper Dangaard Brouer
2018-11-02 14:20               ` Aaron Lu
2018-11-02 19:02                 ` Paweł Staszewski
2018-11-03  0:16                   ` Paweł Staszewski
2018-11-03 12:01                     ` Paweł Staszewski
2018-11-03 12:58                     ` Jesper Dangaard Brouer
2018-11-03 15:23                       ` Paweł Staszewski
2018-11-03 15:43                         ` Paweł Staszewski
2018-11-03 12:53                 ` Jesper Dangaard Brouer
2018-11-05  6:28                   ` Aaron Lu
2018-11-05  9:10                     ` Jesper Dangaard Brouer
2018-11-05  8:42                   ` Tariq Toukan
2018-11-05  8:48                     ` Aaron Lu
2018-11-01  3:37 ` David Ahern
2018-11-01 10:55   ` Jesper Dangaard Brouer
2018-11-01 13:52     ` Paweł Staszewski
2018-11-01 17:23       ` David Ahern
2018-11-01 17:30         ` Paweł Staszewski
2018-11-03 17:32           ` David Ahern
2018-11-04  0:24             ` Paweł Staszewski
2018-11-05 20:17               ` Jesper Dangaard Brouer
2018-11-08  0:59                 ` Paweł Staszewski
2018-11-08  1:13                   ` Paweł Staszewski
2018-11-08 14:43                   ` Paweł Staszewski
2018-11-07 21:06               ` David Ahern
2018-11-08 13:33                 ` Paweł Staszewski
2018-11-08 16:06                   ` David Ahern
2018-11-08 16:25                     ` Paweł Staszewski
2018-11-08 16:27                       ` Paweł Staszewski
2018-11-08 16:32                         ` David Ahern
2018-11-08 17:30                           ` Paweł Staszewski
2018-11-08 18:05                             ` David Ahern
2018-11-09  0:40                           ` Paweł Staszewski
2018-11-09  0:42                             ` David Ahern
2018-11-09  4:52                               ` Saeed Mahameed
2018-11-09  7:52                                 ` Jesper Dangaard Brouer
2018-11-09  9:56                                 ` Paweł Staszewski
2018-11-09 10:20                     ` Paweł Staszewski
2018-11-09 16:21                       ` David Ahern
2018-11-09 19:59                         ` Paweł Staszewski
2018-11-10  0:06                         ` David Ahern
2018-11-10 13:18                           ` Paweł Staszewski
2018-11-10 14:56                             ` David Ahern
2018-11-19 21:59                           ` David Ahern
2018-11-20 23:00                             ` Paweł Staszewski
2018-11-01  9:50 ` Saeed Mahameed
2018-11-01 11:09   ` Paweł Staszewski
2018-11-01 16:49     ` Paweł Staszewski
2018-11-01 20:37     ` Saeed Mahameed
2018-11-01 21:18       ` Paweł Staszewski
2018-11-01 21:24         ` Paweł Staszewski
2018-11-01 21:34           ` Paweł Staszewski
2018-11-03  0:18       ` Paweł Staszewski
2018-11-08 19:12         ` Paweł Staszewski
2018-11-09 22:20           ` Paweł Staszewski
2018-11-10 19:34             ` Jesper Dangaard Brouer
2018-11-10 19:49               ` Paweł Staszewski
2018-11-10 19:56                 ` Paweł Staszewski
2018-11-10 22:06                   ` Jesper Dangaard Brouer [this message]
2018-11-10 22:19                     ` Paweł Staszewski
2018-11-11  8:03                       ` Jesper Dangaard Brouer
2018-11-11 10:26                         ` Paweł Staszewski
2018-11-10 20:02               ` Paweł Staszewski
2018-11-10 21:01                 ` Jesper Dangaard Brouer
2018-11-10 21:53                   ` Paweł Staszewski
2018-11-10 22:04                     ` Paweł Staszewski
2018-11-11  8:56                     ` Jesper Dangaard Brouer
2018-11-12 19:19                       ` Paweł Staszewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181110230630.0daeba8e@redhat.com \
    --to=brouer@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pstaszewski@itcare.pl \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).