From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: kernel 4.15.0-rc9+ (net-next) high cpu load at 50Gbit/s - about 6Mpps Date: Sun, 28 Jan 2018 10:59:35 -0800 Message-ID: <1517165975.3715.85.camel@gmail.com> References: <3d6d4060-a6bd-483a-8101-029491cf0be7@itcare.pl> <5891f536-f5cf-3e9a-40c7-d6765d038587@itcare.pl> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit To: =?UTF-8?Q?Pawe=C5=82?= Staszewski , "netde >> Linux Kernel Network Developers" Return-path: Received: from mail-pg0-f42.google.com ([74.125.83.42]:46547 "EHLO mail-pg0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752159AbeA1S7i (ORCPT ); Sun, 28 Jan 2018 13:59:38 -0500 Received: by mail-pg0-f42.google.com with SMTP id s9so2714305pgq.13 for ; Sun, 28 Jan 2018 10:59:37 -0800 (PST) In-Reply-To: <5891f536-f5cf-3e9a-40c7-d6765d038587@itcare.pl> Sender: netdev-owner@vger.kernel.org List-ID: On Sun, 2018-01-28 at 19:26 +0100, Paweł Staszewski wrote: > > W dniu 27.01.2018 o 23:23, Paweł Staszewski pisze: > > Hi > > > > > > Today I made some real life traffic tests with kernel 4.15.0-rc9 > > > > but when traffic reach 50Gbit/s and about 6Mpps cpou load rises fast > > from 48% to 100% for all cpu cores. > > > > Here is some graph that presenting how cpu load rises when there was > > more pps. > > > > > > https://ibb.co/mhD5ob > > > > > > here is perf record from that time: > > > > https://pastebin.com/3zqG1rvE > > > > > > There is 8x 10G ixgbe 82599 interfaces teamed with teamd. > > > > No traffic queueing - only pfifo fast on all interfaces. > > > > No NAT or iptables forles other than INPUT (about 30rules) > > > > All nic's have same ethtool settings: > > > > ethtool -k eth0 > > Features for eth0: > > Cannot get device udp-fragmentation-offload settings: Operation not > > supported > > rx-checksumming: on > > tx-checksumming: on > >         tx-checksum-ipv4: off [fixed] > >         tx-checksum-ip-generic: on > >         tx-checksum-ipv6: off [fixed] > >         tx-checksum-fcoe-crc: off [fixed] > >         tx-checksum-sctp: on > > scatter-gather: on > >         tx-scatter-gather: on > >         tx-scatter-gather-fraglist: off [fixed] > > tcp-segmentation-offload: on > >         tx-tcp-segmentation: on > >         tx-tcp-ecn-segmentation: off [fixed] > >         tx-tcp-mangleid-segmentation: off > >         tx-tcp6-segmentation: on > > udp-fragmentation-offload: off > > generic-segmentation-offload: on > > generic-receive-offload: on > > large-receive-offload: off > > rx-vlan-offload: on > > tx-vlan-offload: on > > ntuple-filters: on > > receive-hashing: on > > highdma: on [fixed] > > rx-vlan-filter: on > > vlan-challenged: off [fixed] > > tx-lockless: off [fixed] > > netns-local: off [fixed] > > tx-gso-robust: off [fixed] > > tx-fcoe-segmentation: off [fixed] > > tx-gre-segmentation: on > > tx-gre-csum-segmentation: on > > tx-ipxip4-segmentation: on > > tx-ipxip6-segmentation: on > > tx-udp_tnl-segmentation: on > > tx-udp_tnl-csum-segmentation: on > > tx-gso-partial: on > > tx-sctp-segmentation: off [fixed] > > tx-esp-segmentation: off [fixed] > > fcoe-mtu: off [fixed] > > tx-nocache-copy: off > > loopback: off [fixed] > > rx-fcs: off [fixed] > > rx-all: off > > tx-vlan-stag-hw-insert: off [fixed] > > rx-vlan-stag-hw-parse: off [fixed] > > rx-vlan-stag-filter: off [fixed] > > l2-fwd-offload: off > > hw-tc-offload: off > > esp-hw-offload: off [fixed] > > esp-tx-csum-hw-offload: off [fixed] > > rx-udp_tunnel-port-offload: on > > > > > > ethtool -g eth0 > > Ring parameters for eth0: > > Pre-set maximums: > > RX:             4096 > > RX Mini:        0 > > RX Jumbo:       0 > > TX:             4096 > > Current hardware settings: > > RX:             4096 > > RX Mini:        0 > > RX Jumbo:       0 > > TX:             2048 > > > > > > ethtool -c eth0 > > Coalesce parameters for eth0: > > Adaptive RX: off  TX: off > > stats-block-usecs: 0 > > sample-interval: 0 > > pkt-rate-low: 0 > > pkt-rate-high: 0 > > > > rx-usecs: 512 > > rx-frames: 0 > > rx-usecs-irq: 0 > > rx-frames-irq: 0 > > > > tx-usecs: 0 > > tx-frames: 0 > > tx-usecs-irq: 0 > > tx-frames-irq: 0 > > > > rx-usecs-low: 0 > > rx-frame-low: 0 > > tx-usecs-low: 0 > > tx-frame-low: 0 > > > > rx-usecs-high: 0 > > rx-frame-high: 0 > > tx-usecs-high: 0 > > tx-frame-high: 0 > > > > > > > > > > > > Peft top for kernel 4.15.0-rc9 below (all 40 cores 100% cpu load with > 6.3Mpps) > >     20.96%  [kernel]                [k] queued_spin_lock_slowpath >      5.51%  [kernel]                [k] ixgbe_poll >      5.49%  [kernel]                [k] ixgbe_xmit_frame_ring >      4.39%  [kernel]                [k] do_raw_spin_lock >      4.29%  [kernel]                [k] sch_direct_xmit >      4.11%  [kernel]                [k] fib_table_lookup >      3.11%  [team_mode_roundrobin]  [k] rr_transmit >      2.71%  [kernel]                [k] __dev_queue_xmit >      2.62%  [kernel]                [k] __ptr_ring_peek >      2.39%  [kernel]                [k] skb_release_data >      2.18%  [kernel]                [k] dev_gro_receive >      1.75%  [kernel]                [k] __qdisc_run >      1.67%  [kernel]                [k] pfifo_fast_enqueue >      1.57%  [kernel]                [k] netdev_pick_tx >      1.56%  [kernel]                [k] page_frag_free >      1.48%  [kernel]                [k] ip_finish_output2 >      1.38%  [kernel]                [k] __slab_free >      1.36%  [kernel]                [k] skb_unref >      1.34%  [kernel]                [k] ixgbe_maybe_stop_tx >      1.30%  [kernel]                [k] vlan_do_receive >      1.28%  [kernel]                [k] pfifo_fast_dequeue >      1.23%  [kernel]                [k] virt_to_head_page > > > > Same configuration kernel 4.15.0-rc3 (50% cpu load on all 40 cores with > 6.3Mpps) > >      7.81%  [kernel]                [k] ixgbe_xmit_frame_ring >      7.61%  [kernel]                [k] ixgbe_poll >      7.09%  [kernel]                [k] do_raw_spin_lock >      5.63%  [kernel]                [k] fib_table_lookup >      5.19%  [kernel]                [k] __dev_queue_xmit >      4.38%  [team_mode_roundrobin]  [k] rr_transmit >      3.10%  [kernel]                [k] netdev_pick_tx >      2.79%  [kernel]                [k] skb_release_data >      2.34%  [kernel]                [k] dev_gro_receive >      1.99%  [kernel]                [k] page_frag_free >      1.96%  [kernel]                [k] skb_unref >      1.92%  [kernel]                [k] virt_to_head_page >      1.90%  [kernel]                [k] ixgbe_maybe_stop_tx >      1.82%  [kernel]                [k] vlan_do_receive >      1.74%  [kernel]                [k] ip_finish_output2 >      1.73%  [kernel]                [k] __build_skb >      1.68%  [kernel]                [k] __slab_free >      1.67%  [kernel]                [k] __netif_receive_skb_core >      1.60%  [kernel]                [k] inet_gro_receive >      1.49%  [kernel]                [k] netif_skb_features >      1.35%  [kernel]                [k] ip_rcv >      1.33%  [kernel]                [k] ipt_do_table >      1.30%  [kernel]                [k] compound_head >      1.26%  [kernel]                [k] dev_hard_start_xmit >      1.18%  [kernel]                [k] put_page >      1.13%  [kernel]                [k] tcp_gro_receive >      1.13%  [kernel]                [k] ip_forward >      0.99%  [kernel]                [k] validate_xmit_skb >      0.97%  [kernel]                [k] ip_route_input_rcu >      0.88%  [kernel]                [k] inet_lookup_ifaddr_rcu >      0.81%  [kernel]                [k] pfifo_fast_dequeue >      0.77%  [kernel]                [k] vlan_dev_hard_start_xmit >      0.64%  [kernel]                [k] ___slab_alloc Please report : 1) "ethtool -l" information for your ethernet adapters. 2) IRQ configuration (grep eth /proc/interrupts) 3) tc -s qdisc show