From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.
Date: Tue, 20 Feb 2024 16:32:06 +0100 [thread overview]
Message-ID: <20240220153206.AUZ_zP24@linutronix.de> (raw)
In-Reply-To: <07620deb-2b96-4bcc-a045-480568a27c58@kernel.org>
On 2024-02-20 13:57:24 [+0100], Jesper Dangaard Brouer wrote:
> > so I replaced nr_cpu_ids with 64 and bootet maxcpus=64 so that I can run
> > xdp-bench on the ixgbe.
> >
>
> Yes, ixgbe HW have limited TX queues, and XDP tries to allocate a
> hardware TX queue for every CPU in the system. So, I guess you have too
> many CPUs in your system - lol.
>
> Other drivers have a fallback to a locked XDP TX path, so this is also
> something to lookout for in the machine with i40e.
this locked XDP TX path starts at 64 but xdp_progs are rejected > 64 * 2.
> > so. i40 send, ixgbe receive.
> >
> > -t 2
> >
> > | Summary 2,348,800 rx/s 0 err/s
> > | receive total 2,348,800 pkt/s 2,348,800 drop/s 0 error/s
> > | cpu:0 2,348,800 pkt/s 2,348,800 drop/s 0 error/s
> > | xdp_exception 0 hit/s
> >
>
> This is way too low, with i40e sending.
>
> On my system with only -t 1 my i40e driver can send with approx 15Mpps:
>
> Ethtool(i40e2) stat: 15028585 ( 15,028,585) <= tx-0.packets /sec
> Ethtool(i40e2) stat: 15028589 ( 15,028,589) <= tx_packets /sec
-t1 in ixgbe
Show adapter(s) (eth1) statistics (ONLY that changed!)
Ethtool(eth1 ) stat: 107857263 ( 107,857,263) <= tx_bytes /sec
Ethtool(eth1 ) stat: 115047684 ( 115,047,684) <= tx_bytes_nic /sec
Ethtool(eth1 ) stat: 1797621 ( 1,797,621) <= tx_packets /sec
Ethtool(eth1 ) stat: 1797636 ( 1,797,636) <= tx_pkts_nic /sec
Ethtool(eth1 ) stat: 107857263 ( 107,857,263) <= tx_queue_0_bytes /sec
Ethtool(eth1 ) stat: 1797621 ( 1,797,621) <= tx_queue_0_packets /sec
-t i40e
Ethtool(eno2np1 ) stat: 90 ( 90) <= port.rx_bytes /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= port.rx_size_127 /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= port.rx_unicast /sec
Ethtool(eno2np1 ) stat: 79554379 ( 79,554,379) <= port.tx_bytes /sec
Ethtool(eno2np1 ) stat: 1243037 ( 1,243,037) <= port.tx_size_64 /sec
Ethtool(eno2np1 ) stat: 1243037 ( 1,243,037) <= port.tx_unicast /sec
Ethtool(eno2np1 ) stat: 86 ( 86) <= rx-32.bytes /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= rx-32.packets /sec
Ethtool(eno2np1 ) stat: 86 ( 86) <= rx_bytes /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= rx_cache_waive /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= rx_packets /sec
Ethtool(eno2np1 ) stat: 1 ( 1) <= rx_unicast /sec
Ethtool(eno2np1 ) stat: 74580821 ( 74,580,821) <= tx-0.bytes /sec
Ethtool(eno2np1 ) stat: 1243014 ( 1,243,014) <= tx-0.packets /sec
Ethtool(eno2np1 ) stat: 74580821 ( 74,580,821) <= tx_bytes /sec
Ethtool(eno2np1 ) stat: 1243014 ( 1,243,014) <= tx_packets /sec
Ethtool(eno2np1 ) stat: 1243037 ( 1,243,037) <= tx_unicast /sec
mine is slightly slower. But this seems to match what I see on the RX
side.
> At this level, if you can verify that CPU:60 is 100% loaded, and packet
> generator is sending more than rx number, then it could work as a valid
> experiment.
i40e receiving on 8:
%Cpu8 : 0.0 us, 0.0 sy, 0.0 ni, 84.8 id, 0.0 wa, 0.0 hi, 15.2 si, 0.0 st
ixgbe receiving on 13:
%Cpu13 : 0.0 us, 0.0 sy, 0.0 ni, 56.7 id, 0.0 wa, 0.0 hi, 43.3 si, 0.0 st
looks idle. On the sending side kpktgend_0 is always at 100%.
> > -t 18
> > | Summary 7,784,946 rx/s 0 err/s
> > | receive total 7,784,946 pkt/s 7,784,946 drop/s 0 error/s
> > | cpu:60 7,784,946 pkt/s 7,784,946 drop/s 0 error/s
> > | xdp_exception 0 hit/s
> >
> > after t18 it drop down to 2,…
> > Now I got worse than before since -t8 says 7,5… and it did 8,4 in the
> > morning. Do you have maybe a .config for me in case I did not enable the
> > performance switch?
> >
>
> I would look for root-cause with perf record +
> perf report --sort cpu,comm,dso,symbol --no-children
while sending with ixgbe while running perf top on the box:
| Samples: 621K of event 'cycles', 4000 Hz, Event count (approx.): 49979376685 lost: 0/0 drop: 0/0
| Overhead CPU Command Shared Object Symbol
| 31.98% 000 kpktgend_0 [kernel] [k] xas_find
| 6.72% 000 kpktgend_0 [kernel] [k] pfn_to_dma_pte
| 5.63% 000 kpktgend_0 [kernel] [k] ixgbe_xmit_frame_ring
| 4.78% 000 kpktgend_0 [kernel] [k] dma_pte_clear_level
| 3.16% 000 kpktgend_0 [kernel] [k] __iommu_dma_unmap
| 2.30% 000 kpktgend_0 [kernel] [k] fq_ring_free_locked
| 1.99% 000 kpktgend_0 [kernel] [k] __domain_mapping
| 1.82% 000 kpktgend_0 [kernel] [k] iommu_dma_alloc_iova
| 1.80% 000 kpktgend_0 [kernel] [k] __iommu_map
| 1.72% 000 kpktgend_0 [kernel] [k] iommu_pgsize.isra.0
| 1.70% 000 kpktgend_0 [kernel] [k] __iommu_dma_map
| 1.63% 000 kpktgend_0 [kernel] [k] alloc_iova_fast
| 1.59% 000 kpktgend_0 [kernel] [k] _raw_spin_lock_irqsave
| 1.32% 000 kpktgend_0 [kernel] [k] iommu_map
| 1.30% 000 kpktgend_0 [kernel] [k] iommu_dma_map_page
| 1.23% 000 kpktgend_0 [kernel] [k] intel_iommu_iotlb_sync_map
| 1.21% 000 kpktgend_0 [kernel] [k] xa_find_after
| 1.17% 000 kpktgend_0 [kernel] [k] ixgbe_poll
| 1.06% 000 kpktgend_0 [kernel] [k] __iommu_unmap
| 1.04% 000 kpktgend_0 [kernel] [k] intel_iommu_unmap_pages
| 1.01% 000 kpktgend_0 [kernel] [k] free_iova_fast
| 0.96% 000 kpktgend_0 [pktgen] [k] pktgen_thread_worker
the i40e box while sending:
|Samples: 400K of event 'cycles:P', 4000 Hz, Event count (approx.): 80512443924 lost: 0/0 drop: 0/0
|Overhead CPU Command Shared Object Symbol
| 24.04% 000 kpktgend_0 [kernel] [k] i40e_lan_xmit_frame
| 17.20% 019 swapper [kernel] [k] i40e_napi_poll
| 4.84% 019 swapper [kernel] [k] intel_idle_irq
| 4.20% 019 swapper [kernel] [k] napi_consume_skb
| 3.00% 000 kpktgend_0 [pktgen] [k] pktgen_thread_worker
| 2.76% 008 swapper [kernel] [k] i40e_napi_poll
| 2.36% 000 kpktgend_0 [kernel] [k] dma_map_page_attrs
| 1.93% 019 swapper [kernel] [k] dma_unmap_page_attrs
| 1.70% 008 swapper [kernel] [k] intel_idle_irq
| 1.44% 008 swapper [kernel] [k] __udp4_lib_rcv
| 1.44% 008 swapper [kernel] [k] __netif_receive_skb_core.constprop.0
| 1.40% 008 swapper [kernel] [k] napi_build_skb
| 1.28% 000 kpktgend_0 [kernel] [k] kfree_skb_reason
| 1.27% 008 swapper [kernel] [k] ip_rcv_core
| 1.19% 008 swapper [kernel] [k] inet_gro_receive
| 1.01% 008 swapper [kernel] [k] kmem_cache_free.part.0
> --Jesper
Sebastian
next prev parent reply other threads:[~2024-02-20 15:32 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 14:58 [PATCH RFC net-next 0/2] Use per-task storage for XDP-redirects on PREEMPT_RT Sebastian Andrzej Siewior
2024-02-13 14:58 ` [PATCH RFC net-next 1/2] net: Reference bpf_redirect_info via task_struct " Sebastian Andrzej Siewior
2024-02-13 20:50 ` Jesper Dangaard Brouer
2024-02-14 12:19 ` Sebastian Andrzej Siewior
2024-02-14 13:23 ` Toke Høiland-Jørgensen
2024-02-14 14:28 ` Sebastian Andrzej Siewior
2024-02-14 16:08 ` Toke Høiland-Jørgensen
2024-02-14 16:36 ` Sebastian Andrzej Siewior
2024-02-15 20:23 ` Toke Høiland-Jørgensen
2024-02-16 16:57 ` Sebastian Andrzej Siewior
2024-02-19 19:01 ` Toke Høiland-Jørgensen
2024-02-20 9:17 ` Jesper Dangaard Brouer
2024-02-20 10:17 ` Sebastian Andrzej Siewior
2024-02-20 10:42 ` Jesper Dangaard Brouer
2024-02-20 12:08 ` Sebastian Andrzej Siewior
2024-02-20 12:57 ` Jesper Dangaard Brouer
2024-02-20 15:32 ` Sebastian Andrzej Siewior [this message]
2024-02-22 9:22 ` Sebastian Andrzej Siewior
2024-02-22 10:10 ` Jesper Dangaard Brouer
2024-02-22 10:58 ` Sebastian Andrzej Siewior
2024-02-20 12:10 ` Dave Taht
2024-02-14 16:13 ` Toke Høiland-Jørgensen
2024-02-15 9:04 ` Sebastian Andrzej Siewior
2024-02-15 12:11 ` Toke Høiland-Jørgensen
2024-02-13 14:58 ` [PATCH RFC net-next 2/2] net: Move per-CPU flush-lists to bpf_xdp_storage " Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240220153206.AUZ_zP24@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hawk@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox