XDP Performance Regression in recent kernel versions

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* XDP Performance Regression in recent kernel versions
@ 2024-06-18 15:28 Sebastiano Miano
  2024-06-19  6:00 ` Tariq Toukan
  2024-06-19 16:27 ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 20+ messages in thread
From: Sebastiano Miano @ 2024-06-18 15:28 UTC (permalink / raw)
  To: bpf, netdev; +Cc: saeedm, tariqt, hawk, edumazet, kuba, pabeni

Hi folks,

I have been conducting some basic experiments with XDP and have
observed a significant performance regression in recent kernel
versions compared to v5.15.

My setup is the following:
- Hardware: Two machines connected back-to-back with 100G Mellanox
ConnectX-6 Dx.
- DUT: 2x16 core Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz.
- Software: xdp-bench program from [1] running on the DUT in both DROP
and TX modes.
- Traffic generator: Pktgen-DPDK sending traffic with a single 64B UDP
flow at ~130Mpps.
- Tests: Single core, HT disabled

Results:

Kernel version |-------| XDP_DROP |--------|   XDP_TX  |
5.15                                30Mpps                  16.1Mpps
6.2                                21.3Mpps                 14.1Mpps
6.5                                19.9Mpps                  8.6Mpps
bpf-next (6.10-rc2)        22.1Mpps                 9.2Mpps

I repeated the experiments multiple times and consistently obtained
similar results.
Are you aware of any performance regressions in recent kernel versions
that could explain these results?

[1] https://github.com/xdp-project/xdp-tools

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-18 15:28 XDP Performance Regression in recent kernel versions Sebastiano Miano
@ 2024-06-19  6:00 ` Tariq Toukan
  2024-06-19 15:17   ` Sebastiano Miano
  2024-06-19 16:27 ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 20+ messages in thread
From: Tariq Toukan @ 2024-06-19  6:00 UTC (permalink / raw)
  To: Sebastiano Miano, bpf, netdev
  Cc: saeedm, tariqt, hawk, edumazet, kuba, pabeni, Gal Pressman, amira


On 18/06/2024 18:28, Sebastiano Miano wrote:
> Hi folks,
> 
> I have been conducting some basic experiments with XDP and have
> observed a significant performance regression in recent kernel
> versions compared to v5.15.
> 

Hi,

> My setup is the following:
> - Hardware: Two machines connected back-to-back with 100G Mellanox
> ConnectX-6 Dx.
> - DUT: 2x16 core Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz.
> - Software: xdp-bench program from [1] running on the DUT in both DROP
> and TX modes.
> - Traffic generator: Pktgen-DPDK sending traffic with a single 64B UDP
> flow at ~130Mpps.
> - Tests: Single core, HT disabled
> 
> Results:
> 
> Kernel version |-------| XDP_DROP |--------|   XDP_TX  |
> 5.15                                30Mpps                  16.1Mpps
> 6.2                                21.3Mpps                 14.1Mpps
> 6.5                                19.9Mpps                  8.6Mpps
> bpf-next (6.10-rc2)        22.1Mpps                 9.2Mpps
> 
> I repeated the experiments multiple times and consistently obtained
> similar results.
> Are you aware of any performance regressions in recent kernel versions
> that could explain these results?
> 
> [1] https://github.com/xdp-project/xdp-tools
> 

Thanks for your report.

I assume cpu util for the active core on the DUT is 100% in all cases, 
right?

Can you please share some more details? Like relevant ethtool counters, 
and perf top output.

We'll check if this repro for us as well.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-19  6:00 ` Tariq Toukan
@ 2024-06-19 15:17   ` Sebastiano Miano
  0 siblings, 0 replies; 20+ messages in thread
From: Sebastiano Miano @ 2024-06-19 15:17 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: bpf, netdev, saeedm, hawk, edumazet, kuba, pabeni, Gal Pressman,
	amira

On Wed, 19 Jun 2024 at 08:00, Tariq Toukan <tariqt@nvidia.com> wrote:
>
> Thanks for your report.
>
> I assume cpu util for the active core on the DUT is 100% in all cases,
> right?

Yes, that's correct.
The irq is also on the core on the right numa node, and I have
disabled CPU frequency scaling.

>
> Can you please share some more details? Like relevant ethtool counters,
> and perf top output.
>
> We'll check if this repro for us as well.

Sure, below you can find the reports for the XDP_DROP and XDP_TX cases.
I am attaching only the ones for kern v5.15 vs v6.5.

--------------------------------------------------
ethtool output (5.15) - Missing counters are zero
--------------------------------------------------
NIC statistics:
     rx_packets: 333854100
     rx_bytes: 20031246044
     tx_packets: 25
     tx_bytes: 2070
     rx_csum_unnecessary: 333854079
     rx_xdp_drop: 3753342954
     rx_xdp_redirect: 0
     rx_xdp_tx_xmit: 5582660674
     rx_xdp_tx_mpwqe: 175018775
     rx_xdp_tx_inlnw: 8970048
     rx_xdp_tx_nops: 378338337
     rx_xdp_tx_full: 0
     rx_xdp_tx_err: 0
     rx_xdp_tx_cqe: 87229072
     rx_cache_reuse: 9369255040
     rx_cache_full: 68
     rx_cache_empty: 16153471
     rx_cache_busy: 193
     rx_cache_waive: 15864256
     rx_congst_umr: 158
     ch_events: 448
     ch_poll: 151091830
     ch_arm: 301
     rx_out_of_buffer: 990473555
     rx_if_down_packets: 67469721
     rx_steer_missed_packets: 1962570491
     rx_vport_unicast_packets: 38460159194
     rx_vport_unicast_bytes: 2461450188460
     tx_vport_unicast_packets: 5582654212
     tx_vport_unicast_bytes: 334959252764
     tx_packets_phy: 5588396729
     rx_packets_phy: 97052087562
     tx_bytes_phy: 357657403514
     rx_bytes_phy: 6211329423080
     tx_mac_control_phy: 5745055
     tx_pause_ctrl_phy: 5745055
     rx_discards_phy: 58591428329
     tx_discards_phy: 0
     tx_errors_phy: 0
     rx_undersize_pkts_phy: 0
     rx_fragments_phy: 0
     rx_jabbers_phy: 0
     rx_64_bytes_phy: 97052040472
     rx_65_to_127_bytes_phy: 3
     rx_128_to_255_bytes_phy: 0
     rx_256_to_511_bytes_phy: 26
     rx_512_to_1023_bytes_phy: 0
     rx_1024_to_1518_bytes_phy: 0
     rx_1519_to_2047_bytes_phy: 0
     rx_2048_to_4095_bytes_phy: 0
     rx_4096_to_8191_bytes_phy: 0
     rx_8192_to_10239_bytes_phy: 0
     rx_prio0_bytes: 6211318150440
     rx_prio0_packets: 38460533605
     rx_prio0_discards: 58591314012
     tx_prio0_bytes: 357288052986
     tx_prio0_packets: 5582625883
     tx_global_pause: 5745042
     tx_global_pause_duration: 771103810
     ch0_events: 55
     ch0_poll: 146981606
     ch0_arm: 35
     ch0_aff_change: 6
     ch0_force_irq: 0
     ch0_eq_rearm: 0
     rx0_packets: 70812690
     rx0_bytes: 4248761400
     rx0_csum_complete: 0
     rx0_csum_complete_tail: 0
     rx0_csum_complete_tail_slow: 0
     rx0_csum_unnecessary: 70812671
     rx0_csum_unnecessary_inner: 0
     rx0_csum_none: 19
     rx0_xdp_drop: 3753342954
     rx0_xdp_redirect: 0
     rx0_lro_packets: 0
     rx0_lro_bytes: 0
     rx0_ecn_mark: 0
     rx0_removed_vlan_packets: 0
     rx0_wqe_err: 0
     rx0_mpwqe_filler_cqes: 0
     rx0_mpwqe_filler_strides: 0
     rx0_oversize_pkts_sw_drop: 0
     rx0_buff_alloc_err: 0
     rx0_cqe_compress_blks: 0
     rx0_cqe_compress_pkts: 0
     rx0_cache_reuse: 9368316609
     rx0_cache_full: 2
     rx0_cache_empty: 11519
     rx0_cache_busy: 0
     rx0_cache_waive: 0
     rx0_congst_umr: 158
     rx0_arfs_err: 0
     rx0_recover: 0
     rx0_xdp_tx_xmit: 5582664928
     rx0_xdp_tx_mpwqe: 175018908
     rx0_xdp_tx_inlnw: 8970048
     rx0_xdp_tx_nops: 378338623
     rx0_xdp_tx_full: 0
     rx0_xdp_tx_err: 0
     rx0_xdp_tx_cqes: 87229139

--------------------------------------------------
perf top output (5.15) - XDP_DROP
--------------------------------------------------
19.27%  [kernel]                  [k] mlx5e_skb_from_cqe_mpwrq_linear
11.74%  [kernel]                  [k] mlx5e_handle_rx_cqe_mpwrq
9.82%   [kernel]                  [k] mlx5e_xdp_handle
9.43%   [kernel]                  [k] mlx5e_alloc_rx_mpwqe
9.29%   bpf_prog_xdp_basic_prog   [k] bpf_prog_5f76c01f0ff23233_xdp_basic_prog
7.06%   [kernel]                  [k] mlx5e_page_release_dynamic
6.95%   [kernel]                  [k] mlx5e_poll_rx_cq
5.89%   [kernel]                  [k] dma_sync_single_for_cpu
5.21%   [kernel]                  [k] dma_sync_single_for_device
4.12%   [kernel]                  [k] mlx5e_free_rx_mpwqe
1.65%   [kernel]                  [k] mlx5e_poll_ico_cq
1.60%   [kernel]                  [k] mlx5e_napi_poll
1.59%   [kernel]                  [k] bpf_get_smp_processor_id
0.94%   [kernel]                  [k] bpf_dispatcher_xdp_func
0.91%   [kernel]                  [k] net_rx_action
0.90%   bpf_prog_xdp_dispatcher   [k] bpf_prog_17d608957d1f805a_xdp_dispatcher
0.90%   [kernel]                  [k] bpf_dispatcher_xdp
0.64%   [kernel]                  [k] mlx5e_post_rx_mpwqes
0.64%   [kernel]                  [k] mlx5e_poll_xdpsq_cq
0.37%   [kernel]                  [k] __softirqentry_text_start

--------------------------------------------------
perf top output (5.15) - XDP_TX
--------------------------------------------------
13.84%  bpf_prog_xdp_swap_macs_prog  [k]
bpf_prog_0a3ad412f28cbb6d_xdp_swap_macs_prog
11.43%  [kernel]                     [k] mlx5e_xmit_xdp_buff
10.69%  [kernel]                     [k] mlx5e_skb_from_cqe_mpwrq_linear
9.79%  [kernel]                      [k] mlx5e_xmit_xdp_frame_mpwqe
8.35%  [kernel]                      [k] mlx5e_handle_rx_cqe_mpwrq
6.34%  [kernel]                      [k] dma_sync_single_for_device
6.20%  [kernel]                      [k] mlx5e_poll_rx_cq
5.62%  [kernel]                      [k] mlx5e_page_release_dynamic
5.33%  [kernel]                      [k] mlx5e_xdp_handle
5.21%  [kernel]                      [k] mlx5e_alloc_rx_mpwqe
4.47%  [kernel]                      [k] mlx5e_free_xdpsq_desc
3.26%  [kernel]                      [k] dma_sync_single_for_cpu
1.47%  [kernel]                      [k] mlx5e_xmit_xdp_frame_check_mpwqe
1.22%  [kernel]                      [k] mlx5e_poll_xdpsq_cq
0.95%  [kernel]                      [k] net_rx_action
0.90%  [kernel]                      [k] bpf_get_smp_processor_id
0.80%  [kernel]                      [k] mlx5e_napi_poll
0.69%  [kernel]                      [k] mlx5e_xdp_mpwqe_session_start
0.63%  [kernel]                      [k] mlx5e_poll_ico_cq
0.49%  [kernel]                      [k] bpf_dispatcher_xdp
0.47%  [kernel]                      [k] bpf_dispatcher_xdp_func

---------------------------------------------------------------------------------------

--------------------------------------------------
ethtool output (6.5) - Missing counters are zero
--------------------------------------------------
NIC statistics:
     rx_packets: 7282880
     rx_bytes: 436973482
     tx_packets: 42
     tx_bytes: 3556
     rx_csum_unnecessary: 7282816
     rx_xdp_drop: 7783331724
     rx_xdp_redirect: 0
     rx_xdp_tx_xmit: 46956452544
     rx_xdp_tx_mpwqe: 4401807536
     rx_xdp_tx_inlnw: 46951234092
     rx_xdp_tx_nops: 4988835176
     rx_xdp_tx_full: 0
     rx_xdp_tx_err: 0
     rx_xdp_tx_cqe: 733694572
     rx_pp_alloc_fast: 3641784
     rx_pp_alloc_slow: 8
     rx_pp_alloc_slow_high_order: 0
     rx_pp_alloc_empty: 8
     rx_pp_alloc_refill: 0
     rx_pp_alloc_waive: 0
     rx_pp_recycle_cached: 3641280
     ch_events: 505
     ch_poll: 855423286
     rx_out_of_buffer: 534918379
     rx_if_down_packets: 4044804
     rx_steer_missed_packets: 298
     rx_vport_unicast_packets: 287214261626
     rx_vport_unicast_bytes: 18381712744116
     tx_vport_unicast_packets: 46956452544
     tx_vport_unicast_bytes: 2817387157674
     tx_packets_phy: 47000866603
     rx_packets_phy: 728277471186
     tx_bytes_phy: 3008055468662
     rx_bytes_phy: 46609758231313
     tx_mac_control_phy: 44414017
     tx_pause_ctrl_phy: 44414017
     rx_discards_phy: 441063206498
     rx_64_bytes_phy: 728277470842
     rx_65_to_127_bytes_phy: 133
     rx_128_to_255_bytes_phy: 0
     rx_256_to_511_bytes_phy: 211
     rx_512_to_1023_bytes_phy: 0
     rx_1024_to_1518_bytes_phy: 0
     rx_1519_to_2047_bytes_phy: 0
     rx_2048_to_4095_bytes_phy: 0
     rx_4096_to_8191_bytes_phy: 0
     rx_8192_to_10239_bytes_phy: 0
     rx_buffer_passed_thres_phy: 1192226
     rx_prio0_bytes: 46609758231313
     rx_prio0_packets: 287214264688
     rx_prio0_discards: 441063206498
     tx_prio0_bytes: 3005212971574
     tx_prio0_packets: 46956452586
     tx_global_pause: 44414017
     tx_global_pause_duration: 5961284324
     ch0_events: 120
     ch0_poll: 855423025
     ch0_arm: 100
     ch0_aff_change: 0
     ch0_force_irq: 0
     ch0_eq_rearm: 0
     rx0_packets: 7282880
     rx0_bytes: 436973482
     rx0_csum_complete: 0
     rx0_csum_complete_tail: 0
     rx0_csum_complete_tail_slow: 0
     rx0_csum_unnecessary: 7282816
     rx0_csum_unnecessary_inner: 0
     rx0_csum_none: 64
     rx0_xdp_drop: 7783331724
     rx0_xdp_redirect: 0
     rx0_lro_packets: 0
     rx0_lro_bytes: 0
     rx0_gro_packets: 0
     rx0_gro_bytes: 0
     rx0_gro_skbs: 0
     rx0_gro_match_packets: 0
     rx0_gro_large_hds: 0
     rx0_ecn_mark: 0
     rx0_removed_vlan_packets: 0
     rx0_wqe_err: 0
     rx0_mpwqe_filler_cqes: 0
     rx0_mpwqe_filler_strides: 0
     rx0_oversize_pkts_sw_drop: 0
     rx0_buff_alloc_err: 0
     rx0_cqe_compress_blks: 0
     rx0_cqe_compress_pkts: 0
     rx0_congst_umr: 0
     rx0_arfs_err: 0
     rx0_recover: 0
     rx0_pp_alloc_fast: 3641784
     rx0_pp_alloc_slow: 8
     rx0_pp_alloc_slow_high_order: 0
     rx0_pp_alloc_empty: 8
     rx0_pp_alloc_refill: 0
     rx0_pp_alloc_waive: 0
     rx0_pp_recycle_cached: 3641280
     rx0_pp_recycle_cache_full: 0
     rx0_pp_recycle_ring: 0
     rx0_pp_recycle_ring_full: 0
     rx0_pp_recycle_released_ref: 0
     rx0_xdp_tx_xmit: 46956452544
     rx0_xdp_tx_mpwqe: 4401807536
     rx0_xdp_tx_inlnw: 46951234092
     rx0_xdp_tx_nops: 4988835176
     rx0_xdp_tx_full: 0
     rx0_xdp_tx_err: 0
     rx0_xdp_tx_cqes: 733694572

--------------------------------------------------
perf top output (6.5) - XDP_DROP
--------------------------------------------------
27.63%  [kernel]                [k] mlx5e_skb_from_cqe_mpwrq_linear
12.61%  [kernel]                [k] mlx5e_handle_rx_cqe_mpwrq
8.38%  [kernel]                 [k] mlx5e_rx_cq_process_basic_cqe_comp
7.06%  [kernel]                 [k] page_pool_put_defragged_page
6.45%  [kernel]                 [k] mlx5e_xdp_handle
5.36%  bpf_prog_xdp_basic_prog  [k] bpf_prog_5f76c01f0ff23233_xdp_basic_prog
4.95%  [kernel]                 [k] dma_sync_single_for_device
4.89%  [kernel]                 [k] page_pool_alloc_pages
4.36%  [kernel]                 [k] mlx5e_alloc_rx_mpwqe
3.70%  [kernel]                 [k] dma_sync_single_for_cpu
2.71%  [kernel]                 [k] mlx5e_page_release_fragmented.isra.0
2.09%  [kernel]                 [k] bpf_dispatcher_xdp_func
1.95%  [kernel]                 [k] mlx5e_free_rx_mpwqe
1.10%  [kernel]                 [k] mlx5e_poll_ico_cq
1.07%  [kernel]                 [k] bpf_get_smp_processor_id
1.05%  [kernel]                 [k] mlx5e_napi_poll
0.85%  [kernel]                 [k] mlx5e_poll_xdpsq_cq
0.61%  [kernel]                 [k] net_rx_action
0.58%  bpf_prog_xdp_dispatcher  [k] bpf_prog_17d608957d1f805a_xdp_dispatcher
0.57%  [kernel]                 [k] bpf_dispatcher_xdp
0.53%  [kernel]                 [k] mlx5e_post_rx_mpwqes
0.27%  [kernel]                 [k] __do_softirq
0.25%  [kernel]                 [k] mlx5e_poll_tx_cq

--------------------------------------------------
perf top output (6.5) - XDP_TX
--------------------------------------------------
19.60%  [kernel]                    [k] mlx5e_xdp_mpwqe_add_dseg
14.61%  [kernel]                    [k] mlx5e_skb_from_cqe_mpwrq_linear
11.55%  [kernel]                    [k] mlx5e_xmit_xdp_buff
5.85%  [kernel]                     [k] mlx5e_handle_rx_cqe_mpwrq
5.73%  bpf_prog_xdp_swap_macs_prog  [k] bpf_prog_0a3a_xdp_swap_macs_prog
5.09%  [kernel]                     [k] mlx5e_free_xdpsq_desc
5.08%  [kernel]                     [k] dma_sync_single_for_device
4.66%  [kernel]                     [k] mlx5e_xmit_xdp_frame_mpwqe
3.64%  [kernel]                     [k] mlx5e_rx_cq_process_basic_cqe_comp
3.34%  [kernel]                     [k] page_pool_put_defragged_page
3.04%  [kernel]                     [k] mlx5e_xdp_handle
3.03%  [kernel]                     [k] mlx5e_page_release_fragmented.isra.0
2.56%  [kernel]                     [k] dma_sync_single_for_cpu
2.15%  [kernel]                     [k] mlx5e_alloc_rx_mpwqe
1.96%  [kernel]                     [k] page_pool_alloc_pages
1.06%  [kernel]                     [k] mlx5e_xmit_xdp_frame_check_mpwqe
1.02%  [kernel]                     [k] bpf_dispatcher_xdp_func
1.01%  [kernel]                     [k] mlx5e_free_rx_mpwqe
0.84%  [kernel]                     [k] mlx5e_poll_xdpsq_cq
0.62%  [kernel]                     [k] mlx5e_xdpsq_get_next_pi
0.53%  [kernel]                     [k] mlx5e_poll_ico_cq
0.48%  [kernel]                     [k] bpf_get_smp_processor_id
0.48%  [kernel]                     [k] net_rx_action
0.36%  [kernel]                     [k] mlx5e_napi_poll
0.32%  [kernel]                     [k] mlx5e_xdp_mpwqe_complete
0.25%  [kernel]                     [k] bpf_dispatcher_xdp
0.22%  bpf_prog_xdp_dispatcher      [k] bpf_prog_17d6_xdp_dispatcher
0.21%  [kernel]                     [k] mlx5e_post_rx_mpwqes
0.11%  [kernel]                     [k] __do_softirq

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-18 15:28 XDP Performance Regression in recent kernel versions Sebastiano Miano
  2024-06-19  6:00 ` Tariq Toukan
@ 2024-06-19 16:27 ` Jesper Dangaard Brouer
  2024-06-19 19:17   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 20+ messages in thread
From: Jesper Dangaard Brouer @ 2024-06-19 16:27 UTC (permalink / raw)
  To: Sebastiano Miano, bpf, netdev, Toke Hoiland Jorgensen,
	Toke Høiland-Jørgensen
  Cc: saeedm, tariqt, edumazet, kuba, pabeni


On 18/06/2024 17.28, Sebastiano Miano wrote:
> Hi folks,
> 
> I have been conducting some basic experiments with XDP and have
> observed a significant performance regression in recent kernel
> versions compared to v5.15.
> 
> My setup is the following:
> - Hardware: Two machines connected back-to-back with 100G Mellanox
> ConnectX-6 Dx.
> - DUT: 2x16 core Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz.
> - Software: xdp-bench program from [1] running on the DUT in both DROP
> and TX modes.
> - Traffic generator: Pktgen-DPDK sending traffic with a single 64B UDP
> flow at ~130Mpps.
> - Tests: Single core, HT disabled
> 
> Results:
> 
> Kernel version |-------| XDP_DROP |--------|   XDP_TX  |
> 5.15                      30Mpps               16.1Mpps
> 6.2                       21.3Mpps             14.1Mpps
> 6.5                       19.9Mpps              8.6Mpps
> bpf-next (6.10-rc2)       22.1Mpps              9.2Mpps
> 

Around when I left Red Hat there were a project with [LNST] that used
xdp-bench for tracking and finding regressions like this.

Perhaps Toke can enlighten us, if that project have caught similar 
regressions?

[LNST] https://github.com/LNST-project/lnst


> I repeated the experiments multiple times and consistently obtained
> similar results.
> Are you aware of any performance regressions in recent kernel versions
> that could explain these results?
> 
> [1] https://github.com/xdp-project/xdp-tools


--Jesper

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-19 16:27 ` Jesper Dangaard Brouer
@ 2024-06-19 19:17   ` Toke Høiland-Jørgensen
  2024-06-20  9:52     ` Daniel Borkmann
  0 siblings, 1 reply; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-06-19 19:17 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Sebastiano Miano, bpf, netdev
  Cc: saeedm, tariqt, edumazet, kuba, pabeni, Samuel Dobron

Jesper Dangaard Brouer <hawk@kernel.org> writes:

> On 18/06/2024 17.28, Sebastiano Miano wrote:
>> Hi folks,
>> 
>> I have been conducting some basic experiments with XDP and have
>> observed a significant performance regression in recent kernel
>> versions compared to v5.15.
>> 
>> My setup is the following:
>> - Hardware: Two machines connected back-to-back with 100G Mellanox
>> ConnectX-6 Dx.
>> - DUT: 2x16 core Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz.
>> - Software: xdp-bench program from [1] running on the DUT in both DROP
>> and TX modes.
>> - Traffic generator: Pktgen-DPDK sending traffic with a single 64B UDP
>> flow at ~130Mpps.
>> - Tests: Single core, HT disabled
>> 
>> Results:
>> 
>> Kernel version |-------| XDP_DROP |--------|   XDP_TX  |
>> 5.15                      30Mpps               16.1Mpps
>> 6.2                       21.3Mpps             14.1Mpps
>> 6.5                       19.9Mpps              8.6Mpps
>> bpf-next (6.10-rc2)       22.1Mpps              9.2Mpps
>> 
>
> Around when I left Red Hat there were a project with [LNST] that used
> xdp-bench for tracking and finding regressions like this.
>
> Perhaps Toke can enlighten us, if that project have caught similar 
> regressions?
>
> [LNST] https://github.com/LNST-project/lnst

Yes, actually, we have! Here's the bugzilla for it:
https://bugzilla.redhat.com/show_bug.cgi?id=2270408

I'm on PTO for the rest of this week, but adding Samuel who ran the
tests to Cc, he should be able to provide more information if needed.

-Toke


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-19 19:17   ` Toke Høiland-Jørgensen
@ 2024-06-20  9:52     ` Daniel Borkmann
  2024-06-21 12:35       ` Samuel Dobron
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2024-06-20  9:52 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Jesper Dangaard Brouer,
	Sebastiano Miano, bpf, netdev
  Cc: saeedm, tariqt, edumazet, kuba, pabeni, Samuel Dobron, netdev

On 6/19/24 9:17 PM, Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <hawk@kernel.org> writes:
>> On 18/06/2024 17.28, Sebastiano Miano wrote:
>>> I have been conducting some basic experiments with XDP and have
>>> observed a significant performance regression in recent kernel
>>> versions compared to v5.15.
>>>
>>> My setup is the following:
>>> - Hardware: Two machines connected back-to-back with 100G Mellanox
>>> ConnectX-6 Dx.
>>> - DUT: 2x16 core Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz.
>>> - Software: xdp-bench program from [1] running on the DUT in both DROP
>>> and TX modes.
>>> - Traffic generator: Pktgen-DPDK sending traffic with a single 64B UDP
>>> flow at ~130Mpps.
>>> - Tests: Single core, HT disabled
>>>
>>> Results:
>>>
>>> Kernel version |-------| XDP_DROP |--------|   XDP_TX  |
>>> 5.15                      30Mpps               16.1Mpps
>>> 6.2                       21.3Mpps             14.1Mpps
>>> 6.5                       19.9Mpps              8.6Mpps
>>> bpf-next (6.10-rc2)       22.1Mpps              9.2Mpps
>>>
>>
>> Around when I left Red Hat there were a project with [LNST] that used
>> xdp-bench for tracking and finding regressions like this.
>>
>> Perhaps Toke can enlighten us, if that project have caught similar
>> regressions?
>>
>> [LNST] https://github.com/LNST-project/lnst
> 
> Yes, actually, we have! Here's the bugzilla for it:
> https://bugzilla.redhat.com/show_bug.cgi?id=2270408

 > We compared performance of ELN and RHEL9 candidate kernels and noticed significant
 > drop in XDP drop [1] on mlx5 (25G).
 >
 > On any rhel9 candidate kernel we are able to drop 19-20M pkts/sec but on an ELN
 > kernels, we are reaching just 15M pkts/sec (CPU utillization remains the same -
 > around 100%).
 >
 > We don't see such regression on ixgbe or i40e.

It looks like this is known since March, was this ever reported to Nvidia back
then? :/

Given XDP is in the critical path for many in production, we should think about
regular performance reporting for the different vendors for each released kernel,
similar to here [0].

Out of curiosity, @Saeed: Is Nvidia internally regularly assessing XDP perf for mlx5
as part of QA? (Probably not, but I thought I'd ask.)

Thanks,
Daniel

   [0] http://core.dpdk.org/perf-reports/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-20  9:52     ` Daniel Borkmann
@ 2024-06-21 12:35       ` Samuel Dobron
  2024-06-24 11:46         ` Toke Høiland-Jørgensen
  2024-06-30 11:43         ` Tariq Toukan
  0 siblings, 2 replies; 20+ messages in thread
From: Samuel Dobron @ 2024-06-21 12:35 UTC (permalink / raw)
  To: Daniel Borkmann, hawk
  Cc: Toke Høiland-Jørgensen, Sebastiano Miano, bpf, netdev,
	saeedm, tariqt, edumazet, kuba, pabeni

Hey all,

Yeah, we do tests for ELN kernels [1] on a regular basis. Since
~January of this year.

As already mentioned, mlx5 is the only driver affected by this regression.
Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
the one already mentioned by Toke, another one [0] has been reported
in early February.
Btw. issue mentioned by Toke has been moved to Jira, see [5].

Not sure all of you are able to see the content of [0], Jira says it's
RH-confidental.
So, I am not sure how much I can share without being fired :D. Anyway,
affected kernels have been released a while ago, so anyone can find it
on its own.
Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
don't have data for any other XDP mode) in kernel-5.14 compared to
previous builds.

From tests history, I can see (most likely) the same improvement
on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
(partially) fixed?

For earlier 6.10. kernels we don't have data due to [3] (there is regression on
XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
in issue).
So if you want to run tests on 6.10. please see [3].

Summary XDP_DROP+mlx5@25G:
kernel       pps
<5.14        20.5M        baseline
>=5.14      19M           [0]
<6.4          19-20M      baseline for ELN kernels
>=6.4        15M           [4 and 5] (mentioned by Toke)
>=6.10      ???            [3]
>=6.10rc2 17M-18M

> It looks like this is known since March, was this ever reported to Nvidia back
> then? :/

Not sure if that's a question for me, I was told, filling an issue in
Bugzilla/Jira is where
our competences end. Who is supposed to report it to them?

> Given XDP is in the critical path for many in production, we should think about
> regular performance reporting for the different vendors for each released kernel,
> similar to here [0].

I think this might be the part of upstream kernel testing with LNST?
Maybe Jesper
knows more about that? Until then, I think, I can let you know about
new regressions we catch.

Thanks,
Sam.

[0] https://issues.redhat.com/browse/RHEL-24054
[1] https://koji.fedoraproject.org/koji/search?terms=kernel-%5Cd.*eln*&type=build&match=regexp
[2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2469107
[3] https://bugzilla.redhat.com/show_bug.cgi?id=2282969
[4] https://bugzilla.redhat.com/show_bug.cgi?id=2270408
[5] https://issues.redhat.com/browse/RHEL-24054

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-21 12:35       ` Samuel Dobron
@ 2024-06-24 11:46         ` Toke Høiland-Jørgensen
  2024-06-30 10:25           ` Tariq Toukan
  2024-06-30 11:43         ` Tariq Toukan
  1 sibling, 1 reply; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-06-24 11:46 UTC (permalink / raw)
  To: Samuel Dobron, Daniel Borkmann, hawk
  Cc: Sebastiano Miano, bpf, netdev, saeedm, tariqt, edumazet, kuba,
	pabeni

Samuel Dobron <sdobron@redhat.com> writes:

>> It looks like this is known since March, was this ever reported to Nvidia back
>> then? :/
>
> Not sure if that's a question for me, I was told, filling an issue in
> Bugzilla/Jira is where
> our competences end. Who is supposed to report it to them?

I don't think we have a formal reporting procedure, but I was planning
to send this to the list, referencing the Bugzilla entry. Seems I
dropped the ball on that; sorry! :(

Can we set up a better reporting procedure for this going forward? A
mailing list, or just a name we can put in reports? Or something else?
Tariq, any preferences?

-Toke

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-24 11:46         ` Toke Høiland-Jørgensen
@ 2024-06-30 10:25           ` Tariq Toukan
  2024-07-22 10:57             ` Samuel Dobron
  0 siblings, 1 reply; 20+ messages in thread
From: Tariq Toukan @ 2024-06-30 10:25 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Samuel Dobron, Daniel Borkmann,
	hawk
  Cc: Sebastiano Miano, bpf, netdev, saeedm, edumazet, kuba, pabeni,
	Dragos Tatulea



On 24/06/2024 14:46, Toke Høiland-Jørgensen wrote:
> Samuel Dobron <sdobron@redhat.com> writes:
> 
>>> It looks like this is known since March, was this ever reported to Nvidia back
>>> then? :/
>>
>> Not sure if that's a question for me, I was told, filling an issue in
>> Bugzilla/Jira is where
>> our competences end. Who is supposed to report it to them?
> 
> I don't think we have a formal reporting procedure, but I was planning
> to send this to the list, referencing the Bugzilla entry. Seems I
> dropped the ball on that; sorry! :(
> 
> Can we set up a better reporting procedure for this going forward? A
> mailing list, or just a name we can put in reports? Or something else?
> Tariq, any preferences?
> 
> -Toke
> 

Hi,
Please add Dragos and me on XDP mailing list reports.

Regards,
Tariq

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-21 12:35       ` Samuel Dobron
  2024-06-24 11:46         ` Toke Høiland-Jørgensen
@ 2024-06-30 11:43         ` Tariq Toukan
  2024-07-22  9:26           ` Dragos Tatulea
  1 sibling, 1 reply; 20+ messages in thread
From: Tariq Toukan @ 2024-06-30 11:43 UTC (permalink / raw)
  To: Samuel Dobron, Daniel Borkmann, hawk, Dragos Tatulea
  Cc: Toke Høiland-Jørgensen, Sebastiano Miano, bpf, netdev,
	saeedm, edumazet, kuba, pabeni



On 21/06/2024 15:35, Samuel Dobron wrote:
> Hey all,
> 
> Yeah, we do tests for ELN kernels [1] on a regular basis. Since
> ~January of this year.
> 
> As already mentioned, mlx5 is the only driver affected by this regression.
> Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
> the one already mentioned by Toke, another one [0] has been reported
> in early February.
> Btw. issue mentioned by Toke has been moved to Jira, see [5].
> 
> Not sure all of you are able to see the content of [0], Jira says it's
> RH-confidental.
> So, I am not sure how much I can share without being fired :D. Anyway,
> affected kernels have been released a while ago, so anyone can find it
> on its own.
> Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
> don't have data for any other XDP mode) in kernel-5.14 compared to
> previous builds.
> 
>  From tests history, I can see (most likely) the same improvement
> on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
> (partially) fixed?
> 
> For earlier 6.10. kernels we don't have data due to [3] (there is regression on
> XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
> in issue).
> So if you want to run tests on 6.10. please see [3].
> 
> Summary XDP_DROP+mlx5@25G:
> kernel       pps
> <5.14        20.5M        baseline
>> =5.14      19M           [0]
> <6.4          19-20M      baseline for ELN kernels
>> =6.4        15M           [4 and 5] (mentioned by Toke)

+ @Dragos

That's about when we added several changes to the RX datapath.
Most relevant are:
- Fully removing the in-driver RX page-cache.
- Refactoring to support XDP multi-buffer.

We tested XDP performance before submission, I don't recall we noticed 
such a degradation.

I'll check with Dragos as he probably has these reports.

>> =6.10      ???            [3]
>> =6.10rc2 17M-18M
> 
> 
>> It looks like this is known since March, was this ever reported to Nvidia back
>> then? :/
> 
> Not sure if that's a question for me, I was told, filling an issue in
> Bugzilla/Jira is where
> our competences end. Who is supposed to report it to them?
> 
>> Given XDP is in the critical path for many in production, we should think about
>> regular performance reporting for the different vendors for each released kernel,
>> similar to here [0].
> 
> I think this might be the part of upstream kernel testing with LNST?
> Maybe Jesper
> knows more about that? Until then, I think, I can let you know about
> new regressions we catch.
> 
> Thanks,
> Sam.
> 
> [0] https://issues.redhat.com/browse/RHEL-24054
> [1] https://koji.fedoraproject.org/koji/search?terms=kernel-%5Cd.*eln*&type=build&match=regexp
> [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2469107
> [3] https://bugzilla.redhat.com/show_bug.cgi?id=2282969
> [4] https://bugzilla.redhat.com/show_bug.cgi?id=2270408
> [5] https://issues.redhat.com/browse/RHEL-24054
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-30 11:43         ` Tariq Toukan
@ 2024-07-22  9:26           ` Dragos Tatulea
  2024-07-23  9:52             ` Carolina Jubran
  0 siblings, 1 reply; 20+ messages in thread
From: Dragos Tatulea @ 2024-07-22  9:26 UTC (permalink / raw)
  To: Tariq Toukan, daniel@iogearbox.net, Carolina Jubran,
	sdobron@redhat.com, hawk@kernel.org
  Cc: toke@redhat.com, mianosebastiano@gmail.com, pabeni@redhat.com,
	netdev@vger.kernel.org, edumazet@google.com, Saeed Mahameed,
	bpf@vger.kernel.org, kuba@kernel.org

On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
> 
> On 21/06/2024 15:35, Samuel Dobron wrote:
> > Hey all,
> > 
> > Yeah, we do tests for ELN kernels [1] on a regular basis. Since
> > ~January of this year.
> > 
> > As already mentioned, mlx5 is the only driver affected by this regression.
> > Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
> > the one already mentioned by Toke, another one [0] has been reported
> > in early February.
> > Btw. issue mentioned by Toke has been moved to Jira, see [5].
> > 
> > Not sure all of you are able to see the content of [0], Jira says it's
> > RH-confidental.
> > So, I am not sure how much I can share without being fired :D. Anyway,
> > affected kernels have been released a while ago, so anyone can find it
> > on its own.
> > Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
> > don't have data for any other XDP mode) in kernel-5.14 compared to
> > previous builds.
> > 
> >  From tests history, I can see (most likely) the same improvement
> > on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
> > (partially) fixed?
> > 
> > For earlier 6.10. kernels we don't have data due to [3] (there is regression on
> > XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
> > in issue).
> > So if you want to run tests on 6.10. please see [3].
> > 
> > Summary XDP_DROP+mlx5@25G:
> > kernel       pps
> > <5.14        20.5M        baseline
> > > =5.14      19M           [0]
> > <6.4          19-20M      baseline for ELN kernels
> > > =6.4        15M           [4 and 5] (mentioned by Toke)
> 
> + @Dragos
> 
> That's about when we added several changes to the RX datapath.
> Most relevant are:
> - Fully removing the in-driver RX page-cache.
> - Refactoring to support XDP multi-buffer.
> 
> We tested XDP performance before submission, I don't recall we noticed 
> such a degradation.

Adding Carolina to post her analysis on this.

> 
> I'll check with Dragos as he probably has these reports.
> 
We only noticed a 6% degradation for XDP_XDROP.

https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/

> > > =6.10      ???            [3]
> > > =6.10rc2 17M-18M
> > 
> > 
> > > It looks like this is known since March, was this ever reported to Nvidia back
> > > then? :/
> > 
> > Not sure if that's a question for me, I was told, filling an issue in
> > Bugzilla/Jira is where
> > our competences end. Who is supposed to report it to them?
> > 
> > > Given XDP is in the critical path for many in production, we should think about
> > > regular performance reporting for the different vendors for each released kernel,
> > > similar to here [0].
> > 
> > I think this might be the part of upstream kernel testing with LNST?
> > Maybe Jesper
> > knows more about that? Until then, I think, I can let you know about
> > new regressions we catch.
> > 
> > Thanks,
> > Sam.
> > 
> > [0] https://issues.redhat.com/browse/RHEL-24054
> > [1] https://koji.fedoraproject.org/koji/search?terms=kernel-%5Cd.*eln*&type=build&match=regexp
> > [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2469107
> > [3] https://bugzilla.redhat.com/show_bug.cgi?id=2282969
> > [4] https://bugzilla.redhat.com/show_bug.cgi?id=2270408
> > [5] https://issues.redhat.com/browse/RHEL-24054
> > 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-06-30 10:25           ` Tariq Toukan
@ 2024-07-22 10:57             ` Samuel Dobron
  0 siblings, 0 replies; 20+ messages in thread
From: Samuel Dobron @ 2024-07-22 10:57 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: Toke Høiland-Jørgensen, Daniel Borkmann, hawk,
	Sebastiano Miano, bpf, netdev, saeedm, edumazet, kuba, pabeni,
	Dragos Tatulea

Hey,

Sorry for waiting.
I've started a discussion within our team, how to handle this since we
don't have reporting process defined. So it may take some time, I'll
let you know.

Thanks,
Sam.

On Sun, Jun 30, 2024 at 12:26 PM Tariq Toukan <tariqt@nvidia.com> wrote:
>
>
>
> On 24/06/2024 14:46, Toke Høiland-Jørgensen wrote:
> > Samuel Dobron <sdobron@redhat.com> writes:
> >
> >>> It looks like this is known since March, was this ever reported to Nvidia back
> >>> then? :/
> >>
> >> Not sure if that's a question for me, I was told, filling an issue in
> >> Bugzilla/Jira is where
> >> our competences end. Who is supposed to report it to them?
> >
> > I don't think we have a formal reporting procedure, but I was planning
> > to send this to the list, referencing the Bugzilla entry. Seems I
> > dropped the ball on that; sorry! :(
> >
> > Can we set up a better reporting procedure for this going forward? A
> > mailing list, or just a name we can put in reports? Or something else?
> > Tariq, any preferences?
> >
> > -Toke
> >
>
> Hi,
> Please add Dragos and me on XDP mailing list reports.
>
> Regards,
> Tariq
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-22  9:26           ` Dragos Tatulea
@ 2024-07-23  9:52             ` Carolina Jubran
  2024-07-24 15:36               ` Toke Høiland-Jørgensen
  2024-07-30 11:04               ` Samuel Dobron
  0 siblings, 2 replies; 20+ messages in thread
From: Carolina Jubran @ 2024-07-23  9:52 UTC (permalink / raw)
  To: Dragos Tatulea, Tariq Toukan, daniel@iogearbox.net,
	sdobron@redhat.com, hawk@kernel.org, mianosebastiano@gmail.com
  Cc: toke@redhat.com, pabeni@redhat.com, netdev@vger.kernel.org,
	edumazet@google.com, Saeed Mahameed, bpf@vger.kernel.org,
	kuba@kernel.org



On 22/07/2024 12:26, Dragos Tatulea wrote:
> On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
>>
>> On 21/06/2024 15:35, Samuel Dobron wrote:
>>> Hey all,
>>>
>>> Yeah, we do tests for ELN kernels [1] on a regular basis. Since
>>> ~January of this year.
>>>
>>> As already mentioned, mlx5 is the only driver affected by this regression.
>>> Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
>>> the one already mentioned by Toke, another one [0] has been reported
>>> in early February.
>>> Btw. issue mentioned by Toke has been moved to Jira, see [5].
>>>
>>> Not sure all of you are able to see the content of [0], Jira says it's
>>> RH-confidental.
>>> So, I am not sure how much I can share without being fired :D. Anyway,
>>> affected kernels have been released a while ago, so anyone can find it
>>> on its own.
>>> Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
>>> don't have data for any other XDP mode) in kernel-5.14 compared to
>>> previous builds.
>>>
>>>   From tests history, I can see (most likely) the same improvement
>>> on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
>>> (partially) fixed?
>>>
>>> For earlier 6.10. kernels we don't have data due to [3] (there is regression on
>>> XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
>>> in issue).
>>> So if you want to run tests on 6.10. please see [3].
>>>
>>> Summary XDP_DROP+mlx5@25G:
>>> kernel       pps
>>> <5.14        20.5M        baseline
>>>> =5.14      19M           [0]
>>> <6.4          19-20M      baseline for ELN kernels
>>>> =6.4        15M           [4 and 5] (mentioned by Toke)
>>
>> + @Dragos
>>
>> That's about when we added several changes to the RX datapath.
>> Most relevant are:
>> - Fully removing the in-driver RX page-cache.
>> - Refactoring to support XDP multi-buffer.
>>
>> We tested XDP performance before submission, I don't recall we noticed
>> such a degradation.
> 
> Adding Carolina to post her analysis on this.

Hey everyone,

After investigating the issue, it seems the performance degradation is 
linked to the commit "x86/bugs: Report Intel retbleed vulnerability"
(6ad0ad2bf8a67).

This commit addresses the Intel retbleed vulnerability and introduces
mitigation measures that impact performance, especially the Spectre v2
mitigations.


Disabling these mitigations in the kernel arguments
(spectre_v2=off ibrs=off) resolved the degradation in my tests.

Could you try adding the mentioned parameters to your kernel arguments
and check if you still see the degradation?

Thank you,

Carolina.

> 
>>
>> I'll check with Dragos as he probably has these reports.
>>
> We only noticed a 6% degradation for XDP_XDROP.
> 
> https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/
> 
>>>> =6.10      ???            [3]
>>>> =6.10rc2 17M-18M
>>>
>>>
>>>> It looks like this is known since March, was this ever reported to Nvidia back
>>>> then? :/
>>>
>>> Not sure if that's a question for me, I was told, filling an issue in
>>> Bugzilla/Jira is where
>>> our competences end. Who is supposed to report it to them?
>>>
>>>> Given XDP is in the critical path for many in production, we should think about
>>>> regular performance reporting for the different vendors for each released kernel,
>>>> similar to here [0].
>>>
>>> I think this might be the part of upstream kernel testing with LNST?
>>> Maybe Jesper
>>> knows more about that? Until then, I think, I can let you know about
>>> new regressions we catch.
>>>
>>> Thanks,
>>> Sam.
>>>
>>> [0] https://issues.redhat.com/browse/RHEL-24054
>>> [1] https://koji.fedoraproject.org/koji/search?terms=kernel-%5Cd.*eln*&type=build&match=regexp
>>> [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2469107
>>> [3] https://bugzilla.redhat.com/show_bug.cgi?id=2282969
>>> [4] https://bugzilla.redhat.com/show_bug.cgi?id=2270408
>>> [5] https://issues.redhat.com/browse/RHEL-24054
>>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-23  9:52             ` Carolina Jubran
@ 2024-07-24 15:36               ` Toke Høiland-Jørgensen
  2024-07-25 12:27                 ` Samuel Dobron
  2024-07-26  8:09                 ` Dragos Tatulea
  2024-07-30 11:04               ` Samuel Dobron
  1 sibling, 2 replies; 20+ messages in thread
From: Toke Høiland-Jørgensen @ 2024-07-24 15:36 UTC (permalink / raw)
  To: Carolina Jubran, Dragos Tatulea, Tariq Toukan,
	daniel@iogearbox.net, sdobron@redhat.com, hawk@kernel.org,
	mianosebastiano@gmail.com
  Cc: pabeni@redhat.com, netdev@vger.kernel.org, edumazet@google.com,
	Saeed Mahameed, bpf@vger.kernel.org, kuba@kernel.org

Carolina Jubran <cjubran@nvidia.com> writes:

> On 22/07/2024 12:26, Dragos Tatulea wrote:
>> On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
>>>
>>> On 21/06/2024 15:35, Samuel Dobron wrote:
>>>> Hey all,
>>>>
>>>> Yeah, we do tests for ELN kernels [1] on a regular basis. Since
>>>> ~January of this year.
>>>>
>>>> As already mentioned, mlx5 is the only driver affected by this regression.
>>>> Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
>>>> the one already mentioned by Toke, another one [0] has been reported
>>>> in early February.
>>>> Btw. issue mentioned by Toke has been moved to Jira, see [5].
>>>>
>>>> Not sure all of you are able to see the content of [0], Jira says it's
>>>> RH-confidental.
>>>> So, I am not sure how much I can share without being fired :D. Anyway,
>>>> affected kernels have been released a while ago, so anyone can find it
>>>> on its own.
>>>> Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
>>>> don't have data for any other XDP mode) in kernel-5.14 compared to
>>>> previous builds.
>>>>
>>>>   From tests history, I can see (most likely) the same improvement
>>>> on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
>>>> (partially) fixed?
>>>>
>>>> For earlier 6.10. kernels we don't have data due to [3] (there is regression on
>>>> XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
>>>> in issue).
>>>> So if you want to run tests on 6.10. please see [3].
>>>>
>>>> Summary XDP_DROP+mlx5@25G:
>>>> kernel       pps
>>>> <5.14        20.5M        baseline
>>>>> =5.14      19M           [0]
>>>> <6.4          19-20M      baseline for ELN kernels
>>>>> =6.4        15M           [4 and 5] (mentioned by Toke)
>>>
>>> + @Dragos
>>>
>>> That's about when we added several changes to the RX datapath.
>>> Most relevant are:
>>> - Fully removing the in-driver RX page-cache.
>>> - Refactoring to support XDP multi-buffer.
>>>
>>> We tested XDP performance before submission, I don't recall we noticed
>>> such a degradation.
>> 
>> Adding Carolina to post her analysis on this.
>
> Hey everyone,
>
> After investigating the issue, it seems the performance degradation is 
> linked to the commit "x86/bugs: Report Intel retbleed vulnerability"
> (6ad0ad2bf8a67).

Hmm, that commit is from June 2022, and according to Samuel's tests,
this issue was introduced sometime between commits b6dad5178cea and
40f71e7cd3c6 (both of which are dated in June 2023). Besides, if it was
a retbleed mitigation issue, that would affect other drivers as well,
no? Our testing only shows this regression on mlx5, not on the intel
drivers.


>>> I'll check with Dragos as he probably has these reports.
>>>
>> We only noticed a 6% degradation for XDP_XDROP.
>> 
>> https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/

That message mentions that "This will be handled in a different patch
series by adding support for multi-packet per page." - did that ever go
in?

-Toke


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-24 15:36               ` Toke Høiland-Jørgensen
@ 2024-07-25 12:27                 ` Samuel Dobron
  2024-07-26  8:09                 ` Dragos Tatulea
  1 sibling, 0 replies; 20+ messages in thread
From: Samuel Dobron @ 2024-07-25 12:27 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Carolina Jubran, Dragos Tatulea, Tariq Toukan,
	daniel@iogearbox.net, hawk@kernel.org, mianosebastiano@gmail.com,
	pabeni@redhat.com, netdev@vger.kernel.org, edumazet@google.com,
	Saeed Mahameed, bpf@vger.kernel.org, kuba@kernel.org

Confirming that this is just mlx5 issue, intel is fine.

I just did a quick test with disabled[0] Spectre v2 mitigations.
The performance remains the same, no difference at all.

Sam.

[0]:
$ cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Vulnerable;
BHI: Vulnerable

On Wed, Jul 24, 2024 at 5:48 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Carolina Jubran <cjubran@nvidia.com> writes:
>
> > On 22/07/2024 12:26, Dragos Tatulea wrote:
> >> On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
> >>>
> >>> On 21/06/2024 15:35, Samuel Dobron wrote:
> >>>> Hey all,
> >>>>
> >>>> Yeah, we do tests for ELN kernels [1] on a regular basis. Since
> >>>> ~January of this year.
> >>>>
> >>>> As already mentioned, mlx5 is the only driver affected by this regression.
> >>>> Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
> >>>> the one already mentioned by Toke, another one [0] has been reported
> >>>> in early February.
> >>>> Btw. issue mentioned by Toke has been moved to Jira, see [5].
> >>>>
> >>>> Not sure all of you are able to see the content of [0], Jira says it's
> >>>> RH-confidental.
> >>>> So, I am not sure how much I can share without being fired :D. Anyway,
> >>>> affected kernels have been released a while ago, so anyone can find it
> >>>> on its own.
> >>>> Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
> >>>> don't have data for any other XDP mode) in kernel-5.14 compared to
> >>>> previous builds.
> >>>>
> >>>>   From tests history, I can see (most likely) the same improvement
> >>>> on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
> >>>> (partially) fixed?
> >>>>
> >>>> For earlier 6.10. kernels we don't have data due to [3] (there is regression on
> >>>> XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
> >>>> in issue).
> >>>> So if you want to run tests on 6.10. please see [3].
> >>>>
> >>>> Summary XDP_DROP+mlx5@25G:
> >>>> kernel       pps
> >>>> <5.14        20.5M        baseline
> >>>>> =5.14      19M           [0]
> >>>> <6.4          19-20M      baseline for ELN kernels
> >>>>> =6.4        15M           [4 and 5] (mentioned by Toke)
> >>>
> >>> + @Dragos
> >>>
> >>> That's about when we added several changes to the RX datapath.
> >>> Most relevant are:
> >>> - Fully removing the in-driver RX page-cache.
> >>> - Refactoring to support XDP multi-buffer.
> >>>
> >>> We tested XDP performance before submission, I don't recall we noticed
> >>> such a degradation.
> >>
> >> Adding Carolina to post her analysis on this.
> >
> > Hey everyone,
> >
> > After investigating the issue, it seems the performance degradation is
> > linked to the commit "x86/bugs: Report Intel retbleed vulnerability"
> > (6ad0ad2bf8a67).
>
> Hmm, that commit is from June 2022, and according to Samuel's tests,
> this issue was introduced sometime between commits b6dad5178cea and
> 40f71e7cd3c6 (both of which are dated in June 2023). Besides, if it was
> a retbleed mitigation issue, that would affect other drivers as well,
> no? Our testing only shows this regression on mlx5, not on the intel
> drivers.
>
>
> >>> I'll check with Dragos as he probably has these reports.
> >>>
> >> We only noticed a 6% degradation for XDP_XDROP.
> >>
> >> https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/
>
> That message mentions that "This will be handled in a different patch
> series by adding support for multi-packet per page." - did that ever go
> in?
>
> -Toke
>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-24 15:36               ` Toke Høiland-Jørgensen
  2024-07-25 12:27                 ` Samuel Dobron
@ 2024-07-26  8:09                 ` Dragos Tatulea
  2024-07-29 18:00                   ` Samuel Dobron
  1 sibling, 1 reply; 20+ messages in thread
From: Dragos Tatulea @ 2024-07-26  8:09 UTC (permalink / raw)
  To: toke@redhat.com, Tariq Toukan, daniel@iogearbox.net,
	Carolina Jubran, sdobron@redhat.com, hawk@kernel.org,
	mianosebastiano@gmail.com
  Cc: Saeed Mahameed, edumazet@google.com, netdev@vger.kernel.org,
	kuba@kernel.org, pabeni@redhat.com, bpf@vger.kernel.org

Hi,

On Wed, 2024-07-24 at 17:36 +0200, Toke Høiland-Jørgensen wrote:
> Carolina Jubran <cjubran@nvidia.com> writes:
> 
> > On 22/07/2024 12:26, Dragos Tatulea wrote:
> > > On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
> > > > 
> > > > On 21/06/2024 15:35, Samuel Dobron wrote:
> > > > > Hey all,
> > > > > 
> > > > > Yeah, we do tests for ELN kernels [1] on a regular basis. Since
> > > > > ~January of this year.
> > > > > 
> > > > > As already mentioned, mlx5 is the only driver affected by this regression.
> > > > > Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
> > > > > the one already mentioned by Toke, another one [0] has been reported
> > > > > in early February.
> > > > > Btw. issue mentioned by Toke has been moved to Jira, see [5].
> > > > > 
> > > > > Not sure all of you are able to see the content of [0], Jira says it's
> > > > > RH-confidental.
> > > > > So, I am not sure how much I can share without being fired :D. Anyway,
> > > > > affected kernels have been released a while ago, so anyone can find it
> > > > > on its own.
> > > > > Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
> > > > > don't have data for any other XDP mode) in kernel-5.14 compared to
> > > > > previous builds.
> > > > > 
> > > > >   From tests history, I can see (most likely) the same improvement
> > > > > on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
> > > > > (partially) fixed?
> > > > > 
> > > > > For earlier 6.10. kernels we don't have data due to [3] (there is regression on
> > > > > XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
> > > > > in issue).
> > > > > So if you want to run tests on 6.10. please see [3].
> > > > > 
> > > > > Summary XDP_DROP+mlx5@25G:
> > > > > kernel       pps
> > > > > <5.14        20.5M        baseline
> > > > > > =5.14      19M           [0]
> > > > > <6.4          19-20M      baseline for ELN kernels
> > > > > > =6.4        15M           [4 and 5] (mentioned by Toke)
> > > > 
> > > > + @Dragos
> > > > 
> > > > That's about when we added several changes to the RX datapath.
> > > > Most relevant are:
> > > > - Fully removing the in-driver RX page-cache.
> > > > - Refactoring to support XDP multi-buffer.
> > > > 
> > > > We tested XDP performance before submission, I don't recall we noticed
> > > > such a degradation.
> > > 
> > > Adding Carolina to post her analysis on this.
> > 
> > Hey everyone,
> > 
> > After investigating the issue, it seems the performance degradation is 
> > linked to the commit "x86/bugs: Report Intel retbleed vulnerability"
> > (6ad0ad2bf8a67).
> 
> Hmm, that commit is from June 2022, [...]
> 
The results from the very first mail in this thread from Sebastiano were
showing a 30Mpps -> 21.3Mpps XDP_DROP regression between 5.15 and 6.2. This
is what Carolina was focused on. Furthermore, the results from Samuel don't show
this regression. Seems like the discussion is now focused on the 6.4 regression?

> [...] and according to Samuel's tests,
> this issue was introduced sometime between commits b6dad5178cea and
> 40f71e7cd3c6 (both of which are dated in June 2023).
> 
Thanks for the commit range (now I know how to decode ELN kernel versions :)).
Strangely this range doesn't have anything suspicious. I would have expected to
see the page_pool or the XDP multibuf changes would have shown up in this range.
But they are already present in the working version... Anyway, we'll keep on
looking.

>  Besides, if it was
> a retbleed mitigation issue, that would affect other drivers as well,
> no? Our testing only shows this regression on mlx5, not on the intel
> drivers.
> 
> 
> > > > I'll check with Dragos as he probably has these reports.
> > > > 
> > > We only noticed a 6% degradation for XDP_XDROP.
> > > 
> > > https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/
> 
> That message mentions that "This will be handled in a different patch
> series by adding support for multi-packet per page." - did that ever go
> in?
> 
Nope, no XDP multi-packet per page yet.

Thanks,
Dragos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-26  8:09                 ` Dragos Tatulea
@ 2024-07-29 18:00                   ` Samuel Dobron
  0 siblings, 0 replies; 20+ messages in thread
From: Samuel Dobron @ 2024-07-29 18:00 UTC (permalink / raw)
  To: Dragos Tatulea
  Cc: toke@redhat.com, Tariq Toukan, daniel@iogearbox.net,
	Carolina Jubran, hawk@kernel.org, mianosebastiano@gmail.com,
	Saeed Mahameed, edumazet@google.com, netdev@vger.kernel.org,
	kuba@kernel.org, pabeni@redhat.com, bpf@vger.kernel.org

Ah, sorry.
Yes, I was talking about 6.4 regression.

I double-checked that v5.15 regression and I don't see anything
that significant as Sebastiano. I ran a couple of tests for:
* kernel-5.10.0-0.rc6.90.eln105
* kernel-5.14.0-60.eln112
* kernel-5.15.0-0.rc7.53.eln113
* kernel-5.16.0-60.eln114
* kernel-6.11.0-0.rc0.20240724git786c8248dbd3.12.eln141

The results of XDP_DROP on receiving side (the one, that is dropping
packets) are more or less the same ~20.5Mpps (17.5Mpps on 6.11, but
that's due to 6.4 regression). CPU is bottleneck, so 100% cpu utilization
for all the kernels on both ends - generator and receiver. We use pktgen
as a generator, both generator and receiver machines use mlx5 NIC.

However, I noticed that between 5.10 and 5.14 there is 30Mpps->22Mpps
regression BUT at the GENERATOR side, CPU util remains the same
on both ends and amount of dropped packets on receiver side is
the same as well (since it's CPU bottlenecked). Other drivers seems
to be unaffected.

That's probably something unrelated to Sebastiano's regression,
but I believe it's worth to mention.

And so, no idea where Sebastiano's regression comes from. I can see,
he uses ConnectX-6, we don't have those, only ConnectX-5, cloud that
be the problem?

Thanks,
Sam.




On Fri, Jul 26, 2024 at 10:09 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> Hi,
>
> On Wed, 2024-07-24 at 17:36 +0200, Toke Høiland-Jørgensen wrote:
> > Carolina Jubran <cjubran@nvidia.com> writes:
> >
> > > On 22/07/2024 12:26, Dragos Tatulea wrote:
> > > > On Sun, 2024-06-30 at 14:43 +0300, Tariq Toukan wrote:
> > > > >
> > > > > On 21/06/2024 15:35, Samuel Dobron wrote:
> > > > > > Hey all,
> > > > > >
> > > > > > Yeah, we do tests for ELN kernels [1] on a regular basis. Since
> > > > > > ~January of this year.
> > > > > >
> > > > > > As already mentioned, mlx5 is the only driver affected by this regression.
> > > > > > Unfortunately, I think Jesper is actually hitting 2 regressions we noticed,
> > > > > > the one already mentioned by Toke, another one [0] has been reported
> > > > > > in early February.
> > > > > > Btw. issue mentioned by Toke has been moved to Jira, see [5].
> > > > > >
> > > > > > Not sure all of you are able to see the content of [0], Jira says it's
> > > > > > RH-confidental.
> > > > > > So, I am not sure how much I can share without being fired :D. Anyway,
> > > > > > affected kernels have been released a while ago, so anyone can find it
> > > > > > on its own.
> > > > > > Basically, we detected 5% regression on XDP_DROP+mlx5 (currently, we
> > > > > > don't have data for any other XDP mode) in kernel-5.14 compared to
> > > > > > previous builds.
> > > > > >
> > > > > >   From tests history, I can see (most likely) the same improvement
> > > > > > on 6.10rc2 (from 15Mpps to 17-18Mpps), so I'd say 20% drop has been
> > > > > > (partially) fixed?
> > > > > >
> > > > > > For earlier 6.10. kernels we don't have data due to [3] (there is regression on
> > > > > > XDP_DROP as well, but I believe it's turbo-boost issue, as I mentioned
> > > > > > in issue).
> > > > > > So if you want to run tests on 6.10. please see [3].
> > > > > >
> > > > > > Summary XDP_DROP+mlx5@25G:
> > > > > > kernel       pps
> > > > > > <5.14        20.5M        baseline
> > > > > > > =5.14      19M           [0]
> > > > > > <6.4          19-20M      baseline for ELN kernels
> > > > > > > =6.4        15M           [4 and 5] (mentioned by Toke)
> > > > >
> > > > > + @Dragos
> > > > >
> > > > > That's about when we added several changes to the RX datapath.
> > > > > Most relevant are:
> > > > > - Fully removing the in-driver RX page-cache.
> > > > > - Refactoring to support XDP multi-buffer.
> > > > >
> > > > > We tested XDP performance before submission, I don't recall we noticed
> > > > > such a degradation.
> > > >
> > > > Adding Carolina to post her analysis on this.
> > >
> > > Hey everyone,
> > >
> > > After investigating the issue, it seems the performance degradation is
> > > linked to the commit "x86/bugs: Report Intel retbleed vulnerability"
> > > (6ad0ad2bf8a67).
> >
> > Hmm, that commit is from June 2022, [...]
> >
> The results from the very first mail in this thread from Sebastiano were
> showing a 30Mpps -> 21.3Mpps XDP_DROP regression between 5.15 and 6.2. This
> is what Carolina was focused on. Furthermore, the results from Samuel don't show
> this regression. Seems like the discussion is now focused on the 6.4 regression?
>
> > [...] and according to Samuel's tests,
> > this issue was introduced sometime between commits b6dad5178cea and
> > 40f71e7cd3c6 (both of which are dated in June 2023).
> >
> Thanks for the commit range (now I know how to decode ELN kernel versions :)).
> Strangely this range doesn't have anything suspicious. I would have expected to
> see the page_pool or the XDP multibuf changes would have shown up in this range.
> But they are already present in the working version... Anyway, we'll keep on
> looking.
>
> >  Besides, if it was
> > a retbleed mitigation issue, that would affect other drivers as well,
> > no? Our testing only shows this regression on mlx5, not on the intel
> > drivers.
> >
> >
> > > > > I'll check with Dragos as he probably has these reports.
> > > > >
> > > > We only noticed a 6% degradation for XDP_XDROP.
> > > >
> > > > https://lore.kernel.org/netdev/b6fcfa8b-c2b3-8a92-fb6e-0760d5f6f5ff@redhat.com/T/
> >
> > That message mentions that "This will be handled in a different patch
> > series by adding support for multi-packet per page." - did that ever go
> > in?
> >
> Nope, no XDP multi-packet per page yet.
>
> Thanks,
> Dragos


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-23  9:52             ` Carolina Jubran
  2024-07-24 15:36               ` Toke Høiland-Jørgensen
@ 2024-07-30 11:04               ` Samuel Dobron
  2024-12-11 13:20                 ` Samuel Dobron
  1 sibling, 1 reply; 20+ messages in thread
From: Samuel Dobron @ 2024-07-30 11:04 UTC (permalink / raw)
  To: Carolina Jubran, Dragos Tatulea, Tariq Toukan,
	daniel@iogearbox.net, hawk@kernel.org, mianosebastiano@gmail.com
  Cc: toke@redhat.com, pabeni@redhat.com, netdev@vger.kernel.org,
	edumazet@google.com, Saeed Mahameed, bpf@vger.kernel.org,
	kuba@kernel.org

> Could you try adding the mentioned parameters to your kernel arguments
> and check if you still see the degradation? 

Hey,
So i tried multiple kernels around v5.15 as well as couple of previous
v6.xx and there is no difference with spectre v2 mitigations enabled
or disabled.

No difference on other drivers as well.


Sam.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-07-30 11:04               ` Samuel Dobron
@ 2024-12-11 13:20                 ` Samuel Dobron
  2025-01-08  9:26                   ` Carolina Jubran
  0 siblings, 1 reply; 20+ messages in thread
From: Samuel Dobron @ 2024-12-11 13:20 UTC (permalink / raw)
  To: Carolina Jubran, Dragos Tatulea, Tariq Toukan,
	daniel@iogearbox.net, hawk@kernel.org, mianosebastiano@gmail.com
  Cc: toke@redhat.com, pabeni@redhat.com, netdev@vger.kernel.org,
	edumazet@google.com, Saeed Mahameed, bpf@vger.kernel.org,
	kuba@kernel.org, Benjamin Poirier

Hey all,

We recently enabled tests for XDP TX, so I was able to test
xdp tx as well.

XDP_DROP performance regression is the same as I reported
a while ago. There is about 20% regression in
kernel-6.4.0-0.rc6.20230616git40f71e7cd3c6.50.eln126 (baseline)
compared to previous kernel
kernel-6.4.0-0.rc6.20230614gitb6dad5178cea.49.eln126 (broken).
We don't see such regression for other drivers.

The regression was partially fixed somewhere between eln126 and
kernel-6.10.0-0.rc2.20240606git2df0193e62cf.27.eln137 (partially
fixed) and the performance since then is -7 to -15% compared to
baseline. So, nothing new.

XDP_TX is however, more interesting.
When comparing baseline with broken kernel there is 20 - 25%
performance drop (cpu utilizations remains the same) on mlx driver.
There is also 10% drop on other drivers as well. HOWEVER, it got
fixed somewhere between broken and partially fixed kernel. On most
recent kernels, we don't see that regressions on other drivers. But
2-10% (depends if using dpa/load-bytes) regression remains on mlx5.

The numbers look a bit similar to regression with enabled spectre/meltdown
mitigations but based on my experiments, there is no difference with
enabled/disabled mitigations.

Hope this will help,
Sam.

On Tue, Jul 30, 2024 at 1:04 PM Samuel Dobron <sdobron@redhat.com> wrote:
>
> > Could you try adding the mentioned parameters to your kernel arguments
> > and check if you still see the degradation?
>
> Hey,
> So i tried multiple kernels around v5.15 as well as couple of previous
> v6.xx and there is no difference with spectre v2 mitigations enabled
> or disabled.
>
> No difference on other drivers as well.
>
>
> Sam.
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: XDP Performance Regression in recent kernel versions
  2024-12-11 13:20                 ` Samuel Dobron
@ 2025-01-08  9:26                   ` Carolina Jubran
  0 siblings, 0 replies; 20+ messages in thread
From: Carolina Jubran @ 2025-01-08  9:26 UTC (permalink / raw)
  To: Samuel Dobron, Dragos Tatulea, Tariq Toukan, daniel@iogearbox.net,
	hawk@kernel.org, mianosebastiano@gmail.com
  Cc: toke@redhat.com, pabeni@redhat.com, netdev@vger.kernel.org,
	edumazet@google.com, Saeed Mahameed, bpf@vger.kernel.org,
	kuba@kernel.org, Benjamin Poirier

Hello,

Thank you Sam for the detailed information.

I have identified the specific kernel configuration change responsible 
for the degradation between kernel versions 
6.4.0-0.rc6.20230614gitb6dad5178cea.49.eln126 and 
6.4.0-0.rc6.20230616git40f71e7cd3c6.50.eln126. The introduction of the 
CONFIG_INIT_STACK_ALL_ZERO setting in the latter version has led to a 
noticeable performance impact.

I am currently investigating why this change specifically affects mlx5.

Thanks,

Carolina

On 11/12/2024 15:20, Samuel Dobron wrote:
> Hey all,
> 
> We recently enabled tests for XDP TX, so I was able to test
> xdp tx as well.
> 
> XDP_DROP performance regression is the same as I reported
> a while ago. There is about 20% regression in
> kernel-6.4.0-0.rc6.20230616git40f71e7cd3c6.50.eln126 (baseline)
> compared to previous kernel
> kernel-6.4.0-0.rc6.20230614gitb6dad5178cea.49.eln126 (broken).
> We don't see such regression for other drivers.
> 
> The regression was partially fixed somewhere between eln126 and
> kernel-6.10.0-0.rc2.20240606git2df0193e62cf.27.eln137 (partially
> fixed) and the performance since then is -7 to -15% compared to
> baseline. So, nothing new.
> 
> XDP_TX is however, more interesting.
> When comparing baseline with broken kernel there is 20 - 25%
> performance drop (cpu utilizations remains the same) on mlx driver.
> There is also 10% drop on other drivers as well. HOWEVER, it got
> fixed somewhere between broken and partially fixed kernel. On most
> recent kernels, we don't see that regressions on other drivers. But
> 2-10% (depends if using dpa/load-bytes) regression remains on mlx5.
> 
> The numbers look a bit similar to regression with enabled spectre/meltdown
> mitigations but based on my experiments, there is no difference with
> enabled/disabled mitigations.
> 
> Hope this will help,
> Sam.
> 
> On Tue, Jul 30, 2024 at 1:04 PM Samuel Dobron <sdobron@redhat.com> wrote:
>>
>>> Could you try adding the mentioned parameters to your kernel arguments
>>> and check if you still see the degradation?
>>
>> Hey,
>> So i tried multiple kernels around v5.15 as well as couple of previous
>> v6.xx and there is no difference with spectre v2 mitigations enabled
>> or disabled.
>>
>> No difference on other drivers as well.
>>
>>
>> Sam.
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-01-08  9:26 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-18 15:28 XDP Performance Regression in recent kernel versions Sebastiano Miano
2024-06-19  6:00 ` Tariq Toukan
2024-06-19 15:17   ` Sebastiano Miano
2024-06-19 16:27 ` Jesper Dangaard Brouer
2024-06-19 19:17   ` Toke Høiland-Jørgensen
2024-06-20  9:52     ` Daniel Borkmann
2024-06-21 12:35       ` Samuel Dobron
2024-06-24 11:46         ` Toke Høiland-Jørgensen
2024-06-30 10:25           ` Tariq Toukan
2024-07-22 10:57             ` Samuel Dobron
2024-06-30 11:43         ` Tariq Toukan
2024-07-22  9:26           ` Dragos Tatulea
2024-07-23  9:52             ` Carolina Jubran
2024-07-24 15:36               ` Toke Høiland-Jørgensen
2024-07-25 12:27                 ` Samuel Dobron
2024-07-26  8:09                 ` Dragos Tatulea
2024-07-29 18:00                   ` Samuel Dobron
2024-07-30 11:04               ` Samuel Dobron
2024-12-11 13:20                 ` Samuel Dobron
2025-01-08  9:26                   ` Carolina Jubran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).