oe-lkp.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [bytedance:5.15-velinux] [mm]  bf31671edf:  netperf.Throughput_Mbps 11.0% regression
@ 2025-08-25  7:05 kernel test robot
  0 siblings, 0 replies; only message in thread
From: kernel test robot @ 2025-08-25  7:05 UTC (permalink / raw)
  To: jason.zeng; +Cc: oe-lkp, lkp, oliver.sang



Hello,

kernel test robot noticed a 11.0% regression of netperf.Throughput_Mbps on:


commit: bf31671edffe7ea925baa93caff59a860f1ddfa8 ("mm: memcg: make stats flushing threshold per-memcg")
https://github.com/bytedance/kernel.git 5.15-velinux

testcase: netperf
config: x86_64-bytedance-5.15-velinux
compiler: gcc-12
test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
parameters:

	ip: ipv4
	runtime: 300s
	nr_threads: 200%
	cluster: cs-localhost
	send_size: 10K
	test: SCTP_STREAM_MANY
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202508251443.3f803480-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250825/202508251443.3f803480-lkp@intel.com

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/send_size/tbox_group/test/testcase:
  cs-localhost/gcc-12/performance/ipv4/x86_64-bytedance-5.15-velinux/200%/debian-12-x86_64-20240206.cgz/300s/10K/lkp-gnr-2sp3/SCTP_STREAM_MANY/netperf

commit: 
  69642f5099 ("mm: memcg: move vmstats structs definition above flushing code")
  bf31671edf ("mm: memcg: make stats flushing threshold per-memcg")

69642f5099423baf bf31671edffe7ea925baa93caff 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  85772547 ±  4%     -10.4%   76821561        cpuidle..usage
    135.00 ±  9%     -16.8%     112.33 ±  8%  perf-c2c.HITM.remote
      7.04 ± 81%     -83.3%       1.18 ±177%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.kthread.ret_from_fork
    521963 ±  4%     -11.2%     463279        vmstat.system.cs
    309897 ±  3%     -10.7%     276864        vmstat.system.in
      1.65 ±  5%      -0.2        1.42 ±  2%  mpstat.cpu.all.soft%
      5.69 ±  5%      -0.7        4.96        mpstat.cpu.all.sys%
      9.42 ±  4%     -12.8%       8.21 ±  5%  mpstat.max_utilization_pct
    302723 ± 10%     -19.0%     245225 ±  3%  sched_debug.cfs_rq:/.min_vruntime.avg
    266650 ± 14%     -26.2%     196867 ± 18%  sched_debug.cfs_rq:/.spread0.max
    305935 ±  4%     -11.2%     271568        sched_debug.cpu.nr_switches.avg
 4.854e+08 ±  4%     -11.0%  4.318e+08        proc-vmstat.numa_hit
 4.851e+08 ±  4%     -11.0%  4.315e+08        proc-vmstat.numa_local
 2.726e+09 ±  4%     -11.1%  2.424e+09        proc-vmstat.pgalloc_normal
 2.726e+09 ±  4%     -11.1%  2.424e+09        proc-vmstat.pgfree
   4.9e+09 ±  4%     -10.3%  4.394e+09        perf-stat.i.branch-instructions
  19860290 ±  8%     -11.5%   17582329        perf-stat.i.branch-misses
  13095483 ±  4%     -10.0%   11784812 ±  3%  perf-stat.i.cache-misses
 7.297e+08 ±  4%     -10.4%   6.54e+08        perf-stat.i.cache-references
    527531 ±  4%     -11.2%     468461        perf-stat.i.context-switches
 8.112e+10 ±  5%     -11.7%  7.163e+10        perf-stat.i.cpu-cycles
 2.417e+10 ±  4%     -10.5%  2.164e+10        perf-stat.i.instructions
      2.06 ±  4%     -11.3%       1.83        perf-stat.i.metric.K/sec
    202.69 ±  4%     -11.0%     180.30        netperf.ThroughputBoth_Mbps
    103778 ±  4%     -11.0%      92315        netperf.ThroughputBoth_total_Mbps
    202.69 ±  4%     -11.0%     180.30        netperf.Throughput_Mbps
    103778 ±  4%     -11.0%      92315        netperf.Throughput_total_Mbps
     23850 ±  2%      -6.9%      22200        netperf.time.involuntary_context_switches
      1145 ±  5%     -12.0%       1008        netperf.time.percent_of_cpu_this_job_got
      3443 ±  5%     -12.0%       3031        netperf.time.system_time
  11648832 ±  3%      -9.8%   10509763        netperf.time.voluntary_context_switches
   3.8e+08 ±  4%     -11.0%  3.381e+08        netperf.workload
     20.64            -0.7       19.94        perf-profile.calltrace.cycles-pp.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv
     17.64            -0.5       17.19        perf-profile.calltrace.cycles-pp.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv.sctp_backlog_rcv
      1.98 ±  3%      -0.4        1.57 ± 19%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue.get_page_from_freelist.__alloc_pages.kmalloc_large_node
      1.14 ± 22%      -0.3        0.86 ±  8%  perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.kmalloc_large_node
     48.45            -0.2       48.21        perf-profile.calltrace.cycles-pp.sctp_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64
     49.16            -0.2       48.93        perf-profile.calltrace.cycles-pp.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.92 ±  2%      -0.2        2.71 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.kfree_skb_reason.sctp_recvmsg.inet_recvmsg
      2.84 ±  2%      -0.2        2.63 ±  2%  perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page.skb_release_data.kfree_skb_reason.sctp_recvmsg
      2.71 ±  2%      -0.2        2.51 ±  2%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_commit.free_unref_page.skb_release_data.kfree_skb_reason
      9.25            -0.2        9.06        perf-profile.calltrace.cycles-pp.sctp_datamsg_from_user.sctp_sendmsg_to_asoc.sctp_sendmsg.____sys_sendmsg.___sys_sendmsg
      8.87            -0.2        8.69        perf-profile.calltrace.cycles-pp.sctp_packet_pack.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm
      8.60            -0.2        8.42        perf-profile.calltrace.cycles-pp.memcpy_erms.sctp_packet_pack.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter
      4.86            -0.1        4.71        perf-profile.calltrace.cycles-pp.sctp_make_datafrag_empty.sctp_datamsg_from_user.sctp_sendmsg_to_asoc.sctp_sendmsg.____sys_sendmsg
      4.75            -0.1        4.61        perf-profile.calltrace.cycles-pp._sctp_make_chunk.sctp_make_datafrag_empty.sctp_datamsg_from_user.sctp_sendmsg_to_asoc.sctp_sendmsg
      4.19            -0.1        4.08        perf-profile.calltrace.cycles-pp.__alloc_skb._sctp_make_chunk.sctp_make_datafrag_empty.sctp_datamsg_from_user.sctp_sendmsg_to_asoc
      3.61            -0.1        3.50        perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb._sctp_make_chunk.sctp_make_datafrag_empty
      3.62            -0.1        3.51        perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb._sctp_make_chunk.sctp_make_datafrag_empty.sctp_datamsg_from_user
      3.58            -0.1        3.48        perf-profile.calltrace.cycles-pp.kmalloc_large_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb._sctp_make_chunk
      0.64            -0.1        0.58 ±  3%  perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
      0.66            -0.1        0.60 ±  3%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
      1.30            +0.1        1.40        perf-profile.calltrace.cycles-pp.sctp_ulpevent_free.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg
      1.56            +0.2        1.76 ±  5%  perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.sctp_sendmsg_to_asoc.sctp_sendmsg.____sys_sendmsg
      1.58            +0.2        1.77 ±  4%  perf-profile.calltrace.cycles-pp.__sk_mem_schedule.sctp_sendmsg_to_asoc.sctp_sendmsg.____sys_sendmsg.___sys_sendmsg
      0.58 ±  2%      +0.2        0.83 ±  6%  perf-profile.calltrace.cycles-pp.__sk_mem_reduce_allocated.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put
      3.48            +0.3        3.76 ±  3%  perf-profile.calltrace.cycles-pp.sctp_outq_sack.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv.sctp_backlog_rcv
      1.64            +0.3        1.98        perf-profile.calltrace.cycles-pp.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_outq_sack
      1.74            +0.3        2.07        perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_outq_sack.sctp_cmd_interpreter
      8.18            +0.5        8.65        perf-profile.calltrace.cycles-pp.kfree_skb_reason.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg
      0.88 ±  2%      +0.5        1.41 ±  4%  perf-profile.calltrace.cycles-pp.mem_cgroup_charge_skmem.__sk_mem_raise_allocated.__sk_mem_schedule.sctp_sendmsg_to_asoc.sctp_sendmsg
      0.93 ±  2%      +0.8        1.74 ±  5%  perf-profile.calltrace.cycles-pp.__sk_mem_reduce_allocated.skb_release_head_state.kfree_skb_reason.sctp_recvmsg.inet_recvmsg
      1.55            +0.8        2.38 ±  4%  perf-profile.calltrace.cycles-pp.skb_release_head_state.kfree_skb_reason.sctp_recvmsg.inet_recvmsg.sock_recvmsg
      0.00            +0.9        0.88 ±  5%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.mem_cgroup_charge_skmem.__sk_mem_raise_allocated.__sk_mem_schedule.sctp_sendmsg_to_asoc
      0.00            +1.2        1.22 ±  5%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.mem_cgroup_uncharge_skmem.__sk_mem_reduce_allocated.skb_release_head_state.kfree_skb_reason
      0.00            +1.3        1.28 ±  5%  perf-profile.calltrace.cycles-pp.mem_cgroup_uncharge_skmem.__sk_mem_reduce_allocated.skb_release_head_state.kfree_skb_reason.sctp_recvmsg
     30.64            -0.6       30.00        perf-profile.children.cycles-pp.sctp_packet_transmit
     32.46            -0.6       31.87        perf-profile.children.cycles-pp.sctp_outq_flush
     48.58            -0.2       48.34        perf-profile.children.cycles-pp.sctp_sendmsg
     49.18            -0.2       48.94        perf-profile.children.cycles-pp.____sys_sendmsg
      9.26            -0.2        9.07        perf-profile.children.cycles-pp.sctp_datamsg_from_user
      9.00            -0.2        8.82        perf-profile.children.cycles-pp.memcpy_erms
      9.53            -0.2        9.36        perf-profile.children.cycles-pp.sctp_packet_pack
      5.01            -0.2        4.86        perf-profile.children.cycles-pp._sctp_make_chunk
      4.87            -0.1        4.72        perf-profile.children.cycles-pp.sctp_make_datafrag_empty
      0.52 ±  2%      -0.1        0.46 ±  4%  perf-profile.children.cycles-pp.drain_stock
      0.51 ±  3%      -0.1        0.45 ±  4%  perf-profile.children.cycles-pp.page_counter_uncharge
      0.66 ±  2%      -0.1        0.60 ±  4%  perf-profile.children.cycles-pp.refill_stock
      0.66            -0.1        0.61 ±  3%  perf-profile.children.cycles-pp.schedule_idle
      0.45            -0.0        0.40 ±  4%  perf-profile.children.cycles-pp.psi_group_change
      0.57 ±  2%      -0.0        0.53        perf-profile.children.cycles-pp.__free_one_page
      0.35 ±  2%      -0.0        0.31 ±  4%  perf-profile.children.cycles-pp.psi_task_switch
      0.71 ±  2%      -0.0        0.68        perf-profile.children.cycles-pp.__skb_clone
      0.09 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.finish_task_switch
      0.06 ± 11%      +0.0        0.08 ±  6%  perf-profile.children.cycles-pp.netif_rx
      0.10 ±  5%      +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.arch_local_irq_enable
      0.82            +0.0        0.85 ±  2%  perf-profile.children.cycles-pp.ipv4_dst_check
      2.63            +0.1        2.69        perf-profile.children.cycles-pp.sctp_outq_flush_data
      1.31            +0.1        1.41        perf-profile.children.cycles-pp.sctp_ulpevent_free
      2.53            +0.3        2.87 ±  5%  perf-profile.children.cycles-pp.__sk_mem_raise_allocated
      2.55            +0.3        2.89 ±  5%  perf-profile.children.cycles-pp.__sk_mem_schedule
      1.76            +0.4        2.11        perf-profile.children.cycles-pp.sctp_wfree
      8.19            +0.5        8.66        perf-profile.children.cycles-pp.kfree_skb_reason
      1.63            +0.7        2.36 ±  5%  perf-profile.children.cycles-pp.mem_cgroup_charge_skmem
      1.72            +1.2        2.90 ±  4%  perf-profile.children.cycles-pp.__sk_mem_reduce_allocated
      3.62            +1.2        4.81 ±  2%  perf-profile.children.cycles-pp.skb_release_head_state
      0.75            +1.2        1.99 ±  5%  perf-profile.children.cycles-pp.mem_cgroup_uncharge_skmem
      1.30            +2.0        3.28 ±  5%  perf-profile.children.cycles-pp.__mod_memcg_state
      0.84 ±  2%      -0.4        0.46 ±  5%  perf-profile.self.cycles-pp.__sk_mem_raise_allocated
      8.91            -0.2        8.73        perf-profile.self.cycles-pp.memcpy_erms
      0.42 ±  3%      -0.1        0.35 ±  4%  perf-profile.self.cycles-pp.page_counter_uncharge
      1.75            -0.1        1.69        perf-profile.self.cycles-pp.get_page_from_freelist
      0.44 ±  2%      -0.0        0.40 ±  3%  perf-profile.self.cycles-pp.psi_group_change
      0.16 ±  5%      -0.0        0.13 ±  3%  perf-profile.self.cycles-pp.cpuidle_idle_call
      0.48            -0.0        0.45        perf-profile.self.cycles-pp.__free_one_page
      0.46            -0.0        0.43 ±  3%  perf-profile.self.cycles-pp.sctp_chunkify
      0.23 ±  3%      -0.0        0.21 ±  4%  perf-profile.self.cycles-pp.sctp_inq_pop
      0.81            +0.0        0.84 ±  2%  perf-profile.self.cycles-pp.ipv4_dst_check
      0.25 ±  3%      +0.0        0.30 ±  4%  perf-profile.self.cycles-pp.sctp_chunk_put
      1.22            +2.0        3.20 ±  5%  perf-profile.self.cycles-pp.__mod_memcg_state




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2025-08-25  7:05 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25  7:05 [bytedance:5.15-velinux] [mm] bf31671edf: netperf.Throughput_Mbps 11.0% regression kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).