From: Aaron Lu <aaron.lu@intel.com>
To: lkp@lists.01.org
Subject: Re: [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression
Date: Fri, 19 Aug 2016 13:29:41 +0800 [thread overview]
Message-ID: <20160819052941.GA1179@aaronlu.sh.intel.com> (raw)
In-Reply-To: <CADvbK_cF3Mphd5CDqGVHcbbZ-L8Xz=vRPzdKTzE4A_p-ZBiDVA@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5354 bytes --]
On Thu, Aug 18, 2016 at 08:45:42PM +0800, Xin Long wrote:
> >> Hi, Aaron
> >>
> >> 1)
> >> I talked with Marcelo about this one.
> >> He said it might be related with cacheline. the new field distroyed
> >> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> >> + unsigned long prsctp_param;
> >>
> >> to the end of struct sctp_chunk, then try.
> >
> > This doesn't work.
> >
>
> If it's because of cache lines changed, I'm not sure this, either.
> Maybe 2) is a good way to fix it.
A comparison of the good commit 826d253d57b1 and the bad a6c2f792873a:
tests: 8
testcase/path_params/tbox_group/run: netperf/ipv4-300s-200%-cs-localhost-10K-SCTP_STREAM_MANY-performance/lkp-ivb-d02
826d253d57b11f69 a6c2f792873aff332a4689717c
---------------- --------------------------
%stddev change %stddev
\ | \
3923 -37% 2461 netperf.Throughput_Mbps
9 -78% 2 vmstat.procs.r
112616 19% 133981 vmstat.system.cs
4053 7% 4350 vmstat.system.in
8598 ± 4% 957% 90912 softirqs.SCHED
16466114 -37% 10305467 softirqs.NET_RX
605899 -46% 329262 softirqs.TIMER
72067 ± 10% -63% 26356 ± 3% softirqs.RCU
4785 ± 7% -9% 4352 slabinfo.anon_vma_chain.num_objs
642 ± 7% 14% 731 ± 6% slabinfo.kmalloc-512.active_objs
4993 15% 5735 slabinfo.kmalloc-64.active_objs
4993 15% 5735 slabinfo.kmalloc-64.num_objs
2529 ± 4% -15% 2150 proc-vmstat.nr_alloc_batch
4.733e+08 -37% 2.999e+08 proc-vmstat.pgalloc_normal
8.476e+08 -37% 5.36e+08 proc-vmstat.pgfree
3.742e+08 -37% 2.361e+08 proc-vmstat.pgalloc_dma32
1.48e+08 -37% 93033641 proc-vmstat.numa_hit
1.48e+08 -37% 93033640 proc-vmstat.numa_local
0.05 ± 17% 52102% 24.80 turbostat.CPU%c1
0.64 3065% 20.10 ± 3% turbostat.CPU%c6
0.12 ± 39% 1900% 2.35 ± 3% turbostat.Pkg%pc2
0.46 ± 10% 1686% 8.22 ± 6% turbostat.Pkg%pc6
37.54 -14% 32.11 turbostat.PkgWatt
20.20 -25% 15.22 turbostat.CorWatt
99.31 -45% 54.97 turbostat.%Busy
3269 -45% 1803 turbostat.Avg_MHz
76510 ± 46% 3e+05% 1.954e+08 cpuidle.C1-IVB.time
19769 ± 17% 5534% 1113742 ± 5% cpuidle.C1E-IVB.time
151 ± 11% 4175% 6454 ± 7% cpuidle.C1E-IVB.usage
114 ± 14% 6216% 7232 ± 5% cpuidle.C3-IVB.usage
33074 ± 14% 5159% 1739419 ± 3% cpuidle.C3-IVB.time
8874 4203% 381901 cpuidle.C6-IVB.usage
8006184 4072% 3.34e+08 cpuidle.C6-IVB.time
12019 ± 35% 303% 48398 perf-stat.cpu-migrations
34232822 19% 40780053 perf-stat.context-switches
339045 5% 354573 perf-stat.minor-faults
339041 5% 354568 perf-stat.page-faults
2.776e+11 -28% 2.003e+11 perf-stat.branch-instructions
1.505e+12 -29% 1.065e+12 perf-stat.instructions
6.421e+11 -30% 4.473e+11 perf-stat.dTLB-loads
5.32e+11 -34% 3.536e+11 perf-stat.dTLB-stores
1.173e+11 -38% 7.271e+10 perf-stat.cache-references
3.735e+08 ± 5% -48% 1.959e+08 ± 4% perf-stat.iTLB-load-misses
3.864e+09 -51% 1.9e+09 perf-stat.branch-misses
4.069e+09 ± 20% -56% 1.798e+09 ± 40% perf-stat.dTLB-load-misses
5.285e+08 ± 22% -70% 1.585e+08 ± 16% perf-stat.dTLB-store-misses
7.126e+09 ± 16% -97% 2.27e+08 ± 4% perf-stat.cache-misses
The obvious change is:
1 the bad commit has a much fewer runnable process - vmstat.procs.r
2 the context switches are much higher in the bad commit - vmstat.system.cs
It all suggests the netperf processes go to sleep for some reason in the bad
commit.
I used "perf record -p one_netperf_pid -e probe:pick_next_task_idle" as
suggested by Tim to see where it went to sleep:
Samples: 78 of event 'probe:pick_next_task_idle', Event count(approx.): 78
Children Self Trace output
■- 100.00% 100.00% (ffffffff810fc750)
▒ __sendmsg_nocancel
▒ entry_SYSCALL_64_fastpath
▒ sys_sendmsg
▒ __sys_sendmsg
▒ ___sys_sendmsg
▒ inet_sendmsg
▒ sctp_sendmsg
▒ sctp_wait_for_sndbuf
▒ schedule_timeout
▒ schedule
▒ pick_next_task_idle
It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?
The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.
Regards,
Aaron
WARNING: multiple messages have this Message-ID (diff)
From: Aaron Lu <aaron.lu@intel.com>
To: Xin Long <lucien.xin@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
kernel test robot <xiaolong.ye@intel.com>,
Stephen Rothwell <sfr@canb.auug.org.au>,
lkp@01.org, "David S. Miller" <davem@davemloft.net>,
LKML <linux-kernel@vger.kernel.org>,
"Chen, Tim C" <tim.c.chen@intel.com>,
Huang Ying <ying.huang@intel.com>
Subject: Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression
Date: Fri, 19 Aug 2016 13:29:41 +0800 [thread overview]
Message-ID: <20160819052941.GA1179@aaronlu.sh.intel.com> (raw)
In-Reply-To: <CADvbK_cF3Mphd5CDqGVHcbbZ-L8Xz=vRPzdKTzE4A_p-ZBiDVA@mail.gmail.com>
On Thu, Aug 18, 2016 at 08:45:42PM +0800, Xin Long wrote:
> >> Hi, Aaron
> >>
> >> 1)
> >> I talked with Marcelo about this one.
> >> He said it might be related with cacheline. the new field distroyed
> >> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> >> + unsigned long prsctp_param;
> >>
> >> to the end of struct sctp_chunk, then try.
> >
> > This doesn't work.
> >
>
> If it's because of cache lines changed, I'm not sure this, either.
> Maybe 2) is a good way to fix it.
A comparison of the good commit 826d253d57b1 and the bad a6c2f792873a:
tests: 8
testcase/path_params/tbox_group/run: netperf/ipv4-300s-200%-cs-localhost-10K-SCTP_STREAM_MANY-performance/lkp-ivb-d02
826d253d57b11f69 a6c2f792873aff332a4689717c
---------------- --------------------------
%stddev change %stddev
\ | \
3923 -37% 2461 netperf.Throughput_Mbps
9 -78% 2 vmstat.procs.r
112616 19% 133981 vmstat.system.cs
4053 7% 4350 vmstat.system.in
8598 ± 4% 957% 90912 softirqs.SCHED
16466114 -37% 10305467 softirqs.NET_RX
605899 -46% 329262 softirqs.TIMER
72067 ± 10% -63% 26356 ± 3% softirqs.RCU
4785 ± 7% -9% 4352 slabinfo.anon_vma_chain.num_objs
642 ± 7% 14% 731 ± 6% slabinfo.kmalloc-512.active_objs
4993 15% 5735 slabinfo.kmalloc-64.active_objs
4993 15% 5735 slabinfo.kmalloc-64.num_objs
2529 ± 4% -15% 2150 proc-vmstat.nr_alloc_batch
4.733e+08 -37% 2.999e+08 proc-vmstat.pgalloc_normal
8.476e+08 -37% 5.36e+08 proc-vmstat.pgfree
3.742e+08 -37% 2.361e+08 proc-vmstat.pgalloc_dma32
1.48e+08 -37% 93033641 proc-vmstat.numa_hit
1.48e+08 -37% 93033640 proc-vmstat.numa_local
0.05 ± 17% 52102% 24.80 turbostat.CPU%c1
0.64 3065% 20.10 ± 3% turbostat.CPU%c6
0.12 ± 39% 1900% 2.35 ± 3% turbostat.Pkg%pc2
0.46 ± 10% 1686% 8.22 ± 6% turbostat.Pkg%pc6
37.54 -14% 32.11 turbostat.PkgWatt
20.20 -25% 15.22 turbostat.CorWatt
99.31 -45% 54.97 turbostat.%Busy
3269 -45% 1803 turbostat.Avg_MHz
76510 ± 46% 3e+05% 1.954e+08 cpuidle.C1-IVB.time
19769 ± 17% 5534% 1113742 ± 5% cpuidle.C1E-IVB.time
151 ± 11% 4175% 6454 ± 7% cpuidle.C1E-IVB.usage
114 ± 14% 6216% 7232 ± 5% cpuidle.C3-IVB.usage
33074 ± 14% 5159% 1739419 ± 3% cpuidle.C3-IVB.time
8874 4203% 381901 cpuidle.C6-IVB.usage
8006184 4072% 3.34e+08 cpuidle.C6-IVB.time
12019 ± 35% 303% 48398 perf-stat.cpu-migrations
34232822 19% 40780053 perf-stat.context-switches
339045 5% 354573 perf-stat.minor-faults
339041 5% 354568 perf-stat.page-faults
2.776e+11 -28% 2.003e+11 perf-stat.branch-instructions
1.505e+12 -29% 1.065e+12 perf-stat.instructions
6.421e+11 -30% 4.473e+11 perf-stat.dTLB-loads
5.32e+11 -34% 3.536e+11 perf-stat.dTLB-stores
1.173e+11 -38% 7.271e+10 perf-stat.cache-references
3.735e+08 ± 5% -48% 1.959e+08 ± 4% perf-stat.iTLB-load-misses
3.864e+09 -51% 1.9e+09 perf-stat.branch-misses
4.069e+09 ± 20% -56% 1.798e+09 ± 40% perf-stat.dTLB-load-misses
5.285e+08 ± 22% -70% 1.585e+08 ± 16% perf-stat.dTLB-store-misses
7.126e+09 ± 16% -97% 2.27e+08 ± 4% perf-stat.cache-misses
The obvious change is:
1 the bad commit has a much fewer runnable process - vmstat.procs.r
2 the context switches are much higher in the bad commit - vmstat.system.cs
It all suggests the netperf processes go to sleep for some reason in the bad
commit.
I used "perf record -p one_netperf_pid -e probe:pick_next_task_idle" as
suggested by Tim to see where it went to sleep:
Samples: 78 of event 'probe:pick_next_task_idle', Event count(approx.): 78
Children Self Trace output
■- 100.00% 100.00% (ffffffff810fc750)
▒ __sendmsg_nocancel
▒ entry_SYSCALL_64_fastpath
▒ sys_sendmsg
▒ __sys_sendmsg
▒ ___sys_sendmsg
▒ inet_sendmsg
▒ sctp_sendmsg
▒ sctp_wait_for_sndbuf
▒ schedule_timeout
▒ schedule
▒ pick_next_task_idle
It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?
The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.
Regards,
Aaron
next prev parent reply other threads:[~2016-08-19 5:29 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-27 1:54 [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression kernel test robot
2016-07-27 1:54 ` [lkp] " kernel test robot
2016-07-28 7:01 ` Xin Long
2016-07-28 7:01 ` [lkp] " Xin Long
2016-08-05 3:31 ` Aaron Lu
2016-08-05 3:31 ` [LKP] [lkp] " Aaron Lu
2016-08-05 11:53 ` Xin Long
2016-08-05 11:53 ` [LKP] [lkp] " Xin Long
2016-08-08 2:10 ` Aaron Lu
2016-08-08 2:10 ` [LKP] [lkp] " Aaron Lu
2016-08-16 2:38 ` Aaron Lu
2016-08-16 2:38 ` [LKP] [lkp] " Aaron Lu
2016-08-16 8:02 ` Xin Long
2016-08-16 8:02 ` [LKP] [lkp] " Xin Long
2016-08-16 8:30 ` Aaron Lu
2016-08-16 8:30 ` [LKP] [lkp] " Aaron Lu
2016-08-16 8:51 ` Aaron Lu
2016-08-16 8:51 ` [LKP] [lkp] " Aaron Lu
2016-08-16 9:56 ` Xin Long
2016-08-16 9:56 ` [LKP] [lkp] " Xin Long
2016-08-17 5:04 ` Aaron Lu
2016-08-17 5:04 ` [LKP] [lkp] " Aaron Lu
2016-08-17 5:34 ` Xin Long
2016-08-17 5:34 ` [LKP] [lkp] " Xin Long
2016-08-17 5:34 ` Aaron Lu
2016-08-17 5:34 ` [LKP] [lkp] " Aaron Lu
2016-08-17 5:41 ` Xin Long
2016-08-17 5:41 ` [LKP] [lkp] " Xin Long
2016-08-17 6:14 ` Aaron Lu
2016-08-17 6:14 ` [LKP] [lkp] " Aaron Lu
2016-08-17 6:37 ` Aaron Lu
2016-08-17 6:37 ` [LKP] [lkp] " Aaron Lu
2016-08-17 6:42 ` Aaron Lu
2016-08-17 6:42 ` [LKP] [lkp] " Aaron Lu
2016-08-17 7:35 ` Xin Long
2016-08-17 7:35 ` [LKP] [lkp] " Xin Long
2016-08-17 7:42 ` Aaron Lu
2016-08-17 7:42 ` [LKP] [lkp] " Aaron Lu
2016-08-17 7:53 ` Aaron Lu
2016-08-17 7:53 ` [LKP] [lkp] " Aaron Lu
2016-08-17 8:02 ` Xin Long
2016-08-17 8:02 ` [LKP] [lkp] " Xin Long
2016-08-17 8:48 ` Aaron Lu
2016-08-17 8:48 ` [LKP] [lkp] " Aaron Lu
2016-08-17 8:58 ` Xin Long
2016-08-17 8:58 ` [LKP] [lkp] " Xin Long
2016-08-17 9:20 ` Aaron Lu
2016-08-17 9:20 ` [LKP] [lkp] " Aaron Lu
2016-08-17 18:06 ` Xin Long
2016-08-17 18:06 ` [LKP] [lkp] " Xin Long
2016-08-18 3:21 ` Aaron Lu
2016-08-18 3:21 ` [LKP] [lkp] " Aaron Lu
2016-08-18 12:45 ` Xin Long
2016-08-18 12:45 ` [LKP] [lkp] " Xin Long
2016-08-19 5:29 ` Aaron Lu [this message]
2016-08-19 5:29 ` Aaron Lu
2016-08-19 7:19 ` Marcelo Ricardo Leitner
2016-08-19 7:19 ` [LKP] [lkp] " Marcelo Ricardo Leitner
2016-08-19 7:24 ` Aaron Lu
2016-08-19 7:24 ` [LKP] [lkp] " Aaron Lu
2016-08-22 21:44 ` Marcelo Ricardo Leitner
2016-08-22 21:44 ` [LKP] [lkp] " Marcelo Ricardo Leitner
2016-08-23 9:19 ` Aaron Lu
2016-08-23 9:19 ` [LKP] [lkp] " Aaron Lu
2016-09-30 7:05 ` Aaron Lu
2016-09-30 7:05 ` [LKP] [lkp] " Aaron Lu
2016-10-03 2:32 ` Xin Long
2016-10-03 2:32 ` [LKP] [lkp] " Xin Long
2016-10-09 7:41 ` Aaron Lu
2016-10-09 7:41 ` [LKP] [lkp] " Aaron Lu
2016-08-16 18:34 ` Xin Long
2016-08-16 18:34 ` [LKP] [lkp] " Xin Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160819052941.GA1179@aaronlu.sh.intel.com \
--to=aaron.lu@intel.com \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.