All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aaron Lu <aaron.lu@intel.com>
To: lkp@lists.01.org
Subject: Re: [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression
Date: Fri, 19 Aug 2016 13:29:41 +0800	[thread overview]
Message-ID: <20160819052941.GA1179@aaronlu.sh.intel.com> (raw)
In-Reply-To: <CADvbK_cF3Mphd5CDqGVHcbbZ-L8Xz=vRPzdKTzE4A_p-ZBiDVA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5354 bytes --]

On Thu, Aug 18, 2016 at 08:45:42PM +0800, Xin Long wrote:
> >> Hi, Aaron
> >>
> >> 1)
> >> I talked with Marcelo about this one.
> >> He said it might be related with cacheline.  the  new field distroyed
> >> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> >> +       unsigned long prsctp_param;
> >>
> >> to the end of struct sctp_chunk, then try.
> >
> > This doesn't work.
> >
> 
> If it's because of cache lines changed, I'm not sure this, either.
> Maybe 2) is a good way to fix it.

A comparison of the good commit 826d253d57b1 and the bad a6c2f792873a:

tests: 8
testcase/path_params/tbox_group/run: netperf/ipv4-300s-200%-cs-localhost-10K-SCTP_STREAM_MANY-performance/lkp-ivb-d02

826d253d57b11f69             a6c2f792873aff332a4689717c  
----------------             --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3923             -37%       2461        netperf.Throughput_Mbps
         9             -78%          2        vmstat.procs.r
    112616              19%     133981        vmstat.system.cs
      4053               7%       4350        vmstat.system.in
      8598 ±  4%       957%      90912        softirqs.SCHED
  16466114             -37%   10305467        softirqs.NET_RX
    605899             -46%     329262        softirqs.TIMER
     72067 ± 10%       -63%      26356 ±  3%  softirqs.RCU
      4785 ±  7%        -9%       4352        slabinfo.anon_vma_chain.num_objs
       642 ±  7%        14%        731 ±  6%  slabinfo.kmalloc-512.active_objs
      4993              15%       5735        slabinfo.kmalloc-64.active_objs
      4993              15%       5735        slabinfo.kmalloc-64.num_objs
      2529 ±  4%       -15%       2150        proc-vmstat.nr_alloc_batch
 4.733e+08             -37%  2.999e+08        proc-vmstat.pgalloc_normal
 8.476e+08             -37%   5.36e+08        proc-vmstat.pgfree
 3.742e+08             -37%  2.361e+08        proc-vmstat.pgalloc_dma32
  1.48e+08             -37%   93033641        proc-vmstat.numa_hit
  1.48e+08             -37%   93033640        proc-vmstat.numa_local
      0.05 ± 17%     52102%      24.80        turbostat.CPU%c1
      0.64            3065%      20.10 ±  3%  turbostat.CPU%c6
      0.12 ± 39%      1900%       2.35 ±  3%  turbostat.Pkg%pc2
      0.46 ± 10%      1686%       8.22 ±  6%  turbostat.Pkg%pc6
     37.54             -14%      32.11        turbostat.PkgWatt
     20.20             -25%      15.22        turbostat.CorWatt
     99.31             -45%      54.97        turbostat.%Busy
      3269             -45%       1803        turbostat.Avg_MHz
     76510 ± 46%     3e+05%  1.954e+08        cpuidle.C1-IVB.time
     19769 ± 17%      5534%    1113742 ±  5%  cpuidle.C1E-IVB.time
       151 ± 11%      4175%       6454 ±  7%  cpuidle.C1E-IVB.usage
       114 ± 14%      6216%       7232 ±  5%  cpuidle.C3-IVB.usage
     33074 ± 14%      5159%    1739419 ±  3%  cpuidle.C3-IVB.time
      8874            4203%     381901        cpuidle.C6-IVB.usage
   8006184            4072%   3.34e+08        cpuidle.C6-IVB.time
     12019 ± 35%       303%      48398        perf-stat.cpu-migrations
  34232822              19%   40780053        perf-stat.context-switches
    339045               5%     354573        perf-stat.minor-faults
    339041               5%     354568        perf-stat.page-faults
 2.776e+11             -28%  2.003e+11        perf-stat.branch-instructions
 1.505e+12             -29%  1.065e+12        perf-stat.instructions
 6.421e+11             -30%  4.473e+11        perf-stat.dTLB-loads
  5.32e+11             -34%  3.536e+11        perf-stat.dTLB-stores
 1.173e+11             -38%  7.271e+10        perf-stat.cache-references
 3.735e+08 ±  5%       -48%  1.959e+08 ±  4%  perf-stat.iTLB-load-misses
 3.864e+09             -51%    1.9e+09        perf-stat.branch-misses
 4.069e+09 ± 20%       -56%  1.798e+09 ± 40%  perf-stat.dTLB-load-misses
 5.285e+08 ± 22%       -70%  1.585e+08 ± 16%  perf-stat.dTLB-store-misses
 7.126e+09 ± 16%       -97%   2.27e+08 ±  4%  perf-stat.cache-misses

The obvious change is:
1 the bad commit has a much fewer runnable process - vmstat.procs.r
2 the context switches are much higher in the bad commit - vmstat.system.cs

It all suggests the netperf processes go to sleep for some reason in the bad
commit.

I used "perf record -p one_netperf_pid -e probe:pick_next_task_idle" as
suggested by Tim to see where it went to sleep:

Samples: 78  of event 'probe:pick_next_task_idle', Event count(approx.): 78
  Children      Self  Trace output
  ■-  100.00%   100.00%  (ffffffff810fc750)
  ▒     __sendmsg_nocancel
  ▒     entry_SYSCALL_64_fastpath
  ▒     sys_sendmsg
  ▒     __sys_sendmsg
  ▒     ___sys_sendmsg
  ▒     inet_sendmsg
  ▒     sctp_sendmsg
  ▒     sctp_wait_for_sndbuf
  ▒     schedule_timeout
  ▒     schedule
  ▒     pick_next_task_idle

It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?

The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.

Regards,
Aaron

WARNING: multiple messages have this Message-ID (diff)
From: Aaron Lu <aaron.lu@intel.com>
To: Xin Long <lucien.xin@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>,
	kernel test robot <xiaolong.ye@intel.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	lkp@01.org, "David S. Miller" <davem@davemloft.net>,
	LKML <linux-kernel@vger.kernel.org>,
	"Chen, Tim C" <tim.c.chen@intel.com>,
	Huang Ying <ying.huang@intel.com>
Subject: Re: [LKP] [lkp] [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression
Date: Fri, 19 Aug 2016 13:29:41 +0800	[thread overview]
Message-ID: <20160819052941.GA1179@aaronlu.sh.intel.com> (raw)
In-Reply-To: <CADvbK_cF3Mphd5CDqGVHcbbZ-L8Xz=vRPzdKTzE4A_p-ZBiDVA@mail.gmail.com>

On Thu, Aug 18, 2016 at 08:45:42PM +0800, Xin Long wrote:
> >> Hi, Aaron
> >>
> >> 1)
> >> I talked with Marcelo about this one.
> >> He said it might be related with cacheline.  the  new field distroyed
> >> the prior cacheline. So on top of commit 826d253d57b1, pls only add
> >> +       unsigned long prsctp_param;
> >>
> >> to the end of struct sctp_chunk, then try.
> >
> > This doesn't work.
> >
> 
> If it's because of cache lines changed, I'm not sure this, either.
> Maybe 2) is a good way to fix it.

A comparison of the good commit 826d253d57b1 and the bad a6c2f792873a:

tests: 8
testcase/path_params/tbox_group/run: netperf/ipv4-300s-200%-cs-localhost-10K-SCTP_STREAM_MANY-performance/lkp-ivb-d02

826d253d57b11f69             a6c2f792873aff332a4689717c  
----------------             --------------------------  
         %stddev      change         %stddev
             \          |                \  
      3923             -37%       2461        netperf.Throughput_Mbps
         9             -78%          2        vmstat.procs.r
    112616              19%     133981        vmstat.system.cs
      4053               7%       4350        vmstat.system.in
      8598 ±  4%       957%      90912        softirqs.SCHED
  16466114             -37%   10305467        softirqs.NET_RX
    605899             -46%     329262        softirqs.TIMER
     72067 ± 10%       -63%      26356 ±  3%  softirqs.RCU
      4785 ±  7%        -9%       4352        slabinfo.anon_vma_chain.num_objs
       642 ±  7%        14%        731 ±  6%  slabinfo.kmalloc-512.active_objs
      4993              15%       5735        slabinfo.kmalloc-64.active_objs
      4993              15%       5735        slabinfo.kmalloc-64.num_objs
      2529 ±  4%       -15%       2150        proc-vmstat.nr_alloc_batch
 4.733e+08             -37%  2.999e+08        proc-vmstat.pgalloc_normal
 8.476e+08             -37%   5.36e+08        proc-vmstat.pgfree
 3.742e+08             -37%  2.361e+08        proc-vmstat.pgalloc_dma32
  1.48e+08             -37%   93033641        proc-vmstat.numa_hit
  1.48e+08             -37%   93033640        proc-vmstat.numa_local
      0.05 ± 17%     52102%      24.80        turbostat.CPU%c1
      0.64            3065%      20.10 ±  3%  turbostat.CPU%c6
      0.12 ± 39%      1900%       2.35 ±  3%  turbostat.Pkg%pc2
      0.46 ± 10%      1686%       8.22 ±  6%  turbostat.Pkg%pc6
     37.54             -14%      32.11        turbostat.PkgWatt
     20.20             -25%      15.22        turbostat.CorWatt
     99.31             -45%      54.97        turbostat.%Busy
      3269             -45%       1803        turbostat.Avg_MHz
     76510 ± 46%     3e+05%  1.954e+08        cpuidle.C1-IVB.time
     19769 ± 17%      5534%    1113742 ±  5%  cpuidle.C1E-IVB.time
       151 ± 11%      4175%       6454 ±  7%  cpuidle.C1E-IVB.usage
       114 ± 14%      6216%       7232 ±  5%  cpuidle.C3-IVB.usage
     33074 ± 14%      5159%    1739419 ±  3%  cpuidle.C3-IVB.time
      8874            4203%     381901        cpuidle.C6-IVB.usage
   8006184            4072%   3.34e+08        cpuidle.C6-IVB.time
     12019 ± 35%       303%      48398        perf-stat.cpu-migrations
  34232822              19%   40780053        perf-stat.context-switches
    339045               5%     354573        perf-stat.minor-faults
    339041               5%     354568        perf-stat.page-faults
 2.776e+11             -28%  2.003e+11        perf-stat.branch-instructions
 1.505e+12             -29%  1.065e+12        perf-stat.instructions
 6.421e+11             -30%  4.473e+11        perf-stat.dTLB-loads
  5.32e+11             -34%  3.536e+11        perf-stat.dTLB-stores
 1.173e+11             -38%  7.271e+10        perf-stat.cache-references
 3.735e+08 ±  5%       -48%  1.959e+08 ±  4%  perf-stat.iTLB-load-misses
 3.864e+09             -51%    1.9e+09        perf-stat.branch-misses
 4.069e+09 ± 20%       -56%  1.798e+09 ± 40%  perf-stat.dTLB-load-misses
 5.285e+08 ± 22%       -70%  1.585e+08 ± 16%  perf-stat.dTLB-store-misses
 7.126e+09 ± 16%       -97%   2.27e+08 ±  4%  perf-stat.cache-misses

The obvious change is:
1 the bad commit has a much fewer runnable process - vmstat.procs.r
2 the context switches are much higher in the bad commit - vmstat.system.cs

It all suggests the netperf processes go to sleep for some reason in the bad
commit.

I used "perf record -p one_netperf_pid -e probe:pick_next_task_idle" as
suggested by Tim to see where it went to sleep:

Samples: 78  of event 'probe:pick_next_task_idle', Event count(approx.): 78
  Children      Self  Trace output
  ■-  100.00%   100.00%  (ffffffff810fc750)
  ▒     __sendmsg_nocancel
  ▒     entry_SYSCALL_64_fastpath
  ▒     sys_sendmsg
  ▒     __sys_sendmsg
  ▒     ___sys_sendmsg
  ▒     inet_sendmsg
  ▒     sctp_sendmsg
  ▒     sctp_wait_for_sndbuf
  ▒     schedule_timeout
  ▒     schedule
  ▒     pick_next_task_idle

It doesn't look insane and sctp_wait_for_sndbuf may actually have
something to do with a larger sctp_chunk I suppose?

The same perf record doesn't capture any sample for the good commit,
which suggests the nerperf process doesn't sleep in sctp_wait_for_sndbuf.

Regards,
Aaron

  reply	other threads:[~2016-08-19  5:29 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-27  1:54 [sctp] a6c2f79287: netperf.Throughput_Mbps -37.2% regression kernel test robot
2016-07-27  1:54 ` [lkp] " kernel test robot
2016-07-28  7:01 ` Xin Long
2016-07-28  7:01   ` [lkp] " Xin Long
2016-08-05  3:31   ` Aaron Lu
2016-08-05  3:31     ` [LKP] [lkp] " Aaron Lu
2016-08-05 11:53     ` Xin Long
2016-08-05 11:53       ` [LKP] [lkp] " Xin Long
2016-08-08  2:10       ` Aaron Lu
2016-08-08  2:10         ` [LKP] [lkp] " Aaron Lu
2016-08-16  2:38         ` Aaron Lu
2016-08-16  2:38           ` [LKP] [lkp] " Aaron Lu
2016-08-16  8:02         ` Xin Long
2016-08-16  8:02           ` [LKP] [lkp] " Xin Long
2016-08-16  8:30           ` Aaron Lu
2016-08-16  8:30             ` [LKP] [lkp] " Aaron Lu
2016-08-16  8:51           ` Aaron Lu
2016-08-16  8:51             ` [LKP] [lkp] " Aaron Lu
2016-08-16  9:56             ` Xin Long
2016-08-16  9:56               ` [LKP] [lkp] " Xin Long
2016-08-17  5:04               ` Aaron Lu
2016-08-17  5:04                 ` [LKP] [lkp] " Aaron Lu
2016-08-17  5:34                 ` Xin Long
2016-08-17  5:34                   ` [LKP] [lkp] " Xin Long
2016-08-17  5:34                 ` Aaron Lu
2016-08-17  5:34                   ` [LKP] [lkp] " Aaron Lu
2016-08-17  5:41                   ` Xin Long
2016-08-17  5:41                     ` [LKP] [lkp] " Xin Long
2016-08-17  6:14                     ` Aaron Lu
2016-08-17  6:14                       ` [LKP] [lkp] " Aaron Lu
2016-08-17  6:37                       ` Aaron Lu
2016-08-17  6:37                         ` [LKP] [lkp] " Aaron Lu
2016-08-17  6:42                         ` Aaron Lu
2016-08-17  6:42                           ` [LKP] [lkp] " Aaron Lu
2016-08-17  7:35                         ` Xin Long
2016-08-17  7:35                           ` [LKP] [lkp] " Xin Long
2016-08-17  7:42                           ` Aaron Lu
2016-08-17  7:42                             ` [LKP] [lkp] " Aaron Lu
2016-08-17  7:53                             ` Aaron Lu
2016-08-17  7:53                               ` [LKP] [lkp] " Aaron Lu
2016-08-17  8:02                             ` Xin Long
2016-08-17  8:02                               ` [LKP] [lkp] " Xin Long
2016-08-17  8:48                               ` Aaron Lu
2016-08-17  8:48                                 ` [LKP] [lkp] " Aaron Lu
2016-08-17  8:58                                 ` Xin Long
2016-08-17  8:58                                   ` [LKP] [lkp] " Xin Long
2016-08-17  9:20                                   ` Aaron Lu
2016-08-17  9:20                                     ` [LKP] [lkp] " Aaron Lu
2016-08-17 18:06                                     ` Xin Long
2016-08-17 18:06                                       ` [LKP] [lkp] " Xin Long
2016-08-18  3:21                                       ` Aaron Lu
2016-08-18  3:21                                         ` [LKP] [lkp] " Aaron Lu
2016-08-18 12:45                                         ` Xin Long
2016-08-18 12:45                                           ` [LKP] [lkp] " Xin Long
2016-08-19  5:29                                           ` Aaron Lu [this message]
2016-08-19  5:29                                             ` Aaron Lu
2016-08-19  7:19                                             ` Marcelo Ricardo Leitner
2016-08-19  7:19                                               ` [LKP] [lkp] " Marcelo Ricardo Leitner
2016-08-19  7:24                                               ` Aaron Lu
2016-08-19  7:24                                                 ` [LKP] [lkp] " Aaron Lu
2016-08-22 21:44                                                 ` Marcelo Ricardo Leitner
2016-08-22 21:44                                                   ` [LKP] [lkp] " Marcelo Ricardo Leitner
2016-08-23  9:19                                                   ` Aaron Lu
2016-08-23  9:19                                                     ` [LKP] [lkp] " Aaron Lu
2016-09-30  7:05                                                   ` Aaron Lu
2016-09-30  7:05                                                     ` [LKP] [lkp] " Aaron Lu
2016-10-03  2:32                                                     ` Xin Long
2016-10-03  2:32                                                       ` [LKP] [lkp] " Xin Long
2016-10-09  7:41                                                       ` Aaron Lu
2016-10-09  7:41                                                         ` [LKP] [lkp] " Aaron Lu
2016-08-16 18:34             ` Xin Long
2016-08-16 18:34               ` [LKP] [lkp] " Xin Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160819052941.GA1179@aaronlu.sh.intel.com \
    --to=aaron.lu@intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.