Re: [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: lkp@lists.01.org
Subject: Re: [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses
Date: Thu, 30 Apr 2015 14:25:23 +0800	[thread overview]
Message-ID: <20150430062523.GA25995@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <20150424121559.321677ce@notabene.brown>

[-- Attachment #1: Type: text/plain, Size: 16981 bytes --]

On Fri, Apr 24, 2015 at 12:15:59PM +1000, NeilBrown wrote:
> On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying <ying.huang@intel.com> wrote:
> 
> > FYI, we noticed the below changes on
> > 
> > git://neil.brown.name/md for-next
> > commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
> 
> Hi,
>  is there any chance that you could explain what some of this means?
> There is lots of data and some very pretty graphs, but no explanation.

Hi Neil,

(Sorry for late response: Ying is on vacation)

I guess you can simply ignore this report, as I already reported to you
month ago that this patch made fsmark performs better in most cases:

    https://lists.01.org/pipermail/lkp/2015-March/002411.html

> 
> Which numbers are "good", which are "bad"?  Which is "worst".
> What do the graphs really show? and what would we like to see in them?
> 
> I think it is really great that you are doing this testing and reporting the
> results.  It's just so sad that I completely fail to understand them.

Sorry, it's our bad to make them hard to understand as well as
to report a duplicate one(well, the commit hash is different ;).

We might need take some time to make those data understood easier.

	--yliu

> 
> > 
> > 
> > testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
> > 
> > a87d7f782b47e030  878ee6792799e2f88bdcac3298  
> > ----------------  --------------------------  
> >          %stddev     %change         %stddev
> >              \          |                \  
> >      59035 ±  0%     +18.4%      69913 ±  1%  softirqs.SCHED
> >       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.num_objs
> >       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.active_objs
> >     305908 ±  0%      -1.8%     300427 ±  0%  vmstat.io.bo
> >          1 ±  0%    +100.0%          2 ±  0%  vmstat.procs.r
> >       8266 ±  1%     -15.7%       6968 ±  0%  vmstat.system.cs
> >      14819 ±  0%      -2.1%      14503 ±  0%  vmstat.system.in
> >      18.20 ±  6%     +10.2%      20.05 ±  4%  perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
> >       1.94 ±  9%     +90.6%       3.70 ±  9%  perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >       0.00 ±  0%      +Inf%      25.18 ±  3%  perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
> >       0.00 ±  0%      +Inf%      14.14 ±  4%  perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >       1.79 ±  7%    +102.9%       3.64 ±  9%  perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
> >       3.09 ±  4%     -10.8%       2.76 ±  4%  perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
> >       0.80 ± 14%     +28.1%       1.02 ± 10%  perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> >      14.78 ±  6%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >      25.68 ±  4%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
> >       1.23 ±  5%    +140.0%       2.96 ±  7%  perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
> >       2.62 ±  6%     -95.6%       0.12 ± 33%  perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
> >       0.96 ±  9%     +17.5%       1.12 ±  2%  perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> >  1.461e+10 ±  0%      -5.3%  1.384e+10 ±  1%  perf-stat.L1-dcache-load-misses
> >  3.688e+11 ±  0%      -2.7%   3.59e+11 ±  0%  perf-stat.L1-dcache-loads
> >  1.124e+09 ±  0%     -27.7%  8.125e+08 ±  0%  perf-stat.L1-dcache-prefetches
> >  2.767e+10 ±  0%      -1.8%  2.717e+10 ±  0%  perf-stat.L1-dcache-store-misses
> >  2.352e+11 ±  0%      -2.8%  2.287e+11 ±  0%  perf-stat.L1-dcache-stores
> >  6.774e+09 ±  0%      -2.3%   6.62e+09 ±  0%  perf-stat.L1-icache-load-misses
> >  5.571e+08 ±  0%     +40.5%  7.826e+08 ±  1%  perf-stat.LLC-load-misses
> >  6.263e+09 ±  0%     -13.7%  5.407e+09 ±  1%  perf-stat.LLC-loads
> >  1.914e+11 ±  0%      -4.2%  1.833e+11 ±  0%  perf-stat.branch-instructions
> >  1.145e+09 ±  2%      -5.6%  1.081e+09 ±  0%  perf-stat.branch-load-misses
> >  1.911e+11 ±  0%      -4.3%  1.829e+11 ±  0%  perf-stat.branch-loads
> >  1.142e+09 ±  2%      -5.1%  1.083e+09 ±  0%  perf-stat.branch-misses
> >  1.218e+09 ±  0%     +19.8%   1.46e+09 ±  0%  perf-stat.cache-misses
> >  2.118e+10 ±  0%      -5.2%  2.007e+10 ±  0%  perf-stat.cache-references
> >    2510308 ±  1%     -15.7%    2115410 ±  0%  perf-stat.context-switches
> >      39623 ±  0%     +22.1%      48370 ±  1%  perf-stat.cpu-migrations
> >  4.179e+08 ± 40%    +165.7%  1.111e+09 ± 35%  perf-stat.dTLB-load-misses
> >  3.684e+11 ±  0%      -2.5%  3.592e+11 ±  0%  perf-stat.dTLB-loads
> >  1.232e+08 ± 15%     +62.5%  2.002e+08 ± 27%  perf-stat.dTLB-store-misses
> >  2.348e+11 ±  0%      -2.5%  2.288e+11 ±  0%  perf-stat.dTLB-stores
> >    3577297 ±  2%      +8.7%    3888986 ±  1%  perf-stat.iTLB-load-misses
> >  1.035e+12 ±  0%      -3.5%  9.988e+11 ±  0%  perf-stat.iTLB-loads
> >  1.036e+12 ±  0%      -3.7%  9.978e+11 ±  0%  perf-stat.instructions
> >        594 ± 30%    +130.3%       1369 ± 13%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >         17 ± 10%     -28.2%         12 ± 23%  sched_debug.cfs_rq[0]:/.nr_spread_over
> >        210 ± 21%     +42.1%        298 ± 28%  sched_debug.cfs_rq[0]:/.tg_runnable_contrib
> >       9676 ± 21%     +42.1%      13754 ± 28%  sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
> >        772 ± 25%    +116.5%       1672 ±  9%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >       8402 ±  9%     +83.3%      15405 ± 11%  sched_debug.cfs_rq[0]:/.tg_load_avg
> >       8356 ±  9%     +82.8%      15272 ± 11%  sched_debug.cfs_rq[1]:/.tg_load_avg
> >        968 ± 25%    +100.8%       1943 ± 14%  sched_debug.cfs_rq[1]:/.blocked_load_avg
> >      16242 ±  9%     -22.2%      12643 ± 14%  sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
> >        353 ±  9%     -22.1%        275 ± 14%  sched_debug.cfs_rq[1]:/.tg_runnable_contrib
> >       1183 ± 23%     +77.7%       2102 ± 12%  sched_debug.cfs_rq[1]:/.tg_load_contrib
> >        181 ±  8%     -31.4%        124 ± 26%  sched_debug.cfs_rq[2]:/.tg_runnable_contrib
> >       8364 ±  8%     -31.3%       5745 ± 26%  sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
> >       8297 ±  9%     +81.7%      15079 ± 12%  sched_debug.cfs_rq[2]:/.tg_load_avg
> >      30439 ± 13%     -45.2%      16681 ± 26%  sched_debug.cfs_rq[2]:/.exec_clock
> >      39735 ± 14%     -48.3%      20545 ± 29%  sched_debug.cfs_rq[2]:/.min_vruntime
> >       8231 ± 10%     +82.2%      15000 ± 12%  sched_debug.cfs_rq[3]:/.tg_load_avg
> >       1210 ± 14%    +110.3%       2546 ± 30%  sched_debug.cfs_rq[4]:/.tg_load_contrib
> >       8188 ± 10%     +82.8%      14964 ± 12%  sched_debug.cfs_rq[4]:/.tg_load_avg
> >       8132 ± 10%     +83.1%      14890 ± 12%  sched_debug.cfs_rq[5]:/.tg_load_avg
> >        749 ± 29%    +205.9%       2292 ± 34%  sched_debug.cfs_rq[5]:/.blocked_load_avg
> >        963 ± 30%    +169.9%       2599 ± 33%  sched_debug.cfs_rq[5]:/.tg_load_contrib
> >      37791 ± 32%     -38.6%      23209 ± 13%  sched_debug.cfs_rq[6]:/.min_vruntime
> >        693 ± 25%    +132.2%       1609 ± 29%  sched_debug.cfs_rq[6]:/.blocked_load_avg
> >      10838 ± 13%     -39.2%       6587 ± 13%  sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
> >      29329 ± 27%     -33.2%      19577 ± 10%  sched_debug.cfs_rq[6]:/.exec_clock
> >        235 ± 14%     -39.7%        142 ± 14%  sched_debug.cfs_rq[6]:/.tg_runnable_contrib
> >       8085 ± 10%     +83.6%      14848 ± 12%  sched_debug.cfs_rq[6]:/.tg_load_avg
> >        839 ± 25%    +128.5%       1917 ± 18%  sched_debug.cfs_rq[6]:/.tg_load_contrib
> >       8051 ± 10%     +83.6%      14779 ± 12%  sched_debug.cfs_rq[7]:/.tg_load_avg
> >        156 ± 34%     +97.9%        309 ± 19%  sched_debug.cpu#0.cpu_load[4]
> >        160 ± 25%     +64.0%        263 ± 16%  sched_debug.cpu#0.cpu_load[2]
> >        156 ± 32%     +83.7%        286 ± 17%  sched_debug.cpu#0.cpu_load[3]
> >        164 ± 20%     -35.1%        106 ± 31%  sched_debug.cpu#2.cpu_load[0]
> >        249 ± 15%     +80.2%        449 ± 10%  sched_debug.cpu#4.cpu_load[3]
> >        231 ± 11%    +101.2%        466 ± 13%  sched_debug.cpu#4.cpu_load[2]
> >        217 ± 14%    +189.9%        630 ± 38%  sched_debug.cpu#4.cpu_load[0]
> >      71951 ±  5%     +21.6%      87526 ±  7%  sched_debug.cpu#4.nr_load_updates
> >        214 ±  8%    +146.1%        527 ± 27%  sched_debug.cpu#4.cpu_load[1]
> >        256 ± 17%     +75.7%        449 ± 13%  sched_debug.cpu#4.cpu_load[4]
> >        209 ± 23%     +98.3%        416 ± 48%  sched_debug.cpu#5.cpu_load[2]
> >      68024 ±  2%     +18.8%      80825 ±  1%  sched_debug.cpu#5.nr_load_updates
> >        217 ± 26%     +74.9%        380 ± 45%  sched_debug.cpu#5.cpu_load[3]
> >        852 ± 21%     -38.3%        526 ± 22%  sched_debug.cpu#6.curr->pid
> > 
> > lkp-st02: Core2
> > Memory: 8G
> > 
> > 
> > 
> > 
> >                                 perf-stat.cache-misses
> > 
> >   1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
> >           |                       O   O  O   O  O   O  O   O  O   O         |
> >   1.4e+09 ++                                                                |
> >   1.2e+09 *+.*...*      *..*      *      *...*..*...*..*...*..*...*..*...*..*
> >           |      :      :  :      :      :                                  |
> >     1e+09 ++      :    :    :    : :    :                                   |
> >           |       :    :    :    : :    :                                   |
> >     8e+08 ++      :    :    :    : :    :                                   |
> >           |       :   :      :   :  :   :                                   |
> >     6e+08 ++       :  :      :  :   :  :                                    |
> >     4e+08 ++       : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >     2e+08 ++       : :        : :    : :                                    |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                             perf-stat.L1-dcache-prefetches
> > 
> >   1.2e+09 ++----------------------------------------------------------------+
> >           *..*...*      *..*      *        ..*..  ..*..*...*..*...*..*...*..*
> >     1e+09 ++     :      :  :      :      *.     *.                          |
> >           |      :     :    :     ::     :                                  |
> >           |       :    :    :    : :     :                        O         |
> >     8e+08 O+     O: O  :O  O:  O :O:  O :O   O  O   O  O   O  O             |
> >           |       :   :      :   :  :   :                                   |
> >     6e+08 ++      :   :      :   :  :   :                                   |
> >           |        :  :      :  :   :   :                                   |
> >     4e+08 ++       :  :      :  :   :  :                                    |
> >           |        : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >     2e+08 ++        ::        ::     : :                                    |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                               perf-stat.LLC-load-misses
> > 
> >   1e+09 ++------------------------------------------------------------------+
> >   9e+08 O+     O   O  O   O  O                                              |
> >         |                        O   O  O   O                               |
> >   8e+08 ++                                     O   O   O  O   O  O          |
> >   7e+08 ++                                                                  |
> >         |                                                                   |
> >   6e+08 *+..*..*      *...*      *      *...*..*...*...*..*...*..*...*..*...*
> >   5e+08 ++      :     :   :      ::     :                                   |
> >   4e+08 ++      :    :     :    : :    :                                    |
> >         |        :   :     :    :  :   :                                    |
> >   3e+08 ++       :   :      :  :   :   :                                    |
> >   2e+08 ++        : :       :  :    : :                                     |
> >         |         : :       : :     : :                                     |
> >   1e+08 ++         :         ::      :                                      |
> >       0 ++--O------*---------*-------*--------------------------------------+
> > 
> > 
> >                               perf-stat.context-switches
> > 
> >     3e+06 ++----------------------------------------------------------------+
> >           |                              *...*..*...                        |
> >   2.5e+06 *+.*...*      *..*      *      :          *..*...  .*...*..*...  .*
> >           |      :      :  :      :      :                 *.            *. |
> >           O      O: O  :O  O:  O  ::    :       O   O  O   O  O   O         |
> >     2e+06 ++      :    :    :    :O:  O :O   O                              |
> >           |       :    :    :    : :    :                                   |
> >   1.5e+06 ++      :   :      :   :  :   :                                   |
> >           |        :  :      :   :  :  :                                    |
> >     1e+06 ++       :  :      :  :   :  :                                    |
> >           |        : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >    500000 ++        ::        : :    ::                                     |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                                   vmstat.system.cs
> > 
> >   10000 ++------------------------------------------------------------------+
> >    9000 ++                              *...*..                             |
> >         *...*..*      *...*      *      :      *...*...*..  ..*..*...*..  ..*
> >    8000 ++     :      :   :      :      :                 *.            *.  |
> >    7000 O+     O:  O  O   O: O  : :    :       O   O   O  O   O  O          |
> >         |       :    :     :    :O:  O :O   O                               |
> >    6000 ++      :    :     :    : :    :                                    |
> >    5000 ++       :   :     :   :   :   :                                    |
> >    4000 ++       :   :      :  :   :  :                                     |
> >         |        :  :       :  :   :  :                                     |
> >    3000 ++        : :       : :     : :                                     |
> >    2000 ++        : :       : :     : :                                     |
> >         |         : :        ::     ::                                      |
> >    1000 ++         :         :       :                                      |
> >       0 ++--O------*---------*-------*--------------------------------------+
> > 
> > 
> > 	[*] bisect-good sample
> > 	[O] bisect-bad  sample
> > 
> > To reproduce:
> > 
> > 	apt-get install ruby
> > 	git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> > 	cd lkp-tests
> > 	bin/setup-local job.yaml # the job file attached in this email
> > 	bin/run-local   job.yaml
> > 
> > 
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> > 
> > 
> > Thanks,
> > Ying Huang
> > 
>

WARNING: multiple messages have this Message-ID (diff)

From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To: NeilBrown <neilb@suse.de>
Cc: Huang Ying <ying.huang@intel.com>,
	"shli@kernel.org" <shli@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, LKP ML <lkp@01.org>,
	Fengguang Wu <fengguang.wu@intel.com>
Subject: Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses
Date: Thu, 30 Apr 2015 14:25:23 +0800	[thread overview]
Message-ID: <20150430062523.GA25995@yliu-dev.sh.intel.com> (raw)
In-Reply-To: <20150424121559.321677ce@notabene.brown>

On Fri, Apr 24, 2015 at 12:15:59PM +1000, NeilBrown wrote:
> On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying <ying.huang@intel.com> wrote:
> 
> > FYI, we noticed the below changes on
> > 
> > git://neil.brown.name/md for-next
> > commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
> 
> Hi,
>  is there any chance that you could explain what some of this means?
> There is lots of data and some very pretty graphs, but no explanation.

Hi Neil,

(Sorry for late response: Ying is on vacation)

I guess you can simply ignore this report, as I already reported to you
month ago that this patch made fsmark performs better in most cases:

    https://lists.01.org/pipermail/lkp/2015-March/002411.html

> 
> Which numbers are "good", which are "bad"?  Which is "worst".
> What do the graphs really show? and what would we like to see in them?
> 
> I think it is really great that you are doing this testing and reporting the
> results.  It's just so sad that I completely fail to understand them.

Sorry, it's our bad to make them hard to understand as well as
to report a duplicate one(well, the commit hash is different ;).

We might need take some time to make those data understood easier.

	--yliu

> 
> > 
> > 
> > testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
> > 
> > a87d7f782b47e030  878ee6792799e2f88bdcac3298  
> > ----------------  --------------------------  
> >          %stddev     %change         %stddev
> >              \          |                \  
> >      59035 ±  0%     +18.4%      69913 ±  1%  softirqs.SCHED
> >       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.num_objs
> >       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.active_objs
> >     305908 ±  0%      -1.8%     300427 ±  0%  vmstat.io.bo
> >          1 ±  0%    +100.0%          2 ±  0%  vmstat.procs.r
> >       8266 ±  1%     -15.7%       6968 ±  0%  vmstat.system.cs
> >      14819 ±  0%      -2.1%      14503 ±  0%  vmstat.system.in
> >      18.20 ±  6%     +10.2%      20.05 ±  4%  perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
> >       1.94 ±  9%     +90.6%       3.70 ±  9%  perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >       0.00 ±  0%      +Inf%      25.18 ±  3%  perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
> >       0.00 ±  0%      +Inf%      14.14 ±  4%  perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >       1.79 ±  7%    +102.9%       3.64 ±  9%  perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
> >       3.09 ±  4%     -10.8%       2.76 ±  4%  perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
> >       0.80 ± 14%     +28.1%       1.02 ± 10%  perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> >      14.78 ±  6%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
> >      25.68 ±  4%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
> >       1.23 ±  5%    +140.0%       2.96 ±  7%  perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
> >       2.62 ±  6%     -95.6%       0.12 ± 33%  perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
> >       0.96 ±  9%     +17.5%       1.12 ±  2%  perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
> >  1.461e+10 ±  0%      -5.3%  1.384e+10 ±  1%  perf-stat.L1-dcache-load-misses
> >  3.688e+11 ±  0%      -2.7%   3.59e+11 ±  0%  perf-stat.L1-dcache-loads
> >  1.124e+09 ±  0%     -27.7%  8.125e+08 ±  0%  perf-stat.L1-dcache-prefetches
> >  2.767e+10 ±  0%      -1.8%  2.717e+10 ±  0%  perf-stat.L1-dcache-store-misses
> >  2.352e+11 ±  0%      -2.8%  2.287e+11 ±  0%  perf-stat.L1-dcache-stores
> >  6.774e+09 ±  0%      -2.3%   6.62e+09 ±  0%  perf-stat.L1-icache-load-misses
> >  5.571e+08 ±  0%     +40.5%  7.826e+08 ±  1%  perf-stat.LLC-load-misses
> >  6.263e+09 ±  0%     -13.7%  5.407e+09 ±  1%  perf-stat.LLC-loads
> >  1.914e+11 ±  0%      -4.2%  1.833e+11 ±  0%  perf-stat.branch-instructions
> >  1.145e+09 ±  2%      -5.6%  1.081e+09 ±  0%  perf-stat.branch-load-misses
> >  1.911e+11 ±  0%      -4.3%  1.829e+11 ±  0%  perf-stat.branch-loads
> >  1.142e+09 ±  2%      -5.1%  1.083e+09 ±  0%  perf-stat.branch-misses
> >  1.218e+09 ±  0%     +19.8%   1.46e+09 ±  0%  perf-stat.cache-misses
> >  2.118e+10 ±  0%      -5.2%  2.007e+10 ±  0%  perf-stat.cache-references
> >    2510308 ±  1%     -15.7%    2115410 ±  0%  perf-stat.context-switches
> >      39623 ±  0%     +22.1%      48370 ±  1%  perf-stat.cpu-migrations
> >  4.179e+08 ± 40%    +165.7%  1.111e+09 ± 35%  perf-stat.dTLB-load-misses
> >  3.684e+11 ±  0%      -2.5%  3.592e+11 ±  0%  perf-stat.dTLB-loads
> >  1.232e+08 ± 15%     +62.5%  2.002e+08 ± 27%  perf-stat.dTLB-store-misses
> >  2.348e+11 ±  0%      -2.5%  2.288e+11 ±  0%  perf-stat.dTLB-stores
> >    3577297 ±  2%      +8.7%    3888986 ±  1%  perf-stat.iTLB-load-misses
> >  1.035e+12 ±  0%      -3.5%  9.988e+11 ±  0%  perf-stat.iTLB-loads
> >  1.036e+12 ±  0%      -3.7%  9.978e+11 ±  0%  perf-stat.instructions
> >        594 ± 30%    +130.3%       1369 ± 13%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >         17 ± 10%     -28.2%         12 ± 23%  sched_debug.cfs_rq[0]:/.nr_spread_over
> >        210 ± 21%     +42.1%        298 ± 28%  sched_debug.cfs_rq[0]:/.tg_runnable_contrib
> >       9676 ± 21%     +42.1%      13754 ± 28%  sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
> >        772 ± 25%    +116.5%       1672 ±  9%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >       8402 ±  9%     +83.3%      15405 ± 11%  sched_debug.cfs_rq[0]:/.tg_load_avg
> >       8356 ±  9%     +82.8%      15272 ± 11%  sched_debug.cfs_rq[1]:/.tg_load_avg
> >        968 ± 25%    +100.8%       1943 ± 14%  sched_debug.cfs_rq[1]:/.blocked_load_avg
> >      16242 ±  9%     -22.2%      12643 ± 14%  sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
> >        353 ±  9%     -22.1%        275 ± 14%  sched_debug.cfs_rq[1]:/.tg_runnable_contrib
> >       1183 ± 23%     +77.7%       2102 ± 12%  sched_debug.cfs_rq[1]:/.tg_load_contrib
> >        181 ±  8%     -31.4%        124 ± 26%  sched_debug.cfs_rq[2]:/.tg_runnable_contrib
> >       8364 ±  8%     -31.3%       5745 ± 26%  sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
> >       8297 ±  9%     +81.7%      15079 ± 12%  sched_debug.cfs_rq[2]:/.tg_load_avg
> >      30439 ± 13%     -45.2%      16681 ± 26%  sched_debug.cfs_rq[2]:/.exec_clock
> >      39735 ± 14%     -48.3%      20545 ± 29%  sched_debug.cfs_rq[2]:/.min_vruntime
> >       8231 ± 10%     +82.2%      15000 ± 12%  sched_debug.cfs_rq[3]:/.tg_load_avg
> >       1210 ± 14%    +110.3%       2546 ± 30%  sched_debug.cfs_rq[4]:/.tg_load_contrib
> >       8188 ± 10%     +82.8%      14964 ± 12%  sched_debug.cfs_rq[4]:/.tg_load_avg
> >       8132 ± 10%     +83.1%      14890 ± 12%  sched_debug.cfs_rq[5]:/.tg_load_avg
> >        749 ± 29%    +205.9%       2292 ± 34%  sched_debug.cfs_rq[5]:/.blocked_load_avg
> >        963 ± 30%    +169.9%       2599 ± 33%  sched_debug.cfs_rq[5]:/.tg_load_contrib
> >      37791 ± 32%     -38.6%      23209 ± 13%  sched_debug.cfs_rq[6]:/.min_vruntime
> >        693 ± 25%    +132.2%       1609 ± 29%  sched_debug.cfs_rq[6]:/.blocked_load_avg
> >      10838 ± 13%     -39.2%       6587 ± 13%  sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
> >      29329 ± 27%     -33.2%      19577 ± 10%  sched_debug.cfs_rq[6]:/.exec_clock
> >        235 ± 14%     -39.7%        142 ± 14%  sched_debug.cfs_rq[6]:/.tg_runnable_contrib
> >       8085 ± 10%     +83.6%      14848 ± 12%  sched_debug.cfs_rq[6]:/.tg_load_avg
> >        839 ± 25%    +128.5%       1917 ± 18%  sched_debug.cfs_rq[6]:/.tg_load_contrib
> >       8051 ± 10%     +83.6%      14779 ± 12%  sched_debug.cfs_rq[7]:/.tg_load_avg
> >        156 ± 34%     +97.9%        309 ± 19%  sched_debug.cpu#0.cpu_load[4]
> >        160 ± 25%     +64.0%        263 ± 16%  sched_debug.cpu#0.cpu_load[2]
> >        156 ± 32%     +83.7%        286 ± 17%  sched_debug.cpu#0.cpu_load[3]
> >        164 ± 20%     -35.1%        106 ± 31%  sched_debug.cpu#2.cpu_load[0]
> >        249 ± 15%     +80.2%        449 ± 10%  sched_debug.cpu#4.cpu_load[3]
> >        231 ± 11%    +101.2%        466 ± 13%  sched_debug.cpu#4.cpu_load[2]
> >        217 ± 14%    +189.9%        630 ± 38%  sched_debug.cpu#4.cpu_load[0]
> >      71951 ±  5%     +21.6%      87526 ±  7%  sched_debug.cpu#4.nr_load_updates
> >        214 ±  8%    +146.1%        527 ± 27%  sched_debug.cpu#4.cpu_load[1]
> >        256 ± 17%     +75.7%        449 ± 13%  sched_debug.cpu#4.cpu_load[4]
> >        209 ± 23%     +98.3%        416 ± 48%  sched_debug.cpu#5.cpu_load[2]
> >      68024 ±  2%     +18.8%      80825 ±  1%  sched_debug.cpu#5.nr_load_updates
> >        217 ± 26%     +74.9%        380 ± 45%  sched_debug.cpu#5.cpu_load[3]
> >        852 ± 21%     -38.3%        526 ± 22%  sched_debug.cpu#6.curr->pid
> > 
> > lkp-st02: Core2
> > Memory: 8G
> > 
> > 
> > 
> > 
> >                                 perf-stat.cache-misses
> > 
> >   1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
> >           |                       O   O  O   O  O   O  O   O  O   O         |
> >   1.4e+09 ++                                                                |
> >   1.2e+09 *+.*...*      *..*      *      *...*..*...*..*...*..*...*..*...*..*
> >           |      :      :  :      :      :                                  |
> >     1e+09 ++      :    :    :    : :    :                                   |
> >           |       :    :    :    : :    :                                   |
> >     8e+08 ++      :    :    :    : :    :                                   |
> >           |       :   :      :   :  :   :                                   |
> >     6e+08 ++       :  :      :  :   :  :                                    |
> >     4e+08 ++       : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >     2e+08 ++       : :        : :    : :                                    |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                             perf-stat.L1-dcache-prefetches
> > 
> >   1.2e+09 ++----------------------------------------------------------------+
> >           *..*...*      *..*      *        ..*..  ..*..*...*..*...*..*...*..*
> >     1e+09 ++     :      :  :      :      *.     *.                          |
> >           |      :     :    :     ::     :                                  |
> >           |       :    :    :    : :     :                        O         |
> >     8e+08 O+     O: O  :O  O:  O :O:  O :O   O  O   O  O   O  O             |
> >           |       :   :      :   :  :   :                                   |
> >     6e+08 ++      :   :      :   :  :   :                                   |
> >           |        :  :      :  :   :   :                                   |
> >     4e+08 ++       :  :      :  :   :  :                                    |
> >           |        : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >     2e+08 ++        ::        ::     : :                                    |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                               perf-stat.LLC-load-misses
> > 
> >   1e+09 ++------------------------------------------------------------------+
> >   9e+08 O+     O   O  O   O  O                                              |
> >         |                        O   O  O   O                               |
> >   8e+08 ++                                     O   O   O  O   O  O          |
> >   7e+08 ++                                                                  |
> >         |                                                                   |
> >   6e+08 *+..*..*      *...*      *      *...*..*...*...*..*...*..*...*..*...*
> >   5e+08 ++      :     :   :      ::     :                                   |
> >   4e+08 ++      :    :     :    : :    :                                    |
> >         |        :   :     :    :  :   :                                    |
> >   3e+08 ++       :   :      :  :   :   :                                    |
> >   2e+08 ++        : :       :  :    : :                                     |
> >         |         : :       : :     : :                                     |
> >   1e+08 ++         :         ::      :                                      |
> >       0 ++--O------*---------*-------*--------------------------------------+
> > 
> > 
> >                               perf-stat.context-switches
> > 
> >     3e+06 ++----------------------------------------------------------------+
> >           |                              *...*..*...                        |
> >   2.5e+06 *+.*...*      *..*      *      :          *..*...  .*...*..*...  .*
> >           |      :      :  :      :      :                 *.            *. |
> >           O      O: O  :O  O:  O  ::    :       O   O  O   O  O   O         |
> >     2e+06 ++      :    :    :    :O:  O :O   O                              |
> >           |       :    :    :    : :    :                                   |
> >   1.5e+06 ++      :   :      :   :  :   :                                   |
> >           |        :  :      :   :  :  :                                    |
> >     1e+06 ++       :  :      :  :   :  :                                    |
> >           |        : :        : :    : :                                    |
> >           |        : :        : :    : :                                    |
> >    500000 ++        ::        : :    ::                                     |
> >           |         :          :      :                                     |
> >         0 ++-O------*----------*------*-------------------------------------+
> > 
> > 
> >                                   vmstat.system.cs
> > 
> >   10000 ++------------------------------------------------------------------+
> >    9000 ++                              *...*..                             |
> >         *...*..*      *...*      *      :      *...*...*..  ..*..*...*..  ..*
> >    8000 ++     :      :   :      :      :                 *.            *.  |
> >    7000 O+     O:  O  O   O: O  : :    :       O   O   O  O   O  O          |
> >         |       :    :     :    :O:  O :O   O                               |
> >    6000 ++      :    :     :    : :    :                                    |
> >    5000 ++       :   :     :   :   :   :                                    |
> >    4000 ++       :   :      :  :   :  :                                     |
> >         |        :  :       :  :   :  :                                     |
> >    3000 ++        : :       : :     : :                                     |
> >    2000 ++        : :       : :     : :                                     |
> >         |         : :        ::     ::                                      |
> >    1000 ++         :         :       :                                      |
> >       0 ++--O------*---------*-------*--------------------------------------+
> > 
> > 
> > 	[*] bisect-good sample
> > 	[O] bisect-bad  sample
> > 
> > To reproduce:
> > 
> > 	apt-get install ruby
> > 	git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> > 	cd lkp-tests
> > 	bin/setup-local job.yaml # the job file attached in this email
> > 	bin/run-local   job.yaml
> > 
> > 
> > Disclaimer:
> > Results have been estimated based on internal Intel analysis and are provided
> > for informational purposes only. Any difference in system hardware or software
> > design or configuration may affect actual performance.
> > 
> > 
> > Thanks,
> > Ying Huang
> > 
>

next prev parent reply	other threads:[~2015-04-30  6:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-23  6:55 [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5% perf-stat.LLC-load-misses Huang Ying
2015-04-23  6:55 ` [LKP] " Huang Ying
2015-04-24  2:15 ` NeilBrown
2015-04-24  2:15   ` [LKP] " NeilBrown
2015-04-30  6:25   ` Yuanhan Liu [this message]
2015-04-30  6:25     ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150430062523.GA25995@yliu-dev.sh.intel.com \
    --to=yuanhan.liu@linux.intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.