linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [sched] 23f0d2093c: -12.6% regression on sparse file copy
@ 2014-01-05  9:04 fengguang.wu
  2014-01-06  0:30 ` Joonsoo Kim
  0 siblings, 1 reply; 4+ messages in thread
From: fengguang.wu @ 2014-01-05  9:04 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Peter Zijlstra, Ingo Molnar, LKML, lkp

Hi Joonsoo,

We noticed the below changes for commit 23f0d2093c ("sched: Factor out
code to should_we_balance()") in test vm-scalability/300s-lru-file-readtwice

        95a79b805b935f4  23f0d2093c789e612185180c4
        ---------------  -------------------------
==>           4.45 ~ 5%   +1777.7%      83.60 ~ 5%  vm-scalability.stddev
==>       14966511 ~ 0%     -12.6%   13084545 ~ 2%  vm-scalability.throughput
                38 ~ 9%    +406.3%        193 ~ 7%  proc-vmstat.kswapd_low_wmark_hit_quickly
            610823 ~ 0%     -41.4%     357990 ~ 0%  softirqs.SCHED
         5.424e+08 ~ 0%     -38.5%  3.338e+08 ~ 6%  proc-vmstat.pgdeactivate
          4.68e+08 ~ 0%     -37.5%  2.924e+08 ~ 6%  proc-vmstat.pgrefill_normal
         5.549e+08 ~ 0%     -37.1%  3.491e+08 ~ 6%  proc-vmstat.pgactivate
          14938509 ~ 1%     +27.0%   18974176 ~ 1%  vmstat.memory.free
            978771 ~ 1%     +23.9%    1212704 ~ 3%  numa-vmstat.node2.nr_free_pages
           3747434 ~ 0%     +21.7%    4560196 ~ 2%  proc-vmstat.nr_free_pages
==>      1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_foreign
         1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_miss
         1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_other
           3936842 ~ 1%     +22.2%    4812045 ~ 4%  numa-meminfo.node2.MemFree
          21803812 ~ 0%     +17.7%   25661536 ~ 4%  numa-vmstat.node3.numa_foreign
          73701524 ~ 0%     +15.0%   84769542 ~ 0%  proc-vmstat.pgscan_direct_dma32
          73700683 ~ 0%     +15.0%   84768687 ~ 0%  proc-vmstat.pgsteal_direct_dma32
         3.101e+08 ~ 0%     +11.2%  3.448e+08 ~ 0%  proc-vmstat.pgsteal_direct_normal
         3.103e+08 ~ 0%     +11.2%  3.449e+08 ~ 0%  proc-vmstat.pgscan_direct_normal
          45613907 ~ 0%     +12.6%   51342974 ~ 3%  numa-vmstat.node0.numa_other
            795639 ~ 0%     -48.6%     409113 ~13%  time.voluntary_context_switches
               375 ~ 0%      +6.1%        398 ~ 0%  time.elapsed_time
              9427 ~ 0%      -5.8%       8880 ~ 0%  time.percent_of_cpu_this_job_got

The test case basically does

for i in `seq 1 $nr_cpu`
do      
        create_sparse_file huge-$i
        dd if=huge-$i of=/dev/null &
        dd if=huge-$i of=/dev/null &
done

where nr_cpu=120 (test box is a 4-socket ivybridge system).

The change looks stable, each point below is a sample run:

                               vm-scalability.stddev

   120 ++-------------------------------------------------------------------+
       |                                                                    |
   100 ++             *            *                                        |
       | *.***        :    **      :     *    *  *     *     *              |
       **     * *.** * :  *  :*.*  :: .* :   : * :*   * :  .* :   .*   * .**|
    80 ++      *    *  *. :  *   *: ** : ::  :  *  :.*  * *   * ** :   :*   *
       |                 *        *     : ***      *     *     *    :**     |
    60 ++                               *                           *       |
       |                                                                    |
    40 ++                                                                   |
       |                                                                    |
       |                                                                    |
    20 ++                                                                   |
       | O  OO OO OOO  O OO  O                                              |
     0 OO--O--O------OO----OO-----------------------------------------------+


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [sched] 23f0d2093c: -12.6% regression on sparse file copy
  2014-01-05  9:04 [sched] 23f0d2093c: -12.6% regression on sparse file copy fengguang.wu
@ 2014-01-06  0:30 ` Joonsoo Kim
  2014-01-06  7:10   ` Fengguang Wu
  0 siblings, 1 reply; 4+ messages in thread
From: Joonsoo Kim @ 2014-01-06  0:30 UTC (permalink / raw)
  To: fengguang.wu; +Cc: Peter Zijlstra, Ingo Molnar, LKML, lkp

On Sun, Jan 05, 2014 at 05:04:56PM +0800, fengguang.wu@intel.com wrote:
> Hi Joonsoo,
> 
> We noticed the below changes for commit 23f0d2093c ("sched: Factor out
> code to should_we_balance()") in test vm-scalability/300s-lru-file-readtwice

Hello, Fengguang.

There was a mistake in this patch and there was a fix and it was already merged
into mainline.

Could you test again with the commit (b0cff9d sched: Fix load balancing
performance regression in should_we_balance())?

Thanks.

> 
>         95a79b805b935f4  23f0d2093c789e612185180c4
>         ---------------  -------------------------
> ==>           4.45 ~ 5%   +1777.7%      83.60 ~ 5%  vm-scalability.stddev
> ==>       14966511 ~ 0%     -12.6%   13084545 ~ 2%  vm-scalability.throughput
>                 38 ~ 9%    +406.3%        193 ~ 7%  proc-vmstat.kswapd_low_wmark_hit_quickly
>             610823 ~ 0%     -41.4%     357990 ~ 0%  softirqs.SCHED
>          5.424e+08 ~ 0%     -38.5%  3.338e+08 ~ 6%  proc-vmstat.pgdeactivate
>           4.68e+08 ~ 0%     -37.5%  2.924e+08 ~ 6%  proc-vmstat.pgrefill_normal
>          5.549e+08 ~ 0%     -37.1%  3.491e+08 ~ 6%  proc-vmstat.pgactivate
>           14938509 ~ 1%     +27.0%   18974176 ~ 1%  vmstat.memory.free
>             978771 ~ 1%     +23.9%    1212704 ~ 3%  numa-vmstat.node2.nr_free_pages
>            3747434 ~ 0%     +21.7%    4560196 ~ 2%  proc-vmstat.nr_free_pages
> ==>      1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_foreign
>          1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_miss
>          1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_other
>            3936842 ~ 1%     +22.2%    4812045 ~ 4%  numa-meminfo.node2.MemFree
>           21803812 ~ 0%     +17.7%   25661536 ~ 4%  numa-vmstat.node3.numa_foreign
>           73701524 ~ 0%     +15.0%   84769542 ~ 0%  proc-vmstat.pgscan_direct_dma32
>           73700683 ~ 0%     +15.0%   84768687 ~ 0%  proc-vmstat.pgsteal_direct_dma32
>          3.101e+08 ~ 0%     +11.2%  3.448e+08 ~ 0%  proc-vmstat.pgsteal_direct_normal
>          3.103e+08 ~ 0%     +11.2%  3.449e+08 ~ 0%  proc-vmstat.pgscan_direct_normal
>           45613907 ~ 0%     +12.6%   51342974 ~ 3%  numa-vmstat.node0.numa_other
>             795639 ~ 0%     -48.6%     409113 ~13%  time.voluntary_context_switches
>                375 ~ 0%      +6.1%        398 ~ 0%  time.elapsed_time
>               9427 ~ 0%      -5.8%       8880 ~ 0%  time.percent_of_cpu_this_job_got
> 
> The test case basically does
> 
> for i in `seq 1 $nr_cpu`
> do      
>         create_sparse_file huge-$i
>         dd if=huge-$i of=/dev/null &
>         dd if=huge-$i of=/dev/null &
> done
> 
> where nr_cpu=120 (test box is a 4-socket ivybridge system).
> 
> The change looks stable, each point below is a sample run:
> 
>                                vm-scalability.stddev
> 
>    120 ++-------------------------------------------------------------------+
>        |                                                                    |
>    100 ++             *            *                                        |
>        | *.***        :    **      :     *    *  *     *     *              |
>        **     * *.** * :  *  :*.*  :: .* :   : * :*   * :  .* :   .*   * .**|
>     80 ++      *    *  *. :  *   *: ** : ::  :  *  :.*  * *   * ** :   :*   *
>        |                 *        *     : ***      *     *     *    :**     |
>     60 ++                               *                           *       |
>        |                                                                    |
>     40 ++                                                                   |
>        |                                                                    |
>        |                                                                    |
>     20 ++                                                                   |
>        | O  OO OO OOO  O OO  O                                              |
>      0 OO--O--O------OO----OO-----------------------------------------------+
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [sched] 23f0d2093c: -12.6% regression on sparse file copy
  2014-01-06  0:30 ` Joonsoo Kim
@ 2014-01-06  7:10   ` Fengguang Wu
  2014-01-06  7:49     ` Joonsoo Kim
  0 siblings, 1 reply; 4+ messages in thread
From: Fengguang Wu @ 2014-01-06  7:10 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Peter Zijlstra, Ingo Molnar, LKML, lkp

Hi Joonsoo,

On Mon, Jan 06, 2014 at 09:30:52AM +0900, Joonsoo Kim wrote:
> On Sun, Jan 05, 2014 at 05:04:56PM +0800, fengguang.wu@intel.com wrote:
> > Hi Joonsoo,
> > 
> > We noticed the below changes for commit 23f0d2093c ("sched: Factor out
> > code to should_we_balance()") in test vm-scalability/300s-lru-file-readtwice
> 
> Hello, Fengguang.
> 
> There was a mistake in this patch and there was a fix and it was already merged
> into mainline.
> 
> Could you test again with the commit (b0cff9d sched: Fix load balancing
> performance regression in should_we_balance())?

Yes, b0cff9d completely restores the performance. Sorry for the noise!

Thanks,
Fengguang

> > 
> >         95a79b805b935f4  23f0d2093c789e612185180c4
> >         ---------------  -------------------------
> > ==>           4.45 ~ 5%   +1777.7%      83.60 ~ 5%  vm-scalability.stddev
> > ==>       14966511 ~ 0%     -12.6%   13084545 ~ 2%  vm-scalability.throughput
> >                 38 ~ 9%    +406.3%        193 ~ 7%  proc-vmstat.kswapd_low_wmark_hit_quickly
> >             610823 ~ 0%     -41.4%     357990 ~ 0%  softirqs.SCHED
> >          5.424e+08 ~ 0%     -38.5%  3.338e+08 ~ 6%  proc-vmstat.pgdeactivate
> >           4.68e+08 ~ 0%     -37.5%  2.924e+08 ~ 6%  proc-vmstat.pgrefill_normal
> >          5.549e+08 ~ 0%     -37.1%  3.491e+08 ~ 6%  proc-vmstat.pgactivate
> >           14938509 ~ 1%     +27.0%   18974176 ~ 1%  vmstat.memory.free
> >             978771 ~ 1%     +23.9%    1212704 ~ 3%  numa-vmstat.node2.nr_free_pages
> >            3747434 ~ 0%     +21.7%    4560196 ~ 2%  proc-vmstat.nr_free_pages
> > ==>      1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_foreign
> >          1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_miss
> >          1.353e+08 ~ 0%     +18.8%  1.607e+08 ~ 0%  proc-vmstat.numa_other
> >            3936842 ~ 1%     +22.2%    4812045 ~ 4%  numa-meminfo.node2.MemFree
> >           21803812 ~ 0%     +17.7%   25661536 ~ 4%  numa-vmstat.node3.numa_foreign
> >           73701524 ~ 0%     +15.0%   84769542 ~ 0%  proc-vmstat.pgscan_direct_dma32
> >           73700683 ~ 0%     +15.0%   84768687 ~ 0%  proc-vmstat.pgsteal_direct_dma32
> >          3.101e+08 ~ 0%     +11.2%  3.448e+08 ~ 0%  proc-vmstat.pgsteal_direct_normal
> >          3.103e+08 ~ 0%     +11.2%  3.449e+08 ~ 0%  proc-vmstat.pgscan_direct_normal
> >           45613907 ~ 0%     +12.6%   51342974 ~ 3%  numa-vmstat.node0.numa_other
> >             795639 ~ 0%     -48.6%     409113 ~13%  time.voluntary_context_switches
> >                375 ~ 0%      +6.1%        398 ~ 0%  time.elapsed_time
> >               9427 ~ 0%      -5.8%       8880 ~ 0%  time.percent_of_cpu_this_job_got
> > 
> > The test case basically does
> > 
> > for i in `seq 1 $nr_cpu`
> > do      
> >         create_sparse_file huge-$i
> >         dd if=huge-$i of=/dev/null &
> >         dd if=huge-$i of=/dev/null &
> > done
> > 
> > where nr_cpu=120 (test box is a 4-socket ivybridge system).
> > 
> > The change looks stable, each point below is a sample run:
> > 
> >                                vm-scalability.stddev
> > 
> >    120 ++-------------------------------------------------------------------+
> >        |                                                                    |
> >    100 ++             *            *                                        |
> >        | *.***        :    **      :     *    *  *     *     *              |
> >        **     * *.** * :  *  :*.*  :: .* :   : * :*   * :  .* :   .*   * .**|
> >     80 ++      *    *  *. :  *   *: ** : ::  :  *  :.*  * *   * ** :   :*   *
> >        |                 *        *     : ***      *     *     *    :**     |
> >     60 ++                               *                           *       |
> >        |                                                                    |
> >     40 ++                                                                   |
> >        |                                                                    |
> >        |                                                                    |
> >     20 ++                                                                   |
> >        | O  OO OO OOO  O OO  O                                              |
> >      0 OO--O--O------OO----OO-----------------------------------------------+
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [sched] 23f0d2093c: -12.6% regression on sparse file copy
  2014-01-06  7:10   ` Fengguang Wu
@ 2014-01-06  7:49     ` Joonsoo Kim
  0 siblings, 0 replies; 4+ messages in thread
From: Joonsoo Kim @ 2014-01-06  7:49 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Peter Zijlstra, Ingo Molnar, LKML, lkp

On Mon, Jan 06, 2014 at 03:10:07PM +0800, Fengguang Wu wrote:
> Hi Joonsoo,
> 
> On Mon, Jan 06, 2014 at 09:30:52AM +0900, Joonsoo Kim wrote:
> > On Sun, Jan 05, 2014 at 05:04:56PM +0800, fengguang.wu@intel.com wrote:
> > > Hi Joonsoo,
> > > 
> > > We noticed the below changes for commit 23f0d2093c ("sched: Factor out
> > > code to should_we_balance()") in test vm-scalability/300s-lru-file-readtwice
> > 
> > Hello, Fengguang.
> > 
> > There was a mistake in this patch and there was a fix and it was already merged
> > into mainline.
> > 
> > Could you test again with the commit (b0cff9d sched: Fix load balancing
> > performance regression in should_we_balance())?
> 
> Yes, b0cff9d completely restores the performance. Sorry for the noise!

Thanks for quick response. :)


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-01-06  7:49 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-05  9:04 [sched] 23f0d2093c: -12.6% regression on sparse file copy fengguang.wu
2014-01-06  0:30 ` Joonsoo Kim
2014-01-06  7:10   ` Fengguang Wu
2014-01-06  7:49     ` Joonsoo Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).