public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1
@ 2014-05-15  3:56 Jet Chen
  2014-05-15 12:33 ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Jet Chen @ 2014-05-15  3:56 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Fengguang Wu, LKML, lkp

[-- Attachment #1: Type: text/plain, Size: 1767 bytes --]

Hi Oleg,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 8ad8e9d3fd64f101eed6652964670672d699e563 ("uprobes/x86: Introduce uprobe_xol_ops and arch_uprobe->ops")

Test case: lkp-snb01/aim7/signal_test

34e7317d6ae8f61  8ad8e9d3fd64f101eed665296
---------------  -------------------------
    230689 ~ 0%      -7.5%     213485 ~ 0%  TOTAL aim7.2000.jobs-per-min
      0.51 ~30%     -45.7%       0.28 ~10%  TOTAL turbostat.%c1
       430 ~17%     +27.8%        549 ~15%  TOTAL vmstat.procs.r
      0.83 ~14%     +38.1%       1.15 ~12%  TOTAL perf-profile.cpu-cycles.copy_pte_range.copy_page_range.copy_process.do_fork.sys_clone
    106076 ~ 4%     +22.4%     129816 ~ 3%  TOTAL softirqs.RCU
     12117 ~ 6%      -9.9%      10914 ~ 6%  TOTAL slabinfo.kmalloc-256.active_objs
     32163 ~17%     +39.4%      44824 ~12%  TOTAL time.voluntary_context_switches
    276487 ~ 1%     +16.5%     322091 ~ 1%  TOTAL time.involuntary_context_switches
     83.14 ~ 0%     +13.3%      94.21 ~ 0%  TOTAL time.user_time
    108800 ~ 2%      +9.4%     119014 ~ 3%  TOTAL time.minor_page_faults
      1255 ~ 0%      +9.9%       1379 ~ 0%  TOTAL time.system_time
      6774 ~ 1%      +8.8%       7373 ~ 1%  TOTAL vmstat.system.cs
     52.15 ~ 0%      +8.0%      56.34 ~ 0%  TOTAL time.elapsed_time
     25185 ~ 0%      +2.4%      25784 ~ 0%  TOTAL vmstat.system.in
     79.23 ~ 0%      +2.0%      80.80 ~ 0%  TOTAL turbostat.%c0
      2567 ~ 0%      +1.9%       2615 ~ 0%  TOTAL time.percent_of_cpu_this_job_got
       112 ~ 0%      +1.3%        113 ~ 0%  TOTAL turbostat.Cor_W
       138 ~ 0%      +1.0%        139 ~ 0%  TOTAL turbostat.Pkg_W


Legend:
	~XX%    - stddev percent
	[+-]XX% - change percent


Thanks,
Jet



[-- Attachment #2: reproduce --]
[-- Type: text/plain, Size: 2360 bytes --]

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1
  2014-05-15  3:56 [uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1 Jet Chen
@ 2014-05-15 12:33 ` Oleg Nesterov
  2014-05-15 12:54   ` Fengguang Wu
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2014-05-15 12:33 UTC (permalink / raw)
  To: Jet Chen; +Cc: Fengguang Wu, LKML, lkp

Hi Jet,

On 05/15, Jet Chen wrote:
>
> we noticed the below changes on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit 8ad8e9d3fd64f101eed6652964670672d699e563 ("uprobes/x86: Introduce uprobe_xol_ops and arch_uprobe->ops")
>
> Test case: lkp-snb01/aim7/signal_test
>
> 34e7317d6ae8f61  8ad8e9d3fd64f101eed665296
> ---------------  -------------------------
>     230689 ~ 0%      -7.5%     213485 ~ 0%  TOTAL aim7.2000.jobs-per-min
>       0.51 ~30%     -45.7%       0.28 ~10%  TOTAL turbostat.%c1
>        430 ~17%     +27.8%        549 ~15%  TOTAL vmstat.procs.r
>       0.83 ~14%     +38.1%       1.15 ~12%  TOTAL perf-profile.cpu-cycles.copy_pte_range.copy_page_range.copy_process.do_fork.sys_clone
>     106076 ~ 4%     +22.4%     129816 ~ 3%  TOTAL softirqs.RCU
>      12117 ~ 6%      -9.9%      10914 ~ 6%  TOTAL slabinfo.kmalloc-256.active_objs
>      32163 ~17%     +39.4%      44824 ~12%  TOTAL time.voluntary_context_switches
>     276487 ~ 1%     +16.5%     322091 ~ 1%  TOTAL time.involuntary_context_switches
>      83.14 ~ 0%     +13.3%      94.21 ~ 0%  TOTAL time.user_time
>     108800 ~ 2%      +9.4%     119014 ~ 3%  TOTAL time.minor_page_faults
>       1255 ~ 0%      +9.9%       1379 ~ 0%  TOTAL time.system_time
>       6774 ~ 1%      +8.8%       7373 ~ 1%  TOTAL vmstat.system.cs
>      52.15 ~ 0%      +8.0%      56.34 ~ 0%  TOTAL time.elapsed_time
>      25185 ~ 0%      +2.4%      25784 ~ 0%  TOTAL vmstat.system.in
>      79.23 ~ 0%      +2.0%      80.80 ~ 0%  TOTAL turbostat.%c0
>       2567 ~ 0%      +1.9%       2615 ~ 0%  TOTAL time.percent_of_cpu_this_job_got
>        112 ~ 0%      +1.3%        113 ~ 0%  TOTAL turbostat.Cor_W
>        138 ~ 0%      +1.0%        139 ~ 0%  TOTAL turbostat.Pkg_W

Cough... this looks simply impossible ;)

Not only this patch "obviously can not" cause any noticable difference
performance-wise, the code changed by this patch is not executed unless
you play with uprobes?

Oleg.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1
  2014-05-15 12:33 ` Oleg Nesterov
@ 2014-05-15 12:54   ` Fengguang Wu
  0 siblings, 0 replies; 3+ messages in thread
From: Fengguang Wu @ 2014-05-15 12:54 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Jet Chen, LKML, lkp

Hi Oleg,

On Thu, May 15, 2014 at 02:33:32PM +0200, Oleg Nesterov wrote:
> Hi Jet,
> 
> On 05/15, Jet Chen wrote:
> >
> > we noticed the below changes on
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > commit 8ad8e9d3fd64f101eed6652964670672d699e563 ("uprobes/x86: Introduce uprobe_xol_ops and arch_uprobe->ops")
> >
> > Test case: lkp-snb01/aim7/signal_test
> >
> > 34e7317d6ae8f61  8ad8e9d3fd64f101eed665296
> > ---------------  -------------------------
> >     230689 ~ 0%      -7.5%     213485 ~ 0%  TOTAL aim7.2000.jobs-per-min
> >       0.51 ~30%     -45.7%       0.28 ~10%  TOTAL turbostat.%c1
> >        430 ~17%     +27.8%        549 ~15%  TOTAL vmstat.procs.r
> >       0.83 ~14%     +38.1%       1.15 ~12%  TOTAL perf-profile.cpu-cycles.copy_pte_range.copy_page_range.copy_process.do_fork.sys_clone
> >     106076 ~ 4%     +22.4%     129816 ~ 3%  TOTAL softirqs.RCU
> >      12117 ~ 6%      -9.9%      10914 ~ 6%  TOTAL slabinfo.kmalloc-256.active_objs
> >      32163 ~17%     +39.4%      44824 ~12%  TOTAL time.voluntary_context_switches
> >     276487 ~ 1%     +16.5%     322091 ~ 1%  TOTAL time.involuntary_context_switches
> >      83.14 ~ 0%     +13.3%      94.21 ~ 0%  TOTAL time.user_time
> >     108800 ~ 2%      +9.4%     119014 ~ 3%  TOTAL time.minor_page_faults
> >       1255 ~ 0%      +9.9%       1379 ~ 0%  TOTAL time.system_time
> >       6774 ~ 1%      +8.8%       7373 ~ 1%  TOTAL vmstat.system.cs
> >      52.15 ~ 0%      +8.0%      56.34 ~ 0%  TOTAL time.elapsed_time
> >      25185 ~ 0%      +2.4%      25784 ~ 0%  TOTAL vmstat.system.in
> >      79.23 ~ 0%      +2.0%      80.80 ~ 0%  TOTAL turbostat.%c0
> >       2567 ~ 0%      +1.9%       2615 ~ 0%  TOTAL time.percent_of_cpu_this_job_got
> >        112 ~ 0%      +1.3%        113 ~ 0%  TOTAL turbostat.Cor_W
> >        138 ~ 0%      +1.0%        139 ~ 0%  TOTAL turbostat.Pkg_W
> 
> Cough... this looks simply impossible ;)
> 
> Not only this patch "obviously can not" cause any noticable difference
> performance-wise, the code changed by this patch is not executed unless
> you play with uprobes?

Yes this is interesting. From bisect POV, we do find kernels
before/after this commit have some clearly different behavior,
as you can see in some of the below graphs. For example, the
"time.user_time" numbers are clearly separable before/after the
first bad commit.

Legend:

        [*] bisect-good sample
        [O] bisect-bad  sample

                                 time.user_time

  98 ++--------------------------------O------------------------------------+
     |     O   O O O O O   O     O   O    O   O                 O           |
  96 O+O O   O           O   O O   O        O                     O         |
  94 ++                                           O O O O O   O       O O   O
     |                                          O           O       O     O |
  92 ++                                                                     |
     |                                                                      |
  90 ++                                                                     |
     |                                                                      |
  88 ++                                                                     |
  86 ++                                                                     |
     |              .*                                                      |
  84 *+ .*.   .*. .*  +                                                     |
     | *   *.*   *     *.*.*.*.*.*.*.*.*..*.*                               |
  82 ++---------------------------------------------------------------------+


                                 time.system_time

  1420 ++-------------------------------------------------------------------+
       |                                                        O           |
  1400 O+O O O O                                                  O         |
  1380 ++                                OO     O O         O       O O O O O
       |         O O   O O O   O O O O O            O O O O   O             |
  1360 ++            O                                                      |
  1340 ++                    O              O O                             |
       |                                                                    |
  1320 ++                                                                   |
  1300 ++                                                                   |
       |                                                                    |
  1280 ++                                                                   |
  1260 ++            *.                                                     |
       *.*.*.*.*.   +  *.*.*.*.*.*.*.*.*.**.*                               |
  1240 ++--------*-*--------------------------------------------------------+


                         time.percent_of_cpu_this_job_got

  2630 ++-------------------------------------------------------------------+
       |                                                        O O         |
  2620 O+O O O O                         OO                           O O   O
  2610 ++        O O     O     O O O O O        O O O O O O O O     O     O |
       |             O O   O O                O                             |
  2600 ++                                   O                               |
       |                                                                    |
  2590 ++                                                                   |
       |                                                                    |
  2580 ++                                                                   |
  2570 ++                                                                   |
       |            .*.*.*.*.*.*. .*.*.*.**.*                               |
  2560 *+*.*.*.*.*.*             *                                          |
       |                                                                    |
  2550 ++-------------------------------------------------------------------+


                                time.elapsed_time

  58 ++---------------------------------------------------------------------+
     |                                                                      |
  57 ++O O O                                                    O O         |
     O       O                         O  O                 O       O O O O O
  56 ++        O O   O O O   O O O O O          O O O O O O   O             |
     |             O       O                O O                             |
  55 ++                                                                     |
     |                                                                      |
  54 ++                                                                     |
     |                                                                      |
  53 ++                                                                     |
     |            .*.*.                                                     |
  52 *+*.*.*.*.*.*     *.*.*.*.*.*.*.*.*..*.*                               |
     |                                                                      |
  51 ++---------------------------------------------------------------------+


                          time.involuntary_context_switches

  340000 ++-----------------------------------------------------------------+
  330000 ++                         O   O OO                     O      O   |
         O         O          O   O   O          O                O O O     |
  320000 ++    O                                   O O   O                O O
  310000 ++  O   O  O   O O     O                      O   O                |
  300000 ++O                                   O               O            |
  290000 ++                 O                O               O              |
         |            O         *      .*                                   |
  280000 ++           *.    *  : :   .*  + *.                               |
  270000 ++          :  *  : + : : .*     *  *                              |
  260000 ++       .* :   + :  *   *                                         |
  250000 ++   .*.*  *     *                                                 |
         |.*.*                                                              |
  240000 *+                                                                 |
  230000 ++-----------------------------------------------------------------+


                                    softirqs.RCU

  140000 ++-------------------------O---O-----------------------------------+
         |                    O                        O   O          O     |
  130000 ++                                      O                  O     O |
         O                                 O                                O
         |   O   O                                 O              O     O   |
  120000 ++    O   O    O       O                    O   O   O              |
         |          O     O       O   O   O    O                 O          |
  110000 ++                 O   *.   .*. .**                                |
         | O          O        :  *.*   *   :                  O            |
  100000 ++             *      :            :O                              |
         |     *.*  *. + + .*.*              *                              |
         |    :   + : *   *                                                 |
   90000 ++*. :    *                                                        |
         |+  *                                                              |
   80000 *+-----------------------------------------------------------------+


                               aim7.2000.jobs-per-min

  235000 ++-----------------------------------------------------------------+
         |                                                                  |
         *.*.*.*.*.**.   .*.*. .*.*.*.*.*. *.*                              |
  230000 ++           *.*     *           *                                 |
         |                                                                  |
         |                                                                  |
  225000 ++                                                                 |
         |                                                                  |
  220000 ++                                                                 |
         |                                                                  |
         |                    O              O O                            |
  215000 ++        OO O O O O   O                                           |
         |                        O O O O        O O O O O O O O    O O O O O
         O O O O O                        OO                      O         |
  210000 ++------------------------------------------------------O----------+


                                  vmstat.system.in

  26200 ++O-----------------------------------------------------------------+
        O   O O O                                                 O         |
  26000 ++          O    O     O         O O                                |
        |         O                                             O           |
  25800 ++                 O       O   O                            O O O O O
        |             O O    O   O   O       O   O O O   O  O               |
  25600 ++                                     O       O   O  O             |
        |                  *        .*                                      |
  25400 ++*. .*.*. .*      :       *  :                                     |
        | : *     *  +    : :      :  :                                     |
  25200 ++            *.* : :     :    :.*.                                 |
        |:               :   :.*. :    *   *.*                              |
  25000 ++               *   *   *                                          |
        *                                                                   |
  24800 ++------------------------------------------------------------------+


                                   turbostat.%c0

  81.5 ++-------------------------------------------------------------------+
       |                                                                    |
    81 ++                                                       O O         |
       O O O O O                 O O O O OO     O O                   O O O O
       |         O O   O O O   O                    O O O O O O     O       |
  80.5 ++            O       O              O O                             |
       |                                                                    |
    80 ++                                                                   |
       |                                                                    |
  79.5 ++                                                                   |
       |             *.       .*.   .*.   *.                                |
       *. .*. .*.*. +  *.*.*.*   *.*   *.*  *                               |
    79 ++*   *     *                                                        |
       |                                                                    |
  78.5 ++-------------------------------------------------------------------+


                                   turbostat.Pkg_W

    141 ++------------------------------------------------------------------+
        |                                    O                              |
  140.5 ++                                             O   O                |
    140 ++    O              O   O O O O O O                O   O           |
        O O O   O O O O        O               O O            O   O     O   |
  139.5 ++              O  O         *             O O   O            O   O O
        |                O          : :                             O       |
    139 ++                          : :                                     |
        |                  *      .*   :                                    |
  138.5 ++                + :  *.*     *.                                   |
    138 ++*.*.*      .*. *  : +          *                                  |
        *      + .*.*   *    *            +                                 |
  137.5 ++      *                          *.*                              |
        |                                                                   |
    137 ++------------------------------------------------------------------+


                                   turbostat.Cor_W

  114.5 ++-----------------------------------O------------------------------+
        |                                              O   O                |
    114 ++                                                      O           |
        | O O O                  O   O   O O     O          O     O     O   |
        O       O O O        O     O   O       O     O   O    O     O     O O
  113.5 ++            O O  O   O                   O                  O     |
        |                O           *                                      |
    113 ++                           ::                                     |
        |                  *        : :                                     |
  112.5 ++                : :    *. :  :                                    |
        |                 : :   +  *   *.                                   |
        |     *          *   :.*         *                                  |
    112 *+   + :     .*. :   *            +                                 |
        | *.*  : .*.*   *                  *.*                              |
  111.5 ++------*-----------------------------------------------------------+


                           proc-vmstat.numa_pte_updates

  7000 ++--O----------------------------------------------------------------+
       |                                                                    |
  6000 ++O                                                                  |
       |       O                                                  O         |
  5000 ++            O                                                      |
       |                   O                          O   O               O |
  4000 ++                                     O             O               |
       O                                 O                    O             |
  3000 ++    O                                                              |
       |                                  O                     O   O       |
  2000 ++                *   * O * O * O                              O     |
       |         * O    :O: : : :O+ +O:     O     O O   O               O   |
  1000 ++       +O+ .*.O: : :O: :  *  : .*      O                           O
       |  .*. .*   *   *   *   *       *  *.                                |
     0 *+*---*------------------------------*-------------------------------+


                           proc-vmstat.numa_hint_faults

  2500 ++-------------------------------------------------------------------+
       |                                                                    |
       |   O                                                                |
  2000 ++O                                                                  |
       |       O                                                  O         |
       |             O                                                      |
  1500 ++                  O                          O   O O             O |
       O                                 O    O                             |
  1000 ++                                                     O             |
       |     O                            O                     O   O       |
       |                 *   * O   O   O                                    |
   500 ++        * O    :O: : :  O. .O      O     O O   O             O O   |
       |        +O+    O: : :O: +  *  + .*      O                           O
       |  .*. .*   *.*.*   *   *       *  *.                                |
     0 *+*---*------------------------------*-------------------------------+

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-15 12:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-15  3:56 [uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1 Jet Chen
2014-05-15 12:33 ` Oleg Nesterov
2014-05-15 12:54   ` Fengguang Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox