All of lore.kernel.org
 help / color / mirror / Atom feed
From: Huang Ying <ying.huang@intel.com>
To: lkp@lists.01.org
Subject: Re: [SUNRPC] c4a7ca77494: +6.0% fsmark.time.involuntary_context_switches, no primary result change
Date: Mon, 16 Feb 2015 08:28:23 +0800	[thread overview]
Message-ID: <1424046503.5538.23.camel@intel.com> (raw)
In-Reply-To: <CAHQdGtSgAC5+TJPpQ-BiKa8ZTpfw-TjZcjn1CQhCu-xBxfx2PA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7216 bytes --]

On Sun, 2015-02-15 at 13:02 -0500, Trond Myklebust wrote:
> Hi guys,
> 
> On Sun, Feb 15, 2015 at 2:57 AM, Huang Ying <ying.huang@intel.com> wrote:
> > FYI, we noticed the below changes on
> >
> > commit c4a7ca774949960064dac11b326908f28407e8c3 ("SUNRPC: Allow waiting on memory allocation")
> >
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-f2fs-nfsv4-8K-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >          %stddev     %change         %stddev
> >              \          |                \
> >      52524 ±  0%      +6.0%      55672 ±  0%  fsmark.time.involuntary_context_switches
> >        436 ± 14%     +54.9%        676 ± 20%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >        433 ± 15%     +54.7%        670 ± 21%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >       8348 ±  7%     +27.0%      10602 ±  9%  sched_debug.cfs_rq[0]:/.min_vruntime
> >     190081 ± 13%     +32.7%     252269 ± 13%  sched_debug.cpu#0.sched_goidle
> >     205783 ± 12%     +30.2%     267903 ± 13%  sched_debug.cpu#0.ttwu_local
> >     464065 ± 11%     +26.6%     587524 ± 12%  sched_debug.cpu#0.nr_switches
> >     464278 ± 11%     +26.6%     587734 ± 12%  sched_debug.cpu#0.sched_count
> >      15807 ± 11%     +19.6%      18910 ± 12%  sched_debug.cpu#4.nr_load_updates
> >     300041 ±  8%     +20.3%     360969 ± 10%  sched_debug.cpu#0.ttwu_count
> >       1863 ±  9%     +18.1%       2201 ± 10%  sched_debug.cfs_rq[4]:/.exec_clock
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-btrfs-nfsv4-8K-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >        fail:runs  %reproduction    fail:runs
> >            |             |             |
> >      52184 ±  0%      +5.6%      55122 ±  0%  fsmark.time.involuntary_context_switches
> >        557 ± 19%     +21.5%        677 ±  9%  sched_debug.cfs_rq[5]:/.blocked_load_avg
> >        217 ± 19%     -42.9%        124 ± 21%  sched_debug.cfs_rq[2]:/.load
> >      45852 ± 14%     -39.4%      27773 ± 24%  sched_debug.cpu#7.ttwu_local
> >        457 ± 18%     +50.1%        686 ± 20%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >        455 ± 18%     +46.7%        668 ± 19%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >      66605 ± 10%     -26.7%      48826 ± 14%  sched_debug.cpu#7.sched_goidle
> >      78249 ±  9%     -22.5%      60678 ± 11%  sched_debug.cpu#7.ttwu_count
> >     153506 ±  9%     -22.7%     118649 ± 12%  sched_debug.cpu#7.nr_switches
> >     153613 ±  9%     -22.7%     118755 ± 12%  sched_debug.cpu#7.sched_count
> >      15806 ±  6%     +19.2%      18833 ± 18%  sched_debug.cpu#4.nr_load_updates
> >       2171 ±  5%     +15.6%       2510 ± 13%  sched_debug.cfs_rq[4]:/.exec_clock
> >       9924 ± 11%     -27.0%       7244 ± 25%  sched_debug.cfs_rq[3]:/.min_vruntime
> >       3156 ±  4%     -13.4%       2734 ±  8%  sched_debug.cfs_rq[7]:/.min_vruntime
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-ext4-nfsv4-9B-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >     104802 ±  0%      +7.7%     112883 ±  0%  fsmark.time.involuntary_context_switches
> >     471755 ±  0%      -1.3%     465592 ±  0%  fsmark.time.voluntary_context_switches
> >       1977 ± 36%     +90.8%       3771 ±  8%  sched_debug.cpu#4.curr->pid
> >          2 ± 34%     +80.0%          4 ± 24%  sched_debug.cpu#6.cpu_load[1]
> >          4 ± 33%     +83.3%          8 ± 31%  sched_debug.cpu#6.cpu_load[0]
> >        193 ± 17%     +48.0%        286 ± 19%  sched_debug.cfs_rq[2]:/.blocked_load_avg
> >        196 ± 17%     +47.5%        290 ± 19%  sched_debug.cfs_rq[2]:/.tg_load_contrib
> >         96 ± 18%     +40.6%        135 ± 11%  sched_debug.cfs_rq[7]:/.load
> >         97 ± 18%     +38.5%        135 ± 11%  sched_debug.cpu#7.load
> >       2274 ±  7%     -16.5%       1898 ±  3%  proc-vmstat.pgalloc_dma
> >        319 ±  6%     -29.7%        224 ± 24%  sched_debug.cfs_rq[1]:/.tg_load_contrib
> >        314 ±  5%     -29.4%        222 ± 25%  sched_debug.cfs_rq[1]:/.blocked_load_avg
> >        621 ± 10%     +41.9%        881 ± 37%  sched_debug.cfs_rq[4]:/.avg->runnable_avg_sum
> >
> > nhm4: Nehalem
> > Memory: 4G
> >
> >
> >
> >
> >                       fsmark.time.involuntary_context_switches
> >
> >   114000 ++-----------------------------------------------------------------+
> >   113000 O+    O     O  O  O  O  O  O  O   O  O  O  O  O  O  O     O     O  |
> >          |  O     O                                             O     O     O
> >   112000 ++                                                                 |
> >   111000 ++                                                                 |
> >          |                                                                  |
> >   110000 ++                                                                 |
> >   109000 ++                                                                 |
> >   108000 ++                                                                 |
> >          |                                                                  |
> >   107000 ++                                                                 |
> >   106000 ++                                                                 |
> >          |                                                                  |
> >   105000 *+.*..*..*..*..*..*..*..*..*..*...*..*..*..*..*..*..*..*           |
> >   104000 ++-----------------------------------------------------------------+
> >
> >
> >         [*] bisect-good sample
> >         [O] bisect-bad  sample
> >
> > To reproduce:
> >
> >         apt-get install ruby
> >         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >         cd lkp-tests
> >         bin/setup-local job.yaml # the job file attached in this email
> >         bin/run-local   job.yaml
> 
> So this is on a loopback NFS setup (i.e. the server resides on the
> same node as the client, which just mounts from the loopback IP
> address 127.0.0.1)?

Yes.  This is on a loopback NFS setup.

> That's a fairly quirky setup as far as memory management goes. In low
> memory situations, you have very a nasty feedback mechanism whereby
> the NFS server ends up pushing the client to write back more data,
> increasing the memory pressure on the NFS server, etc.
> It is quite possible that allowing the NFS client to block more
> aggressively in low memory situations could worsen that feedback
> mechanism, however that's not our main target platform; we actively
> discourage people from using loopback NFS in production systems.
> 
> Is there any way you could confirm this performance change using a
> remote NFS server instead of the loopback NFS?

We are working on a remote NFS setup now :)

Best Regards,
Huang, Ying



WARNING: multiple messages have this Message-ID (diff)
From: Huang Ying <ying.huang@intel.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: LKML <linux-kernel@vger.kernel.org>, LKP ML <lkp@01.org>
Subject: Re: [LKP] [SUNRPC] c4a7ca77494: +6.0% fsmark.time.involuntary_context_switches, no primary result change
Date: Mon, 16 Feb 2015 08:28:23 +0800	[thread overview]
Message-ID: <1424046503.5538.23.camel@intel.com> (raw)
In-Reply-To: <CAHQdGtSgAC5+TJPpQ-BiKa8ZTpfw-TjZcjn1CQhCu-xBxfx2PA@mail.gmail.com>

On Sun, 2015-02-15 at 13:02 -0500, Trond Myklebust wrote:
> Hi guys,
> 
> On Sun, Feb 15, 2015 at 2:57 AM, Huang Ying <ying.huang@intel.com> wrote:
> > FYI, we noticed the below changes on
> >
> > commit c4a7ca774949960064dac11b326908f28407e8c3 ("SUNRPC: Allow waiting on memory allocation")
> >
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-f2fs-nfsv4-8K-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >          %stddev     %change         %stddev
> >              \          |                \
> >      52524 ±  0%      +6.0%      55672 ±  0%  fsmark.time.involuntary_context_switches
> >        436 ± 14%     +54.9%        676 ± 20%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >        433 ± 15%     +54.7%        670 ± 21%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >       8348 ±  7%     +27.0%      10602 ±  9%  sched_debug.cfs_rq[0]:/.min_vruntime
> >     190081 ± 13%     +32.7%     252269 ± 13%  sched_debug.cpu#0.sched_goidle
> >     205783 ± 12%     +30.2%     267903 ± 13%  sched_debug.cpu#0.ttwu_local
> >     464065 ± 11%     +26.6%     587524 ± 12%  sched_debug.cpu#0.nr_switches
> >     464278 ± 11%     +26.6%     587734 ± 12%  sched_debug.cpu#0.sched_count
> >      15807 ± 11%     +19.6%      18910 ± 12%  sched_debug.cpu#4.nr_load_updates
> >     300041 ±  8%     +20.3%     360969 ± 10%  sched_debug.cpu#0.ttwu_count
> >       1863 ±  9%     +18.1%       2201 ± 10%  sched_debug.cfs_rq[4]:/.exec_clock
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-btrfs-nfsv4-8K-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >        fail:runs  %reproduction    fail:runs
> >            |             |             |
> >      52184 ±  0%      +5.6%      55122 ±  0%  fsmark.time.involuntary_context_switches
> >        557 ± 19%     +21.5%        677 ±  9%  sched_debug.cfs_rq[5]:/.blocked_load_avg
> >        217 ± 19%     -42.9%        124 ± 21%  sched_debug.cfs_rq[2]:/.load
> >      45852 ± 14%     -39.4%      27773 ± 24%  sched_debug.cpu#7.ttwu_local
> >        457 ± 18%     +50.1%        686 ± 20%  sched_debug.cfs_rq[0]:/.tg_load_contrib
> >        455 ± 18%     +46.7%        668 ± 19%  sched_debug.cfs_rq[0]:/.blocked_load_avg
> >      66605 ± 10%     -26.7%      48826 ± 14%  sched_debug.cpu#7.sched_goidle
> >      78249 ±  9%     -22.5%      60678 ± 11%  sched_debug.cpu#7.ttwu_count
> >     153506 ±  9%     -22.7%     118649 ± 12%  sched_debug.cpu#7.nr_switches
> >     153613 ±  9%     -22.7%     118755 ± 12%  sched_debug.cpu#7.sched_count
> >      15806 ±  6%     +19.2%      18833 ± 18%  sched_debug.cpu#4.nr_load_updates
> >       2171 ±  5%     +15.6%       2510 ± 13%  sched_debug.cfs_rq[4]:/.exec_clock
> >       9924 ± 11%     -27.0%       7244 ± 25%  sched_debug.cfs_rq[3]:/.min_vruntime
> >       3156 ±  4%     -13.4%       2734 ±  8%  sched_debug.cfs_rq[7]:/.min_vruntime
> >
> > testbox/testcase/testparams: nhm4/fsmark/performance-1x-32t-1HDD-ext4-nfsv4-9B-400M-fsyncBeforeClose-16d-256fpd
> >
> > 127b21b89f9d8ba0  c4a7ca774949960064dac11b32
> > ----------------  --------------------------
> >     104802 ±  0%      +7.7%     112883 ±  0%  fsmark.time.involuntary_context_switches
> >     471755 ±  0%      -1.3%     465592 ±  0%  fsmark.time.voluntary_context_switches
> >       1977 ± 36%     +90.8%       3771 ±  8%  sched_debug.cpu#4.curr->pid
> >          2 ± 34%     +80.0%          4 ± 24%  sched_debug.cpu#6.cpu_load[1]
> >          4 ± 33%     +83.3%          8 ± 31%  sched_debug.cpu#6.cpu_load[0]
> >        193 ± 17%     +48.0%        286 ± 19%  sched_debug.cfs_rq[2]:/.blocked_load_avg
> >        196 ± 17%     +47.5%        290 ± 19%  sched_debug.cfs_rq[2]:/.tg_load_contrib
> >         96 ± 18%     +40.6%        135 ± 11%  sched_debug.cfs_rq[7]:/.load
> >         97 ± 18%     +38.5%        135 ± 11%  sched_debug.cpu#7.load
> >       2274 ±  7%     -16.5%       1898 ±  3%  proc-vmstat.pgalloc_dma
> >        319 ±  6%     -29.7%        224 ± 24%  sched_debug.cfs_rq[1]:/.tg_load_contrib
> >        314 ±  5%     -29.4%        222 ± 25%  sched_debug.cfs_rq[1]:/.blocked_load_avg
> >        621 ± 10%     +41.9%        881 ± 37%  sched_debug.cfs_rq[4]:/.avg->runnable_avg_sum
> >
> > nhm4: Nehalem
> > Memory: 4G
> >
> >
> >
> >
> >                       fsmark.time.involuntary_context_switches
> >
> >   114000 ++-----------------------------------------------------------------+
> >   113000 O+    O     O  O  O  O  O  O  O   O  O  O  O  O  O  O     O     O  |
> >          |  O     O                                             O     O     O
> >   112000 ++                                                                 |
> >   111000 ++                                                                 |
> >          |                                                                  |
> >   110000 ++                                                                 |
> >   109000 ++                                                                 |
> >   108000 ++                                                                 |
> >          |                                                                  |
> >   107000 ++                                                                 |
> >   106000 ++                                                                 |
> >          |                                                                  |
> >   105000 *+.*..*..*..*..*..*..*..*..*..*...*..*..*..*..*..*..*..*           |
> >   104000 ++-----------------------------------------------------------------+
> >
> >
> >         [*] bisect-good sample
> >         [O] bisect-bad  sample
> >
> > To reproduce:
> >
> >         apt-get install ruby
> >         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >         cd lkp-tests
> >         bin/setup-local job.yaml # the job file attached in this email
> >         bin/run-local   job.yaml
> 
> So this is on a loopback NFS setup (i.e. the server resides on the
> same node as the client, which just mounts from the loopback IP
> address 127.0.0.1)?

Yes.  This is on a loopback NFS setup.

> That's a fairly quirky setup as far as memory management goes. In low
> memory situations, you have very a nasty feedback mechanism whereby
> the NFS server ends up pushing the client to write back more data,
> increasing the memory pressure on the NFS server, etc.
> It is quite possible that allowing the NFS client to block more
> aggressively in low memory situations could worsen that feedback
> mechanism, however that's not our main target platform; we actively
> discourage people from using loopback NFS in production systems.
> 
> Is there any way you could confirm this performance change using a
> remote NFS server instead of the loopback NFS?

We are working on a remote NFS setup now :)

Best Regards,
Huang, Ying



  reply	other threads:[~2015-02-16  0:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-15  7:57 [SUNRPC] c4a7ca77494: +6.0% fsmark.time.involuntary_context_switches, no primary result change Huang Ying
2015-02-15  7:57 ` [LKP] " Huang Ying
2015-02-15 18:02 ` Trond Myklebust
2015-02-15 18:02   ` [LKP] " Trond Myklebust
2015-02-16  0:28   ` Huang Ying [this message]
2015-02-16  0:28     ` Huang Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1424046503.5538.23.camel@intel.com \
    --to=ying.huang@intel.com \
    --cc=lkp@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.