From: Andrew Theurer <habanero@linux.vnet.ibm.com>
To: Avi Kivity <avi@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>,
Rik van Riel <riel@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Srikar <srikar@linux.vnet.ibm.com>,
"Nikunj A. Dadhania" <nikunj@linux.vnet.ibm.com>,
KVM <kvm@vger.kernel.org>, Jiannan Ouyang <ouyang@cs.pitt.edu>,
chegu vinod <chegu_vinod@hp.com>,
LKML <linux-kernel@vger.kernel.org>,
Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
Gleb Natapov <gleb@redhat.com>
Subject: Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
Date: Thu, 04 Oct 2012 09:41:03 -0500 [thread overview]
Message-ID: <1349361663.5551.56.camel@oc6622382223.ibm.com> (raw)
In-Reply-To: <506D83EE.2020303@redhat.com>
On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote:
> On 10/04/2012 12:49 PM, Raghavendra K T wrote:
> > On 10/03/2012 10:35 PM, Avi Kivity wrote:
> >> On 10/03/2012 02:22 PM, Raghavendra K T wrote:
> >>>> So I think it's worth trying again with ple_window of 20000-40000.
> >>>>
> >>>
> >>> Hi Avi,
> >>>
> >>> I ran different benchmarks increasing ple_window, and results does not
> >>> seem to be encouraging for increasing ple_window.
> >>
> >> Thanks for testing! Comments below.
> >>
> >>> Results:
> >>> 16 core PLE machine with 16 vcpu guest.
> >>>
> >>> base kernel = 3.6-rc5 + ple handler optimization patch
> >>> base_pleopt_8k = base kernel + ple window = 8k
> >>> base_pleopt_16k = base kernel + ple window = 16k
> >>> base_pleopt_32k = base kernel + ple window = 32k
> >>>
> >>>
> >>> Percentage improvements of benchmarks w.r.t base_pleopt with
> >>> ple_window = 4096
> >>>
> >>> base_pleopt_8k base_pleopt_16k base_pleopt_32k
> >>> -----------------------------------------------------------------
> >>>
> >>> kernbench_1x -5.54915 -15.94529 -44.31562
> >>> kernbench_2x -7.89399 -17.75039 -37.73498
> >>
> >> So, 44% degradation even with no overcommit? That's surprising.
> >
> > Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it
> > spending 8 times the original ple_window cycles for 16 vcpus
> > significant?
>
> A PLE exit when not overcommitted cannot do any good, it is better to
> spin in the guest rather that look for candidates on the host. In fact
> when we benchmark we often disable PLE completely.
Agreed. However, I really do not understand why the kernbench regressed
with bigger ple_window. It should stay the same or improve. Raghu, do
you have perf data for the kernbench runs?
>
> >
> >>
> >>> I also got perf top output to analyse the difference. Difference comes
> >>> because of flushtlb (and also spinlock).
> >>
> >> That's in the guest, yes?
> >
> > Yes. Perf is in guest.
> >
> >>
> >>>
> >>> Ebizzy run for 4k ple_window
> >>> - 87.20% [kernel] [k] arch_local_irq_restore
> >>> - arch_local_irq_restore
> >>> - 100.00% _raw_spin_unlock_irqrestore
> >>> + 52.89% release_pages
> >>> + 47.10% pagevec_lru_move_fn
> >>> - 5.71% [kernel] [k] arch_local_irq_restore
> >>> - arch_local_irq_restore
> >>> + 86.03% default_send_IPI_mask_allbutself_phys
> >>> + 13.96% default_send_IPI_mask_sequence_phys
> >>> - 3.10% [kernel] [k] smp_call_function_many
> >>> smp_call_function_many
> >>>
> >>>
> >>> Ebizzy run for 32k ple_window
> >>>
> >>> - 91.40% [kernel] [k] arch_local_irq_restore
> >>> - arch_local_irq_restore
> >>> - 100.00% _raw_spin_unlock_irqrestore
> >>> + 53.13% release_pages
> >>> + 46.86% pagevec_lru_move_fn
> >>> - 4.38% [kernel] [k] smp_call_function_many
> >>> smp_call_function_many
> >>> - 2.51% [kernel] [k] arch_local_irq_restore
> >>> - arch_local_irq_restore
> >>> + 90.76% default_send_IPI_mask_allbutself_phys
> >>> + 9.24% default_send_IPI_mask_sequence_phys
> >>>
> >>
> >> Both the 4k and the 32k results are crazy. Why is
> >> arch_local_irq_restore() so prominent? Do you have a very high
> >> interrupt rate in the guest?
> >
> > How to measure if I have high interrupt rate in guest?
> > From /proc/interrupt numbers I am not able to judge :(
>
> 'vmstat 1'
>
> >
> > I went back and got the results on a 32 core machine with 32 vcpu guest.
> > Strangely, I got result supporting the claim that increasing ple_window
> > helps for non-overcommitted scenario.
> >
> > 32 core 32 vcpu guest 1x scenarios.
> >
> > ple_gap = 0
> > kernbench: Elapsed Time 38.61
> > ebizzy: 7463 records/s
> >
> > ple_window = 4k
> > kernbench: Elapsed Time 43.5067
> > ebizzy: 2528 records/s
> >
> > ple_window = 32k
> > kernebench : Elapsed Time 39.4133
> > ebizzy: 7196 records/s
>
> So maybe something was wrong with the first measurement.
OK, this is more in line with what I expected for kernbench. FWIW, in
order to show an improvement for a larger ple_window, we really need a
workload which we know has a longer lock holding time (without factoring
in LHP). We have noticed this on IO based locks mostly. We saw it with
a massive disk IO test (qla2xxx lock), and also with a large web serving
test (some vfs related lock, but I forget what exactly it was).
>
> >
> >
> > perf top for ebizzy for above:
> > ple_gap = 0
> > - 84.74% [kernel] [k] arch_local_irq_restore
> > - arch_local_irq_restore
> > - 100.00% _raw_spin_unlock_irqrestore
> > + 50.96% release_pages
> > + 49.02% pagevec_lru_move_fn
> > - 6.57% [kernel] [k] arch_local_irq_restore
> > - arch_local_irq_restore
> > + 92.54% default_send_IPI_mask_allbutself_phys
> > + 7.46% default_send_IPI_mask_sequence_phys
> > - 1.54% [kernel] [k] smp_call_function_many
> > smp_call_function_many
>
> Again the numbers are ridiculously high for arch_local_irq_restore.
> Maybe there's a bad perf/kvm interaction when we're injecting an
> interrupt, I can't believe we're spending 84% of the time running the
> popf instruction.
I do have a feeling that ebizzy just has too many variables and LHP is
just one of many problems. However, am I curious what perf kvm from
host shows as Avi suggested below.
>
> >
> > ple_window = 32k
> > - 84.47% [kernel] [k] arch_local_irq_restore
> > + arch_local_irq_restore
> > - 6.46% [kernel] [k] arch_local_irq_restore
> > - arch_local_irq_restore
> > + 93.51% default_send_IPI_mask_allbutself_phys
> > + 6.49% default_send_IPI_mask_sequence_phys
> > - 1.80% [kernel] [k] smp_call_function_many
> > - smp_call_function_many
> > + 99.98% native_flush_tlb_others
> >
> >
> > ple_window = 4k
> > - 91.35% [kernel] [k] arch_local_irq_restore
> > - arch_local_irq_restore
> > - 100.00% _raw_spin_unlock_irqrestore
> > + 53.19% release_pages
> > + 46.81% pagevec_lru_move_fn
> > - 3.90% [kernel] [k] smp_call_function_many
> > smp_call_function_many
> > - 2.94% [kernel] [k] arch_local_irq_restore
> > - arch_local_irq_restore
> > + 93.12% default_send_IPI_mask_allbutself_phys
> > + 6.88% default_send_IPI_mask_sequence_phys
> >
> > Let me know if I can try something here..
> > /me confused :(
> >
>
> I'm even more confused. Please try 'perf kvm' from the host, it does
> fewer dirty tricks with the PMU and so may be more accurate.
>
next prev parent reply other threads:[~2012-10-04 14:41 UTC|newest]
Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-21 11:59 [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 1/2] kvm: Handle undercommitted guest case " Raghavendra K T
2012-09-21 13:02 ` Rik van Riel
2012-09-21 17:24 ` Raghavendra K T
2012-09-24 15:41 ` Avi Kivity
2012-09-24 16:06 ` Avi Kivity
2012-09-24 16:14 ` Peter Zijlstra
2012-09-24 16:25 ` Avi Kivity
2012-09-25 8:09 ` Raghavendra K T
2012-09-25 8:54 ` Avi Kivity
2012-09-25 13:49 ` Raghavendra K T
2012-09-27 7:44 ` Gleb Natapov
2012-09-27 8:59 ` Avi Kivity
2012-09-27 9:11 ` Gleb Natapov
2012-09-27 9:33 ` Avi Kivity
2012-09-27 9:58 ` Gleb Natapov
2012-09-27 10:04 ` Avi Kivity
2012-09-27 10:08 ` Gleb Natapov
2012-09-27 10:15 ` Avi Kivity
[not found] ` <CAJocwcf+8u84_yDC-PK0Yni93YSTWzYvr69nq6b3pNv1MwVJzQ@mail.gmail.com>
2012-09-27 8:50 ` Avi Kivity
2012-09-27 11:26 ` Raghavendra K T
2012-09-27 12:06 ` Avi Kivity
2012-09-28 18:18 ` Konrad Rzeszutek Wilk
2012-09-30 8:16 ` Avi Kivity
[not found] ` <CAJocwcc19F+PtsQ5okGMvYeVnkEigpZRpwWY9JgeRPFqfcVoXA@mail.gmail.com>
2012-09-28 6:16 ` Raghavendra K T
2012-09-30 8:18 ` Avi Kivity
2012-09-30 11:07 ` Gleb Natapov
2012-09-30 11:13 ` Avi Kivity
2012-10-03 14:17 ` Raghavendra K T
2012-10-03 14:56 ` Avi Kivity
2012-10-04 7:29 ` Gleb Natapov
2012-10-05 8:36 ` Raghavendra K T
2012-10-07 9:51 ` Avi Kivity
2012-09-25 7:36 ` Raghavendra K T
2012-09-25 8:12 ` Avi Kivity
2012-09-25 14:21 ` Takuya Yoshikawa
2012-09-27 8:43 ` Avi Kivity
2012-10-03 12:22 ` Raghavendra K T
2012-10-03 17:05 ` Avi Kivity
2012-10-04 10:49 ` Raghavendra K T
2012-10-04 12:41 ` Avi Kivity
2012-10-04 13:07 ` Peter Zijlstra
2012-10-04 15:00 ` Avi Kivity
2012-10-09 18:51 ` Raghavendra K T
2012-10-10 2:59 ` Andrew Theurer
2012-10-10 17:54 ` Raghavendra K T
2012-10-10 18:03 ` David Ahern
2012-10-10 18:14 ` Raghavendra K T
2012-10-10 19:36 ` Andrew Theurer
2012-10-15 12:10 ` Raghavendra K T
2012-10-15 14:34 ` Andrew Theurer
2012-10-19 8:30 ` Raghavendra K T
2012-10-19 13:31 ` Andrew Theurer
2012-10-10 14:24 ` Andrew Theurer
2012-10-10 17:43 ` Raghavendra K T
2012-10-10 19:27 ` Andrew Theurer
2012-10-11 17:13 ` Raghavendra K T
2012-10-11 10:39 ` Nikunj A Dadhania
2012-10-18 12:39 ` Avi Kivity
2012-10-19 8:19 ` Raghavendra K T
2012-10-04 14:41 ` Andrew Theurer [this message]
2012-10-05 9:06 ` Raghavendra K T
2012-10-05 9:02 ` Raghavendra K T
2012-09-24 11:33 ` Peter Zijlstra
2012-09-24 11:40 ` Raghavendra K T
2012-09-21 12:00 ` [PATCH RFC 2/2] kvm: Be courteous to other VMs in overcommitted scenario " Raghavendra K T
2012-09-21 13:22 ` Rik van Riel
2012-09-21 13:46 ` Takuya Yoshikawa
2012-09-21 13:52 ` Rik van Riel
2012-09-21 17:45 ` Raghavendra K T
2012-09-24 13:43 ` Takuya Yoshikawa
2012-09-24 15:26 ` Avi Kivity
2012-09-24 15:34 ` Peter Zijlstra
2012-09-24 15:43 ` Avi Kivity
2012-09-24 15:52 ` Peter Zijlstra
2012-09-24 15:58 ` Avi Kivity
2012-09-24 16:05 ` Peter Zijlstra
2012-09-24 16:10 ` Avi Kivity
2012-09-24 16:13 ` Peter Zijlstra
2012-09-24 16:21 ` Avi Kivity
2012-09-25 10:11 ` Avi Kivity
2012-09-21 13:18 ` [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios " Chegu Vinod
2012-09-21 17:36 ` Raghavendra K T
2012-09-24 8:42 ` Dor Laor
2012-09-24 12:02 ` Raghavendra K T
2012-09-25 15:00 ` Dor Laor
2012-09-26 12:27 ` Konrad Rzeszutek Wilk
2012-09-27 10:07 ` Raghavendra K T
2012-09-27 9:49 ` Raghavendra K T
2012-09-27 10:28 ` Andrew Jones
2012-09-27 10:44 ` Avi Kivity
2012-09-27 11:31 ` Raghavendra K T
2012-09-27 10:33 ` Dor Laor
2012-09-24 11:34 ` Peter Zijlstra
2012-09-24 11:52 ` Raghavendra K T
2012-09-24 12:36 ` Peter Zijlstra
2012-09-24 13:29 ` Raghavendra K T
2012-09-24 13:54 ` Peter Zijlstra
2012-09-24 14:16 ` Raghavendra K T
2012-09-25 13:40 ` Raghavendra K T
2012-09-27 8:36 ` Avi Kivity
2012-09-27 11:23 ` Raghavendra K T
2012-09-27 12:03 ` Avi Kivity
2012-09-27 12:25 ` Andrew Theurer
2012-09-28 5:38 ` Raghavendra K T
2012-09-28 5:45 ` H. Peter Anvin
2012-09-28 6:03 ` Raghavendra K T
2012-09-28 8:38 ` Peter Zijlstra
2012-09-28 11:40 ` Andrew Theurer
2012-09-28 14:11 ` Raghavendra K T
2012-09-28 14:13 ` Peter Zijlstra
2012-09-30 8:24 ` Avi Kivity
2012-10-03 14:29 ` Raghavendra K T
2012-10-03 17:25 ` Avi Kivity
2012-10-04 10:56 ` Raghavendra K T
2012-10-04 12:44 ` Avi Kivity
2012-10-05 9:04 ` Raghavendra K T
2012-09-24 15:51 ` Avi Kivity
2012-09-24 16:03 ` Peter Zijlstra
2012-09-24 16:20 ` Avi Kivity
2012-09-26 13:20 ` Andrew Jones
2012-09-26 13:26 ` Peter Zijlstra
2012-09-26 13:39 ` Andrew Jones
2012-09-26 13:45 ` Peter Zijlstra
2012-09-26 12:57 ` Andrew Jones
2012-09-27 10:21 ` Raghavendra K T
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1349361663.5551.56.camel@oc6622382223.ibm.com \
--to=habanero@linux.vnet.ibm.com \
--cc=avi@redhat.com \
--cc=chegu_vinod@hp.com \
--cc=gleb@redhat.com \
--cc=hpa@zytor.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=nikunj@linux.vnet.ibm.com \
--cc=ouyang@cs.pitt.edu \
--cc=peterz@infradead.org \
--cc=raghavendra.kt@linux.vnet.ibm.com \
--cc=riel@redhat.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=srivatsa.vaddagiri@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).