From: Avi Kivity <avi@redhat.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>,
peterz@infradead.org, linux-kernel@vger.kernel.org,
vatsa@linux.vnet.ibm.com, bharata@linux.vnet.ibm.com
Subject: Re: [RFC PATCH 0/4] Gang scheduling in CFS
Date: Sun, 25 Dec 2011 12:58:15 +0200 [thread overview]
Message-ID: <4EF701C7.9080907@redhat.com> (raw)
In-Reply-To: <20111223103620.GD4749@elte.hu>
On 12/23/2011 12:36 PM, Ingo Molnar wrote:
> * Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> wrote:
>
> > Here some interesting perf reports from inside the guest:
> >
> > Baseline:
> > 29.79% ebizzy [kernel.kallsyms] [k] native_flush_tlb_others
> > 18.70% ebizzy libc-2.12.so [.] __GI_memcpy
> > 7.23% ebizzy [kernel.kallsyms] [k] get_page_from_freelist
> > 5.38% ebizzy [kernel.kallsyms] [k] __do_page_fault
> > 4.50% ebizzy [kernel.kallsyms] [k] ____pagevec_lru_add
> > 3.58% ebizzy [kernel.kallsyms] [k] default_send_IPI_mask_logical
> > 3.26% ebizzy [kernel.kallsyms] [k] native_flush_tlb_single
> > 2.82% ebizzy [kernel.kallsyms] [k] handle_pte_fault
> > 2.16% ebizzy [kernel.kallsyms] [k] kunmap_atomic
> > 2.10% ebizzy [kernel.kallsyms] [k] _spin_unlock_irqrestore
> > 1.90% ebizzy [kernel.kallsyms] [k] down_read_trylock
> > 1.65% ebizzy [kernel.kallsyms] [k] __mem_cgroup_commit_charge.clone.4
> > 1.60% ebizzy [kernel.kallsyms] [k] up_read
> > 1.24% ebizzy [kernel.kallsyms] [k] __alloc_pages_nodemask
> >
> > Gang:
> > 22.53% ebizzy libc-2.12.so [.] __GI_memcpy
> > 9.73% ebizzy [kernel.kallsyms] [k] ____pagevec_lru_add
> > 8.22% ebizzy [kernel.kallsyms] [k] get_page_from_freelist
> > 7.80% ebizzy [kernel.kallsyms] [k] default_send_IPI_mask_logical
> > 7.68% ebizzy [kernel.kallsyms] [k] native_flush_tlb_others
> > 6.22% ebizzy [kernel.kallsyms] [k] __do_page_fault
> > 5.54% ebizzy [kernel.kallsyms] [k] native_flush_tlb_single
> > 4.44% ebizzy [kernel.kallsyms] [k] _spin_unlock_irqrestore
> > 2.90% ebizzy [kernel.kallsyms] [k] kunmap_atomic
> > 2.78% ebizzy [kernel.kallsyms] [k] __mem_cgroup_commit_charge.clone.4
> > 2.76% ebizzy [kernel.kallsyms] [k] handle_pte_fault
> > 2.16% ebizzy [kernel.kallsyms] [k] __mem_cgroup_uncharge_common
> > 1.59% ebizzy [kernel.kallsyms] [k] down_read_trylock
> > 1.43% ebizzy [kernel.kallsyms] [k] up_read
> >
> > I see the main difference between both the reports is:
> > native_flush_tlb_others.
>
> So it would be important to figure out why ebizzy gets into so
> many TLB flushes and why gang scheduling makes it go away.
The second part is easy - a remote tlb flush involves IPIs to many other
vcpus (possible waking them up and scheduling them), then busy-waiting
until they acknowledge the flush. Gang scheduling is really good here
since it shortens the busy wait, would be even better if we schedule
halted vcpus (see the yield_on_hlt module parameter, set to 0).
Directed yield on PLE should provide intermediate results between doing
nothing and gang sched.
The first part appears to be unrelated to ebizzy itself - it's the
kunmap_atomic() flushing ptes. It could be eliminated by switching to a
non-highmem kernel, or by allocating more PTEs for kmap_atomic() and
batching the flush.
btw you can get an additional speedup by enabling x2apic, for
default_send_IPI_mask_logical().
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2011-12-25 10:58 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-19 8:33 [RFC PATCH 0/4] Gang scheduling in CFS Nikunj A. Dadhania
2011-12-19 8:34 ` [RFC PATCH 1/4] sched: Adding cpu.gang file to cpu cgroup Nikunj A. Dadhania
2011-12-19 8:34 ` [RFC PATCH 2/4] sched: Adding gang scheduling infrastrucure Nikunj A. Dadhania
2011-12-19 15:51 ` Peter Zijlstra
2011-12-19 16:51 ` Peter Zijlstra
2011-12-20 1:43 ` Nikunj A Dadhania
2011-12-20 1:39 ` Nikunj A Dadhania
2011-12-19 8:34 ` [RFC PATCH 3/4] sched: Gang using set_next_buddy Nikunj A. Dadhania
2011-12-19 8:35 ` [RFC PATCH 4/4] sched:Implement set_gang_buddy Nikunj A. Dadhania
2011-12-19 15:51 ` Peter Zijlstra
2011-12-20 1:43 ` Nikunj A Dadhania
2011-12-26 2:30 ` Nikunj A Dadhania
2011-12-19 11:23 ` [RFC PATCH 0/4] Gang scheduling in CFS Ingo Molnar
2011-12-19 11:44 ` Avi Kivity
2011-12-19 11:50 ` Nikunj A Dadhania
2011-12-19 11:59 ` Avi Kivity
2011-12-19 12:06 ` Nikunj A Dadhania
2011-12-19 12:50 ` Avi Kivity
2011-12-19 13:09 ` Nikunj A Dadhania
2011-12-19 11:45 ` Nikunj A Dadhania
2011-12-19 13:22 ` Nikunj A Dadhania
2011-12-19 16:28 ` Ingo Molnar
2011-12-21 10:39 ` Nikunj A Dadhania
2011-12-21 10:43 ` Avi Kivity
2011-12-23 3:20 ` Nikunj A Dadhania
2011-12-23 10:36 ` Ingo Molnar
2011-12-25 10:58 ` Avi Kivity [this message]
2011-12-25 15:45 ` Avi Kivity
2011-12-26 3:14 ` Nikunj A Dadhania
2011-12-26 9:05 ` Avi Kivity
2011-12-26 11:33 ` Nikunj A Dadhania
2011-12-26 11:41 ` Avi Kivity
2011-12-27 1:47 ` Nikunj A Dadhania
2011-12-27 9:15 ` Avi Kivity
2011-12-27 10:24 ` Nikunj A Dadhania
2011-12-27 3:15 ` Nikunj A Dadhania
2011-12-27 9:17 ` Avi Kivity
2011-12-27 9:44 ` Nikunj A Dadhania
2011-12-27 9:51 ` Avi Kivity
2011-12-27 10:10 ` Nikunj A Dadhania
2011-12-27 10:34 ` Avi Kivity
2011-12-27 10:43 ` Nikunj A Dadhania
2011-12-27 10:53 ` Avi Kivity
2011-12-30 9:51 ` Ingo Molnar
2011-12-30 10:10 ` Nikunj A Dadhania
2011-12-31 2:21 ` Nikunj A Dadhania
2012-01-02 4:20 ` Nikunj A Dadhania
2012-01-02 9:39 ` Avi Kivity
2012-01-02 10:22 ` Nikunj A Dadhania
2012-01-02 9:37 ` Avi Kivity
2012-01-02 10:30 ` Nikunj A Dadhania
2012-01-02 13:33 ` Avi Kivity
2012-01-04 10:52 ` Nikunj A Dadhania
2012-01-04 14:41 ` Avi Kivity
2012-01-04 14:56 ` Srivatsa Vaddagiri
2012-01-04 17:13 ` Avi Kivity
2012-01-05 6:57 ` Nikunj A Dadhania
2012-01-04 16:47 ` Rik van Riel
2012-01-04 17:16 ` Avi Kivity
2012-01-04 20:56 ` Rik van Riel
2012-01-04 21:31 ` Peter Zijlstra
2012-01-04 21:41 ` Avi Kivity
2012-01-05 9:10 ` Ingo Molnar
2012-02-20 8:08 ` Nikunj A Dadhania
2012-02-20 8:14 ` Ingo Molnar
2012-02-20 10:51 ` Peter Zijlstra
2012-02-20 11:53 ` Nikunj A Dadhania
2012-02-20 12:02 ` Srivatsa Vaddagiri
2012-02-20 12:14 ` Peter Zijlstra
2012-01-05 2:10 ` Nikunj A Dadhania
2011-12-19 15:51 ` Peter Zijlstra
2011-12-19 16:09 ` Alan Cox
2011-12-19 22:10 ` Benjamin Herrenschmidt
2011-12-20 1:56 ` Nikunj A Dadhania
2011-12-20 8:52 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4EF701C7.9080907@redhat.com \
--to=avi@redhat.com \
--cc=bharata@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nikunj@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).