From: Mel Gorman <mgorman@suse.de>
To: Richard Davies <richard@arachsys.com>
Cc: KVM <kvm@vger.kernel.org>, QEMU-devel <qemu-devel@nongnu.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>, Avi Kivity <avi@redhat.com>,
Shaohua Li <shli@kernel.org>
Subject: Re: [Qemu-devel] [PATCH 0/6] Reduce compaction scanning and lock contention
Date: Fri, 21 Sep 2012 10:55:48 +0100 [thread overview]
Message-ID: <20120921095548.GT11266@suse.de> (raw)
In-Reply-To: <20120921091701.GC32081@alpha.arachsys.com>
On Fri, Sep 21, 2012 at 10:17:01AM +0100, Richard Davies wrote:
> Richard Davies wrote:
> > I did manage to get a couple which were slightly worse, but nothing like as
> > bad as before. Here are the results:
> >
> > # grep -F '[k]' report | head -8
> > 45.60% qemu-kvm [kernel.kallsyms] [k] clear_page_c
> > 11.26% qemu-kvm [kernel.kallsyms] [k] isolate_freepages_block
> > 3.21% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock
> > 2.27% ksmd [kernel.kallsyms] [k] memcmp
> > 2.02% swapper [kernel.kallsyms] [k] default_idle
> > 1.58% qemu-kvm [kernel.kallsyms] [k] svm_vcpu_run
> > 1.30% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock_irqsave
> > 1.09% qemu-kvm [kernel.kallsyms] [k] get_page_from_freelist
>
> # ========
> # captured on: Fri Sep 21 08:17:52 2012
> # os release : 3.6.0-rc5-elastic+
> # perf version : 3.5.2
> # arch : x86_64
> # nrcpus online : 16
> # nrcpus avail : 16
> # cpudesc : AMD Opteron(tm) Processor 6128
> # cpuid : AuthenticAMD,16,9,1
> # total memory : 131973276 kB
> # cmdline : /home/root/bin/perf record -g -a
> # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 }
> # HEADER_CPU_TOPOLOGY info available, use -I to display
> # HEADER_NUMA_TOPOLOGY info available, use -I to display
> # ========
> #
> # Samples: 283K of event 'cycles'
> # Event count (approx.): 109057976176
> #
> # Overhead Command Shared Object Symbol
> # ........ ............. .................... ..............................................
> #
> 45.60% qemu-kvm [kernel.kallsyms] [k] clear_page_c
> |
> --- clear_page_c
> |
> |--93.35%-- do_huge_pmd_anonymous_page
This is unavoidable. If THP was disabled, the cost would still be
incurred, just on base pages instead of huge pages.
> <SNIP>
> 11.26% qemu-kvm [kernel.kallsyms] [k] isolate_freepages_block
> |
> --- isolate_freepages_block
> compaction_alloc
> migrate_pages
> compact_zone
> compact_zone_order
> try_to_compact_pages
> __alloc_pages_direct_compact
> __alloc_pages_nodemask
> alloc_pages_vma
> do_huge_pmd_anonymous_page
And this is showing that we're still spending a lot of time scanning
for free pages to isolate. I do not have a great idea on how this can be
reduced further without interfering with the page allocator.
One ok idea I considered in the past was using the buddy lists to find
free pages quickly but there is first the problem that the buddy lists
themselves may need to be searched and now that the zone lock is not held
during the scan it would be particularly difficult. The harder problem is
deciding when compaction "finishes". I'll put more thought into it over
the weekend and see if something falls out but I'm not going to hold up
this series waiting for inspiration.
> 3.21% qemu-kvm [kernel.kallsyms] [k] _raw_spin_lock
> |
> --- _raw_spin_lock
> |
> |--39.96%-- tdp_page_fault
Nothing very interesting here until...
> |--1.69%-- free_pcppages_bulk
> | |
> | |--77.53%-- drain_pages
> | | |
> | | |--95.77%-- drain_local_pages
> | | | |
> | | | |--97.90%-- generic_smp_call_function_interrupt
> | | | | smp_call_function_interrupt
> | | | | call_function_interrupt
> | | | | |
> | | | | |--23.37%-- kvm_vcpu_ioctl
> | | | | | do_vfs_ioctl
> | | | | | sys_ioctl
> | | | | | system_call_fastpath
> | | | | | ioctl
> | | | | | |
> | | | | | |--97.22%-- 0x10100000006
> | | | | | |
> | | | | | --2.78%-- 0x10100000002
> | | | | |
> | | | | |--17.80%-- __remove_mapping
> | | | | | shrink_page_list
> | | | | | shrink_inactive_list
> | | | | | shrink_lruvec
> | | | | | try_to_free_pages
> | | | | | __alloc_pages_nodemask
> | | | | | alloc_pages_vma
> | | | | | do_huge_pmd_anonymous_page
This whole section is interesting simply because it shows the per-cpu
draining cost. It's low enough that I'm not going to put much thought
into it but it's not often the per-cpu allocator sticks out like this.
Thanks Richard.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2012-09-21 9:56 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-20 14:04 [Qemu-devel] [PATCH 0/6] Reduce compaction scanning and lock contention Mel Gorman
2012-09-20 14:04 ` [Qemu-devel] [PATCH 1/6] mm: compaction: Abort compaction loop if lock is contended or run too long Mel Gorman
2012-09-20 18:53 ` Rik van Riel
2012-09-20 14:04 ` [Qemu-devel] [PATCH 2/6] mm: compaction: Acquire the zone->lru_lock as late as possible Mel Gorman
2012-09-20 18:54 ` Rik van Riel
2012-09-20 14:04 ` [Qemu-devel] [PATCH 3/6] mm: compaction: Acquire the zone->lock " Mel Gorman
2012-09-20 18:54 ` Rik van Riel
2012-09-20 14:04 ` [Qemu-devel] [PATCH 4/6] Revert "mm: have order > 0 compaction start off where it left" Mel Gorman
2012-09-20 18:54 ` Rik van Riel
2012-09-20 14:04 ` [Qemu-devel] [PATCH 5/6] mm: compaction: Cache if a pageblock was scanned and no pages were isolated Mel Gorman
2012-09-20 18:55 ` Rik van Riel
2012-09-20 14:04 ` [Qemu-devel] [PATCH 6/6] mm: compaction: Restart compaction from near where it left off Mel Gorman
2012-09-20 18:57 ` Rik van Riel
2012-09-21 9:13 ` [Qemu-devel] [PATCH 0/6] Reduce compaction scanning and lock contention Richard Davies
2012-09-21 9:15 ` Richard Davies
2012-09-21 9:17 ` Richard Davies
2012-09-21 9:55 ` Mel Gorman [this message]
2012-09-21 9:18 ` Richard Davies
2012-09-21 9:35 ` Mel Gorman
2012-09-21 9:49 ` Richard Davies
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120921095548.GT11266@suse.de \
--to=mgorman@suse.de \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=qemu-devel@nongnu.org \
--cc=richard@arachsys.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).