From: Andrea Arcangeli <aarcange@redhat.com>
To: "Shi, Yang" <yang.shi@linaro.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
Christoph Lameter <cl@gentwo.org>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Jerome Marchand <jmarchan@redhat.com>,
Sasha Levin <sasha.levin@oracle.com>,
Andres Lagar-Cavilla <andreslc@google.com>,
Ning Qu <quning@gmail.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCHv7 00/29] THP-enabled tmpfs/shmem using compound pages
Date: Tue, 19 Apr 2016 12:50:24 -0400 [thread overview]
Message-ID: <20160419165024.GB24312@redhat.com> (raw)
In-Reply-To: <571565F0.9070203@linaro.org>
Hello,
On Mon, Apr 18, 2016 at 03:55:44PM -0700, Shi, Yang wrote:
> Hi Kirill,
>
> Finally, I got some time to look into and try yours and Hugh's patches,
> got two problems.
One thing that come to mind to test is this: qemu with -machine
accel=kvm -mem-path=/dev/shm/,share=on .
The THP Compound approach in tmpfs may just happen to work already
with KVM (or at worst it'd require minor adjustments) because it uses
the exact same model KVM is already aware about from THP in anonymous
memory, example from arch/x86/kvm/mmu.c:
static void transparent_hugepage_adjust(struct kvm_vcpu *vcpu,
gfn_t *gfnp, kvm_pfn_t *pfnp,
int *levelp)
{
kvm_pfn_t pfn = *pfnp;
gfn_t gfn = *gfnp;
int level = *levelp;
/*
* Check if it's a transparent hugepage. If this would be an
* hugetlbfs page, level wouldn't be set to
* PT_PAGE_TABLE_LEVEL and there would be no adjustment done
* here.
*/
if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
level == PT_PAGE_TABLE_LEVEL &&
PageTransCompound(pfn_to_page(pfn)) &&
!mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
Not using two different models between THP in tmpfs and THP in anon is
essential not just to significantly reduce the size of the kernel
code, but also because THP knowledge can't be self contained in the
mm/shmem.c file. Having to support two different models would
complicate things for secondary MMU drivers (i.e. mmu notifer users)
like KVM who also need to create huge mapping in the shadow pagetable
layer in arch/x86/kvm if the primary MMU allows for it.
> x86-64 and ARM64 with yours and Hugh's patches (linux-next tree), I got
> the program execution time reduced by ~12% on x86-64, it looks very
> impressive.
Agreed, both patchset are impressive works and achieving amazing
results!
My view is that in terms of long-lived computation from userland point
of view, both models are malleable enough and could achieve everything
we need in the end, but as far as the overall kernel efficiency is
concerned the compound model will always retain a slight advantage in
performance by leveraging a native THP compound refcounting that
requires just one atomic_inc/dec per THP mapcount instead of 512 of
them. Other advantages of the compound model is that it's half in code
size despite already including khugepaged (i.e. the same
split_huge_page works for both tmpfs and anon) and like said above it
won't introduce much complications for drivers like KVM as the model
didn't change.
Thanks,
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-04-19 16:50 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-16 0:23 [PATCHv7 00/29] THP-enabled tmpfs/shmem using compound pages Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 01/29] thp, mlock: update unevictable-lru.txt Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 02/29] mm: do not pass mm_struct into handle_mm_fault Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 03/29] mm: introduce fault_env Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 04/29] mm: postpone page table allocation until we have page to map Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 05/29] rmap: support file thp Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 06/29] mm: introduce do_set_pmd() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 07/29] thp, vmstats: add counters for huge file pages Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 08/29] thp: support file pages in zap_huge_pmd() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 09/29] thp: handle file pages in split_huge_pmd() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 10/29] thp: handle file COW faults Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 11/29] thp: skip file huge pmd on copy_huge_pmd() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 12/29] thp: prepare change_huge_pmd() for file thp Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 13/29] thp: run vma_adjust_trans_huge() outside i_mmap_rwsem Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 14/29] thp: file pages support for split_huge_page() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 15/29] thp, mlock: do not mlock PTE-mapped file huge pages Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 16/29] vmscan: split file huge pages before paging them out Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 17/29] page-flags: relax policy for PG_mappedtodisk and PG_reclaim Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 18/29] radix-tree: implement radix_tree_maybe_preload_order() Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 19/29] filemap: prepare find and delete operations for huge pages Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 20/29] truncate: handle file thp Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 21/29] mm, rmap: account shmem thp pages Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 22/29] shmem: prepare huge= mount option and sysfs knob Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 23/29] shmem: get_unmapped_area align huge page Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 24/29] shmem: add huge pages support Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 25/29] shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 26/29] thp: update Documentation/vm/transhuge.txt Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 27/29] thp: extract khugepaged from mm/huge_memory.c Kirill A. Shutemov
2016-04-16 0:23 ` [PATCHv7 28/29] khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page() Kirill A. Shutemov
2016-04-16 0:24 ` [PATCHv7 29/29] khugepaged: add support of collapse for tmpfs/shmem pages Kirill A. Shutemov
2016-04-18 22:55 ` [PATCHv7 00/29] THP-enabled tmpfs/shmem using compound pages Shi, Yang
2016-04-19 14:33 ` Jerome Marchand
2016-04-19 16:11 ` Shi, Yang
2016-04-19 16:50 ` Andrea Arcangeli [this message]
2016-04-19 17:07 ` Andres Lagar-Cavilla
2016-04-24 5:46 ` Wincy Van
2016-04-25 13:30 ` Andres Lagar-Cavilla
2016-04-26 14:02 ` Wincy Van
2016-04-27 15:48 ` Andrea Arcangeli
2016-04-19 23:48 ` Shi, Yang
2016-04-20 8:31 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160419165024.GB24312@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andreslc@google.com \
--cc=cl@gentwo.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=jmarchan@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=quning@gmail.com \
--cc=sasha.levin@oracle.com \
--cc=vbabka@suse.cz \
--cc=yang.shi@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).