From: Ric Mason <ric.masonn@gmail.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
Mel Gorman <mgorman@suse.de>,
linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
Matthew Wilcox <matthew.r.wilcox@intel.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Hillf Danton <dhillf@gmail.com>, Dave Hansen <dave@sr71.net>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3, RFC 00/34] Transparent huge page cache
Date: Sun, 07 Apr 2013 08:40:45 +0800 [thread overview]
Message-ID: <5160C08D.9020101@gmail.com> (raw)
In-Reply-To: <1365163198-29726-1-git-send-email-kirill.shutemov@linux.intel.com>
Hi Kirill,
On 04/05/2013 07:59 PM, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>
> Here's third RFC. Thanks everybody for feedback.
Could you answer my questions in your version two?
>
> The patchset is pretty big already and I want to stop generate new
> features to keep it reviewable. Next I'll concentrate on benchmarking and
> tuning.
>
> Therefore some features will be outside initial transparent huge page
> cache implementation:
> - page collapsing;
> - migration;
> - tmpfs/shmem;
>
> There are few features which are not implemented and potentially can block
> upstreaming:
>
> 1. Currently we allocate 2M page even if we create only 1 byte file on
> ramfs. I don't think it's a problem by itself. With anon thp pages we also
> try to allocate huge pages whenever possible.
> The problem is that ramfs pages are unevictable and we can't just split
> and pushed them in swap as with anon thp. We (at some point) have to have
> mechanism to split last page of the file under memory pressure to reclaim
> some memory.
>
> 2. We don't have knobs for disabling transparent huge page cache per-mount
> or per-file. Should we have mount option and fadivse flags as part of
> initial implementation?
>
> Any thoughts?
>
> The patchset is also on git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git thp/pagecache
>
> v3:
> - set RADIX_TREE_PRELOAD_NR to 512 only if we build with THP;
> - rewrite lru_add_page_tail() to address few bags;
> - memcg accounting;
> - represent file thp pages in meminfo and friends;
> - dump page order in filemap trace;
> - add missed flush_dcache_page() in zero_huge_user_segment;
> - random cleanups based on feedback.
> v2:
> - mmap();
> - fix add_to_page_cache_locked() and delete_from_page_cache();
> - introduce mapping_can_have_hugepages();
> - call split_huge_page() only for head page in filemap_fault();
> - wait_split_huge_page(): serialize over i_mmap_mutex too;
> - lru_add_page_tail: avoid PageUnevictable on active/inactive lru lists;
> - fix off-by-one in zero_huge_user_segment();
> - THP_WRITE_ALLOC/THP_WRITE_FAILED counters;
>
> Kirill A. Shutemov (34):
> mm: drop actor argument of do_generic_file_read()
> block: implement add_bdi_stat()
> mm: implement zero_huge_user_segment and friends
> radix-tree: implement preload for multiple contiguous elements
> memcg, thp: charge huge cache pages
> thp, mm: avoid PageUnevictable on active/inactive lru lists
> thp, mm: basic defines for transparent huge page cache
> thp, mm: introduce mapping_can_have_hugepages() predicate
> thp: represent file thp pages in meminfo and friends
> thp, mm: rewrite add_to_page_cache_locked() to support huge pages
> mm: trace filemap: dump page order
> thp, mm: rewrite delete_from_page_cache() to support huge pages
> thp, mm: trigger bug in replace_page_cache_page() on THP
> thp, mm: locking tail page is a bug
> thp, mm: handle tail pages in page_cache_get_speculative()
> thp, mm: add event counters for huge page alloc on write to a file
> thp, mm: implement grab_thp_write_begin()
> thp, mm: naive support of thp in generic read/write routines
> thp, libfs: initial support of thp in
> simple_read/write_begin/write_end
> thp: handle file pages in split_huge_page()
> thp: wait_split_huge_page(): serialize over i_mmap_mutex too
> thp, mm: truncate support for transparent huge page cache
> thp, mm: split huge page on mmap file page
> ramfs: enable transparent huge page cache
> x86-64, mm: proper alignment mappings with hugepages
> mm: add huge_fault() callback to vm_operations_struct
> thp: prepare zap_huge_pmd() to uncharge file pages
> thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
> thp, mm: basic huge_fault implementation for generic_file_vm_ops
> thp: extract fallback path from do_huge_pmd_anonymous_page() to a
> function
> thp: initial implementation of do_huge_linear_fault()
> thp: handle write-protect exception to file-backed huge pages
> thp: call __vma_adjust_trans_huge() for file-backed VMA
> thp: map file-backed huge pages on fault
>
> arch/x86/kernel/sys_x86_64.c | 12 +-
> drivers/base/node.c | 10 +
> fs/libfs.c | 48 +++-
> fs/proc/meminfo.c | 6 +
> fs/ramfs/inode.c | 6 +-
> include/linux/backing-dev.h | 10 +
> include/linux/huge_mm.h | 36 ++-
> include/linux/mm.h | 8 +
> include/linux/mmzone.h | 1 +
> include/linux/pagemap.h | 33 ++-
> include/linux/radix-tree.h | 11 +
> include/linux/vm_event_item.h | 2 +
> include/trace/events/filemap.h | 7 +-
> lib/radix-tree.c | 33 ++-
> mm/filemap.c | 298 ++++++++++++++++++++-----
> mm/huge_memory.c | 474 +++++++++++++++++++++++++++++++++-------
> mm/memcontrol.c | 2 -
> mm/memory.c | 41 +++-
> mm/mmap.c | 3 +
> mm/page_alloc.c | 7 +-
> mm/swap.c | 20 +-
> mm/truncate.c | 13 ++
> mm/vmstat.c | 2 +
> 23 files changed, 902 insertions(+), 181 deletions(-)
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Ric Mason <ric.masonn@gmail.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
Mel Gorman <mgorman@suse.de>,
linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
Matthew Wilcox <matthew.r.wilcox@intel.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Hillf Danton <dhillf@gmail.com>, Dave Hansen <dave@sr71.net>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv3, RFC 00/34] Transparent huge page cache
Date: Sun, 07 Apr 2013 08:40:45 +0800 [thread overview]
Message-ID: <5160C08D.9020101@gmail.com> (raw)
In-Reply-To: <1365163198-29726-1-git-send-email-kirill.shutemov@linux.intel.com>
Hi Kirill,
On 04/05/2013 07:59 PM, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>
> Here's third RFC. Thanks everybody for feedback.
Could you answer my questions in your version two?
>
> The patchset is pretty big already and I want to stop generate new
> features to keep it reviewable. Next I'll concentrate on benchmarking and
> tuning.
>
> Therefore some features will be outside initial transparent huge page
> cache implementation:
> - page collapsing;
> - migration;
> - tmpfs/shmem;
>
> There are few features which are not implemented and potentially can block
> upstreaming:
>
> 1. Currently we allocate 2M page even if we create only 1 byte file on
> ramfs. I don't think it's a problem by itself. With anon thp pages we also
> try to allocate huge pages whenever possible.
> The problem is that ramfs pages are unevictable and we can't just split
> and pushed them in swap as with anon thp. We (at some point) have to have
> mechanism to split last page of the file under memory pressure to reclaim
> some memory.
>
> 2. We don't have knobs for disabling transparent huge page cache per-mount
> or per-file. Should we have mount option and fadivse flags as part of
> initial implementation?
>
> Any thoughts?
>
> The patchset is also on git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git thp/pagecache
>
> v3:
> - set RADIX_TREE_PRELOAD_NR to 512 only if we build with THP;
> - rewrite lru_add_page_tail() to address few bags;
> - memcg accounting;
> - represent file thp pages in meminfo and friends;
> - dump page order in filemap trace;
> - add missed flush_dcache_page() in zero_huge_user_segment;
> - random cleanups based on feedback.
> v2:
> - mmap();
> - fix add_to_page_cache_locked() and delete_from_page_cache();
> - introduce mapping_can_have_hugepages();
> - call split_huge_page() only for head page in filemap_fault();
> - wait_split_huge_page(): serialize over i_mmap_mutex too;
> - lru_add_page_tail: avoid PageUnevictable on active/inactive lru lists;
> - fix off-by-one in zero_huge_user_segment();
> - THP_WRITE_ALLOC/THP_WRITE_FAILED counters;
>
> Kirill A. Shutemov (34):
> mm: drop actor argument of do_generic_file_read()
> block: implement add_bdi_stat()
> mm: implement zero_huge_user_segment and friends
> radix-tree: implement preload for multiple contiguous elements
> memcg, thp: charge huge cache pages
> thp, mm: avoid PageUnevictable on active/inactive lru lists
> thp, mm: basic defines for transparent huge page cache
> thp, mm: introduce mapping_can_have_hugepages() predicate
> thp: represent file thp pages in meminfo and friends
> thp, mm: rewrite add_to_page_cache_locked() to support huge pages
> mm: trace filemap: dump page order
> thp, mm: rewrite delete_from_page_cache() to support huge pages
> thp, mm: trigger bug in replace_page_cache_page() on THP
> thp, mm: locking tail page is a bug
> thp, mm: handle tail pages in page_cache_get_speculative()
> thp, mm: add event counters for huge page alloc on write to a file
> thp, mm: implement grab_thp_write_begin()
> thp, mm: naive support of thp in generic read/write routines
> thp, libfs: initial support of thp in
> simple_read/write_begin/write_end
> thp: handle file pages in split_huge_page()
> thp: wait_split_huge_page(): serialize over i_mmap_mutex too
> thp, mm: truncate support for transparent huge page cache
> thp, mm: split huge page on mmap file page
> ramfs: enable transparent huge page cache
> x86-64, mm: proper alignment mappings with hugepages
> mm: add huge_fault() callback to vm_operations_struct
> thp: prepare zap_huge_pmd() to uncharge file pages
> thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
> thp, mm: basic huge_fault implementation for generic_file_vm_ops
> thp: extract fallback path from do_huge_pmd_anonymous_page() to a
> function
> thp: initial implementation of do_huge_linear_fault()
> thp: handle write-protect exception to file-backed huge pages
> thp: call __vma_adjust_trans_huge() for file-backed VMA
> thp: map file-backed huge pages on fault
>
> arch/x86/kernel/sys_x86_64.c | 12 +-
> drivers/base/node.c | 10 +
> fs/libfs.c | 48 +++-
> fs/proc/meminfo.c | 6 +
> fs/ramfs/inode.c | 6 +-
> include/linux/backing-dev.h | 10 +
> include/linux/huge_mm.h | 36 ++-
> include/linux/mm.h | 8 +
> include/linux/mmzone.h | 1 +
> include/linux/pagemap.h | 33 ++-
> include/linux/radix-tree.h | 11 +
> include/linux/vm_event_item.h | 2 +
> include/trace/events/filemap.h | 7 +-
> lib/radix-tree.c | 33 ++-
> mm/filemap.c | 298 ++++++++++++++++++++-----
> mm/huge_memory.c | 474 +++++++++++++++++++++++++++++++++-------
> mm/memcontrol.c | 2 -
> mm/memory.c | 41 +++-
> mm/mmap.c | 3 +
> mm/page_alloc.c | 7 +-
> mm/swap.c | 20 +-
> mm/truncate.c | 13 ++
> mm/vmstat.c | 2 +
> 23 files changed, 902 insertions(+), 181 deletions(-)
>
next prev parent reply other threads:[~2013-04-07 0:40 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-05 11:59 [PATCHv3, RFC 00/34] Transparent huge page cache Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 01/34] mm: drop actor argument of do_generic_file_read() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 02/34] block: implement add_bdi_stat() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 03/34] mm: implement zero_huge_user_segment and friends Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 04/34] radix-tree: implement preload for multiple contiguous elements Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 05/34] memcg, thp: charge huge cache pages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 06/34] thp, mm: avoid PageUnevictable on active/inactive lru lists Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 07/34] thp, mm: basic defines for transparent huge page cache Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 08/34] thp, mm: introduce mapping_can_have_hugepages() predicate Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 09/34] thp: represent file thp pages in meminfo and friends Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-08 19:38 ` Dave Hansen
2013-04-08 19:38 ` Dave Hansen
2013-04-16 14:49 ` Kirill A. Shutemov
2013-04-16 14:49 ` Kirill A. Shutemov
2013-04-16 15:11 ` Dave Hansen
2013-04-16 15:11 ` Dave Hansen
2013-04-05 11:59 ` [PATCHv3, RFC 10/34] thp, mm: rewrite add_to_page_cache_locked() to support huge pages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 11/34] mm: trace filemap: dump page order Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 12/34] thp, mm: rewrite delete_from_page_cache() to support huge pages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 13/34] thp, mm: trigger bug in replace_page_cache_page() on THP Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 14/34] thp, mm: locking tail page is a bug Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 15/34] thp, mm: handle tail pages in page_cache_get_speculative() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 16/34] thp, mm: add event counters for huge page alloc on write to a file Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 17/34] thp, mm: implement grab_thp_write_begin() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 18/34] thp, mm: naive support of thp in generic read/write routines Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 19/34] thp, libfs: initial support of thp in simple_read/write_begin/write_end Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 20/34] thp: handle file pages in split_huge_page() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 21/34] thp: wait_split_huge_page(): serialize over i_mmap_mutex too Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 22/34] thp, mm: truncate support for transparent huge page cache Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 23/34] thp, mm: split huge page on mmap file page Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 24/34] ramfs: enable transparent huge page cache Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 25/34] x86-64, mm: proper alignment mappings with hugepages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 26/34] mm: add huge_fault() callback to vm_operations_struct Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 27/34] thp: prepare zap_huge_pmd() to uncharge file pages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 28/34] thp: move maybe_pmd_mkwrite() out of mk_huge_pmd() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 29/34] thp, mm: basic huge_fault implementation for generic_file_vm_ops Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 30/34] thp: extract fallback path from do_huge_pmd_anonymous_page() to a function Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 31/34] thp: initial implementation of do_huge_linear_fault() Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-08 18:46 ` Dave Hansen
2013-04-08 18:46 ` Dave Hansen
2013-04-08 18:52 ` Dave Hansen
2013-04-08 18:52 ` Dave Hansen
2013-04-17 14:38 ` Kirill A. Shutemov
2013-04-17 14:38 ` Kirill A. Shutemov
2013-04-17 22:07 ` Dave Hansen
2013-04-18 16:09 ` Kirill A. Shutemov
2013-04-18 16:09 ` Kirill A. Shutemov
2013-04-18 16:19 ` Kirill A. Shutemov
2013-04-18 16:19 ` Kirill A. Shutemov
2013-04-18 16:19 ` Kirill A. Shutemov
2013-04-18 16:20 ` Dave Hansen
2013-04-18 16:20 ` Dave Hansen
2013-04-18 16:38 ` Kirill A. Shutemov
2013-04-18 16:38 ` Kirill A. Shutemov
2013-04-18 16:42 ` Dave Hansen
2013-04-18 16:42 ` Dave Hansen
2013-04-05 11:59 ` [PATCHv3, RFC 32/34] thp: handle write-protect exception to file-backed huge pages Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-08 19:07 ` Dave Hansen
2013-04-08 19:07 ` Dave Hansen
2013-04-26 15:31 ` Kirill A. Shutemov
2013-04-26 15:31 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 33/34] thp: call __vma_adjust_trans_huge() for file-backed VMA Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-05 11:59 ` [PATCHv3, RFC 34/34] thp: map file-backed huge pages on fault Kirill A. Shutemov
2013-04-05 11:59 ` Kirill A. Shutemov
2013-04-07 0:40 ` Ric Mason [this message]
2013-04-07 0:40 ` [PATCHv3, RFC 00/34] Transparent huge page cache Ric Mason
2013-04-15 16:02 ` IOZone with transparent " Kirill A. Shutemov
2013-04-15 16:02 ` Kirill A. Shutemov
2013-04-15 18:17 ` [RESEND] " Kirill A. Shutemov
2013-04-15 18:17 ` Kirill A. Shutemov
2013-04-15 23:19 ` Dave Hansen
2013-04-15 23:19 ` Dave Hansen
2013-04-16 5:57 ` Kirill A. Shutemov
2013-04-16 5:57 ` Kirill A. Shutemov
2013-04-16 6:11 ` Dave Hansen
2013-04-16 6:11 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5160C08D.9020101@gmail.com \
--to=ric.masonn@gmail.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dave@sr71.net \
--cc=dhillf@gmail.com \
--cc=fengguang.wu@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.r.wilcox@intel.com \
--cc=mgorman@suse.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.