From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Dave Hansen <dave@sr71.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>,
Wu Fengguang <fengguang.wu@intel.com>, Jan Kara <jack@suse.cz>,
Mel Gorman <mgorman@suse.de>,
linux-mm@kvack.org, Andi Kleen <ak@linux.intel.com>,
Matthew Wilcox <matthew.r.wilcox@intel.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Hillf Danton <dhillf@gmail.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2, RFC 02/30] mm: implement zero_huge_user_segment and friends
Date: Fri, 22 Mar 2013 11:21:50 +0200 (EET) [thread overview]
Message-ID: <20130322092150.5DD29E0085@blue.fi.intel.com> (raw)
In-Reply-To: <514B25F5.7020207@sr71.net>
Dave Hansen wrote:
> On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
> > Let's add helpers to clear huge page segment(s). They provide the same
> > functionallity as zero_user_segment{,s} and zero_user, but for huge
> > pages
> ...
> > +static inline void zero_huge_user_segments(struct page *page,
> > + unsigned start1, unsigned end1,
> > + unsigned start2, unsigned end2)
> > +{
> > + zero_huge_user_segment(page, start1, end1);
> > + zero_huge_user_segment(page, start2, end2);
> > +}
>
> I'm not sure that this helper saves very much code. The one call later
> in these patches:
>
> + zero_huge_user_segments(page, 0, from,
> + from + len, HPAGE_PMD_SIZE);
>
> really only saves one line over this:
>
> zero_huge_user_segment(page, 0, from);
> zero_huge_user_segment(page, from + len,
> HPAGE_PMD_SIZE);
>
> and I think the second one is much more clear to read.
I've tried to mimic non-huge zero_user*, but, yeah, this is silly.
Will drop.
> I do see that there's a small-page variant of this, but I think that one
> was done to save doing two kmap_atomic() operations when you wanted to
> zero two separate operations. This variant doesn't have that kind of
> optimization, so it makes much less sense.
>
> > +void zero_huge_user_segment(struct page *page, unsigned start, unsigned end)
> > +{
> > + int i;
> > +
> > + BUG_ON(end < start);
> > +
> > + might_sleep();
> > +
> > + if (start == end)
> > + return;
>
> I've really got to wonder how much of an optimization this is in
> practice. Was there a specific reason this was added?
It's likely for simple_write_begin() to call zero[_huge]_user_segments()
with one of two segments start == end.
But, honestly, it was just easier to cut the corner case first and don't
bother about it in following code. ;)
> > + /* start and end are on the same small page */
> > + if ((start & PAGE_MASK) == ((end - 1) & PAGE_MASK))
> > + return zero_user_segment(page + (start >> PAGE_SHIFT),
> > + start & ~PAGE_MASK,
> > + ((end - 1) & ~PAGE_MASK) + 1);
>
> It wasn't immediately obvious to me why we need to optimize the "on the
> same page" case. I _think_ it's because using zero_user_segments()
> saves us a kmap_atomic() over the code below. Is that right? It might
> be worth a comment.
The code below will call zero_user_segment() twice for the same small
page, but here we can use just one.
I'll document it.
> > + zero_user_segment(page + (start >> PAGE_SHIFT),
> > + start & ~PAGE_MASK, PAGE_SIZE);
> > + for (i = (start >> PAGE_SHIFT) + 1; i < (end >> PAGE_SHIFT) - 1; i++) {
> > + cond_resched();
> > + clear_highpage(page + i);
>
> zero_user_segments() does a flush_dcache_page(), which wouldn't get done
> on these middle pages. Is that a problem?
I think, it is. Will fix.
> > + }
> > + zero_user_segment(page + i, 0, ((end - 1) & ~PAGE_MASK) + 1);
> > +}
>
> This code is dying for some local variables. It could really use a
> 'start_pfn_offset' and 'end_pfn_offset' or something similar. All of
> the shifting and masking is a bit hard to read and it would be nice to
> think of some real names for what it is doing.
>
> It also desperately needs some comments about how it works. Some
> one-liners like:
>
> /* zero the first (possibly partial) page */
> for()..
> /* zero the full pages in the middle */
> /* zero the last (possibly partial) page */
>
> would be pretty sweet.
Okay, will rework it.
--
Kirill A. Shutemov
next prev parent reply other threads:[~2013-03-22 9:20 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-14 17:50 [PATCHv2, RFC 00/30] Transparent huge page cache Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 01/30] block: implement add_bdi_stat() Kirill A. Shutemov
2013-03-21 14:46 ` Dave Hansen
2013-03-21 17:19 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 02/30] mm: implement zero_huge_user_segment and friends Kirill A. Shutemov
2013-03-21 15:23 ` Dave Hansen
2013-03-22 9:21 ` Kirill A. Shutemov [this message]
2013-03-14 17:50 ` [PATCHv2, RFC 03/30] mm: drop actor argument of do_generic_file_read() Kirill A. Shutemov
2013-03-15 0:21 ` Hillf Danton
2013-03-15 0:27 ` Hillf Danton
2013-03-15 13:22 ` Kirill A. Shutemov
2013-03-21 15:26 ` Dave Hansen
2013-03-14 17:50 ` [PATCHv2, RFC 04/30] radix-tree: implement preload for multiple contiguous elements Kirill A. Shutemov
2013-03-21 15:56 ` Dave Hansen
2013-03-22 9:47 ` Kirill A. Shutemov
2013-03-22 14:38 ` Dave Hansen
2013-03-25 13:03 ` Kirill A. Shutemov
2013-04-05 3:37 ` Ric Mason
2013-03-14 17:50 ` [PATCHv2, RFC 05/30] thp, mm: avoid PageUnevictable on active/inactive lru lists Kirill A. Shutemov
2013-03-21 16:15 ` Dave Hansen
2013-03-22 10:11 ` Kirill A. Shutemov
2013-04-05 3:42 ` Ric Mason
2013-03-14 17:50 ` [PATCHv2, RFC 06/30] thp, mm: basic defines for transparent huge page cache Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 07/30] thp, mm: introduce mapping_can_have_hugepages() predicate Kirill A. Shutemov
2013-03-21 16:21 ` Dave Hansen
2013-03-22 10:12 ` Kirill A. Shutemov
2013-03-22 14:44 ` Dave Hansen
2013-04-02 14:46 ` Kirill A. Shutemov
2013-04-05 3:45 ` Ric Mason
2013-04-05 3:48 ` Ric Mason
2013-03-14 17:50 ` [PATCHv2, RFC 08/30] thp, mm: rewrite add_to_page_cache_locked() to support huge pages Kirill A. Shutemov
2013-03-15 1:30 ` Hillf Danton
2013-03-15 13:23 ` Kirill A. Shutemov
2013-03-15 13:25 ` Hillf Danton
2013-03-15 13:50 ` Kirill A. Shutemov
2013-03-15 13:55 ` Hillf Danton
2013-03-15 15:05 ` Kirill A. Shutemov
2013-03-21 17:11 ` Dave Hansen
2013-03-22 10:34 ` Kirill A. Shutemov
2013-03-22 14:51 ` Dave Hansen
2013-03-14 17:50 ` [PATCHv2, RFC 09/30] thp, mm: rewrite delete_from_page_cache() " Kirill A. Shutemov
2013-03-15 2:25 ` Hillf Danton
2013-03-15 13:23 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 10/30] thp, mm: locking tail page is a bug Kirill A. Shutemov
2013-03-21 17:20 ` Dave Hansen
2013-03-14 17:50 ` [PATCHv2, RFC 11/30] thp, mm: handle tail pages in page_cache_get_speculative() Kirill A. Shutemov
2013-04-05 4:03 ` Ric Mason
2013-03-14 17:50 ` [PATCHv2, RFC 12/30] thp, mm: add event counters for huge page alloc on write to a file Kirill A. Shutemov
2013-03-21 17:59 ` Dave Hansen
2013-03-26 8:40 ` Kirill A. Shutemov
2013-04-05 4:05 ` Ric Mason
2013-03-14 17:50 ` [PATCHv2, RFC 13/30] thp, mm: implement grab_cache_huge_page_write_begin() Kirill A. Shutemov
2013-03-15 2:34 ` Hillf Danton
2013-03-15 13:24 ` Kirill A. Shutemov
2013-03-15 13:30 ` Hillf Danton
2013-03-15 13:35 ` Kirill A. Shutemov
2013-03-15 13:37 ` Hillf Danton
2013-03-21 18:15 ` Dave Hansen
2013-03-26 10:48 ` Kirill A. Shutemov
2013-03-26 15:40 ` Dave
2013-03-21 18:16 ` Dave Hansen
2013-03-14 17:50 ` [PATCHv2, RFC 14/30] thp, mm: naive support of thp in generic read/write routines Kirill A. Shutemov
2013-03-15 3:11 ` Hillf Danton
2013-03-15 13:27 ` Kirill A. Shutemov
2013-03-22 15:22 ` Dave Hansen
2013-03-28 12:25 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 15/30] thp, libfs: initial support of thp in simple_read/write_begin/write_end Kirill A. Shutemov
2013-03-22 18:01 ` Dave
2013-03-28 14:29 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 16/30] thp: handle file pages in split_huge_page() Kirill A. Shutemov
2013-03-15 6:15 ` Hillf Danton
2013-03-15 13:26 ` Kirill A. Shutemov
2013-03-15 13:33 ` Hillf Danton
2013-03-22 18:18 ` Dave
2013-03-28 14:32 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 17/30] thp: wait_split_huge_page(): serialize over i_mmap_mutex too Kirill A. Shutemov
2013-03-22 18:22 ` Dave
2013-03-28 15:08 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 18/30] thp, mm: truncate support for transparent huge page cache Kirill A. Shutemov
2013-03-22 18:29 ` Dave
2013-03-28 15:31 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 19/30] thp, mm: split huge page on mmap file page Kirill A. Shutemov
2013-03-15 6:58 ` Hillf Danton
2013-03-15 13:29 ` Kirill A. Shutemov
2013-03-15 13:35 ` Hillf Danton
2013-03-15 13:45 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 20/30] ramfs: enable transparent huge page cache Kirill A. Shutemov
2013-04-02 16:28 ` Kirill A. Shutemov
2013-04-02 22:15 ` Hugh Dickins
2013-04-03 1:11 ` Minchan Kim
2013-04-05 6:47 ` Simon Jeons
2013-04-05 8:01 ` Minchan Kim
2013-04-05 8:22 ` Wanpeng Li
[not found] ` <515e89d2.e725320a.3a74.7fe7SMTPIN_ADDED_BROKEN@mx.google.com>
2013-04-05 8:31 ` Minchan Kim
2013-04-05 8:35 ` Wanpeng Li
2013-04-05 13:46 ` Christoph Lameter
2013-04-03 13:53 ` Christoph Lameter
2013-03-14 17:50 ` [PATCHv2, RFC 21/30] x86-64, mm: proper alignment mappings with hugepages Kirill A. Shutemov
2013-03-22 18:37 ` Dave
2013-03-14 17:50 ` [PATCHv2, RFC 22/30] mm: add huge_fault() callback to vm_operations_struct Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 23/30] thp: prepare zap_huge_pmd() to uncharge file pages Kirill A. Shutemov
2013-03-15 7:09 ` Hillf Danton
2013-03-15 13:30 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 24/30] thp: move maybe_pmd_mkwrite() out of mk_huge_pmd() Kirill A. Shutemov
2013-03-15 7:31 ` Hillf Danton
2013-03-14 17:50 ` [PATCHv2, RFC 25/30] thp, mm: basic huge_fault implementation for generic_file_vm_ops Kirill A. Shutemov
2013-03-15 7:44 ` Hillf Danton
2013-03-15 13:30 ` Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 26/30] thp: extract fallback path from do_huge_pmd_anonymous_page() to a function Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 27/30] thp: initial implementation of do_huge_linear_fault() Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 28/30] thp: handle write-protect exception to file-backed huge pages Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 29/30] thp: call __vma_adjust_trans_huge() for file-backed VMA Kirill A. Shutemov
2013-03-14 17:50 ` [PATCHv2, RFC 30/30] thp: map file-backed huge pages on fault Kirill A. Shutemov
2013-03-15 0:33 ` [PATCHv2, RFC 00/30] Transparent huge page cache Hillf Danton
2013-03-15 13:33 ` Kirill A. Shutemov
2013-03-18 4:03 ` Simon Jeons
2013-03-18 5:23 ` Simon Jeons
2013-03-18 11:19 ` Kirill A. Shutemov
2013-03-18 11:29 ` Simon Jeons
2013-03-18 11:42 ` Kirill A. Shutemov
2013-03-18 11:42 ` Ric Mason
2013-03-20 1:09 ` Simon Jeons
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130322092150.5DD29E0085@blue.fi.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dave@sr71.net \
--cc=dhillf@gmail.com \
--cc=fengguang.wu@intel.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kirill@shutemov.name \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.r.wilcox@intel.com \
--cc=mgorman@suse.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).