Re: [PATCH mmotm] thp: transparent hugepage core fixlet

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Arcangeli <aarcange@redhat.com>
To: Hugh Dickins <hughd@google.com>
Cc: Andi Kleen <andi@firstfloor.org>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH mmotm] thp: transparent hugepage core fixlet
Date: Wed, 12 Jan 2011 03:02:43 +0100	[thread overview]
Message-ID: <20110112020243.GT9506@random.random> (raw)
In-Reply-To: <alpine.LSU.2.00.1101111318190.26539@sister.anvils>

On Tue, Jan 11, 2011 at 02:59:43PM -0800, Hugh Dickins wrote:
> On Tue, 11 Jan 2011, Andrea Arcangeli wrote:
> > On Tue, Jan 11, 2011 at 03:04:21PM +0100, Andrea Arcangeli wrote:
> > > architectural bug to me. Why can't pud_huge simply return 0 for
> > > x86_32? Any other place dealing with hugepages and calling pud_huge on
> > > x86 noPAE would be at risk, otherwise, no?
> > 
> > Isn't this better solution?
> 
> [Better solution than my patch to follow_page() in mmotm, to fix crash
> with Transparent Huge Pages by duplicating Andrea's pmd_huge VM_HUGETLB
> check to the pud_huge line too.]
> 
> The truth is, I'm sure one of the solutions is better than the other,
> but I'm too confused by p?d folding to know which is which ;)
> 
> Certainly I don't oppose your patch as a replacement for mine,
> if you're sure yours is better.
> 
> There are only two places which are using pud_huge() anyway:
> follow_page() and apply_to_pmd_range().  Is the latter's
> BUG_ON(pud_huge) safe?  Safe in the THP world?

The latter BUG_ON should be safe in THP world, there's a pmd_huge bug
on too so it can't be a problem in THP world.

> And I never quite understood why we have both pmd_huge and pmd_large,
> pud_huge and pud_large.
> 
> There are answers to these questions, but it would take me hours and
> hours of easily-confused research (across several arches) to decide.
> 
> I'm hoping someone else has a surer grasp: Andi introduced pud_huge(),
> and Jeremy is the most active in the pagetable layers nowadays -
> perhaps they can tell us more quickly.

I'd like their opinion too but for exactly the same reason why you
asked yourself if the latter BUG_ON is safe, I think my patch from
practical prospective reduces the risk.

When THP uses pmd_mkhuge it's counter intuitive that pud_huge returns
1, and there's no benefit to that at all other than risking troubles
like this one. In fact a branch and a block of follow_page is
eliminated at compile time by my patch (as opposed your patch adds one
more branch and can't eliminate a block of code if the second branch
would be taken but we know it can't).

I consider this an arch bug, not common code issue. This is the THP
modifications to the code and I didn't expect having to alter the
pud_huge check in addition to the below one. I thought this shall be
enough if the arch is correct (and optimal).

@@ -1273,11 +1301,32 @@ struct page *follow_page(struct vm_area_
        pmd = pmd_offset(pud, address);
        if (pmd_none(*pmd))
                goto no_page_table;
-       if (pmd_huge(*pmd)) {
+       if (pmd_huge(*pmd) && vma->vm_flags & VM_HUGETLB) {
                BUG_ON(flags & FOLL_GET);
                page = follow_huge_pmd(mm, address, pmd, flags &
        FOLL_WRITE);
                goto out;

I think the x86 3level should work ok with follow_page_pmd (it's
basically identical to follow_page_pud so it won't notice the
difference) so I hope it doesn't break anything, and it will speedup
follow_page too (even when THP is off).

Across the whole tree if you grep for pmd_offset, you'll find all the
places that you've to care for THP, I'd like to still not having to
care about the result of pud_offset (having to care for pmd_offset is
more than enough ;).

Other archs implementing pud_huge should also return 0 if there are
only 3 levels, if they introduce THP, this will have the benefit of
optimizing follow_page when THP is off as well for them.

Your patch is ok if this will not be considered an arch bug (I think
to avoid mistakes pud_huge should be implemented by pgtable-nopud.h
but that's a little bigger cleanup I didn't do myself yet).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

     prev parent reply	other threads:[~2011-01-12  2:03 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-11  0:55 [PATCH mmotm] thp: transparent hugepage core fixlet Hugh Dickins
2011-01-11  1:57 ` Andrea Arcangeli
2011-01-11  2:29   ` Hugh Dickins
2011-01-11 14:04     ` Andrea Arcangeli
2011-01-11 16:31       ` Andrea Arcangeli
2011-01-11 22:59         ` Hugh Dickins
2011-01-12  2:02           ` Andrea Arcangeli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110112020243.GT9506@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=hughd@google.com \
    --cc=jeremy@goop.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.