linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Naoya Horiguchi <nao.horiguchi@gmail.com>
Subject: Re: [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)()
Date: Tue, 30 Sep 2014 19:54:53 +0300	[thread overview]
Message-ID: <20140930165453.GA29843@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1409072307430.1298@eggly.anvils>

On Mon, Sep 08, 2014 at 12:13:16AM -0700, Hugh Dickins wrote:
> > > One subtlety to take care over: it's a long time since I've had to
> > > worry about pmd folding and pud folding (what happens when you only
> > > have 2 or 3 levels of page table instead of the full 4): macros get
> > > defined to each other, and levels get optimized out (perhaps
> > > differently on different architectures).
> > > 
> > > So although at first sight the lock to take in follow_huge_pud()
> > > would seem to be mm->page_table_lock, I am not at this point certain
> > > that that's necessarily so - sometimes pud_huge might be pmd_huge,
> > > and the size PMD_SIZE, and pmd_lockptr appropriate at what appears
> > > to be the pud level.  Maybe: needs checking through the architectures
> > > and their configs, not obvious to me.
> > 
> > I think that every architecture uses mm->page_table_lock for pud-level
> > locking at least for now, but that could be changed in the future,
> > for example when 1GB hugepages or pud-based hugepages become common and
> > someone are interested in splitting lock for pud level.
> 
> I'm not convinced by your answer, that you understand the (perhaps
> imaginary!) issue I'm referring to.  Try grep for __PAGETABLE_P.D_FOLDED.
> 
> Our infrastructure allows for 4 levels of pagetable, pgd pud pmd pte,
> but many architectures/configurations support only 2 or 3 levels.
> What pud functions and pmd functions work out to be in those
> configs is confusing, and varies from architecture to architecture.
> 
> In particular, pud and pmd may be different expressions of the same
> thing (with 1 pmd per pud, instead of say 512).  In that case PUD_SIZE
> will equal PMD_SIZE: and then at the pud level huge_pte_lockptr()
> will be using split locking instead of mm->page_table_lock.

<sorry for delay -- just back from vacation>

Look like we can't have PMD folded unless PUD is folded too:

include/asm-generic/pgtable-nopmd.h:#include <asm-generic/pgtable-nopud.h>

It means we have three cases:

- Both PMD and PUD are not folded. PUD_SIZE == PMD_SIZE can be true only
  if PUD table consits from one entry which is emm.. strange.
- PUD folded, PMD is not. In this case PUD_SIZE is equal to PGDIR_SIZE
  which is always (I believe) greater than PMD_SIZE.
- Both are folded: PMD_SIZE == PUD_SIZE == PGDIR_SIZE, but we solve it
  with ARCH_ENABLE_SPLIT_PMD_PTLOCK. It only enabled on configuration with
  where PMD is not folded. Without ARCH_ENABLE_SPLIT_PMD_PTLOCK,
  pmd_lockptr() points to mm->page_table_lock.

Does it make sense?

> Many of the hugetlb architectures have a pud_huge() which just returns
> 0, and we need not worry about those, nor the follow_huge_addr() powerpc.
> But arm64, mips, tile, x86 look more interesting.
> 
> Frankly, I find myself too dumb to be sure of the right answer for all:
> and think that when we put the proper locking into follow_huge_pud(),
> we shall have to include a PUD_SIZE == PMD_SIZE test, to let the
> compiler decide for us which is the appropriate locking to match
> huge_pte_lockptr().
> 
> Unless Kirill can illuminate: I may be afraid of complications
> where actually there are none.

I'm more worry about false-negative result of huge_page_size(h) ==
PMD_SIZE check. I can imagine that some architectures (power and ia64, i
guess) allows several page sizes on the same page table level, but only
one of them is PMD_SIZE.

It seems not a problem currently since we enable split PMD lock only on
x86 and s390.

Possible solution is to annotate each hstate with page table level it
corresponds to.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2014-09-30 16:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-29  1:38 [PATCH 0/6] hugepage migration fixes (v3) Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 1/6] mm/hugetlb: reduce arch dependent code around follow_huge_* Naoya Horiguchi
2014-09-03 19:40   ` Hugh Dickins
2014-08-29  1:38 ` [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)() Naoya Horiguchi
2014-09-03 21:17   ` Hugh Dickins
2014-09-05  5:27     ` Naoya Horiguchi
2014-09-08  7:13       ` Hugh Dickins
2014-09-08 21:37         ` Naoya Horiguchi
2014-09-09 19:03           ` Hugh Dickins
2014-09-30 16:54         ` Kirill A. Shutemov [this message]
2014-08-29  1:38 ` [PATCH v3 3/6] mm/hugetlb: fix getting refcount 0 page in hugetlb_fault() Naoya Horiguchi
2014-09-04  0:20   ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 4/6] mm/hugetlb: add migration entry check in hugetlb_change_protection Naoya Horiguchi
2014-09-04  1:06   ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 5/6] mm/hugetlb: add migration entry check in __unmap_hugepage_range Naoya Horiguchi
2014-09-04  1:47   ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-08-29  1:39 ` [PATCH v3 6/6] mm/hugetlb: remove unused argument of follow_huge_addr() Naoya Horiguchi
2014-09-03 21:26   ` Hugh Dickins
2014-09-05  5:29     ` Naoya Horiguchi
2014-08-31 15:27 ` [PATCH 0/6] hugepage migration fixes (v3) Andi Kleen
2014-09-01  4:08   ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140930165453.GA29843@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).