Re: [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Naoya Horiguchi <nao.horiguchi@gmail.com>
Subject: Re: [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)()
Date: Tue, 30 Sep 2014 19:54:53 +0300	[thread overview]
Message-ID: <20140930165453.GA29843@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1409072307430.1298@eggly.anvils>

On Mon, Sep 08, 2014 at 12:13:16AM -0700, Hugh Dickins wrote:
> > > One subtlety to take care over: it's a long time since I've had to
> > > worry about pmd folding and pud folding (what happens when you only
> > > have 2 or 3 levels of page table instead of the full 4): macros get
> > > defined to each other, and levels get optimized out (perhaps
> > > differently on different architectures).
> > > 
> > > So although at first sight the lock to take in follow_huge_pud()
> > > would seem to be mm->page_table_lock, I am not at this point certain
> > > that that's necessarily so - sometimes pud_huge might be pmd_huge,
> > > and the size PMD_SIZE, and pmd_lockptr appropriate at what appears
> > > to be the pud level.  Maybe: needs checking through the architectures
> > > and their configs, not obvious to me.
> > 
> > I think that every architecture uses mm->page_table_lock for pud-level
> > locking at least for now, but that could be changed in the future,
> > for example when 1GB hugepages or pud-based hugepages become common and
> > someone are interested in splitting lock for pud level.
> 
> I'm not convinced by your answer, that you understand the (perhaps
> imaginary!) issue I'm referring to.  Try grep for __PAGETABLE_P.D_FOLDED.
> 
> Our infrastructure allows for 4 levels of pagetable, pgd pud pmd pte,
> but many architectures/configurations support only 2 or 3 levels.
> What pud functions and pmd functions work out to be in those
> configs is confusing, and varies from architecture to architecture.
> 
> In particular, pud and pmd may be different expressions of the same
> thing (with 1 pmd per pud, instead of say 512).  In that case PUD_SIZE
> will equal PMD_SIZE: and then at the pud level huge_pte_lockptr()
> will be using split locking instead of mm->page_table_lock.

<sorry for delay -- just back from vacation>

Look like we can't have PMD folded unless PUD is folded too:

include/asm-generic/pgtable-nopmd.h:#include <asm-generic/pgtable-nopud.h>

It means we have three cases:

- Both PMD and PUD are not folded. PUD_SIZE == PMD_SIZE can be true only
  if PUD table consits from one entry which is emm.. strange.
- PUD folded, PMD is not. In this case PUD_SIZE is equal to PGDIR_SIZE
  which is always (I believe) greater than PMD_SIZE.
- Both are folded: PMD_SIZE == PUD_SIZE == PGDIR_SIZE, but we solve it
  with ARCH_ENABLE_SPLIT_PMD_PTLOCK. It only enabled on configuration with
  where PMD is not folded. Without ARCH_ENABLE_SPLIT_PMD_PTLOCK,
  pmd_lockptr() points to mm->page_table_lock.

Does it make sense?

> Many of the hugetlb architectures have a pud_huge() which just returns
> 0, and we need not worry about those, nor the follow_huge_addr() powerpc.
> But arm64, mips, tile, x86 look more interesting.
> 
> Frankly, I find myself too dumb to be sure of the right answer for all:
> and think that when we put the proper locking into follow_huge_pud(),
> we shall have to include a PUD_SIZE == PMD_SIZE test, to let the
> compiler decide for us which is the appropriate locking to match
> huge_pte_lockptr().
> 
> Unless Kirill can illuminate: I may be afraid of complications
> where actually there are none.

I'm more worry about false-negative result of huge_page_size(h) ==
PMD_SIZE check. I can imagine that some architectures (power and ia64, i
guess) allows several page sizes on the same page table level, but only
one of them is PMD_SIZE.

It seems not a problem currently since we enable split PMD lock only on
x86 and s390.

Possible solution is to annotate each hstate with page table level it
corresponds to.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Naoya Horiguchi <nao.horiguchi@gmail.com>
Subject: Re: [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)()
Date: Tue, 30 Sep 2014 19:54:53 +0300	[thread overview]
Message-ID: <20140930165453.GA29843@node.dhcp.inet.fi> (raw)
In-Reply-To: <alpine.LSU.2.11.1409072307430.1298@eggly.anvils>

On Mon, Sep 08, 2014 at 12:13:16AM -0700, Hugh Dickins wrote:
> > > One subtlety to take care over: it's a long time since I've had to
> > > worry about pmd folding and pud folding (what happens when you only
> > > have 2 or 3 levels of page table instead of the full 4): macros get
> > > defined to each other, and levels get optimized out (perhaps
> > > differently on different architectures).
> > > 
> > > So although at first sight the lock to take in follow_huge_pud()
> > > would seem to be mm->page_table_lock, I am not at this point certain
> > > that that's necessarily so - sometimes pud_huge might be pmd_huge,
> > > and the size PMD_SIZE, and pmd_lockptr appropriate at what appears
> > > to be the pud level.  Maybe: needs checking through the architectures
> > > and their configs, not obvious to me.
> > 
> > I think that every architecture uses mm->page_table_lock for pud-level
> > locking at least for now, but that could be changed in the future,
> > for example when 1GB hugepages or pud-based hugepages become common and
> > someone are interested in splitting lock for pud level.
> 
> I'm not convinced by your answer, that you understand the (perhaps
> imaginary!) issue I'm referring to.  Try grep for __PAGETABLE_P.D_FOLDED.
> 
> Our infrastructure allows for 4 levels of pagetable, pgd pud pmd pte,
> but many architectures/configurations support only 2 or 3 levels.
> What pud functions and pmd functions work out to be in those
> configs is confusing, and varies from architecture to architecture.
> 
> In particular, pud and pmd may be different expressions of the same
> thing (with 1 pmd per pud, instead of say 512).  In that case PUD_SIZE
> will equal PMD_SIZE: and then at the pud level huge_pte_lockptr()
> will be using split locking instead of mm->page_table_lock.

<sorry for delay -- just back from vacation>

Look like we can't have PMD folded unless PUD is folded too:

include/asm-generic/pgtable-nopmd.h:#include <asm-generic/pgtable-nopud.h>

It means we have three cases:

- Both PMD and PUD are not folded. PUD_SIZE == PMD_SIZE can be true only
  if PUD table consits from one entry which is emm.. strange.
- PUD folded, PMD is not. In this case PUD_SIZE is equal to PGDIR_SIZE
  which is always (I believe) greater than PMD_SIZE.
- Both are folded: PMD_SIZE == PUD_SIZE == PGDIR_SIZE, but we solve it
  with ARCH_ENABLE_SPLIT_PMD_PTLOCK. It only enabled on configuration with
  where PMD is not folded. Without ARCH_ENABLE_SPLIT_PMD_PTLOCK,
  pmd_lockptr() points to mm->page_table_lock.

Does it make sense?

> Many of the hugetlb architectures have a pud_huge() which just returns
> 0, and we need not worry about those, nor the follow_huge_addr() powerpc.
> But arm64, mips, tile, x86 look more interesting.
> 
> Frankly, I find myself too dumb to be sure of the right answer for all:
> and think that when we put the proper locking into follow_huge_pud(),
> we shall have to include a PUD_SIZE == PMD_SIZE test, to let the
> compiler decide for us which is the appropriate locking to match
> huge_pte_lockptr().
> 
> Unless Kirill can illuminate: I may be afraid of complications
> where actually there are none.

I'm more worry about false-negative result of huge_page_size(h) ==
PMD_SIZE check. I can imagine that some architectures (power and ia64, i
guess) allows several page sizes on the same page table level, but only
one of them is PMD_SIZE.

It seems not a problem currently since we enable split PMD lock only on
x86 and s390.

Possible solution is to annotate each hstate with page table level it
corresponds to.

-- 
 Kirill A. Shutemov

next prev parent reply	other threads:[~2014-09-30 16:55 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-29  1:38 [PATCH 0/6] hugepage migration fixes (v3) Naoya Horiguchi
2014-08-29  1:38 ` Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 1/6] mm/hugetlb: reduce arch dependent code around follow_huge_* Naoya Horiguchi
2014-08-29  1:38   ` Naoya Horiguchi
2014-09-03 19:40   ` Hugh Dickins
2014-09-03 19:40     ` Hugh Dickins
2014-08-29  1:38 ` [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)() Naoya Horiguchi
2014-08-29  1:38   ` Naoya Horiguchi
2014-09-03 21:17   ` Hugh Dickins
2014-09-03 21:17     ` Hugh Dickins
2014-09-05  5:27     ` Naoya Horiguchi
2014-09-05  5:27       ` Naoya Horiguchi
2014-09-08  7:13       ` Hugh Dickins
2014-09-08  7:13         ` Hugh Dickins
2014-09-08 21:37         ` Naoya Horiguchi
2014-09-08 21:37           ` Naoya Horiguchi
2014-09-09 19:03           ` Hugh Dickins
2014-09-09 19:03             ` Hugh Dickins
2014-09-30 16:54         ` Kirill A. Shutemov [this message]
2014-09-30 16:54           ` Kirill A. Shutemov
2014-08-29  1:38 ` [PATCH v3 3/6] mm/hugetlb: fix getting refcount 0 page in hugetlb_fault() Naoya Horiguchi
2014-08-29  1:38   ` Naoya Horiguchi
2014-09-04  0:20   ` Hugh Dickins
2014-09-04  0:20     ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-09-05  5:28       ` Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 4/6] mm/hugetlb: add migration entry check in hugetlb_change_protection Naoya Horiguchi
2014-08-29  1:38   ` Naoya Horiguchi
2014-09-04  1:06   ` Hugh Dickins
2014-09-04  1:06     ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-09-05  5:28       ` Naoya Horiguchi
2014-08-29  1:38 ` [PATCH v3 5/6] mm/hugetlb: add migration entry check in __unmap_hugepage_range Naoya Horiguchi
2014-08-29  1:38   ` Naoya Horiguchi
2014-09-04  1:47   ` Hugh Dickins
2014-09-04  1:47     ` Hugh Dickins
2014-09-05  5:28     ` Naoya Horiguchi
2014-09-05  5:28       ` Naoya Horiguchi
2014-08-29  1:39 ` [PATCH v3 6/6] mm/hugetlb: remove unused argument of follow_huge_addr() Naoya Horiguchi
2014-08-29  1:39   ` Naoya Horiguchi
2014-09-03 21:26   ` Hugh Dickins
2014-09-03 21:26     ` Hugh Dickins
2014-09-05  5:29     ` Naoya Horiguchi
2014-09-05  5:29       ` Naoya Horiguchi
2014-08-31 15:27 ` [PATCH 0/6] hugepage migration fixes (v3) Andi Kleen
2014-08-31 15:27   ` Andi Kleen
2014-09-01  4:08   ` Naoya Horiguchi
2014-09-01  4:08     ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140930165453.GA29843@node.dhcp.inet.fi \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.