linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: "Thomas Hellström (VMware)" <thomas_os@shipmail.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Thomas Hellstrom <thellstrom@vmware.com>
Subject: Re: [RFC PATCH] mm: Fix a huge pud insertion race during faulting
Date: Tue, 15 Oct 2019 13:06:53 +0300	[thread overview]
Message-ID: <20191015100653.ittq4b2mx7pszky5@box> (raw)
In-Reply-To: <20191008093711.3410-1-thomas_os@shipmail.org>

On Tue, Oct 08, 2019 at 11:37:11AM +0200, Thomas Hellström (VMware) wrote:
> From: Thomas Hellstrom <thellstrom@vmware.com>
> 
> A huge pud page can theoretically be faulted in racing with pmd_alloc()
> in __handle_mm_fault(). That will lead to pmd_alloc() returning an
> invalid pmd pointer. Fix this by adding a pud_trans_unstable() function
> similar to pmd_trans_unstable() and check whether the pud is really stable
> before using the pmd pointer.
> 
> Race:
> Thread 1:             Thread 2:                 Comment
> create_huge_pud()                               Fallback - not taken.
> 		      create_huge_pud()         Taken.
> pmd_alloc()                                     Returns an invalid pointer.
> 
> Cc: Matthew Wilcox <willy@infradead.org>
> Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepages")
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> ---
> RFC: We include pud_devmap() as an unstable PUD flag. Is this correct?
>      Do the same for pmds?

I *think* it is correct and we should do the same for PMD, but I may be
wrong.

Dan, Matthew, could you comment on this?

> ---
>  include/asm-generic/pgtable.h | 25 +++++++++++++++++++++++++
>  mm/memory.c                   |  6 ++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 818691846c90..70c2058230ba 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -912,6 +912,31 @@ static inline int pud_trans_huge(pud_t pud)
>  }
>  #endif
>  
> +/* See pmd_none_or_trans_huge_or_clear_bad for discussion. */
> +static inline int pud_none_or_trans_huge_or_dev_or_clear_bad(pud_t *pud)
> +{
> +	pud_t pudval = READ_ONCE(*pud);
> +
> +	if (pud_none(pudval) || pud_trans_huge(pudval) || pud_devmap(pudval))
> +		return 1;
> +	if (unlikely(pud_bad(pudval))) {
> +		pud_clear_bad(pud);
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +/* See pmd_trans_unstable for discussion. */
> +static inline int pud_trans_unstable(pud_t *pud)
> +{
> +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) &&			\
> +	defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)
> +	return pud_none_or_trans_huge_or_dev_or_clear_bad(pud);
> +#else
> +	return 0;
> +#endif
> +}
> +
>  #ifndef pmd_read_atomic
>  static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
>  {
> diff --git a/mm/memory.c b/mm/memory.c
> index b1ca51a079f2..43ff372f4f07 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3914,6 +3914,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  	vmf.pud = pud_alloc(mm, p4d, address);
>  	if (!vmf.pud)
>  		return VM_FAULT_OOM;
> +retry_pud:
>  	if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) {
>  		ret = create_huge_pud(&vmf);
>  		if (!(ret & VM_FAULT_FALLBACK))
> @@ -3940,6 +3941,11 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma,
>  	vmf.pmd = pmd_alloc(mm, vmf.pud, address);
>  	if (!vmf.pmd)
>  		return VM_FAULT_OOM;
> +
> +	/* Huge pud page fault raced with pmd_alloc? */
> +	if (pud_trans_unstable(vmf.pud))
> +		goto retry_pud;
> +
>  	if (pmd_none(*vmf.pmd) && __transparent_hugepage_enabled(vma)) {
>  		ret = create_huge_pmd(&vmf);
>  		if (!(ret & VM_FAULT_FALLBACK))
> -- 
> 2.20.1
> 

-- 
 Kirill A. Shutemov


  reply	other threads:[~2019-10-15 10:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  9:37 [RFC PATCH] mm: Fix a huge pud insertion race during faulting Thomas Hellström (VMware)
2019-10-15 10:06 ` Kirill A. Shutemov [this message]
2019-10-16  1:44   ` Dan Williams
2019-10-16  5:59     ` Thomas Hellström (VMware)
2019-10-16 20:02       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191015100653.ittq4b2mx7pszky5@box \
    --to=kirill@shutemov.name \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=thellstrom@vmware.com \
    --cc=thomas_os@shipmail.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).