All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>, akpm@linux-foundation.org
Cc: songmuchun@bytedance.com, david@redhat.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
Date: Thu, 1 Sep 2022 14:06:06 -0700	[thread overview]
Message-ID: <YxEevpqBy2rIjcrO@monkey> (raw)
In-Reply-To: <635f43bdd85ac2615a58405da82b4d33c6e5eb05.1662017562.git.baolin.wang@linux.alibaba.com>

On 09/01/22 18:41, Baolin Wang wrote:
> On some architectures (like ARM64), it can support CONT-PTE/PMD size
> hugetlb, which means it can support not only PMD/PUD size hugetlb
> (2M and 1G), but also CONT-PTE/PMD size(64K and 32M) if a 4K page size
> specified.
> 
> So when looking up a CONT-PTE size hugetlb page by follow_page(), it
> will use pte_offset_map_lock() to get the pte entry lock for the CONT-PTE
> size hugetlb in follow_page_pte(). However this pte entry lock is incorrect
> for the CONT-PTE size hugetlb, since we should use huge_pte_lock() to
> get the correct lock, which is mm->page_table_lock.
> 
> That means the pte entry of the CONT-PTE size hugetlb under current
> pte lock is unstable in follow_page_pte(), we can continue to migrate
> or poison the pte entry of the CONT-PTE size hugetlb, which can cause
> some potential race issues, even though they are under the 'pte lock'.
> 
> For example, suppose thread A is trying to look up a CONT-PTE size
> hugetlb page by move_pages() syscall under the lock, however antoher
> thread B can migrate the CONT-PTE hugetlb page at the same time, which
> will cause thread A to get an incorrect page, if thread A also wants to
> do page migration, then data inconsistency error occurs.
>
> Moreover we have the same issue for CONT-PMD size hugetlb in
> follow_huge_pmd().
> 
> To fix above issues, rename the follow_huge_pmd() as follow_huge_pmd_pte()
> to handle PMD and PTE level size hugetlb, which uses huge_pte_lock() to
> get the correct pte entry lock to make the pte entry stable.
> 
> Cc: <stable@vger.kernel.org>
> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> Changes from v2:
>  - Combine PMD and PTE level hugetlb handling into one function.
>  - Drop unnecessary patches.
>  - Update the commit message.

Baolin, were you able to at least exercise the new code paths?  Especially the
path for CONT_PTE.  Code looks fine to me.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

It is a little hackish, but this is only for backports.  So, I think it is OK.
We may want to point out that code cleanup and simplification is going upstream
that will address these issues in a more elegant manner.

> 
> Mike, please fold this patch into your series. Thanks.

If I understand Andrew, this can go in as a separate patch for backport to
address potential bugs.  I will provide a cleanup/simplification that will
remove this going forward.

Andrew also asked for a Fixes tag.
Support for CONT_PMD/_PTE was added with bb9dd3df8ee9 "arm64: hugetlb: refactor
find_num_contig()".  Patch series "Support for contiguous pte hugepages", v4.
However, I do not believe these code paths were executed until migration
support was added with 5480280d3f2d "arm64/mm: enable HugeTLB migration for
contiguous bit HugeTLB pages"
I would go with 5480280d3f2d.
-- 
Mike Kravetz


  parent reply	other threads:[~2022-09-01 21:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 10:41 [PATCH v3] mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page Baolin Wang
2022-09-01 19:49 ` Andrew Morton
2022-09-01 21:06 ` Mike Kravetz [this message]
2022-09-02  1:29   ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YxEevpqBy2rIjcrO@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.