linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Naoya Horiguchi <nao.horiguchi@gmail.com>,
	Oscar Salvador <osalvador@suse.de>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Tony Luck <tony.luck@intel.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/2] mm,hwpoison: fix race with compound page allocation
Date: Mon, 17 May 2021 13:11:25 -0700	[thread overview]
Message-ID: <126534de-9679-0e22-bb2c-807f40edca38@oracle.com> (raw)
In-Reply-To: <20210517045401.2506032-2-nao.horiguchi@gmail.com>

On 5/16/21 9:54 PM, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> When hugetlb page fault (under overcommitting situation) and
> memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race:
> 
>     CPU0:                           CPU1:
> 
>                                     gather_surplus_pages()
>                                       page = alloc_surplus_huge_page()
>     memory_failure_hugetlb()
>       get_hwpoison_page(page)
>         __get_hwpoison_page(page)
>           get_page_unless_zero(page)
>                                       zero = put_page_testzero(page)
>                                       VM_BUG_ON_PAGE(!zero, page)
>                                       enqueue_huge_page(h, page)
>       put_page(page)
> 
> __get_hwpoison_page() only checks page refcount before taking additional
> one for memory error handling, which is wrong because there's a time
> window where compound pages have non-zero refcount during initialization.
> 
> So makes __get_hwpoison_page() check page status a bit more for a few
> types of compound pages. PageSlab() check is added because otherwise
> "non anonymous thp" path is wrongly chosen.
> 
> Fixes: ead07f6a867b ("mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling")
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reported-by: Muchun Song <songmuchun@bytedance.com>
> Cc: stable@vger.kernel.org # 5.12+
> ---
> ChangeLog v4:
> - all hugetlb related check in hugetlb_lock,
> - fix build error with #ifdef CONFIG_HUGETLB_PAGE
> ---
>  mm/memory-failure.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git v5.12/mm/memory-failure.c v5.12_patched/mm/memory-failure.c
> index a3659619d293..761f982b6d7b 100644
> --- v5.12/mm/memory-failure.c
> +++ v5.12_patched/mm/memory-failure.c
> @@ -1094,6 +1094,16 @@ static int page_action(struct page_state *ps, struct page *p,
>  static int __get_hwpoison_page(struct page *page)
>  {
>  	struct page *head = compound_head(page);
> +	int ret = 0;
> +
> +#ifdef CONFIG_HUGETLB_PAGE
> +	spin_lock(&hugetlb_lock);
> +	if (PageHuge(head) && (HPageFreed(head) || HPageMigratable(head)))
> +		ret = get_page_unless_zero(head);
> +	spin_unlock(&hugetlb_lock);
> +	if (ret > 0)
> +		return ret;
> +#endif

I must be missing something.

The above code makes sure the page is not in one of these transitive
hugetlb states as mentioned in the commit message.  It only attempts
to take a reference on the page if it is not in one of these states.

However, if it is in such a transitive state (!HPageFreed(head) &&
!HPageMigratable(head)) we will fall through and execute the code:

	if (get_page_unless_zero(head)) {
		if (head == compound_head(page))
			return 1;

So, it seems like we will always do a get_page_unless_zero() for
PageHuge() pages?

Also, due to commit db71ef79b59b ("hugetlb: make free_huge_page irq
safe") you need to make sure interrupts are disabled when taking
hugetlb_lock.
-- 
Mike Kravetz


  parent reply	other threads:[~2021-05-17 20:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17  4:53 [PATCH v4 0/2] hwpoison: fix race with compound page allocation Naoya Horiguchi
2021-05-17  4:54 ` [PATCH v4 1/2] mm,hwpoison: " Naoya Horiguchi
2021-05-17 10:12   ` Oscar Salvador
2021-05-17 11:34     ` Naoya Horiguchi
2021-05-17 20:11   ` Mike Kravetz [this message]
2021-05-18  5:24     ` HORIGUCHI NAOYA(堀口 直也)
2021-05-17  4:54 ` [PATCH v4 2/2] mm,hwpoison: make get_hwpoison_page call get_any_page() Naoya Horiguchi
2021-05-17 12:59   ` Oscar Salvador
2021-05-17 15:13     ` HORIGUCHI NAOYA(堀口 直也)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=126534de-9679-0e22-bb2c-807f40edca38@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=nao.horiguchi@gmail.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=songmuchun@bytedance.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).