public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Jane Chu <jane.chu@oracle.com>
Cc: akpm@linux-foundation.org, linmiaohe@huawei.com,
	kirill.shutemov@linux.intel.com, hughd@google.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: make page_mapped_in_vma() hugetlb walk aware
Date: Tue, 21 Jan 2025 05:00:10 +0000	[thread overview]
Message-ID: <Z48p2oK1AfRLYmDQ@casper.infradead.org> (raw)
In-Reply-To: <20250121041849.3393237-1-jane.chu@oracle.com>

On Mon, Jan 20, 2025 at 09:18:49PM -0700, Jane Chu wrote:
> When a process consumes a UE in a page, the memory failure handler
> attempts to collect information for a potential SIGBUS.
> If the page is an anonymous page, page_mapped_in_vma(page, vma) is
> invoked in order to
>   1. retrieve the vaddr from the process' address space,
>   2. verify that the vaddr is indeed mapped to the poisoned page,
> where 'page' is the precise small page with UE.
> 
> It's been observed that when injecting poison to a non-head subpage
> of an anonymous hugetlb page, no SIGBUS show up; while injecting to
> the head page produces a SIGBUS. The casue is that, though hugetlb_walk()
> returns a valid pmd entry (on x86), but check_pte() detects mismatch
> between the head page per the pmd and the input subpage. Thus the vaddr
> is considered not mapped to the subpage and the process is not collected
> for SIGBUS purpose.  This is the calling stack
>       collect_procs_anon
>         page_mapped_in_vma
>           page_vma_mapped_walk
>             hugetlb_walk
>               huge_pte_lock
>                 check_pte
> 
> It seems that the most obvious place to fix the issue is by making
> page_mapped_in_vma() hugetlb walk aware. The precise subpage in the
> input is useful in providing PAGE_SIZE granularity vaddr.

I don't like this solution because it adds yet another special case for
hugetlb.  If we don't split a PMD-mapped THP, we'd have the same
problem, right?

check_pte() would succeed if we set pvmw->pfn to folio_pfn() and
pvmw->nr_pages to folio_nr_pages(), right?  I just don't know what else
might be affected by that.

I like one of these two options:

@@ -206,6 +206,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
                pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
                if (!pvmw->pte)
                        return false;
+               pvmw->pte += pvmw->address & (size - PAGE_SIZE);

                pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte);
                if (!check_pte(pvmw))

(that needs a bit of tidying up; you can't just do that, but I think
you get the basic idea -- correct the pte to point to the precise page
instead of the hugetlb pfn)


The option I really prefer is much more work but matches our preferred
direction of getting rid of hugetlb specific code.  Something like this:

@@ -192,27 +192,6 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
        if (pvmw->pmd && !pvmw->pte)
                return not_found(pvmw);

-       if (unlikely(is_vm_hugetlb_page(vma))) {
-               struct hstate *hstate = hstate_vma(vma);
-               unsigned long size = huge_page_size(hstate);
-               /* The only possible mapping was handled on last iteration */
[...]
-               pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte);
-               if (!check_pte(pvmw))
-                       return not_found(pvmw);
-               return true;
-       }
-
        end = vma_address_end(pvmw);
        if (pvmw->pte)
                goto next_pte;
@@ -229,7 +208,19 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw                        continue;
                }
                pud = pud_offset(p4d, pvmw->address);
-               if (!pud_present(*pud)) {
+               pude = *pud;
+               if (pud_trans_huge(pude) ||
+                   (pud_present(pude) && pud_devmap(pude))) {
+                       pvmw->ptl = pud_lock(mm, pvmw->pud);
+                       ...
+                       if (likely(pud_trans_huge(pude) || pud_devmap(pude))) {
+                               if (pvmw->flags & PVMW_MIGRATION)
+                                       return not_found(pvmw);
+                               if (!check_pud(pud_pfn(pude), pvmw))
+                                       return not_found(pvmw);
+                               return true;
+                       }
+               } else if (!pud_present(pude)) {
                        step_forward(pvmw, PUD_SIZE);
                        continue;
                }

ie get rid of all the hugetlb-specific code, and add support for the
PUD level to the common code.  You'd also need to write check_pud().

I'll understand if you don't want to do all the extra work.  And
thanks for tracking down this bug.



  reply	other threads:[~2025-01-21  5:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-21  4:18 [PATCH] mm: make page_mapped_in_vma() hugetlb walk aware Jane Chu
2025-01-21  5:00 ` Matthew Wilcox [this message]
2025-01-21  5:20   ` jane.chu
2025-02-24 20:45     ` jane.chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z48p2oK1AfRLYmDQ@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jane.chu@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox