All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: maobibo@loongson.cn
Cc: lance.yang@linux.dev, akpm@linux-foundation.org,
	david@kernel.org, ljs@kernel.org, ziy@nvidia.com,
	baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/huge_memory: skip huge_zero_pmd in zap_huge_pmd_folio()
Date: Thu, 30 Apr 2026 15:02:17 +0800	[thread overview]
Message-ID: <20260430070217.39679-1-lance.yang@linux.dev> (raw)
In-Reply-To: <e3f542dd-f22b-fc8c-b376-72ab5909b6c8@loongson.cn>


On Thu, Apr 30, 2026 at 02:34:20PM +0800, Bibo Mao wrote:
>
>
>On 2026/4/30 下午12:28, Lance Yang wrote:
>> 
>> On Thu, Apr 30, 2026 at 12:11:20PM +0800, Bibo Mao wrote:
>>> when executing command "make check" with qemu software, there is
>>> error report like this:
>>> BUG: Bad rss-counter state mm:00000000972846bc type:MM_FILEPAGES val:-4096 Comm:bios-tables-tes Pid:27802
>>> BUG: Bad rss-counter state mm:00000000752180c5 type:MM_FILEPAGES val:-2048 Comm:worker Pid:27815
>>> BUG: Bad rss-counter state mm:000000009c2f6a61 type:MM_FILEPAGES val:-2048 Comm:qom-test Pid:27825
>> 
>> Good catch!
>> 
>>> The problem is that when application exits, rss counter is calculated
>>> with huge_zero_pmd huge page, instead it should be skipped.
>> 
>> Looks like the same problem[1] we discussed recently.
>> 
>> [1] https://lore.kernel.org/linux-mm/74a75b59-2e13-3985-ee99-d5521f39df2a@google.com/
>> 
>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>> ---
>>> mm/huge_memory.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 970e077019b7..3cbea344d4a2 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -2423,6 +2423,9 @@ static void zap_huge_pmd_folio(struct mm_struct *mm, struct vm_area_struct *vma,
>>> {
>>> 	const bool is_device_private = folio_is_device_private(folio);
>>>
>>> +	if (is_huge_zero_pmd(pmdval))
>>> +		return;
>>> +
>> 
>> The huge zero PMD should not be returned by vm_normal_page_pmd() or
>> vm_normal_folio_pmd() as a normal folio. If it reaches
>> zap_huge_pmd_folio(), we already made the wrong normal-vs-special
>> decision ...
>> 
>> So I don't think we should special-case it in zap_huge_pmd_folio(). That
>> only avoids this RSS decrement :)
>> 
>> Could you please check whether the fix[2] also fixes your QEMU test?
>> 
>> [2] https://lore.kernel.org/linux-mm/ea1453a6-14c9-4334-ac7e-2758586393b2@kernel.org/
>yes, I think it will solve this problem.
>
>Only that I think that there should be tlb flush operation after 
>pmdp_huge_get_and_clear_full() even with huge_zero_pmd page, so 
>tlb_remove_page_size() should be called. Is that right?

Calling tlb_remove_page_size() is not necessary there :)

zap_huge_pmd() already marks the PMD range for TLB invalidation right
after clearing the entry:

	orig_pmd = pmdp_huge_get_and_clear_full(...);
	tlb_remove_pmd_tlb_entry(tlb, pmd, addr);

The later tlb_remove_page_size() is guarded by "is_present && folio",
and is for the normal folio case after normal_or_softleaf_folio_pmd()
return one :)

Please correct me if I missed something :D

Cheers, Lance


  reply	other threads:[~2026-04-30  7:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30  4:11 [PATCH] mm/huge_memory: skip huge_zero_pmd in zap_huge_pmd_folio() Bibo Mao
2026-04-30  4:28 ` Lance Yang
2026-04-30  4:58   ` Lance Yang
2026-04-30  6:34   ` Bibo Mao
2026-04-30  7:02     ` Lance Yang [this message]
2026-04-30  7:05       ` Bibo Mao
2026-04-30  7:16         ` Lance Yang
2026-04-30  8:09           ` Bibo Mao
2026-04-30  8:15             ` Lance Yang
2026-04-30  7:12       ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260430070217.39679-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=maobibo@loongson.cn \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.