linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Donet Tom <donettom@linux.ibm.com>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>,
	akpm@linux-foundation.org, Liam.Howlett@oracle.com,
	lorenzo.stoakes@oracle.com, shuah@kernel.org, pfalcato@suse.de,
	david@redhat.com, ziy@nvidia.com, baolin.wang@linux.alibaba.com,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	ritesh.list@gmail.com
Subject: Re: [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures
Date: Fri, 8 Aug 2025 19:55:37 +0530	[thread overview]
Message-ID: <c237c703-3ed6-4d7d-aaff-bd6291f9220f@linux.ibm.com> (raw)
In-Reply-To: <20250808025804.b7cv47gcq2yscka7@master>


On 8/8/25 8:28 AM, Wei Yang wrote:
> On Thu, Aug 07, 2025 at 02:56:28PM +0530, Donet Tom wrote:
>> On 8/6/25 8:24 PM, Wei Yang wrote:
>>> On Wed, Aug 06, 2025 at 06:30:37PM +0530, Donet Tom wrote:
>>> [...]
>>>>> Child process inherit the ksm_merging_pages from parent, which is reasonable
>>>>> to me. But I am confused why ksm_unmerge() would just reset ksm_merging_pages
>>>>> for parent and leave ksm_merging_pages in child process unchanged.
>>>>>
>>>>> ksm_unmerge() writes to /sys/kernel/mm/ksm/run, which is a system wide sysfs
>>>>> interface. I expect it applies to both parent and child.
>>>> I am not very familiar with the KSM code, but from what I understand:
>>>>
>>>> The ksm_merging_pages counter is maintained per mm_struct. When
>>>> we write to /sys/kernel/mm/ksm/run, unmerging is triggered, and the
>>>> counters are updated for all mm_structs present in the ksm_mm_slot list.
>>>>
>>>> A mm_struct gets added to this list  when MADV_MERGEABLE is called.
>>>> In the case of the child process, since MADV_MERGEABLE has not been
>>>> invoked yet, its mm_struct is not part of the list. As a result,
>>>> its ksm_merging_pages counter is not reset.
>>>>
>>> Would this flag be inherited during fork? VM_MERGEABLE is saved in related vma
>>> I don't see it would be dropped during fork. Maybe missed.
>>>
>>>>>> value remained unchanged. That’s why get_my_merging_page() in the child was
>>>>>> returning a non-zero value.
>>>>>>
>>>>> I guess you mean the get_my_merging_page() in __mmap_and_merge_range() return
>>>>> a non-zero value. But there is ksm_unmerge() before it. Why this ksm_unmerge()
>>>>> couldn't reset the value, but a ksm_unmerge() in parent could.
>>>>>
>>>>>> Initially, I fixed the issue by calling ksm_unmerge() before the fork(), and
>>>>>> that
>>>>>> resolved the problem. Later, I decided it would be cleaner to move the
>>>>>> ksm_unmerge() call to the test cleanup phase.
>>>>>>
>>>>> Also all the tests before test_prctl_fork(), except test_prctl(), calls
>>>>>
>>>>>      ksft_test_result(!range_maps_duplicates());
>>>>>
>>>>> If the previous tests succeed, it means there is no duplicate pages. This
>>>>> means ksm_merging_pages should be 0 before test_prctl_fork() if other tests
>>>>> pass. And the child process would inherit a 0 ksm_merging_pages. (A quick test
>>>>> proves it.)
>>>> If I understand correctly, all the tests are calling MADV_UNMERGEABLE,
>>>> which internally calls break_ksm() in the kernel. This function replaces the
>>>> KSM page with an exclusive anonymous page. However, the
>>>> ksm_merging_pages counters are not updated at this point.
>>>>
>>>> The function range_maps_duplicates(map, size) checks whether the pages
>>>> have been unmerged. Since break_ksm() does perform the unmerge, this
>>>> function returns false, and the test passes.
>>>>
>>>> The ksm_merging_pages update happens later via the ksm_scan_thread().
>>>> That’s why we observe that ksm_merging_pages values are not reset
>>>> immediately after the test finishes.
>>>>
>>> Not familiar with ksm internal. But the ksm_merging_pages counter still has
>>> non-zero value when all merged pages are unmerged makes me feel odd.
>>>
>>>> If we add a sleep(1) after the MADV_UNMERGEABLE call, we can see that
>>>> the ksm_merging_pages values are reset after the sleep.
>>>>
>>>> Once the test completes successfully, we can call ksm_unmerge(), which
>>>> will immediately reset the ksm_merging_pages value. This way, in the fork
>>>> test, the child process will also see the correct value.
>>>>> So which part of the story I missed?
>>>>>
>>>> So, during the cleanup phase after a successful test, we can call
>>>> ksm_unmerge() to reset the counter. Do you see any issue with
>>>> this approach?
>>>>
>>> It looks there is no issue with an extra ksm_unmerge().
>>>
>>> But one more question. Why an extra ksm_unmerge() could help.
>>>
>>> Here is what we have during test:
>>>
>>>
>>>     test_prot_none()
>>>         !range_maps_duplicates()
>>>         ksm_unmerge()                  1) <--- newly add
>>>     test_prctl_fork()
>>>         >--- in child
>>>         __mmap_and_merge_range()
>>>             ksm_unmerge()              2) <--- already have
>>>
>>> As you mentioned above ksm_unmerge() would immediately reset
>>> ksm_merging_pages, why ksm_unmerge() at 2) still leave ksm_merging_pages
>>> non-zero? And the one at 1) could help.
>>
> >From the debugging, what I understood is:
>> When we perform fork(), MADV_MERGEABLE, or PR_SET_MEMORY_MERGE, the
>> mm_struct of the process gets added to the ksm_mm_slot list. As a
>> result, both the parent and child processes’ mm_struct structures
>> will be present in ksm_mm_slot.
>>
>> When KSM merges the pages, it creates a ksm_rmap_item for each page,
>> and the ksm_merging_pages counter is incremented accordingly.
>>
>> Since the parent process did the merge, its mm_struct is present in
>> ksm_mm_slot, and ksm_rmap_item entries are created for all the merged
>> pages.
>>
>> When a process is forked, the child’s mm_struct is also added to
>> ksm_mm_slot, and it inherits the ksm_merging_pages count. However,
>> no ksm_rmap_item entries are created for the child process because it
>> did not do any merge.
>>
>> When ksm_unmerge() is called, it iterates over all processes in
>> ksm_mm_slot. In our case, both the parent and child are present. It
>> first processes the parent, which has ksm_rmap_item entries, so it
>> unmerges the pages and resets the ksm_merging_pages counter.
>>
>> For the child, since it did not perform any actual merging, it does not
>> have any ksm_rmap_item entries. Therefore, there are no pages to unmerge,
>> and the counter remains unchanged.
>>
> Thanks for the detailed analysis.
>
> So the key is child has no ksm_rmap_item which will not clear ksm_merging_page
> on ksm_unmerge().
>
>> So, only processes that performed KSM merging will have their counters
>> updated during ksm_unmerge(). The child process, having not initiated any
>> merging, retains the inherited counter value without any update.
>>
>> So from a testing point of view, I think it is better to reset the
>> counters as part of the cleanup code to ensure that the next tests do
>> not get incorrect values.
>>
> Hmm... I agree from the test point of view based on current situation.
>
> While maybe this is also a check point for later version.

Are you okay to proceed with the current patch in this series?

>
>> The question I have is: is it correct to keep the inherited
>> |ksm_merging_page|
>> value in the child or Should we reset it to 0 during |ksm_fork()|?
>>
> Very good question. There looks to be something wrong, but I am not sure this
> is the correct way.

ok.

I am going through it and will come up with a fix along with a test for 
this scenario. I will post it as a separate series.


>
>>> Or there is still some timing issue like sleep(1) you did?
>>>


  reply	other threads:[~2025-08-08 14:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-29  5:33 [PATCH v3 0/7] selftests/mm: Fix false positives and skip unsupported tests Aboorva Devarajan
2025-07-29  5:33 ` [PATCH v3 1/7] mm/selftests: Fix incorrect pointer being passed to mark_range() Aboorva Devarajan
2025-07-29  5:33 ` [PATCH v3 2/7] selftests/mm: Add support to test 4PB VA on PPC64 Aboorva Devarajan
2025-07-29  5:33 ` [PATCH v3 3/7] selftest/mm: Fix ksm_funtional_test failures Aboorva Devarajan
2025-08-04  9:11   ` Wei Yang
2025-08-04 14:36     ` David Hildenbrand
2025-08-05  6:09     ` Donet Tom
2025-08-05 17:03       ` Wei Yang
2025-08-06 13:00         ` Donet Tom
2025-08-06 14:54           ` Wei Yang
2025-08-07  9:26             ` Donet Tom
2025-08-08  2:58               ` Wei Yang
2025-08-08 14:25                 ` Donet Tom [this message]
2025-08-09 18:32                   ` Wei Yang
2025-07-29  5:34 ` [PATCH v3 4/7] mm/selftests: Fix split_huge_page_test failure on systems with 64KB page size Aboorva Devarajan
2025-08-04  9:04   ` Wei Yang
2025-08-04 14:33     ` David Hildenbrand
2025-08-05  6:13     ` Donet Tom
2025-07-29  5:34 ` [PATCH v3 5/7] selftests/mm: Fix child process exit codes in ksm_functional_tests Aboorva Devarajan
2025-07-29  5:34 ` [PATCH v3 6/7] selftests/mm: Skip thuge-gen test if system is not setup properly Aboorva Devarajan
2025-07-29  5:34 ` [PATCH v3 7/7] selftests/mm: Skip hugepage-mremap test if userfaultfd unavailable Aboorva Devarajan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c237c703-3ed6-4d7d-aaff-bd6291f9220f@linux.ibm.com \
    --to=donettom@linux.ibm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=aboorvad@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=npache@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=richard.weiyang@gmail.com \
    --cc=ritesh.list@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).