linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Mika Penttilä" <mpenttil@redhat.com>
To: zhiguojiang <justinjiang@vivo.com>,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	oe-lkp@lists.linux.dev, oliver.sang@intel.com
Cc: opensource.kernel@vivo.com
Subject: Re: [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio
Date: Wed, 28 Aug 2024 06:51:29 +0300	[thread overview]
Message-ID: <25d25633-0bf4-452c-b665-354a5aaa5d0c@redhat.com> (raw)
In-Reply-To: <5b56e76b-cf73-4cfd-b4c5-03fdece234fd@vivo.com>

Hi,

On 8/28/24 04:14, zhiguojiang wrote:
>
>
> 在 2024/8/28 1:35, David Hildenbrand 写道:
>> On 27.08.24 03:50, zhiguojiang wrote:
>>>
>>>
>>> 在 2024/8/27 1:24, David Hildenbrand 写道:
>>>> On 23.08.24 16:01, Zhiguo Jiang wrote:
>>>>> After CoWed by do_wp_page, the vma established a new mapping
>>>>> relationship
>>>>> with the CoWed folio instead of the non-CoWed folio. However,
>>>>> regarding
>>>>> the situation where vma->anon_vma and the non-CoWed folio's
>>>>> anon_vma are
>>>>> not same, the avc binding relationship between them will no longer be
>>>>> needed, so it is issue for the avc binding relationship still
>>>>> existing
>>>>> between them.
>>>>>
>>>>> This patch will remove the avc binding relationship between vma
>>>>> and the
>>>>> non-CoWed folio's anon_vma, which each has their own independent
>>>>> anon_vma. It can also alleviates rmap overhead simultaneously.
>>>>>
>>>>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>>>>> ---
>>>>> -v2:
>>>>>    * Solve the kernel test robot noticed "WARNING"
>>>>>      Reported-by: kernel test robot <oliver.sang@intel.com>
>>>>>      Closes:
>>>>> https://lore.kernel.org/oe-lkp/202408230938.43f55b4-lkp@intel.com
>>>>>    * Update comments to more accurately describe this patch.
>>>>>
>>>>> -v1:
>>>>> https://lore.kernel.org/linux-mm/20240820143359.199-1-justinjiang@vivo.com/
>>>>>
>>>>>
>>>>>    include/linux/rmap.h |  1 +
>>>>>    mm/memory.c          |  8 +++++++
>>>>>    mm/rmap.c            | 53
>>>>> ++++++++++++++++++++++++++++++++++++++++++++
>>>>>    3 files changed, 62 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>>>>> index 91b5935e8485..8607d28a3146
>>>>> --- a/include/linux/rmap.h
>>>>> +++ b/include/linux/rmap.h
>>>>> @@ -257,6 +257,7 @@ void folio_remove_rmap_ptes(struct folio *,
>>>>> struct page *, int nr_pages,
>>>>>        folio_remove_rmap_ptes(folio, page, 1, vma)
>>>>>    void folio_remove_rmap_pmd(struct folio *, struct page *,
>>>>>            struct vm_area_struct *);
>>>>> +void folio_remove_anon_avc(struct folio *, struct vm_area_struct *);
>>>>>      void hugetlb_add_anon_rmap(struct folio *, struct
>>>>> vm_area_struct *,
>>>>>            unsigned long address, rmap_t flags);
>>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>>> index 93c0c25433d0..4c89cb1cb73e
>>>>> --- a/mm/memory.c
>>>>> +++ b/mm/memory.c
>>>>> @@ -3428,6 +3428,14 @@ static vm_fault_t wp_page_copy(struct vm_fault
>>>>> *vmf)
>>>>>                 * old page will be flushed before it can be reused.
>>>>>                 */
>>>>>                folio_remove_rmap_pte(old_folio, vmf->page, vma);
>>>>> +
>>>>> +            /*
>>>>> +             * If the new_folio's anon_vma is different from the
>>>>> +             * old_folio's anon_vma, the avc binding relationship
>>>>> +             * between vma and the old_folio's anon_vma is removed,
>>>>> +             * avoiding rmap redundant overhead.
>>>>> +             */
>>>>> +            folio_remove_anon_avc(old_folio, vma);
>>>>
>>>> ... by increasing write fault latency, introducing an RMAP walk
>>>> (!)? Hmm?
>>>>
>>>> On the reuse path, we do a folio_move_anon_rmap(), to optimize that.
>>>>
>>> Thanks for your comments. This may not be a good fixup patch. The
>>> resue patch folio_move_anon_rmap() seems to be exclusive or
>>> _refcount = 1 folios. The fork() path seems to clear exclusive flag
>>> in copy_page_range() --> ... --> __folio_try_dup_anon_rmap(). However,
>>> I observed lots of orphan avcs by the above debug trace logs in
>>> wp_page_copy(). But they may be not removed by discussing with Mika.
>>
>> Was this patch ever tested? I cannot even boot a simple VM without an
>> endless stream of
>>
>> [    5.804598] ------------[ cut here ]------------
>> [    5.805494] WARNING: CPU: 11 PID: 595 at mm/rmap.c:443
>> unlink_anon_vmas+0x19b/0x1d0
>> [    5.806962] Modules linked in: qemu_fw_cfg
>> [    5.807762] CPU: 11 UID: 0 PID: 595 Comm: dracut-rootfs-g Tainted:
>> G        W          6.11.0-rc4+ #72
>> [    5.809546] Tainted: [W]=WARN
>> [    5.810127] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
>> BIOS 1.16.3-2.fc40 04/01/2014
>> [    5.811753] RIP: 0010:unlink_anon_vmas+0x19b/0x1d0
>> [    5.812680] Code: b0 00 00 00 00 75 1f f0 ff 8f a0 00 00 00 75 a2
>> e8 8a fd ff ff eb 9b 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d4 82 d0 00 0f
>> 0b eb dd <0f> 0b eb cf 0f 0b 48 83 c7 08 e8 16 40 d7 ff e9 ea fe ff
>> ff 48 8b
>> [    5.816247] RSP: 0018:ffffa19f43bb78d0 EFLAGS: 00010286
>> [    5.817258] RAX: ffff8a71c1bdd2d0 RBX: ffff8a71c1bdd2c0 RCX:
>> ffff8a71c27a86c8
>> [    5.818624] RDX: 0000000000000001 RSI: ffff8a71c2771b28 RDI:
>> ffff8a71c27a9e60
>> [    5.820011] RBP: dead000000000122 R08: 0000000000000000 R09:
>> 0000000000000001
>> [    5.821380] R10: 0000000000000200 R11: 0000000000000001 R12:
>> ffff8a71c2771b28
>> [    5.822748] R13: dead000000000100 R14: ffff8a71c2771b18 R15:
>> ffff8a71c27a9e60
>> [    5.824122] FS:  0000000000000000(0000) GS:ffff8a7337980000(0000)
>> knlGS:0000000000000000
>> [    5.825665] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    5.826775] CR2: 00007fca7f70ac58 CR3: 00000001027b2004 CR4:
>> 0000000000770ef0
>> [    5.828146] PKRU: 55555554
>> [    5.828686] Call Trace:
>> [    5.829169]  <TASK>
>> [    5.829594]  ? __warn.cold+0xb1/0x13e
>> [    5.830312]  ? unlink_anon_vmas+0x19b/0x1d0
>> [    5.831118]  ? report_bug+0xff/0x140
>> [    5.831840]  ? handle_bug+0x3c/0x80
>> [    5.832524]  ? exc_invalid_op+0x17/0x70
>> [    5.833262]  ? asm_exc_invalid_op+0x1a/0x20
>> [    5.834086]  ? unlink_anon_vmas+0x19b/0x1d0
>> [    5.834908]  free_pgtables+0x130/0x290
>> [    5.835661]  exit_mmap+0x19a/0x460
>> [    5.836351]  __mmput+0x4b/0x120
>> [    5.836965]  do_exit+0x2e1/0xac0
>> [    5.837601]  ? lock_release+0xd5/0x2c0
>> [    5.838343]  do_group_exit+0x36/0xa0
>> [    5.839035]  __x64_sys_exit_group+0x18/0x20
>> [    5.839866]  x64_sys_call+0x14b4/0x14c0
> Arm64 machine tested it and no crashes detected. You may try the
> attachment modifition provided by Lorenzo Stoakes. Can you please
> check if there are any opportunities for further improvement?


This patch is still wrong afaics in the main logic, you can not remove
the avc because the non cowed folios of child are not reached then.


>>
>>
>> Andrew, please remove this from mm-unstable.
>
> Thanks
> Zhiguo


Thanks,

Mika




  reply	other threads:[~2024-08-28  3:51 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-23 14:01 [PATCH v2] vma remove the unneeded avc bound with non-CoWed folio Zhiguo Jiang
2024-08-26 17:24 ` David Hildenbrand
2024-08-27  1:50   ` zhiguojiang
2024-08-27 17:35     ` David Hildenbrand
2024-08-28  1:14       ` zhiguojiang
2024-08-28  3:51         ` Mika Penttilä [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-08-23 15:02 Zhiguo Jiang
2024-08-24  5:35 ` Andrew Morton
2024-08-25  4:10   ` zhiguojiang
2024-08-25  4:17     ` zhiguojiang
2024-08-24 16:26 ` Lorenzo Stoakes
2024-08-24 18:04   ` Lorenzo Stoakes
2024-08-25  5:06   ` zhiguojiang
2024-08-25  6:39     ` Lorenzo Stoakes
2024-08-25 18:13       ` Mika Penttilä
2024-08-26  2:56         ` zhiguojiang
2024-08-26  4:30           ` Mika Penttilä
2024-08-25  6:42     ` Lorenzo Stoakes
2024-08-25  7:08       ` zhiguojiang
2024-08-26 17:03       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25d25633-0bf4-452c-b665-354a5aaa5d0c@redhat.com \
    --to=mpenttil@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=justinjiang@vivo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=opensource.kernel@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).