linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Zhaoyang Huang <huangzhaoyang@gmail.com>
Cc: Hyesoo Yu <hyesoo.yu@samsung.com>,
	jaewon31.kim@samsung.com, John Hubbard <jhubbard@nvidia.com>,
	"zhaoyang.huang@unisoc.com" <zhaoyang.huang@unisoc.com>,
	"surenb@google.com" <surenb@google.com>,
	"Steve.Kang@unisoc.com" <Steve.Kang@unisoc.com>,
	Jaewon Kim <jaewon31.kim@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jang-Hyuck Kim <janghyuck.kim@samsung.com>
Subject: Re: reply: [RFC] pin_user_pages_fast failure count increased
Date: Wed, 28 May 2025 14:57:25 +0200	[thread overview]
Message-ID: <c41e3e9a-f033-4a47-b9be-bdd1016a111f@redhat.com> (raw)
In-Reply-To: <CAGWkznHfrcuQOWzVrecBZCate3xRah9QM=BEKdf6_ntzqaasdQ@mail.gmail.com>

On 28.05.25 12:59, Zhaoyang Huang wrote:
> On Wed, May 28, 2025 at 3:55 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 28.05.25 05:36, Hyesoo Yu wrote:
>>> On Wed, May 28, 2025 at 10:49:36AM +0800, Zhaoyang Huang wrote:
>>>> On Wed, May 28, 2025 at 9:25 AM Hyesoo Yu <hyesoo.yu@samsung.com> wrote:
>>>>>
>>>>> On Mon, May 26, 2025 at 07:49:57PM +0800, Zhaoyang Huang wrote:
>>>>>
>>>>> Hello, Zhaoyang.
>>>>>
>>>>> I don't believe commit 1aaf8c was just intended to prevent an infinite loop.
>>>>> The commit was introduced to allow pinning CMA memory in the pKVM on AOSP.
>>>>>
>>>>> That leads me to question whether the assumption that CMA can be long-term pinned is actually valid.
>>>> That depends on the user of CMA, yes for my scenario since it worked
>>>> for the guest os. For common scenario such as the file/anon mapping,
>>>> the page will be judged as unpinnable for long-term and be migrated
>>>> out of CMA area.
>>>
>>> Your scenario and the common scenarios can not be distinguished from the kernel API's perspective.
>>> Even in common cases, the page may be in a non-LRU state temporiarily, and in such situations,
>>> pinning CMA can lead to bugs - we've encountered multiple issues because of this.
>>>
>>
>> Right. We just disallow long-term pinning CMA pages, because we don't
>> know who the real owner is that would be okay with long-term pinning them.
>>
>>>>>
>>>>> In my opinion, it might be more appropriate to revert that commit 1aaf8c and instead ensure
>>>>> that pKVM avoids using CMA for memory that requires long-term pinning through GUP ?
>>>> It is not a pkvm issue but a defect of applying FOLL_LONGTERM over
>>>> non-LRU CMA pages.
>>>
>>> In include/linux/mm_types.h, the CMA should be migrated when FOLL_LONGTERM.
>>>
>>> * In the CMA case: long term pins in a CMA region would unnecessarily fragment
>>> * that region.  And so, CMA attempts to migrate the page before pinning, when
>>> * FOLL_LONGTERM is specified.
>>>
>>> Given this, would it make sense to avoid using FOLL_LONGTERM in this code path ?
>>
>> If something is unbounded in time, FOLL_LONGTERM is the right thing to use.
>>
>>>>>
>>>>> Alternatively, instead of changing the current logic that prevents longterm GUP from pinning CMA,
>>>>> it would be better to propose a new patch that specifically addresses the pKVM scenario like adding new FOLL_flags ?
>>>> I don't think so. pin_user_pages is an exported API which can't make
>>>> assumptions over the caller.
>>>
>>> My point is not to base the patch on assumptions about the caller,
>>> but to define a clear mechanism that ensures safe behavior in the intended scenario.
>>>
>>> For example, you can add FOLL_NO_MIGRATION and skip to migrate unpinnable pages.
>>
>> Not sure which exact semantics you have in mind. But failing if we would
>> have to migrate might be ok. Not sure if the caller should worry about
>> that, though: the caller should not have to worry about page placement
>> in general.
> With going over the whole thread, I think the root cause is
> collect_longterm_unpinnable_folios() hit the race window between
> lru_add_drain_all() and folio_isolate_lru() by chance and returned
> with ret=0 which finally have the CMA page pinned, right? However, I
> find the proposed patch below will fail the PKVM
> scenario(FOLL_LONGTERM set with non-LRU CMA pages) again as the CMA
> pages never go to LRU which will have the __gup_longterm_locked loop
> in do while(ret == -EAGAIN) as it did before 1aaf8c.

For upstream that is the exact right thing to do.

CMA pages must not get longterm pinned.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-05-28 12:57 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAJrd-UtDD50iN=Yxz4=6kNkAcNAtRFkxhKAbEYiRyyDT-bYPHg@mail.gmail.com>
2025-05-22 10:18 ` reply: [RFC] pin_user_pages_fast failure count increased 黄朝阳 (Zhaoyang Huang)
2025-05-22 12:22   ` David Hildenbrand
     [not found]     ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p3>
2025-05-22 13:09       ` Jaewon Kim
2025-05-22 14:06         ` David Hildenbrand
     [not found]         ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p2>
2025-05-22 14:44           ` 김재원
2025-05-22 15:07             ` David Hildenbrand
2025-05-23  2:48               ` John Hubbard
2025-05-23  2:37           ` 김재원
2025-05-23  2:52             ` John Hubbard
2025-05-26  7:48               ` Hyesoo Yu
2025-05-26  8:05                 ` Zhaoyang Huang
2025-05-26  9:33                   ` Hyesoo Yu
2025-05-26  9:38                     ` David Hildenbrand
     [not found]                     ` <CGME20250522130101epcas1p435244c12cfc9bb7895008b8ea98af064@epcms1p8>
2025-05-26 11:17                       ` Jaewon Kim
2025-05-26 11:49                         ` Zhaoyang Huang
2025-05-28  1:23                           ` Hyesoo Yu
2025-05-28  2:49                             ` Zhaoyang Huang
2025-05-28  3:36                               ` Hyesoo Yu
2025-05-28  7:55                                 ` David Hildenbrand
2025-05-28 10:59                                   ` Zhaoyang Huang
2025-05-28 12:57                                     ` David Hildenbrand [this message]
2025-06-03 13:12                                     ` David Hildenbrand
2025-06-04  1:04                                       ` Zhaoyang Huang
2025-06-04  9:12                                         ` David Hildenbrand
2025-06-04  9:41                                           ` Zhaoyang Huang
2025-06-04  9:48                                             ` David Hildenbrand
     [not found]                                               ` <CGME20250604095542epcas2p3f3d2d6fc17115547981a7173215a09d1@epcas2p3.samsung.com>
2025-06-04  9:53                                                 ` Hyesoo Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c41e3e9a-f033-4a47-b9be-bdd1016a111f@redhat.com \
    --to=david@redhat.com \
    --cc=Steve.Kang@unisoc.com \
    --cc=huangzhaoyang@gmail.com \
    --cc=hyesoo.yu@samsung.com \
    --cc=jaewon31.kim@gmail.com \
    --cc=jaewon31.kim@samsung.com \
    --cc=janghyuck.kim@samsung.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=surenb@google.com \
    --cc=zhaoyang.huang@unisoc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).