All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>,
	Yang Shi <shy828301@gmail.com>, Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Jerome Marchand <jmarchan@redhat.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Hugh Dickins <hughd@google.com>, Jason Gunthorpe <jgg@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>
Subject: Re: [PATCH v1] mm/gup: adjust stale comment for RCU GUP-fast
Date: Mon, 5 Sep 2022 14:11:07 +0530	[thread overview]
Message-ID: <64faaca6-bbcf-bbfb-66c8-b8e80fcdc6bd@linux.ibm.com> (raw)
In-Reply-To: <8e1aaa19-0d28-3381-3ec6-920a474f5f3f@redhat.com>

On 9/5/22 2:08 PM, David Hildenbrand wrote:
> On 04.09.22 18:52, Aneesh Kumar K V wrote:
>> On 9/2/22 12:02 PM, David Hildenbrand wrote:
>>> On 01.09.22 20:35, Yang Shi wrote:
>>>> On Thu, Sep 1, 2022 at 11:07 AM Peter Xu <peterx@redhat.com> wrote:
>>>>>
>>>>> On Thu, Sep 01, 2022 at 10:50:48AM -0700, Yang Shi wrote:
>>>>>> Yeah, because THP collapse does copy the data before clearing pte. If
>>>>>> we want to remove pmdp_collapse_flush() by just clearing pmd, we
>>>>>> should clear *AND* flush pte before copying the data IIRC.
>>>>>
>>>>> Yes tlb flush is still needed.  IIUC the generic pmdp_collapse_flush() will
>>>>> still be working (with the pte level flushing there) but it should just
>>>>> start to work for all archs, so potentially we could drop the arch-specific
>>>>> pmdp_collapse_flush()s, mostly the ppc impl.
>>>>
>>>> I'm don't know why powperpc needs to have its specific
>>>> pmdp_collapse_flush() in the first place, not only the mandatory IPI
>>>> broadcast, but also the specific implementation of pmd tlb flush. But
>>>> anyway the IPI broadcast could be removed at least IMO.
>>>>
>>>
>>> pmdp_collapse_flush() is overwritten on book3s only. It either translates
>>> to radix__pmdp_collapse_flush() or hash__pmdp_collapse_flush().
>>>
>>>
>>> radix__pmdp_collapse_flush() has a comment explaining the situation:
>>>
>>>
>>> +       /*
>>> +        * pmdp collapse_flush need to ensure that there are no parallel gup
>>> +        * walk after this call. This is needed so that we can have stable
>>> +        * page ref count when collapsing a page. We don't allow a collapse page
>>> +        * if we have gup taken on the page. We can ensure that by sending IPI
>>> +        * because gup walk happens with IRQ disabled.
>>> +        */
>>>
>>>
>>> The comment for hash__pmdp_collapse_flush() is a bit more involved:
>>>
>>>     /*
>>>      * Wait for all pending hash_page to finish. This is needed
>>>      * in case of subpage collapse. When we collapse normal pages
>>>      * to hugepage, we first clear the pmd, then invalidate all
>>>      * the PTE entries. The assumption here is that any low level
>>>      * page fault will see a none pmd and take the slow path that
>>>      * will wait on mmap_lock. But we could very well be in a
>>>      * hash_page with local ptep pointer value. Such a hash page
>>>      * can result in adding new HPTE entries for normal subpages.
>>>      * That means we could be modifying the page content as we
>>>      * copy them to a huge page. So wait for parallel hash_page
>>>      * to finish before invalidating HPTE entries. We can do this
>>>      * by sending an IPI to all the cpus and executing a dummy
>>>      * function there.
>>>      */
>>>
>>> I'm not sure if that implies that the IPI is needed for some other hash-magic.
>>>
>>> Maybe Aneesh can clarify.
>>>
>>
>> We still need the IPI for the hash. Another reason for architecture to override that
>> function is to help them use the right page size when flushing the TLB.
> 
> Thanks for clarifying. So the radix variant wouldn't need the IPI anymore, once GUP-fast is handled differently, correct?
> 

yes. With this patch https://lkml.kernel.org/r/20220901222707.477402-1-shy828301@gmail.com we can remove the
serialize_against_pte_lookup(vma->vm_mm); in radix__pmdp_collapse_flush()

-aneesh




  reply	other threads:[~2022-09-05  8:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01  7:21 [PATCH v1] mm/gup: adjust stale comment for RCU GUP-fast David Hildenbrand
2022-09-01 14:37 ` Kirill A . Shutemov
2022-09-01 16:12 ` Jason Gunthorpe
2022-09-01 16:30   ` David Hildenbrand
2022-09-01 16:28 ` Peter Xu
2022-09-01 16:34   ` David Hildenbrand
2022-09-01 16:40     ` Peter Xu
2022-09-01 16:46       ` David Hildenbrand
2022-09-01 17:41         ` Peter Xu
2022-09-01 17:44           ` David Hildenbrand
2022-09-01 17:50         ` Yang Shi
2022-09-01 18:07           ` Peter Xu
2022-09-01 18:35             ` Yang Shi
2022-09-02  6:32               ` David Hildenbrand
2022-09-02 13:53                 ` Peter Xu
2022-09-02 15:37                 ` Yang Shi
2022-09-04 16:52                 ` Aneesh Kumar K V
2022-09-05  8:38                   ` David Hildenbrand
2022-09-05  8:41                     ` Aneesh Kumar K V [this message]
2022-09-04 16:49     ` Aneesh Kumar K V
2022-09-05  8:02       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64faaca6-bbcf-bbfb-66c8-b8e80fcdc6bd@linux.ibm.com \
    --to=aneesh.kumar@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=jmarchan@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.