From: Muchun Song <muchun.song@linux.dev>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Linux-MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
Muchun Song <songmuchun@bytedance.com>,
Joao Martins <joao.m.martins@oracle.com>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>,
David Rientjes <rientjes@google.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Naoya Horiguchi <naoya.horiguchi@linux.dev>,
Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Xiongchun Duan <duanxiongchun@bytedance.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 09/12] hugetlb_vmemmap: Optimistically set Optimized flag
Date: Thu, 31 Aug 2023 11:27:52 +0800 [thread overview]
Message-ID: <7D01FB21-5182-428D-BCD8-89C679ABEEC8@linux.dev> (raw)
In-Reply-To: <20230830224706.GC55006@monkey>
> On Aug 31, 2023, at 06:47, Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 08/30/23 15:26, Muchun Song wrote:
>>
>>
>> On 2023/8/26 03:04, Mike Kravetz wrote:
>>> At the beginning of hugetlb_vmemmap_optimize, optimistically set
>>> the HPageVmemmapOptimized flag in the head page. Clear the flag
>>> if the operation fails.
>>>
>>> No change in behavior. However, this will become important in
>>> subsequent patches where we batch delay TLB flushing. We need to
>>> make sure the content in the old and new vmemmap pages are the same.
>>
>> Sorry, I didn't get the point here. Could you elaborate it?
>>
>
> Sorry, this really could use a better explanation.
>
>>>
>>> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
>>> ---
>>> mm/hugetlb_vmemmap.c | 8 +++++---
>>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index e390170c0887..500a118915ff 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -566,7 +566,9 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h,
>>> if (!vmemmap_should_optimize(h, head))
>>> return;
>>> + /* Optimistically assume success */
>>> static_branch_inc(&hugetlb_optimize_vmemmap_key);
>>> + SetHPageVmemmapOptimized(head);
>>> vmemmap_end = vmemmap_start + hugetlb_vmemmap_size(h);
>>> vmemmap_reuse = vmemmap_start;
>>> @@ -577,10 +579,10 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h,
>>> * to the page which @vmemmap_reuse is mapped to, then free the pages
>>> * which the range [@vmemmap_start, @vmemmap_end] is mapped to.
>>> */
>>> - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages))
>>> + if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages)) {
>>> static_branch_dec(&hugetlb_optimize_vmemmap_key);
>>> - else
>>> - SetHPageVmemmapOptimized(head);
>>> + ClearHPageVmemmapOptimized(head);
>>> + }
>
> Consider the case where we have successfully remapped vmemmap AND
> - we have replaced the page table page (pte page) containing the struct
> page of the hugetlb head page. Joao's commit 11aad2631bf7
> 'mm/hugetlb_vmemmap: remap head page to newly allocated page'.
> - we have NOT flushed the TLB after remapping due to batching the
> operations before flush.
>
> In this case, it is possible that the old head page is still in the TLB
> and caches and SetHPageVmemmapOptimized(head) will actually set the flag
> in the old pte page. We then have an optimized hugetlb page without the
> HPageVmemmapOptimized flag set. When developing this series, we
> experienced various BUGs as a result of this situation.
Now, I got it. Thanks for your elaboration.
>
> In the case of an error during optimization, we do a TLB flush so if
> we need to clear the flag we will write to the correct pte page.
Right.
>
> Hope that makes sense.
>
> I add an explanation like this to the commit message and perhaps put
> this closer to/or squash with the patch that batches operations before
> flushing TLB.
Yes. But I'd also like to add a big comment to explain what's going on here
instead of a simple "Optimistically assume success". This one really makes
me think it is an optimization not a mandatory premise.
Thanks.
> --
> Mike Kravetz
>
>>> }
>>> /**
next prev parent reply other threads:[~2023-08-31 3:28 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-25 19:04 [PATCH 00/12] Batch hugetlb vmemmap modification operations Mike Kravetz
2023-08-25 19:04 ` [PATCH 01/12] hugetlb: clear flags in tail pages that will be freed individually Mike Kravetz
2023-08-25 19:04 ` [PATCH 02/12] hugetlb: Use a folio in free_hpage_workfn() Mike Kravetz
2023-08-25 19:04 ` [PATCH 03/12] hugetlb: Remove a few calls to page_folio() Mike Kravetz
2023-08-25 19:04 ` [PATCH 04/12] hugetlb: Convert remove_pool_huge_page() to remove_pool_hugetlb_folio() Mike Kravetz
2023-08-25 19:04 ` [PATCH 05/12] hugetlb: restructure pool allocations Mike Kravetz
2023-08-25 19:04 ` [PATCH 06/12] hugetlb: perform vmemmap optimization on a list of pages Mike Kravetz
2023-08-25 19:04 ` [PATCH 07/12] hugetlb: perform vmemmap restoration " Mike Kravetz
2023-08-26 6:58 ` kernel test robot
2023-08-30 8:33 ` Muchun Song
2023-08-30 17:53 ` Mike Kravetz
2023-08-25 19:04 ` [PATCH 08/12] hugetlb: batch freeing of vmemmap pages Mike Kravetz
2023-08-26 4:00 ` kernel test robot
2023-08-30 7:20 ` Muchun Song
2023-08-30 18:36 ` Mike Kravetz
2023-08-25 19:04 ` [PATCH 09/12] hugetlb_vmemmap: Optimistically set Optimized flag Mike Kravetz
2023-08-30 7:26 ` Muchun Song
2023-08-30 22:47 ` Mike Kravetz
2023-08-31 3:27 ` Muchun Song [this message]
2023-08-25 19:04 ` [PATCH 10/12] hugetlb: batch PMD split for bulk vmemmap dedup Mike Kravetz
2023-08-26 5:56 ` kernel test robot
2023-08-28 9:42 ` Joao Martins
2023-08-28 16:44 ` Mike Kravetz
2023-08-29 3:47 ` Muchun Song
2023-08-26 18:14 ` kernel test robot
2023-08-30 8:09 ` Muchun Song
2023-08-30 11:13 ` Joao Martins
2023-08-30 16:03 ` Joao Martins
2023-08-31 3:54 ` Muchun Song
2023-08-31 9:26 ` Joao Martins
2023-08-25 19:04 ` [PATCH 11/12] hugetlb: batch TLB flushes when freeing vmemmap Mike Kravetz
2023-08-30 8:23 ` Muchun Song
2023-08-30 11:17 ` Joao Martins
2023-08-25 19:04 ` [PATCH 12/12] hugetlb: batch TLB flushes when restoring vmemmap Mike Kravetz
2023-08-26 8:01 ` kernel test robot
2023-08-30 8:47 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7D01FB21-5182-428D-BCD8-89C679ABEEC8@linux.dev \
--to=muchun.song@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=duanxiongchun@bytedance.com \
--cc=joao.m.martins@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=naoya.horiguchi@linux.dev \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=songmuchun@bytedance.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.