From: Usama Arif <usamaarif642@gmail.com>
To: Barry Song <baohua@kernel.org>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev,
roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com,
ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org,
cerasuolodomenico@gmail.com, corbet@lwn.net,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios
Date: Wed, 14 Aug 2024 12:11:26 +0100 [thread overview]
Message-ID: <925804e4-0b33-45eb-905d-e00f67192828@gmail.com> (raw)
In-Reply-To: <CAGsJ_4x+5fiCoWv4G0NsYq+aJRqZsrCEHO_DF+CnNFdqx0VgMQ@mail.gmail.com>
On 14/08/2024 11:44, Barry Song wrote:
> On Wed, Aug 14, 2024 at 12:03 AM Usama Arif <usamaarif642@gmail.com> wrote:
>>
>> Currently folio->_deferred_list is used to keep track of
>> partially_mapped folios that are going to be split under memory
>> pressure. In the next patch, all THPs that are faulted in and collapsed
>> by khugepaged are also going to be tracked using _deferred_list.
>>
>> This patch introduces a pageflag to be able to distinguish between
>> partially mapped folios and others in the deferred_list at split time in
>> deferred_split_scan. Its needed as __folio_remove_rmap decrements
>> _mapcount, _large_mapcount and _entire_mapcount, hence it won't be
>> possible to distinguish between partially mapped folios and others in
>> deferred_split_scan.
>>
>> Eventhough it introduces an extra flag to track if the folio is
>> partially mapped, there is no functional change intended with this
>> patch and the flag is not useful in this patch itself, it will
>> become useful in the next patch when _deferred_list has non partially
>> mapped folios.
>>
>> Signed-off-by: Usama Arif <usamaarif642@gmail.com>
>> ---
>> include/linux/huge_mm.h | 4 ++--
>> include/linux/page-flags.h | 3 +++
>> mm/huge_memory.c | 21 +++++++++++++--------
>> mm/hugetlb.c | 1 +
>> mm/internal.h | 4 +++-
>> mm/memcontrol.c | 3 ++-
>> mm/migrate.c | 3 ++-
>> mm/page_alloc.c | 5 +++--
>> mm/rmap.c | 3 ++-
>> mm/vmscan.c | 3 ++-
>> 10 files changed, 33 insertions(+), 17 deletions(-)
>>
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 4c32058cacfe..969f11f360d2 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page)
>> {
>> return split_huge_page_to_list_to_order(page, NULL, 0);
>> }
>> -void deferred_split_folio(struct folio *folio);
>> +void deferred_split_folio(struct folio *folio, bool partially_mapped);
>>
>> void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
>> unsigned long address, bool freeze, struct folio *folio);
>> @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page)
>> {
>> return 0;
>> }
>> -static inline void deferred_split_folio(struct folio *folio) {}
>> +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {}
>> #define split_huge_pmd(__vma, __pmd, __address) \
>> do { } while (0)
>>
>> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
>> index a0a29bd092f8..cecc1bad7910 100644
>> --- a/include/linux/page-flags.h
>> +++ b/include/linux/page-flags.h
>> @@ -182,6 +182,7 @@ enum pageflags {
>> /* At least one page in this folio has the hwpoison flag set */
>> PG_has_hwpoisoned = PG_active,
>> PG_large_rmappable = PG_workingset, /* anon or file-backed */
>> + PG_partially_mapped, /* was identified to be partially mapped */
>> };
>>
>> #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1)
>> @@ -861,8 +862,10 @@ static inline void ClearPageCompound(struct page *page)
>> ClearPageHead(page);
>> }
>> FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE)
>> +FOLIO_FLAG(partially_mapped, FOLIO_SECOND_PAGE)
>> #else
>> FOLIO_FLAG_FALSE(large_rmappable)
>> +FOLIO_FLAG_FALSE(partially_mapped)
>> #endif
>>
>> #define PG_head_mask ((1UL << PG_head))
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 6df0e9f4f56c..c024ab0f745c 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3397,6 +3397,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>> * page_deferred_list.
>> */
>> list_del_init(&folio->_deferred_list);
>> + folio_clear_partially_mapped(folio);
>> }
>> spin_unlock(&ds_queue->split_queue_lock);
>> if (mapping) {
>> @@ -3453,11 +3454,12 @@ void __folio_undo_large_rmappable(struct folio *folio)
>> if (!list_empty(&folio->_deferred_list)) {
>> ds_queue->split_queue_len--;
>> list_del_init(&folio->_deferred_list);
>> + folio_clear_partially_mapped(folio);
>> }
>> spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags);
>> }
>>
>> -void deferred_split_folio(struct folio *folio)
>> +void deferred_split_folio(struct folio *folio, bool partially_mapped)
>> {
>> struct deferred_split *ds_queue = get_deferred_split_queue(folio);
>> #ifdef CONFIG_MEMCG
>> @@ -3485,14 +3487,17 @@ void deferred_split_folio(struct folio *folio)
>> if (folio_test_swapcache(folio))
>> return;
>>
>> - if (!list_empty(&folio->_deferred_list))
>> - return;
>> -
>> spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
>> + if (partially_mapped)
>> + folio_set_partially_mapped(folio);
>> + else
>> + folio_clear_partially_mapped(folio);
>
> Hi Usama,
>
> Do we need this? When can a partially_mapped folio on deferred_list become
> non-partially-mapped and need a clear? I understand transferring from
> entirely_map
> to partially_mapped is a one way process? partially_mapped folios can't go back
> to entirely_mapped?
>
Hi Barry,
deferred_split_folio function is called in 3 places after this series, at fault, collapse and partial mapping. partial mapping can only happen after fault/collapse, and we have FOLIO_FLAG_FALSE(partially_mapped), i.e. flag initialized to false, so technically its not needed. A partially_mapped folio on deferred list wont become non-partially mapped.
I just did it as a precaution if someone ever changes the kernel and calls deferred_split_folio with partially_mapped set to false after it had been true. The function arguments of deferred_split_folio make it seem that passing partially_mapped=false as an argument would clear it, which is why I cleared it as well. I could change the patch to something like below if it makes things better? i.e. add a comment at the top of the function:
-void deferred_split_folio(struct folio *folio)
+/* partially_mapped=false won't clear PG_partially_mapped folio flag */
+void deferred_split_folio(struct folio *folio, bool partially_mapped)
{
struct deferred_split *ds_queue = get_deferred_split_queue(folio);
#ifdef CONFIG_MEMCG
@@ -3485,14 +3488,15 @@ void deferred_split_folio(struct folio *folio)
if (folio_test_swapcache(folio))
return;
- if (!list_empty(&folio->_deferred_list))
- return;
-
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
+ if (partially_mapped)
+ folio_set_partially_mapped(folio);
if (list_empty(&folio->_deferred_list)) {
- if (folio_test_pmd_mappable(folio))
- count_vm_event(THP_DEFERRED_SPLIT_PAGE);
- count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED);
+ if (partially_mapped) {
+ if (folio_test_pmd_mappable(folio))
+ count_vm_event(THP_DEFERRED_SPLIT_PAGE);
+ count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED);
+ }
list_add_tail(&folio->_deferred_list, &ds_queue->split_queue);
ds_queue->split_queue_len++;
#ifdef CONFIG_MEMCG
> I am trying to rebase my NR_SPLIT_DEFERRED counter on top of your
> work, but this "clear" makes that job quite tricky. as I am not sure
> if this patch
> is going to clear the partially_mapped flag for folios on deferred_list.
>
> Otherwise, I can simply do the below whenever folio is leaving deferred_list
>
> ds_queue->split_queue_len--;
> list_del_init(&folio->_deferred_list);
> if (folio_test_clear_partially_mapped(folio))
> mod_mthp_stat(folio_order(folio),
> MTHP_STAT_NR_SPLIT_DEFERRED, -1);
>
next prev parent reply other threads:[~2024-08-14 11:11 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-13 12:02 [PATCH v3 0/6] mm: split underutilized THPs Usama Arif
2024-08-13 12:02 ` [PATCH v3 1/6] mm: free zapped tail pages when splitting isolated thp Usama Arif
2024-08-15 18:47 ` Kairui Song
2024-08-15 19:16 ` Usama Arif
2024-08-16 16:55 ` Kairui Song
2024-08-16 17:02 ` Usama Arif
2024-08-16 18:11 ` Kairui Song
2024-08-13 12:02 ` [PATCH v3 2/6] mm: remap unused subpages to shared zeropage " Usama Arif
2024-08-13 12:02 ` [PATCH v3 3/6] mm: selftest to verify zero-filled pages are mapped to zeropage Usama Arif
2024-08-13 12:02 ` [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios Usama Arif
2024-08-14 3:30 ` Yu Zhao
2024-08-14 10:20 ` Usama Arif
2024-08-14 10:44 ` Barry Song
2024-08-14 10:52 ` Barry Song
2024-08-14 11:11 ` Usama Arif [this message]
2024-08-14 11:20 ` Barry Song
2024-08-14 11:26 ` Barry Song
2024-08-14 11:30 ` Usama Arif
2024-08-14 11:10 ` Barry Song
2024-08-14 11:20 ` Usama Arif
2024-08-14 11:23 ` Barry Song
2024-08-14 12:36 ` Usama Arif
2024-08-14 23:05 ` Barry Song
2024-08-15 15:25 ` Usama Arif
2024-08-15 23:30 ` Andrew Morton
2024-08-16 2:50 ` Yu Zhao
2024-08-15 16:33 ` David Hildenbrand
2024-08-15 17:10 ` Usama Arif
2024-08-15 21:06 ` Barry Song
2024-08-15 21:08 ` David Hildenbrand
2024-08-16 15:44 ` Matthew Wilcox
2024-08-16 16:08 ` Usama Arif
2024-08-16 16:28 ` Matthew Wilcox
2024-08-16 16:41 ` Usama Arif
2024-08-13 12:02 ` [PATCH v3 5/6] mm: split underutilized THPs Usama Arif
2024-08-13 12:02 ` [PATCH v3 6/6] mm: add sysfs entry to disable splitting " Usama Arif
2024-08-13 17:22 ` [PATCH v3 0/6] mm: split " Andi Kleen
2024-08-14 10:13 ` Usama Arif
2024-08-18 5:13 ` Hugh Dickins
2024-08-18 7:45 ` David Hildenbrand
2024-08-19 2:38 ` Usama Arif
2024-08-19 2:36 ` Usama Arif
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=925804e4-0b33-45eb-905d-e00f67192828@gmail.com \
--to=usamaarif642@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=cerasuolodomenico@gmail.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@meta.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.