[PATCH v2] mm/folio: Avoid special handling for order value 0 in folio_set_order

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tarun Sahu <tsahu@linux.ibm.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, muchun.song@linux.dev,
	mike.kravetz@oracle.com, aneesh.kumar@linux.ibm.com,
	willy@infradead.org, sidhartha.kumar@oracle.com,
	gerald.schaefer@linux.ibm.com, linux-kernel@vger.kernel.org,
	jaypatel@linux.ibm.com, tsahu@linux.ibm.com
Subject: [PATCH v2] mm/folio: Avoid special handling for order value 0 in folio_set_order
Date: Mon, 15 May 2023 22:38:09 +0530	[thread overview]
Message-ID: <20230515170809.284680-1-tsahu@linux.ibm.com> (raw)

folio_set_order(folio, 0) is used in kernel at two places
__destroy_compound_gigantic_folio and __prep_compound_gigantic_folio.
Currently, It is called to clear out the folio->_folio_nr_pages and
folio->_folio_order.

For __destroy_compound_gigantic_folio:
In past, folio_set_order(folio, 0) was needed because page->mapping used
to overlap with _folio_nr_pages and _folio_order. So if these fields were
left uncleared during freeing gigantic hugepages, they were causing
"BUG: bad page state" due to non-zero page->mapping. Now, After
Commit a01f43901cfb ("hugetlb: be sure to free demoted CMA pages to
CMA") page->mapping has explicitly been cleared out for tail pages. Also,
_folio_order and _folio_nr_pages no longer overlaps with page->mapping.

struct page {
...
   struct address_space * mapping;  /* 24     8 */
...
}

struct folio {
...
    union {
        struct {
        	long unsigned int _flags_1;      /* 64    8 */
        	long unsigned int _head_1;       /* 72    8 */
        	unsigned char _folio_dtor;       /* 80    1 */
        	unsigned char _folio_order;      /* 81    1 */

        	/* XXX 2 bytes hole, try to pack */

        	atomic_t   _entire_mapcount;     /* 84    4 */
        	atomic_t   _nr_pages_mapped;     /* 88    4 */
        	atomic_t   _pincount;            /* 92    4 */
        	unsigned int _folio_nr_pages;    /* 96    4 */
        };                                       /* 64   40 */
        struct page __page_1 __attribute__((__aligned__(8))); /* 64   64 */
    }
...
}

So, folio_set_order(folio, 0) can be removed from freeing gigantic
folio path (__destroy_compound_gigantic_folio).

Another place, folio_set_order(folio, 0) is called inside
__prep_compound_gigantic_folio during error path. Here,
folio_set_order(folio, 0) can also be removed if we move
folio_set_order(folio, order) after for loop.

The patch also moves _folio_set_head call in __prep_compound_gigantic_folio()
such that we avoid clearing them in the error path.

Also, as Mike pointed out:
"It would actually be better to move the calls _folio_set_head and
folio_set_order in __prep_compound_gigantic_folio() as suggested here. Why?
In the current code, the ref count on the 'head page' is still 1 (or more)
while those calls are made. So, someone could take a speculative ref on the
page BEFORE the tail pages are set up."

This way, folio_set_order(folio, 0) is no more needed. And it will also
helps removing the confusion of folio order being set to 0 (as _folio_order
field is part of first tail page).

Testing: I have run LTP tests, which all passes. and also I have written
the test in LTP which tests the bug caused by compound_nr and page->mapping
overlapping.

https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/hugetlb/hugemmap/hugemmap32.c

Running on older kernel ( < 5.10-rc7) with the above bug this fails while
on newer kernel and, also with this patch it passes.

Signed-off-by: Tarun Sahu <tsahu@linux.ibm.com>
---
 mm/hugetlb.c  | 9 +++------
 mm/internal.h | 8 ++------
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f154019e6b84..607553445855 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1489,7 +1489,6 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
 			set_page_refcounted(p);
 	}

-	folio_set_order(folio, 0);
 	__folio_clear_head(folio);
 }

@@ -1951,9 +1950,6 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
 	struct page *p;

 	__folio_clear_reserved(folio);
-	__folio_set_head(folio);
-	/* we rely on prep_new_hugetlb_folio to set the destructor */
-	folio_set_order(folio, order);
 	for (i = 0; i < nr_pages; i++) {
 		p = folio_page(folio, i);

@@ -1999,6 +1995,9 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
 		if (i != 0)
 			set_compound_head(p, &folio->page);
 	}
+	__folio_set_head(folio);
+	/* we rely on prep_new_hugetlb_folio to set the destructor */
+	folio_set_order(folio, order);
 	atomic_set(&folio->_entire_mapcount, -1);
 	atomic_set(&folio->_nr_pages_mapped, 0);
 	atomic_set(&folio->_pincount, 0);
@@ -2017,8 +2016,6 @@ static bool __prep_compound_gigantic_folio(struct folio *folio,
 		p = folio_page(folio, j);
 		__ClearPageReserved(p);
 	}
-	folio_set_order(folio, 0);
-	__folio_clear_head(folio);
 	return false;
 }

diff --git a/mm/internal.h b/mm/internal.h
index 68410c6d97ac..c59fe08c5b39 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -425,16 +425,12 @@ int split_free_page(struct page *free_page,
  */
 static inline void folio_set_order(struct folio *folio, unsigned int order)
 {
-	if (WARN_ON_ONCE(!folio_test_large(folio)))
+	if (WARN_ON_ONCE(!order || !folio_test_large(folio)))
 		return;

 	folio->_folio_order = order;
 #ifdef CONFIG_64BIT
-	/*
-	 * When hugetlb dissolves a folio, we need to clear the tail
-	 * page, rather than setting nr_pages to 1.
-	 */
-	folio->_folio_nr_pages = order ? 1U << order : 0;
+	folio->_folio_nr_pages = 1U << order;
 #endif
 }

-- 
2.31.1

next             reply	other threads:[~2023-05-15 17:09 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-15 17:08 Tarun Sahu [this message]
2023-05-15 17:15 ` [PATCH v2] mm/folio: Avoid special handling for order value 0 in folio_set_order Tarun Sahu
2023-05-15 17:16 ` Matthew Wilcox
2023-05-15 17:45   ` Mike Kravetz
2023-06-03  0:08     ` Mike Kravetz
2023-05-16 13:09   ` Tarun Sahu
2023-05-22  5:49 ` Tarun Sahu
2023-06-06 15:58 ` Mike Kravetz
2023-06-08 10:03   ` Tarun Sahu
2023-06-08 23:52     ` Mike Kravetz

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:f154019e6b8 dfblob:60755344585 dfblob:68410c6d97a
dfblob:c59fe08c5b3 )
 OR (
bs:"[PATCH v2] mm/folio: Avoid special handling for order value 0 in folio_set_order" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230515170809.284680-1-tsahu@linux.ibm.com \
    --to=tsahu@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=jaypatel@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=sidhartha.kumar@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.