From: Andrew Morton <akpm@linux-foundation.org>
To: mm-commits@vger.kernel.org, ziy@nvidia.com, vbabka@suse.cz,
robin.murphy@arm.com, robh+dt@kernel.org, paulus@samba.org,
m.szyprowski@samsung.com, mst@redhat.com, mpe@ellerman.id.au,
minchan@kernel.org, iommu@lists.linux-foundation.org, hch@lst.de,
frowand.list@gmail.com, benh@kernel.crashing.org,
aneesh.kumar@linux.ibm.com, david@redhat.com,
akpm@linux-foundation.org
Subject: + mm-enforce-pageblock_order-max_order.patch added to -mm tree
Date: Mon, 14 Feb 2022 13:06:15 -0800 [thread overview]
Message-ID: <20220214210616.19CCAC340E9@smtp.kernel.org> (raw)
The patch titled
Subject: mm: enforce pageblock_order < MAX_ORDER
has been added to the -mm tree. Its filename is
mm-enforce-pageblock_order-max_order.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-enforce-pageblock_order-max_order.patch
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-enforce-pageblock_order-max_order.patch
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: David Hildenbrand <david@redhat.com>
Subject: mm: enforce pageblock_order < MAX_ORDER
Some places in the kernel don't really expect pageblock_order >=
MAX_ORDER, and it looks like this is only possible in corner cases:
1) CONFIG_DEFERRED_STRUCT_PAGE_INIT we'll end up freeing pageblock_order
pages via __free_pages_core(), which cannot possibly work.
2) find_zone_movable_pfns_for_nodes() will roundup the ZONE_MOVABLE
start PFN to MAX_ORDER_NR_PAGES. Consequently with a bigger
pageblock_order, we could have a single pageblock partially managed by
two zones.
3) compaction code runs into __fragmentation_index() with order
>= MAX_ORDER, when checking WARN_ON_ONCE(order >= MAX_ORDER). [1]
4) mm/page_reporting.c won't be reporting any pages with default
page_reporting_order == pageblock_order, as we'll be skipping the
reporting loop inside page_reporting_process_zone().
5) __rmqueue_fallback() will never be able to steal with
ALLOC_NOFRAGMENT.
pageblock_order >= MAX_ORDER is weird either way: it's a pure optimization
for making alloc_contig_range(), as used for allcoation of gigantic pages,
a little more reliable to succeed. However, if there is demand for
somewhat reliable allocation of gigantic pages, affected setups should be
using CMA or boottime allocations instead.
So let's make sure that pageblock_order < MAX_ORDER and simplify.
[1] https://lkml.kernel.org/r/87r189a2ks.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20220214174132.219303-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: John Garry via iommu <iommu@lists.linux-foundation.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/virtio/virtio_mem.c | 9 ++------
include/linux/cma.h | 3 --
include/linux/pageblock-flags.h | 7 ++++--
mm/Kconfig | 3 ++
mm/page_alloc.c | 32 +++++++-----------------------
5 files changed, 20 insertions(+), 34 deletions(-)
--- a/drivers/virtio/virtio_mem.c~mm-enforce-pageblock_order-max_order
+++ a/drivers/virtio/virtio_mem.c
@@ -2476,13 +2476,10 @@ static int virtio_mem_init_hotplug(struc
VIRTIO_MEM_DEFAULT_OFFLINE_THRESHOLD);
/*
- * We want subblocks to span at least MAX_ORDER_NR_PAGES and
- * pageblock_nr_pages pages. This:
- * - Is required for now for alloc_contig_range() to work reliably -
- * it doesn't properly handle smaller granularity on ZONE_NORMAL.
+ * TODO: once alloc_contig_range() works reliably with pageblock
+ * granularity on ZONE_NORMAL, use pageblock_nr_pages instead.
*/
- sb_size = max_t(uint64_t, MAX_ORDER_NR_PAGES,
- pageblock_nr_pages) * PAGE_SIZE;
+ sb_size = PAGE_SIZE * MAX_ORDER_NR_PAGES;
sb_size = max_t(uint64_t, vm->device_block_size, sb_size);
if (sb_size < memory_block_size_bytes() && !force_bbm) {
--- a/include/linux/cma.h~mm-enforce-pageblock_order-max_order
+++ a/include/linux/cma.h
@@ -25,8 +25,7 @@
* -- can deal with only some pageblocks of a higher-order page being
* MIGRATE_CMA, we can use pageblock_nr_pages.
*/
-#define CMA_MIN_ALIGNMENT_PAGES max_t(phys_addr_t, MAX_ORDER_NR_PAGES, \
- pageblock_nr_pages)
+#define CMA_MIN_ALIGNMENT_PAGES MAX_ORDER_NR_PAGES
#define CMA_MIN_ALIGNMENT_BYTES (PAGE_SIZE * CMA_MIN_ALIGNMENT_PAGES)
struct cma;
--- a/include/linux/pageblock-flags.h~mm-enforce-pageblock_order-max_order
+++ a/include/linux/pageblock-flags.h
@@ -37,8 +37,11 @@ extern unsigned int pageblock_order;
#else /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */
-/* Huge pages are a constant size */
-#define pageblock_order HUGETLB_PAGE_ORDER
+/*
+ * Huge pages are a constant size, but don't exceed the maximum allocation
+ * granularity.
+ */
+#define pageblock_order min_t(unsigned int, HUGETLB_PAGE_ORDER, MAX_ORDER - 1)
#endif /* CONFIG_HUGETLB_PAGE_SIZE_VARIABLE */
--- a/mm/Kconfig~mm-enforce-pageblock_order-max_order
+++ a/mm/Kconfig
@@ -262,6 +262,9 @@ config HUGETLB_PAGE_SIZE_VARIABLE
HUGETLB_PAGE_ORDER when there are multiple HugeTLB page sizes available
on a platform.
+ Note that the pageblock_order cannot exceed MAX_ORDER - 1 and will be
+ clamped down to MAX_ORDER - 1.
+
config CONTIG_ALLOC
def_bool (MEMORY_ISOLATION && COMPACTION) || CMA
--- a/mm/page_alloc.c~mm-enforce-pageblock_order-max_order
+++ a/mm/page_alloc.c
@@ -1054,14 +1054,12 @@ static inline void __free_one_page(struc
int migratetype, fpi_t fpi_flags)
{
struct capture_control *capc = task_capc(zone);
+ unsigned int max_order = pageblock_order;
unsigned long buddy_pfn;
unsigned long combined_pfn;
- unsigned int max_order;
struct page *buddy;
bool to_tail;
- max_order = min_t(unsigned int, MAX_ORDER - 1, pageblock_order);
-
VM_BUG_ON(!zone_is_initialized(zone));
VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page);
@@ -2262,19 +2260,8 @@ void __init init_cma_reserved_pageblock(
} while (++p, --i);
set_pageblock_migratetype(page, MIGRATE_CMA);
-
- if (pageblock_order >= MAX_ORDER) {
- i = pageblock_nr_pages;
- p = page;
- do {
- set_page_refcounted(p);
- __free_pages(p, MAX_ORDER - 1);
- p += MAX_ORDER_NR_PAGES;
- } while (i -= MAX_ORDER_NR_PAGES);
- } else {
- set_page_refcounted(page);
- __free_pages(page, pageblock_order);
- }
+ set_page_refcounted(page);
+ __free_pages(page, pageblock_order);
adjust_managed_page_count(page, pageblock_nr_pages);
page_zone(page)->cma_pages += pageblock_nr_pages;
@@ -7429,16 +7416,15 @@ static inline void setup_usemap(struct z
/* Initialise the number of pages represented by NR_PAGEBLOCK_BITS */
void __init set_pageblock_order(void)
{
- unsigned int order;
+ unsigned int order = MAX_ORDER - 1;
/* Check that pageblock_nr_pages has not already been setup */
if (pageblock_order)
return;
- if (HPAGE_SHIFT > PAGE_SHIFT)
+ /* Don't let pageblocks exceed the maximum allocation granularity. */
+ if (HPAGE_SHIFT > PAGE_SHIFT && HUGETLB_PAGE_ORDER < order)
order = HUGETLB_PAGE_ORDER;
- else
- order = MAX_ORDER - 1;
/*
* Assume the largest contiguous order of interest is a huge page.
@@ -9026,14 +9012,12 @@ struct page *has_unmovable_pages(struct
#ifdef CONFIG_CONTIG_ALLOC
static unsigned long pfn_max_align_down(unsigned long pfn)
{
- return pfn & ~(max_t(unsigned long, MAX_ORDER_NR_PAGES,
- pageblock_nr_pages) - 1);
+ return ALIGN_DOWN(pfn, MAX_ORDER_NR_PAGES);
}
static unsigned long pfn_max_align_up(unsigned long pfn)
{
- return ALIGN(pfn, max_t(unsigned long, MAX_ORDER_NR_PAGES,
- pageblock_nr_pages));
+ return ALIGN(pfn, MAX_ORDER_NR_PAGES);
}
#if defined(CONFIG_DYNAMIC_DEBUG) || \
_
Patches currently in -mm which might be from david@redhat.com are
mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch
mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch
mm-slightly-clarify-ksm-logic-in-do_swap_page.patch
mm-streamline-cow-logic-in-do_swap_page.patch
mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch
mm-khugepaged-remove-reuse_swap_page-usage.patch
mm-swapfile-remove-stale-reuse_swap_page.patch
mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch
mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch
cma-factor-out-minimum-alignment-requirement.patch
mm-enforce-pageblock_order-max_order.patch
drivers-base-memory-add-memory-block-to-memory-group-after-registration-succeeded.patch
drivers-base-node-consolidate-node-device-subsystem-initialization-in-node_dev_init.patch
drivers-base-node-rename-link_mem_sections-to-register_memory_block_under_node.patch
drivers-base-memory-determine-and-store-zone-for-single-zone-memory-blocks.patch
proc-vmcore-fix-possible-deadlock-on-concurrent-mmap-and-read.patch
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
reply other threads:[~2022-02-14 21:16 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220214210616.19CCAC340E9@smtp.kernel.org \
--to=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=david@redhat.com \
--cc=frowand.list@gmail.com \
--cc=hch@lst.de \
--cc=iommu@lists.linux-foundation.org \
--cc=m.szyprowski@samsung.com \
--cc=minchan@kernel.org \
--cc=mm-commits@vger.kernel.org \
--cc=mpe@ellerman.id.au \
--cc=mst@redhat.com \
--cc=paulus@samba.org \
--cc=robh+dt@kernel.org \
--cc=robin.murphy@arm.com \
--cc=vbabka@suse.cz \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox