From: Mike Kravetz <mike.kravetz@oracle.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Miaohe Lin <linmiaohe@huawei.com>,
Kefeng Wang <wangkefeng.wang@huawei.com>, Zi Yan <ziy@nvidia.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Sat, 16 Sep 2023 12:57:39 -0700 [thread overview]
Message-ID: <20230916195739.GB618858@monkey> (raw)
In-Reply-To: <20230915141610.GA104956@cmpxchg.org>
On 09/15/23 10:16, Johannes Weiner wrote:
> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> > In next-20230913, I started hitting the following BUG. Seems related
> > to this series. And, if series is reverted I do not see the BUG.
> >
> > I can easily reproduce on a small 16G VM. kernel command line contains
> > "hugetlb_free_vmemmap=on hugetlb_cma=4G". Then run the script,
> > while true; do
> > echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> > echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> > echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > done
> >
> > For the BUG below I believe it was the first (or second) 1G page creation from
> > CMA that triggered: cma_alloc of 1G.
> >
> > Sorry, have not looked deeper into the issue.
>
> Thanks for the report, and sorry about the breakage!
>
> I was scratching my head at this:
>
> /* MIGRATE_ISOLATE page should not go to pcplists */
> VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
>
> because there is nothing in page isolation that prevents setting
> MIGRATE_ISOLATE on something that's on the pcplist already. So why
> didn't this trigger before already?
>
> Then it clicked: it used to only check the *pcpmigratetype* determined
> by free_unref_page(), which of course mustn't be MIGRATE_ISOLATE.
>
> Pages that get isolated while *already* on the pcplist are fine, and
> are handled properly:
>
> mt = get_pcppage_migratetype(page);
>
> /* MIGRATE_ISOLATE page should not go to pcplists */
> VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
>
> /* Pageblock could have been isolated meanwhile */
> if (unlikely(isolated_pageblocks))
> mt = get_pageblock_migratetype(page);
>
> So this was purely a sanity check against the pcpmigratetype cache
> operations. With that gone, we can remove it.
With the patch below applied, a slightly different workload triggers the
following warnings. It seems related, and appears to go away when
reverting the series.
[ 331.595382] ------------[ cut here ]------------
[ 331.596665] page type is 5, passed migratetype is 1 (nr=512)
[ 331.598121] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:662 expand+0x1c9/0x200
[ 331.600549] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq 9p snd_seq_device netfs 9pnet_virtio snd_pcm joydev snd_timer virtio_balloon snd soundcore 9pnet virtio_blk virtio_console virtio_net net_failover failover crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
[ 331.609530] CPU: 2 PID: 935 Comm: bash Tainted: G W 6.6.0-rc1-next-20230913+ #26
[ 331.611603] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[ 331.613527] RIP: 0010:expand+0x1c9/0x200
[ 331.614492] Code: 89 ef be 07 00 00 00 c6 05 c9 b1 35 01 01 e8 de f7 ff ff 8b 4c 24 30 8b 54 24 0c 48 c7 c7 68 9f 22 82 48 89 c6 e8 97 b3 df ff <0f> 0b e9 db fe ff ff 48 c7 c6 f8 9f 22 82 48 89 df e8 41 e3 fc ff
[ 331.618540] RSP: 0018:ffffc90003c97a88 EFLAGS: 00010086
[ 331.619801] RAX: 0000000000000000 RBX: ffffea0007ff8000 RCX: 0000000000000000
[ 331.621331] RDX: 0000000000000005 RSI: ffffffff8224dce6 RDI: 00000000ffffffff
[ 331.622914] RBP: 00000000001ffe00 R08: 0000000000009ffb R09: 00000000ffffdfff
[ 331.624712] R10: 00000000ffffdfff R11: ffffffff824660c0 R12: ffff88827fffcd80
[ 331.626317] R13: 0000000000000009 R14: 0000000000000200 R15: 000000000000000a
[ 331.627810] FS: 00007f24b3932740(0000) GS:ffff888477c00000(0000) knlGS:0000000000000000
[ 331.630593] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 331.631865] CR2: 0000560a53875018 CR3: 000000017eee8003 CR4: 0000000000370ee0
[ 331.633382] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 331.634873] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 331.636324] Call Trace:
[ 331.636934] <TASK>
[ 331.637521] ? expand+0x1c9/0x200
[ 331.638320] ? __warn+0x7d/0x130
[ 331.639116] ? expand+0x1c9/0x200
[ 331.639957] ? report_bug+0x18d/0x1c0
[ 331.640832] ? handle_bug+0x41/0x70
[ 331.641635] ? exc_invalid_op+0x13/0x60
[ 331.642522] ? asm_exc_invalid_op+0x16/0x20
[ 331.643494] ? expand+0x1c9/0x200
[ 331.644264] ? expand+0x1c9/0x200
[ 331.645007] rmqueue_bulk+0xf4/0x530
[ 331.645847] get_page_from_freelist+0x3ed/0x1040
[ 331.646837] ? prepare_alloc_pages.constprop.0+0x197/0x1b0
[ 331.647977] __alloc_pages+0xec/0x240
[ 331.648783] alloc_buddy_hugetlb_folio.isra.0+0x6a/0x150
[ 331.649912] __alloc_fresh_hugetlb_folio+0x157/0x230
[ 331.650938] alloc_pool_huge_folio+0xad/0x110
[ 331.651909] set_max_huge_pages+0x17d/0x390
[ 331.652760] nr_hugepages_store_common+0x91/0xf0
[ 331.653825] kernfs_fop_write_iter+0x108/0x1f0
[ 331.654986] vfs_write+0x207/0x400
[ 331.655925] ksys_write+0x63/0xe0
[ 331.656832] do_syscall_64+0x37/0x90
[ 331.657793] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 331.660398] RIP: 0033:0x7f24b3a26e87
[ 331.661342] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 331.665673] RSP: 002b:00007ffccd603de8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 331.667541] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f24b3a26e87
[ 331.669197] RDX: 0000000000000005 RSI: 0000560a5381bb50 RDI: 0000000000000001
[ 331.670883] RBP: 0000560a5381bb50 R08: 000000000000000a R09: 00007f24b3abe0c0
[ 331.672536] R10: 00007f24b3abdfc0 R11: 0000000000000246 R12: 0000000000000005
[ 331.674175] R13: 00007f24b3afa520 R14: 0000000000000005 R15: 00007f24b3afa720
[ 331.675841] </TASK>
[ 331.676450] ---[ end trace 0000000000000000 ]---
[ 331.677659] ------------[ cut here ]------------
[ 331.677659] ------------[ cut here ]------------
[ 331.679109] page type is 5, passed migratetype is 1 (nr=512)
[ 331.680376] WARNING: CPU: 2 PID: 935 at mm/page_alloc.c:699 del_page_from_free_list+0x137/0x170
[ 331.682314] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq 9p snd_seq_device netfs 9pnet_virtio snd_pcm joydev snd_timer virtio_balloon snd soundcore 9pnet virtio_blk virtio_console virtio_net net_failover failover crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
[ 331.691852] CPU: 2 PID: 935 Comm: bash Tainted: G W 6.6.0-rc1-next-20230913+ #26
[ 331.694026] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[ 331.696162] RIP: 0010:del_page_from_free_list+0x137/0x170
[ 331.697589] Code: c6 05 a0 b5 35 01 01 e8 b7 fb ff ff 44 89 f1 44 89 e2 48 c7 c7 68 9f 22 82 48 89 c6 b8 01 00 00 00 d3 e0 89 c1 e8 69 b7 df ff <0f> 0b e9 03 ff ff ff 48 c7 c6 a0 9f 22 82 48 89 df e8 13 e7 fc ff
[ 331.702060] RSP: 0018:ffffc90003c97ac8 EFLAGS: 00010086
[ 331.703430] RAX: 0000000000000000 RBX: ffffea0007ff8000 RCX: 0000000000000000
[ 331.705284] RDX: 0000000000000005 RSI: ffffffff8224dce6 RDI: 00000000ffffffff
[ 331.707101] RBP: 00000000001ffe00 R08: 0000000000009ffb R09: 00000000ffffdfff
[ 331.708933] R10: 00000000ffffdfff R11: ffffffff824660c0 R12: 0000000000000001
[ 331.710754] R13: ffff88827fffcd80 R14: 0000000000000009 R15: 0000000000000009
[ 331.712637] FS: 00007f24b3932740(0000) GS:ffff888477c00000(0000) knlGS:0000000000000000
[ 331.714861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 331.716466] CR2: 0000560a53875018 CR3: 000000017eee8003 CR4: 0000000000370ee0
[ 331.718441] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 331.720372] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 331.723583] Call Trace:
[ 331.724351] <TASK>
[ 331.725045] ? del_page_from_free_list+0x137/0x170
[ 331.726370] ? __warn+0x7d/0x130
[ 331.727326] ? del_page_from_free_list+0x137/0x170
[ 331.728637] ? report_bug+0x18d/0x1c0
[ 331.729688] ? handle_bug+0x41/0x70
[ 331.730707] ? exc_invalid_op+0x13/0x60
[ 331.731798] ? asm_exc_invalid_op+0x16/0x20
[ 331.733007] ? del_page_from_free_list+0x137/0x170
[ 331.734317] ? del_page_from_free_list+0x137/0x170
[ 331.735649] rmqueue_bulk+0xdf/0x530
[ 331.736741] get_page_from_freelist+0x3ed/0x1040
[ 331.738069] ? prepare_alloc_pages.constprop.0+0x197/0x1b0
[ 331.739578] __alloc_pages+0xec/0x240
[ 331.740666] alloc_buddy_hugetlb_folio.isra.0+0x6a/0x150
[ 331.742135] __alloc_fresh_hugetlb_folio+0x157/0x230
[ 331.743521] alloc_pool_huge_folio+0xad/0x110
[ 331.744768] set_max_huge_pages+0x17d/0x390
[ 331.745988] nr_hugepages_store_common+0x91/0xf0
[ 331.747306] kernfs_fop_write_iter+0x108/0x1f0
[ 331.748651] vfs_write+0x207/0x400
[ 331.749735] ksys_write+0x63/0xe0
[ 331.750808] do_syscall_64+0x37/0x90
[ 331.753203] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 331.754857] RIP: 0033:0x7f24b3a26e87
[ 331.756184] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 331.760239] RSP: 002b:00007ffccd603de8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 331.761935] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f24b3a26e87
[ 331.763524] RDX: 0000000000000005 RSI: 0000560a5381bb50 RDI: 0000000000000001
[ 331.765102] RBP: 0000560a5381bb50 R08: 000000000000000a R09: 00007f24b3abe0c0
[ 331.766740] R10: 00007f24b3abdfc0 R11: 0000000000000246 R12: 0000000000000005
[ 331.768344] R13: 00007f24b3afa520 R14: 0000000000000005 R15: 00007f24b3afa720
[ 331.769949] </TASK>
[ 331.770559] ---[ end trace 0000000000000000 ]---
--
Mike Kravetz
> ---
>
> From b0cb92ed10b40fab0921002effa8b726df245790 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Fri, 15 Sep 2023 09:59:52 -0400
> Subject: [PATCH] mm: page_alloc: remove pcppage migratetype caching fix
>
> Mike reports the following crash in -next:
>
> [ 28.643019] page:ffffea0004fb4280 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13ed0a
> [ 28.645455] flags: 0x200000000000000(node=0|zone=2)
> [ 28.646835] page_type: 0xffffffff()
> [ 28.647886] raw: 0200000000000000 dead000000000100 dead000000000122 0000000000000000
> [ 28.651170] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [ 28.653124] page dumped because: VM_BUG_ON_PAGE(is_migrate_isolate(mt))
> [ 28.654769] ------------[ cut here ]------------
> [ 28.655972] kernel BUG at mm/page_alloc.c:1231!
>
> This VM_BUG_ON() used to check that the cached pcppage_migratetype set
> by free_unref_page() wasn't MIGRATE_ISOLATE.
>
> When I removed the caching, I erroneously changed the assert to check
> that no isolated pages are on the pcplist. This is quite different,
> because pages can be isolated *after* they had been put on the
> freelist already (which is handled just fine).
>
> IOW, this was purely a sanity check on the migratetype caching. With
> that gone, the check should have been removed as well. Do that now.
>
> Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
> mm/page_alloc.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e3f1c777feed..9469e4660b53 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1207,9 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
> count -= nr_pages;
> pcp->count -= nr_pages;
>
> - /* MIGRATE_ISOLATE page should not go to pcplists */
> - VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
> -
> __free_one_page(page, pfn, zone, order, mt, FPI_NONE);
> trace_mm_page_pcpu_drain(page, order, mt);
> } while (count > 0 && !list_empty(list));
> --
> 2.42.0
>
next prev parent reply other threads:[~2023-09-16 19:58 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59 ` Zi Yan
2023-09-11 21:09 ` Andrew Morton
2023-09-12 13:47 ` Vlastimil Babka
2023-09-12 14:50 ` Johannes Weiner
2023-09-13 9:33 ` Vlastimil Babka
2023-09-13 13:24 ` Johannes Weiner
2023-09-13 13:34 ` Vlastimil Babka
2023-09-12 15:03 ` Johannes Weiner
2023-09-14 7:29 ` Vlastimil Babka
2023-09-14 9:56 ` Mel Gorman
2023-09-27 5:42 ` Huang, Ying
2023-09-27 14:51 ` Johannes Weiner
2023-09-30 4:26 ` Huang, Ying
2023-10-02 14:58 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01 ` Zi Yan
2023-09-13 9:52 ` Vlastimil Babka
2023-09-14 10:00 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17 ` Zi Yan
2023-09-11 20:47 ` Johannes Weiner
2023-09-11 20:50 ` Zi Yan
2023-09-13 14:31 ` Vlastimil Babka
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23 ` Zi Yan
2023-09-13 14:40 ` Vlastimil Babka
2023-09-14 13:37 ` Johannes Weiner
2023-09-14 10:03 ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52 ` Vlastimil Babka
2023-09-14 14:47 ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18 ` Vlastimil Babka
2023-09-14 4:11 ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16 ` Johannes Weiner
2023-09-15 15:05 ` Mike Kravetz
2023-09-16 19:57 ` Mike Kravetz [this message]
2023-09-16 20:13 ` Andrew Morton
2023-09-18 7:16 ` Vlastimil Babka
2023-09-18 14:52 ` Johannes Weiner
2023-09-18 17:40 ` Mike Kravetz
2023-09-19 6:49 ` Johannes Weiner
2023-09-19 12:37 ` Zi Yan
2023-09-19 15:22 ` Zi Yan
2023-09-19 18:47 ` Mike Kravetz
2023-09-19 20:57 ` Zi Yan
2023-09-20 0:32 ` Mike Kravetz
2023-09-20 1:38 ` Zi Yan
2023-09-20 6:07 ` Vlastimil Babka
2023-09-20 13:48 ` Johannes Weiner
2023-09-20 16:04 ` Johannes Weiner
2023-09-20 17:23 ` Zi Yan
2023-09-21 2:31 ` Zi Yan
2023-09-21 10:19 ` David Hildenbrand
2023-09-21 14:47 ` Zi Yan
2023-09-25 21:12 ` Zi Yan
2023-09-26 17:39 ` Johannes Weiner
2023-09-28 2:51 ` Zi Yan
2023-10-03 2:26 ` Zi Yan
2023-10-10 21:12 ` Johannes Weiner
2023-10-11 15:25 ` Johannes Weiner
2023-10-11 15:45 ` Johannes Weiner
2023-10-11 15:57 ` Zi Yan
2023-10-13 0:06 ` Zi Yan
2023-10-13 14:51 ` Zi Yan
2023-10-16 13:35 ` Zi Yan
2023-10-16 14:37 ` Johannes Weiner
2023-10-16 15:00 ` Zi Yan
2023-10-16 18:51 ` Johannes Weiner
2023-10-16 19:49 ` Zi Yan
2023-10-16 20:26 ` Johannes Weiner
2023-10-16 20:39 ` Johannes Weiner
2023-10-16 20:48 ` Zi Yan
2023-09-26 18:19 ` David Hildenbrand
2023-09-28 3:22 ` Zi Yan
2023-10-02 11:43 ` David Hildenbrand
2023-10-03 2:35 ` Zi Yan
2023-09-18 7:07 ` Vlastimil Babka
2023-09-18 14:09 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230916195739.GB618858@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=vbabka@suse.cz \
--cc=wangkefeng.wang@huawei.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.