linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Tue, 19 Sep 2023 17:32:39 -0700	[thread overview]
Message-ID: <20230920003239.GD112714@monkey> (raw)
In-Reply-To: <C416A861-44D3-46E7-B756-63DA3731FC1E@nvidia.com>

On 09/19/23 16:57, Zi Yan wrote:
> On 19 Sep 2023, at 14:47, Mike Kravetz wrote:
> 
> > On 09/19/23 02:49, Johannes Weiner wrote:
> >> On Mon, Sep 18, 2023 at 10:40:37AM -0700, Mike Kravetz wrote:
> >>> On 09/18/23 10:52, Johannes Weiner wrote:
> >>>> On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> >>>>> On 9/16/23 21:57, Mike Kravetz wrote:
> >>>>>> On 09/15/23 10:16, Johannes Weiner wrote:
> >>>>>>> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> >
> > Sorry for causing the confusion!
> >
> > When I originally saw the warnings pop up, I was running the above script
> > as well as another that only allocated order 9 hugetlb pages:
> >
> > while true; do
> > 	echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > 	echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > done
> >
> > The warnings were actually triggered by allocations in this second script.
> >
> > However, when reporting the warnings I wanted to include the simplest
> > way to recreate.  And, I noticed that that second script running in
> > parallel was not required.  Again, sorry for the confusion!  Here is a
> > warning triggered via the alloc_contig_range path only running the one
> > script.
> >
> > [  107.275821] ------------[ cut here ]------------
> > [  107.277001] page type is 0, passed migratetype is 1 (nr=512)
> > [  107.278379] WARNING: CPU: 1 PID: 886 at mm/page_alloc.c:699 del_page_from_free_list+0x137/0x170
> > [  107.280514] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic joydev 9p snd_hda_intel netfs snd_intel_dspcfg snd_hda_codec snd_hwdep 9pnet_virtio snd_hda_core snd_seq snd_seq_device 9pnet virtio_balloon snd_pcm snd_timer snd soundcore virtio_net net_failover failover virtio_console virtio_blk crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
> > [  107.291033] CPU: 1 PID: 886 Comm: bash Not tainted 6.6.0-rc2-next-20230919-dirty #35
> > [  107.293000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
> > [  107.295187] RIP: 0010:del_page_from_free_list+0x137/0x170
> > [  107.296618] Code: c6 05 20 9b 35 01 01 e8 b7 fb ff ff 44 89 f1 44 89 e2 48 c7 c7 d8 ab 22 82 48 89 c6 b8 01 00 00 00 d3 e0 89 c1 e8 e9 99 df ff <0f> 0b e9 03 ff ff ff 48 c7 c6 10 ac 22 82 48 89 df e8 f3 e0 fc ff
> > [  107.301236] RSP: 0018:ffffc90003ba7a70 EFLAGS: 00010086
> > [  107.302535] RAX: 0000000000000000 RBX: ffffea0007ff8000 RCX: 0000000000000000
> > [  107.304467] RDX: 0000000000000004 RSI: ffffffff8224e9de RDI: 00000000ffffffff
> > [  107.306289] RBP: 00000000001ffe00 R08: 0000000000009ffb R09: 00000000ffffdfff
> > [  107.308135] R10: 00000000ffffdfff R11: ffffffff824660e0 R12: 0000000000000001
> > [  107.309956] R13: ffff88827fffcd80 R14: 0000000000000009 R15: 00000000001ffc00
> > [  107.311839] FS:  00007fabb8cba740(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
> > [  107.314695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  107.316159] CR2: 00007f41ba01acf0 CR3: 0000000282ed4006 CR4: 0000000000370ee0
> > [  107.317971] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  107.319783] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  107.321575] Call Trace:
> > [  107.322314]  <TASK>
> > [  107.323002]  ? del_page_from_free_list+0x137/0x170
> > [  107.324380]  ? __warn+0x7d/0x130
> > [  107.325341]  ? del_page_from_free_list+0x137/0x170
> > [  107.326627]  ? report_bug+0x18d/0x1c0
> > [  107.327632]  ? prb_read_valid+0x17/0x20
> > [  107.328711]  ? handle_bug+0x41/0x70
> > [  107.329685]  ? exc_invalid_op+0x13/0x60
> > [  107.330787]  ? asm_exc_invalid_op+0x16/0x20
> > [  107.331937]  ? del_page_from_free_list+0x137/0x170
> > [  107.333189]  __free_one_page+0x2ab/0x6f0
> > [  107.334375]  free_pcppages_bulk+0x169/0x210
> > [  107.335575]  drain_pages_zone+0x3f/0x50
> > [  107.336691]  __drain_all_pages+0xe2/0x1e0
> > [  107.337843]  alloc_contig_range+0x143/0x280
> > [  107.339026]  alloc_contig_pages+0x210/0x270
> > [  107.340200]  alloc_fresh_hugetlb_folio+0xa6/0x270
> > [  107.341529]  alloc_pool_huge_page+0x7d/0x100
> > [  107.342745]  set_max_huge_pages+0x162/0x340
> > [  107.345059]  nr_hugepages_store_common+0x91/0xf0
> > [  107.346329]  kernfs_fop_write_iter+0x108/0x1f0
> > [  107.347547]  vfs_write+0x207/0x400
> > [  107.348543]  ksys_write+0x63/0xe0
> > [  107.349511]  do_syscall_64+0x37/0x90
> > [  107.350543]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > [  107.351940] RIP: 0033:0x7fabb8daee87
> > [  107.352819] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> > [  107.356373] RSP: 002b:00007ffc02737478 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [  107.358103] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fabb8daee87
> > [  107.359695] RDX: 0000000000000002 RSI: 000055fe584a1620 RDI: 0000000000000001
> > [  107.361258] RBP: 000055fe584a1620 R08: 000000000000000a R09: 00007fabb8e460c0
> > [  107.362842] R10: 00007fabb8e45fc0 R11: 0000000000000246 R12: 0000000000000002
> > [  107.364385] R13: 00007fabb8e82520 R14: 0000000000000002 R15: 00007fabb8e82720
> > [  107.365968]  </TASK>
> > [  107.366534] ---[ end trace 0000000000000000 ]---
> > [  121.542474] ------------[ cut here ]------------
> >
> > Perhaps that is another piece of information in that the warning can be
> > triggered via both allocation paths.
> >
> > To be perfectly clear, here is what I did today:
> > - built next-20230919.  It does not contain your series
> >   	I could not recreate the issue.
> > - Added your series and the patch to remove
> >   VM_BUG_ON_PAGE(is_migrate_isolate(mt), page) from free_pcppages_bulk
> > 	I could recreate the issue while running only the one script.
> > 	The warning above is from that run.
> > - Added this suggested patch from Zi
> > 	diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > 	index 1400e674ab86..77a4aea31a7f 100644
> > 	--- a/mm/page_alloc.c
> > 	+++ b/mm/page_alloc.c
> > 	@@ -1651,8 +1651,13 @@ static bool prep_move_freepages_block(struct zone *zone, struct page *page,
> >  		end = pageblock_end_pfn(pfn) - 1;
> >
> >  		/* Do not cross zone boundaries */
> > 	+#if 0
> >  		if (!zone_spans_pfn(zone, start))
> > 			start = zone->zone_start_pfn;
> > 	+#else
> > 	+	if (!zone_spans_pfn(zone, start))
> > 	+		start = pfn;
> > 	+#endif
> > 	 	if (!zone_spans_pfn(zone, end))
> > 	 		return false;
> > 	I can still trigger warnings.
> 
> OK. One thing to note is that the page type in the warning changed from
> 5 (MIGRATE_ISOLATE) to 0 (MIGRATE_UNMOVABLE) with my suggested change.
> 

Just to be really clear,
- the 5 (MIGRATE_ISOLATE) warning was from the __alloc_pages call path.
- the 0 (MIGRATE_UNMOVABLE) as above was from the alloc_contig_range call
  path WITHOUT your change.

I am guessing the difference here has more to do with the allocation path?

I went back and reran focusing on the specific migrate type.
Without your patch, and coming from the alloc_contig_range call path,
I got two warnings of 'page type is 0, passed migratetype is 1' as above.
With your patch I got one 'page type is 0, passed migratetype is 1'
warning and one 'page type is 1, passed migratetype is 0' warning.

I could be wrong, but I do not think your patch changes things.

> >
> > One idea about recreating the issue is that it may have to do with size
> > of my VM (16G) and the requested allocation sizes 4G.  However, I tried
> > to really stress the allocations by increasing the number of hugetlb
> > pages requested and that did not help.  I also noticed that I only seem
> > to get two warnings and then they stop, even if I continue to run the
> > script.
> >
> > Zi asked about my config, so it is attached.
> 
> With your config, I still have no luck reproducing the issue. I will keep
> trying. Thanks.
> 

Perhaps try running both scripts in parallel?
Adjust the number of hugetlb pages allocated to equal 25% of memory?
-- 
Mike Kravetz


  reply	other threads:[~2023-09-20  0:33 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59   ` Zi Yan
2023-09-11 21:09     ` Andrew Morton
2023-09-12 13:47   ` Vlastimil Babka
2023-09-12 14:50     ` Johannes Weiner
2023-09-13  9:33       ` Vlastimil Babka
2023-09-13 13:24         ` Johannes Weiner
2023-09-13 13:34           ` Vlastimil Babka
2023-09-12 15:03     ` Johannes Weiner
2023-09-14  7:29       ` Vlastimil Babka
2023-09-14  9:56   ` Mel Gorman
2023-09-27  5:42   ` Huang, Ying
2023-09-27 14:51     ` Johannes Weiner
2023-09-30  4:26       ` Huang, Ying
2023-10-02 14:58         ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01   ` Zi Yan
2023-09-13  9:52   ` Vlastimil Babka
2023-09-14 10:00   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17   ` Zi Yan
2023-09-11 20:47     ` Johannes Weiner
2023-09-11 20:50       ` Zi Yan
2023-09-13 14:31   ` Vlastimil Babka
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23   ` Zi Yan
2023-09-13 14:40   ` Vlastimil Babka
2023-09-14 13:37     ` Johannes Weiner
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52   ` Vlastimil Babka
2023-09-14 14:47     ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18   ` Vlastimil Babka
2023-09-14  4:11     ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16   ` Johannes Weiner
2023-09-15 15:05     ` Mike Kravetz
2023-09-16 19:57     ` Mike Kravetz
2023-09-16 20:13       ` Andrew Morton
2023-09-18  7:16       ` Vlastimil Babka
2023-09-18 14:52         ` Johannes Weiner
2023-09-18 17:40           ` Mike Kravetz
2023-09-19  6:49             ` Johannes Weiner
2023-09-19 12:37               ` Zi Yan
2023-09-19 15:22                 ` Zi Yan
2023-09-19 18:47               ` Mike Kravetz
2023-09-19 20:57                 ` Zi Yan
2023-09-20  0:32                   ` Mike Kravetz [this message]
2023-09-20  1:38                     ` Zi Yan
2023-09-20  6:07                       ` Vlastimil Babka
2023-09-20 13:48                         ` Johannes Weiner
2023-09-20 16:04                           ` Johannes Weiner
2023-09-20 17:23                             ` Zi Yan
2023-09-21  2:31                               ` Zi Yan
2023-09-21 10:19                                 ` David Hildenbrand
2023-09-21 14:47                                   ` Zi Yan
2023-09-25 21:12                                     ` Zi Yan
2023-09-26 17:39                                       ` Johannes Weiner
2023-09-28  2:51                                         ` Zi Yan
2023-10-03  2:26                                           ` Zi Yan
2023-10-10 21:12                                             ` Johannes Weiner
2023-10-11 15:25                                               ` Johannes Weiner
2023-10-11 15:45                                                 ` Johannes Weiner
2023-10-11 15:57                                                   ` Zi Yan
2023-10-13  0:06                                               ` Zi Yan
2023-10-13 14:51                                                 ` Zi Yan
2023-10-16 13:35                                                   ` Zi Yan
2023-10-16 14:37                                                     ` Johannes Weiner
2023-10-16 15:00                                                       ` Zi Yan
2023-10-16 18:51                                                         ` Johannes Weiner
2023-10-16 19:49                                                           ` Zi Yan
2023-10-16 20:26                                                             ` Johannes Weiner
2023-10-16 20:39                                                               ` Johannes Weiner
2023-10-16 20:48                                                                 ` Zi Yan
2023-09-26 18:19                                     ` David Hildenbrand
2023-09-28  3:22                                       ` Zi Yan
2023-10-02 11:43                                         ` David Hildenbrand
2023-10-03  2:35                                           ` Zi Yan
2023-09-18  7:07     ` Vlastimil Babka
2023-09-18 14:09       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230920003239.GD112714@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).