All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Miaohe Lin <linmiaohe@huawei.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene
Date: Tue, 19 Sep 2023 17:32:39 -0700	[thread overview]
Message-ID: <20230920003239.GD112714@monkey> (raw)
In-Reply-To: <C416A861-44D3-46E7-B756-63DA3731FC1E@nvidia.com>

On 09/19/23 16:57, Zi Yan wrote:
> On 19 Sep 2023, at 14:47, Mike Kravetz wrote:
> 
> > On 09/19/23 02:49, Johannes Weiner wrote:
> >> On Mon, Sep 18, 2023 at 10:40:37AM -0700, Mike Kravetz wrote:
> >>> On 09/18/23 10:52, Johannes Weiner wrote:
> >>>> On Mon, Sep 18, 2023 at 09:16:58AM +0200, Vlastimil Babka wrote:
> >>>>> On 9/16/23 21:57, Mike Kravetz wrote:
> >>>>>> On 09/15/23 10:16, Johannes Weiner wrote:
> >>>>>>> On Thu, Sep 14, 2023 at 04:52:38PM -0700, Mike Kravetz wrote:
> >
> > Sorry for causing the confusion!
> >
> > When I originally saw the warnings pop up, I was running the above script
> > as well as another that only allocated order 9 hugetlb pages:
> >
> > while true; do
> > 	echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > 	echo 0 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > done
> >
> > The warnings were actually triggered by allocations in this second script.
> >
> > However, when reporting the warnings I wanted to include the simplest
> > way to recreate.  And, I noticed that that second script running in
> > parallel was not required.  Again, sorry for the confusion!  Here is a
> > warning triggered via the alloc_contig_range path only running the one
> > script.
> >
> > [  107.275821] ------------[ cut here ]------------
> > [  107.277001] page type is 0, passed migratetype is 1 (nr=512)
> > [  107.278379] WARNING: CPU: 1 PID: 886 at mm/page_alloc.c:699 del_page_from_free_list+0x137/0x170
> > [  107.280514] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic joydev 9p snd_hda_intel netfs snd_intel_dspcfg snd_hda_codec snd_hwdep 9pnet_virtio snd_hda_core snd_seq snd_seq_device 9pnet virtio_balloon snd_pcm snd_timer snd soundcore virtio_net net_failover failover virtio_console virtio_blk crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_pci_legacy_dev virtio_pci_modern_dev virtio_ring fuse
> > [  107.291033] CPU: 1 PID: 886 Comm: bash Not tainted 6.6.0-rc2-next-20230919-dirty #35
> > [  107.293000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
> > [  107.295187] RIP: 0010:del_page_from_free_list+0x137/0x170
> > [  107.296618] Code: c6 05 20 9b 35 01 01 e8 b7 fb ff ff 44 89 f1 44 89 e2 48 c7 c7 d8 ab 22 82 48 89 c6 b8 01 00 00 00 d3 e0 89 c1 e8 e9 99 df ff <0f> 0b e9 03 ff ff ff 48 c7 c6 10 ac 22 82 48 89 df e8 f3 e0 fc ff
> > [  107.301236] RSP: 0018:ffffc90003ba7a70 EFLAGS: 00010086
> > [  107.302535] RAX: 0000000000000000 RBX: ffffea0007ff8000 RCX: 0000000000000000
> > [  107.304467] RDX: 0000000000000004 RSI: ffffffff8224e9de RDI: 00000000ffffffff
> > [  107.306289] RBP: 00000000001ffe00 R08: 0000000000009ffb R09: 00000000ffffdfff
> > [  107.308135] R10: 00000000ffffdfff R11: ffffffff824660e0 R12: 0000000000000001
> > [  107.309956] R13: ffff88827fffcd80 R14: 0000000000000009 R15: 00000000001ffc00
> > [  107.311839] FS:  00007fabb8cba740(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
> > [  107.314695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  107.316159] CR2: 00007f41ba01acf0 CR3: 0000000282ed4006 CR4: 0000000000370ee0
> > [  107.317971] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  107.319783] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  107.321575] Call Trace:
> > [  107.322314]  <TASK>
> > [  107.323002]  ? del_page_from_free_list+0x137/0x170
> > [  107.324380]  ? __warn+0x7d/0x130
> > [  107.325341]  ? del_page_from_free_list+0x137/0x170
> > [  107.326627]  ? report_bug+0x18d/0x1c0
> > [  107.327632]  ? prb_read_valid+0x17/0x20
> > [  107.328711]  ? handle_bug+0x41/0x70
> > [  107.329685]  ? exc_invalid_op+0x13/0x60
> > [  107.330787]  ? asm_exc_invalid_op+0x16/0x20
> > [  107.331937]  ? del_page_from_free_list+0x137/0x170
> > [  107.333189]  __free_one_page+0x2ab/0x6f0
> > [  107.334375]  free_pcppages_bulk+0x169/0x210
> > [  107.335575]  drain_pages_zone+0x3f/0x50
> > [  107.336691]  __drain_all_pages+0xe2/0x1e0
> > [  107.337843]  alloc_contig_range+0x143/0x280
> > [  107.339026]  alloc_contig_pages+0x210/0x270
> > [  107.340200]  alloc_fresh_hugetlb_folio+0xa6/0x270
> > [  107.341529]  alloc_pool_huge_page+0x7d/0x100
> > [  107.342745]  set_max_huge_pages+0x162/0x340
> > [  107.345059]  nr_hugepages_store_common+0x91/0xf0
> > [  107.346329]  kernfs_fop_write_iter+0x108/0x1f0
> > [  107.347547]  vfs_write+0x207/0x400
> > [  107.348543]  ksys_write+0x63/0xe0
> > [  107.349511]  do_syscall_64+0x37/0x90
> > [  107.350543]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > [  107.351940] RIP: 0033:0x7fabb8daee87
> > [  107.352819] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> > [  107.356373] RSP: 002b:00007ffc02737478 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [  107.358103] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fabb8daee87
> > [  107.359695] RDX: 0000000000000002 RSI: 000055fe584a1620 RDI: 0000000000000001
> > [  107.361258] RBP: 000055fe584a1620 R08: 000000000000000a R09: 00007fabb8e460c0
> > [  107.362842] R10: 00007fabb8e45fc0 R11: 0000000000000246 R12: 0000000000000002
> > [  107.364385] R13: 00007fabb8e82520 R14: 0000000000000002 R15: 00007fabb8e82720
> > [  107.365968]  </TASK>
> > [  107.366534] ---[ end trace 0000000000000000 ]---
> > [  121.542474] ------------[ cut here ]------------
> >
> > Perhaps that is another piece of information in that the warning can be
> > triggered via both allocation paths.
> >
> > To be perfectly clear, here is what I did today:
> > - built next-20230919.  It does not contain your series
> >   	I could not recreate the issue.
> > - Added your series and the patch to remove
> >   VM_BUG_ON_PAGE(is_migrate_isolate(mt), page) from free_pcppages_bulk
> > 	I could recreate the issue while running only the one script.
> > 	The warning above is from that run.
> > - Added this suggested patch from Zi
> > 	diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > 	index 1400e674ab86..77a4aea31a7f 100644
> > 	--- a/mm/page_alloc.c
> > 	+++ b/mm/page_alloc.c
> > 	@@ -1651,8 +1651,13 @@ static bool prep_move_freepages_block(struct zone *zone, struct page *page,
> >  		end = pageblock_end_pfn(pfn) - 1;
> >
> >  		/* Do not cross zone boundaries */
> > 	+#if 0
> >  		if (!zone_spans_pfn(zone, start))
> > 			start = zone->zone_start_pfn;
> > 	+#else
> > 	+	if (!zone_spans_pfn(zone, start))
> > 	+		start = pfn;
> > 	+#endif
> > 	 	if (!zone_spans_pfn(zone, end))
> > 	 		return false;
> > 	I can still trigger warnings.
> 
> OK. One thing to note is that the page type in the warning changed from
> 5 (MIGRATE_ISOLATE) to 0 (MIGRATE_UNMOVABLE) with my suggested change.
> 

Just to be really clear,
- the 5 (MIGRATE_ISOLATE) warning was from the __alloc_pages call path.
- the 0 (MIGRATE_UNMOVABLE) as above was from the alloc_contig_range call
  path WITHOUT your change.

I am guessing the difference here has more to do with the allocation path?

I went back and reran focusing on the specific migrate type.
Without your patch, and coming from the alloc_contig_range call path,
I got two warnings of 'page type is 0, passed migratetype is 1' as above.
With your patch I got one 'page type is 0, passed migratetype is 1'
warning and one 'page type is 1, passed migratetype is 0' warning.

I could be wrong, but I do not think your patch changes things.

> >
> > One idea about recreating the issue is that it may have to do with size
> > of my VM (16G) and the requested allocation sizes 4G.  However, I tried
> > to really stress the allocations by increasing the number of hugetlb
> > pages requested and that did not help.  I also noticed that I only seem
> > to get two warnings and then they stop, even if I continue to run the
> > script.
> >
> > Zi asked about my config, so it is attached.
> 
> With your config, I still have no luck reproducing the issue. I will keep
> trying. Thanks.
> 

Perhaps try running both scripts in parallel?
Adjust the number of hugetlb pages allocated to equal 25% of memory?
-- 
Mike Kravetz


  reply	other threads:[~2023-09-20  0:33 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11 19:41 [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Johannes Weiner
2023-09-11 19:41 ` [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Johannes Weiner
2023-09-11 19:59   ` Zi Yan
2023-09-11 21:09     ` Andrew Morton
2023-09-12 13:47   ` Vlastimil Babka
2023-09-12 14:50     ` Johannes Weiner
2023-09-13  9:33       ` Vlastimil Babka
2023-09-13 13:24         ` Johannes Weiner
2023-09-13 13:34           ` Vlastimil Babka
2023-09-12 15:03     ` Johannes Weiner
2023-09-14  7:29       ` Vlastimil Babka
2023-09-14  9:56   ` Mel Gorman
2023-09-27  5:42   ` Huang, Ying
2023-09-27 14:51     ` Johannes Weiner
2023-09-30  4:26       ` Huang, Ying
2023-10-02 14:58         ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 2/6] mm: page_alloc: fix up block types when merging compatible blocks Johannes Weiner
2023-09-11 20:01   ` Zi Yan
2023-09-13  9:52   ` Vlastimil Babka
2023-09-14 10:00   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 3/6] mm: page_alloc: move free pages when converting block during isolation Johannes Weiner
2023-09-11 20:17   ` Zi Yan
2023-09-11 20:47     ` Johannes Weiner
2023-09-11 20:50       ` Zi Yan
2023-09-13 14:31   ` Vlastimil Babka
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error Johannes Weiner
2023-09-11 20:23   ` Zi Yan
2023-09-13 14:40   ` Vlastimil Babka
2023-09-14 13:37     ` Johannes Weiner
2023-09-14 10:03   ` Mel Gorman
2023-09-11 19:41 ` [PATCH 5/6] mm: page_alloc: fix freelist movement during block conversion Johannes Weiner
2023-09-13 19:52   ` Vlastimil Babka
2023-09-14 14:47     ` Johannes Weiner
2023-09-11 19:41 ` [PATCH 6/6] mm: page_alloc: consolidate free page accounting Johannes Weiner
2023-09-13 20:18   ` Vlastimil Babka
2023-09-14  4:11     ` Johannes Weiner
2023-09-14 23:52 ` [PATCH V2 0/6] mm: page_alloc: freelist migratetype hygiene Mike Kravetz
2023-09-15 14:16   ` Johannes Weiner
2023-09-15 15:05     ` Mike Kravetz
2023-09-16 19:57     ` Mike Kravetz
2023-09-16 20:13       ` Andrew Morton
2023-09-18  7:16       ` Vlastimil Babka
2023-09-18 14:52         ` Johannes Weiner
2023-09-18 17:40           ` Mike Kravetz
2023-09-19  6:49             ` Johannes Weiner
2023-09-19 12:37               ` Zi Yan
2023-09-19 15:22                 ` Zi Yan
2023-09-19 18:47               ` Mike Kravetz
2023-09-19 20:57                 ` Zi Yan
2023-09-20  0:32                   ` Mike Kravetz [this message]
2023-09-20  1:38                     ` Zi Yan
2023-09-20  6:07                       ` Vlastimil Babka
2023-09-20 13:48                         ` Johannes Weiner
2023-09-20 16:04                           ` Johannes Weiner
2023-09-20 17:23                             ` Zi Yan
2023-09-21  2:31                               ` Zi Yan
2023-09-21 10:19                                 ` David Hildenbrand
2023-09-21 14:47                                   ` Zi Yan
2023-09-25 21:12                                     ` Zi Yan
2023-09-26 17:39                                       ` Johannes Weiner
2023-09-28  2:51                                         ` Zi Yan
2023-10-03  2:26                                           ` Zi Yan
2023-10-10 21:12                                             ` Johannes Weiner
2023-10-11 15:25                                               ` Johannes Weiner
2023-10-11 15:45                                                 ` Johannes Weiner
2023-10-11 15:57                                                   ` Zi Yan
2023-10-13  0:06                                               ` Zi Yan
2023-10-13 14:51                                                 ` Zi Yan
2023-10-16 13:35                                                   ` Zi Yan
2023-10-16 14:37                                                     ` Johannes Weiner
2023-10-16 15:00                                                       ` Zi Yan
2023-10-16 18:51                                                         ` Johannes Weiner
2023-10-16 19:49                                                           ` Zi Yan
2023-10-16 20:26                                                             ` Johannes Weiner
2023-10-16 20:39                                                               ` Johannes Weiner
2023-10-16 20:48                                                                 ` Zi Yan
2023-09-26 18:19                                     ` David Hildenbrand
2023-09-28  3:22                                       ` Zi Yan
2023-10-02 11:43                                         ` David Hildenbrand
2023-10-03  2:35                                           ` Zi Yan
2023-09-18  7:07     ` Vlastimil Babka
2023-09-18 14:09       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230920003239.GD112714@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=vbabka@suse.cz \
    --cc=wangkefeng.wang@huawei.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.