From: Qian Cai <cai@lca.pw>
To: Anshuman Khandual <anshuman.khandual@arm.com>,
Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
Oscar Salvador <osalvador@suse.de>,
Mel Gorman <mgorman@techsingularity.net>,
Mike Rapoport <rppt@linux.ibm.com>,
Dan Williams <dan.j.williams@intel.com>,
Pavel Tatashin <pavel.tatashin@microsoft.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages()
Date: Fri, 04 Oct 2019 08:56:16 -0400 [thread overview]
Message-ID: <1570193776.5576.270.camel@lca.pw> (raw)
In-Reply-To: <91128b73-9a47-100b-d3de-e83f0b941e9f@arm.com>
On Fri, 2019-10-04 at 17:14 +0530, Anshuman Khandual wrote:
>
> On 10/04/2019 04:28 PM, Michal Hocko wrote:
> > On Thu 03-10-19 13:40:57, Anshuman Khandual wrote:
> > > Having unmovable pages on a given pageblock should be reported correctly
> > > when required with REPORT_FAILURE flag. But there can be a scenario where a
> > > reserved page in the page block will get reported as a generic "unmovable"
> > > reason code. Instead this should be changed to a more appropriate reason
> > > code like "Reserved page".
> >
> > Others have already pointed out this is just redundant but I will have a
>
> Sure.
>
> > more generic comment on the changelog. There is essentially no
> > information why the current state is bad/unhelpful and why the chnage is
>
> The current state is not necessarily bad or unhelpful. I just though that it
> could be improved upon. Some how calling out explicitly only the CMA page
> failure case just felt adhoc, where as there are other reasons like HugeTLB
> immovability which might depend on other factors apart from just page flags
> (though I did not propose that originally).
>
> > needed. All you claim is that something is a certain way and then assert
> > that it should be done differently. That is not how changelogs should
> > look like.
> >
>
> Okay, probably I should have explained more on why "unmovable" is less than
> adequate to capture the exact reason for specific failure cases and how
> "Reserved Page" instead would been better. But got the point, will improve.
>
It might be a good time to rethink if it is really a good idea to dump_page()
at all inside has_unmovable_pages(). As it is right now, it is a a potential
deadlock between console vs memory offline. More details are in this thread,
https://lore.kernel.org/lkml/1568817579.5576.172.camel@lca.pw/
01: [ 672.875392] WARNING: possible circular locking dependency detected
01: [ 672.875394] 5.4.0-rc1-next-20191004+ #64 Not tainted
01: [ 672.875396] ------------------------------------------------------
01: [ 672.875398] test.sh/1724 is trying to acquire lock:
01: [ 672.875400] 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x
01: 328/0xa30
01: [ 672.875406]
01: [ 672.875408] but task is already holding lock:
01: [ 672.875409] 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
01: late_page_range+0x216/0x538
01: [ 672.875415]
01: [ 672.875417] which lock already depends on the new lock.
01: [ 672.875418]
01: [ 672.875419]
01: [ 672.875421] the existing dependency chain (in reverse order) is:
01: [ 672.875423]
01: [ 672.875424] -> #2 (&(&zone->lock)->rlock){-.-.}:
01: [ 672.875430] lock_acquire+0x21a/0x468
01: [ 672.875431] _raw_spin_lock+0x54/0x68
01: [ 672.875433] get_page_from_freelist+0x8b6/0x2d28
01: [ 672.875435] __alloc_pages_nodemask+0x246/0x658
01: [ 672.875437] __get_free_pages+0x34/0x78
01: [ 672.875438] sclp_init+0x106/0x690
01: [ 672.875440] sclp_register+0x2e/0x248
01: [ 672.875442] sclp_rw_init+0x4a/0x70
01: [ 672.875443] sclp_console_init+0x4a/0x1b8
01: [ 672.875445] console_init+0x2c8/0x410
01: [ 672.875447] start_kernel+0x530/0x6a0
01: [ 672.875448] startup_continue+0x70/0xd0
01: [ 672.875449]
01: [ 672.875450] -> #1 (sclp_lock){-.-.}:
01: [ 672.875458] lock_acquire+0x21a/0x468
01: [ 672.875460] _raw_spin_lock_irqsave+0xcc/0xe8
01: [ 672.875462] sclp_add_request+0x34/0x308
01: [ 672.875464] sclp_conbuf_emit+0x100/0x138
01: [ 672.875465] sclp_console_write+0x96/0x3b8
01: [ 672.875467] console_unlock+0x6dc/0xa30
01: [ 672.875469] vprintk_emit+0x184/0x3c8
01: [ 672.875470] vprintk_default+0x44/0x50
01: [ 672.875472] printk+0xa8/0xc0
01: [ 672.875473] iommu_debugfs_setup+0xf2/0x108
01: [ 672.875475] iommu_init+0x6c/0x78
01: [ 672.875477] do_one_initcall+0x162/0x680
01: [ 672.875478] kernel_init_freeable+0x4e8/0x5a8
01: [ 672.875480] kernel_init+0x2a/0x188
01: [ 672.875484] ret_from_fork+0x30/0x34
01: [ 672.875486] kernel_thread_starter+0x0/0xc
01: [ 672.875487]
01: [ 672.875488] -> #0 (console_owner){-...}:
01: [ 672.875495] check_noncircular+0x338/0x3e0
01: [ 672.875496] __lock_acquire+0x1e66/0x2d88
01: [ 672.875498] lock_acquire+0x21a/0x468
01: [ 672.875499] console_unlock+0x3a6/0xa30
01: [ 672.875501] vprintk_emit+0x184/0x3c8
01: [ 672.875503] vprintk_default+0x44/0x50
01: [ 672.875504] printk+0xa8/0xc0
01: [ 672.875506] __dump_page+0x1dc/0x710
01: [ 672.875507] dump_page+0x2e/0x58
01: [ 672.875509] has_unmovable_pages+0x2e8/0x470
01: [ 672.875511] start_isolate_page_range+0x404/0x538
01: [ 672.875513] __offline_pages+0x22c/0x1338
01: [ 672.875514] memory_subsys_offline+0xa6/0xe8
01: [ 672.875516] device_offline+0xe6/0x118
01: [ 672.875517] state_store+0xf0/0x110
01: [ 672.875519] kernfs_fop_write+0x1bc/0x270
01: [ 672.875521] vfs_write+0xce/0x220
01: [ 672.875522] ksys_write+0xea/0x190
01: [ 672.875524] system_call+0xd8/0x2b4
01: [ 672.875525]
01: [ 672.875527] other info that might help us debug this:
01: [ 672.875528]
01: [ 672.875529] Chain exists of:
01: [ 672.875530] console_owner --> sclp_lock --> &(&zone->lock)->rlock
01: [ 672.875538]
01: [ 672.875540] Possible unsafe locking scenario:
01: [ 672.875541]
01: [ 672.875543] CPU0 CPU1
01: [ 672.875544] ---- ----
01: [ 672.875545] lock(&(&zone->lock)->rlock);
01: [ 672.875549] lock(sclp_lock);
01: [ 672.875553] lock(&(&zone->lock)->rlock);
01: [ 672.875557] lock(console_owner);
01: [ 672.875560]
01: [ 672.875562] *** DEADLOCK ***
01: [ 672.875563]
01: [ 672.875564] 9 locks held by test.sh/1724:
01: [ 672.875565] #0: 000000000e925408 (sb_writers#4){.+.+}, at: vfs_write+0x2
01: 06/0x220
01: [ 672.875574] #1: 0000000050aa4280 (&of->mutex){+.+.}, at: kernfs_fop_writ
01: e+0x154/0x270
01: [ 672.875581] #2: 0000000062e5c628 (kn->count#198){.+.+}, at: kernfs_fop_w
01: rite+0x16a/0x270
01: [ 672.875590] #3: 00000000523236a0 (device_hotplug_lock){+.+.}, at: lock_d
01: evice_hotplug_sysfs+0x30/0x80
01: [ 672.875598] #4: 0000000062e70990 (&dev->mutex){....}, at: device_offline
01: +0x78/0x118
01: [ 672.875605] #5: 0000000051fd36b0 (cpu_hotplug_lock.rw_sem){++++}, at: __
01: offline_pages+0xec/0x1338
01: [ 672.875613] #6: 00000000521ca470 (mem_hotplug_lock.rw_sem){++++}, at: pe
01: rcpu_down_write+0x38/0x210
01: [ 672.875620] #7: 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: star
01: t_isolate_page_range+0x216/0x538
01: [ 672.875628] #8: 000000005205a100 (console_lock){+.+.}, at: vprintk_emit+
01: 0x178/0x3c8
01: [ 672.875635]
01: [ 672.875636] stack backtrace:
01: [ 672.875639] CPU: 1 PID: 1724 Comm: test.sh Not tainted 5.4.0-rc1-next-201
01: 91004+ #64
01: [ 672.875640] Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)
01: [ 672.875642] Call Trace:
01: [ 672.875644] ([<00000000512ae218>] show_stack+0x110/0x1b0)
01: [ 672.875645] [<0000000051b6d506>] dump_stack+0x126/0x178
01: [ 672.875648] [<00000000513a4b08>] check_noncircular+0x338/0x3e0
01: [ 672.875650] [<00000000513aaaf6>] __lock_acquire+0x1e66/0x2d88
01: [ 672.875652] [<00000000513a7e12>] lock_acquire+0x21a/0x468
01: [ 672.875654] [<00000000513bb2fe>] console_unlock+0x3a6/0xa30
01: [ 672.875655] [<00000000513bde2c>] vprintk_emit+0x184/0x3c8
01: [ 672.875657] [<00000000513be0b4>] vprintk_default+0x44/0x50
01: [ 672.875659] [<00000000513beb60>] printk+0xa8/0xc0
01: [ 672.875661] [<000000005158c364>] __dump_page+0x1dc/0x710
01: [ 672.875663] [<000000005158c8c6>] dump_page+0x2e/0x58
01: [ 672.875665] [<00000000515d87c8>] has_unmovable_pages+0x2e8/0x470
01: [ 672.875667] [<000000005167072c>] start_isolate_page_range+0x404/0x538
01: [ 672.875669] [<0000000051b96de4>] __offline_pages+0x22c/0x1338
01: [ 672.875671] [<0000000051908586>] memory_subsys_offline+0xa6/0xe8
01: [ 672.875673] [<00000000518e561e>] device_offline+0xe6/0x118
01: [ 672.875675] [<0000000051908170>] state_store+0xf0/0x110
01: [ 672.875677] [<0000000051796384>] kernfs_fop_write+0x1bc/0x270
01: [ 672.875679] [<000000005168972e>] vfs_write+0xce/0x220
01: [ 672.875681] [<0000000051689b9a>] ksys_write+0xea/0x190
01: [ 672.875685] [<0000000051ba9990>] system_call+0xd8/0x2b4
01: [ 672.875687] INFO: lockdep is turned off.
next prev parent reply other threads:[~2019-10-04 12:56 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-03 8:10 [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages() Anshuman Khandual
2019-10-03 9:05 ` Qian Cai
2019-10-03 9:32 ` Anshuman Khandual
2019-10-03 9:53 ` Anshuman Khandual
2019-10-03 11:19 ` Qian Cai
2019-10-03 11:32 ` Anshuman Khandual
2019-10-03 11:50 ` Qian Cai
2019-10-03 12:02 ` Anshuman Khandual
2019-10-03 12:14 ` Qian Cai
2019-10-04 8:25 ` David Hildenbrand
2020-01-14 8:19 ` Anshuman Khandual
2020-01-14 8:30 ` David Hildenbrand
2020-01-14 9:10 ` Michal Hocko
2020-01-14 10:23 ` Vlastimil Babka
2020-01-14 11:03 ` Anshuman Khandual
2020-01-14 11:32 ` Michal Hocko
2020-01-14 12:04 ` [PATCH] mm, debug: always print flags in dump_page() Vlastimil Babka
2020-01-14 13:35 ` Michal Hocko
2020-01-14 18:22 ` [PATCH] mm/page_alloc: Add a reason for reserved pages in has_unmovable_pages() Ralph Campbell
2019-10-04 10:58 ` Michal Hocko
2019-10-04 11:44 ` Anshuman Khandual
2019-10-04 12:56 ` Qian Cai [this message]
2019-10-04 13:07 ` Michal Hocko
2019-10-04 13:30 ` Qian Cai
2019-10-04 13:38 ` Michal Hocko
2019-10-04 13:56 ` Qian Cai
2019-10-04 14:41 ` Michal Hocko
2019-10-05 21:22 ` Andrew Morton
2019-10-05 22:38 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1570193776.5576.270.camel@lca.pw \
--to=cai@lca.pw \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=osalvador@suse.de \
--cc=pavel.tatashin@microsoft.com \
--cc=rppt@linux.ibm.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.