From: SeongJae Park <sj@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: gwhite@kupulau.com, Hailong Tu <tuhailong@gmail.com>,
SeongJae Park <sj@kernel.org>,
bugzilla-daemon@kernel.org, linux-mm@kvack.org,
damon@lists.linux.dev
Subject: Re: [Bug 216072] New: regression: ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at boot when DAMON is enabled
Date: Sat, 4 Jun 2022 19:22:22 +0000 [thread overview]
Message-ID: <20220604192222.1488-1-sj@kernel.org> (raw)
In-Reply-To: <20220604112706.d50208c3c15a748d1c04c584@linux-foundation.org>
Cc-ing damon@lists.linux.dev
Thank you for reporting this, Greg! And thank you for forwarding this, Andrew!
On Sat, 4 Jun 2022 11:27:06 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sat, 04 Jun 2022 15:49:50 +0000 bugzilla-daemon@kernel.org wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=216072
> >
> > Bug ID: 216072
> > Summary: regression:
> > ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at
> > boot when DAMON is enabled
> > Product: Memory Management
> > Version: 2.5
> > Kernel Version: 5.19 pre-rc1
> > Hardware: All
> > OS: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: normal
> > Priority: P1
> > Component: Other
> > Assignee: akpm@linux-foundation.org
> > Reporter: gwhite@kupulau.com
> > Regression: No
> >
> > I see a hang on boot whenever DAMON is enabled. The specific commit that
> > causes this is listed below. There is no printk / dmesg output, only the
> > message about an initrd being loaded by EFIStup. Then a hard hang. Removing
> > the commit below - or disabling DAMON entirely - fixes the issue.
> >
> > commit 059342d1dd4e01d634184793fa3f8437e62afaa1
> > Author: Hailong Tu <tuhailong@gmail.com>
> > Date: Fri Apr 29 14:37:00 2022 -0700
> >
> > mm/damon/reclaim: fix the timer always stays active
> >
> > The timer stays active even if the reclaim mechanism is never enabled. It
> > is unnecessary overhead can be completely avoided by using
> > module_param_cb() for enabled flag.
> >
> > Link:
> > https://lkml.kernel.org/r/20220421125910.1052459-1-tuhailong@gmail.com
> > Signed-off-by: Hailong Tu <tuhailong@gmail.com>
> > Reviewed-by: SeongJae Park <sj@kernel.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Greg has further mentioned that the issue can be reproduced when
the kernel is booting with damon_reclaim.enabled=Y parameter, and I was also
reproducible on my test machine.
DAMON_RECLAIM calls 'schedule_delayed_work()', which uses 'system_wq', from a
parameter store callback ('enabled_store()'), which is called from
'parse_args()', which is again called from 'start_kernel()'.
And 'system_wq' is initialized from 'workqueue_init_early()', which is called
from 'start_kernel()' after 'parse_args()'.
Therefore the 'schedule_delayed_work()' touches the uninitialized 'system_wq',
and the init process gets kernel NULL pointer dereference, and the system
hangs.
I further confirmed below simple change fixes this issue. I will format it as
a patch and send soon.
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 53c0c084f046..78984c8d1047 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -374,6 +374,8 @@ static void damon_reclaim_timer_fn(struct work_struct *work)
}
static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn);
+static bool damon_reclaim_initialized;
+
static int enabled_store(const char *val,
const struct kernel_param *kp)
{
@@ -382,6 +384,9 @@ static int enabled_store(const char *val,
if (rc < 0)
return rc;
+ if (!damon_reclaim_initialized)
+ return rc;
+
if (enabled)
schedule_delayed_work(&damon_reclaim_timer, 0);
@@ -450,6 +455,8 @@ static int __init damon_reclaim_init(void)
damon_add_target(ctx, target);
schedule_delayed_work(&damon_reclaim_timer, 0);
+
+ damon_reclaim_initialized = true;
return 0;
}
Thanks,
SJ
[...]
next prev parent reply other threads:[~2022-06-04 19:22 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-216072-27@https.bugzilla.kernel.org/>
2022-06-04 18:27 ` [Bug 216072] New: regression: ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at boot when DAMON is enabled Andrew Morton
2022-06-04 19:22 ` SeongJae Park [this message]
2022-06-04 19:50 ` [PATCH] mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220604192222.1488-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bugzilla-daemon@kernel.org \
--cc=damon@lists.linux.dev \
--cc=gwhite@kupulau.com \
--cc=linux-mm@kvack.org \
--cc=tuhailong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).