All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: gwhite@kupulau.com, Hailong Tu <tuhailong@gmail.com>,
	SeongJae Park <sj@kernel.org>,
	bugzilla-daemon@kernel.org, linux-mm@kvack.org,
	damon@lists.linux.dev
Subject: Re: [Bug 216072] New: regression: ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at boot when DAMON is enabled
Date: Sat,  4 Jun 2022 19:22:22 +0000	[thread overview]
Message-ID: <20220604192222.1488-1-sj@kernel.org> (raw)
In-Reply-To: <20220604112706.d50208c3c15a748d1c04c584@linux-foundation.org>

Cc-ing damon@lists.linux.dev

Thank you for reporting this, Greg!  And thank you for forwarding this, Andrew!

On Sat, 4 Jun 2022 11:27:06 -0700 Andrew Morton <akpm@linux-foundation.org> wrote:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Sat, 04 Jun 2022 15:49:50 +0000 bugzilla-daemon@kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=216072
> > 
> >             Bug ID: 216072
> >            Summary: regression:
> >                     ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at
> >                     boot when DAMON is enabled
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 5.19 pre-rc1
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Other
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: gwhite@kupulau.com
> >         Regression: No
> > 
> > I see a hang on boot whenever DAMON is enabled.  The specific commit that
> > causes this is listed below.  There is no printk / dmesg output, only the
> > message about an initrd being loaded by EFIStup.  Then a hard hang.  Removing
> > the commit below - or disabling DAMON entirely - fixes the issue.
> > 
> > commit 059342d1dd4e01d634184793fa3f8437e62afaa1
> > Author: Hailong Tu <tuhailong@gmail.com>
> > Date:   Fri Apr 29 14:37:00 2022 -0700
> > 
> >     mm/damon/reclaim: fix the timer always stays active
> > 
> >     The timer stays active even if the reclaim mechanism is never enabled.  It
> >     is unnecessary overhead can be completely avoided by using
> >     module_param_cb() for enabled flag.
> > 
> >     Link:
> > https://lkml.kernel.org/r/20220421125910.1052459-1-tuhailong@gmail.com
> >     Signed-off-by: Hailong Tu <tuhailong@gmail.com>
> >     Reviewed-by: SeongJae Park <sj@kernel.org>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Greg has further mentioned that the issue can be reproduced when
the kernel is booting with damon_reclaim.enabled=Y parameter, and I was also
reproducible on my test machine.

DAMON_RECLAIM calls 'schedule_delayed_work()', which uses 'system_wq', from a
parameter store callback ('enabled_store()'), which is called from
'parse_args()', which is again called from 'start_kernel()'.

And 'system_wq' is initialized from 'workqueue_init_early()', which is called
from 'start_kernel()' after 'parse_args()'.

Therefore the 'schedule_delayed_work()' touches the uninitialized 'system_wq',
and the init process gets kernel NULL pointer dereference, and the system
hangs.

I further confirmed below simple change fixes this issue.  I will format it as
a patch and send soon.

diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 53c0c084f046..78984c8d1047 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -374,6 +374,8 @@ static void damon_reclaim_timer_fn(struct work_struct *work)
 }
 static DECLARE_DELAYED_WORK(damon_reclaim_timer, damon_reclaim_timer_fn);

+static bool damon_reclaim_initialized;
+
 static int enabled_store(const char *val,
                const struct kernel_param *kp)
 {
@@ -382,6 +384,9 @@ static int enabled_store(const char *val,
        if (rc < 0)
                return rc;

+       if (!damon_reclaim_initialized)
+               return rc;
+
        if (enabled)
                schedule_delayed_work(&damon_reclaim_timer, 0);

@@ -450,6 +455,8 @@ static int __init damon_reclaim_init(void)
        damon_add_target(ctx, target);

        schedule_delayed_work(&damon_reclaim_timer, 0);
+
+       damon_reclaim_initialized = true;
        return 0;
 }



Thanks,
SJ

[...]

  reply	other threads:[~2022-06-04 19:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-216072-27@https.bugzilla.kernel.org/>
2022-06-04 18:27 ` [Bug 216072] New: regression: ccccccgcdkgekhjervgbdfbhdjugcjkfdhiegeuugugtHang at boot when DAMON is enabled Andrew Morton
2022-06-04 19:22   ` SeongJae Park [this message]
2022-06-04 19:50     ` [PATCH] mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220604192222.1488-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@kernel.org \
    --cc=damon@lists.linux.dev \
    --cc=gwhite@kupulau.com \
    --cc=linux-mm@kvack.org \
    --cc=tuhailong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.