linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Roberto Ricci <io@r-ricci.it>
Cc: Dave Young <dyoung@redhat.com>,
	ebiederm@xmission.com, rafael@kernel.org, pavel@ucw.cz,
	ytcoode@gmail.com, kexec@lists.infradead.org,
	linux-pm@vger.kernel.org, akpm@linux-foundation.org,
	regressions@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] Kernel booted via kexec fails to resume from hibernation
Date: Sat, 29 Mar 2025 09:44:06 +0800	[thread overview]
Message-ID: <Z+dQZozsbdls6yqJ@MiWiFi-R3L-srv> (raw)
In-Reply-To: <Z-c7V2hptt9U9UCl@desktop0a>

On 03/29/25 at 01:14am, Roberto Ricci wrote:
> On 2025-01-27 10:42 +0800, Dave Young wrote:
> > On Mon, 27 Jan 2025 at 10:39, Dave Young <dyoung@redhat.com> wrote:
> > > On 01/13/25 at 10:28pm, Roberto Ricci wrote:
> > > > After rebooting the system via kexec, hibernating and rebooting the machine, this oops occurs:
> > > >
> > > [snip]
> > > >
> > > > I will send the kernel config and dmesg in replies to this email.
> > > >
> > >
> > > I tried your config (removed some config driver related which is not useful), but it can not boot on my kvm guest.
> > > Firstly I saw a panic in ftrace path,  then I rebuilt the kernel without ftrace, it panicked again but in kvm related code path.
> > > Both are not related to kexec at all so I suspect your bug is not kexec specific.
> > >
> > > [snip]
> > >
> > > You can find the kernel config here (with the ftrace enabled):
> > > https://people.redhat.com/~ruyang/snakeyear/panic-ftrace.config
> > 
> > BTW, if I disable KASAN then kernel can boot, anyway kexec +
> > hibernation works fine with a few tests, no panics.
> > 
> > >
> > > Thanks
> > > Dave
> 
> Hi,
> 
> sorry for the late reply. I tried your modified config, but I'm getting
> the same oops I originally reported. No idea why the oops is not
> happening for you.

Not that oops is not happening in my side, I can't boot kernel built
with you provided config on Fedora OS. 

> 
> Anyway, I performed yet another bisection, this time with just plain
> defconfig plus CONFIG_KEXEC_FILE=y, and I got different results.
> 
> Updated steps to reproduce:
> 1. Boot kernel >= v6.8 in a virtual machine created with this command:
>    `qemu-system-x86_64 -enable-kvm -smp 1 -m 4.0G -hda disk.qcow2`
> 2. Load the same kernel with:
>    `kexec --kexec-file-syscall -l /boot/vmlinuz-6.14.0 --initrd /boot/initramfs-6.14.0.img --reuse-cmdline`
> 3. Reboot (or call `kexec -e` directly)
> 4. Hibernate and reboot: `printf reboot >/sys/power/disk && printf disk >/sys/power/state`
> 5. Upon resuming, three things could happen, depending on luck:

OK, this is a little complicated. wondering why you need to do the
hibernation and reboot. Just for curiosity.

> 5a. A kernel oops:
> ```
> [   42.574201] BUG: kernel NULL pointer dereference, address: 0000000000000000
...snip... 
> I will send config and dmesg in replies to this email.
> 
> The bisection pointed to
> b3ba234171cd kexec_file: load kernel at top of system RAM if required
> 
> #regzbot introduced: b3ba234171cd0d58df0a13c262210ff8b5fd2830
> 
> Now that I think about it, this was the commit I found when I did the
> very first bisection after I found the bug. But I could not get the same
> result with subsequent bisections, so I didn't mention it in my original
> report.
> 
> When reverting b3ba234171cd on top of v6.14, merge conflicts must be
> solved, I hope I did it right:

I doubt how this caused the failure. I have several questions, could you
help answer:

1) Can this problem be stably reproduced with kexec_file_load?

2) if answer to 1) is yes, can reverting b3ba234171cd fix it stably?

3) If answer to 1) and 2) is yes, does kexec_load works for you? Asking
this because kexec_load interface defaults to put kexec kernel on top of
system RAM which is equivalent to applying commit b3ba234171cd.

4) Can you add '-d' to 'kexec -l' to print more debugging message?

5) Can normal kexec trigger the failure? I mean operating kexec w/o
the hibernation/resumption. 

> 
> ```
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 3eedb8c226ad..3014be212afd 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -614,10 +614,7 @@ static int kexec_walk_resources(struct kexec_buf *kbuf,
>                                            crashk_res.start, crashk_res.end,
>                                            kbuf, func);
>  #endif
> -       if (kbuf->top_down)
> -               return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
> -       else
> -               return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +       return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
>  }
> 
>  /**
> ```
> 
> Applying this diff solves the problem for v6.14.
> 


  parent reply	other threads:[~2025-03-29  1:44 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-13 21:28 [REGRESSION] Kernel booted via kexec fails to resume from hibernation Roberto Ricci
2025-01-13 21:31 ` Roberto Ricci
2025-01-14  3:42   ` Baoquan He
2025-04-01 12:59   ` msizanoen
2025-04-03 22:00     ` Roberto Ricci
2025-04-04  2:54       ` msizanoen
2025-04-04  4:56         ` msizanoen
2025-04-04  5:50           ` msizanoen
2025-04-04 20:39             ` Roberto Ricci
2025-04-05  5:15             ` msizanoen
2025-04-04 20:00         ` Roberto Ricci
2025-01-13 21:32 ` Roberto Ricci
2025-01-13 23:17 ` Andrew Morton
2025-01-14 13:19   ` Roberto Ricci
2025-01-14 13:16 ` Roberto Ricci
2025-01-15  4:04   ` Baoquan He
2025-01-15 12:00     ` Roberto Ricci
2025-01-16 11:52       ` Roberto Ricci
2025-01-17  1:55         ` Baoquan He
2025-01-17  3:41           ` Baoquan He
2025-01-17  7:52             ` Roberto Ricci
2025-01-16  9:54     ` Yuntao Wang
2025-01-22  9:45 ` RuiRui Yang
2025-01-22 13:01   ` Roberto Ricci
2025-01-27  2:39 ` Dave Young
2025-01-27  2:42   ` Dave Young
2025-03-09 17:09     ` Donald
2025-03-29  0:14     ` Roberto Ricci
2025-03-29  0:14       ` Roberto Ricci
2025-03-29  0:15       ` Roberto Ricci
2025-03-29  1:44       ` Baoquan He [this message]
2025-03-29 20:30         ` Roberto Ricci
2025-03-29 20:33           ` Roberto Ricci
2025-03-31  3:22           ` Dave Young
2025-04-03 21:59             ` Roberto Ricci
2025-04-04 23:31           ` Roberto Ricci
2025-04-04 23:37             ` Roberto Ricci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z+dQZozsbdls6yqJ@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=io@r-ricci.it \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=ytcoode@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).