All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: Roberto Ricci <io@r-ricci.it>
Cc: Dave Young <dyoung@redhat.com>,
	ebiederm@xmission.com, rafael@kernel.org, pavel@ucw.cz,
	ytcoode@gmail.com, kexec@lists.infradead.org,
	linux-pm@vger.kernel.org, akpm@linux-foundation.org,
	regressions@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] Kernel booted via kexec fails to resume from hibernation
Date: Sat, 29 Mar 2025 09:44:06 +0800	[thread overview]
Message-ID: <Z+dQZozsbdls6yqJ@MiWiFi-R3L-srv> (raw)
In-Reply-To: <Z-c7V2hptt9U9UCl@desktop0a>

On 03/29/25 at 01:14am, Roberto Ricci wrote:
> On 2025-01-27 10:42 +0800, Dave Young wrote:
> > On Mon, 27 Jan 2025 at 10:39, Dave Young <dyoung@redhat.com> wrote:
> > > On 01/13/25 at 10:28pm, Roberto Ricci wrote:
> > > > After rebooting the system via kexec, hibernating and rebooting the machine, this oops occurs:
> > > >
> > > [snip]
> > > >
> > > > I will send the kernel config and dmesg in replies to this email.
> > > >
> > >
> > > I tried your config (removed some config driver related which is not useful), but it can not boot on my kvm guest.
> > > Firstly I saw a panic in ftrace path,  then I rebuilt the kernel without ftrace, it panicked again but in kvm related code path.
> > > Both are not related to kexec at all so I suspect your bug is not kexec specific.
> > >
> > > [snip]
> > >
> > > You can find the kernel config here (with the ftrace enabled):
> > > https://people.redhat.com/~ruyang/snakeyear/panic-ftrace.config
> > 
> > BTW, if I disable KASAN then kernel can boot, anyway kexec +
> > hibernation works fine with a few tests, no panics.
> > 
> > >
> > > Thanks
> > > Dave
> 
> Hi,
> 
> sorry for the late reply. I tried your modified config, but I'm getting
> the same oops I originally reported. No idea why the oops is not
> happening for you.

Not that oops is not happening in my side, I can't boot kernel built
with you provided config on Fedora OS. 

> 
> Anyway, I performed yet another bisection, this time with just plain
> defconfig plus CONFIG_KEXEC_FILE=y, and I got different results.
> 
> Updated steps to reproduce:
> 1. Boot kernel >= v6.8 in a virtual machine created with this command:
>    `qemu-system-x86_64 -enable-kvm -smp 1 -m 4.0G -hda disk.qcow2`
> 2. Load the same kernel with:
>    `kexec --kexec-file-syscall -l /boot/vmlinuz-6.14.0 --initrd /boot/initramfs-6.14.0.img --reuse-cmdline`
> 3. Reboot (or call `kexec -e` directly)
> 4. Hibernate and reboot: `printf reboot >/sys/power/disk && printf disk >/sys/power/state`
> 5. Upon resuming, three things could happen, depending on luck:

OK, this is a little complicated. wondering why you need to do the
hibernation and reboot. Just for curiosity.

> 5a. A kernel oops:
> ```
> [   42.574201] BUG: kernel NULL pointer dereference, address: 0000000000000000
...snip... 
> I will send config and dmesg in replies to this email.
> 
> The bisection pointed to
> b3ba234171cd kexec_file: load kernel at top of system RAM if required
> 
> #regzbot introduced: b3ba234171cd0d58df0a13c262210ff8b5fd2830
> 
> Now that I think about it, this was the commit I found when I did the
> very first bisection after I found the bug. But I could not get the same
> result with subsequent bisections, so I didn't mention it in my original
> report.
> 
> When reverting b3ba234171cd on top of v6.14, merge conflicts must be
> solved, I hope I did it right:

I doubt how this caused the failure. I have several questions, could you
help answer:

1) Can this problem be stably reproduced with kexec_file_load?

2) if answer to 1) is yes, can reverting b3ba234171cd fix it stably?

3) If answer to 1) and 2) is yes, does kexec_load works for you? Asking
this because kexec_load interface defaults to put kexec kernel on top of
system RAM which is equivalent to applying commit b3ba234171cd.

4) Can you add '-d' to 'kexec -l' to print more debugging message?

5) Can normal kexec trigger the failure? I mean operating kexec w/o
the hibernation/resumption. 

> 
> ```
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 3eedb8c226ad..3014be212afd 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -614,10 +614,7 @@ static int kexec_walk_resources(struct kexec_buf *kbuf,
>                                            crashk_res.start, crashk_res.end,
>                                            kbuf, func);
>  #endif
> -       if (kbuf->top_down)
> -               return walk_system_ram_res_rev(0, ULONG_MAX, kbuf, func);
> -       else
> -               return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +       return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
>  }
> 
>  /**
> ```
> 
> Applying this diff solves the problem for v6.14.
> 



  parent reply	other threads:[~2025-03-29  1:44 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-13 21:28 [REGRESSION] Kernel booted via kexec fails to resume from hibernation Roberto Ricci
2025-01-13 21:31 ` Roberto Ricci
2025-01-14  3:42   ` Baoquan He
2025-04-01 12:59   ` msizanoen
2025-04-03 22:00     ` Roberto Ricci
2025-04-04  2:54       ` msizanoen
2025-04-04  4:56         ` msizanoen
2025-04-04  5:50           ` msizanoen
2025-04-04 20:39             ` Roberto Ricci
2025-04-05  5:15             ` msizanoen
2025-04-04 20:00         ` Roberto Ricci
2025-01-13 21:32 ` Roberto Ricci
2025-01-13 23:17 ` Andrew Morton
2025-01-14 13:19   ` Roberto Ricci
2025-01-14 13:16 ` Roberto Ricci
2025-01-15  4:04   ` Baoquan He
2025-01-15 12:00     ` Roberto Ricci
2025-01-16 11:52       ` Roberto Ricci
2025-01-17  1:55         ` Baoquan He
2025-01-17  3:41           ` Baoquan He
2025-01-17  7:52             ` Roberto Ricci
2025-01-16  9:54     ` Yuntao Wang
2025-01-22  9:45 ` RuiRui Yang
2025-01-22 13:01   ` Roberto Ricci
2025-01-27  2:39 ` Dave Young
2025-01-27  2:42   ` Dave Young
2025-03-09 17:09     ` Donald
2025-03-29  0:14     ` Roberto Ricci
2025-03-29  0:14       ` Roberto Ricci
2025-03-29  0:15       ` Roberto Ricci
2025-03-29  1:44       ` Baoquan He [this message]
2025-03-29 20:30         ` Roberto Ricci
2025-03-29 20:33           ` Roberto Ricci
2025-03-31  3:22           ` Dave Young
2025-04-03 21:59             ` Roberto Ricci
2025-04-04 23:31           ` Roberto Ricci
2025-04-04 23:37             ` Roberto Ricci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z+dQZozsbdls6yqJ@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=io@r-ricci.it \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=ytcoode@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.