From: Roberto Ricci <io@r-ricci.it>
To: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>,
ebiederm@xmission.com, rafael@kernel.org, pavel@ucw.cz,
ytcoode@gmail.com, kexec@lists.infradead.org,
linux-pm@vger.kernel.org, akpm@linux-foundation.org,
regressions@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [REGRESSION] Kernel booted via kexec fails to resume from hibernation
Date: Sat, 29 Mar 2025 21:30:17 +0100 [thread overview]
Message-ID: <Z-hYWc9LtBU1Yhtg@desktop0a> (raw)
In-Reply-To: <Z+dQZozsbdls6yqJ@MiWiFi-R3L-srv>
On 2025-03-29 09:44 +0800, Baoquan He wrote:
> On 03/29/25 at 01:14am, Roberto Ricci wrote:
> [snip]
> > Anyway, I performed yet another bisection, this time with just plain
> > defconfig plus CONFIG_KEXEC_FILE=y, and I got different results.
> >
> > Updated steps to reproduce:
> > 1. Boot kernel >= v6.8 in a virtual machine created with this command:
> > `qemu-system-x86_64 -enable-kvm -smp 1 -m 4.0G -hda disk.qcow2`
> > 2. Load the same kernel with:
> > `kexec --kexec-file-syscall -l /boot/vmlinuz-6.14.0 --initrd /boot/initramfs-6.14.0.img --reuse-cmdline`
> > 3. Reboot (or call `kexec -e` directly)
> > 4. Hibernate and reboot: `printf reboot >/sys/power/disk && printf disk >/sys/power/state`
> > 5. Upon resuming, three things could happen, depending on luck:
>
> OK, this is a little complicated. wondering why you need to do the
> hibernation and reboot. Just for curiosity.
The reason I do hibernation and reboot instead of hibernation and then
manually boot again is just convenience during tests. The issue occurs
with manual reboot too.
The reason I want kexec + hibernation to work is to fix a hibernation
issue on a system using ZFSBootMenu, a bootloader based on Linux which
uses kexec to boot the final OS. Other software using the same
mechanism include Petitboot and LinuxBoot. They might be affected as
well but I didn't try.
> > 5a. A kernel oops:
> > ```
> > [ 42.574201] BUG: kernel NULL pointer dereference, address: 0000000000000000
> ...snip...
> > I will send config and dmesg in replies to this email.
> >
> > The bisection pointed to
> > b3ba234171cd kexec_file: load kernel at top of system RAM if required
> [snip]
>
> I doubt how this caused the failure. I have several questions, could you
> help answer:
>
> 1) Can this problem be stably reproduced with kexec_file_load?
Every kernel build I tested which contains that commit is affected.
However a given build will not always lead to the same of the three
possible outcomes I described. E.g. first you get a oops (case 5a),
then you repeat the same steps with the same kernel image and the
system may get stuck at a black screen instead (case 5b).
But it never fully works.
> 2) if answer to 1) is yes, can reverting b3ba234171cd fix it stably?
Yes. None of cases 5{a,b,c} I previously described occur. Seems to work
fine.
> 3) If answer to 1) and 2) is yes, does kexec_load works for you? Asking
> this because kexec_load interface defaults to put kexec kernel on top of
> system RAM which is equivalent to applying commit b3ba234171cd.
No, it doesn't. While hibernation alone works, kexec + hibernation
results in the system just rebooting without resuming the hibernation
image, but no crash or other weird behaviour occurs.
Initially I decided to focus on kexec_file_load in order to narrow
things down, but that was before noticing that the bug could manifest
itself in different forms.
It is possible, indeed, that both syscalls are affected by the same
problem, which is not caused by commit b3ba234171cd.
I tried to test kexec_load with some older kernels, but I got build
errors, so I tested longterm releases where such errors have been fixed.
With v4.9.337, kexec (via kexec_load) + hibernation works.
With v5.4.291 it doesn't.
I'm not sure how bisection could be done in this case.
> 4) Can you add '-d' to 'kexec -l' to print more debugging message?
When using kexec_file_load, just these two lines get printed:
```
Try gzip decompression.
Try LZMA decompression.
```
When using kexec_load on kernel v5.4.291 (which doesn't work):
[the output is in a reply to this email]
When using kexec_load on kernel v4.9.337 (which works):
Identical to above, except for the exact hex value of some addresses.
> 5) Can normal kexec trigger the failure? I mean operating kexec w/o
> the hibernation/resumption.
No, kexec without hibernation seems to work fine, regardless of kernel
version and kexec syscall used.
next prev parent reply other threads:[~2025-03-29 20:30 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-13 21:28 [REGRESSION] Kernel booted via kexec fails to resume from hibernation Roberto Ricci
2025-01-13 21:31 ` Roberto Ricci
2025-01-14 3:42 ` Baoquan He
2025-04-01 12:59 ` msizanoen
2025-04-03 22:00 ` Roberto Ricci
2025-04-04 2:54 ` msizanoen
2025-04-04 4:56 ` msizanoen
2025-04-04 5:50 ` msizanoen
2025-04-04 20:39 ` Roberto Ricci
2025-04-05 5:15 ` msizanoen
2025-04-04 20:00 ` Roberto Ricci
2025-01-13 21:32 ` Roberto Ricci
2025-01-13 23:17 ` Andrew Morton
2025-01-14 13:19 ` Roberto Ricci
2025-01-14 13:16 ` Roberto Ricci
2025-01-15 4:04 ` Baoquan He
2025-01-15 12:00 ` Roberto Ricci
2025-01-16 11:52 ` Roberto Ricci
2025-01-17 1:55 ` Baoquan He
2025-01-17 3:41 ` Baoquan He
2025-01-17 7:52 ` Roberto Ricci
2025-01-16 9:54 ` Yuntao Wang
2025-01-22 9:45 ` RuiRui Yang
2025-01-22 13:01 ` Roberto Ricci
2025-01-27 2:39 ` Dave Young
2025-01-27 2:42 ` Dave Young
2025-03-09 17:09 ` Donald
2025-03-29 0:14 ` Roberto Ricci
2025-03-29 0:14 ` Roberto Ricci
2025-03-29 0:15 ` Roberto Ricci
2025-03-29 1:44 ` Baoquan He
2025-03-29 20:30 ` Roberto Ricci [this message]
2025-03-29 20:33 ` Roberto Ricci
2025-03-31 3:22 ` Dave Young
2025-04-03 21:59 ` Roberto Ricci
2025-04-04 23:31 ` Roberto Ricci
2025-04-04 23:37 ` Roberto Ricci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-hYWc9LtBU1Yhtg@desktop0a \
--to=io@r-ricci.it \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=dyoung@redhat.com \
--cc=ebiederm@xmission.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=pavel@ucw.cz \
--cc=rafael@kernel.org \
--cc=regressions@lists.linux.dev \
--cc=ytcoode@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).