All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: kexec@lists.infradead.org, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Kai Huang <kai.huang@intel.com>,
	Nikolay Borisov <nik.borisov@suse.com>,
	linux-kernel@vger.kernel.org, Simon Horman <horms@kernel.org>,
	Dave Young <dyoung@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	jpoimboe@kernel.org
Subject: Re: [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG
Date: Mon, 25 Nov 2024 21:34:54 +0100	[thread overview]
Message-ID: <Z0TfblQeVRnDc-S1@gmail.com> (raw)
In-Reply-To: <334ae44077315e2b69529b6fef8d85ec55f80ecf.camel@infradead.org>


* David Woodhouse <dwmw2@infradead.org> wrote:

> > Just curious: did you write this code to debug the series, or was 
> > there some original hair-tearing regression that motivated you? Is 
> > there's an upstream fix to marvel at and be horrified about in 
> > equal measure?
> 
> https://lore.kernel.org/all/2ab14f6f-2690-056b-cf9e-38a12dafd728@amd.com/t/#u
> is the upstream fix.

Which ended up being the following upstream commit:

  88a921aa3c6b ("x86/sev: Ensure that RMP table fixups are reserved")

Might make sense to add this commit reference to one of the central 
patches of the GDT/IDT code, to document how this feature is able to 
pin down very hard to debug regressions. (Even if the upstream fix was 
done independently in probably luckier circumstances.)

> [...] It's all the more horrifying because it was already *fixed* 
> upstream before I lost weeks of my life to chasing it. And the 
> trigger which actually made it *happen*, and made our production 
> systems allocate memory within that dangerous 1MiB region adjacent to 
> the RMP table, was a tweak to the NMI watchdog period... leading to 
> an assumption that we were getting stray perf NMIs during the kexec, 
> and a *long* wild goose chase based on that false assumption...

:-/

> Once I'd written the debug code, I just wanted to clean it up a bit 
> and push it out for the benefit of others; that *was* the main point 
> of this series. All the rest of the cleanups are just yak shaving.
> 
> The realisation that we never even explicitly mapped the control code 
> page and always just got lucky because it happened to be in the same 
> 2MiB or 1GiB superpage as something else that we did map... was just 
> a bonus :)

I'm amazed and horrified in equal measure ;-)

> (That one is fixed in v3 which I'll post shortly, and is already in 
> https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kexec-debug
> )
> 
> > I'd argue that this debugging code probably needs a default-off Kconfig 
> > option, even with the obvious hard-coded environmental limitations & 
> > assumptions it has. Could be useful to very early debugging & would 
> > preserve your effort without it bitrotting too obviously.
> 
> Yeah. In v3 I've made it a config option, and made it use the 
> early_printk serial console (as long as that's an I/O based 8250; we 
> can add others too later).

That's lovely!

Thanks,

	Ingo


  reply	other threads:[~2024-11-25 20:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-22 22:38 [RFC PATCH v2 0/16] x86/kexec: Add exception handling for relocate_kernel and further yak-shaving David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 01/16] x86/kexec: Clean up and document register use in relocate_kernel_64.S David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 02/16] x86/kexec: Use named labels in swap_pages " David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 03/16] x86/kexec: Restore GDT on return from preserve_context kexec David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 04/16] x86/kexec: Only swap pages for preserve_context mode David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 05/16] x86/kexec: Invoke copy of relocate_kernel() instead of the original David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 06/16] x86/kexec: Move relocate_kernel to kernel .data section David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 07/16] x86/kexec: Add data section to relocate_kernel David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 08/16] x86/kexec: Copy control page into place in machine_kexec_prepare() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 09/16] x86/kexec: Drop page_list argument from relocate_kernel() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 10/16] x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 11/16] x86/kexec: Clean up register usage in relocate_kernel() David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 12/16] x86/kexec: Mark relocate_kernel page as ROX instead of RWX David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 13/16] x86/kexec: Debugging support: load a GDT David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 14/16] x86/kexec: Debugging support: Load an IDT and basic exception entry points David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 15/16] x86/kexec: Debugging support: Dump registers on exception David Woodhouse
2024-11-22 22:38 ` [RFC PATCH v2 16/16] [DO NOT MERGE] x86/kexec: enable DEBUG David Woodhouse
2024-11-25  9:21   ` Ingo Molnar
2024-11-25  9:32     ` David Woodhouse
2024-11-25 20:34       ` Ingo Molnar [this message]
2024-11-25 20:46         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z0TfblQeVRnDc-S1@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=dyoung@redhat.com \
    --cc=horms@kernel.org \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@kernel.org \
    --cc=kai.huang@intel.com \
    --cc=kexec@lists.infradead.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.