All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Gortmaker <paulg@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Richard Purdie <richard.purdie@linuxfoundation.org>
Subject: Re: Intermittent Qemu boot hang/regression traced back to INT 0x80 changes
Date: Fri, 26 Apr 2024 08:24:02 -0400	[thread overview]
Message-ID: <20240426122402.GA36092@kernel.org> (raw)
In-Reply-To: <20240424195157.GGZili3b-AxmUDlipA@fat_crate.local>

[Apologies for repeated info; last mail didn't make it to the list]

[Re: Intermittent Qemu boot hang/regression traced back to INT 0x80 changes] On 24/04/2024 (Wed 21:51) Borislav Petkov wrote:

> On Wed, Apr 24, 2024 at 02:58:06PM -0400, Paul Gortmaker wrote:
> ...
> > pci 0000:00:1d.0: [8086:2934] type 00 class 0x0c0300 conventional PCI endpoint
> > pci 0000:00:1d.0: BAR 4 [io  0xc080-0xc09f]
> > pci 0000:00:1d.1: [8086:2935] type 00 class 0x0c0300 conventional PCI endpoint
> > pci 0000:00:1d.1: BAR 4 [io  0xc0a0-0xc0bf]
> > pci 0000:00:1d.2: [8086:2936] type 00 class 0x0c0300 conventional PCI endpoint
> > <hang - not always exactly here, but always in this block of PCI printk>
> 
> How would those commits have anything to do with such an early hang?!
> 
> Nothing that early is issuing INT80 32-bit syscalls, is it?
> 
> Btw, can you checkout the Linus tree at...
> 
> f35e46631b28 Merge tag 'x86-int80-20231207' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> f4116bfc4462 x86/tdx: Allow 32-bit emulation by default
> 
> 
> <-- here and test that commit as the top one?
> 
> 55617fb991df x86/entry: Do not allow external 0x80 interrupts 

They both show the issue, but that really doesn't matter now.  When you
guys pointed out it really didn't make sense, I did what I should have
done before - tested the crap out of ^1, the trunk just before the
INT80 merge:

commit f35e46631b28a63ca3887d7afef1a65a5544da52
Merge: 55b224d90d44 f4116bfc4462
       ^^^^^^^^^^^^
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Thu Dec 7 11:56:34 2023 -0800

    Merge tag 'x86-int80-20231207' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

...which would be 55b224d90d44 (parisc merge).  So I left that run
for near 24h (almost 2000 runs), and got 8 PCI-hang instances.  :(
Which means INT80 isn't even there yet.

So I owe you guys an apology for pointing the finger at INT80.  I still
don't understand how the pseudo bisect on v6.6-stable seems so
"concrete".  The v6.6.6 worked "fine" (it seemed) and v6.6.7 died fairly
quickly.  The revert of INT80 on v6.6.7 seemed to "fix" it - but if so,
it was only because it perturbed something else.

I already knew my "good" bisect points were not "proven" good, but only
statistically "good".  Seems I need to revisit some of those "good" data
points (both on v6.6-stable) and on mainline and test longer.

> 
> which reminds me - that hang could be actually that guest kernel
> panicking but the panic not coming out to the console.
> 
> When it hangs, can you connect with gdb to qemu and dump stack and
> registers?
> 
> Make sure you have DEBUG_INFO enabled in the guest kernel.

I want to try some of these things, but I also don't want to
accidentally lose the reproducer I have.  Maybe I'll see if I can
reproduce it at home, since I'll lose use of the current box in a week
anyway...

Again, sorry for the false positive.  I let the v6.6-stable testing bias
my mainline conclusions to where I didn't test underneath INT80.  I'll
follow up with more details once (if?) I manage to properly sort this.

Paul.
--

> 
> Is this even a guest?
> 
> I know you had guests last time you reported the alternatives issue.
> 
> Right, and then test the tree checked out at this commit:
> 
> be5341eb0d43 x86/entry: Convert INT 0x80 emulation to IDTENTRY
> 
> The others should be unrelated...
> 
> b82a8dbd3d2f x86/coco: Disable 32-bit emulation by default on TDX and SEV
> 
> Hmm.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2024-04-26 12:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24 18:58 Intermittent Qemu boot hang/regression traced back to INT 0x80 changes Paul Gortmaker
2024-04-24 19:51 ` Borislav Petkov
2024-04-26 12:24   ` Paul Gortmaker [this message]
2024-05-12 20:23     ` Paul Gortmaker
2024-04-24 20:03 ` Dave Hansen
2024-04-24 20:19 ` Dave Hansen
2024-04-25 18:44 ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240426122402.GA36092@kernel.org \
    --to=paulg@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.