All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kalle Valo <kvalo@kernel.org>
To: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	 Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Ingo Molnar <mingo@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	 "Rafael J. Wysocki" <rafael@kernel.org>,
	 x86@kernel.org,  linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org,  regressions@lists.linux.dev,
	 Jeff Johnson <quic_jjohnson@quicinc.com>,
	 Daniel Sneddon <daniel.sneddon@linux.intel.com>
Subject: Re: [regression] suspend stress test stalls within 30 minutes
Date: Fri, 17 May 2024 20:19:49 +0300	[thread overview]
Message-ID: <87ikzcfj9m.fsf@kernel.org> (raw)
In-Reply-To: <20240515072231.z3wlyoblyc34ldmr@desk> (Pawan Gupta's message of "Wed, 15 May 2024 00:22:31 -0700")

Pawan Gupta <pawan.kumar.gupta@linux.intel.com> writes:

> On Tue, May 14, 2024 at 09:10:07AM -0700, Dave Hansen wrote:
>
>> On 5/14/24 06:17, Kalle Valo wrote:
>> > The kernel we use in our ath11k testing has almost all kernel debug
>> > features enabled so I decided disable all of them, which unsurprisingly
>> > also fixed my suspend problems. So maybe this is something which happens
>> > only when MITIGATION_IBRS_ENTRY and some debug option from 'Kernel
>> > hacking' are both enabled?
>> 
>> I had my money on DEBUG_ENTRY, but it doesn't look like you ever had it
>> enabled.
>> 
>> I've got basically two theories:
>> 
>> One, the IBRS value is getting mucked up somewhere, either that %r15
>> value is getting stepped on or the per-cpu value is corrupt and the
>> WRMSR #GP's, causing the hang.
>> 
>> Two, IBRS_{ENTER,EXIT} is called in a "wrong" context somewhere.  Either
>> it is clobbering something it shouldn't or it is assuming something is
>> in place that is not (like a valid stack).
>> 
>> But the whole "'sudo shutdown -h now' then suspend somehow immediately
>> unstalls" thing is really perplexing.  I hope Pawan has some ideas.
>
> Nothing promising yet. I now have the system with the same model, but the
> system is only booting in recovery mode with the config attached with the
> report.
>
> Kalle, I wanted to try reverting the below commits:
>
> aa1567a7e644 ("intel_idle: Add ibrs_off module parameter to force-disable IBRS")
> 1e4d3001f59f ("x86/entry: Harden return-to-user")
> c516213726fb ("x86/entry: Optimize common_interrupt_return()")
>
> ... but I haven't reproduced the issue yet.

I can try to revert those but didn't manage to do it yet.

> FYI, cmdline "spectre_v2=off" should have the same effect as
> CONFIG_IBRS_ENTRY=n.

Confirmed, I don't see the bug with "spectre_v2=off" and the box
suspended succesfully 400 times.

> Other interesting thing to try is cmdline "dis_ucode_ldr".

This didn't help. I tried twice, the first time it failed after 11
suspend loops and the second time after 34 loops.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

      parent reply	other threads:[~2024-05-17 17:19 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-11 18:22 [regression] suspend stress test stalls within 30 minutes Kalle Valo
2024-05-11 18:48 ` Borislav Petkov
2024-05-11 18:49   ` Borislav Petkov
2024-05-11 20:26     ` Kalle Valo
2024-05-13 19:58       ` Kalle Valo
2024-05-14 13:17         ` Kalle Valo
2024-05-14 16:05           ` Borislav Petkov
2024-05-14 17:36             ` Pawan Gupta
2024-05-17 17:15             ` Kalle Valo
2024-05-17 17:22               ` Dave Hansen
2024-05-17 18:37                 ` Kalle Valo
2024-05-17 18:48                   ` Dave Hansen
2024-05-17 18:58                     ` Kalle Valo
2024-05-17 19:08                       ` Rafael J. Wysocki
2024-05-17 19:00                   ` Rafael J. Wysocki
2024-05-22  1:52                     ` Len Brown
2024-05-17 17:26               ` Borislav Petkov
2024-05-17 18:22                 ` Kalle Valo
2024-05-14 16:10           ` Dave Hansen
2024-05-15  7:22             ` Pawan Gupta
2024-05-15  7:44               ` Borislav Petkov
2024-05-15 16:27                 ` Pawan Gupta
2024-05-15 16:47                   ` Kalle Valo
2024-05-16  7:03                     ` Pawan Gupta
2024-05-16 14:25                       ` Pawan Gupta
2024-05-16 14:32                         ` Dave Hansen
2024-05-16 15:41                           ` Pawan Gupta
2024-05-17 17:41                         ` Kalle Valo
2024-05-17 18:31                           ` Pawan Gupta
2024-05-17 17:23                   ` Kalle Valo
2024-05-17 17:19               ` Kalle Valo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikzcfj9m.fsf@kernel.org \
    --to=kvalo@kernel.org \
    --cc=bp@alien8.de \
    --cc=daniel.sneddon@linux.intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=quic_jjohnson@quicinc.com \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.