linux-input.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Malte Starostik <malte@starostik.de>
To: Benjamin Tissoires <benjamin.tissoires@redhat.com>,
	Linux regressions mailing list <regressions@lists.linux.dev>,
	"Limonciello, Mario" <mario.limonciello@amd.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>,
	basavaraj.natikar@amd.com, linux-input@vger.kernel.org,
	linux@hexchain.org, stable@vger.kernel.org
Subject: Re: amd_sfh driver causes kernel oops during boot
Date: Wed, 07 Jun 2023 00:57:07 +0200	[thread overview]
Message-ID: <5980752.YW5z2jdOID@zen> (raw)
In-Reply-To: <79bd270e-4a0d-b4be-992b-73c65d085624@amd.com>

Am Dienstag, 6. Juni 2023, 17:25:13 CEST schrieb Limonciello, Mario:
> On 6/6/2023 3:08 AM, Benjamin Tissoires wrote:
> > On Jun 06 2023, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>> On Mon, Jun 05, 2023 at 01:24:25PM +0200, Malte Starostik wrote:
> >>>> Hello,
> >>>> 
> >>>> chiming in here as I'm experiencing what looks like the exact same
> >>>> issue, also on a Lenovo Z13 notebook, also on Arch:
> >>>> Oops during startup in task udev-worker followed by udev-worker
> >>>> blocking all attempts to suspend or cleanly shutdown/reboot the
> >>>> machine

> > I have a suspicion on commit 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors
> > are enabled, clean up") because the stack trace says that there is a bad
> > list_add, which could happen if the object is not correctly initialized.
> > 
> > However, that commit was present in v6.2, so it might not be that one.
> > 
> If I'm not mistaken the Z13 doesn't actually have any
> sensors connected to SFH.  So I think the suspicion on
> 7bcfdab3f0c6 and theory this is triggered by HID init makes
> a lot of sense.
> 
> Can you try this patch?
> 
> diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
> b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
> index d9b7b01900b5..fa693a5224c6 100644
> --- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
> +++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
> @@ -324,6 +324,7 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev
> *privdata)
>                          devm_kfree(dev, cl_data->report_descr[i]);
>                  }
>                  dev_warn(dev, "Failed to discover, sensors not enabled
> is %d\n", cl_data->is_any_sensor_enabled);
> +               cl_data->num_hid_devices = 0;
>                  return -EOPNOTSUPP;
>          }
>          schedule_delayed_work(&cl_data->work_buffer,
> msecs_to_jiffies(AMD_SFH_IDLE_LOOP));

I applied this to 9e87b63ed37e202c77aa17d4112da6ae0c7c097c now, which was the 
origin when I started the whole bisection. Clean rebuild, issue still 
persists.

Out of 50 boots, I got:

25 clean
22 Oops as posted by the OP
1 same Oops, followed by a panic
1 lockup [1]
1 hanging with just a blank screen

Not sure whether the lockups are related, but [1] mentions modprobe and udev-
worker as well and all problems including the blank screen one appear roughly 
at the same time during boot. As this is before a graphics mode switch, I 
suspect the last mentioned case may be like [1] while the screen was blanked.
To support the timing correlation: the UVC error for the IR cam shown in the 
photo (normal boot noise) also appears right before the BUG in the non-lockup 
bad case.

I do see the dev_warn in dmesg, so the code path modified in your patch is 
indeed hit:
[   10.897521] pcie_mp2_amd 0000:63:00.7: Failed to discover, sensors not 
enabled is 1
[   10.897533] pcie_mp2_amd: probe of 0000:63:00.7 failed with error -95

BR Malte

[1] https://photos.app.goo.gl/2FAvQ7DqBsHEF6Bd8



  reply	other threads:[~2023-06-06 22:57 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-23 17:27 amd_sfh driver causes kernel oops during boot Haochen Tong
2023-05-24  3:58 ` Bagas Sanjaya
2023-05-24  6:10   ` Haochen Tong
2023-05-24 10:10     ` Bagas Sanjaya
2023-06-05 11:24       ` Malte Starostik
2023-06-06  2:36         ` Bagas Sanjaya
2023-06-06  6:56           ` Linux regression tracking (Thorsten Leemhuis)
2023-06-06  8:08             ` Benjamin Tissoires
2023-06-06 15:25               ` Limonciello, Mario
2023-06-06 22:57                 ` Malte Starostik [this message]
2023-06-20 13:20                   ` Linux regression tracking (Thorsten Leemhuis)
2023-06-20 18:50                     ` Limonciello, Mario
2023-06-20 20:03                       ` Limonciello, Mario
2023-06-21 23:41                         ` Malte Starostik
2023-06-21  2:46                     ` Haochen Tong
2023-07-10 12:16                     ` Linux regression tracking #update (Thorsten Leemhuis)
2023-06-06  9:53             ` Malte Starostik
2023-06-06  2:39       ` Bagas Sanjaya
2023-06-06  3:41         ` Haochen Tong
2023-05-24 10:08 ` Bagas Sanjaya
2023-07-07  9:37 ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5980752.YW5z2jdOID@zen \
    --to=malte@starostik.de \
    --cc=bagasdotme@gmail.com \
    --cc=basavaraj.natikar@amd.com \
    --cc=benjamin.tissoires@redhat.com \
    --cc=linux-input@vger.kernel.org \
    --cc=linux@hexchain.org \
    --cc=mario.limonciello@amd.com \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).