* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
@ 2024-07-17 17:17 ` bugzilla-daemon
2024-07-17 17:29 ` bugzilla-daemon
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-17 17:17 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #1 from Artem S. Tashkinov (aros@gmx.com) ---
Are you running the latest BIOS version? If not, please flash.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
2024-07-17 17:17 ` [Bug 219051] " bugzilla-daemon
@ 2024-07-17 17:29 ` bugzilla-daemon
2024-07-17 17:49 ` bugzilla-daemon
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-17 17:29 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #2 from Artem S. Tashkinov (aros@gmx.com) ---
Also, it's worth noting that if your system resets itself (and you get a
confirmation about that on next boot where dmesg says your partitions have been
recovered), it usually indicates a HW issue, rather than a software one.
Unfortunately I've no clue what could be wrong and how to test it.
Some AMD users have been playing with undervolting their CPUs using e.g.
Smokeless_UMAF or similar things, and I wanna hope it's not what you've done.
Undervolting may make your system unstable.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
2024-07-17 17:17 ` [Bug 219051] " bugzilla-daemon
2024-07-17 17:29 ` bugzilla-daemon
@ 2024-07-17 17:49 ` bugzilla-daemon
2024-07-17 19:22 ` bugzilla-daemon
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-17 17:49 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #3 from Catalin (catalin@antebit.com) ---
I already have the latest version of bios.
It's a laptop with very limited bios settings, undervolting is not one of them.
Pretty much is default, secureboot activated, disk encrypted
I got some errors in dmesg:
[ 864.970948] asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -61
[ 864.972365] asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -61
[ 864.973533] asus_wmi: fan_curve_get_factory_default (0x00110032) failed: -19
I have them on other laptop with 24.04 and no issues
and
[ 864.511925] systemd-journald[638]: File
/var/log/journal/9ce66a6c7ebf406aa908624aafc66e86/system.journal corrupted or
uncleanly shut down, renaming and replacing.
I agree it could be a HW error, but with amd_pstate disabled I have no resets
.. at least for 10 hours..
If I can give you more information I will.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (2 preceding siblings ...)
2024-07-17 17:49 ` bugzilla-daemon
@ 2024-07-17 19:22 ` bugzilla-daemon
2024-07-19 11:16 ` bugzilla-daemon
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-17 19:22 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
Artem S. Tashkinov (aros@gmx.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mario.limonciello@amd.com
--- Comment #4 from Artem S. Tashkinov (aros@gmx.com) ---
Casting Mario.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (3 preceding siblings ...)
2024-07-17 19:22 ` bugzilla-daemon
@ 2024-07-19 11:16 ` bugzilla-daemon
2024-07-23 7:04 ` bugzilla-daemon
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-19 11:16 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #5 from Artem S. Tashkinov (aros@gmx.com) ---
Mario, any ideas what is going on here and how it can be fixed/debugged?
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (4 preceding siblings ...)
2024-07-19 11:16 ` bugzilla-daemon
@ 2024-07-23 7:04 ` bugzilla-daemon
2024-07-23 12:10 ` bugzilla-daemon
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-23 7:04 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
Perry Yuan(AMD) (Perry.Yuan@amd.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Perry.Yuan@amd.com
--- Comment #6 from Perry Yuan(AMD) (Perry.Yuan@amd.com) ---
Mario is OOO lately,
[ 864.970948] asus_wmi: fan_curve_get_factory_default (0x00110024) failed: -61
[ 864.972365] asus_wmi: fan_curve_get_factory_default (0x00110025) failed: -61
[ 864.973533] asus_wmi: fan_curve_get_factory_default (0x00110032) failed: -19
The errors show the BIOS broken to get factory default fan curve data.
I would like sugguest to submit support case to ASUS for further help.
The system reset is triggered when system reach critical temperature.
"I noticed that the processor runs at lower frequencies than 20.04, so do the
fans, most of the time at 0 rpm. Temps 50-70 degrees."
Like you said, the frequencies are lower than 20.04 which have no amd-pstate
driver loaded, so the system temperature increasing slower than 24.04 kernel
which has amd-pstate-epp driver loaded.
I would guess the root cause is system fan error instead of amd-pstate driver.
After you resolve the Fan issue, system will not reset any more.
static int fan_curve_get_factory_default(struct asus_wmi *asus, u32 fan_dev)
{
struct fan_curve_data *curves;
u8 buf[FAN_CURVE_BUF_LEN];
int err, fan_idx;
u8 mode = 0;
if (asus->throttle_thermal_policy_available)
mode = asus->throttle_thermal_policy_mode;
/* DEVID_<C/G>PU_FAN_CURVE is switched for OVERBOOST vs SILENT */
if (mode == 2)
mode = 1;
else if (mode == 1)
mode = 2;
// asus_wmi_evaluate_method_buf failed here, it is a broken bios issue.
err = asus_wmi_evaluate_method_buf(asus->dsts_id, fan_dev, mode, buf,
FAN_CURVE_BUF_LEN);
if (err) {
pr_warn("%s (0x%08x) failed: %d\n", __func__, fan_dev, err);
return err;
}
.....
}
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (5 preceding siblings ...)
2024-07-23 7:04 ` bugzilla-daemon
@ 2024-07-23 12:10 ` bugzilla-daemon
2024-07-24 3:01 ` bugzilla-daemon
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-23 12:10 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #7 from Catalin (catalin@antebit.com) ---
Update:
It crashed with amd_pstate=disable after 15+ hours.
Also I tried Windows with crash in 4 hours, so it has nothing to to with
amd_pstate. You can close it. Most likely faulty motherboard. Thanks!
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (6 preceding siblings ...)
2024-07-23 12:10 ` bugzilla-daemon
@ 2024-07-24 3:01 ` bugzilla-daemon
2024-07-25 14:51 ` bugzilla-daemon
2024-07-25 14:52 ` bugzilla-daemon
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-24 3:01 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #8 from Perry Yuan(AMD) (Perry.Yuan@amd.com) ---
(In reply to Catalin from comment #7)
> Update:
>
> It crashed with amd_pstate=disable after 15+ hours.
> Also I tried Windows with crash in 4 hours, so it has nothing to to with
> amd_pstate. You can close it. Most likely faulty motherboard. Thanks!
Thanks to hear the feedback.
BTW, You can check if the system fan is spinning normally in case the system
CPU die temperature reach critical point(+84.8 degree).
sensors utility can help to check the die temperature like below.
#sensors
k10temp-pci-00c3
Adapter: PCI adapter
Tctl: +37.2°C
On some AMD CPUs, there is a difference between the die temperature (Tdie) and
the reported temperature (Tctl). Tdie is the real measured temperature, and
Tctl is used for fan control. While Tctl is always available as temp1_input,
the driver exports Tdie temperature as temp2_input for those CPUs which support
it. (Documentation/hwmon/k10temp.rst)
Looks like your system HW/BIOS broken, vendors maybe have solution for your
case.
Perry.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (7 preceding siblings ...)
2024-07-24 3:01 ` bugzilla-daemon
@ 2024-07-25 14:51 ` bugzilla-daemon
2024-07-25 14:52 ` bugzilla-daemon
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-25 14:51 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
Artem S. Tashkinov (aros@gmx.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |INVALID
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread* [Bug 219051] amd_pstate=active reset computer
2024-07-17 12:52 [Bug 219051] New: amd_pstate=active reset computer bugzilla-daemon
` (8 preceding siblings ...)
2024-07-25 14:51 ` bugzilla-daemon
@ 2024-07-25 14:52 ` bugzilla-daemon
9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2024-07-25 14:52 UTC (permalink / raw)
To: linux-pm
https://bugzilla.kernel.org/show_bug.cgi?id=219051
--- Comment #9 from Perry Yuan(AMD) (Perry.Yuan@amd.com) ---
Hello, Thank you for your message. I am currently out of the office for Labor
Day Holiday. If you have any urgent issues or need CPPC, Hetero Core related
urgent assistance, please reach out to [Limonciello, Mario]. For non-urgent
matters, I will respond to your email as soon as possible upon my return on
[5/6]. Best wishes, Perry Yuan
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 11+ messages in thread