From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Thomas Witt" <kernel@witt.link>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Kuppuswamy Sathyanarayanan"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"Vidya Sagar" <vidyas@nvidia.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
"Kai-Heng Feng" <kai.heng.feng@canonical.com>,
"David E . Box" <david.e.box@linux.intel.com>,
"Tasev Nikola" <tasev.stefanoska@skynet.be>,
"Mark Enriquez" <enriquezmark36@gmail.com>,
"Koba Ko" <koba.ko@canonical.com>,
"Werner Sembach" <wse@tuxedocomputers.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"吳昊澄 Ricky" <ricky_wu@realtek.com>,
linux-pci@vger.kernel.org
Subject: Re: [PATCH v4] PCI/ASPM: Add back L1 PM Substate save and restore
Date: Tue, 10 Oct 2023 07:53:52 +0300 [thread overview]
Message-ID: <20231010045352.GH3208943@black.fi.intel.com> (raw)
In-Reply-To: <20231009163421.GA938492@bhelgaas>
On Mon, Oct 09, 2023 at 11:34:21AM -0500, Bjorn Helgaas wrote:
> On Mon, Oct 09, 2023 at 11:34:34AM +0300, Mika Westerberg wrote:
> > On Thu, Oct 05, 2023 at 12:22:26PM -0500, Bjorn Helgaas wrote:
> > > On Thu, Oct 05, 2023 at 05:57:52PM +0200, Thomas Witt wrote:
> > > > On 05/10/2023 17:30, Bjorn Helgaas wrote:
> > > > ...
> > >
> > > > > Right, without the denylist, I expect Thomas' TUXEDO to fail,
> > > > > but I still hope we can figure out why. If we just keep it on
> > > > > the denylist, that system will suffer from more power
> > > > > consumption than necessary, but only after suspend/resume.
> > > > >
> > > > > A denylist seems like the absolute last resort. In this case
> > > > > we don't know about anything *wrong* with those platforms; all
> > > > > we know is that our resume path doesn't work. It's likely
> > > > > that it fails on other platforms we haven't heard about, too.
> > > >
> > > > The best guess from Mika and David was a firmware issue, but I
> > > > run the same Firmware revision as Werner. I even reflashed the
> > > > Firmware, but that did not change anything:
> > >
> > > Possibly a BIOS settings difference?
> > >
> > > > Quoting David Box:
> > > > > I agree that we should pursue an exception for your system.
> > > > > This is looking like a firmware bug. One thing we did notice
> > > > > in the turbostat results is your IRTL (Interrupt Response Time
> > > > > Limit) values are bogus:
> > > > >
> > > > > cpu6: MSR_PKGC3_IRTL: 0x0000884e (valid, 79872 ns)
> > > > > cpu6: MSR_PKGC6_IRTL: 0x00008000 (valid, 0 ns)
> > > > > cpu6: MSR_PKGC7_IRTL: 0x00008000 (valid, 0 ns)
> > > > > cpu6: MSR_PKGC8_IRTL: 0x00008000 (valid, 0 ns)
> > > > > cpu6: MSR_PKGC9_IRTL: 0x00008000 (valid, 0 ns)
> > > > > cpu6: MSR_PKGC10_IRTL: 0x00008000 (valid, 0 ns)
> > > > >
> > > > > This is despite the PKGC configuration register showing that all
> > > > > states are enabled:
> > > > >
> > > > > cpu6: MSR_PKG_CST_CONFIG_CONTROL: 0x1e008008 (UNdemote-C3, UNdemote-C1,
> > > > demote-
> > > > C3, demote-C1, locked, pkg-cstate-limit=8 (unlimited))
> > > > >
> > > > > Firmware sets this.
> > >
> > > I can't find this discussion, but if there's a firmware issue related
> > > to IRTL MSRs, I would want the workaround in intel-idle.c or whatever
> > > code deals with the MSRs, not in the ASPM code.
> >
> > Unfortunately that discussion never ended up on the mailing list. But in
> > summary that particular system seems to run pretty hot (if I understand
> > correctly what David concluded). This is the reason Thomas has the
> > pm_sleep=deep in the command line and this is why the L1 SS restore then
> > causes the failure on resume. Without this it works fine but consumes
> > lot of energy in s2idle.
> >
> > Can you suggest what we should do with this now?
> >
> > We got report from Tasev Nikola on
> > https://bugzilla.kernel.org/show_bug.cgi?id=216782 that even if the Asus
> > system is removed from the denylist it works so that we can do. However,
> > with the Thomas' system I'm not sure. If we leave it like this:
> >
> > static int aspm_l1ss_suspend_via_firmware(const struct dmi_system_id *not_used)
> > {
> > return pm_suspend_via_firmware();
> > }
> >
> > static const struct dmi_system_id aspm_l1ss_denylist[] = {
> > {
> > /*
> > * https://bugzilla.kernel.org/show_bug.cgi?id=216877
> > *
> > * This system needs to use suspend to mem instead of its
> > * default (suspend to idle) to avoid draining the battery.
> > * However, the BIOS gets confused if we try to restore the
> > * L1SS registers so avoid doing that if the user forced
> > * suspend to mem. The default suspend to idle on the other
> > * hand needs restoring L1SS to allow the CPU to enter low
> > * power states. This entry should handle both.
> > */
> > .callback = aspm_l1ss_suspend_via_firmware,
> > .ident = "TUXEDO InfinityBook S 14 v5",
> > .matches = {
> > DMI_MATCH(DMI_BOARD_VENDOR, "TUXEDO"),
> > DMI_MATCH(DMI_BOARD_NAME, "L140CU"),
> > },
> > },
> > { }
> > };
> >
> > Then it should work with Thomas' system (defaults to deep), and TUXEDO
> > with default settings (defaults to s2idle), and with the rest of the
> > world (I hope at least, fingers crossed). Or you want to pursue a
> > solution without the denylist still? I'm out of ideas what could be
> > wrong except that when the pm_sleep=deep it means the BIOS is involved
> > in the suspend/restore of the devices and it may not expect the OS to
> > touch these registers or something along those lines.
>
> [I think this refers to "mem_sleep_default=deep" (not pm_sleep)]
Yes.
> I assume everything was fine before suspend, and it only runs hot
> after resume. And the platform granted control of the PCIe Capability
> and the LTR Capability to OSPM via _OSC?
As David commented already it runs hot even before suspend. The platform
did give LTR capability to the OS:
Jan 02 14:17:52 localhost kernel: acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
Jan 02 14:17:52 localhost kernel: acpi PNP0A08:00: _OSC: OS now controls [PME PCIeCapability LTR]
next prev parent reply other threads:[~2023-10-10 4:54 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-02 7:00 [PATCH v4] PCI/ASPM: Add back L1 PM Substate save and restore Mika Westerberg
2023-10-04 22:23 ` Bjorn Helgaas
2023-10-05 9:02 ` Mika Westerberg
2023-10-05 15:30 ` Bjorn Helgaas
2023-10-05 15:57 ` Thomas Witt
2023-10-05 17:22 ` Bjorn Helgaas
2023-10-09 8:34 ` Mika Westerberg
2023-10-09 16:34 ` Bjorn Helgaas
2023-10-09 22:57 ` David E. Box
2023-10-10 4:53 ` Mika Westerberg [this message]
2023-11-03 19:51 ` David E. Box
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231010045352.GH3208943@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=david.e.box@linux.intel.com \
--cc=enriquezmark36@gmail.com \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=kai.heng.feng@canonical.com \
--cc=kernel@witt.link \
--cc=koba.ko@canonical.com \
--cc=linux-pci@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=ricky_wu@realtek.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=tasev.stefanoska@skynet.be \
--cc=vidyas@nvidia.com \
--cc=wse@tuxedocomputers.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).