netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: "Wyborny\, Carolyn" <carolyn.wyborny@intel.com>,
	Matthew Garrett <mjg@redhat.com>
Cc: "Kirsher\, Jeffrey T" <jeffrey.t.kirsher@intel.com>,
	"davem\@davemloft.net" <davem@davemloft.net>,
	Chris Boot <bootc@bootc.net>,
	"netdev\@vger.kernel.org" <netdev@vger.kernel.org>,
	"gospo\@redhat.com" <gospo@redhat.com>,
	"sassmann\@redhat.com" <sassmann@redhat.com>
Subject: Re: [net-next 5/9] e1000e: Disable ASPM L1 on 82574
Date: Sat, 05 May 2012 17:33:45 +0100	[thread overview]
Message-ID: <87fwbea8pi.fsf@spindle.srvr.nix> (raw)
In-Reply-To: <87sjfhaukf.fsf@spindle.srvr.nix> (nix@esperi.org.uk's message of "Thu, 03 May 2012 21:17:04 +0100")

On 3 May 2012, nix@esperi.org.uk outgrape:

> On 3 May 2012, Carolyn Wyborny told this:
>
>> It would be good to know why/how your system is re-enabling the
>> setting. The problem is not solvable in firmware unfortunately and is
>> somewhat platform dependent. MMIO-tracer might be used to try and see
>
> I entirely forgot about that tool! *Definitely* worth trying.
>
> I'll give it a try this weekend.

Well, mmiotrace was a total flop: massive numbers of unexpected
secondary interrupts and a hard lockup. Still, I've now diagnosed this
bug and it's right up Matthew Garrett's street!

Matthew: the problem here is a server with an 82574L (controlled by the
e1000e driver). This NIC has a hardware bug causing it to lock up in a
way that only a reboot can solve in an hour or two if PCIe ASPM is not
disabled during boot (leaving me with my home directory stuck behind a
dead NIC on a headless machine, most annoying). The driver is attempting
to disable it, but failing.

>> when the re-enabling config space is written, but it might be too
>> heavyweight for a live production system.
>
> Given that the re-enabling happens at around the same time as the boot
> scripts finish running (it's done by the time I can log in), that's not
> going to be a problem. Hence my speculation that it's being re-enabled
> when the interface stabilizes (which is, of course, asynchronous) or
> something like that.

This is wrong. The disable never happens. The BIOS has been told to
enable PCIe ASPM. However, the kernel log says:

May  5 17:06:53 spindle info: [    0.629699]  pci0000:00: Requesting ACPI _OSC control (0x1d)
May  5 17:06:53 spindle info: [    0.629941]  pci0000:00: ACPI _OSC request failed (AE_NOT_FOUND), returned control mask: 0x1d
May  5 17:06:53 spindle info: [    0.630373] ACPI _OSC control for PCIe not granted, disabling ASPM

Unless pcie_aspm=force has been specified on the kernel command line,
this flips aspm_disabled to 1.

The e1000e driver then says (with a bit of extra debugging info I
added):

May  5 17:06:53 spindle info: [    1.248153] e1000e 0000:03:00.0: Disabling ASPM L0s L1
May  5 17:06:53 spindle info: [    1.248393] e1000e 0000:03:00.0: Disabling ASPM via pci_disable_link_state_locked()
May  5 17:06:53 spindle info: [    1.248823] e1000e 0000:03:00.0: aspm disabled, not forcing

i.e. because aspm_disabled is set, pci/pcie/aspm.c refuses to make any
changes at all to ASPM link state, not even to turn *off* ASPM on a
device on which the BIOS turned it on at boot. So ASPM remains enabled
and the NIC eventually locks up.

The question here is how to fix it. It appears that the motherboard or
BIOS on this machine does not grant _OSC control even (especially?) if
you have turned on PCIe ASPM in the BIOS. But perhaps even if _OSC is
not granted you should permit PCIe to be *disabled* by drivers, just not
enabled? (The BIOS appears to be buggy in this area: if you turn off
ASPM, save, and go back into setup, ASPM has turned itself back on
again!)

I'm not sure what the right thing to do is here: I don't know enough
about this area. But it does seem very strange that the only way I have
to turn off PCIe ASPM reliably on this device is to tell the kernel to
forcibly turn it *on*!

  reply	other threads:[~2012-05-05 16:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-03  9:56 [net-next 0/9][pull request] Intel Wired LAN Dirver Updates Jeff Kirsher
2012-05-03  9:56 ` [net-next 1/9] e1000e: suggest a possible workaround to a device hang on 82577/8 Jeff Kirsher
2012-05-03  9:56 ` [net-next 2/9] e1000e: cleanup long [read|write]_reg_locked PHY ops function pointers Jeff Kirsher
2012-05-03  9:56 ` [net-next 3/9] e1000e: Resolve intermittent negotiation issue on 82574/82583 Jeff Kirsher
2012-05-03  9:56 ` [net-next 4/9] e1000e: Driver workaround for IPv6 Header Extension Erratum Jeff Kirsher
2012-05-03  9:56 ` [net-next 5/9] e1000e: Disable ASPM L1 on 82574 Jeff Kirsher
2012-05-03 10:08   ` Nix
2012-05-03 20:12     ` Wyborny, Carolyn
2012-05-03 20:17       ` Nix
2012-05-05 16:33         ` Nix [this message]
2012-05-09 14:02           ` Nix
2012-05-03  9:56 ` [net-next 6/9] e1000e: Remove special case for 82573/82574 ASPM L1 disablement Jeff Kirsher
2012-05-03  9:56 ` [net-next 7/9] ixgbevf: Add support to recognize 100mb link speed Jeff Kirsher
2012-05-03  9:56 ` [net-next 8/9] ixgbevf: Make sure jumbo frames are set correctly after PF reset Jeff Kirsher
2012-05-03  9:56 ` [net-next 9/9] ixgbevf: Update version string Jeff Kirsher
2012-05-03 17:30 ` [net-next 0/9][pull request] Intel Wired LAN Dirver Updates David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fwbea8pi.fsf@spindle.srvr.nix \
    --to=nix@esperi.org.uk \
    --cc=bootc@bootc.net \
    --cc=carolyn.wyborny@intel.com \
    --cc=davem@davemloft.net \
    --cc=gospo@redhat.com \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=mjg@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sassmann@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).