All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Henningsson <david.henningsson@canonical.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org, bhelgaas@google.com
Subject: Re: Dmesg filled with "AER: Corrected error received"
Date: Wed, 30 Dec 2015 13:52:59 +0100	[thread overview]
Message-ID: <5683D3AB.1000609@canonical.com> (raw)
In-Reply-To: <20151229155822.GA17321@localhost>

Hi,

Indeed booting with pci=noaer (as suggested in the other bug) works 
around this issue as well. I'll use that for the time being.

Thanks for working on it!

// David

On 2015-12-29 16:58, Bjorn Helgaas wrote:
> On Fri, Dec 18, 2015 at 11:30:33AM +0100, David Henningsson wrote:
>> Hi Linux PCI maintainers,
>>
>> My dmesg gets filled with a few lines repeated over and over again:
>>
>> pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> pcieport 0000:00:1c.0: can't find device of ID00e0
>> pcieport 0000:00:1c.0: AER: Corrected error received: id=00e0
>> pcieport 0000:00:1c.0: PCIe Bus Error: severity=Corrected,
>> type=Physical Layer, id=00e0(Receiver ID)
>> pcieport 0000:00:1c.0:   device [8086:9d14] error
>> status/mask=00000001/00002000
>> pcieport 0000:00:1c.0:    [ 0] Receiver Error
>>
>> This happens 10-30 times per second (!), so dmesg fills up quickly.
>> The bug is present in both vanilla and Ubuntu kernels.
>
> This is a pretty obvious bug in our AER code.  We normally clear
> correctable errors by writing the PCI_ERR_COR_STATUS register in
> handle_error_source().  The execution path looks like this:
>
>    aer_isr_one_error
>      aer_print_port_info
>      if (find_source_device())
>        aer_process_err_devices
>          handle_error_source
>            pci_write_config_dword(dev, PCI_ERR_COR_STATUS, ...)
>
> In this case, find_source_device() printed "can't find device of
> ID00e0" [sic] and returned false, so we don't call
> aer_process_err_devices().  The error is never cleared, so
> we discover it again and again.
>
> I'll work on fixing this.  Incidentally, there's another report
> with similar symptoms here:
>
>    https://bugzilla.kernel.org/show_bug.cgi?id=109691
>
> Bjorn
>

-- 
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic

  reply	other threads:[~2015-12-30 12:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-18 10:30 Dmesg filled with "AER: Corrected error received" David Henningsson
2015-12-22 21:57 ` Bjorn Helgaas
2015-12-23  8:06   ` David Henningsson
2015-12-29 15:58 ` Bjorn Helgaas
2015-12-30 12:52   ` David Henningsson [this message]
2016-01-15 23:21   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5683D3AB.1000609@canonical.com \
    --to=david.henningsson@canonical.com \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.