linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <bhelgaas@google.com>
To: Yishai Hadas <yishaih@dev.mellanox.co.il>
Cc: "Pandarathil, Vijaymohan R" <vijaymohan.pandarathil@hp.com>,
	Myron Stowe <myron.stowe@redhat.com>,
	"linux-rdma (linux-rdma@vger.kernel.org)"
	<linux-rdma@vger.kernel.org>,
	"yishaih@mellanox.com" <yishaih@mellanox.com>,
	liranl@mellanox.com,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Don Dutile <ddutile@redhat.com>
Subject: Re: PCI/AER: AER in SRIOV environment
Date: Mon, 23 Jun 2014 13:09:15 -0600	[thread overview]
Message-ID: <CAErSpo7xVvqYea8v624T8EyVf6skQccraAe_WYpzcf2tkWFLhg@mail.gmail.com> (raw)
In-Reply-To: <53A839C6.5050102@dev.mellanox.co.il>

[+cc linux-pci, Don]

On Mon, Jun 23, 2014 at 8:29 AM, Yishai Hadas
<yishaih@dev.mellanox.co.il> wrote:
> Hi Vijay,
> Trying to add AER support for Mellanox NIC in SRIOV environment, while
> evaluating/testing encountered a problem which led me to your
> patch accepted as part of kernel 3.8, commit ID
> "918b4053184c0ca22236e70e299c5343eea35304".
>
> Have some concerns/questions on:
> When working in SRIOV environment VFs may be un-attached, having no driver
> assigned to, or may be attached to Virtual machine to work in some
> pass-through mode.
> Once working in KVM setup there is pci-stub driver which is loaded in the
> HYP/PF for a given attached VF.
>
> I'm using the aer-inject kernel module and its corresponding aer-inject tool
> to simulate an error in the HYP.
> In both cases your commit will cause the AER recovery to fail as there is no
> driver assigned to PF's VFs that supports AER, comparing the code before
> your change.
>
> How such cases should work ?  my expectation was that the PF will get the
> error detected message then will recognize whether
> issue is its own or one of its VFs

I'm really not an AER expert, so help me understand this question of
recognizing whether an error is associated with a PF or a VF.

In terms of hardware, it looks like the device that detects an error
logs some information and sends an Error Message upstream.  The Root
Complex receives the message, captures the source ID from the Error
Message, and may generate an interrupt.  I expect this source ID can
be either a PF or a VF; there's no requirement that a VF error must be
reported as though it's from the PF, is there?

> and work accordingly, in current code
> looks like recovery failed as part of "voting" once there is no AER handler
> assigned to the VFs.

The commit you mentioned has to do with PCI_ERS_RESULT_NO_AER_DRIVER.
We use pci_walk_bus() to figure out whether all the devices in a
subtree have a driver.  What subtree is involved here?  I would expect
the VFs to be siblings of the PF, not children of it, so I'm not sure
where things went wrong.

Can you collect "lspci -vvv" output and maybe add some debug so we can
see exactly where the error is detected and what devices we're looking
at to conclude that one of them doesn't have a driver?

Bjorn

       reply	other threads:[~2014-06-23 19:09 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <53A839C6.5050102@dev.mellanox.co.il>
2014-06-23 19:09 ` Bjorn Helgaas [this message]
2014-06-23 20:12   ` PCI/AER: AER in SRIOV environment Don Dutile
2014-06-23 22:44     ` Yishai Hadas
2014-06-23 23:17       ` Alex Williamson
2014-06-24 14:56       ` Don Dutile
2014-06-24 16:22         ` Yishai Hadas
2014-06-24 17:38           ` Alex Williamson
2014-06-23 23:10     ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAErSpo7xVvqYea8v624T8EyVf6skQccraAe_WYpzcf2tkWFLhg@mail.gmail.com \
    --to=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=liranl@mellanox.com \
    --cc=myron.stowe@redhat.com \
    --cc=vijaymohan.pandarathil@hp.com \
    --cc=yishaih@dev.mellanox.co.il \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).