All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@dev.mellanox.co.il>
To: Don Dutile <ddutile@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	"Pandarathil, Vijaymohan R" <vijaymohan.pandarathil@hp.com>,
	Myron Stowe <myron.stowe@redhat.com>,
	"linux-rdma (linux-rdma@vger.kernel.org)"
	<linux-rdma@vger.kernel.org>,
	"yishaih@mellanox.com" <yishaih@mellanox.com>,
	liranl@mellanox.com,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>
Subject: Re: PCI/AER: AER in SRIOV environment
Date: Tue, 24 Jun 2014 01:44:37 +0300	[thread overview]
Message-ID: <53A8ADD5.7030207@dev.mellanox.co.il> (raw)
In-Reply-To: <53A88A32.4010406@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 5823 bytes --]

On 6/23/2014 11:12 PM, Don Dutile wrote:
> On 06/23/2014 03:09 PM, Bjorn Helgaas wrote:
>> [+cc linux-pci, Don]
>>
> Adding Alex Williamson in case he can add more to this conversation...
>
>> On Mon, Jun 23, 2014 at 8:29 AM, Yishai Hadas
>> <yishaih@dev.mellanox.co.il> wrote:
>>> Hi Vijay,
>>> Trying to add AER support for Mellanox NIC in SRIOV environment, while
>>> evaluating/testing encountered a problem which led me to your
>>> patch accepted as part of kernel 3.8, commit ID
>>> "918b4053184c0ca22236e70e299c5343eea35304".
>>>
>>> Have some concerns/questions on:
>>> When working in SRIOV environment VFs may be un-attached, having no 
>>> driver
>>> assigned to, or may be attached to Virtual machine to work in some
>>> pass-through mode.
>>> Once working in KVM setup there is pci-stub driver which is loaded 
>>> in the
>>> HYP/PF for a given attached VF.
> huh? 'loaded in the hyp/pf? .... um, loaded in the host, and a VF is
> detached from its host driver -- a VF can be used in the host w/o any 
> virtualization,
> i.e., that's how guest VM is driving the VF: as if it was used by a 
> guest (host) OS directly --
> and attached to pci-stub driver, when assigned to a KVM guest in 
> pre-VFIO days/ways.
> If VFIO used, then VF is attached to vfio-pci driver.
>
>>>
>>> I'm using the aer-inject kernel module and its corresponding 
>>> aer-inject tool
>>> to simulate an error in the HYP.
>>> In both cases your commit will cause the AER recovery to fail as 
>>> there is no
>>> driver assigned to PF's VFs that supports AER, comparing the code 
>>> before
>>> your change.
>>>
> Without VFIO, I believe that's correct. There was no AER-to-VF support 
> pre-VFIO days.
> I believe with the recent VFIO support,
> and modifications to KVM, an AER that is associated with an assigned 
> VF will
> force the crash/halt of the KVM guest -- can't depend on a guest VF 
> driver clearing
> the AER in the hyp/host -- guest isn't privileged enough to clear the 
> error.
> So, crashing the guest is the simple option at the moment, to contain 
> the error.
> Alex: do I have that (vfio aer default) correct, or is that still 
> site-under-construction?
     How about the case that the VF is not attached to a KVM guest and 
has no driver loaded on host ? in such a case from code review and some 
testing the recovery will
     fail as there is no AER aware driver here. What is the expected 
solution here ?
     Any special qemu /stuff is needed to activate the VFIO support ? 
would like to give it a try for a case that VF is attached.
>
>>> How such cases should work ?  my expectation was that the PF will 
>>> get the
>>> error detected message then will recognize whether
>>> issue is its own or one of its VFs
> The AER packet will have the tag of the VF in if it was the source of 
> the error;
> so the PF will never see it; although one could argue it should be 
> 'promoted'
> to the PF if PF/VF needs to clear some state it has wrt the VF (the 
> SRIOV spec is
> lacking of info in this space); _but_, VFIO resets the VF (sets FLR 
> bit) when the
> device is deassigned and before re-attachment to the host, so that 
> should clear out
> any state btwn PF & VF ('should' ... famous last words...).
     In my test I have used the aer-inject tool simulating an error to 
the BUS that both PF/VF are residing on, putting the function number to 
be the PF one, looks like both should be called by the aer driver as part
     of the pci_walk_bus(). As mentioned I got a call only on the PF and 
recovery failed as of the VF doesn't include an AER aware driver, once 
removed the VF recovery succeeded.
     I believe that packet should include some info about the source of 
the error isn't it ?
     In addition, looking at IXGBE upstream source code at 
ixgbe_error_detected()  looks like there is some code running on the PF 
that checks whether the source was a VF.

     By the way: when tried to simulate a VF error using its FN got 
below error:
     "Error: Failed to write, Inappropriate ioctl for device", any idea 
about that error ?
>
>>
>> I'm really not an AER expert, so help me understand this question of
>> recognizing whether an error is associated with a PF or a VF.
>>
>> In terms of hardware, it looks like the device that detects an error
>> logs some information and sends an Error Message upstream.  The Root
>> Complex receives the message, captures the source ID from the Error
>> Message, and may generate an interrupt.  I expect this source ID can
>> be either a PF or a VF; there's no requirement that a VF error must be
>> reported as though it's from the PF, is there?
>>
>>> and work accordingly, in current code
>>> looks like recovery failed as part of "voting" once there is no AER 
>>> handler
>>> assigned to the VFs.
>>
>> The commit you mentioned has to do with PCI_ERS_RESULT_NO_AER_DRIVER.
>> We use pci_walk_bus() to figure out whether all the devices in a
>> subtree have a driver.  What subtree is involved here?  I would expect
>> the VFs to be siblings of the PF, not children of it, so I'm not sure
>> where things went wrong.
> Well, VFs could be on virtual busses (ARI turned on), so not 
> necessarily a
> sibling to PF ... and then we have the problem in PCI code of not 
> being able
> to traverse these virtual busses (in some cases; not sure if 
> pci_walk_bus(),
> which is going down the tree vs up the tree, has any problems here 
> w/VFs on
> virtual busses).
>
>>
>> Can you collect "lspci -vvv" output and maybe add some debug so we can
>> see exactly where the error is detected and what devices we're looking
>> at to conclude that one of them doesn't have a driver?
     lspci -vvv for both PF & VF is attached, we can see that VF 
(21:00.1) has no driver loaded comparing the PF (Kernel driver in use: 
mlx4_core).
>>
>> Bjorn
>>
>


[-- Attachment #2: lspci.txt --]
[-- Type: text/plain, Size: 7889 bytes --]

21:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
        Subsystem: Hewlett-Packard Company Device 18cf
        Physical Slot: 2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 64
        Region 0: Memory at fbf00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at f8000000 (64-bit, prefetchable) [size=32M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
                Product Name: HP ConnectX-3 Mezz
                Read-only fields:
                        [PN] Part number: 644161-B21
                        [EC] Engineering changes: C4
                        [SN] Serial number: IL224202VW
                        [V0] Vendor specific: HP IB FDR/EN 10/40Gb 2P 544M Adptr
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                Read/write fields:
                        [V1] Vendor specific: N/A
                        [YA] Asset tag: N/A
                        [RW] Read-write area: 102 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 252 byte(s) free
                End
        Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
                Vector table: BAR=0 offset=0007c000
                PBA: BAR=0 offset=0007d000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [148 v1] Device Serial Number 24-be-05-ff-ff-8b-6b-d0
        Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 1, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 00000000d8000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [154 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [18c v1] #19
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_core
		
21:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
        Subsystem: Hewlett-Packard Company Device 61b0
        Physical Slot: 2
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Region 2: [virtual] Memory at d8000000 (64-bit, prefetchable) [size=32M]
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM unknown, Latency L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [9c] MSI-X: Enable- Count=256 Masked-
                Vector table: BAR=2 offset=00002000
                PBA: BAR=2 offset=00003000
        Capabilities: [40] #00 [0000]
        Kernel modules: mlx4_core
		
		

WARNING: multiple messages have this Message-ID (diff)
From: Yishai Hadas <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Don Dutile <ddutile-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Bjorn Helgaas <bhelgaas-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	"Pandarathil,
	Vijaymohan R"
	<vijaymohan.pandarathil-VXdhtT5mjnY@public.gmane.org>,
	Myron Stowe <myron.stowe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"linux-rdma
	(linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org"
	<yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	liranl-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
	"linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-pci-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: PCI/AER: AER in SRIOV environment
Date: Tue, 24 Jun 2014 01:44:37 +0300	[thread overview]
Message-ID: <53A8ADD5.7030207@dev.mellanox.co.il> (raw)
In-Reply-To: <53A88A32.4010406-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5854 bytes --]

On 6/23/2014 11:12 PM, Don Dutile wrote:
> On 06/23/2014 03:09 PM, Bjorn Helgaas wrote:
>> [+cc linux-pci, Don]
>>
> Adding Alex Williamson in case he can add more to this conversation...
>
>> On Mon, Jun 23, 2014 at 8:29 AM, Yishai Hadas
>> <yishaih-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>>> Hi Vijay,
>>> Trying to add AER support for Mellanox NIC in SRIOV environment, while
>>> evaluating/testing encountered a problem which led me to your
>>> patch accepted as part of kernel 3.8, commit ID
>>> "918b4053184c0ca22236e70e299c5343eea35304".
>>>
>>> Have some concerns/questions on:
>>> When working in SRIOV environment VFs may be un-attached, having no 
>>> driver
>>> assigned to, or may be attached to Virtual machine to work in some
>>> pass-through mode.
>>> Once working in KVM setup there is pci-stub driver which is loaded 
>>> in the
>>> HYP/PF for a given attached VF.
> huh? 'loaded in the hyp/pf? .... um, loaded in the host, and a VF is
> detached from its host driver -- a VF can be used in the host w/o any 
> virtualization,
> i.e., that's how guest VM is driving the VF: as if it was used by a 
> guest (host) OS directly --
> and attached to pci-stub driver, when assigned to a KVM guest in 
> pre-VFIO days/ways.
> If VFIO used, then VF is attached to vfio-pci driver.
>
>>>
>>> I'm using the aer-inject kernel module and its corresponding 
>>> aer-inject tool
>>> to simulate an error in the HYP.
>>> In both cases your commit will cause the AER recovery to fail as 
>>> there is no
>>> driver assigned to PF's VFs that supports AER, comparing the code 
>>> before
>>> your change.
>>>
> Without VFIO, I believe that's correct. There was no AER-to-VF support 
> pre-VFIO days.
> I believe with the recent VFIO support,
> and modifications to KVM, an AER that is associated with an assigned 
> VF will
> force the crash/halt of the KVM guest -- can't depend on a guest VF 
> driver clearing
> the AER in the hyp/host -- guest isn't privileged enough to clear the 
> error.
> So, crashing the guest is the simple option at the moment, to contain 
> the error.
> Alex: do I have that (vfio aer default) correct, or is that still 
> site-under-construction?
     How about the case that the VF is not attached to a KVM guest and 
has no driver loaded on host ? in such a case from code review and some 
testing the recovery will
     fail as there is no AER aware driver here. What is the expected 
solution here ?
     Any special qemu /stuff is needed to activate the VFIO support ? 
would like to give it a try for a case that VF is attached.
>
>>> How such cases should work ?  my expectation was that the PF will 
>>> get the
>>> error detected message then will recognize whether
>>> issue is its own or one of its VFs
> The AER packet will have the tag of the VF in if it was the source of 
> the error;
> so the PF will never see it; although one could argue it should be 
> 'promoted'
> to the PF if PF/VF needs to clear some state it has wrt the VF (the 
> SRIOV spec is
> lacking of info in this space); _but_, VFIO resets the VF (sets FLR 
> bit) when the
> device is deassigned and before re-attachment to the host, so that 
> should clear out
> any state btwn PF & VF ('should' ... famous last words...).
     In my test I have used the aer-inject tool simulating an error to 
the BUS that both PF/VF are residing on, putting the function number to 
be the PF one, looks like both should be called by the aer driver as part
     of the pci_walk_bus(). As mentioned I got a call only on the PF and 
recovery failed as of the VF doesn't include an AER aware driver, once 
removed the VF recovery succeeded.
     I believe that packet should include some info about the source of 
the error isn't it ?
     In addition, looking at IXGBE upstream source code at 
ixgbe_error_detected()  looks like there is some code running on the PF 
that checks whether the source was a VF.

     By the way: when tried to simulate a VF error using its FN got 
below error:
     "Error: Failed to write, Inappropriate ioctl for device", any idea 
about that error ?
>
>>
>> I'm really not an AER expert, so help me understand this question of
>> recognizing whether an error is associated with a PF or a VF.
>>
>> In terms of hardware, it looks like the device that detects an error
>> logs some information and sends an Error Message upstream.  The Root
>> Complex receives the message, captures the source ID from the Error
>> Message, and may generate an interrupt.  I expect this source ID can
>> be either a PF or a VF; there's no requirement that a VF error must be
>> reported as though it's from the PF, is there?
>>
>>> and work accordingly, in current code
>>> looks like recovery failed as part of "voting" once there is no AER 
>>> handler
>>> assigned to the VFs.
>>
>> The commit you mentioned has to do with PCI_ERS_RESULT_NO_AER_DRIVER.
>> We use pci_walk_bus() to figure out whether all the devices in a
>> subtree have a driver.  What subtree is involved here?  I would expect
>> the VFs to be siblings of the PF, not children of it, so I'm not sure
>> where things went wrong.
> Well, VFs could be on virtual busses (ARI turned on), so not 
> necessarily a
> sibling to PF ... and then we have the problem in PCI code of not 
> being able
> to traverse these virtual busses (in some cases; not sure if 
> pci_walk_bus(),
> which is going down the tree vs up the tree, has any problems here 
> w/VFs on
> virtual busses).
>
>>
>> Can you collect "lspci -vvv" output and maybe add some debug so we can
>> see exactly where the error is detected and what devices we're looking
>> at to conclude that one of them doesn't have a driver?
     lspci -vvv for both PF & VF is attached, we can see that VF 
(21:00.1) has no driver loaded comparing the PF (Kernel driver in use: 
mlx4_core).
>>
>> Bjorn
>>
>


[-- Attachment #2: lspci.txt --]
[-- Type: text/plain, Size: 7889 bytes --]

21:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
        Subsystem: Hewlett-Packard Company Device 18cf
        Physical Slot: 2
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 64
        Region 0: Memory at fbf00000 (64-bit, non-prefetchable) [size=1M]
        Region 2: Memory at f8000000 (64-bit, prefetchable) [size=32M]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Vital Product Data
                Product Name: HP ConnectX-3 Mezz
                Read-only fields:
                        [PN] Part number: 644161-B21
                        [EC] Engineering changes: C4
                        [SN] Serial number: IL224202VW
                        [V0] Vendor specific: HP IB FDR/EN 10/40Gb 2P 544M Adptr
                        [RV] Reserved: checksum good, 0 byte(s) reserved
                Read/write fields:
                        [V1] Vendor specific: N/A
                        [YA] Asset tag: N/A
                        [RW] Read-write area: 102 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 253 byte(s) free
                        [RW] Read-write area: 252 byte(s) free
                End
        Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
                Vector table: BAR=0 offset=0007c000
                PBA: BAR=0 offset=0007d000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #8, Speed 8GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [148 v1] Device Serial Number 24-be-05-ff-ff-8b-6b-d0
        Capabilities: [108 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy+
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 1, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 1004
                Supported Page Size: 000007ff, System Page Size: 00000001
                Region 2: Memory at 00000000d8000000 (64-bit, prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [154 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [18c v1] #19
        Kernel driver in use: mlx4_core
        Kernel modules: mlx4_core
		
21:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
        Subsystem: Hewlett-Packard Company Device 61b0
        Physical Slot: 2
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Region 2: [virtual] Memory at d8000000 (64-bit, prefetchable) [size=32M]
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset+
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM unknown, Latency L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [9c] MSI-X: Enable- Count=256 Masked-
                Vector table: BAR=2 offset=00002000
                PBA: BAR=2 offset=00003000
        Capabilities: [40] #00 [0000]
        Kernel modules: mlx4_core
		
		

  reply	other threads:[~2014-06-23 22:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-23 14:29 PCI/AER: AER in SRIOV environment Yishai Hadas
2014-06-23 19:09 ` Bjorn Helgaas
2014-06-23 19:09   ` Bjorn Helgaas
2014-06-23 20:12   ` Don Dutile
2014-06-23 22:44     ` Yishai Hadas [this message]
2014-06-23 22:44       ` Yishai Hadas
2014-06-23 23:17       ` Alex Williamson
2014-06-24 14:56       ` Don Dutile
2014-06-24 14:56         ` Don Dutile
2014-06-24 16:22         ` Yishai Hadas
2014-06-24 16:22           ` Yishai Hadas
2014-06-24 17:38           ` Alex Williamson
2014-06-23 23:10     ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53A8ADD5.7030207@dev.mellanox.co.il \
    --to=yishaih@dev.mellanox.co.il \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=liranl@mellanox.com \
    --cc=myron.stowe@redhat.com \
    --cc=vijaymohan.pandarathil@hp.com \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.