* RE: [E1000-devel] 82571EB: Detected Hardware Unit Hang [not found] ` <9B4A1B1917080E46B64F07F2989DADD62F2D8070@ORSMSX102.amr.corp.intel.com> @ 2012-11-27 18:10 ` Ben Hutchings 2012-11-27 18:24 ` Fujinaka, Todd 2012-11-28 8:31 ` Joe Jin 0 siblings, 2 replies; 9+ messages in thread From: Ben Hutchings @ 2012-11-27 18:10 UTC (permalink / raw) To: Fujinaka, Todd, Mary Mcgrath Cc: Joe Jin, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: > Forgive me if I'm being too repetitious as I think some of this has > been mentioned in the past. > > We (and by we I mean the Ethernet part and driver) can only change the > advertised availability of a larger MaxPayloadSize. The size is > negotiated by both sides of the link when the link is established. The > driver should not change the size of the link as it would be poking at > registers outside of its scope and is controlled by the upstream > bridge (not us). [...] MaxPayloadSize (MPS) is not negotiated between devices but is programmed by the system firmware (at least for devices present at boot - the kernel may be responsible in case of hotplug). You can use the kernel parameter 'pci=pcie_bus_perf' (or one of several others) to set a policy that overrides this, but no policy will allow setting MPS above the device's MaxPayloadSizeSupported (MPSS). (These parameters are not documented in Documentation/kernel-parameters.txt! Someone ought to fix that.) Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-27 18:10 ` [E1000-devel] 82571EB: Detected Hardware Unit Hang Ben Hutchings @ 2012-11-27 18:24 ` Fujinaka, Todd 2012-11-28 8:31 ` Joe Jin 1 sibling, 0 replies; 9+ messages in thread From: Fujinaka, Todd @ 2012-11-27 18:24 UTC (permalink / raw) To: Ben Hutchings, Mary Mcgrath Cc: Joe Jin, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci VGhhbmtzIGZvciB0aGUgY2xhcmlmaWNhdGlvbi4gSSB3YXMganVzdCBnb2luZyBieSB0aGUgUENJ ZSBzcGVjLCB3aGljaCBzYXlzIHRoZSBsb3dlc3QgdmFsdWUgb2YgYm90aCBlbmRzIGlzIHVzZWQs IGFuZCBJIGZpZ3VyZWQgU09NRVRISU5HIGhhZCB0byBiZSBsb29raW5nIGF0IHRoYXQgYW5kIGRv aW5nIHNvbWUgc29ydCBvZiBuZWdvdGlhdGlvbi4gSSdtIG5vIEJJT1MgZ3V5LCBzbyBJJ20gbm90 IHN1cmUgd2hhdCdzIGFjdHVhbGx5IGdvaW5nIG9uLCB3aGV0aGVyIHNvbWV0aGluZyB3YWxrcyB0 aGUgUENJZSB0cmVlIG9yIGlmIHRoZSBCSU9TIGp1c3Qgc2V0cyBhbGwgdGhlIHZhbHVlcyB0byB0 aGUgbWluaW11bS4NCg0KVG9kZCBGdWppbmFrYQ0KVGVjaG5pY2FsIE1hcmtldGluZyBFbmdpbmVl cg0KTEFOIEFjY2VzcyBEaXZpc2lvbiAoTEFEKQ0KSW50ZWwgQ29ycG9yYXRpb24NCnRvZGQuZnVq aW5ha2FAaW50ZWwuY29tDQooNTAzKSA3MTItNDU2NQ0KDQoNCi0tLS0tT3JpZ2luYWwgTWVzc2Fn ZS0tLS0tDQpGcm9tOiBCZW4gSHV0Y2hpbmdzIFttYWlsdG86Ymh1dGNoaW5nc0Bzb2xhcmZsYXJl LmNvbV0gDQpTZW50OiBUdWVzZGF5LCBOb3ZlbWJlciAyNywgMjAxMiAxMDoxMSBBTQ0KVG86IEZ1 amluYWthLCBUb2RkOyBNYXJ5IE1jZ3JhdGgNCkNjOiBKb2UgSmluOyBuZXRkZXZAdmdlci5rZXJu ZWwub3JnOyBlMTAwMC1kZXZlbEBsaXN0cy5zZi5uZXQ7IGxpbnV4LWtlcm5lbEB2Z2VyLmtlcm5l bC5vcmc7IGxpbnV4LXBjaQ0KU3ViamVjdDogUkU6IFtFMTAwMC1kZXZlbF0gODI1NzFFQjogRGV0 ZWN0ZWQgSGFyZHdhcmUgVW5pdCBIYW5nDQoNCk9uIFR1ZSwgMjAxMi0xMS0yNyBhdCAxNzozMiAr MDAwMCwgRnVqaW5ha2EsIFRvZGQgd3JvdGU6DQo+IEZvcmdpdmUgbWUgaWYgSSdtIGJlaW5nIHRv byByZXBldGl0aW91cyBhcyBJIHRoaW5rIHNvbWUgb2YgdGhpcyBoYXMgDQo+IGJlZW4gbWVudGlv bmVkIGluIHRoZSBwYXN0Lg0KPiANCj4gV2UgKGFuZCBieSB3ZSBJIG1lYW4gdGhlIEV0aGVybmV0 IHBhcnQgYW5kIGRyaXZlcikgY2FuIG9ubHkgY2hhbmdlIHRoZSANCj4gYWR2ZXJ0aXNlZCBhdmFp bGFiaWxpdHkgb2YgYSBsYXJnZXIgTWF4UGF5bG9hZFNpemUuIFRoZSBzaXplIGlzIA0KPiBuZWdv dGlhdGVkIGJ5IGJvdGggc2lkZXMgb2YgdGhlIGxpbmsgd2hlbiB0aGUgbGluayBpcyBlc3RhYmxp c2hlZC4gVGhlIA0KPiBkcml2ZXIgc2hvdWxkIG5vdCBjaGFuZ2UgdGhlIHNpemUgb2YgdGhlIGxp bmsgYXMgaXQgd291bGQgYmUgcG9raW5nIGF0IA0KPiByZWdpc3RlcnMgb3V0c2lkZSBvZiBpdHMg c2NvcGUgYW5kIGlzIGNvbnRyb2xsZWQgYnkgdGhlIHVwc3RyZWFtIA0KPiBicmlkZ2UgKG5vdCB1 cykuDQpbLi4uXQ0KDQpNYXhQYXlsb2FkU2l6ZSAoTVBTKSBpcyBub3QgbmVnb3RpYXRlZCBiZXR3 ZWVuIGRldmljZXMgYnV0IGlzIHByb2dyYW1tZWQgYnkgdGhlIHN5c3RlbSBmaXJtd2FyZSAoYXQg bGVhc3QgZm9yIGRldmljZXMgcHJlc2VudCBhdCBib290IC0gdGhlIGtlcm5lbCBtYXkgYmUgcmVz cG9uc2libGUgaW4gY2FzZSBvZiBob3RwbHVnKS4gIFlvdSBjYW4gdXNlIHRoZSBrZXJuZWwgcGFy YW1ldGVyICdwY2k9cGNpZV9idXNfcGVyZicgKG9yIG9uZSBvZiBzZXZlcmFsIG90aGVycykgdG8g c2V0IGEgcG9saWN5IHRoYXQgb3ZlcnJpZGVzIHRoaXMsIGJ1dCBubyBwb2xpY3kgd2lsbCBhbGxv dyBzZXR0aW5nIE1QUyBhYm92ZSB0aGUgZGV2aWNlJ3MgTWF4UGF5bG9hZFNpemVTdXBwb3J0ZWQg KE1QU1MpLg0KDQooVGhlc2UgcGFyYW1ldGVycyBhcmUgbm90IGRvY3VtZW50ZWQgaW4NCkRvY3Vt ZW50YXRpb24va2VybmVsLXBhcmFtZXRlcnMudHh0ISAgU29tZW9uZSBvdWdodCB0byBmaXggdGhh dC4pDQoNCkJlbi4NCg0KLS0NCkJlbiBIdXRjaGluZ3MsIFN0YWZmIEVuZ2luZWVyLCBTb2xhcmZs YXJlIE5vdCBzcGVha2luZyBmb3IgbXkgZW1wbG95ZXI7IHRoYXQncyB0aGUgbWFya2V0aW5nIGRl cGFydG1lbnQncyBqb2IuDQpUaGV5IGFza2VkIHVzIHRvIG5vdGUgdGhhdCBTb2xhcmZsYXJlIHBy b2R1Y3QgbmFtZXMgYXJlIHRyYWRlbWFya2VkLg0KDQo= ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-27 18:10 ` [E1000-devel] 82571EB: Detected Hardware Unit Hang Ben Hutchings 2012-11-27 18:24 ` Fujinaka, Todd @ 2012-11-28 8:31 ` Joe Jin 2012-11-28 15:53 ` Fujinaka, Todd 1 sibling, 1 reply; 9+ messages in thread From: Joe Jin @ 2012-11-28 8:31 UTC (permalink / raw) To: Ben Hutchings Cc: Fujinaka, Todd, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci On 11/28/12 02:10, Ben Hutchings wrote: > On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >> Forgive me if I'm being too repetitious as I think some of this has >> been mentioned in the past. >> >> We (and by we I mean the Ethernet part and driver) can only change the >> advertised availability of a larger MaxPayloadSize. The size is >> negotiated by both sides of the link when the link is established. The >> driver should not change the size of the link as it would be poking at >> registers outside of its scope and is controlled by the upstream >> bridge (not us). > [...] > > MaxPayloadSize (MPS) is not negotiated between devices but is programmed > by the system firmware (at least for devices present at boot - the > kernel may be responsible in case of hotplug). You can use the kernel > parameter 'pci=pcie_bus_perf' (or one of several others) to set a policy > that overrides this, but no policy will allow setting MPS above the > device's MaxPayloadSizeSupported (MPSS). > Ben, Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. So I'm trying to use ethtool modify it from eeprom to see if help or no. Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. Thanks in advance, Joe ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-28 8:31 ` Joe Jin @ 2012-11-28 15:53 ` Fujinaka, Todd 2012-11-29 3:10 ` Ethan Zhao 0 siblings, 1 reply; 9+ messages in thread From: Fujinaka, Todd @ 2012-11-28 15:53 UTC (permalink / raw) To: Joe Jin, Ben Hutchings Cc: Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci VGhlIG9ubHkgRUVQUk9NIEkga25vdyBhYm91dCBvciBjYW4gc3BlYWsgdG8gaXMgdGhlIG9uZSBh dHRhY2hlZCB0byB0aGUgODI1NzEgYW5kIGl0IGRvZXNuJ3Qgc2V0IHRoZSBNYXhQYXlsb2FkU2l6 ZS4gVGhhdCdzIGRvbmUgYnkgdGhlIEJJT1MuDQoNClRvZGQgRnVqaW5ha2ENClRlY2huaWNhbCBN YXJrZXRpbmcgRW5naW5lZXINCkxBTiBBY2Nlc3MgRGl2aXNpb24gKExBRCkNCkludGVsIENvcnBv cmF0aW9uDQp0b2RkLmZ1amluYWthQGludGVsLmNvbQ0KKDUwMykgNzEyLTQ1NjUNCg0KDQotLS0t LU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogSm9lIEppbiBbbWFpbHRvOmpvZS5qaW5Ab3Jh Y2xlLmNvbV0gDQpTZW50OiBXZWRuZXNkYXksIE5vdmVtYmVyIDI4LCAyMDEyIDEyOjMxIEFNDQpU bzogQmVuIEh1dGNoaW5ncw0KQ2M6IEZ1amluYWthLCBUb2RkOyBNYXJ5IE1jZ3JhdGg7IG5ldGRl dkB2Z2VyLmtlcm5lbC5vcmc7IGUxMDAwLWRldmVsQGxpc3RzLnNmLm5ldDsgbGludXgta2VybmVs QHZnZXIua2VybmVsLm9yZzsgbGludXgtcGNpDQpTdWJqZWN0OiBSZTogW0UxMDAwLWRldmVsXSA4 MjU3MUVCOiBEZXRlY3RlZCBIYXJkd2FyZSBVbml0IEhhbmcNCg0KT24gMTEvMjgvMTIgMDI6MTAs IEJlbiBIdXRjaGluZ3Mgd3JvdGU6DQo+IE9uIFR1ZSwgMjAxMi0xMS0yNyBhdCAxNzozMiArMDAw MCwgRnVqaW5ha2EsIFRvZGQgd3JvdGU6DQo+PiBGb3JnaXZlIG1lIGlmIEknbSBiZWluZyB0b28g cmVwZXRpdGlvdXMgYXMgSSB0aGluayBzb21lIG9mIHRoaXMgaGFzIA0KPj4gYmVlbiBtZW50aW9u ZWQgaW4gdGhlIHBhc3QuDQo+Pg0KPj4gV2UgKGFuZCBieSB3ZSBJIG1lYW4gdGhlIEV0aGVybmV0 IHBhcnQgYW5kIGRyaXZlcikgY2FuIG9ubHkgY2hhbmdlIA0KPj4gdGhlIGFkdmVydGlzZWQgYXZh aWxhYmlsaXR5IG9mIGEgbGFyZ2VyIE1heFBheWxvYWRTaXplLiBUaGUgc2l6ZSBpcyANCj4+IG5l Z290aWF0ZWQgYnkgYm90aCBzaWRlcyBvZiB0aGUgbGluayB3aGVuIHRoZSBsaW5rIGlzIGVzdGFi bGlzaGVkLiANCj4+IFRoZSBkcml2ZXIgc2hvdWxkIG5vdCBjaGFuZ2UgdGhlIHNpemUgb2YgdGhl IGxpbmsgYXMgaXQgd291bGQgYmUgDQo+PiBwb2tpbmcgYXQgcmVnaXN0ZXJzIG91dHNpZGUgb2Yg aXRzIHNjb3BlIGFuZCBpcyBjb250cm9sbGVkIGJ5IHRoZSANCj4+IHVwc3RyZWFtIGJyaWRnZSAo bm90IHVzKS4NCj4gWy4uLl0NCj4gDQo+IE1heFBheWxvYWRTaXplIChNUFMpIGlzIG5vdCBuZWdv dGlhdGVkIGJldHdlZW4gZGV2aWNlcyBidXQgaXMgDQo+IHByb2dyYW1tZWQgYnkgdGhlIHN5c3Rl bSBmaXJtd2FyZSAoYXQgbGVhc3QgZm9yIGRldmljZXMgcHJlc2VudCBhdCANCj4gYm9vdCAtIHRo ZSBrZXJuZWwgbWF5IGJlIHJlc3BvbnNpYmxlIGluIGNhc2Ugb2YgaG90cGx1ZykuICBZb3UgY2Fu IHVzZSANCj4gdGhlIGtlcm5lbCBwYXJhbWV0ZXIgJ3BjaT1wY2llX2J1c19wZXJmJyAob3Igb25l IG9mIHNldmVyYWwgb3RoZXJzKSB0byANCj4gc2V0IGEgcG9saWN5IHRoYXQgb3ZlcnJpZGVzIHRo aXMsIGJ1dCBubyBwb2xpY3kgd2lsbCBhbGxvdyBzZXR0aW5nIE1QUyANCj4gYWJvdmUgdGhlIGRl dmljZSdzIE1heFBheWxvYWRTaXplU3VwcG9ydGVkIChNUFNTKS4NCj4gDQoNCkJlbiwNCg0KVW5m b3J0dW5hdGVseSBJJ20gdXNpbmcgMy4wLngga2VybmVsIGFuZCB0aGlzIGlzIG5vdCBpbmNsdWRl ZCBpbiB0aGUga2VybmVsLg0KU28gSSdtIHRyeWluZyB0byB1c2UgZXRodG9vbCBtb2RpZnkgaXQg ZnJvbSBlZXByb20gdG8gc2VlIGlmIGhlbHAgb3Igbm8uDQoNCg0KVG9kZCwgSSdsbCByZXZpZXcg YWxsIE1heFBheWxvYWQgZm9yIGFsbCBkZXZpY2VzLCBidXQgbmVlZCB0byBzYXkgaWYgaXQgbWlz bWF0Y2gsIGN1c3RvbWVyIGNvdWxkIG5vdCBtb2RpZnkgaXQgZnJvbSBCSU9TIGZvciB0aGVyZSB3 YXMgbm90IGVudHJ5IGF0IHRoZXJlLCB0byB0ZXN0IGl0LCB3ZSBoYXZlIHRvIGZpbmQgaG93IHRv IHZlcmlmeSBpZiB0aGlzIGlzIHRoZSByb290IGNhdXNlLCBzbyBzdGlsbCBuZWVkIHRvIGZpbmQg dGhlIG9mZnNldCBpbiBlZXByb20uDQoNClRoYW5rcyBpbiBhZHZhbmNlLA0KSm9lDQoNCg== ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-28 15:53 ` Fujinaka, Todd @ 2012-11-29 3:10 ` Ethan Zhao 2012-11-29 15:52 ` Fujinaka, Todd 0 siblings, 1 reply; 9+ messages in thread From: Ethan Zhao @ 2012-11-29 3:10 UTC (permalink / raw) To: Fujinaka, Todd Cc: Joe Jin, Ben Hutchings, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci Joe, Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?). Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC. e.g. set device control register to 0f 00 (128 bytes payload size) # setpci -v -s 00:02.0 98.w=000f set device link control register to 60h (retrain the link) # setpci -v -s 00:02.0 a0.b=60 Hope it works, Just my 2 cents. Ethan.zhao@oracle.com On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote: > The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. > > Todd Fujinaka > Technical Marketing Engineer > LAN Access Division (LAD) > Intel Corporation > todd.fujinaka@intel.com > (503) 712-4565 > > > -----Original Message----- > From: Joe Jin [mailto:joe.jin@oracle.com] > Sent: Wednesday, November 28, 2012 12:31 AM > To: Ben Hutchings > Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci > Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang > > On 11/28/12 02:10, Ben Hutchings wrote: >> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>> Forgive me if I'm being too repetitious as I think some of this has >>> been mentioned in the past. >>> >>> We (and by we I mean the Ethernet part and driver) can only change >>> the advertised availability of a larger MaxPayloadSize. The size is >>> negotiated by both sides of the link when the link is established. >>> The driver should not change the size of the link as it would be >>> poking at registers outside of its scope and is controlled by the >>> upstream bridge (not us). >> [...] >> >> MaxPayloadSize (MPS) is not negotiated between devices but is >> programmed by the system firmware (at least for devices present at >> boot - the kernel may be responsible in case of hotplug). You can use >> the kernel parameter 'pci=pcie_bus_perf' (or one of several others) to >> set a policy that overrides this, but no policy will allow setting MPS >> above the device's MaxPayloadSizeSupported (MPSS). >> > > Ben, > > Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. > So I'm trying to use ethtool modify it from eeprom to see if help or no. > > > Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. > > Thanks in advance, > Joe > ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-29 3:10 ` Ethan Zhao @ 2012-11-29 15:52 ` Fujinaka, Todd 2012-12-19 3:04 ` Joe Jin 0 siblings, 1 reply; 9+ messages in thread From: Fujinaka, Todd @ 2012-11-29 15:52 UTC (permalink / raw) To: Ethan Zhao Cc: Joe Jin, Ben Hutchings, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci Someone else pointed this out to me locally. If you have a non-client BIOS, you should be able to set the MaxPayloadSize using setpci. You have to make sure that you're being consistent throughout all the associated links. Todd Fujinaka Technical Marketing Engineer LAN Access Division (LAD) Intel Corporation todd.fujinaka@intel.com (503) 712-4565 -----Original Message----- From: Ethan Zhao [mailto:ethan.kernel@gmail.com] Sent: Wednesday, November 28, 2012 7:10 PM To: Fujinaka, Todd Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang Joe, Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?). Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC. e.g. set device control register to 0f 00 (128 bytes payload size) # setpci -v -s 00:02.0 98.w=000f set device link control register to 60h (retrain the link) # setpci -v -s 00:02.0 a0.b=60 Hope it works, Just my 2 cents. Ethan.zhao@oracle.com On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote: > The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. > > Todd Fujinaka > Technical Marketing Engineer > LAN Access Division (LAD) > Intel Corporation > todd.fujinaka@intel.com > (503) 712-4565 > > > -----Original Message----- > From: Joe Jin [mailto:joe.jin@oracle.com] > Sent: Wednesday, November 28, 2012 12:31 AM > To: Ben Hutchings > Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org; > e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci > Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang > > On 11/28/12 02:10, Ben Hutchings wrote: >> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>> Forgive me if I'm being too repetitious as I think some of this has >>> been mentioned in the past. >>> >>> We (and by we I mean the Ethernet part and driver) can only change >>> the advertised availability of a larger MaxPayloadSize. The size is >>> negotiated by both sides of the link when the link is established. >>> The driver should not change the size of the link as it would be >>> poking at registers outside of its scope and is controlled by the >>> upstream bridge (not us). >> [...] >> >> MaxPayloadSize (MPS) is not negotiated between devices but is >> programmed by the system firmware (at least for devices present at >> boot - the kernel may be responsible in case of hotplug). You can >> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >> others) to set a policy that overrides this, but no policy will allow >> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >> > > Ben, > > Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. > So I'm trying to use ethtool modify it from eeprom to see if help or no. > > > Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. > > Thanks in advance, > Joe > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-11-29 15:52 ` Fujinaka, Todd @ 2012-12-19 3:04 ` Joe Jin 2012-12-19 5:52 ` Yijing Wang 0 siblings, 1 reply; 9+ messages in thread From: Joe Jin @ 2012-12-19 3:04 UTC (permalink / raw) To: Fujinaka, Todd Cc: Ethan Zhao, Ben Hutchings, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci Hi all, I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to kernel to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug. Thanks all of your help. Best Regards, Joe On 11/29/12 23:52, Fujinaka, Todd wrote: > Someone else pointed this out to me locally. If you have a non-client BIOS, you should be able to set the MaxPayloadSize using setpci. You have to make sure that you're being consistent throughout all the associated links. > > Todd Fujinaka > Technical Marketing Engineer > LAN Access Division (LAD) > Intel Corporation > todd.fujinaka@intel.com > (503) 712-4565 > > > -----Original Message----- > From: Ethan Zhao [mailto:ethan.kernel@gmail.com] > Sent: Wednesday, November 28, 2012 7:10 PM > To: Fujinaka, Todd > Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci > Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang > > Joe, > Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?). > Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC. > > e.g. > set device control register to 0f 00 (128 bytes payload size) > # setpci -v -s 00:02.0 98.w=000f > set device link control register to 60h (retrain the link) > # setpci -v -s 00:02.0 a0.b=60 > > Hope it works, Just my 2 cents. > > Ethan.zhao@oracle.com > > On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote: >> The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. >> >> Todd Fujinaka >> Technical Marketing Engineer >> LAN Access Division (LAD) >> Intel Corporation >> todd.fujinaka@intel.com >> (503) 712-4565 >> >> >> -----Original Message----- >> From: Joe Jin [mailto:joe.jin@oracle.com] >> Sent: Wednesday, November 28, 2012 12:31 AM >> To: Ben Hutchings >> Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org; >> e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >> >> On 11/28/12 02:10, Ben Hutchings wrote: >>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>>> Forgive me if I'm being too repetitious as I think some of this has >>>> been mentioned in the past. >>>> >>>> We (and by we I mean the Ethernet part and driver) can only change >>>> the advertised availability of a larger MaxPayloadSize. The size is >>>> negotiated by both sides of the link when the link is established. >>>> The driver should not change the size of the link as it would be >>>> poking at registers outside of its scope and is controlled by the >>>> upstream bridge (not us). >>> [...] >>> >>> MaxPayloadSize (MPS) is not negotiated between devices but is >>> programmed by the system firmware (at least for devices present at >>> boot - the kernel may be responsible in case of hotplug). You can >>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >>> others) to set a policy that overrides this, but no policy will allow >>> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >>> >> >> Ben, >> >> Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. >> So I'm trying to use ethtool modify it from eeprom to see if help or no. >> >> >> Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. >> >> Thanks in advance, >> Joe >> -- Oracle <http://www.oracle.com> Joe Jin | Software Development Senior Manager | +8610.6106.5624 ORACLE | Linux and Virtualization No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-12-19 3:04 ` Joe Jin @ 2012-12-19 5:52 ` Yijing Wang 2012-12-19 6:13 ` Joe Jin 0 siblings, 1 reply; 9+ messages in thread From: Yijing Wang @ 2012-12-19 5:52 UTC (permalink / raw) To: Joe Jin Cc: Fujinaka, Todd, Ethan Zhao, Ben Hutchings, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci, Jon Mason, Bjorn Helgaas On 2012/12/19 11:04, Joe Jin wrote: > Hi all, > > I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to kernel > to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug. > Hi Joe, I found similar problem when I do pci hotplug, discussion is here:http://marc.info/?l=linux-pci&m=134810569924220&w=2. We try to improve Linux kernel to debug this problem easily based Bjorn's suggestion. Jon sent out the first version patch http://marc.info/?l=linux-pci&m=135002016005274&w=2. I think we can do further here, http://marc.info/?l=linux-pci&m=135115581307869&w=2. I hope this information can help you. Thanks! Yijing. > Thanks all of your help. > > Best Regards, > Joe > > On 11/29/12 23:52, Fujinaka, Todd wrote: >> Someone else pointed this out to me locally. If you have a non-client BIOS, you should be able to set the MaxPayloadSize using setpci. You have to make sure that you're being consistent throughout all the associated links. >> >> Todd Fujinaka >> Technical Marketing Engineer >> LAN Access Division (LAD) >> Intel Corporation >> todd.fujinaka@intel.com >> (503) 712-4565 >> >> >> -----Original Message----- >> From: Ethan Zhao [mailto:ethan.kernel@gmail.com] >> Sent: Wednesday, November 28, 2012 7:10 PM >> To: Fujinaka, Todd >> Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >> >> Joe, >> Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?). >> Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC. >> >> e.g. >> set device control register to 0f 00 (128 bytes payload size) >> # setpci -v -s 00:02.0 98.w=000f >> set device link control register to 60h (retrain the link) >> # setpci -v -s 00:02.0 a0.b=60 >> >> Hope it works, Just my 2 cents. >> >> Ethan.zhao@oracle.com >> >> On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote: >>> The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. >>> >>> Todd Fujinaka >>> Technical Marketing Engineer >>> LAN Access Division (LAD) >>> Intel Corporation >>> todd.fujinaka@intel.com >>> (503) 712-4565 >>> >>> >>> -----Original Message----- >>> From: Joe Jin [mailto:joe.jin@oracle.com] >>> Sent: Wednesday, November 28, 2012 12:31 AM >>> To: Ben Hutchings >>> Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org; >>> e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >>> >>> On 11/28/12 02:10, Ben Hutchings wrote: >>>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>>>> Forgive me if I'm being too repetitious as I think some of this has >>>>> been mentioned in the past. >>>>> >>>>> We (and by we I mean the Ethernet part and driver) can only change >>>>> the advertised availability of a larger MaxPayloadSize. The size is >>>>> negotiated by both sides of the link when the link is established. >>>>> The driver should not change the size of the link as it would be >>>>> poking at registers outside of its scope and is controlled by the >>>>> upstream bridge (not us). >>>> [...] >>>> >>>> MaxPayloadSize (MPS) is not negotiated between devices but is >>>> programmed by the system firmware (at least for devices present at >>>> boot - the kernel may be responsible in case of hotplug). You can >>>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >>>> others) to set a policy that overrides this, but no policy will allow >>>> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >>>> >>> >>> Ben, >>> >>> Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. >>> So I'm trying to use ethtool modify it from eeprom to see if help or no. >>> >>> >>> Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. >>> >>> Thanks in advance, >>> Joe >>> > > -- Thanks! Yijing ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang 2012-12-19 5:52 ` Yijing Wang @ 2012-12-19 6:13 ` Joe Jin 0 siblings, 0 replies; 9+ messages in thread From: Joe Jin @ 2012-12-19 6:13 UTC (permalink / raw) To: Yijing Wang Cc: Fujinaka, Todd, Ethan Zhao, Ben Hutchings, Mary Mcgrath, netdev@vger.kernel.org, e1000-devel@lists.sf.net, linux-kernel@vger.kernel.org, linux-pci, Jon Mason, Bjorn Helgaas Hi Yijing, Thanks for your reference, the patch looks good for me, but I have no chance to test it on customer's env. Best Regards, Joe On 12/19/12 13:52, Yijing Wang wrote: > On 2012/12/19 11:04, Joe Jin wrote: >> Hi all, >> >> I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to kernel >> to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug. >> > > Hi Joe, > I found similar problem when I do pci hotplug, discussion is here:http://marc.info/?l=linux-pci&m=134810569924220&w=2. > We try to improve Linux kernel to debug this problem easily based Bjorn's suggestion. Jon sent out the first version patch http://marc.info/?l=linux-pci&m=135002016005274&w=2. > I think we can do further here, http://marc.info/?l=linux-pci&m=135115581307869&w=2. I hope this information can help you. > > Thanks! > Yijing. > >> Thanks all of your help. >> >> Best Regards, >> Joe >> >> On 11/29/12 23:52, Fujinaka, Todd wrote: >>> Someone else pointed this out to me locally. If you have a non-client BIOS, you should be able to set the MaxPayloadSize using setpci. You have to make sure that you're being consistent throughout all the associated links. >>> >>> Todd Fujinaka >>> Technical Marketing Engineer >>> LAN Access Division (LAD) >>> Intel Corporation >>> todd.fujinaka@intel.com >>> (503) 712-4565 >>> >>> >>> -----Original Message----- >>> From: Ethan Zhao [mailto:ethan.kernel@gmail.com] >>> Sent: Wednesday, November 28, 2012 7:10 PM >>> To: Fujinaka, Todd >>> Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >>> >>> Joe, >>> Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?). >>> Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC. >>> >>> e.g. >>> set device control register to 0f 00 (128 bytes payload size) >>> # setpci -v -s 00:02.0 98.w=000f >>> set device link control register to 60h (retrain the link) >>> # setpci -v -s 00:02.0 a0.b=60 >>> >>> Hope it works, Just my 2 cents. >>> >>> Ethan.zhao@oracle.com >>> >>> On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote: >>>> The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS. >>>> >>>> Todd Fujinaka >>>> Technical Marketing Engineer >>>> LAN Access Division (LAD) >>>> Intel Corporation >>>> todd.fujinaka@intel.com >>>> (503) 712-4565 >>>> >>>> >>>> -----Original Message----- >>>> From: Joe Jin [mailto:joe.jin@oracle.com] >>>> Sent: Wednesday, November 28, 2012 12:31 AM >>>> To: Ben Hutchings >>>> Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org; >>>> e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci >>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang >>>> >>>> On 11/28/12 02:10, Ben Hutchings wrote: >>>>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote: >>>>>> Forgive me if I'm being too repetitious as I think some of this has >>>>>> been mentioned in the past. >>>>>> >>>>>> We (and by we I mean the Ethernet part and driver) can only change >>>>>> the advertised availability of a larger MaxPayloadSize. The size is >>>>>> negotiated by both sides of the link when the link is established. >>>>>> The driver should not change the size of the link as it would be >>>>>> poking at registers outside of its scope and is controlled by the >>>>>> upstream bridge (not us). >>>>> [...] >>>>> >>>>> MaxPayloadSize (MPS) is not negotiated between devices but is >>>>> programmed by the system firmware (at least for devices present at >>>>> boot - the kernel may be responsible in case of hotplug). You can >>>>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several >>>>> others) to set a policy that overrides this, but no policy will allow >>>>> setting MPS above the device's MaxPayloadSizeSupported (MPSS). >>>>> >>>> >>>> Ben, >>>> >>>> Unfortunately I'm using 3.0.x kernel and this is not included in the kernel. >>>> So I'm trying to use ethtool modify it from eeprom to see if help or no. >>>> >>>> >>>> Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom. >>>> >>>> Thanks in advance, >>>> Joe >>>> >> >> > > -- Oracle <http://www.oracle.com> Joe Jin | Software Development Senior Manager | +8610.6106.5624 ORACLE | Linux and Virtualization No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-12-19 6:13 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <509B5038.8090304@oracle.com> [not found] ` <061C8A8601E8EE4CA8D8FD6990CEA89133487884@ORSMSX102.amr.corp.intel.com> [not found] ` <50A30656.6090508@oracle.com> [not found] ` <061C8A8601E8EE4CA8D8FD6990CEA8913348B105@ORSMSX102.amr.corp.intel.com> [not found] ` <50A43828.6000702@oracle.com> [not found] ` <061C8A8601E8EE4CA8D8FD6990CEA8913349A0B4@ORSMSX102.amr.corp.intel.com> [not found] ` <50A9C5CC.1030300@oracle.com> [not found] ` <061C8A8601E8EE4CA8D8FD6990CEA8913349EB41@ORSMSX102.amr.corp.intel.com> [not found] ` <50AB8471.7080607@oracle.com> [not found] ` <9B4A1B1917080E46B64F07F2989DADD62F2D62D6@ORSMSX102.amr.corp.intel.com> [not found] ` <50B41077.3080009@oracle.com> [not found] ` <b46b84cf-3688-4fd7-b741-4e96f2663517@default> [not found] ` <9B4A1B1917080E46B64F07F2989DADD62F2D8070@ORSMSX102.amr.corp.intel.com> 2012-11-27 18:10 ` [E1000-devel] 82571EB: Detected Hardware Unit Hang Ben Hutchings 2012-11-27 18:24 ` Fujinaka, Todd 2012-11-28 8:31 ` Joe Jin 2012-11-28 15:53 ` Fujinaka, Todd 2012-11-29 3:10 ` Ethan Zhao 2012-11-29 15:52 ` Fujinaka, Todd 2012-12-19 3:04 ` Joe Jin 2012-12-19 5:52 ` Yijing Wang 2012-12-19 6:13 ` Joe Jin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).