From: Joe Jin <joe.jin@oracle.com>
To: Yijing Wang <wangyijing@huawei.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
Jon Mason <jdmason@kudzu.us>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
"e1000-devel@lists.sf.net" <e1000-devel@lists.sf.net>,
Mary Mcgrath <mary.mcgrath@oracle.com>,
linux-pci <linux-pci@vger.kernel.org>,
Ben Hutchings <bhutchings@solarflare.com>,
Ethan Zhao <ethan.kernel@gmail.com>
Subject: Re: 82571EB: Detected Hardware Unit Hang
Date: Wed, 19 Dec 2012 14:13:00 +0800 [thread overview]
Message-ID: <50D15AEC.5050306@oracle.com> (raw)
In-Reply-To: <50D15600.6060001@huawei.com>
Hi Yijing,
Thanks for your reference, the patch looks good for me, but I have no chance
to test it on customer's env.
Best Regards,
Joe
On 12/19/12 13:52, Yijing Wang wrote:
> On 2012/12/19 11:04, Joe Jin wrote:
>> Hi all,
>>
>> I backported mps commits and ask customer pass "pci=pcie_bus_peer2pee" to kernel
>> to limited MPS to 128 and issue disappeared, sound like this is a BIOS bug.
>>
>
> Hi Joe,
> I found similar problem when I do pci hotplug, discussion is here:http://marc.info/?l=linux-pci&m=134810569924220&w=2.
> We try to improve Linux kernel to debug this problem easily based Bjorn's suggestion. Jon sent out the first version patch http://marc.info/?l=linux-pci&m=135002016005274&w=2.
> I think we can do further here, http://marc.info/?l=linux-pci&m=135115581307869&w=2. I hope this information can help you.
>
> Thanks!
> Yijing.
>
>> Thanks all of your help.
>>
>> Best Regards,
>> Joe
>>
>> On 11/29/12 23:52, Fujinaka, Todd wrote:
>>> Someone else pointed this out to me locally. If you have a non-client BIOS, you should be able to set the MaxPayloadSize using setpci. You have to make sure that you're being consistent throughout all the associated links.
>>>
>>> Todd Fujinaka
>>> Technical Marketing Engineer
>>> LAN Access Division (LAD)
>>> Intel Corporation
>>> todd.fujinaka@intel.com
>>> (503) 712-4565
>>>
>>>
>>> -----Original Message-----
>>> From: Ethan Zhao [mailto:ethan.kernel@gmail.com]
>>> Sent: Wednesday, November 28, 2012 7:10 PM
>>> To: Fujinaka, Todd
>>> Cc: Joe Jin; Ben Hutchings; Mary Mcgrath; netdev@vger.kernel.org; e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci
>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
>>>
>>> Joe,
>>> Possibly your customer is running a kernel without source code on a platform whose vendor wouldn't like to fix BIOS issue( Is that a HP/Dell server ?).
>>> Anyway, to see if is a payload issue or, you could change the payload size with setpci tool to those devices and set the link retrain bit to trigger the link retraining to debug the issue and identity the root cause. I thinks it is much easier than modify the BIOS or eeprom of NIC.
>>>
>>> e.g.
>>> set device control register to 0f 00 (128 bytes payload size)
>>> # setpci -v -s 00:02.0 98.w=000f
>>> set device link control register to 60h (retrain the link)
>>> # setpci -v -s 00:02.0 a0.b=60
>>>
>>> Hope it works, Just my 2 cents.
>>>
>>> Ethan.zhao@oracle.com
>>>
>>> On Wed, Nov 28, 2012 at 11:53 PM, Fujinaka, Todd <todd.fujinaka@intel.com> wrote:
>>>> The only EEPROM I know about or can speak to is the one attached to the 82571 and it doesn't set the MaxPayloadSize. That's done by the BIOS.
>>>>
>>>> Todd Fujinaka
>>>> Technical Marketing Engineer
>>>> LAN Access Division (LAD)
>>>> Intel Corporation
>>>> todd.fujinaka@intel.com
>>>> (503) 712-4565
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Joe Jin [mailto:joe.jin@oracle.com]
>>>> Sent: Wednesday, November 28, 2012 12:31 AM
>>>> To: Ben Hutchings
>>>> Cc: Fujinaka, Todd; Mary Mcgrath; netdev@vger.kernel.org;
>>>> e1000-devel@lists.sf.net; linux-kernel@vger.kernel.org; linux-pci
>>>> Subject: Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
>>>>
>>>> On 11/28/12 02:10, Ben Hutchings wrote:
>>>>> On Tue, 2012-11-27 at 17:32 +0000, Fujinaka, Todd wrote:
>>>>>> Forgive me if I'm being too repetitious as I think some of this has
>>>>>> been mentioned in the past.
>>>>>>
>>>>>> We (and by we I mean the Ethernet part and driver) can only change
>>>>>> the advertised availability of a larger MaxPayloadSize. The size is
>>>>>> negotiated by both sides of the link when the link is established.
>>>>>> The driver should not change the size of the link as it would be
>>>>>> poking at registers outside of its scope and is controlled by the
>>>>>> upstream bridge (not us).
>>>>> [...]
>>>>>
>>>>> MaxPayloadSize (MPS) is not negotiated between devices but is
>>>>> programmed by the system firmware (at least for devices present at
>>>>> boot - the kernel may be responsible in case of hotplug). You can
>>>>> use the kernel parameter 'pci=pcie_bus_perf' (or one of several
>>>>> others) to set a policy that overrides this, but no policy will allow
>>>>> setting MPS above the device's MaxPayloadSizeSupported (MPSS).
>>>>>
>>>>
>>>> Ben,
>>>>
>>>> Unfortunately I'm using 3.0.x kernel and this is not included in the kernel.
>>>> So I'm trying to use ethtool modify it from eeprom to see if help or no.
>>>>
>>>>
>>>> Todd, I'll review all MaxPayload for all devices, but need to say if it mismatch, customer could not modify it from BIOS for there was not entry at there, to test it, we have to find how to verify if this is the root cause, so still need to find the offset in eeprom.
>>>>
>>>> Thanks in advance,
>>>> Joe
>>>>
>>
>>
>
>
--
Oracle <http://www.oracle.com>
Joe Jin | Software Development Senior Manager | +8610.6106.5624
ORACLE | Linux and Virtualization
No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
next prev parent reply other threads:[~2012-12-19 6:13 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-08 6:24 82571EB: Detected Hardware Unit Hang Joe Jin
2012-11-08 20:35 ` Dave, Tushar N
2012-11-09 1:22 ` Joe Jin
2012-11-14 2:47 ` Joe Jin
2012-11-14 3:45 ` Dave, Tushar N
2012-11-15 0:32 ` Joe Jin
2012-11-15 20:26 ` Dave, Tushar N
2012-11-19 5:38 ` Joe Jin
2012-11-20 8:59 ` Dave, Tushar N
2012-11-20 13:24 ` Joe Jin
2012-11-26 16:23 ` [E1000-devel] " Fujinaka, Todd
2012-11-27 0:59 ` Joe Jin
2012-11-27 2:06 ` Mary Mcgrath
2012-11-27 17:32 ` [E1000-devel] " Fujinaka, Todd
2012-11-27 18:10 ` Ben Hutchings
2012-11-27 18:24 ` Fujinaka, Todd
2012-11-28 8:31 ` Joe Jin
2012-11-28 15:53 ` Fujinaka, Todd
2012-11-29 3:10 ` Ethan Zhao
2012-11-29 15:52 ` Fujinaka, Todd
2012-12-19 3:04 ` Joe Jin
2012-12-19 5:52 ` Yijing Wang
2012-12-19 6:13 ` Joe Jin [this message]
2012-11-20 13:24 ` Joe Jin
2012-11-14 3:37 ` Li Yu
2012-11-14 3:43 ` Dave, Tushar N
-- strict thread matches above, loose matches on Subject: below --
2012-07-09 8:51 Joe Jin
2012-07-09 9:21 ` Eric Dumazet
2012-07-09 12:19 ` Joe Jin
2012-07-10 7:40 ` Joe Jin
2012-07-10 18:14 ` Wyborny, Carolyn
2012-07-10 19:02 ` Dave, Tushar N
2012-07-10 19:17 ` Dave, Tushar N
2012-07-11 0:34 ` Joe Jin
2012-07-11 1:18 ` Dave, Tushar N
2012-07-11 1:44 ` Joe Jin
2012-07-11 3:22 ` Dave, Tushar N
2012-07-11 3:29 ` Joe Jin
2012-07-11 4:05 ` Dave, Tushar N
2012-07-11 5:03 ` Joe Jin
2012-07-11 7:11 ` Dave, Tushar N
2012-07-11 7:17 ` Joe Jin
2012-07-11 7:37 ` Dave, Tushar N
2012-07-11 7:38 ` Joe Jin
2012-07-11 7:50 ` Dave, Tushar N
2012-07-11 7:53 ` Joe Jin
2012-07-11 18:51 ` Dave, Tushar N
2012-07-12 2:23 ` Joe Jin
2012-07-12 2:52 ` Dave, Tushar N
2012-07-12 2:57 ` Joe Jin
2012-07-12 3:07 ` Dave, Tushar N
2012-07-12 3:12 ` Joe Jin
2012-07-12 5:57 ` Dave, Tushar N
2012-07-12 6:16 ` Joe Jin
2012-07-12 6:41 ` Dave, Tushar N
2012-07-12 7:10 ` Joe Jin
2012-07-12 18:19 ` Dave, Tushar N
2012-07-12 23:46 ` Joe Jin
2012-07-13 4:10 ` Dave, Tushar N
2012-07-13 4:33 ` Joe Jin
2012-07-15 3:42 ` Dave, Tushar N
2012-07-15 3:52 ` Joe Jin
2012-07-15 13:35 ` Henrique de Moraes Holschuh
2012-07-16 15:47 ` Ben Hutchings
2012-07-16 16:08 ` Henrique de Moraes Holschuh
2012-07-17 4:48 ` Jon Mason
2012-07-17 4:45 ` Jon Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50D15AEC.5050306@oracle.com \
--to=joe.jin@oracle.com \
--cc=bhelgaas@google.com \
--cc=bhutchings@solarflare.com \
--cc=e1000-devel@lists.sf.net \
--cc=ethan.kernel@gmail.com \
--cc=jdmason@kudzu.us \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mary.mcgrath@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=wangyijing@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).