From: Paul Johnson <pjay@nwtrail.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>, linux-pci <linux-pci@vger.kernel.org>
Subject: Re: [problem] mpt2sas load fails with LSISAS2008
Date: Thu, 19 Feb 2015 15:40:41 -0800 [thread overview]
Message-ID: <54E67479.3020906@nwtrail.com> (raw)
In-Reply-To: <CAErSpo6z1GmTkLQJDhBZxRU2wjCZiYsFBETi18vsYVNzOBi6VA@mail.gmail.com>
This is a resend of mail sent 2/11 except the dmesg attachment is not on
the bug report.
On 02/11/2015 08:57 AM, Bjorn Helgaas wrote:
> On Wed, Feb 11, 2015 at 10:11 AM, Paul Johnson <pjay@nwtrail.com> wrote:
>> On 02/10/2015 08:49 AM, Bjorn Helgaas wrote:
>>>
>>> We need to work out what's going wrong here before we rush into a
>>> band-aid.
>>>
>>> What changed between v3.4 and v3.4.1 that exposed this problem? "git
>>> log --oneline v3.4..v3.4.1" doesn't show any likely culprits. Paul,
>>> are those the versions you tested? Your dmesg logs at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=92351 show
>>> "3.4.0-030400-generic" and "3.4.1-030401-generic" but I don't know
>>> whether those are precisely v3.4 and v3.4.1.
>>>
>>> I assume this system works fine with Windows, and I doubt Windows has
>>> a hack like "never move LSI devices." So it would be useful to know
>>> if we're doing something stupid in Linux that makes us trip over this.
>>> Paul, if you happen to have Windows on this machine as well, a
>>> complete AIDA64 report (free trial version at http://www.aida64.com)
>>> would show what Windows did.
>>>
>>> The resource allocation we're doing is related SR-IOV, and
>>> unfortunately we don't print enough information in dmesg to figure
>>> everything out. Paul, can you attach the complete "lspci -vv" output
>>> to the bugzilla?
>>>
>>> Bjorn
>>>
>> The system I have had this problem on is in production, though it should be
>> replaced by a real server. Because it is in use, I have used a separate boot
>> disk to test kernels. I also have limited access to take the machine down.
>> The system runs ubuntu server, though I have used an ubuntu desktop to test
>> kernels. There is not a windows system on the machine, though, just
>> guessing, LSI likely provides the windows driver and that driver may well
>> have dealt with a problem that is looking to be specific to a firmware/bios
>> version on this card.
>
> That might be possible. The issue seems to be related to changing BAR
> addresses, and I expect that would be outside the scope of what the
> driver can influence. So I don't know whether Windows has a mechanism
> for that or not.
>
>> Someone found another of these cards here, so I tried it last night in an
>> unused machine. It worked on the ubuntu 3.13 kernel without realloc. The
>> card that has been the problem has these versions of firmware:
>> [ 9.004647] mpt2sas0: LSISAS2008: FWVersion(17.00.01.00),
>> ChipRevision(0x03), BiosVersion(07.33.00.00)
>>
>> and the card that works has a newer version:
>> [ 15.725011] mpt2sas0: LSISAS2008: FWVersion(18.00.00.00),
>> ChipRevision(0x03), BiosVersion(07.35.00.00)
>
> Without seeing the dmesg log, I can't tell whether this card works
> because (1) the LSI firmware is fixed or (2) the kernel didn't try to
> change the BARs.
>
> And I still don't have any clue about what changed between v3.4 and
> v3.4.1 and triggered the problem.
>
> Applying a fix without figuring out the real root cause of the problem
> is voodoo programming, and I don't like to do that.
>
>> Now, the cards are in very different machines so the difference could be due
>> to the machines and not the firmware, but I would tend to go with the
>> firmware difference. LSI firmware is now beyond both these firmware
>> versions, but if I can find a copy of the older firmware, I'll try it on the
>> card with the newer firmware.
>
> We could tell from the dmesg log whether Linux changed the BARs. I
> wouldn't bother trying different LSI firmware versions until you
> confirm that we changed the BARs.
>
> Bjorn
>
The 3.4.0 and 3.4.1 kernels I used came from here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D
A dmesg with the newer firmware and 3.19 from the same url is attached
to the bug report https://bugzilla.kernel.org/show_bug.cgi?id=92351 as
attachment: dmesg with 3.19 and LSI FW 18
Paul
next prev parent reply other threads:[~2015-02-19 23:40 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-27 23:12 [problem] mpt2sas load fails with LSISAS2008 Paul Johnson
2015-01-28 4:02 ` Bjorn Helgaas
2015-01-28 4:44 ` Yinghai Lu
2015-01-28 16:46 ` Bjorn Helgaas
2015-01-30 16:25 ` Paul Johnson
2015-01-30 21:37 ` Yinghai Lu
2015-02-02 18:03 ` Paul Johnson
2015-02-02 22:30 ` Yinghai Lu
2015-02-03 23:58 ` Paul Johnson
2015-02-07 23:34 ` Yinghai Lu
2015-02-08 2:15 ` Paul Johnson
2015-02-08 3:11 ` Yinghai Lu
2015-02-08 18:43 ` Paul Johnson
2015-02-10 16:49 ` Bjorn Helgaas
2015-02-11 16:11 ` Paul Johnson
2015-02-11 16:57 ` Bjorn Helgaas
2015-02-19 23:40 ` Paul Johnson [this message]
2015-02-26 0:28 ` Bjorn Helgaas
2015-02-26 6:02 ` Yinghai Lu
2015-02-26 15:59 ` Bjorn Helgaas
2015-02-26 18:08 ` Yinghai Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54E67479.3020906@nwtrail.com \
--to=pjay@nwtrail.com \
--cc=bhelgaas@google.com \
--cc=linux-pci@vger.kernel.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.