linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mark Salter <msalter@redhat.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Geoff Levand <geoff@infradead.org>,
	Riku Voipio <riku.voipio@linaro.org>,
	ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	James Morse <james.morse@arm.com>,
	Hanjun Guo <hanjun.guo@linaro.org>,
	Sudeep Holla <sudeep.holla@arm.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] arm64/acpi: Add fixup for HPE m400 quirks
Date: Wed, 27 Jun 2018 08:25:31 -0400	[thread overview]
Message-ID: <45b96b937687b199bdbd6966491ab23f50bb20e7.camel@redhat.com> (raw)
In-Reply-To: <CAKv+Gu9VfK8itWr5EoOGAdSA-OkrXystw-sbooOq2ZOpFwi9vw@mail.gmail.com>

On Wed, 2018-06-27 at 10:48 +0200, Ard Biesheuvel wrote:
> On 26 June 2018 at 22:20, Mark Salter <msalter@redhat.com> wrote:
> > On Tue, 2018-06-26 at 15:51 +0100, James Morse wrote:
> > > Hi Mark,
> > > 
> > > Thanks for shed-ing some light on what is going on here!
> > > 
> > > On 25/06/18 16:34, Mark Salter wrote:
> > > > On Fri, 2018-06-22 at 11:19 -0400, Mark Salter wrote:
> > > > > I'm going to hack something to get to the ghes info earlier in boot and
> > > > > check the things you mention above wrt Error Status Block and GHES.0.
> > > > 
> > > > So I had to end up instrumenting the EFI stub to see where the error came
> > > > from. At the start of the stub, there is no GHES.2 error. The error first
> > > > shows up after the stub's call to ExitBootServices returns.
> > > 
> > > What's the notification type of GHES.2? I'm guessing POLLed or some kind of IRQ.
> > 
> > SCI
> > 
> > Here's the HEST entry:
> > 
> > [028h 0040   2]                Subtable Type : 0009 [Generic Hardware Error Source]
> > [02Ah 0042   2]                    Source Id : 0002
> > [02Ch 0044   2]            Related Source Id : FFFF
> > [02Eh 0046   1]                     Reserved : 00
> > [02Fh 0047   1]                      Enabled : 01
> > [030h 0048   4]       Records To Preallocate : 00000001
> > [034h 0052   4]      Max Sections Per Record : 00000001
> > [038h 0056   4]          Max Raw Data Length : 00000AEC
> > 
> > [03Ch 0060  12]         Error Status Address : [Generic Address Structure]
> > [03Ch 0060   1]                     Space ID : 00 [SystemMemory]
> > [03Dh 0061   1]                    Bit Width : 40
> > [03Eh 0062   1]                   Bit Offset : 00
> > [03Fh 0063   1]         Encoded Access Width : 04 [QWord Access:64]
> > [040h 0064   8]                      Address : 0000004FF7E9F0E0
> > 
> 
> This is a reserved region in the memory map. Does that apply to the
> other occurrences as well?

Yes, they are all in the same reserved region.

> 
> > There are 9 others all identical except for Source ID and address.
> > 
> > > These systems don't have EL3, so the CPU must continue running while something
> > > external generates the CPER records. The records being visible is the last point
> > > the faulty-access could have been made, with the window of time depending on how
> > > fast this external-thing receives and processes the error.
> > 
> > There's a System Control Processor (slimpro) on the SoC which can interact with
> > the CPU in various ways and which has access to memory and other hw.
> > 
> > > 
> > > 
> > > > So it looks
> > > > like the firmware itself is causing the error. There's still a chance that
> > > > the stub is doing something wrong with the memory map passed to the
> > > > firmware, so I'll try to eliminate that as well.
> > > 
> > > adding delay loops will help prove the EFIStub is innocent.
> > 
> > Didn't change anything.
> > 
> > > 
> > > Are there any optional drivers being loaded by UEFI? (can you remove any USB
> > > mass storage drives for instance).
> > 
> > The only storage is pci based. There is a USB port but doesn't look like
> > anything is attached to it. I don't have physical access to it. It is one on
> > many moonshot cartridges in a chassis several hundred miles away.
> > 
> > > 
> > > Are redhat able to rebuild UEFI on these systems? (Can it be fixed?)
> > 
> > No.
> > 
> > > 
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1285107 is about the m400
> > > description of the GIC, comments 15 and 16 show a UEFI patch to something other
> > > than the upstream platforms tree[0], and new firmware being tested.
> > > (although this may be wishful thinking)
> > 
> > HPe would respond to bug reports until m400 reached EOL. They have been pretty
> > clear that no more firmware updates will be done.
> > 
> > > 
> > > It looks like quirking this based on the DMI platform name and UEFI version will
> > > be what we need. We could discard anything in the error status block areas at
> > > ghes_probe() time based on this quirk, but we may have missed other problems
> > > during boot, giving a false sense of security.
> > > 
> > > 
> > > Thanks,
> > > 
> > > James
> > > 
> > > 
> > > [0] Might be wrong, but this is where I look:
> > > https://github.com/tianocore/edk2-platforms.git
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2018-06-27 12:25 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-13 18:22 [PATCH] arm64/acpi: Add fixup for HPE m400 quirks Geoff Levand
2018-06-15  8:47 ` Riku Voipio
2018-06-15  9:51 ` Graeme Gregory
2018-06-15 11:14 ` James Morse
2018-06-15 17:17   ` Geoff Levand
2018-06-15 17:33     ` Mark Salter
2018-06-15 18:15       ` Geoff Levand
2018-06-15 19:14         ` Mark Salter
2018-06-18 16:18     ` James Morse
2018-06-18 18:04       ` Geoff Levand
2018-06-18 22:18         ` Mark Salter
2018-06-19 10:21           ` James Morse
2018-06-22 15:19             ` Mark Salter
2018-06-25 15:34               ` Mark Salter
2018-06-26 14:51                 ` James Morse
2018-06-26 20:20                   ` Mark Salter
2018-06-27  8:48                     ` Ard Biesheuvel
2018-06-27 12:25                       ` Mark Salter [this message]
2018-07-03  9:30                         ` Ian Campbell
2018-07-03 15:20                           ` Mark Salter
2018-06-28 10:06                     ` James Morse
2018-06-29 13:05                       ` Mark Salter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45b96b937687b199bdbd6966491ab23f50bb20e7.camel@redhat.com \
    --to=msalter@redhat.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=geoff@infradead.org \
    --cc=hanjun.guo@linaro.org \
    --cc=james.morse@arm.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=riku.voipio@linaro.org \
    --cc=sudeep.holla@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).