All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Wahl <steve.wahl@hpe.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Steve Wahl <steve.wahl@hpe.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, Juergen Gross <jgross@suse.com>,
	Brijesh Singh <brijesh.singh@amd.com>,
	Jordan Borgner <mail@jordan-borgner.de>,
	Feng Tang <feng.tang@intel.com>,
	linux-kernel@vger.kernel.org, Baoquan He <bhe@redhat.com>,
	russ.anderson@hpe.com, dimitri.sivanich@hpe.com,
	mike.travis@hpe.com
Subject: Re: [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area.
Date: Mon, 16 Sep 2019 12:14:55 -0500	[thread overview]
Message-ID: <20190916171455.GJ7834@swahl-linux> (raw)
In-Reply-To: <4de3e0c2-7b59-c325-8d88-58220d721f71@intel.com>

On Mon, Sep 16, 2019 at 07:25:48AM -0700, Dave Hansen wrote:
> On 9/16/19 2:00 AM, Kirill A. Shutemov wrote:
> >>> I think we also need to make it clear that this is workaround for a broken
> >>> hardware: speculative execution must not trigger a halt.
> >> I think the word broken is a bit loaded here.  According to the UEFI
> >> spec (version 2.8, page 167), "Regions that are backed by the physical
> >> hardware, but are not supposed to be accessed by the OS, must be
> >> returned as EfiReservedMemoryType."  Our interpretation is that
> >> includes speculative accesses.
> > +Dave.
> > 
> > I don't think it is. Speculative access is done by hardware, not OS.
> > 
> > BTW, isn't it a BIOS issue?
> > 
> > I believe it should have a way to hide a range of physical address space
> > from OS or force a caching mode on to exclude it from speculative
> > execution. Like setup MTRRs or something.
> 
> Ugh.  I bet that was a fun one to chase down.  Have the hardware
> engineers learned a lesson or are they hiding behind the EFI spec in an
> act of pure cowardice? ;)

Yes, it was fun.  My main BIOS contact has explained to me how they are
stuck between a rock and a hard place on any other options for this.

> The patch is small and fixes a real problem.  The changelog is OK,
> although I'd prefer some differentiation between "occupied by the
> kernel" and the kernel *image*.

OK, is the phrase "kernel image" generally understood to cover
everything from _text to _end, including the bss?  As long as that's
true, I will adopt this phrase.

> The code is also gloriously free of any comments about what it's
> doing or why.

I'm intending to add something like this in the next version:

/*
 * Only the region occupied by the kernel image has so far been checked against
 * the table of usable memory regions provided by the firmware, so
 * invalidate pages outside that region.  A page table entry that maps to
 * a reserved area of memory would allow processor speculation into that
 * area, and on some hardware (particularly the UV platform) speculation
 * into reserved areas can cause a system halt.
 */


> But, I'm left with lots of questions:
> 
> Why do PMD-level changes fix this?  Is it because we 2MB pad the kernel
> image?  Why can't we still get within 2MB of the memory address in
> question?

This fix works for our hardware because the problematic reserved
regions are 64M aligned, and going up to the next 2MB boundary from
_end is not going to cross the next 64M boundary.

One could argue the next step would be going into
boot/compressed/{kaslr.c, misc.c} and rounding the size of the kernel
up to the next 2MB boundary to ensure the chosen random location is
covered by usable RAM up to the next PMD-level boundary.  I did not go
there because for us it is not necessary.

> Is it in the lower 1MB, by chance?

No, this is a reserved range at the top physical addresses for each
NUMA node in a collection of them.  

> If this is all about avoiding EFI reserved ranges, why doesn't the
> patch *LOOK* At EFI reserved ranges?

Because the range the kernel image is located in is already checked
against them in boot/compressed/kaslr.c.  This will now be explained
in the comment I mention above, which you had not yet seen.

--> Steve Wahl


-- 
Steve Wahl, Hewlett Packard Enterprise

  reply	other threads:[~2019-09-16 17:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06 21:29 [PATCH] x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area Steve Wahl
2019-09-09  8:14 ` Kirill A. Shutemov
2019-09-10  6:18   ` Ingo Molnar
2019-09-10 14:38     ` Steve Wahl
2019-09-10 14:28   ` Steve Wahl
2019-09-11  0:28     ` Kirill A. Shutemov
2019-09-11 20:08       ` Steve Wahl
2019-09-12 10:19         ` Kirill A. Shutemov
2019-09-13 15:14           ` Steve Wahl
2019-09-16  9:00             ` Kirill A. Shutemov
2019-09-16 14:25               ` Dave Hansen
2019-09-16 17:14                 ` Steve Wahl [this message]
2019-09-16 22:19                   ` Dave Hansen
2019-09-16 16:17               ` Steve Wahl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190916171455.GJ7834@swahl-linux \
    --to=steve.wahl@hpe.com \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dimitri.sivanich@hpe.com \
    --cc=feng.tang@intel.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mail@jordan-borgner.de \
    --cc=mike.travis@hpe.com \
    --cc=mingo@redhat.com \
    --cc=russ.anderson@hpe.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.