linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matt Fleming <matt@codeblueprint.co.uk>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
	"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
	Toshi Kani <toshi.kani@hp.com>, Brian Gerst <brgerst@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Luis Rodriguez <mcgrof@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Denys Vlasenko <dvlasenk@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	ricardo.neri@intel.com, Hugh Dickins <hughd@google.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: [tip:efi/core] x86/mm/pat: Use _PAGE_GLOBAL bit for EFI page table mappings
Date: Wed, 24 Feb 2016 14:10:46 +0000	[thread overview]
Message-ID: <20160224141046.GA2603@codeblueprint.co.uk> (raw)
In-Reply-To: <CALCETrWUcCRdbY4teQPdv068XH9UDDRLVV3UKzpM=g5k=JGk_A@mail.gmail.com>

On Tue, 23 Feb, at 06:43:04PM, Andy Lutomirski wrote:
> On Tue, Feb 23, 2016 at 4:50 PM, Sai Praneeth Prakhya
> <sai.praneeth.prakhya@intel.com> wrote:
> >
> > As you rightly said the code is about making a page GLOBAL if PRESENT is
> > set and we do set PRESENT bit before mapping so that page is GLOBAL.
> > This code was taken from the other parts of pageattr.c. The point is
> > that we don't want differences between whether things were mapped in the
> > EFI page tables directly (i.e. using populate_pte()) or later split from
> > large pages via the split_large_page() code path. If this is still
> > confusing could you please elaborate on it further.
> 
> At least the comment should say "Set the GLOBAL flag if and only
> if...".  But why is this code here in the first place?  What is
> passing a pgprot with global unset into this code in the first place?
 
This comes from populate_pgd(),

  static int populate_pgd(struct cpa_data *cpa, unsigned long addr)
  {
	  pgprot_t pgprot = __pgprot(_KERNPG_TABLE);

> > I don't think previous implementation is broken and this is not a bug
> > fix as such. Before this patch some EFI region mappings had GLOBAL bit
> > set (which followed split large page path) and some aren't (which used
> > populate_pte). This patch just aligns the mappings done via two
> > different code paths as mentioned above. As a whole it also maintains
> > consistency with kernel mappings.
> 
> But your code also *clears* GLOBAL if PRESENT is clear, and your
> comment talks about that.  Does this ever actually happen in practice?
 
Not when mapping EFI regions, no (we explicitly set _PAGE_PRESENT in
kernel_map_pages_in_pgd(), but this logic is duplicated from
__split_large_page() which can be called from non-EFI code.

> I'd much rather see WARN_ON_ONCE((pgprot_val(pgprot) & (_PAGE_PRESENT
> | _PAGE_GLOBAL)) == _PAGE_GLOBAL) in here, if that makes sense in the
> context of the callers of the function.
 
I think that'd be a nice cleanup, along with pulling all the pgprot
twiddling out into a single function.

> > We did this for consistency among EFI mappings. This has some advantages
> > as mentioned in the commit message. It also makes it less confusing when
> > starting at the PGT_DUMP traces if _PAGE_GLOBAL is used consistently.
> >
> > We don't actually do anything special with _PAGE_GLOBAL in EFI.
> 
> You're making a choice of whether to set _PAGE_GLOBAL, and I think
> you've made the wrong choice.
> 
> Normally, the only pages with are _PAGE_GLOBAL are those that are in
> the normal kernel mappings (swapper_pg_dir and normal mm_struct pgds).
> By allowing _PAGE_GLOBAL to be set in EFI mappings, you're breaking
> that convention, which forces you to use extra-expensive
> __flush_tlb_all calls in efi_call_virt.
> 
> I think you should explicitly *clear* _PAGE_GLOBAL in the EFI mappings
> instead.  This would allow you to use write_cr3 by itself, which would
> make the code simpler and faster.
 
This is interesting.

When I suggested to Sai that he write this patch my main motivation
was consistency for all mappings. We've got that now, but perhaps it's
the wrong consistency ;)

If we go with the no-PAGE_GLOBAL approach, we need changes to ensure
we never set _PAGE_GLOBAL for the EFI mappings, because before this
patch was applied sometimes we did and sometimes we didn't, depending
on whether a page was split or not.

I'm racking my brain to think of how your suggestion might have
unintended consequences because diagnosing stale TLB entry bugs is
simply the worst job ever. I can't think of anything. The only
scenarios where we'd see problems is if a) we have new global mappings
in the EFI page tables or b) we have different global mappings.

Since we reference swapper_pg_dir from the PMD level downwards b)
shouldn't be a problem, and the only differences between
swapper_pg_dir and efi_pgd should be the EFI mappings, which saves us
from a).

> > This is a valid point. I know that EFI runtime regions persist during
> > and after boot if we have a UEFI firmware and other commits made EFI
> > regions have separate page table but I am not clear about the effect of
> > global flush. I think Matt/Boris could comment on it.
> 
> It's straightfoward on existing kernels.  If _PAGE_GLOBAL is set, TLB
> entries persist across cr3 writes.  If _PAGE_GLOBAL is clear, then TLB
> entries are flushed by cr3 writes.
 
This is safe for EFI right now because of the big __flush_tlb_all() in
efi_call_virt().

> With PCID enabled (which is only in a not-quite-ready patch set I
> have), the story is a bit more complicated, but it works essentially
> the same way unless you explicitly opt out.
 
Hmm... is series that posted somewhere?

> > We touch this code path only when mapping EFI runtime regions to VA
> > space, i.e. we added pgd field in cpa only as a support for mapping efi
> > runtime regions.
> 
> populate_pgd is called from non-EFI code as well though, isn't it?

Nope. The "if (cpa->pgd)" guard ensures that we only call that
function for the EFI mapping code - no one else sets ->pgd.

  reply	other threads:[~2016-02-24 14:10 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-17 12:35 [GIT PULL 00/13] EFI changes for v4.6 part 2 Matt Fleming
2016-02-17 12:35 ` [PATCH 01/13] efi: Reformat GUID tables to follow the format in UEFI spec Matt Fleming
2016-02-22 12:12   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgUGV0ZXIgSm9uZXMgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:35 ` [PATCH 02/13] efi/runtime-wrappers: Run UEFI Runtime Services with interrupts enabled Matt Fleming
2016-02-22 12:13   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:35 ` [PATCH 03/13] x86/mm/pageattr: Use _PAGE_GLOBAL bit for EFI page table mappings Matt Fleming
2016-02-22 12:13   ` [tip:efi/core] x86/mm/pat: " =?UTF-8?B?dGlwLWJvdCBmb3IgU2FpIFByYW5lZXRoIDx0aXBib3RAenl0b3IuY29tPg==?=
2016-02-23 17:47     ` Andy Lutomirski
2016-02-23 18:08       ` Linus Torvalds
2016-02-23 18:12         ` Borislav Petkov
2016-02-24  2:09         ` H. Peter Anvin
2016-02-24  2:13           ` H. Peter Anvin
2016-02-25  8:59         ` Ingo Molnar
2016-02-24  0:50       ` Sai Praneeth Prakhya
2016-02-24  2:43         ` Andy Lutomirski
2016-02-24 14:10           ` Matt Fleming [this message]
2016-02-24 16:20             ` Borislav Petkov
2016-02-24 16:36               ` Andy Lutomirski
2016-02-24 18:56                 ` Linus Torvalds
2016-02-24 19:45                   ` Matt Fleming
2016-02-24 19:50                     ` Andy Lutomirski
2016-02-29 10:56                     ` Sylvain Chouleur
2016-03-02 11:20                       ` Matt Fleming
2016-02-24 19:33                 ` Matt Fleming
2016-02-24 19:49                   ` Andy Lutomirski
2016-02-25  9:06                     ` Ingo Molnar
2016-02-25 15:27                     ` Matt Fleming
2016-02-24 16:41               ` Borislav Petkov
2016-02-24 16:47                 ` Andy Lutomirski
2016-02-24 16:39             ` Andy Lutomirski
2016-02-25 16:00               ` Matt Fleming
2016-02-17 12:35 ` [PATCH 04/13] efi/arm64: Drop __init annotation from handle_kernel_image() Matt Fleming
2016-02-22 12:14   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:35 ` [PATCH 05/13] arm64: vmlinux.lds.S: Handle .init.rodata.xxx and .init.bss sections Matt Fleming
2016-02-22 12:14   ` [tip:efi/core] arm64/vmlinux.lds.S: " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:35 ` [PATCH 06/13] efi/efistub: Prevent __init annotations from being used Matt Fleming
2016-02-22 12:14   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:36 ` [PATCH 07/13] efi/arm-init: Use read-only early mappings Matt Fleming
2016-02-22 12:15   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:36 ` [PATCH 08/13] efi/arm: Check for LPAE support before booting a LPAE kernel Matt Fleming
2016-02-22 12:15   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:36 ` [PATCH 09/13] efi/arm64: Check for h/w support before booting a >4 KB granule kernel Matt Fleming
2016-02-22 12:16   ` [tip:efi/core] efi/arm64: Check for h/w support before booting a >4 KB granular kernel =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-03-06  3:35   ` [PATCH 09/13] efi/arm64: Check for h/w support before booting a >4 KB granule kernel Ard Biesheuvel
2016-03-07 11:02     ` Matt Fleming
2016-03-07 11:05       ` Ard Biesheuvel
2016-02-17 12:36 ` [PATCH 10/13] efi/arm*: Perform hardware compatibility check Matt Fleming
2016-02-22 12:16   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgQXJkIEJpZXNoZXV2ZWwgPHRpcGJvdEB6eXRvci5jb20+?=
2016-02-17 12:36 ` [PATCH 11/13] x86/mm/pageattr: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd() Matt Fleming
2016-02-22 12:16   ` [tip:efi/core] x86/mm/pat: " =?UTF-8?B?dGlwLWJvdCBmb3IgU2FpIFByYW5lZXRoIDx0aXBib3RAenl0b3IuY29tPg==?=
2016-02-17 12:36 ` [PATCH 12/13] x86/efi: Map EFI_MEMORY_{XP,RO} memory region bits to EFI page tables Matt Fleming
2016-02-22 12:17   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgU2FpIFByYW5lZXRoIDx0aXBib3RAenl0b3IuY29tPg==?=
2016-02-17 12:36 ` [PATCH 13/13] x86/efi: Only map kernel text for EFI mixed mode Matt Fleming
2016-02-22 12:17   ` [tip:efi/core] " =?UTF-8?B?dGlwLWJvdCBmb3IgU2FpIFByYW5lZXRoIDx0aXBib3RAenl0b3IuY29tPg==?=

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224141046.GA2603@codeblueprint.co.uk \
    --to=matt@codeblueprint.co.uk \
    --cc=akpm@linux-foundation.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=dvlasenk@redhat.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mcgrof@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=sai.praneeth.prakhya@intel.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=toshi.kani@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).