Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Tom Lendacky <thomas.lendacky@amd.com>
To: Laura Abbott <labbott@redhat.com>, Ingo Molnar <mingo@kernel.org>
Cc: Gabriel C <nix.or.die@gmail.com>, Borislav Petkov <bp@suse.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Brijesh Singh <brijesh.singh@amd.com>, X86 ML <x86@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8
Date: Sat, 20 Jan 2018 11:34:10 -0600	[thread overview]
Message-ID: <b7a8e990-9d1d-39b2-671d-a44d5647dbec@amd.com> (raw)
In-Reply-To: <d2b236d9-ccfc-0cd0-f097-9daba70b86ff@redhat.com>

On 1/20/2018 10:52 AM, Laura Abbott wrote:
> On 01/20/2018 05:13 AM, Ingo Molnar wrote:
>>
>> * Ingo Molnar <mingo@kernel.org> wrote:
>>
>>> 2)
>>>
>>> using global variables, which is unsafe in early code if the kernel is
>>> relocatable.
>>>
>>> The bisected to commit uses a new sme_populate_pgd_data to collect
>>> variables that
>>> were already on the stack, which should be position independent and safe.
>>>
>>> But the other commits use sme_active(), which does:
>>>
>>> bool sme_active(void)
>>> {
>>>          return sme_me_mask && !sev_enabled;
>>> }
>>> EXPORT_SYMBOL(sme_active);
>>>
>>> And that looks PIC-unsafe to me, as both are globals:
>>>
>>> u64 sme_me_mask __section(.data) = 0;
>>> EXPORT_SYMBOL(sme_me_mask);
>>>
>>> Does the code start working if you force sme_active() to 0 while
>>> keeping the
>>> function call, i.e. something like the hack below?
>>
>> BTW., this aspect of the boot code is really fragile, and depending on
>> compiler
>> there could be unsafe relocations generated without it being 'obvious'
>> from the
>> patch itself. It's also pretty compiler and code layout dependent ...
>>
>> A good way to check this I think would be to turn off
>> CONFIG_RELOCATABLE=y in the
>> .config - does that make the kernel boot again?
>>
>> If that makes a difference then we need to take a look at the
>> relocations in the
>> two key files, with CONFIG_RELOCATABLE=y turned back on:
>>
>>    objdump -r arch/x86/kernel/head64.o
>>    objdump -r arch/x86/mm/mem_encrypt.o
>>
>> There's three types of relocations that should be there normally:
>>
>> #define R_X86_64_64             1       /* Direct 64 bit  */
>> #define R_X86_64_PC32           2       /* PC relative 32 bit signed */
>> #define R_X86_64_32S            11      /* Direct 32 bit sign extended */
>>
>> Only R_X86_64_PC32 is safe as-is, R_X86_64_32S needs to be used via
>> fixup_pointer().
>>
>> What makes this difficult in the SME context is that the early boot
>> portion of
>> arch/x86/mm/mem_encrypt.c is not separated out, but mixed in with later
>> code.
>>
>> I missed this aspect when reviewing and merging this code :-(
>>
>> Maybe a diff of the list of relocations of the before/after commit
>> points would be
>> nice.
>>
>> I.e. does something like:
>>
>>    git checkout <last_working_commit_sha1>
>>    objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- >
>> working.relocs
>>
>>    git checkout <first_broken_commit_sha1>
>>    objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- >
>> broken.relocs
>>
>>    diff -up working.relocs broken.relocs
>>
>> show any changes to the relocations?
>>
>> Side note:
>>
>> Regardless of whether it's the root cause for this regression we
>> definitely need
>> to improve the relocations robustness of early boot code: at minimum we
>> should
>> isolate all critical functionality into a separate section, and then add
>> tooling
>> checks to make sure all relocations are safe.
>>
>> Thanks,
>>
>>     Ingo
>>
> 
> For the previous question, changing it to sme_active _does_ make the
> kernel work. Unfortunately, I can't test without relocations since
> I need to boot with CONFIG_EFI_STUB, but the relocations did show
> something interesting:
> 
> +R_X86_64_PC32     __stack_chk_fail-0x0000000000000004
> 
> There's a new call to __stack_chk_fail and if I dump the end of
> sme_encrypt_kernel I do see that stuck in there. I bet the size
> of struct sme_populate_pgd_data is now large enough to trigger
> a stack check. If I add __nostackprotector to sme_encrypt_kernel
> like sme_enable has, it boots fine. This would explain why that
> particular commit showed as the problem in bisection.

Great find Laura.  It must have something to do with compiler levels
since my level didn't insert that check.

Thanks,
Tom

> 
> Thanks,
> Laura

next prev parent reply	other threads:[~2018-01-20 17:34 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-20  1:23 Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8 Laura Abbott
2018-01-20  2:23 ` Gabriel C
2018-01-20  4:02   ` Laura Abbott
2018-01-20  5:25     ` Gabriel C
2018-01-20  6:15       ` Tom Lendacky
2018-01-20  6:57         ` Laura Abbott
2018-01-20  7:03           ` Laura Abbott
2018-01-20 12:08             ` Gabriel C
2018-01-20 12:33           ` Ingo Molnar
2018-01-20 13:13             ` Ingo Molnar
     [not found]               ` <d2b236d9-ccfc-0cd0-f097-9daba70b86ff@redhat.com>
2018-01-20 17:34                 ` Tom Lendacky [this message]
2018-01-21  1:14                   ` [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel Laura Abbott
2018-01-21  1:23                     ` Linus Torvalds
2018-01-21  1:49                       ` Gabriel C
2018-01-21  4:16                         ` Linus Torvalds
2018-01-21  9:37                           ` Greg Kroah-Hartman
2018-01-21  9:50                             ` Ingo Molnar
2018-01-21 10:36                               ` Greg Kroah-Hartman
2018-01-21  8:46                       ` Ingo Molnar
2018-01-20 12:01         ` Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8 Gabriel C
2018-01-20  2:38 ` Linus Torvalds
2018-01-20  4:13   ` Tom Lendacky
2018-01-20 12:12 ` Ingo Molnar
2018-01-20 15:35   ` Laura Abbott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7a8e990-9d1d-39b2-671d-a44d5647dbec@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=bp@suse.de \
    --cc=brijesh.singh@amd.com \
    --cc=labbott@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=nix.or.die@gmail.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox