Boot time trap handling

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* Boot time trap handling
@ 2014-05-12 10:40 Andrew Cooper
  2014-05-12 12:09 ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2014-05-12 10:40 UTC (permalink / raw)
  To: Xen-devel List, Jan Beulich, Keir Fraser, Tim Deegan

Hello,

When reviewing the SMAP series, Jan noticed an issue I had overlooked
while developing 7e510a7b874fa "x86/boot: move some __high_start code
and data into init sections".

All interrupts we expect to receive are indeed replaced with real
handlers, but the reserved exceptions are left as-are pointing at
ignore_int().  With ignore_int() in the init section, it will point
somewhere arbitrary after boot.  This is obviously a bad thing, but the
previous behaviour wasn't much more useful in this regard, by locking up
a pcpu without bringing the system down.

While attempting to fix this issue, I discovered some more concerning ones.

On the BSP, we:
* load the empty 'idt_table'
* patch ignore_int() into every entry (and indeed, bottom first which is
rather unsafe)
* enter __start_xen()
* patch in early_page_fault()
* perform large mounts of setup
* enter trap_init()
* patch the real trap handlers into 'idt_table'
* Set up ISTs for #DF, #NMI, #MCE in 'idt_table'
* Set up IST stacks in our local TSS
* Load our local TSS

On APs, we:
* Load mmu_cr4_features, including CR4.MCE
* Load the BSPs 'idt_table', complete with IST information
* Set up IST stacks in our local TSS
* Load our local TSS
* Switch onto our own local memcpy() of the BSPs idt.

In both cases, until we have loaded the TSS, we risk trying to take an
MCE or NMI without a TSS loaded.  I cant spot which contributory
exception would be generated, but I suspect #NP, or possibly #TS.  With
#DF set up in the same way, we will triple fault.

Looking at the real trap handlers, they appear to be safe from almost
the start of __start_xen().  Therefore, I propose:

* The boot critical region with an empty idtr gets extended slightly
into the top of __start_xen() and start_secondary()
* Inside this critical region, set set up and load the TSS.
* Load ourselves onto our local idt.
* Load cr4, after the MCE entry path is valid.

This has the added advantage that we gain full bugframe and extable
support for the earlier parts of setup.

Is there anything I have overlooked, or does this plan look plausible?

~Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Boot time trap handling
  2014-05-12 10:40 Boot time trap handling Andrew Cooper
@ 2014-05-12 12:09 ` Jan Beulich
  2014-05-12 12:53   ` Andrew Cooper
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2014-05-12 12:09 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

>>> On 12.05.14 at 12:40, <andrew.cooper3@citrix.com> wrote:
> On the BSP, we:
> * load the empty 'idt_table'
> * patch ignore_int() into every entry (and indeed, bottom first which is
> rather unsafe)

What is it that you consider "rather unsafe" here?

> * enter __start_xen()
> * patch in early_page_fault()
> * perform large mounts of setup
> * enter trap_init()
> * patch the real trap handlers into 'idt_table'
> * Set up ISTs for #DF, #NMI, #MCE in 'idt_table'
> * Set up IST stacks in our local TSS
> * Load our local TSS
> 
> On APs, we:
> * Load mmu_cr4_features, including CR4.MCE
> * Load the BSPs 'idt_table', complete with IST information
> * Set up IST stacks in our local TSS
> * Load our local TSS
> * Switch onto our own local memcpy() of the BSPs idt.
> 
> In both cases, until we have loaded the TSS, we risk trying to take an
> MCE or NMI without a TSS loaded.  I cant spot which contributory
> exception would be generated, but I suspect #NP, or possibly #TS.  With
> #DF set up in the same way, we will triple fault.
> 
> 
> Looking at the real trap handlers, they appear to be safe from almost
> the start of __start_xen().

That of course may need closer inspection, and documenting the
results in an eventually evolving patch description.

>  Therefore, I propose:
> 
> * The boot critical region with an empty idtr gets extended slightly
> into the top of __start_xen() and start_secondary()
> * Inside this critical region, set set up and load the TSS.
> * Load ourselves onto our local idt.
> * Load cr4, after the MCE entry path is valid.
> 
> This has the added advantage that we gain full bugframe and extable
> support for the earlier parts of setup.
> 
> Is there anything I have overlooked, or does this plan look plausible?

Looks all reasonable, as long as there's not going to be any window
without (or with a non-working, or with a simplistic) exception handler
that would become larger than what it is now.

Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Boot time trap handling
  2014-05-12 12:09 ` Jan Beulich
@ 2014-05-12 12:53   ` Andrew Cooper
  2014-05-12 12:59     ` Jan Beulich
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2014-05-12 12:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Tim Deegan, Keir Fraser, Xen-devel List

On 12/05/14 13:09, Jan Beulich wrote:
>>>> On 12.05.14 at 12:40, <andrew.cooper3@citrix.com> wrote:
>> On the BSP, we:
>> * load the empty 'idt_table'
>> * patch ignore_int() into every entry (and indeed, bottom first which is
>> rather unsafe)
> What is it that you consider "rather unsafe" here?

We load the lower part first with the present bit set, before loading
the top 32 bits of the entry point, into an active IDT.

As we exceedingly unlikely to actually take an interrupt/exception while
doing this, we get away with it.

>
>> * enter __start_xen()
>> * patch in early_page_fault()
>> * perform large mounts of setup
>> * enter trap_init()
>> * patch the real trap handlers into 'idt_table'
>> * Set up ISTs for #DF, #NMI, #MCE in 'idt_table'
>> * Set up IST stacks in our local TSS
>> * Load our local TSS
>>
>> On APs, we:
>> * Load mmu_cr4_features, including CR4.MCE
>> * Load the BSPs 'idt_table', complete with IST information
>> * Set up IST stacks in our local TSS
>> * Load our local TSS
>> * Switch onto our own local memcpy() of the BSPs idt.
>>
>> In both cases, until we have loaded the TSS, we risk trying to take an
>> MCE or NMI without a TSS loaded.  I cant spot which contributory
>> exception would be generated, but I suspect #NP, or possibly #TS.  With
>> #DF set up in the same way, we will triple fault.
>>
>>
>> Looking at the real trap handlers, they appear to be safe from almost
>> the start of __start_xen().
> That of course may need closer inspection, and documenting the
> results in an eventually evolving patch description.

My meaning here was more "there is nothing which is an obvious problem
to making this work".

I will of course do an analysis of each trap and reason about its early
safety, possibly with patches to tweak the behaviour to ensure safety.

>
>>  Therefore, I propose:
>>
>> * The boot critical region with an empty idtr gets extended slightly
>> into the top of __start_xen() and start_secondary()
>> * Inside this critical region, set set up and load the TSS.
>> * Load ourselves onto our local idt.
>> * Load cr4, after the MCE entry path is valid.
>>
>> This has the added advantage that we gain full bugframe and extable
>> support for the earlier parts of setup.
>>
>> Is there anything I have overlooked, or does this plan look plausible?
> Looks all reasonable, as long as there's not going to be any window
> without (or with a non-working, or with a simplistic) exception handler
> that would become larger than what it is now.
>
> Jan
>

I think that the overall window without exception handling will reduce
substantially, mainly due to the BSP changes.

The AP window isn't changing length much, but is changing from the
current unsafe pseudo handling to no handling until it is all safe to.

One point I forgot to say was that I plan to leave the reserved
exception vectors with non-present descriptors.  This way we get a fatal
GPF indicating the exact vector, rather than a PCPU lockup optionally
followed by an NMI watchdog timeout when another pcpu times out waiting
for the locked up pcpu to rendezvous in the time calibration code.

~Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Boot time trap handling
  2014-05-12 12:53   ` Andrew Cooper
@ 2014-05-12 12:59     ` Jan Beulich
  2014-05-12 13:01       ` Andrew Cooper
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2014-05-12 12:59 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: TimDeegan, Keir Fraser, Xen-devel List

>>> On 12.05.14 at 14:53, <andrew.cooper3@citrix.com> wrote:
> One point I forgot to say was that I plan to leave the reserved
> exception vectors with non-present descriptors.  This way we get a fatal
> GPF indicating the exact vector, rather than a PCPU lockup optionally
> followed by an NMI watchdog timeout when another pcpu times out waiting
> for the locked up pcpu to rendezvous in the time calibration code.

This I'm not sure about. Replacing the simplistic handler is a must of
course, but I would tend towards a generic exception handler wired
up to fatal_trap() rather than having to decode the #GP error code
to understand what's going on.

Jan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Boot time trap handling
  2014-05-12 12:59     ` Jan Beulich
@ 2014-05-12 13:01       ` Andrew Cooper
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Cooper @ 2014-05-12 13:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: TimDeegan, Keir Fraser, Xen-devel List

On 12/05/14 13:59, Jan Beulich wrote:
>>>> On 12.05.14 at 14:53, <andrew.cooper3@citrix.com> wrote:
>> One point I forgot to say was that I plan to leave the reserved
>> exception vectors with non-present descriptors.  This way we get a fatal
>> GPF indicating the exact vector, rather than a PCPU lockup optionally
>> followed by an NMI watchdog timeout when another pcpu times out waiting
>> for the locked up pcpu to rendezvous in the time calibration code.
> This I'm not sure about. Replacing the simplistic handler is a must of
> course, but I would tend towards a generic exception handler wired
> up to fatal_trap() rather than having to decode the #GP error code
> to understand what's going on.
>
> Jan
>

Ok - I will make a formal entry point for each of the reserved exceptions.

~Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-12 13:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-12 10:40 Boot time trap handling Andrew Cooper
2014-05-12 12:09 ` Jan Beulich
2014-05-12 12:53   ` Andrew Cooper
2014-05-12 12:59     ` Jan Beulich
2014-05-12 13:01       ` Andrew Cooper

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).