public inbox for kexec@lists.infradead.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Daniel Kiper <daniel.kiper@oracle.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [Xen-devel] [PATCH 5/8] kexec: extend hypercall with improved load/unload ops
Date: Fri, 8 Mar 2013 23:38:03 +0000	[thread overview]
Message-ID: <513A765B.8000709@citrix.com> (raw)
In-Reply-To: <20130308214547.GC11057@debian70-amd64.local.net-space.pl>

On 08/03/13 21:45, Daniel Kiper wrote:
> On Fri, Mar 08, 2013 at 05:29:05PM +0000, Andrew Cooper wrote:
>> <snip>
>>>> The tools know what mode the image must be called it and it can tell the
>>>> hypervisor and the hypervisor can trivial setup the correct mode.
>>>>
>>>> I propose:
>>>>
>>>> * Tools say: "here's an image, call it in mode X".
>>>>
>>>> You suggest:
>>>>
>>>> * Hypervisor implicitly says through some unspecified side channel: "I
>>>> only call images in mode Y".
>>> Purgatory is clearly defined. Please look into kexec-tools/purgatory.
>>> It is integral part of kexec infrastructure.
>> Purgatory might be well defined, but that is not relevant here.
>>
>> The kexec syscall and hypercall basically amount to "Here is a blob.
>> Its architecture is $X and its entry point is $Y"
> kexec syscall use architecture information to check that given
> image could be executed on given platform. That is all.

And how is 'could' distinguished?

A basic sanity check at load time of "is $X an operating mode I can get
to at some point in the future" is fine, and useful to eliminate the
case of trying to load something claiming to be an ARM blob on an x86
machine.

However, the entry point given can only possibly work in one operating
mode.  If $X is i386 and Xen jumps to it with long mode enabled, then it
will crash very quickly.  Conversely, if $X is x86_64 and Xen jumps to
it in protected mode, another crash will occur.

>
>> (Give or take some reconstruction)
> What does this reconstruction? Hypervisor?

Under the current implementation, the dom0 kernel.  Under the new
planned implementation, Xen.

>
>> Xen should not be making any assumptions about these things.
>>
>> As it currently stands, Xen will assume that KEXEC_load from a pv_32on64
>> domain is an i386 image, while a KEXEC_load from a 64bit PV domain is an
>> x86_64 image.
> I do not understand. First you write that "Xen should not be making any
> assumptions about these things" and in the next sentence you state
> that "Xen will assume that...". What do you mean by that?

Sorry for the confustion - That is what happens in the current
implementation.

>
> And why do you force users to use image for one architecture (in this case
> subarchitecture)? I (as a user) would like to have a choice.

The image can do whatever it wants once it is running.

>
>> The fact that this currently works in the common case of having the
>> crash kernel with the same architecture as the dom0 kernel is by luck
>> rather than good guidance.
> OK, I agree but in this case following part of patch 5/8:
>
> if ( image->arch == EM_386 )
>   reloc_flags |= KEXEC_RELOC_FLAG_COMPAT;
>
> should be change to:
>
> if ( is_pv_32on64_domain(dom0) )
>   reloc_flags |= KEXEC_RELOC_FLAG_COMPAT;

No - specifically not.  This is the whole problem we are trying to avoid.

The current running architecture of dom0 has no place trying to
second-guess the intended architecture of the blob.

What happens if I as the user am currently running a 32bit dom0 on 64
bit Xen, and want to load a 64bit blob to jump to?

Under your suggestion, I as the user have to declare it to be a 32bit
blob and write a 32->64 shim at the beginning of it.  Under Davids
suggestion, all I as the user have to do is to tell Xen that it is
indeed a 64bit image.

>
>> Furthmore, the design of the interface should not be deliberately
>> crippled because the common user of it "can deal with it like this";
> If something is good and tested in many ways, on many architectures,
> very long time, why not use it? What is the difference between Xen
> and other architectures?

argumentum ad antiquitatem

Not that I wish to jibe at kexec-tools, but to point out the fallacy of
an argument on that basis.


About "good and tested", the current kexec handover mechanism is insane,
and is frankly a miracle it ever worked in the first place.

Lets take the example of a 32bit dom0 on 64bit Xen and a 32bit crash kernel

(The following is to the best of my understanding, so apologies if I
have misunderstood bits)

1) /sbin/kexec bundles a 32bit kernel and initrd, along with purgatory
etc and makes a kexec system call
2) dom0 copies the segments into regular kalloc()'d chunks
3) dom0 constructs a control page, bundles some control state together
and makes a kexec hypercall
4) Xen saves the control data and overwrites the dom0 provided virtual
addresses

In the case of a crash

1) Xen writes crash notes and shuts down as fast as possible
2) Because dom0 is 32bit, Xen sets up 32bit mode non-pae 1:1mapped and
3a) might die there and then because the control page living in dom0
kalloc()'d space might now be above the 4GB boundary
3b) be lucky that the control page is below the 4GB and
4) Execute the control page which sets up 32bit mode non-pae 1:1mapped
(on a different set of pagetables/GDT etc)
5) Works to reconstruct the image in the crash region which
6a) might copy in the wrong block because of 32bit truncation issues
7) Jump to the beginning of purgatory which sets up 32bit mode

And amongst all of that, I am still unsure of whether there are other
issues because of an "unsigned long page_list[]" in the 64bit hypervisor
being different from the "unsigned long page_list[]" used by the 32bit
control page.  In machine_kexec_load() in the hypervisor, we make no
sanity checks against the assertions of the comments.


In the proposed new interface, we do not need to set up the correct
state for purgatory, jump into the dom0 control page which re-sets up
different equivalent state, just to reconstruct the image and jump to it.

As for the different architecture of Xen, I hope the above shows exacly
why it is different, and why it is dangerous to use assumptions based on
is_pv_32on64_domain(dom0)

>
>> kexec-tools is not the only potential consumer of this interface.
> Potentialy yes but as I know (correct me if I am wrong) kexec-tools
> is only one tool, until now, which uses kexec syscall/hypercall.
> If we use this tool we should align to widely accepted rules.
> If we do not like them then we should convince maintainers that
> our approach is better or write our own tool with our own rules.
> But then we should not call it kexec.
>
> Daniel

I see no reason why Davids proposed interface is incompatible with
kexec-tools.  Do you?

~Andrew

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2013-03-08 23:38 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-21 17:48 [PATCH 0/8] kexec: extended kexec hypercall for use with pv-ops kernels David Vrabel
2013-02-21 17:48 ` [PATCH 1/8] x86: give FIX_EFI_MPF its own fixmap entry David Vrabel
2013-02-21 17:48 ` [PATCH 2/8] xen: make GUEST_HANDLE_64() and uint64_aligned_t available everywhere David Vrabel
2013-02-21 17:48 ` [PATCH 3/8] kexec: add public interface for improved load/unload sub-ops David Vrabel
2013-02-21 22:29   ` Daniel Kiper
2013-02-22 11:49     ` David Vrabel
2013-02-22  8:33   ` [Xen-devel] " Jan Beulich
2013-02-22 11:50     ` David Vrabel
2013-02-22 13:09       ` Jan Beulich
2013-03-08 10:50   ` Daniel Kiper
2013-03-08 11:52     ` David Vrabel
2013-03-08 12:28       ` Daniel Kiper
2013-03-08 12:36         ` [Xen-devel] " Jan Beulich
2013-03-08 15:34           ` Daniel Kiper
2013-02-21 17:48 ` [PATCH 4/8] kexec: add infrastructure for handling kexec images David Vrabel
2013-03-08 11:37   ` Daniel Kiper
2013-03-08 11:42     ` David Vrabel
2013-03-08 11:58       ` Daniel Kiper
2013-02-21 17:48 ` [PATCH 5/8] kexec: extend hypercall with improved load/unload ops David Vrabel
2013-02-21 22:41   ` Daniel Kiper
2013-02-22  8:42   ` [Xen-devel] " Jan Beulich
2013-02-22 11:54     ` David Vrabel
2013-02-22 13:11       ` Jan Beulich
2013-03-08 11:23   ` Daniel Kiper
2013-03-08 11:40     ` David Vrabel
2013-03-08 12:21       ` Daniel Kiper
2013-03-08 14:01         ` David Vrabel
2013-03-08 15:23           ` Daniel Kiper
2013-03-08 17:29             ` [Xen-devel] " Andrew Cooper
2013-03-08 21:45               ` Daniel Kiper
2013-03-08 23:38                 ` Andrew Cooper [this message]
2013-03-11 11:17                   ` Daniel Kiper
2013-03-11 13:21                     ` David Vrabel
2013-03-11 13:30                       ` Daniel Kiper
2013-03-11 13:43                         ` David Vrabel
2013-03-11 14:13                           ` Daniel Kiper
2013-03-11 14:27                             ` Andrew Cooper
2013-03-11 20:45                               ` Daniel Kiper
2013-03-11 21:18                                 ` Andrew Cooper
2013-03-12 11:17                                   ` Daniel Kiper
2013-03-12 11:36   ` Daniel Kiper
2013-02-21 17:48 ` [PATCH 6/8] xen: kexec crash image when dom0 crashes David Vrabel
2013-02-21 17:48 ` [PATCH 7/8] libxc: add hypercall buffer arrays David Vrabel
2013-03-06 14:25   ` [Xen-devel] " Ian Jackson
2013-03-07  2:44   ` Ian Campbell
2013-02-21 17:48 ` [PATCH 8/8] libxc: add API for kexec hypercall David Vrabel
2013-03-07  2:46   ` [Xen-devel] " Ian Campbell
2013-02-21 22:47 ` [PATCH 0/8] kexec: extended kexec hypercall for use with pv-ops kernels Daniel Kiper
2013-02-22  8:17 ` [Xen-devel] " Jan Beulich
2013-02-22 11:56   ` David Vrabel
2013-02-26 13:58 ` Don Slutz
2013-03-05 11:04 ` David Vrabel
  -- strict thread matches above, loose matches on Subject: below --
2013-04-16 17:13 [PATCHv4 0/8] kexec: extend " David Vrabel
2013-04-16 17:13 ` [PATCH 5/8] kexec: extend hypercall with improved load/unload ops David Vrabel
2013-04-17  8:55   ` [Xen-devel] " Jan Beulich
2013-04-17 10:11     ` David Vrabel
2013-04-17 10:20       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=513A765B.8000709@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=daniel.kiper@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=kexec@lists.infradead.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox