All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Tim Deegan <tim@xen.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [Draft A] Boot ABI for HVM guests without a device-model
Date: Wed, 10 Jun 2015 16:53:23 +0200	[thread overview]
Message-ID: <55784F63.4080801@citrix.com> (raw)
In-Reply-To: <55785480020000780008312B@mail.emea.novell.com>

El 10/06/15 a les 15.15, Jan Beulich ha escrit:
>>>> On 10.06.15 at 14:34, <roger.pau@citrix.com> wrote:
>>  * XEN_ELFNOTE_PADDR_OFFSET: the offset of the ELF paddr field from the
>>    actual required physical address.
> 
> Why would that be needed? I.e. why would there ever be an offset?

For example a FreeBSD kernel has the following program headers:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0xffffffff80200040 0xffffffff80200040
                 0x0000000000000150 0x0000000000000150  R E    8
  INTERP         0x0000000000000190 0xffffffff80200190 0xffffffff80200190
                 0x000000000000000d 0x000000000000000d  R      1
      [Requesting program interpreter: /red/herring]
  LOAD           0x0000000000000000 0xffffffff80200000 0xffffffff80200000
                 0x0000000001055b30 0x0000000001055b30  R E    200000
  LOAD           0x0000000001055b30 0xffffffff81455b30 0xffffffff81455b30
                 0x0000000000135c88 0x0000000000532348  RW     200000
  DYNAMIC        0x0000000001055b30 0xffffffff81455b30 0xffffffff81455b30
                 0x00000000000000d0 0x00000000000000d0  RW     8
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RWE    8

I thought the loader needs XEN_ELFNOTE_PADDR_OFFSET in order to figure 
out the physical address were it has to load the kernel by using 
PhysAddr - XEN_ELFNOTE_PADDR_OFFSET, but maybe that's not the case. 
Maybe I can also fix the FreeBSD kernel in order to have the right 
PhysAddr, but I'm not sure if that's going to screw native loading.

> 
>>  * XEN_ELFNOTE_PADDR_ENTRY: the 32bit entry point into the kernel.
>>  * XEN_ELFNOTE_FEATURES: features required by the guest kernel in order
>>    to run.
>>
>> The presence of the XEN_ELFNOTE_PADDR_ENTRY note indicates that the 
>> kernel supports the boot ABI described in this document.
>>
>> The domain builder will load the kernel into the guest memory space and 
>> jump into the entry point defined at XEN_ELFNOTE_PADDR_ENTRY with the 
>> following machine state:
>>
>>  * esi: contains the physical memory address were the loader has placed
>>    the start_info page.
>>
>>  * eax: contains the magic value 0xFF6BC1E2.
> 
> On what basis was this value chosen?

It's a completely random value.

> For my taste, it's getting too
> close to something that could be a legitimate 32-bit kernel pointer
> (agreed, all values could be valid pointers in 32-bit OSes, but with
> OSes tending to place themselves high in memory, a value numerically
> closer to what multiboot1 uses would seem more desirable).

I don't have any strong opinions here, does the following seem more 
suitable:

0x336ec578 ("xEn3" with the 0x80 bit of the "E" set)

(from xc_dom_binloader.c)

Or we can follow multiboot1 and use:

0x3BADB002

(note the 3 instead of the 2).

> 
>>  * cr0: bit 31 (PG) must be cleared. Bit 0 (PE) must be set. Other bits
>>    are all undefined. 
> 
> I see that grub1 documentation says so, but I doubt this is realistic
> (even less so for cr4 bits): Some of the bits (including ones not
> currently defined) may have a meaning even in non-paged protected
> mode, and the environment should be as completely defined as possible.
> I.e. I think most other bits should be defined to be zero upon handoff.
> 
>>  * cs: must be a 32-bit read/execute code segment with an offset of ‘0’
>>    and a limit of ‘0xFFFFFFFF’. The exact value is undefined.
> 
> I guess "exact value" really means "selector value".

I think so, it's a literal copy from the multiboot1 spec.

> 
>>  * ds, es, fs, gs, ss: must be a 32-bit read/write data segment with an
>>    offset of ‘0’ and a limit of ‘0xFFFFFFFF’. The exact values are all
>>    undefined. 
> 
> Same here, plus I don't think fs and gs should be defined to have any
> particular value, base, limit, or attributes (such that handing off with
> them holding nul selectors would become acceptable).

This is also copied from the multiboot1 spec. I don't have any issue 
with leaving fs and gs undefined.

> 
>>  * eflags: bit 17 (VM) must be cleared. Bit 9 (IF) must be cleared. 
>>    Other bits are all undefined.
>>
>>  * A20 gate: must be enabled.
> 
> This is irrelevant on other than physical machines.

I had my doubts on this one, glad to know it's not relevant.

>> Comments for further discussion:
>>
>> Do we want to keep using the start_info page? Most of the fields there 
>> are not relevant for auto-translated guests, but without it we have to 
>> figure out how to pass the following information to the guest:
>>
>>  - Flags: SIF_xxx flags, this could probably be done with cpuid instead.
>>  - cmd_line: ?
>>  - console mfn: ?
>>  - console evtchn: ?
>>  - console_info address: ?
> 
> Yeah, settling on ideally a reasonably arch-independent mechanism
> that doesn't place undue constraints on future ports would be nice.
> And considering a hypothetical variant of x86 Xen not supporting PV
> guests anymore, this would no longer define XEN_HAVE_PV_GUEST_ENTRY
> and hence no longer have a struct start_info. So from a puristic pov
> the information should indeed be conveyed another way.

What about the following layout:

struct hvm_start_info {
    /* THE FOLLOWING ARE FILLED IN BOTH ON INITIAL BOOT AND ON RESUME.    */
    char magic[32];             /* "xen-<version>-<platform>".            */
    union {
        struct {
            xen_pfn_t console_paddr;    /* Physical address of console page.   */
            uint32_t  console_evtchn;   /* Event channel for console page.     */
        } domU;
        struct {
            uint32_t info_off;  /* Offset of console_info struct.         */
            uint32_t info_size; /* Size of console_info struct from start.*/
        } dom0;
    } console;
    unsigned long mod_start;    /* Physical address of pre-loaded module  */
    unsigned long mod_len;      /* Size (bytes) of pre-loaded module.     */
#define MAX_GUEST_CMDLINE 1024
    int8_t cmd_line[MAX_GUEST_CMDLINE];
};

We can even expand MAX_GUEST_CMDLINE if needed.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2015-06-10 14:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-10 12:34 [Draft A] Boot ABI for HVM guests without a device-model Roger Pau Monné
2015-06-10 13:15 ` Jan Beulich
2015-06-10 14:53   ` Roger Pau Monné [this message]
2015-06-10 15:53     ` Jan Beulich
2015-06-10 15:42   ` Roger Pau Monné
2015-06-11 11:01   ` Tim Deegan
2015-06-10 13:18 ` Andrew Cooper
2015-06-10 15:38   ` Roger Pau Monné
2015-06-10 15:57     ` Andrew Cooper
2015-06-11  8:23       ` Roger Pau Monné
2015-06-10 18:55 ` Konrad Rzeszutek Wilk
2015-06-10 21:31   ` Andrew Cooper
2015-06-11  8:31     ` Roger Pau Monné
2015-06-11  7:18   ` Jan Beulich
2015-06-12 13:30     ` Konrad Rzeszutek Wilk
2015-06-11  8:43   ` Roger Pau Monné
2015-06-12 13:23     ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55784F63.4080801@citrix.com \
    --to=roger.pau@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=ian.campbell@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.