* [Draft B] Boot ABI for HVM guests without a device-model
@ 2015-08-26 11:48 Roger Pau Monné
2015-08-26 12:00 ` Jan Beulich
2015-08-26 12:18 ` Andrew Cooper
0 siblings, 2 replies; 10+ messages in thread
From: Roger Pau Monné @ 2015-08-26 11:48 UTC (permalink / raw)
To: xen-devel
Cc: Elena Ufimtseva, Andrew Cooper, Tim Deegan, Jan Beulich,
Boris Ostrovsky
Hello,
The discussion in [1] lead to an agreement of the missing pieces in PVH
(or HVM without a device-model) in order to progress with it's
implementation.
One of the missing pieces is a new boot ABI, that replaces the PV boot
ABI. The aim of this new boot ABI is to remove the limitations of the
PV boot ABI, that are no longer present when using auto-translated
guests. The new boot protocol should allow to use the same entry point
for both 32bit and 64bit guests, and let the guest choose it's bitness
at run time without the domain builder knowing in advance.
Roger.
[1] http://lists.xen.org/archives/html/xen-devel/2015-06/msg00258.html
---
HVM direct boot ABI
===================
Since the Xen entry point into the kernel can be different from the
native entry point, ELFNOTES are used in order to tell the domain
builder how to load and jump into the kernel entry point. At least the
following ELFNOTES are required in order to use this boot ABI:
ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz, "FreeBSD")
ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz, __XSTRING(__FreeBSD_version))
ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz, "xen-3.0")
ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY, .quad, xen_start32)
ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel")
ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, .long ((1 << XENFEAT_writable_page_tables) | \
(1 << XENFEAT_auto_translated_physmap) | \
(1 << XENFEAT_supervisor_mode_kernel) | \
(1 << XENFEAT_hvm_callback_vector))
ELFNOTE(Xen, XEN_ELFNOTE_LOADER, .asciz, "generic")
The first three notes contain information about the guest kernel and
the Xen hypercall ABI version. The following notes are of special
interest:
* XEN_ELFNOTE_PHYS32_ENTRY: the 32bit physical entry point into the kernel.
* XEN_ELFNOTE_FEATURES: features required by the guest kernel in order
to run.
The presence of the XEN_ELFNOTE_PHYS32_ENTRY note indicates that the
kernel supports the boot ABI described in this document.
The domain builder will load the kernel into the guest memory space and
jump into the entry point defined at XEN_ELFNOTE_PHYS32_ENTRY with the
following machine state:
* ebx: contains the physical memory address where the loader has placed
the boot start info structure.
* cr0: bit 0 (PE) will be set. All the other writeable bits are cleared.
* cr4: all bits are cleared.
* cs: must be a 32-bit read/execute code segment with an offset of ‘0’
and a limit of ‘0xFFFFFFFF’. The selector value is unspecified.
* ds, es: must be a 32-bit read/write data segment with an offset of
‘0’ and a limit of ‘0xFFFFFFFF’. The selector values are all unspecified.
* tr: must be a 32-bit TSS (active) with a base of '0' and a limit of '0xFF'.
* eflags: bit 17 (VM) must be cleared. Bit 9 (IF) must be cleared.
Other bits are all unspecified.
All other processor registers and flag bits are unspecified. The OS is in
charge of setting up it's own stack, GDT and IDT.
The format of the structure passed in the %ebx register is the following:
struct hvm_start_info {
#define HVM_START_MAGIC_VALUE 0x336ec578
uint32_t magic; /* Contains the magic value 0x336ec578 */
/* ("xEn3" with the 0x80 bit of the "E" set).*/
uint32_t flags; /* SIF_xxx flags. */
uint32_t cmdline_paddr; /* Physical address of the command line. */
uint32_t nr_modules; /* Number of modules passed to the kernel. */
uint32_t modlist_paddr; /* Physical address of an array of */
/* hvm_modlist_entry. */
};
struct hvm_modlist_entry {
uint64_t paddr; /* Physical address of the module. */
uint64_t size; /* Size of the module in bytes. */
};
This structure is guaranteed to always be placed in memory after the
loaded kernel and modules. There's no upper bound on the size of the
structure, users should be aware that it might cross a page boundary.
Note that the boot protocol resembles the multiboot1 specification,
this is done so OSes with multiboot1 entry points can reuse those if
desired.
Other relevant information needed in order to boot a guest kernel
(console page address, xenstore event channel...) can be obtained
using HVMPARAMS, just like it's done on HVM guests.
The setup of the hypercall page is also performed in the same way
as HVM guests, using a wrmsr.
AP startup
==========
AP startup is performed using hypercalls. The following VCPU operations
are used in order to bring up secondary vCPUs:
* VCPUOP_initialise is used to set the initial state of the vCPU. The
argument passed to the hypercall must be of the type vcpu_hvm_context
(see public/hvm/hvm_vcpu.h for the layout of the structure). Note that
this hypercall allows starting the vCPU in several modes (16/32/64bits),
regardless of the mode the BSP is currently running on.
* VCPUOP_up is used to launch the vCPU once the initial state has been
set using VCPUOP_initialise.
* VCPUOP_down is used to bring down a vCPU.
* VCPUOP_is_up is used to scan the number of available vCPUs.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 11:48 [Draft B] Boot ABI for HVM guests without a device-model Roger Pau Monné
@ 2015-08-26 12:00 ` Jan Beulich
2015-08-26 12:12 ` Andrew Cooper
2015-08-26 12:18 ` Andrew Cooper
1 sibling, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2015-08-26 12:00 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Elena Ufimtseva, Andrew Cooper, Tim Deegan, xen-devel,
BorisOstrovsky
>>> On 26.08.15 at 13:48, <roger.pau@citrix.com> wrote:
> * tr: must be a 32-bit TSS (active) with a base of '0' and a limit of '0xFF'.
Why 0xFF instead of 0x67?
> struct hvm_start_info {
> #define HVM_START_MAGIC_VALUE 0x336ec578
> uint32_t magic; /* Contains the magic value 0x336ec578 */
> /* ("xEn3" with the 0x80 bit of the "E" set).*/
> uint32_t flags; /* SIF_xxx flags. */
> uint32_t cmdline_paddr; /* Physical address of the command line. */
> uint32_t nr_modules; /* Number of modules passed to the kernel. */
> uint32_t modlist_paddr; /* Physical address of an array of */
> /* hvm_modlist_entry. */
> };
>
> struct hvm_modlist_entry {
> uint64_t paddr; /* Physical address of the module. */
> uint64_t size; /* Size of the module in bytes. */
> };
Why is paddr 64-bit here, but 32-bit in both cases above?
> This structure is guaranteed to always be placed in memory after the
DYM "These structures are ..."?
> loaded kernel and modules. There's no upper bound on the size of the
> structure, users should be aware that it might cross a page boundary.
How is there no size limit? It's (currently) 16 bytes, and I don't see
why it would change. And even if - as implied by the previous
comment - this also relates to struct hvm_start_info: Its size is
fixed (and unlikely to change much) too.
> Note that the boot protocol resembles the multiboot1 specification,
> this is done so OSes with multiboot1 entry points can reuse those if
> desired.
Which raises the question why we don't make it multiboot1 as much
as possible.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 12:00 ` Jan Beulich
@ 2015-08-26 12:12 ` Andrew Cooper
2015-08-26 14:44 ` Roger Pau Monné
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2015-08-26 12:12 UTC (permalink / raw)
To: Jan Beulich, Roger Pau Monné
Cc: Elena Ufimtseva, xen-devel, BorisOstrovsky, Tim Deegan
On 26/08/15 13:00, Jan Beulich wrote:
>
>> struct hvm_start_info {
>> #define HVM_START_MAGIC_VALUE 0x336ec578
>> uint32_t magic; /* Contains the magic value 0x336ec578 */
>> /* ("xEn3" with the 0x80 bit of the "E" set).*/
>> uint32_t flags; /* SIF_xxx flags. */
>> uint32_t cmdline_paddr; /* Physical address of the command line. */
>> uint32_t nr_modules; /* Number of modules passed to the kernel. */
>> uint32_t modlist_paddr; /* Physical address of an array of */
>> /* hvm_modlist_entry. */
>> };
>>
>> struct hvm_modlist_entry {
>> uint64_t paddr; /* Physical address of the module. */
>> uint64_t size; /* Size of the module in bytes. */
>> };
> Why is paddr 64-bit here, but 32-bit in both cases above?
This was my fault for suggesting it like this, but on further
consideration, uint32_t's for both fields will be fine. It won't be
interesting to load any modules outside of the 32bit boundary.
Anyone wishing to load more than 4GB of modules this way should go away
and rethink their boot procedure.
>
>> This structure is guaranteed to always be placed in memory after the
> DYM "These structures are ..."?
>
>> loaded kernel and modules.
There is no requirement for the command line/module information to be
after the loaded kernel. All it needs to do is not overlap.
>> There's no upper bound on the size of the
>> structure, users should be aware that it might cross a page boundary.
> How is there no size limit? It's (currently) 16 bytes, and I don't see
> why it would change. And even if - as implied by the previous
> comment - this also relates to struct hvm_start_info: Its size is
> fixed (and unlikely to change much) too.
I agree it is unlikely to change (but there is a flags field just in
case), but we shouldn't impose unnecessary arbitrary restrictions.
~Andrew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 11:48 [Draft B] Boot ABI for HVM guests without a device-model Roger Pau Monné
2015-08-26 12:00 ` Jan Beulich
@ 2015-08-26 12:18 ` Andrew Cooper
2015-08-26 15:38 ` Roger Pau Monné
1 sibling, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2015-08-26 12:18 UTC (permalink / raw)
To: Roger Pau Monné, xen-devel
Cc: Elena Ufimtseva, Boris Ostrovsky, Tim Deegan, Jan Beulich
On 26/08/15 12:48, Roger Pau Monné wrote:
> Hello,
>
> The discussion in [1] lead to an agreement of the missing pieces in PVH
> (or HVM without a device-model) in order to progress with it's
> implementation.
>
> One of the missing pieces is a new boot ABI, that replaces the PV boot
> ABI. The aim of this new boot ABI is to remove the limitations of the
> PV boot ABI, that are no longer present when using auto-translated
> guests. The new boot protocol should allow to use the same entry point
> for both 32bit and 64bit guests, and let the guest choose it's bitness
> at run time without the domain builder knowing in advance.
>
> Roger.
>
> [1] http://lists.xen.org/archives/html/xen-devel/2015-06/msg00258.html
>
> ---
> HVM direct boot ABI
> ===================
>
> Since the Xen entry point into the kernel can be different from the
> native entry point, ELFNOTES are used in order to tell the domain
> builder how to load and jump into the kernel entry point. At least the
> following ELFNOTES are required in order to use this boot ABI:
Perhaps note that this includes the example FreeBSD values. It
shouldn't be implied that these are the exact notes which should be used.
>
> ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS, .asciz, "FreeBSD")
> ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION, .asciz, __XSTRING(__FreeBSD_version))
> ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION, .asciz, "xen-3.0")
> ELFNOTE(Xen, XEN_ELFNOTE_PHYS32_ENTRY, .quad, xen_start32)
As this is strictly a 32bit entry, it can be .long rather than .quad
> ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel")
> ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, .long ((1 << XENFEAT_writable_page_tables) | \
> (1 << XENFEAT_auto_translated_physmap) | \
> (1 << XENFEAT_supervisor_mode_kernel) | \
> (1 << XENFEAT_hvm_callback_vector))
Can we see about fixing the overloading of XENFEAT_supervisor_mode_kernel ?
IMO it should be relegated to history. It was an old,
not-fully-implemented pv feature (subsequently removed completely) which
is not relevant to HVM guests.
> <snip>
>
> Other relevant information needed in order to boot a guest kernel
> (console page address, xenstore event channel...) can be obtained
> using HVMPARAMS, just like it's done on HVM guests.
>
> The setup of the hypercall page is also performed in the same way
> as HVM guests, using a wrmsr.
using the hypervisor cpuid leaves and msr ranges.
~Andrew
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 12:12 ` Andrew Cooper
@ 2015-08-26 14:44 ` Roger Pau Monné
2015-08-27 8:04 ` Jan Beulich
0 siblings, 1 reply; 10+ messages in thread
From: Roger Pau Monné @ 2015-08-26 14:44 UTC (permalink / raw)
To: Andrew Cooper, Jan Beulich
Cc: Elena Ufimtseva, xen-devel, BorisOstrovsky, Tim Deegan
El 26/08/15 a les 14.12, Andrew Cooper ha escrit:
> On 26/08/15 13:00, Jan Beulich wrote:
>>> This structure is guaranteed to always be placed in memory after the
>> DYM "These structures are ..."?
>>
>>> loaded kernel and modules.
>
> There is no requirement for the command line/module information to be
> after the loaded kernel. All it needs to do is not overlap.
IMHO, this is helpful in order to get last used physical address, after
which free memory starts. Current FreeBSD implementation relies on this,
if we didn't do it that way I would have to calculate where the symtab +
strtab ends, which is more complex.
>>> There's no upper bound on the size of the
>>> structure, users should be aware that it might cross a page boundary.
>> How is there no size limit? It's (currently) 16 bytes, and I don't see
>> why it would change. And even if - as implied by the previous
>> comment - this also relates to struct hvm_start_info: Its size is
>> fixed (and unlikely to change much) too.
>
> I agree it is unlikely to change (but there is a flags field just in
> case), but we shouldn't impose unnecessary arbitrary restrictions.
After reading it again, I realize this is not properly worded. What I
wanted to say is that the cmdline or the list of loaded modules might
cross a page boundary.
Roger.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 12:18 ` Andrew Cooper
@ 2015-08-26 15:38 ` Roger Pau Monné
0 siblings, 0 replies; 10+ messages in thread
From: Roger Pau Monné @ 2015-08-26 15:38 UTC (permalink / raw)
To: Andrew Cooper, xen-devel
Cc: Elena Ufimtseva, Boris Ostrovsky, Tim Deegan, Jan Beulich
El 26/08/15 a les 14.18, Andrew Cooper ha escrit:
> On 26/08/15 12:48, Roger Pau Monné wrote:
>> ELFNOTE(Xen, XEN_ELFNOTE_FEATURES, .asciz, "writable_descriptor_tables|auto_translated_physmap|supervisor_mode_kernel")
>> ELFNOTE(Xen, XEN_ELFNOTE_SUPPORTED_FEATURES, .long ((1 << XENFEAT_writable_page_tables) | \
>> (1 << XENFEAT_auto_translated_physmap) | \
>> (1 << XENFEAT_supervisor_mode_kernel) | \
>> (1 << XENFEAT_hvm_callback_vector))
>
> Can we see about fixing the overloading of XENFEAT_supervisor_mode_kernel ?
>
> IMO it should be relegated to history. It was an old,
> not-fully-implemented pv feature (subsequently removed completely) which
> is not relevant to HVM guests.
Maybe we can get rid of both XEN_ELFNOTE_FEATURES and
XEN_ELFNOTE_SUPPORTED_FEATURES if the kernel only supports PVH?
Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-26 14:44 ` Roger Pau Monné
@ 2015-08-27 8:04 ` Jan Beulich
2015-08-27 9:43 ` Andrew Cooper
0 siblings, 1 reply; 10+ messages in thread
From: Jan Beulich @ 2015-08-27 8:04 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Elena Ufimtseva, Andrew Cooper, Tim Deegan, xen-devel,
BorisOstrovsky
>>> On 26.08.15 at 16:44, <roger.pau@citrix.com> wrote:
> El 26/08/15 a les 14.12, Andrew Cooper ha escrit:
>> On 26/08/15 13:00, Jan Beulich wrote:
>>>> This structure is guaranteed to always be placed in memory after the
>>> DYM "These structures are ..."?
>>>
>>>> loaded kernel and modules.
>>
>> There is no requirement for the command line/module information to be
>> after the loaded kernel. All it needs to do is not overlap.
>
> IMHO, this is helpful in order to get last used physical address, after
> which free memory starts. Current FreeBSD implementation relies on this,
> if we didn't do it that way I would have to calculate where the symtab +
> strtab ends, which is more complex.
But the statement leaves open whether there is any free memory at
all after those structures, or whether instead all free memory lives
at lower addresses. Nor do I consider it appropriate to take a present
(one might say overly simplistic) implementation as a basis for setting
arbitrary restrictions.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-27 8:04 ` Jan Beulich
@ 2015-08-27 9:43 ` Andrew Cooper
2015-08-27 9:57 ` Roger Pau Monné
0 siblings, 1 reply; 10+ messages in thread
From: Andrew Cooper @ 2015-08-27 9:43 UTC (permalink / raw)
To: Jan Beulich, Roger Pau Monné
Cc: Elena Ufimtseva, xen-devel, BorisOstrovsky, Tim Deegan
On 27/08/15 09:04, Jan Beulich wrote:
>>>> On 26.08.15 at 16:44, <roger.pau@citrix.com> wrote:
>> El 26/08/15 a les 14.12, Andrew Cooper ha escrit:
>>> On 26/08/15 13:00, Jan Beulich wrote:
>>>>> This structure is guaranteed to always be placed in memory after the
>>>> DYM "These structures are ..."?
>>>>
>>>>> loaded kernel and modules.
>>> There is no requirement for the command line/module information to be
>>> after the loaded kernel. All it needs to do is not overlap.
>> IMHO, this is helpful in order to get last used physical address, after
>> which free memory starts. Current FreeBSD implementation relies on this,
>> if we didn't do it that way I would have to calculate where the symtab +
>> strtab ends, which is more complex.
> But the statement leaves open whether there is any free memory at
> all after those structures, or whether instead all free memory lives
> at lower addresses. Nor do I consider it appropriate to take a present
> (one might say overly simplistic) implementation as a basis for setting
> arbitrary restrictions.
I agree. This sounds like a FreeBSD bug, and absolutely shouldn't be a
written restriction in the boot ABI.
~Andrew
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-27 9:43 ` Andrew Cooper
@ 2015-08-27 9:57 ` Roger Pau Monné
2015-08-27 11:08 ` Jan Beulich
0 siblings, 1 reply; 10+ messages in thread
From: Roger Pau Monné @ 2015-08-27 9:57 UTC (permalink / raw)
To: Andrew Cooper, Jan Beulich
Cc: Elena Ufimtseva, xen-devel, BorisOstrovsky, Tim Deegan
El 27/08/15 a les 11.43, Andrew Cooper ha escrit:
> On 27/08/15 09:04, Jan Beulich wrote:
>>>>> On 26.08.15 at 16:44, <roger.pau@citrix.com> wrote:
>>> El 26/08/15 a les 14.12, Andrew Cooper ha escrit:
>>>> On 26/08/15 13:00, Jan Beulich wrote:
>>>>>> This structure is guaranteed to always be placed in memory after the
>>>>> DYM "These structures are ..."?
>>>>>
>>>>>> loaded kernel and modules.
>>>> There is no requirement for the command line/module information to be
>>>> after the loaded kernel. All it needs to do is not overlap.
>>> IMHO, this is helpful in order to get last used physical address, after
>>> which free memory starts. Current FreeBSD implementation relies on this,
>>> if we didn't do it that way I would have to calculate where the symtab +
>>> strtab ends, which is more complex.
>> But the statement leaves open whether there is any free memory at
>> all after those structures, or whether instead all free memory lives
>> at lower addresses. Nor do I consider it appropriate to take a present
>> (one might say overly simplistic) implementation as a basis for setting
>> arbitrary restrictions.
Can we just state that the hvm_start_info structure and associated
metadata is placed after the loaded kernel and modules?
Whether there's free memory or not after this is something that the
kernel has to figure out by itself, and I wasn't planning to add such a
statement to the specification.
> I agree. This sounds like a FreeBSD bug, and absolutely shouldn't be a
> written restriction in the boot ABI.
Bug? The FreeBSD native loader passes to the FreeBSD kernel the last
used address, after which free memory starts. IMHO, it is not a bug,
it's just how FreeBSD boots. I understand that Linux might not pass
such a parameter, and there are other ways I can use to find this, but
they are more complex.
We already did something very similar with PV guests, see the comment
before the start_info structure:
* 3. This the order of bootstrap elements in the initial virtual region:
* a. relocated kernel image
* b. initial ram disk [mod_start, mod_len]
* (may be omitted)
* c. list of allocated page frames [mfn_list, nr_pages]
* (unless relocated due to XEN_ELFNOTE_INIT_P2M)
* d. start_info_t structure [register ESI (x86)]
* in case of dom0 this page contains the console info, too
* e. unless dom0: xenstore ring page
* f. unless dom0: console ring page
* g. bootstrap page tables [pt_base and CR3 (x86)]
* h. bootstrap stack [register ESP (x86)]
IMHO it is important to mention how things are loaded into memory, and
placing the hvm_start_info struct after the loaded kernel and modules
is also the most natural way to do it, I don't foresee this changing in
the future.
Roger.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Draft B] Boot ABI for HVM guests without a device-model
2015-08-27 9:57 ` Roger Pau Monné
@ 2015-08-27 11:08 ` Jan Beulich
0 siblings, 0 replies; 10+ messages in thread
From: Jan Beulich @ 2015-08-27 11:08 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Elena Ufimtseva, Andrew Cooper, Tim Deegan, xen-devel,
BorisOstrovsky
>>> On 27.08.15 at 11:57, <roger.pau@citrix.com> wrote:
> El 27/08/15 a les 11.43, Andrew Cooper ha escrit:
>> On 27/08/15 09:04, Jan Beulich wrote:
>>>>>> On 26.08.15 at 16:44, <roger.pau@citrix.com> wrote:
>>>> El 26/08/15 a les 14.12, Andrew Cooper ha escrit:
>>>>> On 26/08/15 13:00, Jan Beulich wrote:
>>>>>>> This structure is guaranteed to always be placed in memory after the
>>>>>> DYM "These structures are ..."?
>>>>>>
>>>>>>> loaded kernel and modules.
>>>>> There is no requirement for the command line/module information to be
>>>>> after the loaded kernel. All it needs to do is not overlap.
>>>> IMHO, this is helpful in order to get last used physical address, after
>>>> which free memory starts. Current FreeBSD implementation relies on this,
>>>> if we didn't do it that way I would have to calculate where the symtab +
>>>> strtab ends, which is more complex.
>>> But the statement leaves open whether there is any free memory at
>>> all after those structures, or whether instead all free memory lives
>>> at lower addresses. Nor do I consider it appropriate to take a present
>>> (one might say overly simplistic) implementation as a basis for setting
>>> arbitrary restrictions.
>
> Can we just state that the hvm_start_info structure and associated
> metadata is placed after the loaded kernel and modules?
I think we should try to avoid introducing any restrictions that aren't
technically warranted: The more restrictions we add now, the less
flexible we're going to be when we want/need to change some of the
implementation later on.
>> I agree. This sounds like a FreeBSD bug, and absolutely shouldn't be a
>> written restriction in the boot ABI.
>
> Bug? The FreeBSD native loader passes to the FreeBSD kernel the last
> used address, after which free memory starts. IMHO, it is not a bug,
> it's just how FreeBSD boots. I understand that Linux might not pass
> such a parameter, and there are other ways I can use to find this, but
> they are more complex.
Considering that you claimed that after that free memory starts, when
- as said - there might not be any free memory there, it indeed sounds
like a bug to me.
> We already did something very similar with PV guests, see the comment
> before the start_info structure:
I think we'd better avoid leaving basically no room for alterations. It
has been problematic on the PV side in at least one case (the ordering
of page table pages for compat guests).
> IMHO it is important to mention how things are loaded into memory, and
> placing the hvm_start_info struct after the loaded kernel and modules
> is also the most natural way to do it, I don't foresee this changing in
> the future.
The only thing we should make sure is that all pieces can be easily
found, i.e. the kernel doesn't need to do any guessing.
Jan
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-08-27 11:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-26 11:48 [Draft B] Boot ABI for HVM guests without a device-model Roger Pau Monné
2015-08-26 12:00 ` Jan Beulich
2015-08-26 12:12 ` Andrew Cooper
2015-08-26 14:44 ` Roger Pau Monné
2015-08-27 8:04 ` Jan Beulich
2015-08-27 9:43 ` Andrew Cooper
2015-08-27 9:57 ` Roger Pau Monné
2015-08-27 11:08 ` Jan Beulich
2015-08-26 12:18 ` Andrew Cooper
2015-08-26 15:38 ` Roger Pau Monné
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).