qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] QEMU virt board: extending various limits
@ 2018-01-16 15:07 Peter Maydell
  2018-01-16 20:18 ` Laszlo Ersek
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2018-01-16 15:07 UTC (permalink / raw)
  To: QEMU Developers; +Cc: Andrew Jones, Laszlo Ersek

We've had discussions before about the various limits in the virt
board imposed by its current address space layout:
 * number of CPUs limited to 123 (not enough space for more redistributors)
 * number of PCIe devices limited by size of ECAM space
 * max memory size limits
 * (anything else?)

If we want to try to fix these this release cycle now would be a good
point to figure out our approach so that we have plenty of time to do
it in.

(Relatedly, I notice patches on list for kvm that allow userspace to
set the guest physical address size, which may affect how we want
to do this.)

I'm not going to have time to look at this but am happy to provide
my opinions on whatever proposals other people would like to suggest.

Probably the first thing to do is figure out whether we can
raise these limits without having to have a flag day (ie just
with changing the device tree we provide the guest), or if we
really have a hard compat break here. We should also try to
fix all these things at once rather than potentially breaking
guests several times...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-16 15:07 [Qemu-devel] QEMU virt board: extending various limits Peter Maydell
@ 2018-01-16 20:18 ` Laszlo Ersek
  2018-01-16 20:28   ` Ard Biesheuvel
  0 siblings, 1 reply; 7+ messages in thread
From: Laszlo Ersek @ 2018-01-16 20:18 UTC (permalink / raw)
  To: Peter Maydell
  Cc: QEMU Developers, Andrew Jones, Ard Biesheuvel, Igor Mammedov,
	Wei Huang, Leif Lindholm (Linaro address)

(adding Ard, Igor, Wei, Leif)

On 01/16/18 16:07, Peter Maydell wrote:
> We've had discussions before about the various limits in the virt
> board imposed by its current address space layout:
>  * number of CPUs limited to 123 (not enough space for more redistributors)
>  * number of PCIe devices limited by size of ECAM space
>  * max memory size limits
>  * (anything else?)
> 
> If we want to try to fix these this release cycle now would be a good
> point to figure out our approach so that we have plenty of time to do
> it in.
> 
> (Relatedly, I notice patches on list for kvm that allow userspace to
> set the guest physical address size, which may affect how we want
> to do this.)
> 
> I'm not going to have time to look at this but am happy to provide
> my opinions on whatever proposals other people would like to suggest.
> 
> Probably the first thing to do is figure out whether we can
> raise these limits without having to have a flag day (ie just
> with changing the device tree we provide the guest), or if we
> really have a hard compat break here. We should also try to
> fix all these things at once rather than potentially breaking
> guests several times...

I've quite lost the context on this since we last talked about it. :) My
request would be that Drew and Igor please (re)state their preferences,
and Ard and myself should put "firmware price tags" on those ideas.

As far as I remember, the sticking point from last time was whether
guest UEFI remains permitted to rely on the RAM base being fixed at 1GB
(i.e. if UEFI is at liberty to ignore x0 on entry). This decision
provides a framework for all further area movements, and represents a
large difference in firmware difficulty.

(Personally I'd be ready to *accept* a consensus that UEFI should cope
with a dynamic x0 on entry -- I'm neither proposing nor arguing against
the notion. The large additional complexity in the firmware should be
clear up-front however -- it'll take more time, more bugs, more human
resources. My last writeup is at
<http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>,
although I think Ard has modified some of the code since, so parts of
that text are no longer up to date.)

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-16 20:18 ` Laszlo Ersek
@ 2018-01-16 20:28   ` Ard Biesheuvel
  2018-01-17 16:15     ` Igor Mammedov
  0 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2018-01-16 20:28 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Peter Maydell, QEMU Developers, Andrew Jones, Igor Mammedov,
	Wei Huang, Leif Lindholm (Linaro address)

On 16 January 2018 at 20:18, Laszlo Ersek <lersek@redhat.com> wrote:
> (adding Ard, Igor, Wei, Leif)
>
> On 01/16/18 16:07, Peter Maydell wrote:
>> We've had discussions before about the various limits in the virt
>> board imposed by its current address space layout:
>>  * number of CPUs limited to 123 (not enough space for more redistributors)
>>  * number of PCIe devices limited by size of ECAM space
>>  * max memory size limits
>>  * (anything else?)
>>
>> If we want to try to fix these this release cycle now would be a good
>> point to figure out our approach so that we have plenty of time to do
>> it in.
>>
>> (Relatedly, I notice patches on list for kvm that allow userspace to
>> set the guest physical address size, which may affect how we want
>> to do this.)
>>
>> I'm not going to have time to look at this but am happy to provide
>> my opinions on whatever proposals other people would like to suggest.
>>
>> Probably the first thing to do is figure out whether we can
>> raise these limits without having to have a flag day (ie just
>> with changing the device tree we provide the guest), or if we
>> really have a hard compat break here. We should also try to
>> fix all these things at once rather than potentially breaking
>> guests several times...
>
> I've quite lost the context on this since we last talked about it. :) My
> request would be that Drew and Igor please (re)state their preferences,
> and Ard and myself should put "firmware price tags" on those ideas.
>
> As far as I remember, the sticking point from last time was whether
> guest UEFI remains permitted to rely on the RAM base being fixed at 1GB
> (i.e. if UEFI is at liberty to ignore x0 on entry). This decision
> provides a framework for all further area movements, and represents a
> large difference in firmware difficulty.
>
> (Personally I'd be ready to *accept* a consensus that UEFI should cope
> with a dynamic x0 on entry -- I'm neither proposing nor arguing against
> the notion. The large additional complexity in the firmware should be
> clear up-front however -- it'll take more time, more bugs, more human
> resources. My last writeup is at
> <http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>,
> although I think Ard has modified some of the code since, so parts of
> that text are no longer up to date.)
>

The 'contract' was 1 MB at 0x40000000 but UEFI never used more than
512 KB of that without checking the DT. With only very minor changes,
we could repurpose this range as 'non-secure SRAM', use it as
temporary PEI memory and use whatever the DT describes for DRAM, PCIe
etc.

For the firmware side, this would be a very natural fit with what the
code currently does, and with what many x86 and ARM bare metal
platforms do as well. Of course, I am clueless when it comes to the
QEMU side of these things, so perhaps this is a terrible idea.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-16 20:28   ` Ard Biesheuvel
@ 2018-01-17 16:15     ` Igor Mammedov
  2018-01-17 16:18       ` Peter Maydell
  0 siblings, 1 reply; 7+ messages in thread
From: Igor Mammedov @ 2018-01-17 16:15 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Laszlo Ersek, Peter Maydell, QEMU Developers, Andrew Jones,
	Wei Huang, Leif Lindholm (Linaro address)

On Tue, 16 Jan 2018 20:28:49 +0000
Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> On 16 January 2018 at 20:18, Laszlo Ersek <lersek@redhat.com> wrote:
> > (adding Ard, Igor, Wei, Leif)
> >
> > On 01/16/18 16:07, Peter Maydell wrote:  
> >> We've had discussions before about the various limits in the virt
> >> board imposed by its current address space layout:
> >>  * number of CPUs limited to 123 (not enough space for more redistributors)
> >>  * number of PCIe devices limited by size of ECAM space
> >>  * max memory size limits
> >>  * (anything else?)
> >>
> >> If we want to try to fix these this release cycle now would be a good
> >> point to figure out our approach so that we have plenty of time to do
> >> it in.
> >>
> >> (Relatedly, I notice patches on list for kvm that allow userspace to
> >> set the guest physical address size, which may affect how we want
> >> to do this.)
> >>
> >> I'm not going to have time to look at this but am happy to provide
> >> my opinions on whatever proposals other people would like to suggest.
> >>
> >> Probably the first thing to do is figure out whether we can
> >> raise these limits without having to have a flag day (ie just
> >> with changing the device tree we provide the guest), or if we
> >> really have a hard compat break here. We should also try to
> >> fix all these things at once rather than potentially breaking
> >> guests several times...  
> >
> > I've quite lost the context on this since we last talked about it. :) My
> > request would be that Drew and Igor please (re)state their preferences,
> > and Ard and myself should put "firmware price tags" on those ideas.
> >
> > As far as I remember, the sticking point from last time was whether
> > guest UEFI remains permitted to rely on the RAM base being fixed at 1GB
> > (i.e. if UEFI is at liberty to ignore x0 on entry). This decision
> > provides a framework for all further area movements, and represents a
> > large difference in firmware difficulty.
> >
> > (Personally I'd be ready to *accept* a consensus that UEFI should cope
> > with a dynamic x0 on entry -- I'm neither proposing nor arguing against
> > the notion. The large additional complexity in the firmware should be
> > clear up-front however -- it'll take more time, more bugs, more human
> > resources. My last writeup is at
> > <http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>,
> > although I think Ard has modified some of the code since, so parts of
> > that text are no longer up to date.)
> >  
> 
> The 'contract' was 1 MB at 0x40000000 but UEFI never used more than
> 512 KB of that without checking the DT. With only very minor changes,
> we could repurpose this range as 'non-secure SRAM', use it as
> temporary PEI memory and use whatever the DT describes for DRAM, PCIe
> etc.
> 
> For the firmware side, this would be a very natural fit with what the
> code currently does, and with what many x86 and ARM bare metal
> platforms do as well. Of course, I am clueless when it comes to the
> QEMU side of these things, so perhaps this is a terrible idea.

my idea was to drop fixed RAM base for virt board (at least for
new machine types) so that QEMU could specify it dynamically and
firmware would get base from x0 instead of compiled in constant.

Though it's more complex for firmware, It should benefit both qemu
and firmware in the long run.
  - we won't have to update lockstep updates if there will be need
    to move base in the future
  - not need maintain/invent compat machinery to make sure that
    old/new firmware will work fine on old/new qemu
  - qemu won't have to introduce fragmented RAM layout and keep it
    in as continuous region which is simpler to maintain and hard
    to break by accident.

So I think that dynamic memory base would be better approach to
maintain zoo of qemu and firmware and reduce chances of breaking
some combo that used to work by accident.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-17 16:15     ` Igor Mammedov
@ 2018-01-17 16:18       ` Peter Maydell
  2018-01-17 16:53         ` Andrew Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Maydell @ 2018-01-17 16:18 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Ard Biesheuvel, Laszlo Ersek, QEMU Developers, Andrew Jones,
	Wei Huang, Leif Lindholm (Linaro address)

On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote:
> my idea was to drop fixed RAM base for virt board (at least for
> new machine types) so that QEMU could specify it dynamically and
> firmware would get base from x0 instead of compiled in constant.

"base of ram is fixed" is about the one thing we've told
people they can rely on without fishing it out of the
device tree, so I think I'll just rule changing that out
of consideration now :-)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-17 16:18       ` Peter Maydell
@ 2018-01-17 16:53         ` Andrew Jones
  2018-01-17 18:53           ` Igor Mammedov
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Jones @ 2018-01-17 16:53 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Igor Mammedov, Ard Biesheuvel, Laszlo Ersek, QEMU Developers,
	Wei Huang, Leif Lindholm (Linaro address)

On Wed, Jan 17, 2018 at 04:18:30PM +0000, Peter Maydell wrote:
> On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote:
> > my idea was to drop fixed RAM base for virt board (at least for
> > new machine types) so that QEMU could specify it dynamically and
> > firmware would get base from x0 instead of compiled in constant.
> 
> "base of ram is fixed" is about the one thing we've told
> people they can rely on without fishing it out of the
> device tree, so I think I'll just rule changing that out
> of consideration now :-)
>

So that leaves three choices:

1) New machine type that has a different or non-fixed RAM base

(Makes the QEMU/AAVMF zoo even worse.)

2) Implement spit memory where one chunk is guaranteed to be at
   the 1G boundary, e.g. 'size <= 1G' at 1G

(The QEMU work will no doubt snowball, especially when considering
 memory hotplug. Although hotplug will likely warrant using DIMMs
 anyway, which means one 'size <= 1G' at 1G DIMM could be a non-removable,
 there by default DIMM, and other DIMM(s) would go elsewhere in order to
 implement the split memory.)

3) Leave memory like it is and just put everything else we want to expand
   in high memory, probably above the second PCIe window. I.e. CPU
   redistributor regions 124 and up and an additional PCIe ECAM space
   would go up there.

(Easiest, most backward compatible thing to do. Is there risk with putting
 those things above 4G? Someday we may want to shift those things and the
 second PCIe window even higher, if we ever want to support more than 515
 GB of memory, but I guess that shouldn't be a problem.)

Thanks,
drew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] QEMU virt board: extending various limits
  2018-01-17 16:53         ` Andrew Jones
@ 2018-01-17 18:53           ` Igor Mammedov
  0 siblings, 0 replies; 7+ messages in thread
From: Igor Mammedov @ 2018-01-17 18:53 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Peter Maydell, Ard Biesheuvel, Laszlo Ersek, QEMU Developers,
	Wei Huang, Leif Lindholm (Linaro address)

On Wed, 17 Jan 2018 17:53:48 +0100
Andrew Jones <drjones@redhat.com> wrote:

> On Wed, Jan 17, 2018 at 04:18:30PM +0000, Peter Maydell wrote:
> > On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote:  
> > > my idea was to drop fixed RAM base for virt board (at least for
> > > new machine types) so that QEMU could specify it dynamically and
> > > firmware would get base from x0 instead of compiled in constant.  
> > 
> > "base of ram is fixed" is about the one thing we've told
> > people they can rely on without fishing it out of the
> > device tree, so I think I'll just rule changing that out
> > of consideration now :-)
> >  
> 
> So that leaves three choices:
> 
> 1) New machine type that has a different or non-fixed RAM base
> 
> (Makes the QEMU/AAVMF zoo even worse.)
may be it's a way to go, we can drop all the stuff we don't
really need for virt use case and new firmware would
pick up RAM base from x0 and it would be able to work on
both new and old (fixed base put in x0) machine type.
Guests that want to run on new machine would have to
be booted by new AAVMF or handle dynamic RAM base from
x0 themselves.

how about virt-enterprise (64bit only EFI OS support booted by AAVMF)?

> 2) Implement spit memory where one chunk is guaranteed to be at
>    the 1G boundary, e.g. 'size <= 1G' at 1G
> 
> (The QEMU work will no doubt snowball, especially when considering
>  memory hotplug. Although hotplug will likely warrant using DIMMs
>  anyway, which means one 'size <= 1G' at 1G DIMM could be a non-removable,
>  there by default DIMM, and other DIMM(s) would go elsewhere in order to
>  implement the split memory.)
look at PC memory map and a bunch of tweaks that alter it,
it's hard to figure out if a change to it would break something.
So if we can (i.e. not restricted by spec) than we should go
for a flexible route that doesn't have design issues from the start.


> 3) Leave memory like it is and just put everything else we want to expand
>    in high memory, probably above the second PCIe window. I.e. CPU
>    redistributor regions 124 and up and an additional PCIe ECAM space
>    would go up there.
> 
> (Easiest, most backward compatible thing to do. Is there risk with putting
>  those things above 4G? Someday we may want to shift those things and the
>  second PCIe window even higher, if we ever want to support more than 515
>  GB of memory, but I guess that shouldn't be a problem.)
we can do that, but platform is too new so eventually we might
have to change layout and have to deal with compat issues than,
but it will be too late to change direction (with existing customers)
so we would have to live with self-inflicted pain which could
be avoided if we made thing flexible.


> 
> Thanks,
> drew

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-17 18:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-16 15:07 [Qemu-devel] QEMU virt board: extending various limits Peter Maydell
2018-01-16 20:18 ` Laszlo Ersek
2018-01-16 20:28   ` Ard Biesheuvel
2018-01-17 16:15     ` Igor Mammedov
2018-01-17 16:18       ` Peter Maydell
2018-01-17 16:53         ` Andrew Jones
2018-01-17 18:53           ` Igor Mammedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).