* [Qemu-devel] QEMU virt board: extending various limits @ 2018-01-16 15:07 Peter Maydell 2018-01-16 20:18 ` Laszlo Ersek 0 siblings, 1 reply; 7+ messages in thread From: Peter Maydell @ 2018-01-16 15:07 UTC (permalink / raw) To: QEMU Developers; +Cc: Andrew Jones, Laszlo Ersek We've had discussions before about the various limits in the virt board imposed by its current address space layout: * number of CPUs limited to 123 (not enough space for more redistributors) * number of PCIe devices limited by size of ECAM space * max memory size limits * (anything else?) If we want to try to fix these this release cycle now would be a good point to figure out our approach so that we have plenty of time to do it in. (Relatedly, I notice patches on list for kvm that allow userspace to set the guest physical address size, which may affect how we want to do this.) I'm not going to have time to look at this but am happy to provide my opinions on whatever proposals other people would like to suggest. Probably the first thing to do is figure out whether we can raise these limits without having to have a flag day (ie just with changing the device tree we provide the guest), or if we really have a hard compat break here. We should also try to fix all these things at once rather than potentially breaking guests several times... thanks -- PMM ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-16 15:07 [Qemu-devel] QEMU virt board: extending various limits Peter Maydell @ 2018-01-16 20:18 ` Laszlo Ersek 2018-01-16 20:28 ` Ard Biesheuvel 0 siblings, 1 reply; 7+ messages in thread From: Laszlo Ersek @ 2018-01-16 20:18 UTC (permalink / raw) To: Peter Maydell Cc: QEMU Developers, Andrew Jones, Ard Biesheuvel, Igor Mammedov, Wei Huang, Leif Lindholm (Linaro address) (adding Ard, Igor, Wei, Leif) On 01/16/18 16:07, Peter Maydell wrote: > We've had discussions before about the various limits in the virt > board imposed by its current address space layout: > * number of CPUs limited to 123 (not enough space for more redistributors) > * number of PCIe devices limited by size of ECAM space > * max memory size limits > * (anything else?) > > If we want to try to fix these this release cycle now would be a good > point to figure out our approach so that we have plenty of time to do > it in. > > (Relatedly, I notice patches on list for kvm that allow userspace to > set the guest physical address size, which may affect how we want > to do this.) > > I'm not going to have time to look at this but am happy to provide > my opinions on whatever proposals other people would like to suggest. > > Probably the first thing to do is figure out whether we can > raise these limits without having to have a flag day (ie just > with changing the device tree we provide the guest), or if we > really have a hard compat break here. We should also try to > fix all these things at once rather than potentially breaking > guests several times... I've quite lost the context on this since we last talked about it. :) My request would be that Drew and Igor please (re)state their preferences, and Ard and myself should put "firmware price tags" on those ideas. As far as I remember, the sticking point from last time was whether guest UEFI remains permitted to rely on the RAM base being fixed at 1GB (i.e. if UEFI is at liberty to ignore x0 on entry). This decision provides a framework for all further area movements, and represents a large difference in firmware difficulty. (Personally I'd be ready to *accept* a consensus that UEFI should cope with a dynamic x0 on entry -- I'm neither proposing nor arguing against the notion. The large additional complexity in the firmware should be clear up-front however -- it'll take more time, more bugs, more human resources. My last writeup is at <http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>, although I think Ard has modified some of the code since, so parts of that text are no longer up to date.) Thanks, Laszlo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-16 20:18 ` Laszlo Ersek @ 2018-01-16 20:28 ` Ard Biesheuvel 2018-01-17 16:15 ` Igor Mammedov 0 siblings, 1 reply; 7+ messages in thread From: Ard Biesheuvel @ 2018-01-16 20:28 UTC (permalink / raw) To: Laszlo Ersek Cc: Peter Maydell, QEMU Developers, Andrew Jones, Igor Mammedov, Wei Huang, Leif Lindholm (Linaro address) On 16 January 2018 at 20:18, Laszlo Ersek <lersek@redhat.com> wrote: > (adding Ard, Igor, Wei, Leif) > > On 01/16/18 16:07, Peter Maydell wrote: >> We've had discussions before about the various limits in the virt >> board imposed by its current address space layout: >> * number of CPUs limited to 123 (not enough space for more redistributors) >> * number of PCIe devices limited by size of ECAM space >> * max memory size limits >> * (anything else?) >> >> If we want to try to fix these this release cycle now would be a good >> point to figure out our approach so that we have plenty of time to do >> it in. >> >> (Relatedly, I notice patches on list for kvm that allow userspace to >> set the guest physical address size, which may affect how we want >> to do this.) >> >> I'm not going to have time to look at this but am happy to provide >> my opinions on whatever proposals other people would like to suggest. >> >> Probably the first thing to do is figure out whether we can >> raise these limits without having to have a flag day (ie just >> with changing the device tree we provide the guest), or if we >> really have a hard compat break here. We should also try to >> fix all these things at once rather than potentially breaking >> guests several times... > > I've quite lost the context on this since we last talked about it. :) My > request would be that Drew and Igor please (re)state their preferences, > and Ard and myself should put "firmware price tags" on those ideas. > > As far as I remember, the sticking point from last time was whether > guest UEFI remains permitted to rely on the RAM base being fixed at 1GB > (i.e. if UEFI is at liberty to ignore x0 on entry). This decision > provides a framework for all further area movements, and represents a > large difference in firmware difficulty. > > (Personally I'd be ready to *accept* a consensus that UEFI should cope > with a dynamic x0 on entry -- I'm neither proposing nor arguing against > the notion. The large additional complexity in the firmware should be > clear up-front however -- it'll take more time, more bugs, more human > resources. My last writeup is at > <http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>, > although I think Ard has modified some of the code since, so parts of > that text are no longer up to date.) > The 'contract' was 1 MB at 0x40000000 but UEFI never used more than 512 KB of that without checking the DT. With only very minor changes, we could repurpose this range as 'non-secure SRAM', use it as temporary PEI memory and use whatever the DT describes for DRAM, PCIe etc. For the firmware side, this would be a very natural fit with what the code currently does, and with what many x86 and ARM bare metal platforms do as well. Of course, I am clueless when it comes to the QEMU side of these things, so perhaps this is a terrible idea. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-16 20:28 ` Ard Biesheuvel @ 2018-01-17 16:15 ` Igor Mammedov 2018-01-17 16:18 ` Peter Maydell 0 siblings, 1 reply; 7+ messages in thread From: Igor Mammedov @ 2018-01-17 16:15 UTC (permalink / raw) To: Ard Biesheuvel Cc: Laszlo Ersek, Peter Maydell, QEMU Developers, Andrew Jones, Wei Huang, Leif Lindholm (Linaro address) On Tue, 16 Jan 2018 20:28:49 +0000 Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 16 January 2018 at 20:18, Laszlo Ersek <lersek@redhat.com> wrote: > > (adding Ard, Igor, Wei, Leif) > > > > On 01/16/18 16:07, Peter Maydell wrote: > >> We've had discussions before about the various limits in the virt > >> board imposed by its current address space layout: > >> * number of CPUs limited to 123 (not enough space for more redistributors) > >> * number of PCIe devices limited by size of ECAM space > >> * max memory size limits > >> * (anything else?) > >> > >> If we want to try to fix these this release cycle now would be a good > >> point to figure out our approach so that we have plenty of time to do > >> it in. > >> > >> (Relatedly, I notice patches on list for kvm that allow userspace to > >> set the guest physical address size, which may affect how we want > >> to do this.) > >> > >> I'm not going to have time to look at this but am happy to provide > >> my opinions on whatever proposals other people would like to suggest. > >> > >> Probably the first thing to do is figure out whether we can > >> raise these limits without having to have a flag day (ie just > >> with changing the device tree we provide the guest), or if we > >> really have a hard compat break here. We should also try to > >> fix all these things at once rather than potentially breaking > >> guests several times... > > > > I've quite lost the context on this since we last talked about it. :) My > > request would be that Drew and Igor please (re)state their preferences, > > and Ard and myself should put "firmware price tags" on those ideas. > > > > As far as I remember, the sticking point from last time was whether > > guest UEFI remains permitted to rely on the RAM base being fixed at 1GB > > (i.e. if UEFI is at liberty to ignore x0 on entry). This decision > > provides a framework for all further area movements, and represents a > > large difference in firmware difficulty. > > > > (Personally I'd be ready to *accept* a consensus that UEFI should cope > > with a dynamic x0 on entry -- I'm neither proposing nor arguing against > > the notion. The large additional complexity in the firmware should be > > clear up-front however -- it'll take more time, more bugs, more human > > resources. My last writeup is at > > <http://mid.mail-archive.com/4cce2b8b-a411-bd5d-a06f-b0b80a5fb2f1@redhat.com>, > > although I think Ard has modified some of the code since, so parts of > > that text are no longer up to date.) > > > > The 'contract' was 1 MB at 0x40000000 but UEFI never used more than > 512 KB of that without checking the DT. With only very minor changes, > we could repurpose this range as 'non-secure SRAM', use it as > temporary PEI memory and use whatever the DT describes for DRAM, PCIe > etc. > > For the firmware side, this would be a very natural fit with what the > code currently does, and with what many x86 and ARM bare metal > platforms do as well. Of course, I am clueless when it comes to the > QEMU side of these things, so perhaps this is a terrible idea. my idea was to drop fixed RAM base for virt board (at least for new machine types) so that QEMU could specify it dynamically and firmware would get base from x0 instead of compiled in constant. Though it's more complex for firmware, It should benefit both qemu and firmware in the long run. - we won't have to update lockstep updates if there will be need to move base in the future - not need maintain/invent compat machinery to make sure that old/new firmware will work fine on old/new qemu - qemu won't have to introduce fragmented RAM layout and keep it in as continuous region which is simpler to maintain and hard to break by accident. So I think that dynamic memory base would be better approach to maintain zoo of qemu and firmware and reduce chances of breaking some combo that used to work by accident. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-17 16:15 ` Igor Mammedov @ 2018-01-17 16:18 ` Peter Maydell 2018-01-17 16:53 ` Andrew Jones 0 siblings, 1 reply; 7+ messages in thread From: Peter Maydell @ 2018-01-17 16:18 UTC (permalink / raw) To: Igor Mammedov Cc: Ard Biesheuvel, Laszlo Ersek, QEMU Developers, Andrew Jones, Wei Huang, Leif Lindholm (Linaro address) On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote: > my idea was to drop fixed RAM base for virt board (at least for > new machine types) so that QEMU could specify it dynamically and > firmware would get base from x0 instead of compiled in constant. "base of ram is fixed" is about the one thing we've told people they can rely on without fishing it out of the device tree, so I think I'll just rule changing that out of consideration now :-) thanks -- PMM ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-17 16:18 ` Peter Maydell @ 2018-01-17 16:53 ` Andrew Jones 2018-01-17 18:53 ` Igor Mammedov 0 siblings, 1 reply; 7+ messages in thread From: Andrew Jones @ 2018-01-17 16:53 UTC (permalink / raw) To: Peter Maydell Cc: Igor Mammedov, Ard Biesheuvel, Laszlo Ersek, QEMU Developers, Wei Huang, Leif Lindholm (Linaro address) On Wed, Jan 17, 2018 at 04:18:30PM +0000, Peter Maydell wrote: > On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote: > > my idea was to drop fixed RAM base for virt board (at least for > > new machine types) so that QEMU could specify it dynamically and > > firmware would get base from x0 instead of compiled in constant. > > "base of ram is fixed" is about the one thing we've told > people they can rely on without fishing it out of the > device tree, so I think I'll just rule changing that out > of consideration now :-) > So that leaves three choices: 1) New machine type that has a different or non-fixed RAM base (Makes the QEMU/AAVMF zoo even worse.) 2) Implement spit memory where one chunk is guaranteed to be at the 1G boundary, e.g. 'size <= 1G' at 1G (The QEMU work will no doubt snowball, especially when considering memory hotplug. Although hotplug will likely warrant using DIMMs anyway, which means one 'size <= 1G' at 1G DIMM could be a non-removable, there by default DIMM, and other DIMM(s) would go elsewhere in order to implement the split memory.) 3) Leave memory like it is and just put everything else we want to expand in high memory, probably above the second PCIe window. I.e. CPU redistributor regions 124 and up and an additional PCIe ECAM space would go up there. (Easiest, most backward compatible thing to do. Is there risk with putting those things above 4G? Someday we may want to shift those things and the second PCIe window even higher, if we ever want to support more than 515 GB of memory, but I guess that shouldn't be a problem.) Thanks, drew ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU virt board: extending various limits 2018-01-17 16:53 ` Andrew Jones @ 2018-01-17 18:53 ` Igor Mammedov 0 siblings, 0 replies; 7+ messages in thread From: Igor Mammedov @ 2018-01-17 18:53 UTC (permalink / raw) To: Andrew Jones Cc: Peter Maydell, Ard Biesheuvel, Laszlo Ersek, QEMU Developers, Wei Huang, Leif Lindholm (Linaro address) On Wed, 17 Jan 2018 17:53:48 +0100 Andrew Jones <drjones@redhat.com> wrote: > On Wed, Jan 17, 2018 at 04:18:30PM +0000, Peter Maydell wrote: > > On 17 January 2018 at 16:15, Igor Mammedov <imammedo@redhat.com> wrote: > > > my idea was to drop fixed RAM base for virt board (at least for > > > new machine types) so that QEMU could specify it dynamically and > > > firmware would get base from x0 instead of compiled in constant. > > > > "base of ram is fixed" is about the one thing we've told > > people they can rely on without fishing it out of the > > device tree, so I think I'll just rule changing that out > > of consideration now :-) > > > > So that leaves three choices: > > 1) New machine type that has a different or non-fixed RAM base > > (Makes the QEMU/AAVMF zoo even worse.) may be it's a way to go, we can drop all the stuff we don't really need for virt use case and new firmware would pick up RAM base from x0 and it would be able to work on both new and old (fixed base put in x0) machine type. Guests that want to run on new machine would have to be booted by new AAVMF or handle dynamic RAM base from x0 themselves. how about virt-enterprise (64bit only EFI OS support booted by AAVMF)? > 2) Implement spit memory where one chunk is guaranteed to be at > the 1G boundary, e.g. 'size <= 1G' at 1G > > (The QEMU work will no doubt snowball, especially when considering > memory hotplug. Although hotplug will likely warrant using DIMMs > anyway, which means one 'size <= 1G' at 1G DIMM could be a non-removable, > there by default DIMM, and other DIMM(s) would go elsewhere in order to > implement the split memory.) look at PC memory map and a bunch of tweaks that alter it, it's hard to figure out if a change to it would break something. So if we can (i.e. not restricted by spec) than we should go for a flexible route that doesn't have design issues from the start. > 3) Leave memory like it is and just put everything else we want to expand > in high memory, probably above the second PCIe window. I.e. CPU > redistributor regions 124 and up and an additional PCIe ECAM space > would go up there. > > (Easiest, most backward compatible thing to do. Is there risk with putting > those things above 4G? Someday we may want to shift those things and the > second PCIe window even higher, if we ever want to support more than 515 > GB of memory, but I guess that shouldn't be a problem.) we can do that, but platform is too new so eventually we might have to change layout and have to deal with compat issues than, but it will be too late to change direction (with existing customers) so we would have to live with self-inflicted pain which could be avoided if we made thing flexible. > > Thanks, > drew ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-01-17 18:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-16 15:07 [Qemu-devel] QEMU virt board: extending various limits Peter Maydell 2018-01-16 20:18 ` Laszlo Ersek 2018-01-16 20:28 ` Ard Biesheuvel 2018-01-17 16:15 ` Igor Mammedov 2018-01-17 16:18 ` Peter Maydell 2018-01-17 16:53 ` Andrew Jones 2018-01-17 18:53 ` Igor Mammedov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).