* gross qemu behavior
@ 2014-03-28 7:48 Jan Beulich
2014-03-28 9:21 ` Jan Beulich
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Jan Beulich @ 2014-03-28 7:48 UTC (permalink / raw)
To: anthony.perard, Stefano Stabellini; +Cc: xen-devel
Hi,
so while doing all that EPT work I naturally also happened to look more
closely at the EPT table dumps, spotting an odd range of 16 pages
outside any supposedly populated address range. This range only
exists when guest memory doesn't extend past (by default) 0xf0000000
(the start of MMIO, i.e. normally the frame buffer). After spending quite
a bit of time I finally figured that this must be a left over of the Cirrus
VGA ROM, and I would have thought that this
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice
}
pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
+ memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
+ pdev->rom.ram_addr, &pdev->rom, 1);
+ memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
return 0;
}
should fix it. It does appear to work as far generic qemu is concerned,
but once looking at the Xen backend I had to conclude that this just
can't work: For one, xen_add_to_physmap() and
xen_remove_from_physmap() are _documented_ (in a comment) to
only be capable of a single region (VRAM). And the latter - even worse -
is implemented with a call to xc_domain_add_to_physmap(), completely
contrary to its name.
Instrumenting xen_region_{add,del}(), I can see that all regions get
properly reported to the Xen backend, just that it doesn't handle them
(this is with above patch in place):
xra(fee00000,100000)
xra(fec00000,1000)
xra(fed00000,400)
xra(80000000,10000)
xrd(80000000,10000)
xra(f0000000,800000)
xra(f1000000,400000)
xra(f2000000,1000000)
xra(f3010000,4000)
xra(f3014000,1000)
xra(f3015000,3000)
xra(f3018000,1000)
xra(f3000000,10000)
xrd(f3000000,10000)
xrd(f0000000,800000)
xra(f0000000,800000)
mapping vram to f0000000 - f0800000
Having wasted enough time getting to this point, I'd like to ask you
to advise a proper fix for this. We definitely shouldn't be leaving
stuff sitting at arbitrary positions in the physical address space of
the guest. And the fact that the range gets removed (from Xen's
perspective, but not from qemu's) when RAM extends beyond
0xf0000000 (due to it being replaced with what is actually
intended to be there) makes me wonder what would happen if the
ROM got enabled by the guest.
Jan
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: gross qemu behavior 2014-03-28 7:48 gross qemu behavior Jan Beulich @ 2014-03-28 9:21 ` Jan Beulich 2014-03-28 9:30 ` Fabio Fantoni 2014-03-28 17:46 ` Stefano Stabellini 2 siblings, 0 replies; 23+ messages in thread From: Jan Beulich @ 2014-03-28 9:21 UTC (permalink / raw) To: anthony.perard, Stefano Stabellini; +Cc: xen-devel >>> On 28.03.14 at 08:48, <JBeulich@suse.com> wrote: > Having wasted enough time getting to this point, I'd like to ask you > to advise a proper fix for this. We definitely shouldn't be leaving > stuff sitting at arbitrary positions in the physical address space of > the guest. And the fact that the range gets removed (from Xen's > perspective, but not from qemu's) when RAM extends beyond > 0xf0000000 (due to it being replaced with what is actually > intended to be there) makes me wonder what would happen if the > ROM got enabled by the guest. Fixing of which would, afaict, also address the performance impacting fact that the emulated MMIO ranges other than the frame buffer get marked UC in the EPT tables if the domain has any passed through devices (as then the call to xc_domain_pin_memory_cacheattr() would get called for all such regions - care would of course need to be taken to avoid calling it for MMIO regions of passed through devices). And looking at the cache attribute pinning I see that this is broken too: The hypervisor doesn't even expose a removal interface, and the adding one doesn't check whether the new region already exists or conflicts with already existing ones. What if the guest decided to relocate the region a couple of times? Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 7:48 gross qemu behavior Jan Beulich 2014-03-28 9:21 ` Jan Beulich @ 2014-03-28 9:30 ` Fabio Fantoni 2014-03-28 10:37 ` Jan Beulich 2014-03-28 17:46 ` Stefano Stabellini 2 siblings, 1 reply; 23+ messages in thread From: Fabio Fantoni @ 2014-03-28 9:30 UTC (permalink / raw) To: Jan Beulich, anthony.perard, Stefano Stabellini; +Cc: xen-devel Il 28/03/2014 08:48, Jan Beulich ha scritto: > Hi, > > so while doing all that EPT work I naturally also happened to look more > closely at the EPT table dumps, spotting an odd range of 16 pages > outside any supposedly populated address range. This range only > exists when guest memory doesn't extend past (by default) 0xf0000000 > (the start of MMIO, i.e. normally the frame buffer). After spending quite > a bit of time I finally figured that this must be a left over of the Cirrus > VGA ROM, and I would have thought that this > > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice > } > > pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); > + memory_region_add_subregion_overlap(pdev->bus->address_space_mem, > + pdev->rom.ram_addr, &pdev->rom, 1); > + memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom); > > return 0; > } > > should fix it. It does appear to work as far generic qemu is concerned, > but once looking at the Xen backend I had to conclude that this just > can't work: For one, xen_add_to_physmap() and > xen_remove_from_physmap() are _documented_ (in a comment) to > only be capable of a single region (VRAM). And the latter - even worse - > is implemented with a call to xc_domain_add_to_physmap(), completely > contrary to its name. > > Instrumenting xen_region_{add,del}(), I can see that all regions get > properly reported to the Xen backend, just that it doesn't handle them > (this is with above patch in place): > > xra(fee00000,100000) > xra(fec00000,1000) > xra(fed00000,400) > xra(80000000,10000) > xrd(80000000,10000) > xra(f0000000,800000) > xra(f1000000,400000) > xra(f2000000,1000000) > xra(f3010000,4000) > xra(f3014000,1000) > xra(f3015000,3000) > xra(f3018000,1000) > xra(f3000000,10000) > xrd(f3000000,10000) > xrd(f0000000,800000) > xra(f0000000,800000) > mapping vram to f0000000 - f0800000 > > Having wasted enough time getting to this point, I'd like to ask you > to advise a proper fix for this. We definitely shouldn't be leaving > stuff sitting at arbitrary positions in the physical address space of > the guest. And the fact that the range gets removed (from Xen's > perspective, but not from qemu's) when RAM extends beyond > 0xf0000000 (due to it being replaced with what is actually > intended to be there) makes me wonder what would happen if the > ROM got enabled by the guest. Thanks for your work. I do not know enough about these things to help you solve it unfortunately. It seems to me, however, to understand that this problem may be the actual cause (or at least one) that also blocks the correct allocation of all qxl memory regionsand perhaps even setting up more ram for stdvga that although no errors appear apparently not working. Can you tell me if it is correct or am I wrong? If it is correct please put me in cc of the future mails and/or patches and I will test them with qxl and any other features that they affect. Thanks for any reply and sorry for my bad english. > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 9:30 ` Fabio Fantoni @ 2014-03-28 10:37 ` Jan Beulich 0 siblings, 0 replies; 23+ messages in thread From: Jan Beulich @ 2014-03-28 10:37 UTC (permalink / raw) To: Fabio Fantoni; +Cc: anthony.perard, xen-devel, Stefano Stabellini >>> On 28.03.14 at 10:30, <fabio.fantoni@m2r.biz> wrote: > Thanks for your work. > I do not know enough about these things to help you solve it unfortunately. > It seems to me, however, to understand that this problem may be the > actual cause (or at least one) that also blocks the correct allocation > of all qxl memory regionsand perhaps even setting up more ram for stdvga > that although no errors appear apparently not working. > Can you tell me if it is correct or am I wrong? I don't know for sure, but it's certainly possible. Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 7:48 gross qemu behavior Jan Beulich 2014-03-28 9:21 ` Jan Beulich 2014-03-28 9:30 ` Fabio Fantoni @ 2014-03-28 17:46 ` Stefano Stabellini 2014-03-28 17:52 ` Stefano Stabellini 2014-03-31 9:07 ` Jan Beulich 2 siblings, 2 replies; 23+ messages in thread From: Stefano Stabellini @ 2014-03-28 17:46 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini CC'ing Paolo, hoping that he has a better idea on how to solve this problem. On Fri, 28 Mar 2014, Jan Beulich wrote: > Hi, > > so while doing all that EPT work I naturally also happened to look more > closely at the EPT table dumps, spotting an odd range of 16 pages > outside any supposedly populated address range. This range only > exists when guest memory doesn't extend past (by default) 0xf0000000 > (the start of MMIO, i.e. normally the frame buffer). After spending quite > a bit of time I finally figured that this must be a left over of the Cirrus > VGA ROM, and I would have thought that this > > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice > } > > pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); > + memory_region_add_subregion_overlap(pdev->bus->address_space_mem, > + pdev->rom.ram_addr, &pdev->rom, 1); > + memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom); > > return 0; > } > > should fix it. It does appear to work as far generic qemu is concerned, > but once looking at the Xen backend I had to conclude that this just > can't work: For one, xen_add_to_physmap() and > xen_remove_from_physmap() are _documented_ (in a comment) to > only be capable of a single region (VRAM). And the latter - even worse - > is implemented with a call to xc_domain_add_to_physmap(), completely > contrary to its name. xen_add_to_physmap and xen_remove_from_physmap are just to deal with the VRAM in their current implementation. > Instrumenting xen_region_{add,del}(), I can see that all regions get > properly reported to the Xen backend, just that it doesn't handle them > (this is with above patch in place): > > xra(fee00000,100000) > xra(fec00000,1000) > xra(fed00000,400) > xra(80000000,10000) > xrd(80000000,10000) > xra(f0000000,800000) > xra(f1000000,400000) > xra(f2000000,1000000) > xra(f3010000,4000) > xra(f3014000,1000) > xra(f3015000,3000) > xra(f3018000,1000) > xra(f3000000,10000) > xrd(f3000000,10000) > xrd(f0000000,800000) > xra(f0000000,800000) > mapping vram to f0000000 - f0800000 > > Having wasted enough time getting to this point, I'd like to ask you > to advise a proper fix for this. We definitely shouldn't be leaving > stuff sitting at arbitrary positions in the physical address space of > the guest. And the fact that the range gets removed (from Xen's > perspective, but not from qemu's) when RAM extends beyond > 0xf0000000 (due to it being replaced with what is actually > intended to be there) makes me wonder what would happen if the > ROM got enabled by the guest. This is a thorny issue, fixing this behavior is not going to be trivial: - The hypervisor/libxc does not currently expose a xc_domain_remove_from_physmap function. - QEMU works by allocating memory regions at the end of the guest physmap and then moving them at the right place. - QEMU can destroy a memory region and in that case we could free the memory and remove it from the physmap, however that is NOT what QEMU does with the vga ROM. In that case it calls memory_region_del_subregion, so we can't be sure that the ROM won't be mapped again, therefore we cannot free it. We need to move it somewhere else, hence the problem. But fortunately we don't actually need to add the VGA ROM to the guest physmap for it to work, QEMU can trap and emulate. In fact even today we are not mapping it at the right place anyway, see xen_set_memory: if (add) { if (!memory_region_is_rom(section->mr)) { xen_add_to_physmap(state, start_addr, size, section->mr, section->offset_within_region); } else { So the only solution I can see right now is: - avoid allocating guest memory for the VGA ROM That means that at the beginning of xen_ram_alloc we need to realize that the memory region we are dealing with is the VGA ROM memory region and avoid calling xc_domain_populate_physmap_exact for it. - call g_malloc instead Simply use g_malloc to allocate QEMU memory for the VGA ROM, keep track of the allocation in a data structure internal to xen-all.c. - make sure that qemu_get_ram_ptr can deal with the different allocation Now that the VGA ROM is QEMU memory, we need to make sure that qemu_get_ram_ptr returns the right pointer for it. This is all very fiddly and hackish, but I can't see a better way of solving the issue. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 17:46 ` Stefano Stabellini @ 2014-03-28 17:52 ` Stefano Stabellini 2014-03-28 18:01 ` Paolo Bonzini 2014-03-31 9:07 ` Jan Beulich 1 sibling, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-03-28 17:52 UTC (permalink / raw) To: Stefano Stabellini Cc: anthony.perard, xen-devel, Stefano Stabellini, Jan Beulich, Paolo Bonzini On Fri, 28 Mar 2014, Stefano Stabellini wrote: > CC'ing Paolo, hoping that he has a better idea on how to solve this > problem. > > > On Fri, 28 Mar 2014, Jan Beulich wrote: > > Hi, > > > > so while doing all that EPT work I naturally also happened to look more > > closely at the EPT table dumps, spotting an odd range of 16 pages > > outside any supposedly populated address range. This range only > > exists when guest memory doesn't extend past (by default) 0xf0000000 > > (the start of MMIO, i.e. normally the frame buffer). After spending quite > > a bit of time I finally figured that this must be a left over of the Cirrus > > VGA ROM, and I would have thought that this > > > > --- a/hw/pci/pci.c > > +++ b/hw/pci/pci.c > > @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice > > } > > > > pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom); > > + memory_region_add_subregion_overlap(pdev->bus->address_space_mem, > > + pdev->rom.ram_addr, &pdev->rom, 1); > > + memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom); > > > > return 0; > > } > > > > should fix it. It does appear to work as far generic qemu is concerned, > > but once looking at the Xen backend I had to conclude that this just > > can't work: For one, xen_add_to_physmap() and > > xen_remove_from_physmap() are _documented_ (in a comment) to > > only be capable of a single region (VRAM). And the latter - even worse - > > is implemented with a call to xc_domain_add_to_physmap(), completely > > contrary to its name. > > xen_add_to_physmap and xen_remove_from_physmap are just to deal with the > VRAM in their current implementation. > > > > Instrumenting xen_region_{add,del}(), I can see that all regions get > > properly reported to the Xen backend, just that it doesn't handle them > > (this is with above patch in place): > > > > xra(fee00000,100000) > > xra(fec00000,1000) > > xra(fed00000,400) > > xra(80000000,10000) > > xrd(80000000,10000) > > xra(f0000000,800000) > > xra(f1000000,400000) > > xra(f2000000,1000000) > > xra(f3010000,4000) > > xra(f3014000,1000) > > xra(f3015000,3000) > > xra(f3018000,1000) > > xra(f3000000,10000) > > xrd(f3000000,10000) > > xrd(f0000000,800000) > > xra(f0000000,800000) > > mapping vram to f0000000 - f0800000 > > > > Having wasted enough time getting to this point, I'd like to ask you > > to advise a proper fix for this. We definitely shouldn't be leaving > > stuff sitting at arbitrary positions in the physical address space of > > the guest. And the fact that the range gets removed (from Xen's > > perspective, but not from qemu's) when RAM extends beyond > > 0xf0000000 (due to it being replaced with what is actually > > intended to be there) makes me wonder what would happen if the > > ROM got enabled by the guest. > > This is a thorny issue, fixing this behavior is not going to be trivial: > > - The hypervisor/libxc does not currently expose a > xc_domain_remove_from_physmap function. > > - QEMU works by allocating memory regions at the end of the guest > physmap and then moving them at the right place. > > - QEMU can destroy a memory region and in that case we could free the > memory and remove it from the physmap, however that is NOT what QEMU > does with the vga ROM. In that case it calls > memory_region_del_subregion, so we can't be sure that the ROM won't be > mapped again, therefore we cannot free it. We need to move it > somewhere else, hence the problem. > > > But fortunately we don't actually need to add the VGA ROM to the guest > physmap for it to work, QEMU can trap and emulate. In fact even today we > are not mapping it at the right place anyway, see xen_set_memory: > > if (add) { > if (!memory_region_is_rom(section->mr)) { > xen_add_to_physmap(state, start_addr, size, > section->mr, section->offset_within_region); > } else { > > > So the only solution I can see right now is: > > - avoid allocating guest memory for the VGA ROM > That means that at the beginning of xen_ram_alloc we need to realize > that the memory region we are dealing with is the VGA ROM memory region > and avoid calling xc_domain_populate_physmap_exact for it. > > - call g_malloc instead > Simply use g_malloc to allocate QEMU memory for the VGA ROM, > keep track of the allocation in a data structure internal to xen-all.c. > > - make sure that qemu_get_ram_ptr can deal with the different allocation > Now that the VGA ROM is QEMU memory, we need to make sure that > qemu_get_ram_ptr returns the right pointer for it. > > > This is all very fiddly and hackish, but I can't see a better way of > solving the issue. Given that I feel that the explanation is not very clear, I am appending a proof of concept patch. It is obviously horrible, I am by no means suggesting it should be applied. diff --git a/exec.c b/exec.c index 91513c6..bdecc70 100644 --- a/exec.c +++ b/exec.c @@ -1453,6 +1453,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) It should not be used for general purpose DMA. Use cpu_physical_memory_map/cpu_physical_memory_rw instead. */ +extern uint8_t* vga_rom; void *qemu_get_ram_ptr(ram_addr_t addr) { RAMBlock *block = qemu_get_ram_block(addr); @@ -1462,7 +1463,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr) * because we don't want to map the entire memory in QEMU. * In that case just map until the end of the page. */ - if (block->offset == 0) { + if (!strcmp(block->mr->name,"cirrus_vga.rom")) { + return vga_rom; + } else if (block->offset == 0) { return xen_map_cache(addr, 0, 0); } else if (block->host == NULL) { block->host = diff --git a/xen-all.c b/xen-all.c index ba34739..6211946 100644 --- a/xen-all.c +++ b/xen-all.c @@ -101,6 +101,8 @@ typedef struct XenIOState { Notifier wakeup; } XenIOState; +uint8_t* vga_rom; + /* Xen specific function for piix pci */ int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num) @@ -217,6 +219,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr) return; } + if (!strcmp(mr->name,"cirrus_vga.rom")) { + vga_rom = g_malloc(size); + return; + } + trace_xen_ram_alloc(ram_addr, size); nr_pfn = size >> TARGET_PAGE_BITS; ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 17:52 ` Stefano Stabellini @ 2014-03-28 18:01 ` Paolo Bonzini 2014-03-28 18:30 ` Stefano Stabellini 0 siblings, 1 reply; 23+ messages in thread From: Paolo Bonzini @ 2014-03-28 18:01 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Jan Beulich Il 28/03/2014 18:52, Stefano Stabellini ha scritto: >> This is a thorny issue, fixing this behavior is not going to be trivial: >> >> - The hypervisor/libxc does not currently expose a >> xc_domain_remove_from_physmap function. >> >> - QEMU works by allocating memory regions at the end of the guest >> physmap and then moving them at the right place. >> >> - QEMU can destroy a memory region and in that case we could free the >> memory and remove it from the physmap, however that is NOT what QEMU >> does with the vga ROM. In that case it calls >> memory_region_del_subregion, so we can't be sure that the ROM won't be >> mapped again, therefore we cannot free it. We need to move it >> somewhere else, hence the problem. Right; QEMU cannot know either if the ROM will be mapped again (examples include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" or a warm reset). >> But fortunately we don't actually need to add the VGA ROM to the guest >> physmap for it to work, QEMU can trap and emulate. In fact even today we >> are not mapping it at the right place anyway, see xen_set_memory: But how can you execute from the VGA ROM then? Also, how do you migrate its contents? And how is VGA different from say an iPXE ROM? It would be nice if QEMU could just special case pc.ram (which has block->offset == 0), and use the normal method to allocate other RAM regions. But I'm afraid that would require some changes in the Xen toolstack as well (for migration, for example) and I'm not sure how you could execute from PCI ROM BARs. Paolo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 18:01 ` Paolo Bonzini @ 2014-03-28 18:30 ` Stefano Stabellini 2014-03-29 7:31 ` Paolo Bonzini 0 siblings, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-03-28 18:30 UTC (permalink / raw) To: Paolo Bonzini; +Cc: anthony.perard, xen-devel, Jan Beulich, Stefano Stabellini On Fri, 28 Mar 2014, Paolo Bonzini wrote: > Il 28/03/2014 18:52, Stefano Stabellini ha scritto: > > > This is a thorny issue, fixing this behavior is not going to be trivial: > > > > > > - The hypervisor/libxc does not currently expose a > > > xc_domain_remove_from_physmap function. > > > > > > - QEMU works by allocating memory regions at the end of the guest > > > physmap and then moving them at the right place. > > > > > > - QEMU can destroy a memory region and in that case we could free the > > > memory and remove it from the physmap, however that is NOT what QEMU > > > does with the vga ROM. In that case it calls > > > memory_region_del_subregion, so we can't be sure that the ROM won't be > > > mapped again, therefore we cannot free it. We need to move it > > > somewhere else, hence the problem. > > Right; QEMU cannot know either if the ROM will be mapped again (examples > include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" or a > warm reset). > > > > But fortunately we don't actually need to add the VGA ROM to the guest > > > physmap for it to work, QEMU can trap and emulate. In fact even today we > > > are not mapping it at the right place anyway, see xen_set_memory: > > But how can you execute from the VGA ROM then? I don't know, I guess we don't? In that case why does it work today? > Also, how do you migrate its contents? That would also not work. We would have to re-initialize it in QEMU on the receiving end. > And how is VGA different from say an iPXE ROM? iPXE is read into memory by hvmloader. > It would be nice if QEMU could just special case pc.ram (which has > block->offset == 0), and use the normal method to allocate other RAM regions. > But I'm afraid that would require some changes in the Xen toolstack as well > (for migration, for example) and I'm not sure how you could execute from PCI > ROM BARs. > > Paolo > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 18:30 ` Stefano Stabellini @ 2014-03-29 7:31 ` Paolo Bonzini 2014-03-30 7:57 ` Fabio Fantoni 0 siblings, 1 reply; 23+ messages in thread From: Paolo Bonzini @ 2014-03-29 7:31 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Jan Beulich Il 28/03/2014 19:30, Stefano Stabellini ha scritto: > On Fri, 28 Mar 2014, Paolo Bonzini wrote: >> Il 28/03/2014 18:52, Stefano Stabellini ha scritto: >>>> This is a thorny issue, fixing this behavior is not going to be trivial: >>>> >>>> - The hypervisor/libxc does not currently expose a >>>> xc_domain_remove_from_physmap function. >>>> >>>> - QEMU works by allocating memory regions at the end of the guest >>>> physmap and then moving them at the right place. >>>> >>>> - QEMU can destroy a memory region and in that case we could free the >>>> memory and remove it from the physmap, however that is NOT what QEMU >>>> does with the vga ROM. In that case it calls >>>> memory_region_del_subregion, so we can't be sure that the ROM won't be >>>> mapped again, therefore we cannot free it. We need to move it >>>> somewhere else, hence the problem. >> >> Right; QEMU cannot know either if the ROM will be mapped again (examples >> include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" or a >> warm reset). >> >>>> But fortunately we don't actually need to add the VGA ROM to the guest >>>> physmap for it to work, QEMU can trap and emulate. In fact even today we >>>> are not mapping it at the right place anyway, see xen_set_memory: >> >> But how can you execute from the VGA ROM then? > > I don't know, I guess we don't? In that case why does it work today? Right, the ROM is copied down to low memory by firmware (hvmloader?). >> Also, how do you migrate its contents? > > That would also not work. We would have to re-initialize it in QEMU on > the receiving end. That is problematic. It would mean that a system reset after migration may auto-upgrade some parts of the firmware. Paolo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-29 7:31 ` Paolo Bonzini @ 2014-03-30 7:57 ` Fabio Fantoni 0 siblings, 0 replies; 23+ messages in thread From: Fabio Fantoni @ 2014-03-30 7:57 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Anthony PERARD, xen-devel, Jan Beulich, Stefano Stabellini [-- Attachment #1.1: Type: text/plain, Size: 3751 bytes --] 2014-03-29 8:31 GMT+01:00 Paolo Bonzini <pbonzini@redhat.com>: > Il 28/03/2014 19:30, Stefano Stabellini ha scritto: > > On Fri, 28 Mar 2014, Paolo Bonzini wrote: >> >>> Il 28/03/2014 18:52, Stefano Stabellini ha scritto: >>> >>>> This is a thorny issue, fixing this behavior is not going to be trivial: >>>>> >>>>> - The hypervisor/libxc does not currently expose a >>>>> xc_domain_remove_from_physmap function. >>>>> >>>>> - QEMU works by allocating memory regions at the end of the guest >>>>> physmap and then moving them at the right place. >>>>> >>>>> - QEMU can destroy a memory region and in that case we could free the >>>>> memory and remove it from the physmap, however that is NOT what QEMU >>>>> does with the vga ROM. In that case it calls >>>>> memory_region_del_subregion, so we can't be sure that the ROM won't >>>>> be >>>>> mapped again, therefore we cannot free it. We need to move it >>>>> somewhere else, hence the problem. >>>>> >>>> >>> Right; QEMU cannot know either if the ROM will be mapped again (examples >>> include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat >>> rom" or a >>> warm reset). >>> >>> But fortunately we don't actually need to add the VGA ROM to the guest >>>>> physmap for it to work, QEMU can trap and emulate. In fact even today >>>>> we >>>>> are not mapping it at the right place anyway, see xen_set_memory: >>>>> >>>> >>> But how can you execute from the VGA ROM then? >>> >> >> I don't know, I guess we don't? In that case why does it work today? >> > > Right, the ROM is copied down to low memory by firmware (hvmloader?). Only vgabios and other rom of qemu traditional are include and loaded by hvmloader. Time ago when I was trying to solve some problems with the emulated vgas I came to doubt that the vgabios of qemu upstream were not loaded or used correctly. Someone had told me that they were loaded automatically from qemu when you use the qemu upstream. Unfortunately I do not have enough knowledge and are not able to find exactly the problems or things missing in xen to solve problems with the emulated vgas. I did a lot of tests, comparing with kvm using same qemu parameters used with xen showed almost always higher video performance on kvm and qxl was not working on xen but showing too few errors/details in logs that I posted long ago, unfortunately no answers. It seemed to me from what little I knew that was not allocated or usedcorrectly all the ram or one or more regions (having memory errors in logs) and / or not being loaded or used properly the vgabios. Seems that in this thread you are probably trying to solve problems including the ones I found. Last mail of my qxl tests for example is this: http://lists.xen.org/archives/html/xen-devel/2013-12/msg00758.html And the memory error on domU logs of this test was: ioremap error for 0xfc001000-0xfc002000, requested 0x10, got 0x0 There was also another test maybe 2 years ago for which data have made me doubt the proper loading or use of vgabios stdvga with xen and qemu upstream but unfortunately can not find it now. I will try to help with test and post results/details if I can. For example some posts ago I see Stabellini patch that seems about load of vgabios and other roms, should be tested? Thanks for any reply and sorry for my bad english. > > > Also, how do you migrate its contents? >>> >> >> That would also not work. We would have to re-initialize it in QEMU on >> the receiving end. >> > > That is problematic. It would mean that a system reset after migration > may auto-upgrade some parts of the firmware. > > > Paolo > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > [-- Attachment #1.2: Type: text/html, Size: 9674 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-28 17:46 ` Stefano Stabellini 2014-03-28 17:52 ` Stefano Stabellini @ 2014-03-31 9:07 ` Jan Beulich 2014-04-03 16:12 ` Stefano Stabellini 1 sibling, 1 reply; 23+ messages in thread From: Jan Beulich @ 2014-03-31 9:07 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote: > But fortunately we don't actually need to add the VGA ROM to the guest > physmap for it to work, Is that true even when the ROM gets enabled by the guest? > QEMU can trap and emulate. In fact even today we > are not mapping it at the right place anyway, see xen_set_memory: > > if (add) { > if (!memory_region_is_rom(section->mr)) { > xen_add_to_physmap(state, start_addr, size, > section->mr, section->offset_within_region); > } else { Right - that's part of the problem. And it would seem to be better to map it where it belongs (even if not enabled) than to have it sit at some arbitrary place. But as that still wouldn't be correct, I'd clearly prefer a proper solution. > So the only solution I can see right now is: > > - avoid allocating guest memory for the VGA ROM > That means that at the beginning of xen_ram_alloc we need to realize > that the memory region we are dealing with is the VGA ROM memory region > and avoid calling xc_domain_populate_physmap_exact for it. > > - call g_malloc instead > Simply use g_malloc to allocate QEMU memory for the VGA ROM, > keep track of the allocation in a data structure internal to xen-all.c. > > - make sure that qemu_get_ram_ptr can deal with the different allocation > Now that the VGA ROM is QEMU memory, we need to make sure that > qemu_get_ram_ptr returns the right pointer for it. > > > This is all very fiddly and hackish, but I can't see a better way of > solving the issue. Plus this all reads very VGA-special-casing to me, yet a proper model would universally cover all PCI ROMs (emulated as well as passed through). Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-03-31 9:07 ` Jan Beulich @ 2014-04-03 16:12 ` Stefano Stabellini 2014-04-04 6:45 ` Jan Beulich 0 siblings, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-04-03 16:12 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini On Mon, 31 Mar 2014, Jan Beulich wrote: > >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote: > > But fortunately we don't actually need to add the VGA ROM to the guest > > physmap for it to work, > > Is that true even when the ROM gets enabled by the guest? Yes, I think so. > > QEMU can trap and emulate. In fact even today we > > are not mapping it at the right place anyway, see xen_set_memory: > > > > if (add) { > > if (!memory_region_is_rom(section->mr)) { > > xen_add_to_physmap(state, start_addr, size, > > section->mr, section->offset_within_region); > > } else { > > Right - that's part of the problem. And it would seem to be better to > map it where it belongs (even if not enabled) than to have it sit at > some arbitrary place. But as that still wouldn't be correct, I'd clearly > prefer a proper solution. We could go down this route, but then on unmap the rom would just be moved back to the original place. Do you think that would be a reasonable solution? From QEMU POV it would certainly be better then the approach below. > > So the only solution I can see right now is: > > > > - avoid allocating guest memory for the VGA ROM > > That means that at the beginning of xen_ram_alloc we need to realize > > that the memory region we are dealing with is the VGA ROM memory region > > and avoid calling xc_domain_populate_physmap_exact for it. > > > > - call g_malloc instead > > Simply use g_malloc to allocate QEMU memory for the VGA ROM, > > keep track of the allocation in a data structure internal to xen-all.c. > > > > - make sure that qemu_get_ram_ptr can deal with the different allocation > > Now that the VGA ROM is QEMU memory, we need to make sure that > > qemu_get_ram_ptr returns the right pointer for it. > > > > > > This is all very fiddly and hackish, but I can't see a better way of > > solving the issue. > > Plus this all reads very VGA-special-casing to me, yet a proper model > would universally cover all PCI ROMs (emulated as well as passed > through). This is the only option to avoid having the rom mapped at high addresses in the guest memory map. It would work for all PCI ROMs. The problem is realizing from xen_ram_alloc and qemu_ram_ptr_length that we are dealing with a PCI ROM rather than something else. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-03 16:12 ` Stefano Stabellini @ 2014-04-04 6:45 ` Jan Beulich 2014-04-04 9:34 ` Paolo Bonzini 2014-04-04 13:53 ` Stefano Stabellini 0 siblings, 2 replies; 23+ messages in thread From: Jan Beulich @ 2014-04-04 6:45 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini >>> On 03.04.14 at 18:12, <stefano.stabellini@eu.citrix.com> wrote: > On Mon, 31 Mar 2014, Jan Beulich wrote: >> >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote: >> > But fortunately we don't actually need to add the VGA ROM to the guest >> > physmap for it to work, >> >> Is that true even when the ROM gets enabled by the guest? > > Yes, I think so. Implying that any execution of code in the ROM would be fully emulated. Very odd, but fitting the picture of trying to be as slow as possible (in the context of the breakage introduced by ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I had to run qemu-traditional and qemu-upstream, and the performance of the guest visibly _much_ better with the former, which I consider rather worrying). >> > QEMU can trap and emulate. In fact even today we >> > are not mapping it at the right place anyway, see xen_set_memory: >> > >> > if (add) { >> > if (!memory_region_is_rom(section->mr)) { >> > xen_add_to_physmap(state, start_addr, size, >> > section->mr, section->offset_within_region); >> > } else { >> >> Right - that's part of the problem. And it would seem to be better to >> map it where it belongs (even if not enabled) than to have it sit at >> some arbitrary place. But as that still wouldn't be correct, I'd clearly >> prefer a proper solution. > > We could go down this route, but then on unmap the rom would just be > moved back to the original place. > Do you think that would be a reasonable solution? From QEMU POV it would > certainly be better then the approach below. No, it should just never appear at the wrong address. As I said above, I'd consider it halfway acceptable if it remained mapped despite an unmap, but I do think that a proper solution (properly unmapping without de-allocating) can and should be found. The more that this intermediate approach can't really work as I now realize: When disabled while the guest is sizing the BARs, the address put there would be all ones in the writable upper bits of the address, i.e. not a place where the ROM could be legitimately mapped. And btw - the current model is inconsistent anyway (and perhaps a reason why certain things appear to not work right when a domain has memory extending beyond 4G): Once other things (it was the frame buffer in all cases I've seen) get mapped into the address space, the ROM gets (implicitly) unmapped anyway. So relying on it to stay somewhere in guest address space is broken in any event; qemu's view just doesn't match reality anymore at that point. Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 6:45 ` Jan Beulich @ 2014-04-04 9:34 ` Paolo Bonzini 2014-04-04 9:45 ` Jan Beulich 2014-04-04 13:53 ` Stefano Stabellini 1 sibling, 1 reply; 23+ messages in thread From: Paolo Bonzini @ 2014-04-04 9:34 UTC (permalink / raw) To: Jan Beulich, Stefano Stabellini; +Cc: anthony.perard, xen-devel Il 04/04/2014 08:45, Jan Beulich ha scritto: > Implying that any execution of code in the ROM would be fully > emulated. ROM is never executed in place. It is always copied to low RAM and executed from there. It might slow down the copy. In fact, on AMD it is not always possible to execute from ROM; if the ROM includes page tables, as was the case for example for 64-bit OVMF, it crashes because NPT expects page tables to be in writable guest memory. > Very odd, but fitting the picture of trying to be as slow > as possible (in the context of the breakage introduced by > ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I > had to run qemu-traditional and qemu-upstream, and the > performance of the guest visibly _much_ better with the former, > which I consider rather worrying). That's quite unexpected. What was your configuration and workload? And what was slower exactly? Disk, network or video (as an initial simplification). Paolo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 9:34 ` Paolo Bonzini @ 2014-04-04 9:45 ` Jan Beulich 0 siblings, 0 replies; 23+ messages in thread From: Jan Beulich @ 2014-04-04 9:45 UTC (permalink / raw) To: Stefano Stabellini, Paolo Bonzini; +Cc: anthony.perard, xen-devel >>> On 04.04.14 at 11:34, <pbonzini@redhat.com> wrote: > Il 04/04/2014 08:45, Jan Beulich ha scritto: >> Very odd, but fitting the picture of trying to be as slow >> as possible (in the context of the breakage introduced by >> ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I (For the record - I was off by a line when copy-and-pasting this, it really was 8bad6c56 "x86/HVM: fix preemption handling in do_hvm_op()"). >> had to run qemu-traditional and qemu-upstream, and the >> performance of the guest visibly _much_ better with the former, >> which I consider rather worrying). > > That's quite unexpected. What was your configuration and workload? And > what was slower exactly? Disk, network or video (as an initial > simplification). Video in particular. Disk and network, using PV drivers, obviously are pretty independent on qemu version (and don't matter much during early boot). But even normal execution during early BIOS initialization seems notably slower (under the assumption that when nothing changes on the virtual screen, video performance doesn't matter). But of course that's not comparing apples to apples, as the two BIOSes also differ... Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 6:45 ` Jan Beulich 2014-04-04 9:34 ` Paolo Bonzini @ 2014-04-04 13:53 ` Stefano Stabellini 2014-04-04 14:58 ` Jan Beulich 1 sibling, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-04-04 13:53 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini On Fri, 4 Apr 2014, Jan Beulich wrote: > >>> On 03.04.14 at 18:12, <stefano.stabellini@eu.citrix.com> wrote: > > On Mon, 31 Mar 2014, Jan Beulich wrote: > >> >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote: > >> > But fortunately we don't actually need to add the VGA ROM to the guest > >> > physmap for it to work, > >> > >> Is that true even when the ROM gets enabled by the guest? > > > > Yes, I think so. > > Implying that any execution of code in the ROM would be fully > emulated. Very odd, but fitting the picture of trying to be as slow > as possible (in the context of the breakage introduced by > ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I > had to run qemu-traditional and qemu-upstream, and the > performance of the guest visibly _much_ better with the former, > which I consider rather worrying). > > >> > QEMU can trap and emulate. In fact even today we > >> > are not mapping it at the right place anyway, see xen_set_memory: > >> > > >> > if (add) { > >> > if (!memory_region_is_rom(section->mr)) { > >> > xen_add_to_physmap(state, start_addr, size, > >> > section->mr, section->offset_within_region); > >> > } else { > >> > >> Right - that's part of the problem. And it would seem to be better to > >> map it where it belongs (even if not enabled) than to have it sit at > >> some arbitrary place. But as that still wouldn't be correct, I'd clearly > >> prefer a proper solution. > > > > We could go down this route, but then on unmap the rom would just be > > moved back to the original place. > > Do you think that would be a reasonable solution? From QEMU POV it would > > certainly be better then the approach below. > > No, it should just never appear at the wrong address. As I said above, > I'd consider it halfway acceptable if it remained mapped despite an > unmap, but I do think that a proper solution (properly unmapping > without de-allocating) can and should be found. There is no way to do it today AFAICT. Would you care of proposing an hypercall that would support this scenario? > The more that this > intermediate approach can't really work as I now realize: When > disabled while the guest is sizing the BARs, the address put there > would be all ones in the writable upper bits of the address, i.e. not > a place where the ROM could be legitimately mapped. > > And btw - the current model is inconsistent anyway (and perhaps a > reason why certain things appear to not work right when a domain > has memory extending beyond 4G): Once other things (it was the > frame buffer in all cases I've seen) get mapped into the address > space, the ROM gets (implicitly) unmapped anyway. So relying on > it to stay somewhere in guest address space is broken in any event; > qemu's view just doesn't match reality anymore at that point. The alternative, never mapping any ROMs in the guest address space, has other issues too: - inconsistency with QEMU's way of doing things - firmware update on migration (as Paolo pointed out) I don't really see this as a huge step forward. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 13:53 ` Stefano Stabellini @ 2014-04-04 14:58 ` Jan Beulich 2014-04-04 15:32 ` Stefano Stabellini 0 siblings, 1 reply; 23+ messages in thread From: Jan Beulich @ 2014-04-04 14:58 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: > On Fri, 4 Apr 2014, Jan Beulich wrote: >> No, it should just never appear at the wrong address. As I said above, >> I'd consider it halfway acceptable if it remained mapped despite an >> unmap, but I do think that a proper solution (properly unmapping >> without de-allocating) can and should be found. > > There is no way to do it today AFAICT. > Would you care of proposing an hypercall that would support this > scenario? Hypercall? Everything you need is there afaict. >> The more that this >> intermediate approach can't really work as I now realize: When >> disabled while the guest is sizing the BARs, the address put there >> would be all ones in the writable upper bits of the address, i.e. not >> a place where the ROM could be legitimately mapped. >> >> And btw - the current model is inconsistent anyway (and perhaps a >> reason why certain things appear to not work right when a domain >> has memory extending beyond 4G): Once other things (it was the >> frame buffer in all cases I've seen) get mapped into the address >> space, the ROM gets (implicitly) unmapped anyway. So relying on >> it to stay somewhere in guest address space is broken in any event; >> qemu's view just doesn't match reality anymore at that point. > > The alternative, never mapping any ROMs in the guest address space, has > other issues too: > > - inconsistency with QEMU's way of doing things > - firmware update on migration (as Paolo pointed out) > > I don't really see this as a huge step forward. And I never proposed this as an alternative. Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 14:58 ` Jan Beulich @ 2014-04-04 15:32 ` Stefano Stabellini 2014-04-04 16:00 ` Jan Beulich 0 siblings, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-04-04 15:32 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini On Fri, 4 Apr 2014, Jan Beulich wrote: > >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: > > On Fri, 4 Apr 2014, Jan Beulich wrote: > >> No, it should just never appear at the wrong address. As I said above, > >> I'd consider it halfway acceptable if it remained mapped despite an > >> unmap, but I do think that a proper solution (properly unmapping > >> without de-allocating) can and should be found. > > > > There is no way to do it today AFAICT. > > Would you care of proposing an hypercall that would support this > > scenario? > > Hypercall? Everything you need is there afaict. Maybe I am missing something. Moving the PCI ROM to the right place in the guest physmap is easy. However how do you think we could unmap the memory (remove it from the guest physmap) without deallocating it? The only hypercall we have is xc_domain_add_to_physmap at the moment. Unless you are thinking of allocating the ROM in a QEMU buffer, but that would still go against QEMU's way and has the same bad side effects of the other approach. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 15:32 ` Stefano Stabellini @ 2014-04-04 16:00 ` Jan Beulich 2014-04-04 16:54 ` Stefano Stabellini 0 siblings, 1 reply; 23+ messages in thread From: Jan Beulich @ 2014-04-04 16:00 UTC (permalink / raw) To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini >>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote: > On Fri, 4 Apr 2014, Jan Beulich wrote: >> >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: >> > On Fri, 4 Apr 2014, Jan Beulich wrote: >> >> No, it should just never appear at the wrong address. As I said above, >> >> I'd consider it halfway acceptable if it remained mapped despite an >> >> unmap, but I do think that a proper solution (properly unmapping >> >> without de-allocating) can and should be found. >> > >> > There is no way to do it today AFAICT. >> > Would you care of proposing an hypercall that would support this >> > scenario? >> >> Hypercall? Everything you need is there afaict. > > Maybe I am missing something. > > Moving the PCI ROM to the right place in the guest physmap is easy. > However how do you think we could unmap the memory (remove it from the > guest physmap) without deallocating it? > The only hypercall we have is xc_domain_add_to_physmap at the moment. XENMEM_remove_from_physmap should be quite suitable here, but that is not your problem. The problem is that XENMEM_add_to_physmap requires a GFN to be passed in, i.e. assumes that a page to be mapped is already mapped somewhere in the guest. (The term "add" in this context is rather confusing, as in the GMFN map space case the page isn't being added, but moved.) So indeed there is functionality missing in the hypervisor. Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 16:00 ` Jan Beulich @ 2014-04-04 16:54 ` Stefano Stabellini 2014-05-05 10:04 ` Fabio Fantoni 0 siblings, 1 reply; 23+ messages in thread From: Stefano Stabellini @ 2014-04-04 16:54 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini On Fri, 4 Apr 2014, Jan Beulich wrote: > >>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote: > > On Fri, 4 Apr 2014, Jan Beulich wrote: > >> >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: > >> > On Fri, 4 Apr 2014, Jan Beulich wrote: > >> >> No, it should just never appear at the wrong address. As I said above, > >> >> I'd consider it halfway acceptable if it remained mapped despite an > >> >> unmap, but I do think that a proper solution (properly unmapping > >> >> without de-allocating) can and should be found. > >> > > >> > There is no way to do it today AFAICT. > >> > Would you care of proposing an hypercall that would support this > >> > scenario? > >> > >> Hypercall? Everything you need is there afaict. > > > > Maybe I am missing something. > > > > Moving the PCI ROM to the right place in the guest physmap is easy. > > However how do you think we could unmap the memory (remove it from the > > guest physmap) without deallocating it? > > The only hypercall we have is xc_domain_add_to_physmap at the moment. > > XENMEM_remove_from_physmap should be quite suitable here, but > that is not your problem. The problem is that XENMEM_add_to_physmap > requires a GFN to be passed in, i.e. assumes that a page to be mapped > is already mapped somewhere in the guest. (The term "add" in this > context is rather confusing, as in the GMFN map space case the page > isn't being added, but moved.) So indeed there is functionality missing > in the hypervisor. Right. After we call XENMEM_remove_from_physmap, there is no way of adding back the pages to the physmap without knowing the corresponding mfns. Even if we knew the mfns of the original allocation, relying on them is probably not a good idea because the pages could theoretically be offlined or shared. This is the reason why I was asking for the hypercall you had in mind to solve this problem. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-04-04 16:54 ` Stefano Stabellini @ 2014-05-05 10:04 ` Fabio Fantoni 2014-05-05 10:35 ` Jan Beulich 0 siblings, 1 reply; 23+ messages in thread From: Fabio Fantoni @ 2014-05-05 10:04 UTC (permalink / raw) To: Stefano Stabellini, Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini Il 04/04/2014 18:54, Stefano Stabellini ha scritto: > On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote: >>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: >>>>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>> No, it should just never appear at the wrong address. As I said above, >>>>>> I'd consider it halfway acceptable if it remained mapped despite an >>>>>> unmap, but I do think that a proper solution (properly unmapping >>>>>> without de-allocating) can and should be found. >>>>> There is no way to do it today AFAICT. >>>>> Would you care of proposing an hypercall that would support this >>>>> scenario? >>>> Hypercall? Everything you need is there afaict. >>> Maybe I am missing something. >>> >>> Moving the PCI ROM to the right place in the guest physmap is easy. >>> However how do you think we could unmap the memory (remove it from the >>> guest physmap) without deallocating it? >>> The only hypercall we have is xc_domain_add_to_physmap at the moment. >> XENMEM_remove_from_physmap should be quite suitable here, but >> that is not your problem. The problem is that XENMEM_add_to_physmap >> requires a GFN to be passed in, i.e. assumes that a page to be mapped >> is already mapped somewhere in the guest. (The term "add" in this >> context is rather confusing, as in the GMFN map space case the page >> isn't being added, but moved.) So indeed there is functionality missing >> in the hypervisor. > Right. > > After we call XENMEM_remove_from_physmap, there is no way of adding back > the pages to the physmap without knowing the corresponding mfns. Even > if we knew the mfns of the original allocation, relying on them is > probably not a good idea because the pages could theoretically be > offlined or shared. > > This is the reason why I was asking for the hypercall you had in mind to > solve this problem. Any news about this discussion? Thanks for any reply. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-05-05 10:04 ` Fabio Fantoni @ 2014-05-05 10:35 ` Jan Beulich 2014-05-05 11:10 ` Fabio Fantoni 0 siblings, 1 reply; 23+ messages in thread From: Jan Beulich @ 2014-05-05 10:35 UTC (permalink / raw) To: Fabio Fantoni Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini >>> On 05.05.14 at 12:04, <fabio.fantoni@m2r.biz> wrote: > Il 04/04/2014 18:54, Stefano Stabellini ha scritto: >> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote: >>>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: >>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>> No, it should just never appear at the wrong address. As I said above, >>>>>>> I'd consider it halfway acceptable if it remained mapped despite an >>>>>>> unmap, but I do think that a proper solution (properly unmapping >>>>>>> without de-allocating) can and should be found. >>>>>> There is no way to do it today AFAICT. >>>>>> Would you care of proposing an hypercall that would support this >>>>>> scenario? >>>>> Hypercall? Everything you need is there afaict. >>>> Maybe I am missing something. >>>> >>>> Moving the PCI ROM to the right place in the guest physmap is easy. >>>> However how do you think we could unmap the memory (remove it from the >>>> guest physmap) without deallocating it? >>>> The only hypercall we have is xc_domain_add_to_physmap at the moment. >>> XENMEM_remove_from_physmap should be quite suitable here, but >>> that is not your problem. The problem is that XENMEM_add_to_physmap >>> requires a GFN to be passed in, i.e. assumes that a page to be mapped >>> is already mapped somewhere in the guest. (The term "add" in this >>> context is rather confusing, as in the GMFN map space case the page >>> isn't being added, but moved.) So indeed there is functionality missing >>> in the hypervisor. >> Right. >> >> After we call XENMEM_remove_from_physmap, there is no way of adding back >> the pages to the physmap without knowing the corresponding mfns. Even >> if we knew the mfns of the original allocation, relying on them is >> probably not a good idea because the pages could theoretically be >> offlined or shared. >> >> This is the reason why I was asking for the hypercall you had in mind to >> solve this problem. > > Any news about this discussion? Someone would need to write both hypervisor and qemu side patches - are you volunteering? Jan ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gross qemu behavior 2014-05-05 10:35 ` Jan Beulich @ 2014-05-05 11:10 ` Fabio Fantoni 0 siblings, 0 replies; 23+ messages in thread From: Fabio Fantoni @ 2014-05-05 11:10 UTC (permalink / raw) To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini Il 05/05/2014 12:35, Jan Beulich ha scritto: >>>> On 05.05.14 at 12:04, <fabio.fantoni@m2r.biz> wrote: >> Il 04/04/2014 18:54, Stefano Stabellini ha scritto: >>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote: >>>>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote: >>>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote: >>>>>>>> No, it should just never appear at the wrong address. As I said above, >>>>>>>> I'd consider it halfway acceptable if it remained mapped despite an >>>>>>>> unmap, but I do think that a proper solution (properly unmapping >>>>>>>> without de-allocating) can and should be found. >>>>>>> There is no way to do it today AFAICT. >>>>>>> Would you care of proposing an hypercall that would support this >>>>>>> scenario? >>>>>> Hypercall? Everything you need is there afaict. >>>>> Maybe I am missing something. >>>>> >>>>> Moving the PCI ROM to the right place in the guest physmap is easy. >>>>> However how do you think we could unmap the memory (remove it from the >>>>> guest physmap) without deallocating it? >>>>> The only hypercall we have is xc_domain_add_to_physmap at the moment. >>>> XENMEM_remove_from_physmap should be quite suitable here, but >>>> that is not your problem. The problem is that XENMEM_add_to_physmap >>>> requires a GFN to be passed in, i.e. assumes that a page to be mapped >>>> is already mapped somewhere in the guest. (The term "add" in this >>>> context is rather confusing, as in the GMFN map space case the page >>>> isn't being added, but moved.) So indeed there is functionality missing >>>> in the hypervisor. >>> Right. >>> >>> After we call XENMEM_remove_from_physmap, there is no way of adding back >>> the pages to the physmap without knowing the corresponding mfns. Even >>> if we knew the mfns of the original allocation, relying on them is >>> probably not a good idea because the pages could theoretically be >>> offlined or shared. >>> >>> This is the reason why I was asking for the hypercall you had in mind to >>> solve this problem. >> Any news about this discussion? > Someone would need to write both hypervisor and qemu side patches > - are you volunteering? > > Jan > Thanks for your reply. Unfortunately I don't have enough knowledge to do such patches. I just wanted to know if there was any news. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2014-05-05 11:10 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-28 7:48 gross qemu behavior Jan Beulich 2014-03-28 9:21 ` Jan Beulich 2014-03-28 9:30 ` Fabio Fantoni 2014-03-28 10:37 ` Jan Beulich 2014-03-28 17:46 ` Stefano Stabellini 2014-03-28 17:52 ` Stefano Stabellini 2014-03-28 18:01 ` Paolo Bonzini 2014-03-28 18:30 ` Stefano Stabellini 2014-03-29 7:31 ` Paolo Bonzini 2014-03-30 7:57 ` Fabio Fantoni 2014-03-31 9:07 ` Jan Beulich 2014-04-03 16:12 ` Stefano Stabellini 2014-04-04 6:45 ` Jan Beulich 2014-04-04 9:34 ` Paolo Bonzini 2014-04-04 9:45 ` Jan Beulich 2014-04-04 13:53 ` Stefano Stabellini 2014-04-04 14:58 ` Jan Beulich 2014-04-04 15:32 ` Stefano Stabellini 2014-04-04 16:00 ` Jan Beulich 2014-04-04 16:54 ` Stefano Stabellini 2014-05-05 10:04 ` Fabio Fantoni 2014-05-05 10:35 ` Jan Beulich 2014-05-05 11:10 ` Fabio Fantoni
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.