All of lore.kernel.org
 help / color / mirror / Atom feed
* gross qemu behavior
@ 2014-03-28  7:48 Jan Beulich
  2014-03-28  9:21 ` Jan Beulich
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Jan Beulich @ 2014-03-28  7:48 UTC (permalink / raw)
  To: anthony.perard, Stefano Stabellini; +Cc: xen-devel

Hi,

so while doing all that EPT work I naturally also happened to look more
closely at the EPT table dumps, spotting an odd range of 16 pages
outside any supposedly populated address range. This range only
exists when guest memory doesn't extend past (by default) 0xf0000000
(the start of MMIO, i.e. normally the frame buffer). After spending quite
a bit of time I finally figured that this must be a left over of the Cirrus
VGA ROM, and I would have thought that this

--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice 
     }
 
     pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
+    memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
+                                        pdev->rom.ram_addr, &pdev->rom, 1);
+    memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
 
     return 0;
 }

should fix it. It does appear to work as far generic qemu is concerned,
but once looking at the Xen backend I had to conclude that this just
can't work: For one, xen_add_to_physmap() and
xen_remove_from_physmap() are _documented_ (in a comment) to
only be capable of a single region (VRAM). And the latter - even worse -
is implemented with a call to xc_domain_add_to_physmap(), completely
contrary to its name.

Instrumenting xen_region_{add,del}(), I can see that all regions get
properly reported to the Xen backend, just that it doesn't handle them
(this is with above patch in place):

xra(fee00000,100000)
xra(fec00000,1000)
xra(fed00000,400)
xra(80000000,10000)
xrd(80000000,10000)
xra(f0000000,800000)
xra(f1000000,400000)
xra(f2000000,1000000)
xra(f3010000,4000)
xra(f3014000,1000)
xra(f3015000,3000)
xra(f3018000,1000)
xra(f3000000,10000)
xrd(f3000000,10000)
xrd(f0000000,800000)
xra(f0000000,800000)
mapping vram to f0000000 - f0800000

Having wasted enough time getting to this point, I'd like to ask you
to advise a proper fix for this. We definitely shouldn't be leaving
stuff sitting at arbitrary positions in the physical address space of
the guest. And the fact that the range gets removed (from Xen's
perspective, but not from qemu's) when RAM extends beyond
0xf0000000 (due to it being replaced with what is actually
intended to be there) makes me wonder what would happen if the
ROM got enabled by the guest.

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28  7:48 gross qemu behavior Jan Beulich
@ 2014-03-28  9:21 ` Jan Beulich
  2014-03-28  9:30 ` Fabio Fantoni
  2014-03-28 17:46 ` Stefano Stabellini
  2 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2014-03-28  9:21 UTC (permalink / raw)
  To: anthony.perard, Stefano Stabellini; +Cc: xen-devel

>>> On 28.03.14 at 08:48, <JBeulich@suse.com> wrote:
> Having wasted enough time getting to this point, I'd like to ask you
> to advise a proper fix for this. We definitely shouldn't be leaving
> stuff sitting at arbitrary positions in the physical address space of
> the guest. And the fact that the range gets removed (from Xen's
> perspective, but not from qemu's) when RAM extends beyond
> 0xf0000000 (due to it being replaced with what is actually
> intended to be there) makes me wonder what would happen if the
> ROM got enabled by the guest.

Fixing of which would, afaict, also address the performance impacting
fact that the emulated MMIO ranges other than the frame buffer get
marked UC in the EPT tables if the domain has any passed through
devices (as then the call to xc_domain_pin_memory_cacheattr()
would get called for all such regions - care would of course need to
be taken to avoid calling it for MMIO regions of passed through
devices).

And looking at the cache attribute pinning I see that this is broken
too: The hypervisor doesn't even expose a removal interface, and
the adding one doesn't check whether the new region already exists
or conflicts with already existing ones. What if the guest decided to
relocate the region a couple of times?

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28  7:48 gross qemu behavior Jan Beulich
  2014-03-28  9:21 ` Jan Beulich
@ 2014-03-28  9:30 ` Fabio Fantoni
  2014-03-28 10:37   ` Jan Beulich
  2014-03-28 17:46 ` Stefano Stabellini
  2 siblings, 1 reply; 23+ messages in thread
From: Fabio Fantoni @ 2014-03-28  9:30 UTC (permalink / raw)
  To: Jan Beulich, anthony.perard, Stefano Stabellini; +Cc: xen-devel

Il 28/03/2014 08:48, Jan Beulich ha scritto:
> Hi,
>
> so while doing all that EPT work I naturally also happened to look more
> closely at the EPT table dumps, spotting an odd range of 16 pages
> outside any supposedly populated address range. This range only
> exists when guest memory doesn't extend past (by default) 0xf0000000
> (the start of MMIO, i.e. normally the frame buffer). After spending quite
> a bit of time I finally figured that this must be a left over of the Cirrus
> VGA ROM, and I would have thought that this
>
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice
>       }
>   
>       pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
> +    memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
> +                                        pdev->rom.ram_addr, &pdev->rom, 1);
> +    memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
>   
>       return 0;
>   }
>
> should fix it. It does appear to work as far generic qemu is concerned,
> but once looking at the Xen backend I had to conclude that this just
> can't work: For one, xen_add_to_physmap() and
> xen_remove_from_physmap() are _documented_ (in a comment) to
> only be capable of a single region (VRAM). And the latter - even worse -
> is implemented with a call to xc_domain_add_to_physmap(), completely
> contrary to its name.
>
> Instrumenting xen_region_{add,del}(), I can see that all regions get
> properly reported to the Xen backend, just that it doesn't handle them
> (this is with above patch in place):
>
> xra(fee00000,100000)
> xra(fec00000,1000)
> xra(fed00000,400)
> xra(80000000,10000)
> xrd(80000000,10000)
> xra(f0000000,800000)
> xra(f1000000,400000)
> xra(f2000000,1000000)
> xra(f3010000,4000)
> xra(f3014000,1000)
> xra(f3015000,3000)
> xra(f3018000,1000)
> xra(f3000000,10000)
> xrd(f3000000,10000)
> xrd(f0000000,800000)
> xra(f0000000,800000)
> mapping vram to f0000000 - f0800000
>
> Having wasted enough time getting to this point, I'd like to ask you
> to advise a proper fix for this. We definitely shouldn't be leaving
> stuff sitting at arbitrary positions in the physical address space of
> the guest. And the fact that the range gets removed (from Xen's
> perspective, but not from qemu's) when RAM extends beyond
> 0xf0000000 (due to it being replaced with what is actually
> intended to be there) makes me wonder what would happen if the
> ROM got enabled by the guest.

Thanks for your work.
I do not know enough about these things to help you solve it unfortunately.
It seems to me, however, to understand that this problem may be the 
actual cause (or at least one) that also blocks the correct allocation 
of all qxl memory regionsand perhaps even setting up more ram for stdvga 
that although no errors appear apparently not working.
Can you tell me if it is correct or am I wrong?
If it is correct please put me in cc of the future mails and/or patches 
and I will test them with qxl and any other features that they affect.

Thanks for any reply and sorry for my bad english.

>
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28  9:30 ` Fabio Fantoni
@ 2014-03-28 10:37   ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2014-03-28 10:37 UTC (permalink / raw)
  To: Fabio Fantoni; +Cc: anthony.perard, xen-devel, Stefano Stabellini

>>> On 28.03.14 at 10:30, <fabio.fantoni@m2r.biz> wrote:
> Thanks for your work.
> I do not know enough about these things to help you solve it unfortunately.
> It seems to me, however, to understand that this problem may be the 
> actual cause (or at least one) that also blocks the correct allocation 
> of all qxl memory regionsand perhaps even setting up more ram for stdvga 
> that although no errors appear apparently not working.
> Can you tell me if it is correct or am I wrong?

I don't know for sure, but it's certainly possible.

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28  7:48 gross qemu behavior Jan Beulich
  2014-03-28  9:21 ` Jan Beulich
  2014-03-28  9:30 ` Fabio Fantoni
@ 2014-03-28 17:46 ` Stefano Stabellini
  2014-03-28 17:52   ` Stefano Stabellini
  2014-03-31  9:07   ` Jan Beulich
  2 siblings, 2 replies; 23+ messages in thread
From: Stefano Stabellini @ 2014-03-28 17:46 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

CC'ing Paolo, hoping that he has a better idea on how to solve this
problem.


On Fri, 28 Mar 2014, Jan Beulich wrote:
> Hi,
> 
> so while doing all that EPT work I naturally also happened to look more
> closely at the EPT table dumps, spotting an odd range of 16 pages
> outside any supposedly populated address range. This range only
> exists when guest memory doesn't extend past (by default) 0xf0000000
> (the start of MMIO, i.e. normally the frame buffer). After spending quite
> a bit of time I finally figured that this must be a left over of the Cirrus
> VGA ROM, and I would have thought that this
> 
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice 
>      }
>  
>      pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
> +    memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
> +                                        pdev->rom.ram_addr, &pdev->rom, 1);
> +    memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
>  
>      return 0;
>  }
> 
> should fix it. It does appear to work as far generic qemu is concerned,
> but once looking at the Xen backend I had to conclude that this just
> can't work: For one, xen_add_to_physmap() and
> xen_remove_from_physmap() are _documented_ (in a comment) to
> only be capable of a single region (VRAM). And the latter - even worse -
> is implemented with a call to xc_domain_add_to_physmap(), completely
> contrary to its name.

xen_add_to_physmap and xen_remove_from_physmap are just to deal with the
VRAM in their current implementation.


> Instrumenting xen_region_{add,del}(), I can see that all regions get
> properly reported to the Xen backend, just that it doesn't handle them
> (this is with above patch in place):
> 
> xra(fee00000,100000)
> xra(fec00000,1000)
> xra(fed00000,400)
> xra(80000000,10000)
> xrd(80000000,10000)
> xra(f0000000,800000)
> xra(f1000000,400000)
> xra(f2000000,1000000)
> xra(f3010000,4000)
> xra(f3014000,1000)
> xra(f3015000,3000)
> xra(f3018000,1000)
> xra(f3000000,10000)
> xrd(f3000000,10000)
> xrd(f0000000,800000)
> xra(f0000000,800000)
> mapping vram to f0000000 - f0800000
> 
> Having wasted enough time getting to this point, I'd like to ask you
> to advise a proper fix for this. We definitely shouldn't be leaving
> stuff sitting at arbitrary positions in the physical address space of
> the guest. And the fact that the range gets removed (from Xen's
> perspective, but not from qemu's) when RAM extends beyond
> 0xf0000000 (due to it being replaced with what is actually
> intended to be there) makes me wonder what would happen if the
> ROM got enabled by the guest.

This is a thorny issue, fixing this behavior is not going to be trivial:

- The hypervisor/libxc does not currently expose a
  xc_domain_remove_from_physmap function.

- QEMU works by allocating memory regions at the end of the guest
  physmap and then moving them at the right place.

- QEMU can destroy a memory region and in that case we could free the
  memory and remove it from the physmap, however that is NOT what QEMU
  does with the vga ROM. In that case it calls
  memory_region_del_subregion, so we can't be sure that the ROM won't be
  mapped again, therefore we cannot free it. We need to move it
  somewhere else, hence the problem.


But fortunately we don't actually need to add the VGA ROM to the guest
physmap for it to work, QEMU can trap and emulate. In fact even today we
are not mapping it at the right place anyway, see xen_set_memory:

    if (add) {
        if (!memory_region_is_rom(section->mr)) {
            xen_add_to_physmap(state, start_addr, size,
                               section->mr, section->offset_within_region);
        } else {


So the only solution I can see right now is:

- avoid allocating guest memory for the VGA ROM
That means that at the beginning of xen_ram_alloc we need to realize
that the memory region we are dealing with is the VGA ROM memory region
and avoid calling xc_domain_populate_physmap_exact for it.

- call g_malloc instead
Simply use g_malloc to allocate QEMU memory for the VGA ROM,
keep track of the allocation in a data structure internal to xen-all.c.

- make sure that qemu_get_ram_ptr can deal with the different allocation
Now that the VGA ROM is QEMU memory, we need to make sure that
qemu_get_ram_ptr returns the right pointer for it.


This is all very fiddly and hackish, but I can't see a better way of
solving the issue.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28 17:46 ` Stefano Stabellini
@ 2014-03-28 17:52   ` Stefano Stabellini
  2014-03-28 18:01     ` Paolo Bonzini
  2014-03-31  9:07   ` Jan Beulich
  1 sibling, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-03-28 17:52 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: anthony.perard, xen-devel, Stefano Stabellini, Jan Beulich,
	Paolo Bonzini

On Fri, 28 Mar 2014, Stefano Stabellini wrote:
> CC'ing Paolo, hoping that he has a better idea on how to solve this
> problem.
> 
> 
> On Fri, 28 Mar 2014, Jan Beulich wrote:
> > Hi,
> > 
> > so while doing all that EPT work I naturally also happened to look more
> > closely at the EPT table dumps, spotting an odd range of 16 pages
> > outside any supposedly populated address range. This range only
> > exists when guest memory doesn't extend past (by default) 0xf0000000
> > (the start of MMIO, i.e. normally the frame buffer). After spending quite
> > a bit of time I finally figured that this must be a left over of the Cirrus
> > VGA ROM, and I would have thought that this
> > 
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -1976,6 +1976,9 @@ static int pci_add_option_rom(PCIDevice 
> >      }
> >  
> >      pci_register_bar(pdev, PCI_ROM_SLOT, 0, &pdev->rom);
> > +    memory_region_add_subregion_overlap(pdev->bus->address_space_mem,
> > +                                        pdev->rom.ram_addr, &pdev->rom, 1);
> > +    memory_region_del_subregion(pdev->bus->address_space_mem, &pdev->rom);
> >  
> >      return 0;
> >  }
> > 
> > should fix it. It does appear to work as far generic qemu is concerned,
> > but once looking at the Xen backend I had to conclude that this just
> > can't work: For one, xen_add_to_physmap() and
> > xen_remove_from_physmap() are _documented_ (in a comment) to
> > only be capable of a single region (VRAM). And the latter - even worse -
> > is implemented with a call to xc_domain_add_to_physmap(), completely
> > contrary to its name.
> 
> xen_add_to_physmap and xen_remove_from_physmap are just to deal with the
> VRAM in their current implementation.
> 
> 
> > Instrumenting xen_region_{add,del}(), I can see that all regions get
> > properly reported to the Xen backend, just that it doesn't handle them
> > (this is with above patch in place):
> > 
> > xra(fee00000,100000)
> > xra(fec00000,1000)
> > xra(fed00000,400)
> > xra(80000000,10000)
> > xrd(80000000,10000)
> > xra(f0000000,800000)
> > xra(f1000000,400000)
> > xra(f2000000,1000000)
> > xra(f3010000,4000)
> > xra(f3014000,1000)
> > xra(f3015000,3000)
> > xra(f3018000,1000)
> > xra(f3000000,10000)
> > xrd(f3000000,10000)
> > xrd(f0000000,800000)
> > xra(f0000000,800000)
> > mapping vram to f0000000 - f0800000
> > 
> > Having wasted enough time getting to this point, I'd like to ask you
> > to advise a proper fix for this. We definitely shouldn't be leaving
> > stuff sitting at arbitrary positions in the physical address space of
> > the guest. And the fact that the range gets removed (from Xen's
> > perspective, but not from qemu's) when RAM extends beyond
> > 0xf0000000 (due to it being replaced with what is actually
> > intended to be there) makes me wonder what would happen if the
> > ROM got enabled by the guest.
> 
> This is a thorny issue, fixing this behavior is not going to be trivial:
> 
> - The hypervisor/libxc does not currently expose a
>   xc_domain_remove_from_physmap function.
> 
> - QEMU works by allocating memory regions at the end of the guest
>   physmap and then moving them at the right place.
> 
> - QEMU can destroy a memory region and in that case we could free the
>   memory and remove it from the physmap, however that is NOT what QEMU
>   does with the vga ROM. In that case it calls
>   memory_region_del_subregion, so we can't be sure that the ROM won't be
>   mapped again, therefore we cannot free it. We need to move it
>   somewhere else, hence the problem.
> 
> 
> But fortunately we don't actually need to add the VGA ROM to the guest
> physmap for it to work, QEMU can trap and emulate. In fact even today we
> are not mapping it at the right place anyway, see xen_set_memory:
> 
>     if (add) {
>         if (!memory_region_is_rom(section->mr)) {
>             xen_add_to_physmap(state, start_addr, size,
>                                section->mr, section->offset_within_region);
>         } else {
> 
> 
> So the only solution I can see right now is:
> 
> - avoid allocating guest memory for the VGA ROM
> That means that at the beginning of xen_ram_alloc we need to realize
> that the memory region we are dealing with is the VGA ROM memory region
> and avoid calling xc_domain_populate_physmap_exact for it.
> 
> - call g_malloc instead
> Simply use g_malloc to allocate QEMU memory for the VGA ROM,
> keep track of the allocation in a data structure internal to xen-all.c.
> 
> - make sure that qemu_get_ram_ptr can deal with the different allocation
> Now that the VGA ROM is QEMU memory, we need to make sure that
> qemu_get_ram_ptr returns the right pointer for it.
> 
> 
> This is all very fiddly and hackish, but I can't see a better way of
> solving the issue.


Given that I feel that the explanation is not very clear, I am appending
a proof of concept patch. It is obviously horrible, I am by no means
suggesting it should be applied. 


diff --git a/exec.c b/exec.c
index 91513c6..bdecc70 100644
--- a/exec.c
+++ b/exec.c
@@ -1453,6 +1453,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
    It should not be used for general purpose DMA.
    Use cpu_physical_memory_map/cpu_physical_memory_rw instead.
  */
+extern uint8_t* vga_rom;
 void *qemu_get_ram_ptr(ram_addr_t addr)
 {
     RAMBlock *block = qemu_get_ram_block(addr);
@@ -1462,7 +1463,9 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
          * because we don't want to map the entire memory in QEMU.
          * In that case just map until the end of the page.
          */
-        if (block->offset == 0) {
+        if (!strcmp(block->mr->name,"cirrus_vga.rom")) {
+            return vga_rom;
+        } else if (block->offset == 0) {
             return xen_map_cache(addr, 0, 0);
         } else if (block->host == NULL) {
             block->host =
diff --git a/xen-all.c b/xen-all.c
index ba34739..6211946 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -101,6 +101,8 @@ typedef struct XenIOState {
     Notifier wakeup;
 } XenIOState;
 
+uint8_t* vga_rom;
+
 /* Xen specific function for piix pci */
 
 int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
@@ -217,6 +219,11 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr)
         return;
     }
 
+    if (!strcmp(mr->name,"cirrus_vga.rom")) {
+        vga_rom = g_malloc(size);
+        return;
+    }
+
     trace_xen_ram_alloc(ram_addr, size);
 
     nr_pfn = size >> TARGET_PAGE_BITS;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28 17:52   ` Stefano Stabellini
@ 2014-03-28 18:01     ` Paolo Bonzini
  2014-03-28 18:30       ` Stefano Stabellini
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2014-03-28 18:01 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Jan Beulich

Il 28/03/2014 18:52, Stefano Stabellini ha scritto:
>> This is a thorny issue, fixing this behavior is not going to be trivial:
>>
>> - The hypervisor/libxc does not currently expose a
>>   xc_domain_remove_from_physmap function.
>>
>> - QEMU works by allocating memory regions at the end of the guest
>>   physmap and then moving them at the right place.
>>
>> - QEMU can destroy a memory region and in that case we could free the
>>   memory and remove it from the physmap, however that is NOT what QEMU
>>   does with the vga ROM. In that case it calls
>>   memory_region_del_subregion, so we can't be sure that the ROM won't be
>>   mapped again, therefore we cannot free it. We need to move it
>>   somewhere else, hence the problem.

Right; QEMU cannot know either if the ROM will be mapped again (examples 
include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" 
or a warm reset).

>> But fortunately we don't actually need to add the VGA ROM to the guest
>> physmap for it to work, QEMU can trap and emulate. In fact even today we
>> are not mapping it at the right place anyway, see xen_set_memory:

But how can you execute from the VGA ROM then?  Also, how do you migrate 
its contents?  And how is VGA different from say an iPXE ROM?

It would be nice if QEMU could just special case pc.ram (which has 
block->offset == 0), and use the normal method to allocate other RAM 
regions.  But I'm afraid that would require some changes in the Xen 
toolstack as well (for migration, for example) and I'm not sure how you 
could execute from PCI ROM BARs.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28 18:01     ` Paolo Bonzini
@ 2014-03-28 18:30       ` Stefano Stabellini
  2014-03-29  7:31         ` Paolo Bonzini
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-03-28 18:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: anthony.perard, xen-devel, Jan Beulich, Stefano Stabellini

On Fri, 28 Mar 2014, Paolo Bonzini wrote:
> Il 28/03/2014 18:52, Stefano Stabellini ha scritto:
> > > This is a thorny issue, fixing this behavior is not going to be trivial:
> > > 
> > > - The hypervisor/libxc does not currently expose a
> > >   xc_domain_remove_from_physmap function.
> > > 
> > > - QEMU works by allocating memory regions at the end of the guest
> > >   physmap and then moving them at the right place.
> > > 
> > > - QEMU can destroy a memory region and in that case we could free the
> > >   memory and remove it from the physmap, however that is NOT what QEMU
> > >   does with the vga ROM. In that case it calls
> > >   memory_region_del_subregion, so we can't be sure that the ROM won't be
> > >   mapped again, therefore we cannot free it. We need to move it
> > >   somewhere else, hence the problem.
> 
> Right; QEMU cannot know either if the ROM will be mapped again (examples
> include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" or a
> warm reset).
> 
> > > But fortunately we don't actually need to add the VGA ROM to the guest
> > > physmap for it to work, QEMU can trap and emulate. In fact even today we
> > > are not mapping it at the right place anyway, see xen_set_memory:
> 
> But how can you execute from the VGA ROM then?

I don't know, I guess we don't? In that case why does it work today?


> Also, how do you migrate its contents?

That would also not work. We would have to re-initialize it in QEMU on
the receiving end.


> And how is VGA different from say an iPXE ROM?

iPXE is read into memory by hvmloader.


> It would be nice if QEMU could just special case pc.ram (which has
> block->offset == 0), and use the normal method to allocate other RAM regions.
> But I'm afraid that would require some changes in the Xen toolstack as well
> (for migration, for example) and I'm not sure how you could execute from PCI
> ROM BARs.
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28 18:30       ` Stefano Stabellini
@ 2014-03-29  7:31         ` Paolo Bonzini
  2014-03-30  7:57           ` Fabio Fantoni
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2014-03-29  7:31 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Jan Beulich

Il 28/03/2014 19:30, Stefano Stabellini ha scritto:
> On Fri, 28 Mar 2014, Paolo Bonzini wrote:
>> Il 28/03/2014 18:52, Stefano Stabellini ha scritto:
>>>> This is a thorny issue, fixing this behavior is not going to be trivial:
>>>>
>>>> - The hypervisor/libxc does not currently expose a
>>>>   xc_domain_remove_from_physmap function.
>>>>
>>>> - QEMU works by allocating memory regions at the end of the guest
>>>>   physmap and then moving them at the right place.
>>>>
>>>> - QEMU can destroy a memory region and in that case we could free the
>>>>   memory and remove it from the physmap, however that is NOT what QEMU
>>>>   does with the vga ROM. In that case it calls
>>>>   memory_region_del_subregion, so we can't be sure that the ROM won't be
>>>>   mapped again, therefore we cannot free it. We need to move it
>>>>   somewhere else, hence the problem.
>>
>> Right; QEMU cannot know either if the ROM will be mapped again (examples
>> include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat rom" or a
>> warm reset).
>>
>>>> But fortunately we don't actually need to add the VGA ROM to the guest
>>>> physmap for it to work, QEMU can trap and emulate. In fact even today we
>>>> are not mapping it at the right place anyway, see xen_set_memory:
>>
>> But how can you execute from the VGA ROM then?
>
> I don't know, I guess we don't? In that case why does it work today?

Right, the ROM is copied down to low memory by firmware (hvmloader?).

>> Also, how do you migrate its contents?
>
> That would also not work. We would have to re-initialize it in QEMU on
> the receiving end.

That is problematic.  It would mean that a system reset after migration 
may auto-upgrade some parts of the firmware.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-29  7:31         ` Paolo Bonzini
@ 2014-03-30  7:57           ` Fabio Fantoni
  0 siblings, 0 replies; 23+ messages in thread
From: Fabio Fantoni @ 2014-03-30  7:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Anthony PERARD, xen-devel, Jan Beulich, Stefano Stabellini


[-- Attachment #1.1: Type: text/plain, Size: 3751 bytes --]

2014-03-29 8:31 GMT+01:00 Paolo Bonzini <pbonzini@redhat.com>:

> Il 28/03/2014 19:30, Stefano Stabellini ha scritto:
>
>  On Fri, 28 Mar 2014, Paolo Bonzini wrote:
>>
>>> Il 28/03/2014 18:52, Stefano Stabellini ha scritto:
>>>
>>>> This is a thorny issue, fixing this behavior is not going to be trivial:
>>>>>
>>>>> - The hypervisor/libxc does not currently expose a
>>>>>   xc_domain_remove_from_physmap function.
>>>>>
>>>>> - QEMU works by allocating memory regions at the end of the guest
>>>>>   physmap and then moving them at the right place.
>>>>>
>>>>> - QEMU can destroy a memory region and in that case we could free the
>>>>>   memory and remove it from the physmap, however that is NOT what QEMU
>>>>>   does with the vga ROM. In that case it calls
>>>>>   memory_region_del_subregion, so we can't be sure that the ROM won't
>>>>> be
>>>>>   mapped again, therefore we cannot free it. We need to move it
>>>>>   somewhere else, hence the problem.
>>>>>
>>>>
>>> Right; QEMU cannot know either if the ROM will be mapped again (examples
>>> include "cd /sys/bus/pci/devices/0000:0:03.0 && echo 1 > rom && cat
>>> rom" or a
>>> warm reset).
>>>
>>>  But fortunately we don't actually need to add the VGA ROM to the guest
>>>>> physmap for it to work, QEMU can trap and emulate. In fact even today
>>>>> we
>>>>> are not mapping it at the right place anyway, see xen_set_memory:
>>>>>
>>>>
>>> But how can you execute from the VGA ROM then?
>>>
>>
>> I don't know, I guess we don't? In that case why does it work today?
>>
>
> Right, the ROM is copied down to low memory by firmware (hvmloader?).


Only vgabios and other rom of qemu traditional are include and loaded by
hvmloader.
Time ago when I was trying to solve some problems with the emulated vgas I came
to doubt that the vgabios of qemu upstream were not loaded or used
correctly.
Someone had told me that they were loaded automatically from qemu when you use
the qemu upstream.
Unfortunately I do not have enough knowledge and are not able to find
exactly the problems or things missing in xen to solve problems with
the emulated
vgas.
I did a lot of tests, comparing with kvm using same qemu parameters used
with xen showed almost always higher video performance on kvm and qxl was
not working on xen but showing too few errors/details in logs that I
posted long
ago, unfortunately no answers.
It seemed to me from what little I knew that was not allocated or
usedcorrectly all
the ram or one or more regions (having memory errors in logs) and / or not
being loaded or used properly the vgabios.
Seems that in this thread you are probably trying to solve problems
including the ones I found.

Last mail of my qxl tests for example is this:
http://lists.xen.org/archives/html/xen-devel/2013-12/msg00758.html
And the memory error on domU logs of this test was:

ioremap error for 0xfc001000-0xfc002000, requested 0x10, got 0x0

There was also another test maybe 2 years ago for which data have made
me doubt the proper loading or use of vgabios stdvga with xen and qemu
upstream but unfortunately can not find it now.


I will try to help with test and post results/details if I can.
For example some posts ago I see Stabellini patch that seems about load of
vgabios and other roms, should be tested?

Thanks for any reply and sorry for my bad english.


>
>
>  Also, how do you migrate its contents?
>>>
>>
>> That would also not work. We would have to re-initialize it in QEMU on
>> the receiving end.
>>
>
> That is problematic.  It would mean that a system reset after migration
> may auto-upgrade some parts of the firmware.
>
>
> Paolo
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

[-- Attachment #1.2: Type: text/html, Size: 9674 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-28 17:46 ` Stefano Stabellini
  2014-03-28 17:52   ` Stefano Stabellini
@ 2014-03-31  9:07   ` Jan Beulich
  2014-04-03 16:12     ` Stefano Stabellini
  1 sibling, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2014-03-31  9:07 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini

>>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote:
> But fortunately we don't actually need to add the VGA ROM to the guest
> physmap for it to work,

Is that true even when the ROM gets enabled by the guest?

> QEMU can trap and emulate. In fact even today we
> are not mapping it at the right place anyway, see xen_set_memory:
> 
>     if (add) {
>         if (!memory_region_is_rom(section->mr)) {
>             xen_add_to_physmap(state, start_addr, size,
>                                section->mr, section->offset_within_region);
>         } else {

Right - that's part of the problem. And it would seem to be better to
map it where it belongs (even if not enabled) than to have it sit at
some arbitrary place. But as that still wouldn't be correct, I'd clearly
prefer a proper solution.

> So the only solution I can see right now is:
> 
> - avoid allocating guest memory for the VGA ROM
> That means that at the beginning of xen_ram_alloc we need to realize
> that the memory region we are dealing with is the VGA ROM memory region
> and avoid calling xc_domain_populate_physmap_exact for it.
> 
> - call g_malloc instead
> Simply use g_malloc to allocate QEMU memory for the VGA ROM,
> keep track of the allocation in a data structure internal to xen-all.c.
> 
> - make sure that qemu_get_ram_ptr can deal with the different allocation
> Now that the VGA ROM is QEMU memory, we need to make sure that
> qemu_get_ram_ptr returns the right pointer for it.
> 
> 
> This is all very fiddly and hackish, but I can't see a better way of
> solving the issue.

Plus this all reads very VGA-special-casing to me, yet a proper model
would universally cover all PCI ROMs (emulated as well as passed
through).

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-03-31  9:07   ` Jan Beulich
@ 2014-04-03 16:12     ` Stefano Stabellini
  2014-04-04  6:45       ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-04-03 16:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

On Mon, 31 Mar 2014, Jan Beulich wrote:
> >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote:
> > But fortunately we don't actually need to add the VGA ROM to the guest
> > physmap for it to work,
> 
> Is that true even when the ROM gets enabled by the guest?

Yes, I think so.


> > QEMU can trap and emulate. In fact even today we
> > are not mapping it at the right place anyway, see xen_set_memory:
> > 
> >     if (add) {
> >         if (!memory_region_is_rom(section->mr)) {
> >             xen_add_to_physmap(state, start_addr, size,
> >                                section->mr, section->offset_within_region);
> >         } else {
> 
> Right - that's part of the problem. And it would seem to be better to
> map it where it belongs (even if not enabled) than to have it sit at
> some arbitrary place. But as that still wouldn't be correct, I'd clearly
> prefer a proper solution.

We could go down this route, but then on unmap the rom would just be
moved back to the original place.
Do you think that would be a reasonable solution? From QEMU POV it would
certainly be better then the approach below.


> > So the only solution I can see right now is:
> > 
> > - avoid allocating guest memory for the VGA ROM
> > That means that at the beginning of xen_ram_alloc we need to realize
> > that the memory region we are dealing with is the VGA ROM memory region
> > and avoid calling xc_domain_populate_physmap_exact for it.
> > 
> > - call g_malloc instead
> > Simply use g_malloc to allocate QEMU memory for the VGA ROM,
> > keep track of the allocation in a data structure internal to xen-all.c.
> > 
> > - make sure that qemu_get_ram_ptr can deal with the different allocation
> > Now that the VGA ROM is QEMU memory, we need to make sure that
> > qemu_get_ram_ptr returns the right pointer for it.
> > 
> > 
> > This is all very fiddly and hackish, but I can't see a better way of
> > solving the issue.
> 
> Plus this all reads very VGA-special-casing to me, yet a proper model
> would universally cover all PCI ROMs (emulated as well as passed
> through).

This is the only option to avoid having the rom mapped at high addresses
in the guest memory map. It would work for all PCI ROMs.
The problem is realizing from xen_ram_alloc and qemu_ram_ptr_length that
we are dealing with a PCI ROM rather than something else.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-03 16:12     ` Stefano Stabellini
@ 2014-04-04  6:45       ` Jan Beulich
  2014-04-04  9:34         ` Paolo Bonzini
  2014-04-04 13:53         ` Stefano Stabellini
  0 siblings, 2 replies; 23+ messages in thread
From: Jan Beulich @ 2014-04-04  6:45 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini

>>> On 03.04.14 at 18:12, <stefano.stabellini@eu.citrix.com> wrote:
> On Mon, 31 Mar 2014, Jan Beulich wrote:
>> >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote:
>> > But fortunately we don't actually need to add the VGA ROM to the guest
>> > physmap for it to work,
>> 
>> Is that true even when the ROM gets enabled by the guest?
> 
> Yes, I think so.

Implying that any execution of code in the ROM would be fully
emulated. Very odd, but fitting the picture of trying to be as slow
as possible (in the context of the breakage introduced by
ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I
had to run qemu-traditional and qemu-upstream, and the
performance of the guest visibly _much_ better with the former,
which I consider rather worrying).

>> > QEMU can trap and emulate. In fact even today we
>> > are not mapping it at the right place anyway, see xen_set_memory:
>> > 
>> >     if (add) {
>> >         if (!memory_region_is_rom(section->mr)) {
>> >             xen_add_to_physmap(state, start_addr, size,
>> >                                section->mr, section->offset_within_region);
>> >         } else {
>> 
>> Right - that's part of the problem. And it would seem to be better to
>> map it where it belongs (even if not enabled) than to have it sit at
>> some arbitrary place. But as that still wouldn't be correct, I'd clearly
>> prefer a proper solution.
> 
> We could go down this route, but then on unmap the rom would just be
> moved back to the original place.
> Do you think that would be a reasonable solution? From QEMU POV it would
> certainly be better then the approach below.

No, it should just never appear at the wrong address. As I said above,
I'd consider it halfway acceptable if it remained mapped despite an
unmap, but I do think that a proper solution (properly unmapping
without de-allocating) can and should be found. The more that this
intermediate approach can't really work as I now realize: When
disabled while the guest is sizing the BARs, the address put there
would be all ones in the writable upper bits of the address, i.e. not
a place where the ROM could be legitimately mapped.

And btw - the current model is inconsistent anyway (and perhaps a
reason why certain things appear to not work right when a domain
has memory extending beyond 4G): Once other things (it was the
frame buffer in all cases I've seen) get mapped into the address
space, the ROM gets (implicitly) unmapped anyway. So relying on
it to stay somewhere in guest address space is broken in any event;
qemu's view just doesn't match reality anymore at that point.

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04  6:45       ` Jan Beulich
@ 2014-04-04  9:34         ` Paolo Bonzini
  2014-04-04  9:45           ` Jan Beulich
  2014-04-04 13:53         ` Stefano Stabellini
  1 sibling, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2014-04-04  9:34 UTC (permalink / raw)
  To: Jan Beulich, Stefano Stabellini; +Cc: anthony.perard, xen-devel

Il 04/04/2014 08:45, Jan Beulich ha scritto:
> Implying that any execution of code in the ROM would be fully
> emulated.

ROM is never executed in place.  It is always copied to low RAM and 
executed from there.  It might slow down the copy.

In fact, on AMD it is not always possible to execute from ROM; if the 
ROM includes page tables, as was the case for example for 64-bit OVMF, 
it crashes because NPT expects page tables to be in writable guest memory.

> Very odd, but fitting the picture of trying to be as slow
> as possible (in the context of the breakage introduced by
> ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I
> had to run qemu-traditional and qemu-upstream, and the
> performance of the guest visibly _much_ better with the former,
> which I consider rather worrying).

That's quite unexpected.  What was your configuration and workload?  And 
what was slower exactly?  Disk, network or video (as an initial 
simplification).

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04  9:34         ` Paolo Bonzini
@ 2014-04-04  9:45           ` Jan Beulich
  0 siblings, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2014-04-04  9:45 UTC (permalink / raw)
  To: Stefano Stabellini, Paolo Bonzini; +Cc: anthony.perard, xen-devel

>>> On 04.04.14 at 11:34, <pbonzini@redhat.com> wrote:
> Il 04/04/2014 08:45, Jan Beulich ha scritto:
>> Very odd, but fitting the picture of trying to be as slow
>> as possible (in the context of the breakage introduced by
>> ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I

(For the record - I was off by a line when copy-and-pasting this,
it really was 8bad6c56 "x86/HVM: fix preemption handling in
do_hvm_op()").

>> had to run qemu-traditional and qemu-upstream, and the
>> performance of the guest visibly _much_ better with the former,
>> which I consider rather worrying).
> 
> That's quite unexpected.  What was your configuration and workload?  And 
> what was slower exactly?  Disk, network or video (as an initial 
> simplification).

Video in particular. Disk and network, using PV drivers, obviously are
pretty independent on qemu version (and don't matter much during
early boot). But even normal execution during early BIOS initialization
seems notably slower (under the assumption that when nothing
changes on the virtual screen, video performance doesn't matter).
But of course that's not comparing apples to apples, as the two
BIOSes also differ...

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04  6:45       ` Jan Beulich
  2014-04-04  9:34         ` Paolo Bonzini
@ 2014-04-04 13:53         ` Stefano Stabellini
  2014-04-04 14:58           ` Jan Beulich
  1 sibling, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-04-04 13:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

On Fri, 4 Apr 2014, Jan Beulich wrote:
> >>> On 03.04.14 at 18:12, <stefano.stabellini@eu.citrix.com> wrote:
> > On Mon, 31 Mar 2014, Jan Beulich wrote:
> >> >>> On 28.03.14 at 18:46, <stefano.stabellini@eu.citrix.com> wrote:
> >> > But fortunately we don't actually need to add the VGA ROM to the guest
> >> > physmap for it to work,
> >> 
> >> Is that true even when the ROM gets enabled by the guest?
> > 
> > Yes, I think so.
> 
> Implying that any execution of code in the ROM would be fully
> emulated. Very odd, but fitting the picture of trying to be as slow
> as possible (in the context of the breakage introduced by
> ef437690 "x86/HVM: correct CPUID leaf 80000008 handling" I
> had to run qemu-traditional and qemu-upstream, and the
> performance of the guest visibly _much_ better with the former,
> which I consider rather worrying).
> 
> >> > QEMU can trap and emulate. In fact even today we
> >> > are not mapping it at the right place anyway, see xen_set_memory:
> >> > 
> >> >     if (add) {
> >> >         if (!memory_region_is_rom(section->mr)) {
> >> >             xen_add_to_physmap(state, start_addr, size,
> >> >                                section->mr, section->offset_within_region);
> >> >         } else {
> >> 
> >> Right - that's part of the problem. And it would seem to be better to
> >> map it where it belongs (even if not enabled) than to have it sit at
> >> some arbitrary place. But as that still wouldn't be correct, I'd clearly
> >> prefer a proper solution.
> > 
> > We could go down this route, but then on unmap the rom would just be
> > moved back to the original place.
> > Do you think that would be a reasonable solution? From QEMU POV it would
> > certainly be better then the approach below.
> 
> No, it should just never appear at the wrong address. As I said above,
> I'd consider it halfway acceptable if it remained mapped despite an
> unmap, but I do think that a proper solution (properly unmapping
> without de-allocating) can and should be found.

There is no way to do it today AFAICT.
Would you care of proposing an hypercall that would support this
scenario?


> The more that this
> intermediate approach can't really work as I now realize: When
> disabled while the guest is sizing the BARs, the address put there
> would be all ones in the writable upper bits of the address, i.e. not
> a place where the ROM could be legitimately mapped.
>
> And btw - the current model is inconsistent anyway (and perhaps a
> reason why certain things appear to not work right when a domain
> has memory extending beyond 4G): Once other things (it was the
> frame buffer in all cases I've seen) get mapped into the address
> space, the ROM gets (implicitly) unmapped anyway. So relying on
> it to stay somewhere in guest address space is broken in any event;
> qemu's view just doesn't match reality anymore at that point.

The alternative, never mapping any ROMs in the guest address space, has
other issues too:

- inconsistency with QEMU's way of doing things
- firmware update on migration (as Paolo pointed out)

I don't really see this as a huge step forward.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04 13:53         ` Stefano Stabellini
@ 2014-04-04 14:58           ` Jan Beulich
  2014-04-04 15:32             ` Stefano Stabellini
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2014-04-04 14:58 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini

>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
> On Fri, 4 Apr 2014, Jan Beulich wrote:
>> No, it should just never appear at the wrong address. As I said above,
>> I'd consider it halfway acceptable if it remained mapped despite an
>> unmap, but I do think that a proper solution (properly unmapping
>> without de-allocating) can and should be found.
> 
> There is no way to do it today AFAICT.
> Would you care of proposing an hypercall that would support this
> scenario?

Hypercall? Everything you need is there afaict.

>> The more that this
>> intermediate approach can't really work as I now realize: When
>> disabled while the guest is sizing the BARs, the address put there
>> would be all ones in the writable upper bits of the address, i.e. not
>> a place where the ROM could be legitimately mapped.
>>
>> And btw - the current model is inconsistent anyway (and perhaps a
>> reason why certain things appear to not work right when a domain
>> has memory extending beyond 4G): Once other things (it was the
>> frame buffer in all cases I've seen) get mapped into the address
>> space, the ROM gets (implicitly) unmapped anyway. So relying on
>> it to stay somewhere in guest address space is broken in any event;
>> qemu's view just doesn't match reality anymore at that point.
> 
> The alternative, never mapping any ROMs in the guest address space, has
> other issues too:
> 
> - inconsistency with QEMU's way of doing things
> - firmware update on migration (as Paolo pointed out)
> 
> I don't really see this as a huge step forward.

And I never proposed this as an alternative.

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04 14:58           ` Jan Beulich
@ 2014-04-04 15:32             ` Stefano Stabellini
  2014-04-04 16:00               ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-04-04 15:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

On Fri, 4 Apr 2014, Jan Beulich wrote:
> >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
> > On Fri, 4 Apr 2014, Jan Beulich wrote:
> >> No, it should just never appear at the wrong address. As I said above,
> >> I'd consider it halfway acceptable if it remained mapped despite an
> >> unmap, but I do think that a proper solution (properly unmapping
> >> without de-allocating) can and should be found.
> > 
> > There is no way to do it today AFAICT.
> > Would you care of proposing an hypercall that would support this
> > scenario?
> 
> Hypercall? Everything you need is there afaict.

Maybe I am missing something.

Moving the PCI ROM to the right place in the guest physmap is easy.
However how do you think we could unmap the memory (remove it from the
guest physmap) without deallocating it?
The only hypercall we have is xc_domain_add_to_physmap at the moment.

Unless you are thinking of allocating the ROM in a QEMU buffer, but that
would still go against QEMU's way and has the same bad side effects of
the other approach.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04 15:32             ` Stefano Stabellini
@ 2014-04-04 16:00               ` Jan Beulich
  2014-04-04 16:54                 ` Stefano Stabellini
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2014-04-04 16:00 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: anthony.perard, xen-devel, Paolo Bonzini

>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote:
> On Fri, 4 Apr 2014, Jan Beulich wrote:
>> >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
>> > On Fri, 4 Apr 2014, Jan Beulich wrote:
>> >> No, it should just never appear at the wrong address. As I said above,
>> >> I'd consider it halfway acceptable if it remained mapped despite an
>> >> unmap, but I do think that a proper solution (properly unmapping
>> >> without de-allocating) can and should be found.
>> > 
>> > There is no way to do it today AFAICT.
>> > Would you care of proposing an hypercall that would support this
>> > scenario?
>> 
>> Hypercall? Everything you need is there afaict.
> 
> Maybe I am missing something.
> 
> Moving the PCI ROM to the right place in the guest physmap is easy.
> However how do you think we could unmap the memory (remove it from the
> guest physmap) without deallocating it?
> The only hypercall we have is xc_domain_add_to_physmap at the moment.

XENMEM_remove_from_physmap should be quite suitable here, but
that is not your problem. The problem is that XENMEM_add_to_physmap
requires a GFN to be passed in, i.e. assumes that a page to be mapped
is already mapped somewhere in the guest. (The term "add" in this
context is rather confusing, as in the GMFN map space case the page
isn't being added, but moved.) So indeed there is functionality missing
in the hypervisor.

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04 16:00               ` Jan Beulich
@ 2014-04-04 16:54                 ` Stefano Stabellini
  2014-05-05 10:04                   ` Fabio Fantoni
  0 siblings, 1 reply; 23+ messages in thread
From: Stefano Stabellini @ 2014-04-04 16:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

On Fri, 4 Apr 2014, Jan Beulich wrote:
> >>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote:
> > On Fri, 4 Apr 2014, Jan Beulich wrote:
> >> >>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
> >> > On Fri, 4 Apr 2014, Jan Beulich wrote:
> >> >> No, it should just never appear at the wrong address. As I said above,
> >> >> I'd consider it halfway acceptable if it remained mapped despite an
> >> >> unmap, but I do think that a proper solution (properly unmapping
> >> >> without de-allocating) can and should be found.
> >> > 
> >> > There is no way to do it today AFAICT.
> >> > Would you care of proposing an hypercall that would support this
> >> > scenario?
> >> 
> >> Hypercall? Everything you need is there afaict.
> > 
> > Maybe I am missing something.
> > 
> > Moving the PCI ROM to the right place in the guest physmap is easy.
> > However how do you think we could unmap the memory (remove it from the
> > guest physmap) without deallocating it?
> > The only hypercall we have is xc_domain_add_to_physmap at the moment.
> 
> XENMEM_remove_from_physmap should be quite suitable here, but
> that is not your problem. The problem is that XENMEM_add_to_physmap
> requires a GFN to be passed in, i.e. assumes that a page to be mapped
> is already mapped somewhere in the guest. (The term "add" in this
> context is rather confusing, as in the GMFN map space case the page
> isn't being added, but moved.) So indeed there is functionality missing
> in the hypervisor.

Right.

After we call XENMEM_remove_from_physmap, there is no way of adding back
the pages to the physmap without knowing the corresponding mfns.  Even
if we knew the mfns of the original allocation, relying on them is
probably not a good idea because the pages could theoretically be
offlined or shared.

This is the reason why I was asking for the hypercall you had in mind to
solve this problem.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-04-04 16:54                 ` Stefano Stabellini
@ 2014-05-05 10:04                   ` Fabio Fantoni
  2014-05-05 10:35                     ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Fabio Fantoni @ 2014-05-05 10:04 UTC (permalink / raw)
  To: Stefano Stabellini, Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini

Il 04/04/2014 18:54, Stefano Stabellini ha scritto:
> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote:
>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>> No, it should just never appear at the wrong address. As I said above,
>>>>>> I'd consider it halfway acceptable if it remained mapped despite an
>>>>>> unmap, but I do think that a proper solution (properly unmapping
>>>>>> without de-allocating) can and should be found.
>>>>> There is no way to do it today AFAICT.
>>>>> Would you care of proposing an hypercall that would support this
>>>>> scenario?
>>>> Hypercall? Everything you need is there afaict.
>>> Maybe I am missing something.
>>>
>>> Moving the PCI ROM to the right place in the guest physmap is easy.
>>> However how do you think we could unmap the memory (remove it from the
>>> guest physmap) without deallocating it?
>>> The only hypercall we have is xc_domain_add_to_physmap at the moment.
>> XENMEM_remove_from_physmap should be quite suitable here, but
>> that is not your problem. The problem is that XENMEM_add_to_physmap
>> requires a GFN to be passed in, i.e. assumes that a page to be mapped
>> is already mapped somewhere in the guest. (The term "add" in this
>> context is rather confusing, as in the GMFN map space case the page
>> isn't being added, but moved.) So indeed there is functionality missing
>> in the hypervisor.
> Right.
>
> After we call XENMEM_remove_from_physmap, there is no way of adding back
> the pages to the physmap without knowing the corresponding mfns.  Even
> if we knew the mfns of the original allocation, relying on them is
> probably not a good idea because the pages could theoretically be
> offlined or shared.
>
> This is the reason why I was asking for the hypercall you had in mind to
> solve this problem.

Any news about this discussion?

Thanks for any reply.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-05-05 10:04                   ` Fabio Fantoni
@ 2014-05-05 10:35                     ` Jan Beulich
  2014-05-05 11:10                       ` Fabio Fantoni
  0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2014-05-05 10:35 UTC (permalink / raw)
  To: Fabio Fantoni
  Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

>>> On 05.05.14 at 12:04, <fabio.fantoni@m2r.biz> wrote:
> Il 04/04/2014 18:54, Stefano Stabellini ha scritto:
>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote:
>>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
>>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>> No, it should just never appear at the wrong address. As I said above,
>>>>>>> I'd consider it halfway acceptable if it remained mapped despite an
>>>>>>> unmap, but I do think that a proper solution (properly unmapping
>>>>>>> without de-allocating) can and should be found.
>>>>>> There is no way to do it today AFAICT.
>>>>>> Would you care of proposing an hypercall that would support this
>>>>>> scenario?
>>>>> Hypercall? Everything you need is there afaict.
>>>> Maybe I am missing something.
>>>>
>>>> Moving the PCI ROM to the right place in the guest physmap is easy.
>>>> However how do you think we could unmap the memory (remove it from the
>>>> guest physmap) without deallocating it?
>>>> The only hypercall we have is xc_domain_add_to_physmap at the moment.
>>> XENMEM_remove_from_physmap should be quite suitable here, but
>>> that is not your problem. The problem is that XENMEM_add_to_physmap
>>> requires a GFN to be passed in, i.e. assumes that a page to be mapped
>>> is already mapped somewhere in the guest. (The term "add" in this
>>> context is rather confusing, as in the GMFN map space case the page
>>> isn't being added, but moved.) So indeed there is functionality missing
>>> in the hypervisor.
>> Right.
>>
>> After we call XENMEM_remove_from_physmap, there is no way of adding back
>> the pages to the physmap without knowing the corresponding mfns.  Even
>> if we knew the mfns of the original allocation, relying on them is
>> probably not a good idea because the pages could theoretically be
>> offlined or shared.
>>
>> This is the reason why I was asking for the hypercall you had in mind to
>> solve this problem.
> 
> Any news about this discussion?

Someone would need to write both hypervisor and qemu side patches
- are you volunteering?

Jan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: gross qemu behavior
  2014-05-05 10:35                     ` Jan Beulich
@ 2014-05-05 11:10                       ` Fabio Fantoni
  0 siblings, 0 replies; 23+ messages in thread
From: Fabio Fantoni @ 2014-05-05 11:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: anthony.perard, xen-devel, Paolo Bonzini, Stefano Stabellini

Il 05/05/2014 12:35, Jan Beulich ha scritto:
>>>> On 05.05.14 at 12:04, <fabio.fantoni@m2r.biz> wrote:
>> Il 04/04/2014 18:54, Stefano Stabellini ha scritto:
>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>> On 04.04.14 at 17:32, <stefano.stabellini@eu.citrix.com> wrote:
>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>>>> On 04.04.14 at 15:53, <stefano.stabellini@eu.citrix.com> wrote:
>>>>>>> On Fri, 4 Apr 2014, Jan Beulich wrote:
>>>>>>>> No, it should just never appear at the wrong address. As I said above,
>>>>>>>> I'd consider it halfway acceptable if it remained mapped despite an
>>>>>>>> unmap, but I do think that a proper solution (properly unmapping
>>>>>>>> without de-allocating) can and should be found.
>>>>>>> There is no way to do it today AFAICT.
>>>>>>> Would you care of proposing an hypercall that would support this
>>>>>>> scenario?
>>>>>> Hypercall? Everything you need is there afaict.
>>>>> Maybe I am missing something.
>>>>>
>>>>> Moving the PCI ROM to the right place in the guest physmap is easy.
>>>>> However how do you think we could unmap the memory (remove it from the
>>>>> guest physmap) without deallocating it?
>>>>> The only hypercall we have is xc_domain_add_to_physmap at the moment.
>>>> XENMEM_remove_from_physmap should be quite suitable here, but
>>>> that is not your problem. The problem is that XENMEM_add_to_physmap
>>>> requires a GFN to be passed in, i.e. assumes that a page to be mapped
>>>> is already mapped somewhere in the guest. (The term "add" in this
>>>> context is rather confusing, as in the GMFN map space case the page
>>>> isn't being added, but moved.) So indeed there is functionality missing
>>>> in the hypervisor.
>>> Right.
>>>
>>> After we call XENMEM_remove_from_physmap, there is no way of adding back
>>> the pages to the physmap without knowing the corresponding mfns.  Even
>>> if we knew the mfns of the original allocation, relying on them is
>>> probably not a good idea because the pages could theoretically be
>>> offlined or shared.
>>>
>>> This is the reason why I was asking for the hypercall you had in mind to
>>> solve this problem.
>> Any news about this discussion?
> Someone would need to write both hypervisor and qemu side patches
> - are you volunteering?
>
> Jan
>
Thanks for your reply.
Unfortunately I don't have enough knowledge to do such patches.
I just wanted to know if there was any news.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-05-05 11:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-28  7:48 gross qemu behavior Jan Beulich
2014-03-28  9:21 ` Jan Beulich
2014-03-28  9:30 ` Fabio Fantoni
2014-03-28 10:37   ` Jan Beulich
2014-03-28 17:46 ` Stefano Stabellini
2014-03-28 17:52   ` Stefano Stabellini
2014-03-28 18:01     ` Paolo Bonzini
2014-03-28 18:30       ` Stefano Stabellini
2014-03-29  7:31         ` Paolo Bonzini
2014-03-30  7:57           ` Fabio Fantoni
2014-03-31  9:07   ` Jan Beulich
2014-04-03 16:12     ` Stefano Stabellini
2014-04-04  6:45       ` Jan Beulich
2014-04-04  9:34         ` Paolo Bonzini
2014-04-04  9:45           ` Jan Beulich
2014-04-04 13:53         ` Stefano Stabellini
2014-04-04 14:58           ` Jan Beulich
2014-04-04 15:32             ` Stefano Stabellini
2014-04-04 16:00               ` Jan Beulich
2014-04-04 16:54                 ` Stefano Stabellini
2014-05-05 10:04                   ` Fabio Fantoni
2014-05-05 10:35                     ` Jan Beulich
2014-05-05 11:10                       ` Fabio Fantoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.