From mboxrd@z Thu Jan 1 00:00:00 1970 From: Gordan Bobic Subject: Re: HVM support for e820_host (Was: Bug: Limitation of <=2GB RAM in domU persists with 4.3.0) Date: Tue, 03 Sep 2013 21:35:50 +0100 Message-ID: <52264826.3010402@bobich.net> References: <8426aecf79e7f55c21bbe259014591a2@mail.shatteredsilicon.net> <20130724163102.GA6308@phenom.dumpdata.com> <51F051F1.5050806@bobich.net> <51F19D11.1090200@bobich.net> <51F1A54D.6070906@bobich.net> <1374798084.10269.2.camel@hastur.hellion.org.uk> <20130729180431.GQ5848@phenom.dumpdata.com> <20130903145934.GC1487@konrad-lan.dumpdata.com> <52263CBD.1090402@bobich.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040607020509090607070804" Return-path: In-Reply-To: <52263CBD.1090402@bobich.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------040607020509090607070804 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit First attempt at a test run predictably failed. I added e820_host=1 to a VM config and tried starting it: [root@normandy ~]# xl create /etc/xen/edi Parsing config from /etc/xen/edi libxl: error: libxl_x86.c:307:libxl__arch_domain_create: Failed while collecting E820 with: -3 (errno:-1) libxl: error: libxl_create.c:901:domcreate_rebuild_done: cannot (re-)build domain: -3 libxl: error: libxl_dm.c:1300:libxl__destroy_device_model: could not find device-model's pid for dom 1 libxl: error: libxl.c:1415:libxl__destroy_domid: libxl__destroy_device_model failed for 1 xl-edi.log, qemu-dm-edi.log attached. Both actually look identical to previous logs before the patch. Is this something that is clearly a consequence of the patch being incomplete? Or did I break something? Gordan On 09/03/2013 08:47 PM, Gordan Bobic wrote: > On 09/03/2013 03:59 PM, Konrad Rzeszutek Wilk wrote: > >>>>> 2) Further, I'm finding myself motivated to write that >>>>> auto-set (as opposed to hard coded) vBAR=pBAR patch discussed >>>>> briefly a week or so ago (have an init script read the BAR >>>>> info from dom0 and put it in xenstore, plus a patch to >>>>> make pBAR=vBAR reservations built dynamically rather than >>>>> statically, based on this data. Now, I'm quite fluent in C, >>>>> but my familiarity with Xen soruce code is nearly non-existant >>>>> (limited to studying an old unsupported patch every now and then >>>>> in order to make it apply to a more recent code release). >>>>> Can anyone help me out with a high level view WRT where >>>>> this would be best plumbed in (which files and the flow of >>>>> control between the affected files)? >>>> >>>> hvmloader probably and the libxl e820 code. What from a >>>> high view needs to happen is that: >>>> 1). Need to relax the check in libxl for e820_hole >>>> to also do it for HVM guests. Said code just iterates over the >>>> host E820 and sanitizes it a bit and makes a E820 hypercall to >>>> set it for the guest. > [snip] > > OK, I have attached a preliminary patch against 4.3.0 for the libxl > part. It compiles. I haven't tried running it to see if it actually > works or does something, but my packages build. > > Please let me know if I've missed anything. On it's own, I don't think > this patch will do much (apart from maybe break HVM hosts with > e820_host=1 set). > >>>> 2). Figure out whether the E820 hypercall (which sets the E820 >>>> layout for a guest) can be run on HVM guests. I think it >>>> could not and Mukesh in his PVH patches posted a patch >>>> to enable that - "..Move e820 fields out of pv_domain struct" > > Is this already in 4.3.0 or is this an out-of-tree patch? Do you have a > link to it handy? > >>>> 2). Hvmloader should do an E820 get machine memory hypercall >>>> to see if there is anything there. If there is - that means >>>> the toolstack has request a "new" type of E820. Iterate >>>> over the E820 and make it look like that. >>>> You can look in the Linux arch/x86/xen/setup.c to see how >>>> it does that. >>>> >>>> The complication there is that hvmloader needs to to fit the >>>> ACPI code (the guest type one) and such. >>>> Presumarily you can just re-use the existing spaces that >>>> the host has marked as E820_RESERVED or E820_ACPI.. >>> >>> Yup, I get it. Not only that, but it should also ideally (not >>> strictly necessary, but it'd be handy) map the IOMEM for devices >>> it is passed so that pBAR=vBAR (as opposed to just leaving all >>> the host e820 reserved areas well alone - which would work for >>> most things). >> >> Yes. That is an extra complication that could be done in subsequent >> patches. But in theory if you have the E820 mirrored from the host the >> pBAR=vBAR should be easy enough as the values from the host BARs can >> easily fit in the E820 gaps. > > Agreed. Let's leave the pBAR=vBAR part for a separate patch set. I'll > have to figure out a sensible way to query the IOMEM regions for each of > the devices passed to the VM and make sure they are in the same hole. > >>>> Then there is the SMBIOS would need to move and the BIOS >>>> might need to be relocated - but I think those are relocatable >>>> in some form. > > [bit above left for later reference] > >>>> Well, I am more than happy to help you with this. >>> >>> Thanks, much appreciated. :) >> >> Yeeey! Vict^H^H^H^volunteer :-)! >> >> I am also reachable on IRC (FreeNode mostly) as either darnok or konrad >> if that would be more convient to discuss this. > > Thanks. I'll keep that in mind. :) > > Gordan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > --------------040607020509090607070804 Content-Type: text/plain; charset=UTF-8; name="qemu-dm-edi.log" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="qemu-dm-edi.log" domid: 1 Using file /dev/zvol/ssd/edi in read-write mode Watching /local/domain/0/device-model/1/logdirty/cmd Watching /local/domain/0/device-model/1/command Watching /local/domain/1/cpu char device redirected to /dev/pts/3 qemu_map_cache_init nr_buckets = 10000 size 4194304 shared page at pfn feffd buffered io page at pfn feffb Guest uuid = a57e6840-e9f5-4a14-a822-b2cc662c177f populating video RAM at ff000000 mapping video RAM from ff000000 Register xen platform. Done register platform. platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state. xs_read(/local/domain/0/device-model/1/xen_extended_power_mgmt): read error xs_read(): vncpasswd get error. /vm/a57e6840-e9f5-4a14-a822-b2cc662c177f/vncpasswd. Log-dirty: no command yet. I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 vcpu-set: watch node error. [xenstore_process_vcpu_set_event]: /local/domain/1/cpu has no CPU! I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 xs_read(/local/domain/1/log-throttling): read error qemu: ignoring not-understood drive `/local/domain/1/log-throttling' medium change watch on `/local/domain/1/log-throttling' - unknown device, ignored I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0 dm-command: hot insert pass-through pci dev register_real_device: Assigning real physical device 08:00.0 ... register_real_device: Disable MSI translation via per device option register_real_device: Enable power management pt_iomul_init: Error: pt_iomul_init can't open file /dev/xen/pci_iomul: No such file or directory: 0x8:0x0.0x0 pt_register_regions: IO region registered (size=0x02000000 base_addr=0xf8000000) pt_register_regions: IO region registered (size=0x08000000 base_addr=0xb800000c) pt_register_regions: IO region registered (size=0x04000000 base_addr=0xb400000c) pt_register_regions: IO region registered (size=0x00000080 base_addr=0x0000cf81) pt_register_regions: Expansion ROM registered (size=0x00080000 base_addr=0xfbc00000) pci_intx: intx=1 register_real_device: Real physical device 08:00.0 registered successfuly! IRQ type = INTx dm-command: hot insert pass-through pci dev register_real_device: Assigning real physical device 08:00.1 ... register_real_device: Disable MSI translation via per device option register_real_device: Enable power management pt_iomul_init: Error: pt_iomul_init can't open file /dev/xen/pci_iomul: No such file or directory: 0x8:0x0.0x1 pt_register_regions: IO region registered (size=0x00004000 base_addr=0xfbcfc000) pci_intx: intx=2 register_real_device: Real physical device 08:00.1 registered successfuly! IRQ type = INTx dm-command: hot insert pass-through pci dev register_real_device: Assigning real physical device 0c:00.0 ... register_real_device: Disable MSI translation via per device option register_real_device: Enable power management pt_iomul_init: Error: pt_iomul_init can't open file /dev/xen/pci_iomul: No such file or directory: 0xc:0x0.0x0 pt_register_regions: IO region registered (size=0x00004000 base_addr=0xd7efc000) pci_intx: intx=1 register_real_device: Real physical device 0c:00.0 registered successfuly! IRQ type = INTx dm-command: hot insert pass-through pci dev register_real_device: Assigning real physical device 00:1a.1 ... register_real_device: Disable MSI translation via per device option register_real_device: Enable power management pt_iomul_init: Error: pt_iomul_init can't open file /dev/xen/pci_iomul: No such file or directory: 0x0:0x1a.0x1 pt_register_regions: IO region registered (size=0x00000020 base_addr=0x00008a01) pci_intx: intx=2 register_real_device: Real physical device 00:1a.1 registered successfuly! IRQ type = INTx pt_iomem_map: e_phys=e0000000 maddr=b8000000 type=8 len=134217728 index=1 first_map=1 pt_iomem_map: e_phys=e8000000 maddr=b4000000 type=8 len=67108864 index=3 first_map=1 pt_iomem_map: e_phys=ec000000 maddr=f8000000 type=0 len=33554432 index=0 first_map=1 vga s->lfb_addr = ef000000 s->lfb_end = ef800000 pt_iomem_map: e_phys=ef8a0000 maddr=fbcfc000 type=0 len=16384 index=0 first_map=1 pt_iomem_map: e_phys=ef8a4000 maddr=d7efc000 type=0 len=16384 index=0 first_map=1 pt_ioport_map: e_phys=c100 pio_base=cf80 len=128 index=5 first_map=1 pt_ioport_map: e_phys=c1e0 pio_base=8a00 len=32 index=4 first_map=1 platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state. platform_fixed_ioport: changed ro/rw state of ROM memory area. now is ro state. Unknown PV product 2 loaded in guest PV driver build 1 region type 0 at [ef880000,ef8a0000). squash iomem [ef880000, ef8a0000). region type 1 at [c180,c1c0). vga s->lfb_addr = ef000000 s->lfb_end = ef800000 pt_iomem_map: e_phys=ffffffff maddr=f8000000 type=0 len=33554432 index=0 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=b8000000 type=8 len=134217728 index=1 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=b4000000 type=8 len=67108864 index=3 first_map=0 pt_ioport_map: e_phys=ffff pio_base=cf80 len=128 index=5 first_map=0 pt_iomem_map: e_phys=ec000000 maddr=f8000000 type=0 len=33554432 index=0 first_map=0 pt_iomem_map: e_phys=e0000000 maddr=b8000000 type=8 len=134217728 index=1 first_map=0 pt_iomem_map: e_phys=e8000000 maddr=b4000000 type=8 len=67108864 index=3 first_map=0 pt_ioport_map: e_phys=c100 pio_base=cf80 len=128 index=5 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=fbcfc000 type=0 len=16384 index=0 first_map=0 pt_pci_write_config: [00:06:0] Warning: Guest attempt to set address to unused Base Address Register. [Offset:30h][Length:4] pt_iomem_map: e_phys=ef8a0000 maddr=fbcfc000 type=0 len=16384 index=0 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=d7efc000 type=0 len=16384 index=0 first_map=0 pt_pci_write_config: [00:07:0] Warning: Guest attempt to set address to unused Base Address Register. [Offset:30h][Length:4] pt_iomem_map: e_phys=ef8a4000 maddr=d7efc000 type=0 len=16384 index=0 first_map=0 pt_ioport_map: e_phys=ffff pio_base=8a00 len=32 index=4 first_map=0 pt_pci_write_config: [00:08:0] Warning: Guest attempt to set address to unused Base Address Register. [Offset:30h][Length:4] pt_ioport_map: e_phys=c1e0 pio_base=8a00 len=32 index=4 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=f8000000 type=0 len=33554432 index=0 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=b8000000 type=8 len=134217728 index=1 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=b4000000 type=8 len=67108864 index=3 first_map=0 pt_ioport_map: e_phys=ffff pio_base=cf80 len=128 index=5 first_map=0 pt_iomem_map: e_phys=ec000000 maddr=f8000000 type=0 len=33554432 index=0 first_map=0 pt_iomem_map: e_phys=e0000000 maddr=b8000000 type=8 len=134217728 index=1 first_map=0 pt_iomem_map: e_phys=e8000000 maddr=b4000000 type=8 len=67108864 index=3 first_map=0 pt_ioport_map: e_phys=c100 pio_base=cf80 len=128 index=5 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=fbcfc000 type=0 len=16384 index=0 first_map=0 pt_iomem_map: e_phys=ef8a0000 maddr=fbcfc000 type=0 len=16384 index=0 first_map=0 pt_ioport_map: e_phys=ffff pio_base=8a00 len=32 index=4 first_map=0 pt_ioport_map: e_phys=c1e0 pio_base=8a00 len=32 index=4 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=d7efc000 type=0 len=16384 index=0 first_map=0 pt_iomem_map: e_phys=ef8a4000 maddr=d7efc000 type=0 len=16384 index=0 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=fbcfc000 type=0 len=16384 index=0 first_map=0 pt_iomem_map: e_phys=ffffffff maddr=d7efc000 type=0 len=16384 index=0 first_map=0 pt_ioport_map: e_phys=ffff pio_base=8a00 len=32 index=4 first_map=0 shutdown requested in cpu_handle_ioreq Issued domain 1 poweroff --------------040607020509090607070804 Content-Type: text/plain; charset=UTF-8; name="xl-edi.log" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="xl-edi.log" Waiting for domain edi (domid 1) to die [pid 8363] Domain 1 has shut down, reason code 0 0x0 Action for shutdown reason code 0 is destroy Domain 1 needs to be cleaned up: destroying the domain libxl: error: libxl_pci.c:990:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:08:00.0 libxl: error: libxl_pci.c:990:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:08:00.1 Done. Exiting now --------------040607020509090607070804 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------040607020509090607070804--