From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41151) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bIZgk-0001De-SM for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:59:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bIZgf-00029l-TM for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:59:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47915) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bIZgf-00029f-LH for qemu-devel@nongnu.org; Thu, 30 Jun 2016 06:59:13 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 07E55C049D5A for ; Thu, 30 Jun 2016 10:59:13 +0000 (UTC) Date: Thu, 30 Jun 2016 11:59:09 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20160630105908.GA2683@work-vm> References: <20160617154905.GH18662@thinpad.lan.raisama.net> <20160621194440.GN17952@thinpad.lan.raisama.net> <9b76415a-23e6-3ded-4dbc-42838cc164b0@redhat.com> <20160622142414.GI30202@redhat.com> <20160623014216-mutt-send-email-mst@redhat.com> <20160622232308.GQ30202@redhat.com> <20160623024400-mutt-send-email-mst@redhat.com> <1466671203.26189.35.camel@redhat.com> <20160629164252.GD10488@work-vm> <1467267046.15123.94.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1467267046.15123.94.camel@redhat.com> Subject: Re: [Qemu-devel] Default for phys-addr-bits? (was Re: [PATCH 4/5] x86: Allow physical address bits to be set) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gerd Hoffmann Cc: "Michael S. Tsirkin" , Andrea Arcangeli , Marcel Apfelbaum , Paolo Bonzini , qemu-devel@nongnu.org, Eduardo Habkost * Gerd Hoffmann (kraxel@redhat.com) wrote: > Hi, > > > Something somewhere in qemu/ kernel/ firmware is already reading the number > > of physical bits to determine PCI mapping; if I do: > > > > ./x86_64-softmmu/qemu-system-x86_64 -m 4096,slots=16,maxmem=128T > > No, it's not the physbits. You add some memory hotplug slots here. > Qemu will ask seabios to reserve address space for those, which seabios > promptly does and maps 64bit pci bars above the reserved address space. Right, that's what I was trying to do - I wanted to see if I could get something to use the non-existing address space. > > -vga none -device qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 /home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon chardev=mon,mode=readline -cpu host,phys-bits=48 > > > > it will happily map the qxl VRAM right up high, but if I lower > > the phys-bits down to 46 it won't. > > I suspect the linux kernel remaps the bar because the seabios mapping is > unreachable. Check dmesg. Right, and that is dependent on physbits; if I run with: ./x86_64-softmmu/qemu-system-x86_64 -machine q35,accel=kvm,usb=off -m 4096,slots=16,maxmem=128T -vga none -device qxl-vga,bus=pcie.0,ram_size_mb=2048,vram64_size_mb=2048 -vnc 0.0.0.0:0 /home/vms/7.2a.qcow2 -chardev stdio,mux=on,id=mon -mon chardev=mon,mode=readline -cpu host,phys-bits=48 (on a 46 bit xeon) it happily maps that 64-bit bar into somewhere that shouldn't be accessible: [ 0.266183] pci_bus 0000:00: root bus resource [mem 0x800480000000-0x8004ffffffff] [ 0.321611] pci 0000:00:02.0: reg 0x20: [mem 0x800480000000-0x8004ffffffff 64bit pref] [ 0.423257] pci_bus 0000:00: resource 8 [mem 0x800480000000-0x8004ffffffff] lspci -v: 00:02.0 VGA compatible controller: Red Hat, Inc. QXL paravirtual graphic card (rev 04) (prog-if 00 [VGA controller]) Subsystem: Red Hat, Inc QEMU Virtual Machine Flags: fast devsel, IRQ 22 Memory at c0000000 (32-bit, non-prefetchable) [size=512M] Memory at e0000000 (32-bit, non-prefetchable) [size=64M] Memory at e4070000 (32-bit, non-prefetchable) [size=8K] I/O ports at c080 [size=32] Memory at 800480000000 (64-bit, prefetchable) [size=2G] Expansion ROM at e4060000 [disabled] [size=64K] Kernel driver in use: qxl So that's mapped at an address beyond host phys-bits. And it hasn't failed/crashed etc - but I guess maybe nothing is using that 2G space? If I change the phys-bits=48 to 46 the kernel avoids it: [ 0.414867] acpi PNP0A08:00: host bridge window [0x800480000000-0x8004ffffffff] (ignored, not CPU addressable) [ 0.683134] pci 0000:00:02.0: can't claim BAR 4 [mem 0x800480000000-0x8004ffffffff 64bit pref]: no compatible bridge window [ 0.703948] pci 0000:00:02.0: BAR 4: [mem size 0x80000000 64bit pref] conflicts with PCI mem [mem 0x00000000-0x3fffffffffff] [ 0.703951] pci 0000:00:02.0: BAR 4: failed to assign [mem size 0x80000000 64bit pref] lspci shows: Memory at (64-bit, prefetchable) (Although interesting qemu's info pci still shows it). The 'ignored, not CPU addressable' comes from the kernel's drivers/acpi/pci_root.c acpi_pci_root_validate_resources that uses a value set in arch/x86/kernel/setup.c: iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; So at least the Linux kernel does sanity check using the phys_bits value. Obviously 128T is a bit silly for maxmem at the moment, however I was worrying what happens with 36/39/40bit hosts, and it's not unusual to pick a maxmem that's a few TB even if the VMs you're initially creating are only a handful of GB. (oVirt/RHEV seems to use a 4TB default for maxmem). Still, this only hits as a problem if you hit the combination of: a) You use large PCI bars b) On a 36/39/40bit host c) With a large maxmem that forces those PCI bars up to something silly. Dave > > cheers, > Gerd > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK