From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:38945) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gkOLs-0005fE-9P for qemu-devel@nongnu.org; Fri, 18 Jan 2019 02:14:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gkOLr-0002di-7v for qemu-devel@nongnu.org; Fri, 18 Jan 2019 02:14:04 -0500 Received: from mga12.intel.com ([192.55.52.136]:29419) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gkOLq-0002V1-Sy for qemu-devel@nongnu.org; Fri, 18 Jan 2019 02:14:03 -0500 Date: Fri, 18 Jan 2019 15:10:09 +0800 From: Yu Zhang Message-ID: <20190118071009.dut7nsvqcxusy4rh@linux.intel.com> References: <20181221120320-mutt-send-email-mst@kernel.org> <20181221173758.4ohku43as4dkb355@linux.intel.com> <20181221140016-mutt-send-email-mst@kernel.org> <20181222011126.fdse6h4pl2n7yacl@linux.intel.com> <20181225114552-mutt-send-email-mst@kernel.org> <20181226053000.kdakfnwhqj5ngaq5@linux.intel.com> <20181227151411.GZ19442@habkost.net> <20181228023259.cdzcpbe32dquyrpa@linux.intel.com> <20181229012941.GC6817@habkost.net> <20190115071314.ukqq4425csqrtldo@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190115071314.ukqq4425csqrtldo@linux.intel.com> Subject: Re: [Qemu-devel] [PATCH v3 1/2] intel-iommu: differentiate host address width from IOVA address width. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost , "Michael S. Tsirkin" , Igor Mammedov , Peter Xu Cc: Paolo Bonzini , qemu-devel@nongnu.org, Richard Henderson On Tue, Jan 15, 2019 at 03:13:14PM +0800, Yu Zhang wrote: > On Fri, Dec 28, 2018 at 11:29:41PM -0200, Eduardo Habkost wrote: > > On Fri, Dec 28, 2018 at 10:32:59AM +0800, Yu Zhang wrote: > > > On Thu, Dec 27, 2018 at 01:14:11PM -0200, Eduardo Habkost wrote: > > > > On Wed, Dec 26, 2018 at 01:30:00PM +0800, Yu Zhang wrote: > > > > > On Tue, Dec 25, 2018 at 11:56:19AM -0500, Michael S. Tsirkin wrote: > > > > > > On Sat, Dec 22, 2018 at 09:11:26AM +0800, Yu Zhang wrote: > > > > > > > On Fri, Dec 21, 2018 at 02:02:28PM -0500, Michael S. Tsirkin wrote: > > > > > > > > On Sat, Dec 22, 2018 at 01:37:58AM +0800, Yu Zhang wrote: > > > > > > > > > On Fri, Dec 21, 2018 at 12:04:49PM -0500, Michael S. Tsirkin wrote: > > > > > > > > > > On Sat, Dec 22, 2018 at 12:09:44AM +0800, Yu Zhang wrote: > > > > > > > > > > > Well, my understanding of the vt-d spec is that the address limitation in > > > > > > > > > > > DMAR are referring to the same concept of CPUID.MAXPHYSADDR. I do not think > > > > > > > > > > > there's any different in the native scenario. :) > > > > > > > > > > > > > > > > > > > > I think native machines exist on which the two values are different. > > > > > > > > > > Is that true? > > > > > > > > > > > > > > > > > > I think the answer is not. My understanding is that HAW(host address wdith) is > > > > > > > > > the maximum physical address width a CPU can detects(by cpuid.0x80000008). > > > > > > > > > > > > > > > > > > I agree there are some addresses the CPU does not touch, but they are still in > > > > > > > > > the physical address space, and there's only one physical address space... > > > > > > > > > > > > > > > > > > B.R. > > > > > > > > > Yu > > > > > > > > > > > > > > > > Ouch I thought we are talking about the virtual address size. > > > > > > > > I think I did have a box where VTD's virtual address size was > > > > > > > > smaller than CPU's. > > > > > > > > For physical one - we just need to make it as big as max supported > > > > > > > > memory right? > > > > > > > > > > > > > > Well, my understanding of the physical one is the maximum physical address > > > > > > > width. Sorry, this explain seems nonsense... I mean, it's not just about > > > > > > > the max supported memory, but also covers MMIO. It shall be detectable > > > > > > > from cpuid, or ACPI's DMAR table, instead of calculated by the max memory > > > > > > > size. One common usage of this value is to tell the paging structure entries( > > > > > > > CPU's or IOMMU's) which bits shall be reserved. There are also some registers > > > > > > > e.g. apic base reg etc, whose contents are physical addresses, therefore also > > > > > > > need to follow the similar requirement for the reserved bits. > > > > > > > > > > > > > > So I think the correct direction might be to define this property in the > > > > > > > machine status level, instead of the CPU level. Is this reasonable to you? > > > > > > > > > > > > At that level yes. But isn't this already specified by "pci-hole64-end"? > > > > > > > > > > But this value is set by guest firmware? Will PCI hotplug change this address? > > > > > > > > > > @Eduardo, do you have any plan to calculate the phys-bits by "pci-hole64-end"? > > > > > Or introduce another property, say "max-phys-bits" in machine status? > > > > > > > > I agree it may make sense to make the machine code control > > > > phys-bits instead of the CPU object. A machine property sounds > > > > like the simplest solution. > > > > > > > > But I don't think we can have a meaningful discussion about > > > > implementation if we don't agree about the command-line > > > > interface. We must decide what will happen to the CPU and iommu > > > > physical address width in cases like: > > > > > > Thanks, Eduardo. > > > > > > What about we just use "-machine phys-bits=52", and remove the > > > "phys-bits" from CPU parameter? > > > > Maybe we can deprecate it, but we can't remove it immediately. > > We still need to decide what to do on the cases below, while the > > option is still available. > > I saw the ACPI DMAR is ininitialized in acpi_build(), which is called > by pc_machine_done(). I guess this is done after the initialization of > vCPU and vIOMMU. > > So I am wondering, instead of moving "phys-bits" from X86CPU into the > MachineState, maybe we could: > > 1> Define a "phys_bits" in MachineState or PCMachineState(not sure which > one is more suitable). > > 2> Set ms->phys_bits in x86_cpu_realizefn(). > > 3> Since DMAR is created after vCPU creation, we can build DMAR table > with ms->phys_bits. > > 4> Also, we can reset the hardware address width for vIOMMU(and the > vtd_paging_entry_rsvd_field array) in pc_machine_done(), based on the value > of ms->phys_bits, or from ACPI DMAR table(from spec point of view, address > width limitation of IOMMU shall come from DMAR, yet I have not figured out > any simple approach to probe the ACPI property). > > This way, we do not need worry about the initialization sequence of vCPU > and vIOMMU, and both DMAR and IOMMU setting are from the machine level which > follows the spec. > > Any comments? :) > Ping... Andy comments on this proposal? Thanks! :) Yu > B.R. > Yu > > > > > > > > > > > > > > $QEMU -device intel-iommu > > > > $QEMU -cpu ...,phys-bits=50 -device intel-iommu > > > > $QEMU -cpu ...,host-phys-bits=on -device intel-iommu > > > > $QEMU -machine phys-bits=50 -device intel-iommu > > > > $QEMU -machine phys-bits=50 -cpu ...,phys-bits=48 -device intel-iommu > > > > > > > > -- > > > > Eduardo > > > > > > > > > > B.R. > > > Yu > > > > -- > > Eduardo > > >