* [Qemu-devel] Flatview rendering scalability issue @ 2019-03-11 9:26 Sergio Lopez 2019-03-11 10:19 ` Paolo Bonzini 0 siblings, 1 reply; 6+ messages in thread From: Sergio Lopez @ 2019-03-11 9:26 UTC (permalink / raw) To: qemu-devel; +Cc: pbonzini Hi, Thanks to Q35/PCIe, we can now assign a large number of PCI devices to a single VM, but it seems that Flatview rendering scales poorly (worse than linear) when it has to deal with a large number of Memory Regions. I've measured to cost of the pci_default_write_config() call at virtio_write_config() for 1 PCI device vs. 100 PCI devices: - 1 PCI device write_config: 1879 us write_config: 1037 us write_config: 1 us write_config: 3 us write_config: 1783 us write_config: 2652 us write_config: 1 us write_config: 2 us write_config: 1551 us - 100 PCI devices write_config: 503963 us write_config: 1 us write_config: 493344 us write_config: 0 us write_config: 472946 us write_config: 1 us write_config: 495175 us write_config: 1 us write_config: 519312 us write_config: 1 us I guess this is a consequence of having to reset/rebuild the Flatview when altering the PCI BAR regions. Is this a known issue we're already working on? Thanks, Sergio (slp). ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue 2019-03-11 9:26 [Qemu-devel] Flatview rendering scalability issue Sergio Lopez @ 2019-03-11 10:19 ` Paolo Bonzini 2019-03-11 13:48 ` Sergio Lopez 0 siblings, 1 reply; 6+ messages in thread From: Paolo Bonzini @ 2019-03-11 10:19 UTC (permalink / raw) To: Sergio Lopez, qemu-devel On 11/03/19 10:26, Sergio Lopez wrote: > I guess this is a consequence of having to reset/rebuild the Flatview > when altering the PCI BAR regions. > > Is this a known issue we're already working on? What version of QEMU is this? The initialization is O(n^2) because the guest initializes one device at a time, so you rebuild the FlatView first with 0 devices, then 1, then 2, etc. This is very hard to fix, if at all possible. However, each FlatView creation should be O(n) where n is the number of devices currently configured. Please check with "info mtree -f" that you only have a fixed number of FlatViews. Old versions had one per device. Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue 2019-03-11 10:19 ` Paolo Bonzini @ 2019-03-11 13:48 ` Sergio Lopez 2019-03-11 14:07 ` Paolo Bonzini 0 siblings, 1 reply; 6+ messages in thread From: Sergio Lopez @ 2019-03-11 13:48 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel Paolo Bonzini writes: > On 11/03/19 10:26, Sergio Lopez wrote: >> I guess this is a consequence of having to reset/rebuild the Flatview >> when altering the PCI BAR regions. >> >> Is this a known issue we're already working on? > > What version of QEMU is this? This upstream as of 6cb4f6db4f4367f (Mar 07 2019). > The initialization is O(n^2) because the guest initializes one device at > a time, so you rebuild the FlatView first with 0 devices, then 1, then > 2, etc. This is very hard to fix, if at all possible. > > However, each FlatView creation should be O(n) where n is the number of > devices currently configured. Please check with "info mtree -f" that > you only have a fixed number of FlatViews. Old versions had one per device. I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs. Thanks, Sergio. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue 2019-03-11 13:48 ` Sergio Lopez @ 2019-03-11 14:07 ` Paolo Bonzini 2019-03-11 14:35 ` Sergio Lopez 2019-03-12 3:23 ` Peter Xu 0 siblings, 2 replies; 6+ messages in thread From: Paolo Bonzini @ 2019-03-11 14:07 UTC (permalink / raw) To: Sergio Lopez; +Cc: qemu-devel, Peter Xu On 11/03/19 14:48, Sergio Lopez wrote: >> The initialization is O(n^2) because the guest initializes one device at >> a time, so you rebuild the FlatView first with 0 devices, then 1, then >> 2, etc. This is very hard to fix, if at all possible. >> >> However, each FlatView creation should be O(n) where n is the number of >> devices currently configured. Please check with "info mtree -f" that >> you only have a fixed number of FlatViews. Old versions had one per device. > I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs. With $ eval qemu-system-x86_64 -M q35 \ -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3} I only see 4 flat views ("system", "io", "memory", "(none)"). Probably you are using intel-iommu? Peter, it should be possible to reorganize the VT-d memory regions like this: intel_iommu_ir (MMIO, not added to any container) vtd_root_dmar (container) intel_iommu_dmar (IOMMU), priority 0 alias to intel_iommu_ir, priority 1 vtd_root_nodmar alias to get_system_memory(), priority 0 alias to intel_iommu_ir, priority 1 vtd_root_0 memory region (container) vtd_root_dmar # only one of these is enabled vtd_root_nodmar where the vtd_root_dmar and vtd_root_nodmar memory regions are created in vtd_init once and for all. Because all vtd_root_* memory regions have only one child, memory.c will recognize that they represent the same memory, and create at most two FlatViews (one for vtd_root_dmar, one for vtd_root_nodmar). Paolo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue 2019-03-11 14:07 ` Paolo Bonzini @ 2019-03-11 14:35 ` Sergio Lopez 2019-03-12 3:23 ` Peter Xu 1 sibling, 0 replies; 6+ messages in thread From: Sergio Lopez @ 2019-03-11 14:35 UTC (permalink / raw) To: Paolo Bonzini; +Cc: qemu-devel, Peter Xu Paolo Bonzini writes: > On 11/03/19 14:48, Sergio Lopez wrote: >>> The initialization is O(n^2) because the guest initializes one device at >>> a time, so you rebuild the FlatView first with 0 devices, then 1, then >>> 2, etc. This is very hard to fix, if at all possible. >>> >>> However, each FlatView creation should be O(n) where n is the number of >>> devices currently configured. Please check with "info mtree -f" that >>> you only have a fixed number of FlatViews. Old versions had one per device. >> I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs. > > With > > $ eval qemu-system-x86_64 -M q35 \ > -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3} > > I only see 4 flat views ("system", "io", "memory", "(none)"). > > Probably you are using intel-iommu? Peter, it should be possible to > reorganize the VT-d memory regions like this: You're right, the number of FVs goes down drastically after removing intel-iommu, and the slowness during Guest PCI initialization disappears with it. Thanks, Sergio. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] Flatview rendering scalability issue 2019-03-11 14:07 ` Paolo Bonzini 2019-03-11 14:35 ` Sergio Lopez @ 2019-03-12 3:23 ` Peter Xu 1 sibling, 0 replies; 6+ messages in thread From: Peter Xu @ 2019-03-12 3:23 UTC (permalink / raw) To: Paolo Bonzini; +Cc: Sergio Lopez, qemu-devel On Mon, Mar 11, 2019 at 03:07:43PM +0100, Paolo Bonzini wrote: > On 11/03/19 14:48, Sergio Lopez wrote: > >> The initialization is O(n^2) because the guest initializes one device at > >> a time, so you rebuild the FlatView first with 0 devices, then 1, then > >> 2, etc. This is very hard to fix, if at all possible. > >> > >> However, each FlatView creation should be O(n) where n is the number of > >> devices currently configured. Please check with "info mtree -f" that > >> you only have a fixed number of FlatViews. Old versions had one per device. > > I'm seeing 9 FVs with 1 PCI, and 119 with 100 PCIs. > > With > > $ eval qemu-system-x86_64 -M q35 \ > -device\ e1000,id=n{1,2,3,4,5,6,7,8}{1,2,3} > > I only see 4 flat views ("system", "io", "memory", "(none)"). > > Probably you are using intel-iommu? Peter, it should be possible to > reorganize the VT-d memory regions like this: > > intel_iommu_ir (MMIO, not added to any container) > > vtd_root_dmar (container) > intel_iommu_dmar (IOMMU), priority 0 > alias to intel_iommu_ir, priority 1 > > vtd_root_nodmar > alias to get_system_memory(), priority 0 > alias to intel_iommu_ir, priority 1 > > vtd_root_0 memory region (container) > vtd_root_dmar # only one of these is enabled > vtd_root_nodmar > > where the vtd_root_dmar and vtd_root_nodmar memory regions are created > in vtd_init once and for all. Because all vtd_root_* memory regions > have only one child, memory.c will recognize that they represent the > same memory, and create at most two FlatViews (one for vtd_root_dmar, > one for vtd_root_nodmar). Yes this sounds good. The only thing I'm still uncertain is about the IOMMU notifiers, which should be per-device (for real). That's embedded in IOMMUMemoryRegion so far and it includes the real MR object: struct IOMMUMemoryRegion { MemoryRegion parent_obj; QLIST_HEAD(, IOMMUNotifier) iommu_notify; IOMMUNotifierFlag iommu_notify_flags; }; Maybe I should also make parent_obj a pointer to the created MRs mentioned above, so IOMMUMemoryRegion only contains notification information rather than real MRs (otherwise we won't have a chance to share memory regions between devices)? (But if so then TYPE_INTEL_IOMMU_MEMORY_REGION might not be able to inherit TYPE_IOMMU_MEMORY_REGION directly, and I've not thought about the details of that, yet) Regards, -- Peter Xu ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-03-12 3:31 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-11 9:26 [Qemu-devel] Flatview rendering scalability issue Sergio Lopez 2019-03-11 10:19 ` Paolo Bonzini 2019-03-11 13:48 ` Sergio Lopez 2019-03-11 14:07 ` Paolo Bonzini 2019-03-11 14:35 ` Sergio Lopez 2019-03-12 3:23 ` Peter Xu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.