From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: Enable more than 255 VCPU support without irq remapping function in the guest Date: Wed, 27 Apr 2016 08:56:56 +0200 Message-ID: <572062B8.5030103@siemens.com> References: <571F93CA.40200@intel.com> <571F9487.5090009@siemens.com> <20160426164939.GA18900@potion> <57203B9D.6020402@gmail.com> <57204D28.4070706@siemens.com> <57205B12.6070003@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: pbonzini@redhat.com, kvm@vger.kernel.org, tglx@linutronix.de, gleb@redhat.com, mst@redhat.com, x86@kernel.org, Peter Xu , Igor Mammedov To: Lan Tianyu , Yang Zhang , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from goliath.siemens.de ([192.35.17.28]:55212 "EHLO goliath.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752986AbcD0G5M (ORCPT ); Wed, 27 Apr 2016 02:57:12 -0400 In-Reply-To: <57205B12.6070003@intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2016-04-27 08:24, Lan Tianyu wrote: > On 2016=E5=B9=B404=E6=9C=8827=E6=97=A5 13:24, Jan Kiszka wrote: >>>> If we don't want the interrupt from internal device delivers to CP= U >>>>>> 255, do we still need the VT-d interrupt remapping emulation? I = think >>>> firmware is able to send IPI to wakeup APs even without IR and OS = is >>>> able to do it too. So basically, only KVM and Qemu's support is en= ough. >=20 > Yes, just starting more than 255 APs doesn't need IR. >=20 >> What are "internal devices" for you? And which OS do you know that w= ould >> handle such artificial setups without prio massive patching? >> >> We do need VT-d IR emulation in order to present our guest a well >> specified and support architecture for running > 255 CPUs. >=20 > Changing guest kernel will be big concern. I found commit ce69a784 di= d > optimization to use X2APIC without IR in the guest when APIC id is le= ss > than 256 and so I proposed my idea to see everyone's feedback. Whethe= r > it's possible to relax the IR requirement when APIC id > 255 in the g= uest. You can't do that easily because you can't address those additional CPU= s from *any* device then, only via IPIs. That means, Linux would have to be changed to only set up IRQ affinity masks in the 0-254 range. I suppose you would even have to patch tools like irqbalanced to not issu= e mask changes via /proc that include larger CPU IDs. Practically not feasible, already on Linux. Not to speak of other guest OSes. Jan --=20 Siemens AG, Corporate Technology, CT RDA ITP SES-DE Corporate Competence Center Embedded Linux