From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50439) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4Vx5-0000VB-4x for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z4Vx1-0007H8-3t for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:31 -0400 Received: from mxout-1k.itc.hs-rm.de ([195.72.102.133]:48910) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z4Vx0-0007Gd-Qh for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:27 -0400 Message-ID: <557EE9B3.1030606@hs-rm.de> Date: Mon, 15 Jun 2015 17:05:23 +0200 From: =?UTF-8?B?QWxleCBaw7xwa2U=?= MIME-Version: 1.0 References: <557B0B21.2030009@hs-rm.de> <557EE4D2.5010202@hs-rm.de> In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main loop event // how to improve IPI latency? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: QEMU Developers Am 15.06.2015 um 16:51 schrieb Peter Maydell: > On 15 June 2015 at 15:44, Alex Z=C3=BCpke w= rote: >> Am 12.06.2015 um 20:03 schrieb Peter Maydell: >>> Probably the best approach would be to have something in >>> arm_cpu_set_irq() which says "if we are CPU X and we've >>> just caused an interrupt to be set for CPU Y, then we >>> should ourselves yield back to the main loop". >>> >>> Something like this, maybe, though I have done no more testing >>> than checking it doesn't actively break kernel booting :-) >> >> >> Thanks! One more check for "level" is needed to get it work: >=20 > What happens without that? It's reasonable to have it, > but extra cpu_exit()s shouldn't cause a problem beyond > being a bit inefficient... The emulation get's stuck, for whatever reason I don't understand. I checked if something similar is done on other architectures and found=20 that the level check is missing, see for example cpu_request_exit() in hw= /ppc/prep.c: static void cpu_request_exit(void *opaque, int irq, int level) { CPUState *cpu =3D current_cpu; if (cpu && level) { cpu_exit(cpu); } } But probably this is used for something completely unrelated. > It would be interesting to know if this helps Linux as well > as your custom OS. (I don't know whether a "CPU #0 polls" > approach is bad on hardware too; the other option would be > to have CPU #1 IPI back in the other direction if 0 needed > to wait for a response.) >=20 > -- PMM IIRC, Linux TLB shootdown on x86 once used such a scheme, but I don't kno= w if they changed it. I'd say that an IPI+poll pattern is used quite often in the tricky parts = of a kernel, like kernel debugging. Here's a simple IPI tester sending IPIs from CPU #0 to CPU #1 in an endle= ss loop. The IPIs are delayed until the timer interrupt triggers the main loop. http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf 3174 bytes, md5sum 8d73890a60cd9b24a4f9139509b580e2 Run testcase: $ qemu-system-arm -M vexpress-a15 -smp 2 -kernel ipi.elf -nographic The testcase prints the following on the serial console without the patch= : +------- CPU 0 came up |+------ CPU 0 initialization completed || +---- CPU 0 timer interrupt, 1 HZ || | vv v 0!1T.T.T.T.T.T.T. ^ ^ | | | +-- CPU 1 received an IPI +---- CPU 1 came up Expected testcase output with patch: 0!1T...............................T............... So: more dots =3D=3D more IPIs handled between two timer interrupts "T" .= .. Best regards Alex