From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:50439)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4Vx5-0000VB-4x
	for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:32 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4Vx1-0007H8-3t
	for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:31 -0400
Received: from mxout-1k.itc.hs-rm.de ([195.72.102.133]:48910)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4Vx0-0007Gd-Qh
	for qemu-devel@nongnu.org; Mon, 15 Jun 2015 11:05:27 -0400
Message-ID: <557EE9B3.1030606@hs-rm.de>
Date: Mon, 15 Jun 2015 17:05:23 +0200
From: =?UTF-8?B?QWxleCBaw7xwa2U=?= <alexander.zuepke@hs-rm.de>
MIME-Version: 1.0
References: <557B0B21.2030009@hs-rm.de>
	<CAFEAcA8iCmgwgmhTkRryTyVBSxS+Cbh6bGSLqt7EPM028F1tUg@mail.gmail.com>
	<557EE4D2.5010202@hs-rm.de>
	<CAFEAcA_W6KtcnEAb+K_wcqXn8QZs968ziNEF2c+qqP-YAKqeYw@mail.gmail.com>
In-Reply-To: <CAFEAcA_W6KtcnEAb+K_wcqXn8QZs968ziNEF2c+qqP-YAKqeYw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main
 loop event // how to improve IPI latency?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: QEMU Developers <qemu-devel@nongnu.org>

Am 15.06.2015 um 16:51 schrieb Peter Maydell:
> On 15 June 2015 at 15:44, Alex Z=C3=BCpke <alexander.zuepke@hs-rm.de> w=
rote:
>> Am 12.06.2015 um 20:03 schrieb Peter Maydell:
>>> Probably the best approach would be to have something in
>>> arm_cpu_set_irq() which says "if we are CPU X and we've
>>> just caused an interrupt to be set for CPU Y, then we
>>> should ourselves yield back to the main loop".
>>>
>>> Something like this, maybe, though I have done no more testing
>>> than checking it doesn't actively break kernel booting :-)
>>
>>
>> Thanks! One more check for "level" is needed to get it work:
>=20
> What happens without that? It's reasonable to have it,
> but extra cpu_exit()s shouldn't cause a problem beyond
> being a bit inefficient...

The emulation get's stuck, for whatever reason I don't understand.
I checked if something similar is done on other architectures and found=20
that the level check is missing, see for example cpu_request_exit() in hw=
/ppc/prep.c:
  static void cpu_request_exit(void *opaque, int irq, int level)
  {
      CPUState *cpu =3D current_cpu;

      if (cpu && level) {
          cpu_exit(cpu);
      }
  }

But probably this is used for something completely unrelated.

> It would be interesting to know if this helps Linux as well
> as your custom OS. (I don't know whether a "CPU #0 polls"
> approach is bad on hardware too; the other option would be
> to have CPU #1 IPI back in the other direction if 0 needed
> to wait for a response.)
>=20
> -- PMM

IIRC, Linux TLB shootdown on x86 once used such a scheme, but I don't kno=
w if they changed it.

I'd say that an IPI+poll pattern is used quite often in the tricky parts =
of a kernel, like kernel debugging.


Here's a simple IPI tester sending IPIs from CPU #0 to CPU #1 in an endle=
ss loop.
The IPIs are delayed until the timer interrupt triggers the main loop.

http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf
3174 bytes, md5sum 8d73890a60cd9b24a4f9139509b580e2

Run testcase:
$ qemu-system-arm -M vexpress-a15 -smp 2 -kernel ipi.elf -nographic

The testcase prints the following on the serial console without the patch=
:

  +------- CPU 0 came up
  |+------ CPU 0 initialization completed
  || +---- CPU 0 timer interrupt, 1 HZ
  || |
  vv v
  0!1T.T.T.T.T.T.T.
    ^ ^
    | |
    | +-- CPU 1 received an IPI
    +---- CPU 1 came up


Expected testcase output with patch:

  0!1T..............<hundreds of dots>.................T...............

So: more dots =3D=3D more IPIs handled between two timer interrupts "T" .=
..


Best regards
Alex