From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:40599)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4psF-0005xB-Ru
	for qemu-devel@nongnu.org; Tue, 16 Jun 2015 08:21:53 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4psB-0007tA-NB
	for qemu-devel@nongnu.org; Tue, 16 Jun 2015 08:21:51 -0400
Received: from mxout-1k.itc.hs-rm.de ([195.72.102.133]:58087)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Alexander.Zuepke@hs-rm.de>) id 1Z4psB-0007sq-Cw
	for qemu-devel@nongnu.org; Tue, 16 Jun 2015 08:21:47 -0400
Message-ID: <558014D8.3080802@hs-rm.de>
Date: Tue, 16 Jun 2015 14:21:44 +0200
From: =?UTF-8?B?QWxleCBaw7xwa2U=?= <alexander.zuepke@hs-rm.de>
MIME-Version: 1.0
References: <557B0B21.2030009@hs-rm.de>
	<CAFEAcA8iCmgwgmhTkRryTyVBSxS+Cbh6bGSLqt7EPM028F1tUg@mail.gmail.com>
	<557EE4D2.5010202@hs-rm.de>
	<CAFEAcA_W6KtcnEAb+K_wcqXn8QZs968ziNEF2c+qqP-YAKqeYw@mail.gmail.com>
	<557EE9B3.1030606@hs-rm.de>
	<CAFEAcA_RXFYeaFTVwauNxRzGZV1gp6h-P8OEj4dhF6vawrwGSA@mail.gmail.com>
	<557F2F8C.8090708@hs-rm.de>
	<CAFEAcA9Xr+fTeLX_=ftNFOOWjMMFduoJEvxWCMLSKuYv++W+KQ@mail.gmail.com>
	<CAFEAcA_hqVfLtxBXKiPKprABMF8ntWOTAeVobi_0xgS2P-_2cQ@mail.gmail.com>
	<5580046E.4070002@hs-rm.de>
	<CAFEAcA_br+LaYj=Ou4s4t1pHwP0pF2h3tCVMwV_r4hE=9BNq8A@mail.gmail.com>
In-Reply-To: <CAFEAcA_br+LaYj=Ou4s4t1pHwP0pF2h3tCVMwV_r4hE=9BNq8A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main
 loop event // how to improve IPI latency?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: QEMU Developers <qemu-devel@nongnu.org>

Am 16.06.2015 um 13:53 schrieb Peter Maydell:
> On 16 June 2015 at 12:11, Alex Z=C3=BCpke <alexander.zuepke@hs-rm.de> w=
rote:
>> But the startup is not my problem, it's the later parts.
>=20
> But it was my problem because it meant your test case wasn't
> functional :-)
>=20
>> I added the WFE to the initial lock. Here are two new tests, both are =
now 3178 bytes in size:
>> http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf
>> http://www.cs.hs-rm.de/~zuepke/qemu/ipi_yield.elf
>>
>> Both start on my machine. The IPI ping-pong starts after the
>> first timer interrupt after 1s. The problem is that IPIs are
>> delivered only once a second after the timer interrupts QEMU's
>> main loop.
>=20
> Thanks. These test cases work for me, and I can repro the
> same behaviour you see.

OK, I'm glad that you can trigger my bug.

> I intend to investigate why we're not at least timeslicing
> between the two CPUs at a faster rate than "when there's
> another timer interrupt".

Probably there is no other way of time slicing in QEMU ... every OS uses =
some kind of timer interrupt, so it's not necessary.
And even Linux' tickless kernel doesn't run into this issue because it us=
es SEV/WFE properly.

>> Something else: Existing ARM CPU so far do not use hyper-threading,
>> but have real phyical cores. In contrast, QEMU is an extreme
>> coarse-grained hyper-threading architectures, so existing legacy
>> code that was written with physical cores in mind will trigger
>> timing bugs in synchronization primitives then, especially code
>> originally written for ARM11 MPCore like mine, which lacks WFE/SEV.
>> If we consider QEMU as a platform to run legacy code, doesn't it
>> make sense to address these issues?
>=20
> In general QEMU's approach is more "run correct code reasonably
> fast" rather than "run buggy code the same way the hardware
> would" or "identify bugs in buggy code". There's certainly
> scope for heuristics for making our timeslicing approach less
> obtrusive, but we need to understand the underlying behaviour
> first (and check it doesn't accidentally slow down other
> common workloads in the process). In particular I think the
> 'do cpu_exit if one CPU triggers an interrupt on another'
> approach is probably good, but I need to investigate why
> it isn't working on your test programs without that extra
> 'level &&' condition first...
>=20
> thanks
> -- PMM

OK, thank you Peter!


Best regards
Alex