From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56163) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z80cK-0000Od-J3 for qemu-devel@nongnu.org; Thu, 25 Jun 2015 02:26:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z80cE-0003wL-0E for qemu-devel@nongnu.org; Thu, 25 Jun 2015 02:26:32 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:37386) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z80cD-0003vb-Qs for qemu-devel@nongnu.org; Thu, 25 Jun 2015 02:26:25 -0400 Received: by wicgi11 with SMTP id gi11so65579293wic.0 for ; Wed, 24 Jun 2015 23:26:25 -0700 (PDT) References: <1435160084-938-1-git-send-email-alex.bennee@linaro.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: Date: Thu, 25 Jun 2015 07:27:05 +0100 Message-ID: <87si9gxpza.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH] target-arm/psci.c: wake up sleeping CPUs (MTTCG) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Spyridakis Cc: mttcg@listserver.greensocs.com, Peter Maydell , Mark Burton , QEMU Developers , KONRAD =?utf-8?B?RnLDqWTDqXJpYw==?= Alexander Spyridakis writes: > On 24 June 2015 at 17:34, Alex Bennée wrote: >> Testing with Alexander's bare metal syncronisation tests fails in MTTCG >> leaving one CPU spinning forever waiting for the second CPU to wake up. >> We simply need to poke the halt_cond once we have processed the PSCI >> power on call. > > Thanks Alex. Works for me, also with qemu_cpu_kick(target_cpu_state) > as Paolo mentioned. > > The test seems to stress the current multi-threaded implementation > quite a lot. With 8 CPUs running, the resulting errors are in the > range of 500 per vCPU (10 million iterations). We need to get to the bottom of this one first as obviously the implementation needs to bullet proof for all the various synchronisation patterns the CPU can use. > Performance is another issue as mentioned before, but even more > pronounced with 8 cores. Upstream QEMU needs around 10 seconds to > complete, with multi-threading around 100 seconds for the same test. I'm not overly surprised as this is a high-contention test and the additional locking implies a lot of extra overhead. It's certainly a useful test to compare the comparative performance of the various approaches to atomics/exclusives but I hope in real world tasks we gain a bunch of performance for normal unlocked code running across multiple cores. I wonder if the perf tools can give us some insight to where the extra latency is coming from? > > Best regards. -- Alex Bennée