From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54734) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7gg7-0000iz-3Y for qemu-devel@nongnu.org; Wed, 24 Jun 2015 05:09:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z7gg2-0006cN-RP for qemu-devel@nongnu.org; Wed, 24 Jun 2015 05:09:06 -0400 Received: from mel.v6.act-europe.fr ([2a02:2ab8:224:1::a0a:d2]:46416 helo=smtp.eu.adacore.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z7gg2-0006bR-IL for qemu-devel@nongnu.org; Wed, 24 Jun 2015 05:09:02 -0400 Message-ID: <558A73AC.303@adacore.com> Date: Wed, 24 Jun 2015 11:09:00 +0200 From: Fabien Chouteau MIME-Version: 1.0 References: <1435010055-4584-1-git-send-email-zavadovsky.yan@gmail.com> <5588F689.8050202@weilnetz.de> <55893923.9010304@redhat.com> <55899261.6020705@weilnetz.de> In-Reply-To: <55899261.6020705@weilnetz.de> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH] thread-win32: fix GetThreadContext() permanently fails List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Weil , Paolo Bonzini , Peter Maydell , =?UTF-8?B?0K/QvSDQl9Cw0LLQsNC00L7QstGB0LrQuNC5?= Cc: Olivier Hainque , QEMU Developers On 06/23/2015 07:07 PM, Stefan Weil wrote: > Am 23.06.2015 um 12:46 schrieb Paolo Bonzini: >> On 23/06/2015 12:30, Peter Maydell wrote: >>> On 23 June 2015 at 10:55, =D0=AF=D0=BD =D0=97=D0=B0=D0=B2=D0=B0=D0=B4= =D0=BE=D0=B2=D1=81=D0=BA=D0=B8=D0=B9 wrote: >>>> On Tue, Jun 23, 2015 at 9:02 AM, Stefan Weil wrote: >>>>> We should add an URL to reliable documentation which supports that >>>>> claim. >>>> Unfortunately, MSDN says only "SuspendThread suspends the thread. It= 's >>>> designed for debuggers. Don't use in applications.": >>>> https://msdn.microsoft.com/en-us/library/windows/desktop/ms686345(v=3D= vs.85).aspx >>>> And nothing more useful. >>>> So when I found this piece of code with Suspend/Resume and failed Ge= tContext >>>> I did some googling. >>>> And found this article: >>>> http://blogs.msdn.com/b/oldnewthing/archive/2015/02/05/10591215.aspx >>> Personally I am happy to treat a Raymond Chen blog post as "reliable >>> documentation"... >> Me too. :) >=20 > +1 >=20 > Fabien, I wonder why nobody noticed that the current > code did not do what it was written for. As far as I see > the threads were created with the wrong options, so > GetThreadContext always failed and therefore was only > executed once, so there was no waiting for thread > suspension. >=20 I'm surprised as well, but we run several hundred thousands of tests every day (one QEMU instance for each test) and before this fix we had a few instances freezing for no reason. We identified a possible race condition on SMP host and the bug disappeared after this fix. Even if the call was erroneous, adding a call to GetThreadContext probably gave more time or forced the suspend request to be effective, it's the only explanation I have right now. But clearly there was a bug, and the call to GetThreadContext fixed it. I found other pieces of code that uses this technique but calling GetThreadContext only once (not in a loop like we did), so maybe it's enough to call it once and the loop is superfluous... > Removing the code would have given identical results. > Considering we are talking about thread synchronization on Windows and SMP host, I would not make that assumption :) > Is that in an indicator that the SuspendThread is not > needed at all, as it was discussed in the other e-mails > here? If we completely change the thread synchronization on Windows, maybe SuspendeThread is not needed anymore, but with the current scheme (at least what I know of it), I don't see how we can remove it. As I said before we must be very careful with this piece of code. Regards,