From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:36674) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UfKme-0000Nd-GO for qemu-devel@nongnu.org; Wed, 22 May 2013 21:57:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UfKmd-0006fJ-Bo for qemu-devel@nongnu.org; Wed, 22 May 2013 21:57:36 -0400 Received: from mail-ee0-f53.google.com ([74.125.83.53]:42373) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UfKmd-0006fE-1y for qemu-devel@nongnu.org; Wed, 22 May 2013 21:57:35 -0400 Received: by mail-ee0-f53.google.com with SMTP id c1so1449891eek.26 for ; Wed, 22 May 2013 18:57:34 -0700 (PDT) Date: Thu, 23 May 2013 03:57:21 +0200 From: "Edgar E. Iglesias" Message-ID: <20130523015721.GA10376@smtp.vpn> References: <5189478C.8090405@greensocs.com> <519667A7.9010902@greensocs.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Subject: Re: [Qemu-devel] [RFC] reverse execution. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton Cc: Blue Swirl , Peter Maydell , qemu-devel , KONRAD =?iso-8859-1?Q?Fr=E9d=E9ric?= On Fri, May 17, 2013 at 09:16:06PM +0200, Mark Burton wrote: > I wish I could say I understood it better, but at this point any insight would be gratefully received. However, what does seem clear is that the intent and purpose of Icount is subtly different, and possibly orthogonal to what we're trying to achieve. > > And - actually, determinism (or the lack of it), is defiantly an issue, but - for now - we have spent much of this week finding a bit of code that avoids any non-determanistic behavior - simply so we can make sure the mechanisms work - THEN we will tackle the thorny subject of what is causing non-determanistic behavior (by which, I _suspect_ I mean, what devices or timers are not adhering to the icount mechanism). > > To recap, as I understand things, setting the icount value in the command line is intended to give a rough "instructions per second" mechanism. One of the effects of that is to make things more deterministic. Our overall intent is to allow the user that has hit a bug, to step backwards. > > After much discussion (!) I'm convinced by the argument that I might in the end want both of these things. I might want to set some sort of instructions per second value (and change it between runs), and if/when I hit a bug, go backwards. > > Thus far, so good. > > underneath the hood, icount keeps a counter in the TCG environment which is decremented (as Fred says) and the icount mechanism plays with it as it feels fit. > The bottom line is that, orthogonal to this, we need a separate 'counter' which is almost identical to the icount counter, in order to count instructions for the reverse execution mechanism. > > We have looked at re-using the icount counter as Fred said, but that soon ends you up in a whole heap of pain. Our conclusion - it would be much cleaner to have a separate dedicated counter, then you can simply use either mechanism independent of the other. > On this subject - I would like to hear any and all views. > > Having said all of that, in BOTH cases, we need determinism. > > In our case, determinism is very tightly defined (which - I suspect may not be the case for icount). In our case, having returned to a snapshot, the subsequent execution must follow the EXACT SAME path that it did last time. no if's no buts. Not IO no income tax, no VAT, no money back no guarantee…. > > Right now, what Fred has found is that sometimes things 'drift'… we will (of course) be looking into that. But, for now, our principle concern is to take a simple bit of code, with no IO, and nothing that causes non-determanism - save a snapshot at the beginning of the sequence, run, hit a breakpoint, return to the breakpoint, and be able to _exactly_ return to the place we came from. > > As Fred said, we imagined that we could do this based on TBs, at least as a 'block' level (which actually may be good enough for us). However, our mechanism for counting TB's was badly broken. None the less, we learnt a lot about TB's - and about some of the non-determaistic behavior that will come to haunt us later. We also concluded that counting TBs is always going to be second rate, and if we're going to do this properly, we need to count instructions. Finally, we have concluded that re-using the icount counters is going to be very painful, we need to re-use the same mechanism, but we need dedicated counters… > > > Again, please, all - pitch in and say what you think. Fred and I have been scratching out head all week on this, and I'm not convinced we have come up with the right answers, so any input would be most welcome. Hi, This was a long time ago, but I recall having issues with determenism when hacking on TLMu. Ditching the display timer helped. IIRC, I was getting 100% reproducable runs after that. Cheers, Edgar