From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48150) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UfBgy-0002ET-Ad for qemu-devel@nongnu.org; Wed, 22 May 2013 12:15:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UfBgw-0007fp-FV for qemu-devel@nongnu.org; Wed, 22 May 2013 12:15:08 -0400 Received: from [2001:41d0:8:2b42::1] (port=35729 helo=ns232118.ovh.net) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UfBgw-0007dP-7W for qemu-devel@nongnu.org; Wed, 22 May 2013 12:15:06 -0400 Message-ID: <519CEF00.6020709@greensocs.com> Date: Wed, 22 May 2013 18:14:56 +0200 From: =?UTF-8?B?S09OUkFEIEZyw6lkw6lyaWM=?= MIME-Version: 1.0 References: <5189478C.8090405@greensocs.com> <519667A7.9010902@greensocs.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC] reverse execution. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: Mark Burton , qemu-devel On 18/05/2013 20:52, Blue Swirl wrote: > On Fri, May 17, 2013 at 5:23 PM, KONRAD Fr=C3=A9d=C3=A9ric > wrote: >> On 09/05/2013 19:54, Blue Swirl wrote: >>> On Tue, May 7, 2013 at 6:27 PM, KONRAD Fr=C3=A9d=C3=A9ric >>> wrote: >>>> Hi, >>>> >>>> We are trying to find a way to do reverse execution happen with QEMU= . >>>> >>>> Actually, it is possible to debug the guest through the gdbstub, we = want >>>> to >>>> make the reverse execution possible with GDB as well. >>>> >>>> How we are trying to make that working (basically without optimisati= on): >>>> >>>> -QEMU takes regular snapshot of the VM: >>>> that can be done with the save vm code without optimisation fir= st. >>>> >>>> -When the VM is stopped and GDB requests a reverse-step: >>>> load the last snapshot and replay to one instruction before the >>>> current >>>> PC. >>>> >>>> There are one issue with that for now (for a basic running reverse >>>> execution): >>>> -How to stop one instruction before the actual PC. >>> Add a special translation mode for reverse execution where the next P= C >>> is checked after each instruction. Alternatively, you could make >>> temporary snapshots during this mode (first 1s intervals, then 0.1s >>> etc) which could be used to find the location. I think this way was >>> discussed briefly earlier in the list, please check the archives. >>> >> Hi, thanks for your answer! >> >> I didn't find the discussion in the archive.. Do you have a clue? (Tit= le or >> sender?) > Paul Brook (long time QEMU developer) made a paper about this together > with Daniel Jacobowitz: > http://www.linuxsymposium.org/archives/GCC/Reprints-2007/jacobowitz-rep= rint.pdf > > IIRC Paul also mentioned some techniques on the list at that time but > I couldn't find that in the archives. > > Other related discussions: > http://article.gmane.org/gmane.comp.emulators.qemu/88447 > http://article.gmane.org/gmane.comp.emulators.qemu/94861 > http://article.gmane.org/gmane.comp.emulators.qemu/154572 > > Also this site contains some overview of reverse debugging: > http://jakob.engbloms.se/archives/1554 > Thanks for your help :). >> For now we tried some other things which are not working very well, >> >> It appeared that the replay is not deterministic even with icount: >> - the whole icount mechanism is not saved with save_vm (which can= be >> achieved by moving qemu_icount to TimerState according to Paolo) >> - replaying two times the same thing and stopping at a specific >> breakpoint show two differents vmclock, so replaying the >> same amount of time don't work, and we abandoned this idea. >> >> We tried to count the amount of time tcg_qemu_tb_exec exited with havi= ng >> executed some TB and we stopped one before for the replay. >> This is nearly working but: >> - tcg_qemu_tb_exec exits a little more time during the first repl= ay, >> seems the TB linked list is split dynamically? >> - this works with the next replay (reverse-stepi) but we can't st= op at >> the exact PC instruction with this method. >> >> So we will try to add an instruction counter in the CPUState and incre= ments >> it after each instruction in the translation code, >> which I think is approximately what you suggest. >> Then when replaying the code from the snapshot, we will check the amou= nt of >> executed instruction and stop one instruction before. >> Maybe we can re-use icount mechanism but this might be a lot more >> complicated as it is a de-counter? >> >> Can this be working? >> >> Maybe we will need to trace the PC from the snapshot to the exact loca= tion? > That should be easy, but not the fastest way. > >> Or use both mechanism to get the right location? > Yes, you could load VM from previous snapshot and then use icount or > just host timer to get approximately halfway. Make a new snapshot and > then try again, starting from that snapshot. When you get close > enough, singlestep to the final instruction. Well, finally we plan to do approximately this way: We added a translation block counter, which is incremented by TCG code the same way as icount, and raise a debug exception when we are at the right location. Unfortunately this is not sufficient in term of precision, we can jump=20 back 1 TB. We have two choices: a/ keep this "executed translation block" counter and "step by=20 step" go to the right location. b/ transform this counter in a "executed instruction" counter like=20 icount and do "step by step" execution when we are replaying. The first is a bit difficult, as we don't have the exact PC location=20 where to stop, and the second can be really slow (I don't have performance measure at the moment). Maybe we can try mixing both: replaying to the start of the right TB,=20 then step by step going to the right PC. Fred >> Thanks, >> Fred >> >> >>>> We though that using "-icount" and stop the guest a little time befo= re >>>> the >>>> actual position would give us the right behavior (We use a qemu_time= r >>>> with >>>> vm_clock to stop the vm at the good time), but it seems that it is n= ot >>>> deterministic, and not reproducable. >>>> >>>> Is that normal? >>>> >>>> We don't make any input during the replay, and we though that it can= be >>>> caused >>>> by some timer interruption but "-icount" is using a virtual timer as= I >>>> understand? >>>> >>>> We have two other ideas: >>>> >>>> -Using TCI and count each instruction executed by the processo= r, >>>> then >>>> stop >>>> one instruction before the actual position. This seems slo= wer. >>>> >>>> -Using single-step to count each instruction, then stop one >>>> instruction >>>> before the actual position. >>>> >>>> Would that be better? >>>> >>>> For now we can restore the VM from the last snapshot, when we do a >>>> reverse-step >>>> but we can't stop at the exact position. >>>> >>>> Thanks, >>>> Fred >>>>