From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37365) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ctucJ-0001PG-Td for qemu-devel@nongnu.org; Fri, 31 Mar 2017 07:21:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ctucG-0003mN-Op for qemu-devel@nongnu.org; Fri, 31 Mar 2017 07:21:19 -0400 Received: from mail-wr0-x22c.google.com ([2a00:1450:400c:c0c::22c]:36016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ctucG-0003lA-F5 for qemu-devel@nongnu.org; Fri, 31 Mar 2017 07:21:16 -0400 Received: by mail-wr0-x22c.google.com with SMTP id w11so100268849wrc.3 for ; Fri, 31 Mar 2017 04:21:16 -0700 (PDT) References: <20170307155054.5833-1-alex.bennee@linaro.org> <001801d29bf5$dcba37d0$962ea770$@ru> <8760jdqpv2.fsf@linaro.org> <000101d29cbc$b8df1ca0$2a9d55e0$@ru> <87tw6vq43b.fsf@linaro.org> <000001d29e30$138e5700$3aab0500$@ru> <871stxpdzz.fsf@linaro.org> <002501d29e64$27b383c0$771a8b40$@ru> <87inn1xunu.fsf@linaro.org> <000601d2a852$ac5211d0$04f63570$@ru> <87vaqs4dxc.fsf@linaro.org> <001b01d2a94a$fe5c46a0$fb14d3e0$@ru> <87o9wj3phd.fsf@linaro.org> <000901d2a9ff$89fcdad0$9df69070$@ru> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <000901d2a9ff$89fcdad0$9df69070$@ru> Date: Fri, 31 Mar 2017 12:21:13 +0100 Message-ID: <87fuht4rpy.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH v3 00/11] MTTCG fix-ups for 2.9 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pavel Dovgalyuk Cc: peter.maydell@linaro.org, rth@twiddle.net, pbonzini@redhat.com, qemu-devel@nongnu.org, mttcg@listserver.greensocs.com, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, cota@braap.org, bobby.prani@gmail.com, nikunj@linux.vnet.ibm.com Pavel Dovgalyuk writes: >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >> Pavel Dovgalyuk writes: >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> request@listserver.greensocs.com] >> >> Pavel Dovgalyuk writes: >> >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> >> request@listserver.greensocs.com] >> >> >> Pavel Dovgalyuk writes: >> >> >> >> > I tested on vexpress-a9 platform with Debian wheezy. >> >> >> >> >> >> Thanks for that. I now have a test case that I can reproduce failures on >> >> >> without needing graphics. >> >> >> >> >> >> I've been investigating if there are any problems with the timer >> >> >> processing now they have been moved into the TCG thread. The record >> >> >> stage seems to work fine but I'm having difficulty figuring out why >> >> >> playback freezes. It seems we get to a point where we are stuck waiting >> >> >> for a suspiciously exact timer deadline: >> >> > >> >> > I've encountered the following behavior at replay stage: >> >> > - replay takes some instructions to execute (qemu_icount += counter) >> >> > - virtual timer is fired >> >> >> >> This is the virtual timer based on icount not the virtual rt timer? So >> > >> > Regular virtual timer. It's value is based on icount. >> > virtual_rt is used for internal icount purposes. >> >> And this is where the clock warps come in? The bias brings the >> virtual_rt time forward because execution is waiting for some external >> event to fire (e.g. a timer IRQ)? > > I guess so. But bias is not updated when the vCPU works. > vCPU thread updates only qemu_icount which is used for virtual clock calculation. > >> >> under the new scheme of being processed in the vCPU loop we should only >> >> fire this one once all execution is done (although you may exit the loop >> >> early before icount is exhausted). >> > >> > We should stop the vCPU before processing virtual timers because: >> > - virtual clock depends on instruction counter >> > - virtual timers may change virtual hardware state >> >> If we do the processing in the top of main vCPU loop this should be >> equivalent. The instruction counter cannot run because we haven't >> entered tcg_exec_cpu. We also process virtual timers in this thread >> outside the loop so nothing else can be poking the hardware state. > > This is how qemu worked in older version - it switched between > processing tasks (vCPU and timers) in one thread. The only real difference is the sequencing in the old case was protected by the BQL - now its my program order. > But how we'll join this behavior with the current design and MTTCG? Ignore MTTCG for now. We don't even try and run multiple-threads when icount is enabled. However the change in the BQL locking is what has triggered the failures. Anyway I think I'm getting closer to narrowing it down. On record I see replay_current_step jump backwards with this: /* A common event print, called when reading or saving an event */ static void print_event(uint8_t event) { static int event_count; static uint64_t last_step; uint64_t this_step = replay_get_current_step(); fprintf(stderr, "replay: event %d is %d @ step=%#" PRIx64 "\n", event_count, event, this_step); if (this_step < last_step) { fprintf(stderr,"%s: !!! step %d went backwards %#"PRIx64"=>%#"PRIx64"!!!\n", __func__, event_count, last_step, this_step); abort(); } last_step = this_step; event_count++; } void replay_put_event(uint8_t event) { assert(event < EVENT_COUNT); replay_put_byte(event); print_event(event); } The jump back is fairly consistent in my case (event 66 in the vexpress run) but I'm fairly sure that is the bug. I can't see any reason for the step count to go backwards - indeed that breaks the premise of . Unfortunately when I start putting break and watchpoints in to see how this jump back occurs the problem goes away until I disable them. So this very much looks like a race condition corrupting the icount data. >> >> > - replay puts back unexecuted instructions (qemu_icount -= counter) >> >> > >> >> > But virtual timer cannot take in account non-executed instructions (counter) and >> >> > therefore it reads only qemu_icount, which provides incorrect time. >> >> > >> >> >> But the timers are all enabled: >> >> >> >> >> >> (gdb) qemu timers >> >> >> Processing Realtime timers >> >> >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 >> >> >> Processing Virtual timers >> >> >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 >> >> >> timer 34297350016/1 (cb:0x555555a2e952 ) >> >> >> timer 503290000000/1000000 (cb:0x555555bd4d1d ) >> >> >> Processing Host timers >> >> >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 >> >> >> Processing Virtual RT timers >> >> >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 >> >> > >> >> > Timers are processed only at checkpoints recorded in the log. >> >> > When replay is stuck, probably there is a pending checkpoint in the log >> >> > and pending instructions in CPU (because iothread breaks its >> >> > synchronization). >> >> >> >> How does this work for instructions that sleep (like WFI)? The WFI will >> >> cause us to exit the loop but I assume we need a "real" timer to trigger >> >> a firing of whatever the WFI is waiting for as time according to icount >> >> isn't advancing. >> > >> > This is virtual_rt for. It advances even when vCPU is stopped. >> >> So I'm not seeing this happen: >> >> process_icount_data: icount not increased 117302332749=>117302332749 RT:-1 >> >> I wonder if the advance of virtual_rt has been deadlocked? The -1 >> suggests its not even setup. > > qemu_start_warp_timer says that it is setup only when there are active virtual timers. > > >> > By the way, I made small description of the replay log format >> > to be later included into docs/replay.txt: >> > >> Thanks. That's really useful. >> >> Do you think it would be worth including the dumper script I wrote in >> ./scripts/? I mainly wrote it to aid my understanding of the code flow >> but it might be useful in future investigations? > > Usually I get human-readable logs in record and replay modes to find > divergence between them, but I think your script might be useful in some cases. Cool. I'll include it in the icount series once I have something to post (fingers crossed I'm getting close). -- Alex Bennée