From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35773) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cVb5H-0002xh-VJ for qemu-devel@nongnu.org; Mon, 23 Jan 2017 04:38:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cVb5E-0001bE-SV for qemu-devel@nongnu.org; Mon, 23 Jan 2017 04:38:43 -0500 Received: from mail-wm0-x233.google.com ([2a00:1450:400c:c09::233]:38446) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cVb5E-0001aI-LB for qemu-devel@nongnu.org; Mon, 23 Jan 2017 04:38:40 -0500 Received: by mail-wm0-x233.google.com with SMTP id r144so144680183wme.1 for ; Mon, 23 Jan 2017 01:38:38 -0800 (PST) References: <000301d259dc$f9d097c0$ed71c740$@ru> <000601d25a95$12b1b9f0$38152dd0$@ru> <20161220102126.GE5602@stefanha-x1.localdomain> <002501d25ab1$af024b00$0d06e100$@ru> <000301d25b4f$20018440$60048cc0$@ru> <000801d26bd9$dca56db0$95f04910$@ru> <87o9zd3jta.fsf@linaro.org> <000e01d26caa$dfdb3150$9f9193f0$@ru> <87mveleiw8.fsf@linaro.org> <000c01d2754d$59a4cd70$0cee6850$@ru> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <000c01d2754d$59a4cd70$0cee6850$@ru> Date: Mon, 23 Jan 2017 09:38:35 +0000 Message-ID: <874m0qazfo.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] qemu-2.8-rc4 is broken List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pavel Dovgalyuk Cc: 'Stefan Hajnoczi' , 'qemu-devel' , 'Paolo Bonzini' , 'Pavel Dovgalyuk' , 'Peter Maydell' Pavel Dovgalyuk writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk writes: >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> > >> > Sorry, this is another problem which occurs only in icount replay mode: >> > 1. cpu_handle_exception tries to force exception when is cannot occur due to >> > running out all the planned instructions: >> > } else if (replay_has_exception() >> > && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { >> > /* try to cause an exception pending in the log */ >> > cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); >> > *ret = -1; >> > return true; >> > >> > 2. tb_find calls tb_gen_code, which cannot allocate new translation block >> > and calls tb_flush (which only queues the flushing) and cpu_loop_exit >> > 3. cpu_loop_exit returns to infinite loop of cpu_exec and the condition >> > if (cpu_handle_exception(cpu, &ret)) { >> > break; >> > } >> > is checked again causing an infinite loop. >> > >> > TB cache is not flushed because we never execute that break and real work of tb_flush >> > is made outside this loop. >> >> I think what we need is a: >> >> >> if (cpu->exit_request) >> break; > > Where this exit_request is supposed to be set? Ahh my mistake. Currently it is a global exit_request (becoming a per-cpu exit_request when MTTCG is merged). It's set by qemu_cpu_kick() when work is queued up, in this case the tb_flush async work. >> before the cpu_handle_exception() call to ensure any queued work gets >> processed first. Can you give me you current command line so I can >> reproduce this and check the fix works? > > I solved the problem using following patch: > > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -451,6 +451,10 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) > #ifndef CONFIG_USER_ONLY > } else if (replay_has_exception() > && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { > + /* Break the execution loop in case of running out of TB cache. > + This is needed to make flushing of the TB cache, because > + real flush is queued to be executed outside the cpu loop. */ > + cpu->exception_index = EXCP_INTERRUPT; > /* try to cause an exception pending in the log */ > cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); > *ret = -1; I wonder if it worth renaming EXCP_INTERRUPT? I always get it confused with a guest interrupt. But the effect is the same as we set it on an exit_request. -- Alex Bennée