From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KM0Bh-0002OI-UA
	for qemu-devel@nongnu.org; Thu, 24 Jul 2008 08:44:53 -0400
Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KM0Bg-0002N3-AH
	for qemu-devel@nongnu.org; Thu, 24 Jul 2008 08:44:53 -0400
Received: from [199.232.76.173] (port=48094 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KM0Bf-0002Mw-S7
	for qemu-devel@nongnu.org; Thu, 24 Jul 2008 08:44:51 -0400
Received: from mail.codesourcery.com ([65.74.133.4]:52345)
	by monty-python.gnu.org with esmtps
	(TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60)
	(envelope-from <paul@codesourcery.com>) id 1KM0Be-000338-FR
	for qemu-devel@nongnu.org; Thu, 24 Jul 2008 08:44:51 -0400
From: Paul Brook <paul@codesourcery.com>
Subject: Re: [Qemu-devel] Weird behavior while using the instruction counter
Date: Thu, 24 Jul 2008 13:44:32 +0100
References: <3e1533500807240342s15e6e508kd1d49152b0892e9f@mail.gmail.com>
In-Reply-To: <3e1533500807240342s15e6e508kd1d49152b0892e9f@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200807241344.35106.paul@codesourcery.com>
Reply-To: qemu-devel@nongnu.org
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org
Cc: Luis Pureza <pureza@student.dei.uc.pt>

On Thursday 24 July 2008, Luis Pureza wrote:
> Hi,
>
> I'm using the instruction counter to execute N instructions at a time.
> With very small values of N (say, N < 10), I observed the following
> behavior:
>
> 1. A new TB is generated and execution starts there;
> 2. The instruction counter timer expires and cpu_exec_nocache() is called;
> 3. cpu_exec_nocache() generates a new TB for the same PC and starts to
> execute it;
> 4. Some instruction inside the TB turns out to be an I/O instruction.
> Thus, cpu_io_recompile() gets called
> 5; cpu_io_recompile() regenerates the TB and longjmps back to the
> beginning of cpu_exec()
> 6. on cpu_exec(), tb_find_fast() returns the first TB, instead of the
> one generated by cpu_io_recompile()
> 7. Endless loop!

I think I can see how this could happen, but only when the IO instruction is 
the first instruction in the block.  For any other TB you probably get 
run+fault first.

> Actually, for some reason beyond my comprehension, the loop is not
> really infinite: after a few seconds it actually executes the block
> and moves on. However, as you can imagine, this is too slow.

You need to figure out what's actually happening. Either it's an infinite loop 
or it's not.

Instruction counter expiry and the first IO trap are both fairly expensive 
operation. Having the counter expire every few instructionswill make qemu go 
extremely slowly.  Are you sure it's not just running very slowly?

> I think I fixed the problem by appending CF_LAST_IO to the cflags of
> the TB generated by cpu_exec_nocache(). This way, cpu_io_recompile()
> won't be called for this TB.

No. You're assuming the IO trap occurs on the last instruction, which not 
true.  The problem is that cpu_exec_nocache introduces a second TB with the 
same lookup key(pc+flags). cpu_io_recompile (and possibly other places) 
assume the currently executing TB is the only tb that matches. It needs to 
invalidate the original TB (if it exists) as well as the uncached one.

A related issue is that we don't invalidate the cached TB if a fault occurs 
while it is being executed.

Paul