From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=50338 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PcIkW-00050a-8R for qemu-devel@nongnu.org; Mon, 10 Jan 2011 09:29:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PcIkU-00084U-Rz for qemu-devel@nongnu.org; Mon, 10 Jan 2011 09:29:32 -0500 Received: from cantor2.suse.de ([195.135.220.15]:51688 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PcIkU-000849-Hk for qemu-devel@nongnu.org; Mon, 10 Jan 2011 09:29:30 -0500 Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Alexander Graf In-Reply-To: <20110110142317.GG17026@hall.aurel32.net> Date: Mon, 10 Jan 2011 15:29:28 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <5DB773F2-511F-4810-A7DF-37986BA1F2DB@suse.de> References: <20110106221242.GG21099@hall.aurel32.net> <20110110140039.GE17026@hall.aurel32.net> <20110110141524.GF17026@hall.aurel32.net> <20110110142317.GG17026@hall.aurel32.net> Subject: [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno Cc: Blue Swirl , qemu-devel@nongnu.org On 10.01.2011, at 15:23, Aurelien Jarno wrote: > On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote: >>=20 >> On 10.01.2011, at 15:15, Aurelien Jarno wrote: >>=20 >>> On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote: >>>>=20 >>>> On 10.01.2011, at 15:00, Aurelien Jarno wrote: >>>>=20 >>>>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote: >>>>>>=20 >>>>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote: >>>>>>=20 >>>>>>> Hi, >>>>>>>=20 >>>>>>> I have just sent a tcg/arm patch concerning code retranslation. = You >>>>>>> might want to look at the description (copied below), as from a = first >>>>>>> glance ppc, s390 and sparc TCG targets might be affected. If you = see >>>>>>> guest kernel panics, some segmentation fault of qemu or in the = guest, >>>>>>> strange behaviors, that happen randomly and that looks difficult = to >>>>>>> debug it might be the issue. >>>>>>>=20 >>>>>>> Aurelien >>>>>>>=20 >>>>>>>=20 >>>>>>> | QEMU uses code retranslation to restore the CPU state when an = exception >>>>>>> | happens. For it to work the retranslation must not modify the = generated >>>>>>> | code. This is what is currently implemented in ARM TCG. >>>>>>> | >>>>>>> | However on CPU that don't have icache/dcache/memory = synchronised like >>>>>>> | ARM, this requirement is stronger and code retranslation must = not modify >>>>>>> | the generated code "atomically", as the cache line might be = flushed >>>>>>> | at any moment (interrupt, exception, task switching), even if = not >>>>>>> | triggered by QEMU. The probability for this to happen is very = low, and >>>>>>> | depends on cache size and associativiy, machine load, = interrupts, so the >>>>>>> | symptoms are might happen randomly. >>>>>>> | >>>>>>> | This requirement is currently not followed in tcg/arm, for the >>>>>>> | load/store code, which basically has the following structure: >>>>>>> | 1) tlb access code is written >>>>>>> | 2) conditional fast path code is written >>>>>>> | 3) branch is written with a temporary target >>>>>>> | 4) slow path code is written >>>>>>> | 5) branch target is updated >>>>>>> | The cache lines corresponding to the retranslated code is not = flushed >>>>>>> | after code retranslation as the generated code is supposed to = be the >>>>>>> | same. However if the cache line corresponding to the branch = instruction >>>>>>> | is flushed between step 3 and 5, and is not flushed again = before the >>>>>>> | code is executed again, the branch target is wrong. In the = guest, the >>>>>>> | symptoms are MMU page fault at a random addresses, which leads = to >>>>>>> | kernel page fault or segmentation faults. >>>>>>=20 >>>>>> I don't see the problem. If you have separate icache from dcache, = the code in question doesn't get executed during the rewrite, so all = should be fine. If you have both combined, the data write should = automatically modify the cache line, so all is great too. >>>>>=20 >>>>> As far as I understand, this is what happens to the branch target >>>>> instruction, or at least one possibility: >>>>>=20 >>>>> operation | icache | dcache | = mem/L2 | >>>>> = ------------------------------------------+--------+--------+--------+ >>>>> tlb access code is written | absent | absent | ok = | >>>>> conditional fast path code is written | absent | absent | ok = | >>>>> branch is written with a temporary target | absent | wrong | ok = | >>>>> cache line is flushed to memory | absent | absent | = wrong | >>>>> slow path code is written | absent | absent | = wrong | >>>>> branch target is updated | absent | ok | = wrong | >>>>> TB is re-executed | wrong | ok | = wrong | >>>>>=20 >>>>> Note that the issue is real, the patch really fixes the issue on = ARM and >>>>> MIPS. It's not impossible that I don't fully understand it given I = can't >>>>> analyse the cache values at a given time. However, when QEMU = crashes=20 >>>>> because of that, we have seen that the executed code doesn't match = what >>>>> GDB says, and changing the temporary branch value also changes the = way=20 >>>>> QEMU or the guest crash. >>>>>=20 >>>>> The problem doesn't happens very often though (for sure less than = 1=20 >>>>> every few millions). The other way to fix it is to issue a cache = flush >>>>> after a code retranslation, however, it is something very costly = on some >>>>> architectures. >>>>=20 >>>> I'd guess it only happens when code is overwritten. Only then = icache can still be stale in that code region. Can't we just invalidate = the icache on every tb flush? That should also fix the issue I'd guess. >>>=20 >>> That's a solution that works (tested). However making sure that the >>> retranslation doesn't change the code looks better. >>=20 >> Yeah, I agree. Temporary retranslation really shouldn't modify = existing code. We really do need an icache flush on tb flushes still = though as that's a separate issue? Or are these already in? >>=20 >=20 > We already have an icache flush after the first translation. We don't > have any when the TB is flushed, but I don't think it is something > necessary. The new TB that will replace it will issue an icache flush > after it has been translated. We don't need the flush after the translation if we just flush the whole = block on invalidate. That should be a speedup, no? Flushing the full = thing once is probably faster than lots of small flushes on every TB = write. Also, if there's a flush after the translation, I'm even more puzzled = how the race you describe above can occur. Alex