[Qemu-devel] tcg/{ppc, s390, sparc}: branch target and code retranslation

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] tcg/{ppc, s390, sparc}: branch target and code retranslation
@ 2011-01-06 22:12 Aurelien Jarno
       [not found] ` <ADE7D325-3612-4BD9-A88E-7B88E68449E1@suse.de>
  0 siblings, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2011-01-06 22:12 UTC (permalink / raw)
  To: Vassili Karpov, Alexander Graf, Blue Swirl; +Cc: qemu-devel

Hi,

I have just sent a tcg/arm patch concerning code retranslation. You
might want to look at the description (copied below), as from a first
glance ppc, s390 and sparc TCG targets might be affected. If you see
guest kernel panics, some segmentation fault of qemu or in the guest,
strange behaviors, that happen randomly and that looks difficult to
debug it might be the issue.

Aurelien

| QEMU uses code retranslation to restore the CPU state when an exception
| happens. For it to work the retranslation must not modify the generated
| code. This is what is currently implemented in ARM TCG.
|
| However on CPU that don't have icache/dcache/memory synchronised like
| ARM, this requirement is stronger and code retranslation must not modify
| the generated code "atomically", as the cache line might be flushed
| at any moment (interrupt, exception, task switching), even if not
| triggered by QEMU. The probability for this to happen is very low, and
| depends on cache size and associativiy, machine load, interrupts, so the
| symptoms are might happen randomly.
|
| This requirement is currently not followed in tcg/arm, for the
| load/store code, which basically has the following structure:
|   1) tlb access code is written
|   2) conditional fast path code is written
|   3) branch is written with a temporary target
|   4) slow path code is written
|   5) branch target is updated
| The cache lines corresponding to the retranslated code is not flushed
| after code retranslation as the generated code is supposed to be the
| same. However if the cache line corresponding to the branch instruction
| is flushed between step 3 and 5, and is not flushed again before the
| code is executed again, the branch target is wrong. In the guest, the
| symptoms are MMU page fault at a random addresses, which leads to
| kernel page fault or segmentation faults.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
       [not found] ` <ADE7D325-3612-4BD9-A88E-7B88E68449E1@suse.de>
@ 2011-01-10 14:00   ` Aurelien Jarno
  2011-01-10 14:07     ` Alexander Graf
  0 siblings, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2011-01-10 14:00 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel

On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> 
> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> 
> > Hi,
> > 
> > I have just sent a tcg/arm patch concerning code retranslation. You
> > might want to look at the description (copied below), as from a first
> > glance ppc, s390 and sparc TCG targets might be affected. If you see
> > guest kernel panics, some segmentation fault of qemu or in the guest,
> > strange behaviors, that happen randomly and that looks difficult to
> > debug it might be the issue.
> > 
> > Aurelien
> > 
> > 
> > | QEMU uses code retranslation to restore the CPU state when an exception
> > | happens. For it to work the retranslation must not modify the generated
> > | code. This is what is currently implemented in ARM TCG.
> > |
> > | However on CPU that don't have icache/dcache/memory synchronised like
> > | ARM, this requirement is stronger and code retranslation must not modify
> > | the generated code "atomically", as the cache line might be flushed
> > | at any moment (interrupt, exception, task switching), even if not
> > | triggered by QEMU. The probability for this to happen is very low, and
> > | depends on cache size and associativiy, machine load, interrupts, so the
> > | symptoms are might happen randomly.
> > |
> > | This requirement is currently not followed in tcg/arm, for the
> > | load/store code, which basically has the following structure:
> > |   1) tlb access code is written
> > |   2) conditional fast path code is written
> > |   3) branch is written with a temporary target
> > |   4) slow path code is written
> > |   5) branch target is updated
> > | The cache lines corresponding to the retranslated code is not flushed
> > | after code retranslation as the generated code is supposed to be the
> > | same. However if the cache line corresponding to the branch instruction
> > | is flushed between step 3 and 5, and is not flushed again before the
> > | code is executed again, the branch target is wrong. In the guest, the
> > | symptoms are MMU page fault at a random addresses, which leads to
> > | kernel page fault or segmentation faults.
> 
> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.

As far as I understand, this is what happens to the branch target
instruction, or at least one possibility:

operation                                 | icache | dcache | mem/L2 |
------------------------------------------+--------+--------+--------+
tlb access code is written                | absent | absent |   ok   |
conditional fast path code is written     | absent | absent |   ok   |
branch is written with a temporary target | absent | wrong  |   ok   |
cache line is flushed to memory           | absent | absent |  wrong |
slow path code is written                 | absent | absent |  wrong |
branch target is updated                  | absent |   ok   |  wrong |
TB is re-executed                         | wrong  |   ok   |  wrong |

Note that the issue is real, the patch really fixes the issue on ARM and
MIPS. It's not impossible that I don't fully understand it given I can't
analyse the cache values at a given time. However, when QEMU crashes 
because of that, we have seen that the executed code doesn't match what
GDB says, and changing the temporary branch value also changes the way 
QEMU or the guest crash.

The problem doesn't happens very often though (for sure less than 1 
every few millions). The other way to fix it is to issue a cache flush
after a code retranslation, however, it is something very costly on some
architectures.

> As long as the code in question doesn't get run in between, I don't see where breakage could occur?

The breakage can occur, because the same TB can be re-executed later,
and we don't have an explicit cache flush after the retranslation.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:00   ` [Qemu-devel] " Aurelien Jarno
@ 2011-01-10 14:07     ` Alexander Graf
  2011-01-10 14:15       ` Aurelien Jarno
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Graf @ 2011-01-10 14:07 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Blue Swirl, qemu-devel


On 10.01.2011, at 15:00, Aurelien Jarno wrote:

> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
>> 
>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
>> 
>>> Hi,
>>> 
>>> I have just sent a tcg/arm patch concerning code retranslation. You
>>> might want to look at the description (copied below), as from a first
>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
>>> guest kernel panics, some segmentation fault of qemu or in the guest,
>>> strange behaviors, that happen randomly and that looks difficult to
>>> debug it might be the issue.
>>> 
>>> Aurelien
>>> 
>>> 
>>> | QEMU uses code retranslation to restore the CPU state when an exception
>>> | happens. For it to work the retranslation must not modify the generated
>>> | code. This is what is currently implemented in ARM TCG.
>>> |
>>> | However on CPU that don't have icache/dcache/memory synchronised like
>>> | ARM, this requirement is stronger and code retranslation must not modify
>>> | the generated code "atomically", as the cache line might be flushed
>>> | at any moment (interrupt, exception, task switching), even if not
>>> | triggered by QEMU. The probability for this to happen is very low, and
>>> | depends on cache size and associativiy, machine load, interrupts, so the
>>> | symptoms are might happen randomly.
>>> |
>>> | This requirement is currently not followed in tcg/arm, for the
>>> | load/store code, which basically has the following structure:
>>> |   1) tlb access code is written
>>> |   2) conditional fast path code is written
>>> |   3) branch is written with a temporary target
>>> |   4) slow path code is written
>>> |   5) branch target is updated
>>> | The cache lines corresponding to the retranslated code is not flushed
>>> | after code retranslation as the generated code is supposed to be the
>>> | same. However if the cache line corresponding to the branch instruction
>>> | is flushed between step 3 and 5, and is not flushed again before the
>>> | code is executed again, the branch target is wrong. In the guest, the
>>> | symptoms are MMU page fault at a random addresses, which leads to
>>> | kernel page fault or segmentation faults.
>> 
>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
> 
> As far as I understand, this is what happens to the branch target
> instruction, or at least one possibility:
> 
> operation                                 | icache | dcache | mem/L2 |
> ------------------------------------------+--------+--------+--------+
> tlb access code is written                | absent | absent |   ok   |
> conditional fast path code is written     | absent | absent |   ok   |
> branch is written with a temporary target | absent | wrong  |   ok   |
> cache line is flushed to memory           | absent | absent |  wrong |
> slow path code is written                 | absent | absent |  wrong |
> branch target is updated                  | absent |   ok   |  wrong |
> TB is re-executed                         | wrong  |   ok   |  wrong |
> 
> Note that the issue is real, the patch really fixes the issue on ARM and
> MIPS. It's not impossible that I don't fully understand it given I can't
> analyse the cache values at a given time. However, when QEMU crashes 
> because of that, we have seen that the executed code doesn't match what
> GDB says, and changing the temporary branch value also changes the way 
> QEMU or the guest crash.
> 
> The problem doesn't happens very often though (for sure less than 1 
> every few millions). The other way to fix it is to issue a cache flush
> after a code retranslation, however, it is something very costly on some
> architectures.

I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.

> 
>> As long as the code in question doesn't get run in between, I don't see where breakage could occur?
> 
> The breakage can occur, because the same TB can be re-executed later,
> and we don't have an explicit cache flush after the retranslation.

Oh, I think I'm starting to understand the issue. Have you seen this with system or user emulation? I'd guess with user it makes sense:

[T1] trap
[T2] run code
[T1] write branch code to find guest IP
[T2] run exactly that branch code
[T1] rewrite branch code
[T2] has invalid code in its cache

If that's the case, a global lock on retranslation should do the trick, no?

If it was during system emulation, I'm still puzzled :).


Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:07     ` Alexander Graf
@ 2011-01-10 14:15       ` Aurelien Jarno
  2011-01-10 14:20         ` Alexander Graf
  0 siblings, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2011-01-10 14:15 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel

On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> 
> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> 
> > On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >> 
> >> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>> might want to look at the description (copied below), as from a first
> >>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>> strange behaviors, that happen randomly and that looks difficult to
> >>> debug it might be the issue.
> >>> 
> >>> Aurelien
> >>> 
> >>> 
> >>> | QEMU uses code retranslation to restore the CPU state when an exception
> >>> | happens. For it to work the retranslation must not modify the generated
> >>> | code. This is what is currently implemented in ARM TCG.
> >>> |
> >>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>> | ARM, this requirement is stronger and code retranslation must not modify
> >>> | the generated code "atomically", as the cache line might be flushed
> >>> | at any moment (interrupt, exception, task switching), even if not
> >>> | triggered by QEMU. The probability for this to happen is very low, and
> >>> | depends on cache size and associativiy, machine load, interrupts, so the
> >>> | symptoms are might happen randomly.
> >>> |
> >>> | This requirement is currently not followed in tcg/arm, for the
> >>> | load/store code, which basically has the following structure:
> >>> |   1) tlb access code is written
> >>> |   2) conditional fast path code is written
> >>> |   3) branch is written with a temporary target
> >>> |   4) slow path code is written
> >>> |   5) branch target is updated
> >>> | The cache lines corresponding to the retranslated code is not flushed
> >>> | after code retranslation as the generated code is supposed to be the
> >>> | same. However if the cache line corresponding to the branch instruction
> >>> | is flushed between step 3 and 5, and is not flushed again before the
> >>> | code is executed again, the branch target is wrong. In the guest, the
> >>> | symptoms are MMU page fault at a random addresses, which leads to
> >>> | kernel page fault or segmentation faults.
> >> 
> >> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
> > 
> > As far as I understand, this is what happens to the branch target
> > instruction, or at least one possibility:
> > 
> > operation                                 | icache | dcache | mem/L2 |
> > ------------------------------------------+--------+--------+--------+
> > tlb access code is written                | absent | absent |   ok   |
> > conditional fast path code is written     | absent | absent |   ok   |
> > branch is written with a temporary target | absent | wrong  |   ok   |
> > cache line is flushed to memory           | absent | absent |  wrong |
> > slow path code is written                 | absent | absent |  wrong |
> > branch target is updated                  | absent |   ok   |  wrong |
> > TB is re-executed                         | wrong  |   ok   |  wrong |
> > 
> > Note that the issue is real, the patch really fixes the issue on ARM and
> > MIPS. It's not impossible that I don't fully understand it given I can't
> > analyse the cache values at a given time. However, when QEMU crashes 
> > because of that, we have seen that the executed code doesn't match what
> > GDB says, and changing the temporary branch value also changes the way 
> > QEMU or the guest crash.
> > 
> > The problem doesn't happens very often though (for sure less than 1 
> > every few millions). The other way to fix it is to issue a cache flush
> > after a code retranslation, however, it is something very costly on some
> > architectures.
> 
> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.

That's a solution that works (tested). However making sure that the
retranslation doesn't change the code looks better.

> > 
> >> As long as the code in question doesn't get run in between, I don't see where breakage could occur?
> > 
> > The breakage can occur, because the same TB can be re-executed later,
> > and we don't have an explicit cache flush after the retranslation.
> 
> Oh, I think I'm starting to understand the issue. Have you seen this with system or user emulation? I'd guess with user it makes sense:
> 
> [T1] trap
> [T2] run code
> [T1] write branch code to find guest IP
> [T2] run exactly that branch code
> [T1] rewrite branch code
> [T2] has invalid code in its cache
> 
> If that's the case, a global lock on retranslation should do the trick, no?
>

I haven't seen that during user emulation, however your scenario, which
is different is also possible. Note however that they are far less
retranslation on linux-user code. Also far less code being translated.

In any case, a global lock also doesn't seems to be the best solution,
just making sure to not change the code during retranslation seems a
better solution.

> If it was during system emulation, I'm still puzzled :).
> 

Yes, it was during system emulation.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:15       ` Aurelien Jarno
@ 2011-01-10 14:20         ` Alexander Graf
  2011-01-10 14:23           ` Aurelien Jarno
  2011-01-10 14:51           ` Edgar E. Iglesias
  0 siblings, 2 replies; 10+ messages in thread
From: Alexander Graf @ 2011-01-10 14:20 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Blue Swirl, qemu-devel


On 10.01.2011, at 15:15, Aurelien Jarno wrote:

> On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
>> 
>> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
>> 
>>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I have just sent a tcg/arm patch concerning code retranslation. You
>>>>> might want to look at the description (copied below), as from a first
>>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
>>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
>>>>> strange behaviors, that happen randomly and that looks difficult to
>>>>> debug it might be the issue.
>>>>> 
>>>>> Aurelien
>>>>> 
>>>>> 
>>>>> | QEMU uses code retranslation to restore the CPU state when an exception
>>>>> | happens. For it to work the retranslation must not modify the generated
>>>>> | code. This is what is currently implemented in ARM TCG.
>>>>> |
>>>>> | However on CPU that don't have icache/dcache/memory synchronised like
>>>>> | ARM, this requirement is stronger and code retranslation must not modify
>>>>> | the generated code "atomically", as the cache line might be flushed
>>>>> | at any moment (interrupt, exception, task switching), even if not
>>>>> | triggered by QEMU. The probability for this to happen is very low, and
>>>>> | depends on cache size and associativiy, machine load, interrupts, so the
>>>>> | symptoms are might happen randomly.
>>>>> |
>>>>> | This requirement is currently not followed in tcg/arm, for the
>>>>> | load/store code, which basically has the following structure:
>>>>> |   1) tlb access code is written
>>>>> |   2) conditional fast path code is written
>>>>> |   3) branch is written with a temporary target
>>>>> |   4) slow path code is written
>>>>> |   5) branch target is updated
>>>>> | The cache lines corresponding to the retranslated code is not flushed
>>>>> | after code retranslation as the generated code is supposed to be the
>>>>> | same. However if the cache line corresponding to the branch instruction
>>>>> | is flushed between step 3 and 5, and is not flushed again before the
>>>>> | code is executed again, the branch target is wrong. In the guest, the
>>>>> | symptoms are MMU page fault at a random addresses, which leads to
>>>>> | kernel page fault or segmentation faults.
>>>> 
>>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
>>> 
>>> As far as I understand, this is what happens to the branch target
>>> instruction, or at least one possibility:
>>> 
>>> operation                                 | icache | dcache | mem/L2 |
>>> ------------------------------------------+--------+--------+--------+
>>> tlb access code is written                | absent | absent |   ok   |
>>> conditional fast path code is written     | absent | absent |   ok   |
>>> branch is written with a temporary target | absent | wrong  |   ok   |
>>> cache line is flushed to memory           | absent | absent |  wrong |
>>> slow path code is written                 | absent | absent |  wrong |
>>> branch target is updated                  | absent |   ok   |  wrong |
>>> TB is re-executed                         | wrong  |   ok   |  wrong |
>>> 
>>> Note that the issue is real, the patch really fixes the issue on ARM and
>>> MIPS. It's not impossible that I don't fully understand it given I can't
>>> analyse the cache values at a given time. However, when QEMU crashes 
>>> because of that, we have seen that the executed code doesn't match what
>>> GDB says, and changing the temporary branch value also changes the way 
>>> QEMU or the guest crash.
>>> 
>>> The problem doesn't happens very often though (for sure less than 1 
>>> every few millions). The other way to fix it is to issue a cache flush
>>> after a code retranslation, however, it is something very costly on some
>>> architectures.
>> 
>> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
> 
> That's a solution that works (tested). However making sure that the
> retranslation doesn't change the code looks better.

Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?


Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:20         ` Alexander Graf
@ 2011-01-10 14:23           ` Aurelien Jarno
  2011-01-10 14:29             ` Alexander Graf
  2011-01-10 14:51           ` Edgar E. Iglesias
  1 sibling, 1 reply; 10+ messages in thread
From: Aurelien Jarno @ 2011-01-10 14:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel

On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
> 
> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
> 
> > On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> >> 
> >> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> >> 
> >>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >>>> 
> >>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>>>> might want to look at the description (copied below), as from a first
> >>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>>>> strange behaviors, that happen randomly and that looks difficult to
> >>>>> debug it might be the issue.
> >>>>> 
> >>>>> Aurelien
> >>>>> 
> >>>>> 
> >>>>> | QEMU uses code retranslation to restore the CPU state when an exception
> >>>>> | happens. For it to work the retranslation must not modify the generated
> >>>>> | code. This is what is currently implemented in ARM TCG.
> >>>>> |
> >>>>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>>>> | ARM, this requirement is stronger and code retranslation must not modify
> >>>>> | the generated code "atomically", as the cache line might be flushed
> >>>>> | at any moment (interrupt, exception, task switching), even if not
> >>>>> | triggered by QEMU. The probability for this to happen is very low, and
> >>>>> | depends on cache size and associativiy, machine load, interrupts, so the
> >>>>> | symptoms are might happen randomly.
> >>>>> |
> >>>>> | This requirement is currently not followed in tcg/arm, for the
> >>>>> | load/store code, which basically has the following structure:
> >>>>> |   1) tlb access code is written
> >>>>> |   2) conditional fast path code is written
> >>>>> |   3) branch is written with a temporary target
> >>>>> |   4) slow path code is written
> >>>>> |   5) branch target is updated
> >>>>> | The cache lines corresponding to the retranslated code is not flushed
> >>>>> | after code retranslation as the generated code is supposed to be the
> >>>>> | same. However if the cache line corresponding to the branch instruction
> >>>>> | is flushed between step 3 and 5, and is not flushed again before the
> >>>>> | code is executed again, the branch target is wrong. In the guest, the
> >>>>> | symptoms are MMU page fault at a random addresses, which leads to
> >>>>> | kernel page fault or segmentation faults.
> >>>> 
> >>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
> >>> 
> >>> As far as I understand, this is what happens to the branch target
> >>> instruction, or at least one possibility:
> >>> 
> >>> operation                                 | icache | dcache | mem/L2 |
> >>> ------------------------------------------+--------+--------+--------+
> >>> tlb access code is written                | absent | absent |   ok   |
> >>> conditional fast path code is written     | absent | absent |   ok   |
> >>> branch is written with a temporary target | absent | wrong  |   ok   |
> >>> cache line is flushed to memory           | absent | absent |  wrong |
> >>> slow path code is written                 | absent | absent |  wrong |
> >>> branch target is updated                  | absent |   ok   |  wrong |
> >>> TB is re-executed                         | wrong  |   ok   |  wrong |
> >>> 
> >>> Note that the issue is real, the patch really fixes the issue on ARM and
> >>> MIPS. It's not impossible that I don't fully understand it given I can't
> >>> analyse the cache values at a given time. However, when QEMU crashes 
> >>> because of that, we have seen that the executed code doesn't match what
> >>> GDB says, and changing the temporary branch value also changes the way 
> >>> QEMU or the guest crash.
> >>> 
> >>> The problem doesn't happens very often though (for sure less than 1 
> >>> every few millions). The other way to fix it is to issue a cache flush
> >>> after a code retranslation, however, it is something very costly on some
> >>> architectures.
> >> 
> >> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
> > 
> > That's a solution that works (tested). However making sure that the
> > retranslation doesn't change the code looks better.
> 
> Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?
> 

We already have an icache flush after the first translation. We don't
have any when the TB is flushed, but I don't think it is something
necessary. The new TB that will replace it will issue an icache flush
after it has been translated.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:23           ` Aurelien Jarno
@ 2011-01-10 14:29             ` Alexander Graf
  2011-01-10 14:45               ` Aurelien Jarno
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Graf @ 2011-01-10 14:29 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Blue Swirl, qemu-devel


On 10.01.2011, at 15:23, Aurelien Jarno wrote:

> On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
>> 
>> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
>> 
>>> On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
>>>> 
>>>>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
>>>>>> 
>>>>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have just sent a tcg/arm patch concerning code retranslation. You
>>>>>>> might want to look at the description (copied below), as from a first
>>>>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
>>>>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
>>>>>>> strange behaviors, that happen randomly and that looks difficult to
>>>>>>> debug it might be the issue.
>>>>>>> 
>>>>>>> Aurelien
>>>>>>> 
>>>>>>> 
>>>>>>> | QEMU uses code retranslation to restore the CPU state when an exception
>>>>>>> | happens. For it to work the retranslation must not modify the generated
>>>>>>> | code. This is what is currently implemented in ARM TCG.
>>>>>>> |
>>>>>>> | However on CPU that don't have icache/dcache/memory synchronised like
>>>>>>> | ARM, this requirement is stronger and code retranslation must not modify
>>>>>>> | the generated code "atomically", as the cache line might be flushed
>>>>>>> | at any moment (interrupt, exception, task switching), even if not
>>>>>>> | triggered by QEMU. The probability for this to happen is very low, and
>>>>>>> | depends on cache size and associativiy, machine load, interrupts, so the
>>>>>>> | symptoms are might happen randomly.
>>>>>>> |
>>>>>>> | This requirement is currently not followed in tcg/arm, for the
>>>>>>> | load/store code, which basically has the following structure:
>>>>>>> |   1) tlb access code is written
>>>>>>> |   2) conditional fast path code is written
>>>>>>> |   3) branch is written with a temporary target
>>>>>>> |   4) slow path code is written
>>>>>>> |   5) branch target is updated
>>>>>>> | The cache lines corresponding to the retranslated code is not flushed
>>>>>>> | after code retranslation as the generated code is supposed to be the
>>>>>>> | same. However if the cache line corresponding to the branch instruction
>>>>>>> | is flushed between step 3 and 5, and is not flushed again before the
>>>>>>> | code is executed again, the branch target is wrong. In the guest, the
>>>>>>> | symptoms are MMU page fault at a random addresses, which leads to
>>>>>>> | kernel page fault or segmentation faults.
>>>>>> 
>>>>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
>>>>> 
>>>>> As far as I understand, this is what happens to the branch target
>>>>> instruction, or at least one possibility:
>>>>> 
>>>>> operation                                 | icache | dcache | mem/L2 |
>>>>> ------------------------------------------+--------+--------+--------+
>>>>> tlb access code is written                | absent | absent |   ok   |
>>>>> conditional fast path code is written     | absent | absent |   ok   |
>>>>> branch is written with a temporary target | absent | wrong  |   ok   |
>>>>> cache line is flushed to memory           | absent | absent |  wrong |
>>>>> slow path code is written                 | absent | absent |  wrong |
>>>>> branch target is updated                  | absent |   ok   |  wrong |
>>>>> TB is re-executed                         | wrong  |   ok   |  wrong |
>>>>> 
>>>>> Note that the issue is real, the patch really fixes the issue on ARM and
>>>>> MIPS. It's not impossible that I don't fully understand it given I can't
>>>>> analyse the cache values at a given time. However, when QEMU crashes 
>>>>> because of that, we have seen that the executed code doesn't match what
>>>>> GDB says, and changing the temporary branch value also changes the way 
>>>>> QEMU or the guest crash.
>>>>> 
>>>>> The problem doesn't happens very often though (for sure less than 1 
>>>>> every few millions). The other way to fix it is to issue a cache flush
>>>>> after a code retranslation, however, it is something very costly on some
>>>>> architectures.
>>>> 
>>>> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
>>> 
>>> That's a solution that works (tested). However making sure that the
>>> retranslation doesn't change the code looks better.
>> 
>> Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?
>> 
> 
> We already have an icache flush after the first translation. We don't
> have any when the TB is flushed, but I don't think it is something
> necessary. The new TB that will replace it will issue an icache flush
> after it has been translated.

We don't need the flush after the translation if we just flush the whole block on invalidate. That should be a speedup, no? Flushing the full thing once is probably faster than lots of small flushes on every TB write.

Also, if there's a flush after the translation, I'm even more puzzled how the race you describe above can occur.


Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:29             ` Alexander Graf
@ 2011-01-10 14:45               ` Aurelien Jarno
  0 siblings, 0 replies; 10+ messages in thread
From: Aurelien Jarno @ 2011-01-10 14:45 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel

On Mon, Jan 10, 2011 at 03:29:28PM +0100, Alexander Graf wrote:
> 
> On 10.01.2011, at 15:23, Aurelien Jarno wrote:
> 
> > On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
> >> 
> >> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
> >> 
> >>> On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> >>>> 
> >>>> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> >>>> 
> >>>>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >>>>>> 
> >>>>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >>>>>> 
> >>>>>>> Hi,
> >>>>>>> 
> >>>>>>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>>>>>> might want to look at the description (copied below), as from a first
> >>>>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>>>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>>>>>> strange behaviors, that happen randomly and that looks difficult to
> >>>>>>> debug it might be the issue.
> >>>>>>> 
> >>>>>>> Aurelien
> >>>>>>> 
> >>>>>>> 
> >>>>>>> | QEMU uses code retranslation to restore the CPU state when an exception
> >>>>>>> | happens. For it to work the retranslation must not modify the generated
> >>>>>>> | code. This is what is currently implemented in ARM TCG.
> >>>>>>> |
> >>>>>>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>>>>>> | ARM, this requirement is stronger and code retranslation must not modify
> >>>>>>> | the generated code "atomically", as the cache line might be flushed
> >>>>>>> | at any moment (interrupt, exception, task switching), even if not
> >>>>>>> | triggered by QEMU. The probability for this to happen is very low, and
> >>>>>>> | depends on cache size and associativiy, machine load, interrupts, so the
> >>>>>>> | symptoms are might happen randomly.
> >>>>>>> |
> >>>>>>> | This requirement is currently not followed in tcg/arm, for the
> >>>>>>> | load/store code, which basically has the following structure:
> >>>>>>> |   1) tlb access code is written
> >>>>>>> |   2) conditional fast path code is written
> >>>>>>> |   3) branch is written with a temporary target
> >>>>>>> |   4) slow path code is written
> >>>>>>> |   5) branch target is updated
> >>>>>>> | The cache lines corresponding to the retranslated code is not flushed
> >>>>>>> | after code retranslation as the generated code is supposed to be the
> >>>>>>> | same. However if the cache line corresponding to the branch instruction
> >>>>>>> | is flushed between step 3 and 5, and is not flushed again before the
> >>>>>>> | code is executed again, the branch target is wrong. In the guest, the
> >>>>>>> | symptoms are MMU page fault at a random addresses, which leads to
> >>>>>>> | kernel page fault or segmentation faults.
> >>>>>> 
> >>>>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
> >>>>> 
> >>>>> As far as I understand, this is what happens to the branch target
> >>>>> instruction, or at least one possibility:
> >>>>> 
> >>>>> operation                                 | icache | dcache | mem/L2 |
> >>>>> ------------------------------------------+--------+--------+--------+
> >>>>> tlb access code is written                | absent | absent |   ok   |
> >>>>> conditional fast path code is written     | absent | absent |   ok   |
> >>>>> branch is written with a temporary target | absent | wrong  |   ok   |
> >>>>> cache line is flushed to memory           | absent | absent |  wrong |
> >>>>> slow path code is written                 | absent | absent |  wrong |
> >>>>> branch target is updated                  | absent |   ok   |  wrong |
> >>>>> TB is re-executed                         | wrong  |   ok   |  wrong |
> >>>>> 
> >>>>> Note that the issue is real, the patch really fixes the issue on ARM and
> >>>>> MIPS. It's not impossible that I don't fully understand it given I can't
> >>>>> analyse the cache values at a given time. However, when QEMU crashes 
> >>>>> because of that, we have seen that the executed code doesn't match what
> >>>>> GDB says, and changing the temporary branch value also changes the way 
> >>>>> QEMU or the guest crash.
> >>>>> 
> >>>>> The problem doesn't happens very often though (for sure less than 1 
> >>>>> every few millions). The other way to fix it is to issue a cache flush
> >>>>> after a code retranslation, however, it is something very costly on some
> >>>>> architectures.
> >>>> 
> >>>> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
> >>> 
> >>> That's a solution that works (tested). However making sure that the
> >>> retranslation doesn't change the code looks better.
> >> 
> >> Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?
> >> 
> > 
> > We already have an icache flush after the first translation. We don't
> > have any when the TB is flushed, but I don't think it is something
> > necessary. The new TB that will replace it will issue an icache flush
> > after it has been translated.
> 
> We don't need the flush after the translation if we just flush the whole block on invalidate. That should be a speedup, no? Flushing the full thing once is probably faster than lots of small flushes on every TB write.

Maybe flush is not the correct word to described what it is done. What
is done in syncing the icache with the dcache. It's what is done by the
function flush_icache_range() in tcg/arch/tcg-target.h.

If you flush the TB on invalidate, it doesn't ensure that the data just
written are visible from the icache.

> Also, if there's a flush after the translation, I'm even more puzzled how the race you describe above can occur.
> 

There is one after the translation, but not one after the retranslation.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:20         ` Alexander Graf
  2011-01-10 14:23           ` Aurelien Jarno
@ 2011-01-10 14:51           ` Edgar E. Iglesias
  2011-01-10 15:03             ` Alexander Graf
  1 sibling, 1 reply; 10+ messages in thread
From: Edgar E. Iglesias @ 2011-01-10 14:51 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
> 
> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
> 
> > On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> >> 
> >> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> >> 
> >>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >>>> 
> >>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>>>> might want to look at the description (copied below), as from a first
> >>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>>>> strange behaviors, that happen randomly and that looks difficult to
> >>>>> debug it might be the issue.
> >>>>> 
> >>>>> Aurelien
> >>>>> 
> >>>>> 
> >>>>> | QEMU uses code retranslation to restore the CPU state when an exception
> >>>>> | happens. For it to work the retranslation must not modify the generated
> >>>>> | code. This is what is currently implemented in ARM TCG.
> >>>>> |
> >>>>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>>>> | ARM, this requirement is stronger and code retranslation must not modify
> >>>>> | the generated code "atomically", as the cache line might be flushed
> >>>>> | at any moment (interrupt, exception, task switching), even if not
> >>>>> | triggered by QEMU. The probability for this to happen is very low, and
> >>>>> | depends on cache size and associativiy, machine load, interrupts, so the
> >>>>> | symptoms are might happen randomly.
> >>>>> |
> >>>>> | This requirement is currently not followed in tcg/arm, for the
> >>>>> | load/store code, which basically has the following structure:
> >>>>> |   1) tlb access code is written
> >>>>> |   2) conditional fast path code is written
> >>>>> |   3) branch is written with a temporary target
> >>>>> |   4) slow path code is written
> >>>>> |   5) branch target is updated
> >>>>> | The cache lines corresponding to the retranslated code is not flushed
> >>>>> | after code retranslation as the generated code is supposed to be the
> >>>>> | same. However if the cache line corresponding to the branch instruction
> >>>>> | is flushed between step 3 and 5, and is not flushed again before the
> >>>>> | code is executed again, the branch target is wrong. In the guest, the
> >>>>> | symptoms are MMU page fault at a random addresses, which leads to
> >>>>> | kernel page fault or segmentation faults.
> >>>> 
> >>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
> >>> 
> >>> As far as I understand, this is what happens to the branch target
> >>> instruction, or at least one possibility:
> >>> 
> >>> operation                                 | icache | dcache | mem/L2 |
> >>> ------------------------------------------+--------+--------+--------+
> >>> tlb access code is written                | absent | absent |   ok   |
> >>> conditional fast path code is written     | absent | absent |   ok   |
> >>> branch is written with a temporary target | absent | wrong  |   ok   |
> >>> cache line is flushed to memory           | absent | absent |  wrong |
> >>> slow path code is written                 | absent | absent |  wrong |
> >>> branch target is updated                  | absent |   ok   |  wrong |
> >>> TB is re-executed                         | wrong  |   ok   |  wrong |
> >>> 
> >>> Note that the issue is real, the patch really fixes the issue on ARM and
> >>> MIPS. It's not impossible that I don't fully understand it given I can't
> >>> analyse the cache values at a given time. However, when QEMU crashes 
> >>> because of that, we have seen that the executed code doesn't match what
> >>> GDB says, and changing the temporary branch value also changes the way 
> >>> QEMU or the guest crash.
> >>> 
> >>> The problem doesn't happens very often though (for sure less than 1 
> >>> every few millions). The other way to fix it is to issue a cache flush
> >>> after a code retranslation, however, it is something very costly on some
> >>> architectures.
> >> 
> >> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
> > 
> > That's a solution that works (tested). However making sure that the
> > retranslation doesn't change the code looks better.
> 
> Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?

I don't think it's needed. When a tb gets flushed it can't execute any more
until it gets reallocated and a new translation is generated (which will
sync the caches).

IIUC, The problem with the search_pc regeneration shows up because we
don't explicitely sync the caches when done. If something in the host
system (e.g interrupts, exceptions, cache misses) trigs a cache flush
in the middle of the regeneration, we might end up with the temporarily
modified regenerated code in memory and the unmodified code in dcaches
only. For non-cocherent I & D caches, when the tb gets executed the
icache will pick up the possibly incorrect memory contents.

So either we don't modify the TB (even temporarily) while regenerating
or we do a cache sync after regeneration. If the latter doesn't slow
down exception handling to much it might be better because I'm
afraid this bug will come back and hunt us.. Not sure..

Cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation
  2011-01-10 14:51           ` Edgar E. Iglesias
@ 2011-01-10 15:03             ` Alexander Graf
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Graf @ 2011-01-10 15:03 UTC (permalink / raw)
  To: Edgar E. Iglesias; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno


On 10.01.2011, at 15:51, Edgar E. Iglesias wrote:

> On Mon, Jan 10, 2011 at 03:20:40PM +0100, Alexander Graf wrote:
>> 
>> On 10.01.2011, at 15:15, Aurelien Jarno wrote:
>> 
>>> On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
>>>> 
>>>>> On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
>>>>>> 
>>>>>> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have just sent a tcg/arm patch concerning code retranslation. You
>>>>>>> might want to look at the description (copied below), as from a first
>>>>>>> glance ppc, s390 and sparc TCG targets might be affected. If you see
>>>>>>> guest kernel panics, some segmentation fault of qemu or in the guest,
>>>>>>> strange behaviors, that happen randomly and that looks difficult to
>>>>>>> debug it might be the issue.
>>>>>>> 
>>>>>>> Aurelien
>>>>>>> 
>>>>>>> 
>>>>>>> | QEMU uses code retranslation to restore the CPU state when an exception
>>>>>>> | happens. For it to work the retranslation must not modify the generated
>>>>>>> | code. This is what is currently implemented in ARM TCG.
>>>>>>> |
>>>>>>> | However on CPU that don't have icache/dcache/memory synchronised like
>>>>>>> | ARM, this requirement is stronger and code retranslation must not modify
>>>>>>> | the generated code "atomically", as the cache line might be flushed
>>>>>>> | at any moment (interrupt, exception, task switching), even if not
>>>>>>> | triggered by QEMU. The probability for this to happen is very low, and
>>>>>>> | depends on cache size and associativiy, machine load, interrupts, so the
>>>>>>> | symptoms are might happen randomly.
>>>>>>> |
>>>>>>> | This requirement is currently not followed in tcg/arm, for the
>>>>>>> | load/store code, which basically has the following structure:
>>>>>>> |   1) tlb access code is written
>>>>>>> |   2) conditional fast path code is written
>>>>>>> |   3) branch is written with a temporary target
>>>>>>> |   4) slow path code is written
>>>>>>> |   5) branch target is updated
>>>>>>> | The cache lines corresponding to the retranslated code is not flushed
>>>>>>> | after code retranslation as the generated code is supposed to be the
>>>>>>> | same. However if the cache line corresponding to the branch instruction
>>>>>>> | is flushed between step 3 and 5, and is not flushed again before the
>>>>>>> | code is executed again, the branch target is wrong. In the guest, the
>>>>>>> | symptoms are MMU page fault at a random addresses, which leads to
>>>>>>> | kernel page fault or segmentation faults.
>>>>>> 
>>>>>> I don't see the problem. If you have separate icache from dcache, the code in question doesn't get executed during the rewrite, so all should be fine. If you have both combined, the data write should automatically modify the cache line, so all is great too.
>>>>> 
>>>>> As far as I understand, this is what happens to the branch target
>>>>> instruction, or at least one possibility:
>>>>> 
>>>>> operation                                 | icache | dcache | mem/L2 |
>>>>> ------------------------------------------+--------+--------+--------+
>>>>> tlb access code is written                | absent | absent |   ok   |
>>>>> conditional fast path code is written     | absent | absent |   ok   |
>>>>> branch is written with a temporary target | absent | wrong  |   ok   |
>>>>> cache line is flushed to memory           | absent | absent |  wrong |
>>>>> slow path code is written                 | absent | absent |  wrong |
>>>>> branch target is updated                  | absent |   ok   |  wrong |
>>>>> TB is re-executed                         | wrong  |   ok   |  wrong |
>>>>> 
>>>>> Note that the issue is real, the patch really fixes the issue on ARM and
>>>>> MIPS. It's not impossible that I don't fully understand it given I can't
>>>>> analyse the cache values at a given time. However, when QEMU crashes 
>>>>> because of that, we have seen that the executed code doesn't match what
>>>>> GDB says, and changing the temporary branch value also changes the way 
>>>>> QEMU or the guest crash.
>>>>> 
>>>>> The problem doesn't happens very often though (for sure less than 1 
>>>>> every few millions). The other way to fix it is to issue a cache flush
>>>>> after a code retranslation, however, it is something very costly on some
>>>>> architectures.
>>>> 
>>>> I'd guess it only happens when code is overwritten. Only then icache can still be stale in that code region. Can't we just invalidate the icache on every tb flush? That should also fix the issue I'd guess.
>>> 
>>> That's a solution that works (tested). However making sure that the
>>> retranslation doesn't change the code looks better.
>> 
>> Yeah, I agree. Temporary retranslation really shouldn't modify existing code. We really do need an icache flush on tb flushes still though as that's a separate issue? Or are these already in?
> 
> I don't think it's needed. When a tb gets flushed it can't execute any more
> until it gets reallocated and a new translation is generated (which will
> sync the caches).

Oh, I was talking about replacing the individual flushes on every translation with a big flush on tb_flush.

> IIUC, The problem with the search_pc regeneration shows up because we
> don't explicitely sync the caches when done. If something in the host
> system (e.g interrupts, exceptions, cache misses) trigs a cache flush
> in the middle of the regeneration, we might end up with the temporarily
> modified regenerated code in memory and the unmodified code in dcaches
> only. For non-cocherent I & D caches, when the tb gets executed the
> icache will pick up the possibly incorrect memory contents.
> 
> So either we don't modify the TB (even temporarily) while regenerating
> or we do a cache sync after regeneration. If the latter doesn't slow
> down exception handling to much it might be better because I'm
> afraid this bug will come back and hunt us.. Not sure..

See the discussion on IRC about that :).


Alex

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-01-10 15:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-06 22:12 [Qemu-devel] tcg/{ppc, s390, sparc}: branch target and code retranslation Aurelien Jarno
     [not found] ` <ADE7D325-3612-4BD9-A88E-7B88E68449E1@suse.de>
2011-01-10 14:00   ` [Qemu-devel] " Aurelien Jarno
2011-01-10 14:07     ` Alexander Graf
2011-01-10 14:15       ` Aurelien Jarno
2011-01-10 14:20         ` Alexander Graf
2011-01-10 14:23           ` Aurelien Jarno
2011-01-10 14:29             ` Alexander Graf
2011-01-10 14:45               ` Aurelien Jarno
2011-01-10 14:51           ` Edgar E. Iglesias
2011-01-10 15:03             ` Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).