Need help understanding assertion fail.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* Need help understanding assertion fail.
@ 2020-02-03 16:37 Wayne Li
  2020-02-03 16:56 ` Peter Maydell
  2020-02-05  9:32 ` Richard Henderson
  0 siblings, 2 replies; 5+ messages in thread
From: Wayne Li @ 2020-02-03 16:37 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1944 bytes --]

Dear QEMU list members,

So we developed a virtual machine using the QEMU code.  This virtual
machine emulates a certain custom-made computer that runs on a certain
military platform.  All I can tell you about this virtual machine is that
emulates a computer that has a PowerPC 7457 processor.  Anyway, we
developed this virtual machine using QEMU on little endian machines (i.e.
like your typical desktop running Windows).  What I'm working on right now
is getting this virtual machine to work on a T4240-RDB which has a PowerPC
e6500 processor.  The biggest roadblock I face when trying to get this to
work relates to the fact that the T4240-RDB is a big-endian machine and
transferring our code from a little-endian machine to a big-endian machine
created a lot of problems due to a lot of bad practices the team made when
they developed the virtual machine.

Anyway that's the background.  The specific problem I'm having right now is
I get the following assertion error during some of the setup stuff our OS
does post boot-up (the OS is also custom-made):

qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val: Assertion
`disp == (int16_t) disp' failed.

Looking at the QEMU code, "disp" is the difference between two pointers
named "target" and "pc".  I'm not sure exactly what either of those names
mean.  And it looks like since the assertion is checking if casting "disp"
as a short changes the value, it's checking if the "disp" value is too
big?  I'm just not very sure what this assertion means.

Anyway, the thing is this problem has to be somehow related to the transfer
of the code from a little-endian platform to a big-endian platform as our
project works without any problem on little-endian platforms.  But I think
it would be really helpful if I knew more about this assertion.  What
exactly is trying to check for here and why?  Do you see how it could
relate to endianism issues in any way?

-Thanks!, Wayne Li

[-- Attachment #2: Type: text/html, Size: 2218 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help understanding assertion fail.
  2020-02-03 16:37 Need help understanding assertion fail Wayne Li
@ 2020-02-03 16:56 ` Peter Maydell
  2020-02-03 21:32   ` Wayne Li
  2020-02-05  9:32 ` Richard Henderson
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2020-02-03 16:56 UTC (permalink / raw)
  To: Wayne Li; +Cc: QEMU Developers

On Mon, 3 Feb 2020 at 16:39, Wayne Li <waynli329@gmail.com> wrote:
> Anyway that's the background.  The specific problem I'm having right now is I get the following assertion error during some of the setup stuff our OS does post boot-up (the OS is also custom-made):
>
> qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val: Assertion `disp == (int16_t) disp' failed.
>
> Looking at the QEMU code, "disp" is the difference between two pointers named "target" and "pc".  I'm not sure exactly what either of those names mean.  And it looks like since the assertion is checking if casting "disp" as a short changes the value, it's checking if the "disp" value is too big?  I'm just not very sure what this assertion means.

This assertion is checking that we're not trying to fit too
large a value into the host PPC branch instruction we just emitted.
That is, tcg_out_bc() emits a PPC conditional branch instruction,
which has a 14 bit field for the offset (it's a relative branch),
and we know the bottom 2 bits of the target will be 0 (PPC insns
being 4-aligned), so the distance between the current host PC
and the target of the branch must fit in a signed 16-bit field.

"disp" here stands for "displacement".

The PPC TCG backend only uses this for the TCG 'brcond' and
'brcond2' TCG intermediate-representation ops. It seems likely
that the code for your target is generating TCG ops which have
too large a gap between a brcond/brcond2 and the destination label.
You could try using the various QEMU -d options to print out the
guest instructions and the generated TCG ops to pin down what
part of your target is trying to generate branches over too
much code like this.

> Anyway, the thing is this problem has to be somehow related to
> the transfer of the code from a little-endian platform to a
> big-endian platform as our project works without any problem on
> little-endian platforms.

In this case it isn't necessarily directly an endianness issue.
The x86 instruction set provides conditional branch instructions
which allow a 32-bit displacement value, so you're basically never
going to overflow a conditional-branch there. PPC, being RISC,
has more limited branch insns. You might also run into this
if you tried to use aarch64 (64-bit) arm hosts, which are
little-endian but have a 19-bit branch displacement limit,
depending on just how big you've managed to make your jumps.
On the other hand, a 16-bit displacement is a jump over
64K of generated code, which is huge for a single TCG
generated translation block, so it could well be that you
have an endianness bug in your TCG frontend which is causing
you to generate an enormous TB by accident.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help understanding assertion fail.
  2020-02-03 16:56 ` Peter Maydell
@ 2020-02-03 21:32   ` Wayne Li
  2020-02-04 10:19     ` Peter Maydell
  0 siblings, 1 reply; 5+ messages in thread
From: Wayne Li @ 2020-02-03 21:32 UTC (permalink / raw)
  To: Peter Maydell; +Cc: QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 4500 bytes --]

I see.  So you're saying that it might be possible that my guest could be
generating TCG ops that can't be translated into PPC instructions because
the displacement value is to big.  While the same TCG ops can be translated
into x86 instructions because x86 allows for a bigger displacement value.
But on the other hand it could be some other problem causing me to have a
large displacement value.

In that case, I think it'd be super helpful if I print out this
displacement value in the TCG ops when running on PPC versus x86 because
they should be the same right?  What option in QEMU -d allows me to see
generated TCG ops?  Doing a -d --help shows the following options:

out_asm    show generated host assembly code for each compiled TB
in_asm     show target assembly code for each compiled TB
op         show micro ops for each compiled TB
op_opt     show micro ops (x86 only: before eflags optimization) and
after liveness analysis
int        show interrupts/exceptions in short format
exec       show trace before each executed TB (lots of logs)
cpu        show CPU state before block translation
mmu        log MMU-related activities
pcall      x86 only: show protected mode far calls/returns/exceptions
cpu_reset  show CPU state before CPU resets
ioport     show all i/o ports accesses
unimp      log unimplemented functionality
guest_errors log when the guest OS does something invalid (eg accessing a
non-existent register)

There doesn't seem to be any option to print out the TCG ops specifically?
Maybe I'll have to go into the code to add print statements that print out
the TCG ops?

-Thanks!, Wayne Li

On Mon, Feb 3, 2020 at 10:56 AM Peter Maydell <peter.maydell@linaro.org>
wrote:

> On Mon, 3 Feb 2020 at 16:39, Wayne Li <waynli329@gmail.com> wrote:
> > Anyway that's the background.  The specific problem I'm having right now
> is I get the following assertion error during some of the setup stuff our
> OS does post boot-up (the OS is also custom-made):
> >
> > qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val:
> Assertion `disp == (int16_t) disp' failed.
> >
> > Looking at the QEMU code, "disp" is the difference between two pointers
> named "target" and "pc".  I'm not sure exactly what either of those names
> mean.  And it looks like since the assertion is checking if casting "disp"
> as a short changes the value, it's checking if the "disp" value is too
> big?  I'm just not very sure what this assertion means.
>
> This assertion is checking that we're not trying to fit too
> large a value into the host PPC branch instruction we just emitted.
> That is, tcg_out_bc() emits a PPC conditional branch instruction,
> which has a 14 bit field for the offset (it's a relative branch),
> and we know the bottom 2 bits of the target will be 0 (PPC insns
> being 4-aligned), so the distance between the current host PC
> and the target of the branch must fit in a signed 16-bit field.
>
> "disp" here stands for "displacement".
>
> The PPC TCG backend only uses this for the TCG 'brcond' and
> 'brcond2' TCG intermediate-representation ops. It seems likely
> that the code for your target is generating TCG ops which have
> too large a gap between a brcond/brcond2 and the destination label.
> You could try using the various QEMU -d options to print out the
> guest instructions and the generated TCG ops to pin down what
> part of your target is trying to generate branches over too
> much code like this.
>
> > Anyway, the thing is this problem has to be somehow related to
> > the transfer of the code from a little-endian platform to a
> > big-endian platform as our project works without any problem on
> > little-endian platforms.
>
> In this case it isn't necessarily directly an endianness issue.
> The x86 instruction set provides conditional branch instructions
> which allow a 32-bit displacement value, so you're basically never
> going to overflow a conditional-branch there. PPC, being RISC,
> has more limited branch insns. You might also run into this
> if you tried to use aarch64 (64-bit) arm hosts, which are
> little-endian but have a 19-bit branch displacement limit,
> depending on just how big you've managed to make your jumps.
> On the other hand, a 16-bit displacement is a jump over
> 64K of generated code, which is huge for a single TCG
> generated translation block, so it could well be that you
> have an endianness bug in your TCG frontend which is causing
> you to generate an enormous TB by accident.
>
> thanks
> -- PMM
>

[-- Attachment #2: Type: text/html, Size: 5362 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help understanding assertion fail.
  2020-02-03 21:32   ` Wayne Li
@ 2020-02-04 10:19     ` Peter Maydell
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2020-02-04 10:19 UTC (permalink / raw)
  To: Wayne Li; +Cc: QEMU Developers

On Mon, 3 Feb 2020 at 21:32, Wayne Li <waynli329@gmail.com> wrote:
>
> I see.  So you're saying that it might be possible that my guest could be generating TCG ops that can't be translated into PPC instructions because the displacement value is to big.  While the same TCG ops can be translated into x86 instructions because x86 allows for a bigger displacement value.  But on the other hand it could be some other problem causing me to have a large displacement value.
>
> In that case, I think it'd be super helpful if I print out this displacement value in the TCG ops when running on PPC versus x86 because they should be the same right?  What option in QEMU -d allows me to see generated TCG ops?  Doing a -d --help shows the following options:

> There doesn't seem to be any option to print out the TCG ops specifically?  Maybe I'll have to go into the code to add print statements that print out the TCG ops?

'op' prints out the ops...

Note that in the TCG ops output there won't be a displacement value, because
that is calculated in the TCG backend. At the ops level, the branches are to
labels. But you'll be able to see if you're generating a super-enormous block
really easily, because it'll have lots of ops in it. (See also the advice in
tcg/README about generally preferring to use calls to helper functions
rather than directly generating more than about 20 TCG ops
for any one guest insn, and the overall MAX_OP_PER_INSTR limit).

thanks
-- PMM


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Need help understanding assertion fail.
  2020-02-03 16:37 Need help understanding assertion fail Wayne Li
  2020-02-03 16:56 ` Peter Maydell
@ 2020-02-05  9:32 ` Richard Henderson
  1 sibling, 0 replies; 5+ messages in thread
From: Richard Henderson @ 2020-02-05  9:32 UTC (permalink / raw)
  To: Wayne Li, qemu-devel

On 2/3/20 4:37 PM, Wayne Li wrote:
> Anyway that's the background.  The specific problem I'm having right now is I
> get the following assertion error during some of the setup stuff our OS does
> post boot-up (the OS is also custom-made):
> 
> qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val: Assertion
> `disp == (int16_t) disp' failed.

As Peter has already explained this has to do with *generating* ppc output for
the host, and nothing to do with little vs big endian.

There is only one place from which this ought to be reachable: an extremely
large backward branch, explicitly generated within your tcg ops.

Out of range forward branches are handled gracefully, as they generally occur
due to an internal branch to out-of-line code to handle the slow path of a
memory operation.  Generally this will be "fixed" by restarting generation of
the TB with fewer guest instructions.  E.g.

	insn1
	  memory op, conditional branch to m1
 i2:
	insn2
	insn3
	insn4
	branch to next tb with insn5
 m1:
	slow path for insn1
	goto i2

can be split into

	insn1
	  memory op, conditional branch to m1
  i2:
	insn2
	branch to next tb with insn3
  m1:
	slow path for insn1
	goto i2

However, these forward branches are implicit, part of the expansion of the
INDEX_op_qemu_ld/st tcg opcodes.

Backward branches are are *only* generated by explicit tcg ops, generated by
your target/ code.  Since you should not be generating backward branches
*between* insns, there is no expectation that splitting the TB in half will
have any effect.

I can only suggest that there is some insn for which you are generating inline
code which includes a loop.  This insn should probably be implemented with an
out-of-line helper instead.  But since I have no visibility into the actual
architecture being emulated, I cannot be sure.

r~

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-05  9:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-03 16:37 Need help understanding assertion fail Wayne Li
2020-02-03 16:56 ` Peter Maydell
2020-02-03 21:32   ` Wayne Li
2020-02-04 10:19     ` Peter Maydell
2020-02-05  9:32 ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).