qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: ldrd implementation issue?
       [not found] <6866d6b6-24af-4743-8407-26ae2998dd21@gmail.com>
@ 2025-02-11 10:46 ` Peter Maydell
  2025-02-17 13:03   ` Peter Maydell
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2025-02-11 10:46 UTC (permalink / raw)
  To: Stu Grossman; +Cc: qemu-arm, QEMU Developers

(added qemu-devel to the cc list)

On Mon, 10 Feb 2025 at 17:26, Stu Grossman <stu.grossman@gmail.com> wrote:
>
> I've been getting SIGBUS cores with a bunch of user apps running under
> linux 5.15 and qemu-system-aarch64.  These happen to be 32 bit (T32?)
> programs.
>
> All of the cores point at the following instruction:
>
>         ldrd r2,r3,[r2]
>
> where r2 points at four bytes prior to the end of a page.  Eg: 0x10ffc.
>
> The bug appears to be that we get a page fault between the accesses for
> each word.  Prior to the fault, the first register is updated.  Later,
> after the fault is serviced, the instruction is restarted with the index
> register containing the word loaded prior to the fault, which is
> probably not the desired address, resulting in a coredump.
>
> So the conditions for the problem are:
>
>         - the index register must be one of the registers being loaded.
>         - the index register must point at a uint64_t that spans a page
>           boundary.
>
> I understand that the uint64_t spanning a page boundary may not be
> allowed by the arm 32 bit API.  However, the ldrd instruction definition
> allows for four byte alignment.

Yes, you're right; this is a longstanding bug in our LDRD implementation.
The Arm ARM defines in section G1.17.8.3 that if the execution of
an instruction generates a synhcronous Data Abort then the base
register should be restored to its original value if the aborted
instruction is a load and the list of registers to be loaded includes
the base register.

> Note that ldm may have similar issues if the index is one of the
> registers being loaded and the words cross a page boundary.

I believe in LDM we get this correct: in the do_ldm() function
we have a case for 'i == a->rn' which stashes the loaded value
into a TCG temporary, and we don't write that into the actual
register until after we've completed all the loads.

> The fix is to defer the register stores till after both words have been
> read from memory.
>
> Here is my fix:
>
> diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c
> index 9ee761fc64..78ad0ed186 100644
> --- a/target/arm/tcg/translate.c
> +++ b/target/arm/tcg/translate.c
> @@ -5006,7 +5006,7 @@ static bool op_store_rr(DisasContext *s,
> arg_ldst_rr *a,
>   static bool trans_LDRD_rr(DisasContext *s, arg_ldst_rr *a)
>   {
>       int mem_idx = get_mem_index(s);
> -    TCGv_i32 addr, tmp;
> +    TCGv_i32 addr, tmp1, tmp2;
>
>       if (!ENABLE_ARCH_5TE) {
>           return false;
> @@ -5017,15 +5017,16 @@ static bool trans_LDRD_rr(DisasContext *s,
> arg_ldst_rr *a)
>       }
>       addr = op_addr_rr_pre(s, a);
>
> -    tmp = tcg_temp_new_i32();
> -    gen_aa32_ld_i32(s, tmp, addr, mem_idx, MO_UL | MO_ALIGN);
> -    store_reg(s, a->rt, tmp);
> +    tmp1 = tcg_temp_new_i32();
> +    gen_aa32_ld_i32(s, tmp1, addr, mem_idx, MO_UL | MO_ALIGN);
>
>       tcg_gen_addi_i32(addr, addr, 4);
>
> -    tmp = tcg_temp_new_i32();
> -    gen_aa32_ld_i32(s, tmp, addr, mem_idx, MO_UL | MO_ALIGN);
> -    store_reg(s, a->rt + 1, tmp);
> +    tmp2 = tcg_temp_new_i32();
> +    gen_aa32_ld_i32(s, tmp2, addr, mem_idx, MO_UL | MO_ALIGN);
> +
> +    store_reg(s, a->rt, tmp1);
> +    store_reg(s, a->rt + 1, tmp2);
>
>       /* LDRD w/ base writeback is undefined if the registers overlap.  */
>       op_addr_rr_post(s, a, addr, -4);
> @@ -5153,19 +5154,20 @@ static bool op_store_ri(DisasContext *s,
> arg_ldst_ri *a,
>   static bool op_ldrd_ri(DisasContext *s, arg_ldst_ri *a, int rt2)
>   {
>       int mem_idx = get_mem_index(s);
> -    TCGv_i32 addr, tmp;
> +    TCGv_i32 addr, tmp1, tmp2;
>
>       addr = op_addr_ri_pre(s, a);
>
> -    tmp = tcg_temp_new_i32();
> -    gen_aa32_ld_i32(s, tmp, addr, mem_idx, MO_UL | MO_ALIGN);
> -    store_reg(s, a->rt, tmp);
> +    tmp1 = tcg_temp_new_i32();
> +    gen_aa32_ld_i32(s, tmp1, addr, mem_idx, MO_UL | MO_ALIGN);
>
>       tcg_gen_addi_i32(addr, addr, 4);
>
> -    tmp = tcg_temp_new_i32();
> -    gen_aa32_ld_i32(s, tmp, addr, mem_idx, MO_UL | MO_ALIGN);
> -    store_reg(s, rt2, tmp);
> +    tmp2 = tcg_temp_new_i32();
> +    gen_aa32_ld_i32(s, tmp2, addr, mem_idx, MO_UL | MO_ALIGN);
> +
> +    store_reg(s, a->rt, tmp1);
> +    store_reg(s, rt2, tmp2);
>
>       /* LDRD w/ base writeback is undefined if the registers overlap.  */
>       op_addr_ri_post(s, a, addr, -4);

Yes, this fix looks correct to me. Can you provide a
Signed-off-by: tag for it? We can't accept it as a patch
without that. (I can do the other administrative tidying
up of it into a commit, but the signed-off-by is what says
you have the legal right and are happy to submit it to QEMU
under our license (LGPLv2.1+ in this case)).

thanks
-- PMM


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ldrd implementation issue?
  2025-02-11 10:46 ` ldrd implementation issue? Peter Maydell
@ 2025-02-17 13:03   ` Peter Maydell
  2025-02-27 14:42     ` Peter Maydell
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Maydell @ 2025-02-17 13:03 UTC (permalink / raw)
  To: Stu Grossman; +Cc: qemu-arm, QEMU Developers

On Tue, 11 Feb 2025 at 10:46, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> (added qemu-devel to the cc list)
>
> On Mon, 10 Feb 2025 at 17:26, Stu Grossman <stu.grossman@gmail.com> wrote:
> >
> > I've been getting SIGBUS cores with a bunch of user apps running under
> > linux 5.15 and qemu-system-aarch64.  These happen to be 32 bit (T32?)
> > programs.

> > The fix is to defer the register stores till after both words have been
> > read from memory.
> >
> > Here is my fix:

[snip patch]

> Yes, this fix looks correct to me. Can you provide a
> Signed-off-by: tag for it? We can't accept it as a patch
> without that. (I can do the other administrative tidying
> up of it into a commit, but the signed-off-by is what says
> you have the legal right and are happy to submit it to QEMU
> under our license (LGPLv2.1+ in this case)).

Hi -- this is just a nudge about whether you can provide
a signed-off-by line for this fix. I'd love to take it,
but can't without a signed-off-by. (Otherwise I'll have to
reimplement it from scratch, which I'll do at some point.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ldrd implementation issue?
  2025-02-17 13:03   ` Peter Maydell
@ 2025-02-27 14:42     ` Peter Maydell
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Maydell @ 2025-02-27 14:42 UTC (permalink / raw)
  To: Stu Grossman; +Cc: qemu-arm, QEMU Developers

On Mon, 17 Feb 2025 at 13:03, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Tue, 11 Feb 2025 at 10:46, Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > (added qemu-devel to the cc list)
> >
> > On Mon, 10 Feb 2025 at 17:26, Stu Grossman <stu.grossman@gmail.com> wrote:
> > >
> > > I've been getting SIGBUS cores with a bunch of user apps running under
> > > linux 5.15 and qemu-system-aarch64.  These happen to be 32 bit (T32?)
> > > programs.
>
> > > The fix is to defer the register stores till after both words have been
> > > read from memory.
> > >
> > > Here is my fix:
>
> [snip patch]
>
> > Yes, this fix looks correct to me. Can you provide a
> > Signed-off-by: tag for it? We can't accept it as a patch
> > without that. (I can do the other administrative tidying
> > up of it into a commit, but the signed-off-by is what says
> > you have the legal right and are happy to submit it to QEMU
> > under our license (LGPLv2.1+ in this case)).
>
> Hi -- this is just a nudge about whether you can provide
> a signed-off-by line for this fix. I'd love to take it,
> but can't without a signed-off-by. (Otherwise I'll have to
> reimplement it from scratch, which I'll do at some point.)

I had a closer look at our LDRD code, and I noticed that we also
have another bug where we're not implementing the "must be 64-bit
atomic if the address is 8-aligned" requirement. So I wrote some
patches which fix that, and also will fix the page-crossing bug
(because they mean we now ask the memory system to load 64 bits
and then split it out into the two destination registers, rather
than doing two separate 32-bit loads, so we will take the fault
before updating either register).

If you're interested in testing those patches you can find them here:
https://patchew.org/QEMU/20250227142746.1698904-1-peter.maydell@linaro.org/

Thanks again for reporting this bug.

-- PMM


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-02-27 14:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <6866d6b6-24af-4743-8407-26ae2998dd21@gmail.com>
2025-02-11 10:46 ` ldrd implementation issue? Peter Maydell
2025-02-17 13:03   ` Peter Maydell
2025-02-27 14:42     ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).