* [PATCH v2 0/1] Fix endless translation loop of riscv
@ 2025-04-14 3:46 Ziqiao Kong
2025-04-14 3:46 ` [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-14 3:46 UTC (permalink / raw)
To: qemu-devel; +Cc: ziqiaokong, qemu-trivial, alistair.francis, richard.henderson
Sorry for sending this again as I found previous patch series does not
work on riscv32 due to target_ulong is not le64. Please ignore my
previous v1 patch. Below is the original cover letter to illustrate the
purpose of the patch:
Hello! I'm Ziqiao Kong, the maintainer of Unicorn Engine, a fork of
QEMU. When I port Unicorn Engine to s390x, I notice there is a bug in
the implementation of RISCV MMU. It uses qemu_map_ram_ptr to get a
pointer and reads it directly, instead of bswap or address_space_ldl,
which causes an endless translation loop on big endian systems like
s390x I'm working on. Therefore, a quick fix to this is to call
cpu_to_le64 for cmpxchg as this patch shows.
This patch passes our unit tests and the error is somewhat obvious
(unhandled endianness discrepancy). Therefore, I'm rather confident
that QEMU aslo needs this patch. Given the changes are small and don't
impact most popular little endian platforms, I think this fits into
trivial patches.
Ziqiao Kong (1):
target/riscv: fix endless translation loop on big endian systems
target/riscv/cpu_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-14 3:46 [PATCH v2 0/1] Fix endless translation loop of riscv Ziqiao Kong
@ 2025-04-14 3:46 ` Ziqiao Kong
2025-04-14 10:41 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-14 3:46 UTC (permalink / raw)
To: qemu-devel; +Cc: ziqiaokong, qemu-trivial, alistair.francis, richard.henderson
On big endian systems, pte and updated_pte hold big endian host data
while pte_pa points to little endian target data. This means the branch
at cpu_helper.c:1669 will be always satisfied and restart translation,
causing an endless translation loop.
Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
---
target/riscv/cpu_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 6c4391d96b..bc146771c8 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
target_ulong old_pte;
if (riscv_cpu_sxl(env) == MXL_RV32) {
- old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
+ old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
} else {
- old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
+ old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
}
if (old_pte != pte) {
goto restart;
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-14 3:46 ` [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems Ziqiao Kong
@ 2025-04-14 10:41 ` Philippe Mathieu-Daudé
2025-04-14 11:17 ` Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-04-14 10:41 UTC (permalink / raw)
To: Ziqiao Kong, qemu-devel; +Cc: qemu-trivial, alistair.francis, richard.henderson
Hi,
On 14/4/25 05:46, Ziqiao Kong wrote:
> On big endian systems, pte and updated_pte hold big endian host data
> while pte_pa points to little endian target data. This means the branch
> at cpu_helper.c:1669 will be always satisfied and restart translation,
> causing an endless translation loop.
>
Cc: qemu-stable@nongnu.org
Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> ---
> target/riscv/cpu_helper.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 6c4391d96b..bc146771c8 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> target_ulong old_pte;
> if (riscv_cpu_sxl(env) == MXL_RV32) {
> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> } else {
> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> }
> if (old_pte != pte) {
> goto restart;
If PTEs are always stored in LE order, maybe what we want is earlier:
-- >8 --
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 619c76cc001..b6ac2800240 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
*env, hwaddr *physical,
if (riscv_cpu_mxl(env) == MXL_RV32) {
- pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
+ pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
} else {
- pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
+ pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
}
---
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-14 10:41 ` Philippe Mathieu-Daudé
@ 2025-04-14 11:17 ` Ziqiao Kong
2025-04-14 16:59 ` Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-14 11:17 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, qemu-trivial, alistair.francis, richard.henderson
On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> Hi,
>
> On 14/4/25 05:46, Ziqiao Kong wrote:
> > On big endian systems, pte and updated_pte hold big endian host data
> > while pte_pa points to little endian target data. This means the branch
> > at cpu_helper.c:1669 will be always satisfied and restart translation,
> > causing an endless translation loop.
> >
>
> Cc: qemu-stable@nongnu.org
> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>
> > Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > ---
> > target/riscv/cpu_helper.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 6c4391d96b..bc146771c8 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > target_ulong old_pte;
> > if (riscv_cpu_sxl(env) == MXL_RV32) {
> > - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> > } else {
> > - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> > }
> > if (old_pte != pte) {
> > goto restart;
>
> If PTEs are always stored in LE order, maybe what we want is earlier:
>
> -- >8 --
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 619c76cc001..b6ac2800240 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> *env, hwaddr *physical,
> if (riscv_cpu_mxl(env) == MXL_RV32) {
> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> } else {
> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
Unfortunately, this doesn't work in two ways:
1. Note pte is used in the following code and that means pte must hold
a correct value from the
view of host endian (in my case, big endian not little endian).
2. address_space_ldq_le will dispatch to ldq_le_p, while
address_space_leq will dispatch to ldq_p.
However, on little endian targets, ldq_p is an alias of ldq_le_p so
making no effects.
Per my testing, this patch doesn't have any effect indeed. To have a
brief view what is happening,
see the logs just before atomic_cmpxchg:
pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> }
> ---
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-14 11:17 ` Ziqiao Kong
@ 2025-04-14 16:59 ` Ziqiao Kong
2025-04-14 17:38 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-14 16:59 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: qemu-devel, qemu-trivial, alistair.francis, richard.henderson
Hello Philippe,
Any further concern regarding this series? I certainly would like to investigate
and help =).
Bests,
Ziqiao
On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
> >
> > Hi,
> >
> > On 14/4/25 05:46, Ziqiao Kong wrote:
> > > On big endian systems, pte and updated_pte hold big endian host data
> > > while pte_pa points to little endian target data. This means the branch
> > > at cpu_helper.c:1669 will be always satisfied and restart translation,
> > > causing an endless translation loop.
> > >
> >
> > Cc: qemu-stable@nongnu.org
> > Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >
> > > Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > > ---
> > > target/riscv/cpu_helper.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > index 6c4391d96b..bc146771c8 100644
> > > --- a/target/riscv/cpu_helper.c
> > > +++ b/target/riscv/cpu_helper.c
> > > @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > > target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > > target_ulong old_pte;
> > > if (riscv_cpu_sxl(env) == MXL_RV32) {
> > > - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > > + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> > > } else {
> > > - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > > + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> > > }
> > > if (old_pte != pte) {
> > > goto restart;
> >
> > If PTEs are always stored in LE order, maybe what we want is earlier:
> >
> > -- >8 --
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 619c76cc001..b6ac2800240 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > *env, hwaddr *physical,
> > if (riscv_cpu_mxl(env) == MXL_RV32) {
> > - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> > } else {
> > - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>
> Unfortunately, this doesn't work in two ways:
>
> 1. Note pte is used in the following code and that means pte must hold
> a correct value from the
> view of host endian (in my case, big endian not little endian).
> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> address_space_leq will dispatch to ldq_p.
> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> making no effects.
>
> Per my testing, this patch doesn't have any effect indeed. To have a
> brief view what is happening,
> see the logs just before atomic_cmpxchg:
>
> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>
> > }
> > ---
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-14 16:59 ` Ziqiao Kong
@ 2025-04-14 17:38 ` Philippe Mathieu-Daudé
[not found] ` <CAM0BWNBNrjJ6UuF+TRtkuEesLatnY1pzSjyaiPVDeKSMF8no-A@mail.gmail.com>
0 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-04-14 17:38 UTC (permalink / raw)
To: Ziqiao Kong
Cc: qemu-devel, qemu-trivial, alistair.francis, richard.henderson,
Paolo Bonzini
Hi,
On 14/4/25 18:59, Ziqiao Kong wrote:
> Hello Philippe,
>
> Any further concern regarding this series? I certainly would like to investigate
> and help =).
Short term I can't keep looking because I'm busy with other stuffs and
tagged this patch for another review, because there is some endianness
code smell in get_physical_address(). I understand your change fixes
your issue, but I'm skeptical about it, in part because there are no
such use in the whole code base. My change suggestion is just a starting
point, more is needed.
>
> Bests,
> Ziqiao
>
> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>>
>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
>> <philmd@linaro.org> wrote:
>>>
>>> Hi,
>>>
>>> On 14/4/25 05:46, Ziqiao Kong wrote:
>>>> On big endian systems, pte and updated_pte hold big endian host data
>>>> while pte_pa points to little endian target data. This means the branch
>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
>>>> causing an endless translation loop.
>>>>
>>>
>>> Cc: qemu-stable@nongnu.org
>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>>>
>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
>>>> ---
>>>> target/riscv/cpu_helper.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>> index 6c4391d96b..bc146771c8 100644
>>>> --- a/target/riscv/cpu_helper.c
>>>> +++ b/target/riscv/cpu_helper.c
>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
>>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
>>>> target_ulong old_pte;
>>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
>>>> } else {
>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
>>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
>>>> }
>>>> if (old_pte != pte) {
>>>> goto restart;
>>>
>>> If PTEs are always stored in LE order, maybe what we want is earlier:
>>>
>>> -- >8 --
>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>> index 619c76cc001..b6ac2800240 100644
>>> --- a/target/riscv/cpu_helper.c
>>> +++ b/target/riscv/cpu_helper.c
>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
>>> *env, hwaddr *physical,
>>> if (riscv_cpu_mxl(env) == MXL_RV32) {
>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
>>> } else {
>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>>
>> Unfortunately, this doesn't work in two ways:
>>
>> 1. Note pte is used in the following code and that means pte must hold
>> a correct value from the
>> view of host endian (in my case, big endian not little endian).
>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
>> address_space_leq will dispatch to ldq_p.
>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
>> making no effects.
>>
>> Per my testing, this patch doesn't have any effect indeed. To have a
>> brief view what is happening,
>> see the logs just before atomic_cmpxchg:
>>
>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>>
>>> }
>>> ---
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
[not found] ` <CAM0BWNBNrjJ6UuF+TRtkuEesLatnY1pzSjyaiPVDeKSMF8no-A@mail.gmail.com>
@ 2025-04-15 7:04 ` Ziqiao Kong
2025-04-15 7:15 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-15 7:04 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: QEMU Developers, qemu-trivial, alistair.francis,
Richard Henderson, Paolo Bonzini
Accidentally not cc all recipients. Sorry for the confusion. Below is
the duplicated message:
Hello Philippe,
On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> Hi,
>
> On 14/4/25 18:59, Ziqiao Kong wrote:
> > Hello Philippe,
> >
> > Any further concern regarding this series? I certainly would like to investigate
> > and help =).
>
> Short term I can't keep looking because I'm busy with other stuffs and
> tagged this patch for another review, because there is some endianness
> code smell in get_physical_address(). I understand your change fixes
> your issue, but I'm skeptical about it, in part because there are no
> such use in the whole code base. My change suggestion is just a starting
> point, more is needed.
Thanks for responding.
Actually, the pattern of this usage is actually very common in the code base and
that's why I fixed in this way. Sorry I should have put this in the
cover letter to
justify my fix. Below is an incomplete list of the code using this pattern:
- target/i386/tcg/system/excp_helper.c:129
if (likely(in->haddr)) {
old = cpu_to_le32(old);
new = cpu_to_le32(new);
return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
}
- target/arm/ptw.c: 840
if (ptw->out_be) {
old_val = cpu_to_be64(old_val);
new_val = cpu_to_be64(new_val);
cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
cur_val = be64_to_cpu(cur_val);
} else {
old_val = cpu_to_le64(old_val);
new_val = cpu_to_le64(new_val);
cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
cur_val = le64_to_cpu(cur_val);
}
You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
>
> >
> > Bests,
> > Ziqiao
> >
> > On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >>
> >> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> >> <philmd@linaro.org> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 14/4/25 05:46, Ziqiao Kong wrote:
> >>>> On big endian systems, pte and updated_pte hold big endian host data
> >>>> while pte_pa points to little endian target data. This means the branch
> >>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> >>>> causing an endless translation loop.
> >>>>
> >>>
> >>> Cc: qemu-stable@nongnu.org
> >>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >>>
> >>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> >>>> ---
> >>>> target/riscv/cpu_helper.c | 4 ++--
> >>>> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>> index 6c4391d96b..bc146771c8 100644
> >>>> --- a/target/riscv/cpu_helper.c
> >>>> +++ b/target/riscv/cpu_helper.c
> >>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> >>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> >>>> target_ulong old_pte;
> >>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
> >>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> >>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> >>>> } else {
> >>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> >>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> >>>> }
> >>>> if (old_pte != pte) {
> >>>> goto restart;
> >>>
> >>> If PTEs are always stored in LE order, maybe what we want is earlier:
> >>>
> >>> -- >8 --
> >>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>> index 619c76cc001..b6ac2800240 100644
> >>> --- a/target/riscv/cpu_helper.c
> >>> +++ b/target/riscv/cpu_helper.c
> >>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> >>> *env, hwaddr *physical,
> >>> if (riscv_cpu_mxl(env) == MXL_RV32) {
> >>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> >>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> >>> } else {
> >>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> >>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> >>
> >> Unfortunately, this doesn't work in two ways:
> >>
> >> 1. Note pte is used in the following code and that means pte must hold
> >> a correct value from the
> >> view of host endian (in my case, big endian not little endian).
> >> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> >> address_space_leq will dispatch to ldq_p.
> >> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> >> making no effects.
> >>
> >> Per my testing, this patch doesn't have any effect indeed. To have a
> >> brief view what is happening,
> >> see the logs just before atomic_cmpxchg:
> >>
> >> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> >>
> >>> }
> >>> ---
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-15 7:04 ` Ziqiao Kong
@ 2025-04-15 7:15 ` Philippe Mathieu-Daudé
2025-04-15 7:19 ` Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-04-15 7:15 UTC (permalink / raw)
To: Ziqiao Kong
Cc: QEMU Developers, qemu-trivial, alistair.francis,
Richard Henderson, Paolo Bonzini
On 15/4/25 09:04, Ziqiao Kong wrote:
> Accidentally not cc all recipients. Sorry for the confusion. Below is
> the duplicated message:
>
> Hello Philippe,
>
> On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
>>
>> Hi,
>>
>> On 14/4/25 18:59, Ziqiao Kong wrote:
>>> Hello Philippe,
>>>
>>> Any further concern regarding this series? I certainly would like to investigate
>>> and help =).
>>
>> Short term I can't keep looking because I'm busy with other stuffs and
>> tagged this patch for another review, because there is some endianness
>> code smell in get_physical_address(). I understand your change fixes
>> your issue, but I'm skeptical about it, in part because there are no
>> such use in the whole code base. My change suggestion is just a starting
>> point, more is needed.
>
> Thanks for responding.
>
> Actually, the pattern of this usage is actually very common in the code base and
> that's why I fixed in this way. Sorry I should have put this in the
> cover letter to
> justify my fix. Below is an incomplete list of the code using this pattern:
>
> - target/i386/tcg/system/excp_helper.c:129
>
> if (likely(in->haddr)) {
> old = cpu_to_le32(old);
> new = cpu_to_le32(new);
> return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> }
>
> - target/arm/ptw.c: 840
>
> if (ptw->out_be) {
> old_val = cpu_to_be64(old_val);
> new_val = cpu_to_be64(new_val);
> cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> cur_val = be64_to_cpu(cur_val);
> } else {
> old_val = cpu_to_le64(old_val);
> new_val = cpu_to_le64(new_val);
> cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> cur_val = le64_to_cpu(cur_val);
> }
Doh OK...
>
> You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
>
>
>>
>>>
>>> Bests,
>>> Ziqiao
>>>
>>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>>>>
>>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
>>>> <philmd@linaro.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
>>>>>> On big endian systems, pte and updated_pte hold big endian host data
>>>>>> while pte_pa points to little endian target data. This means the branch
>>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
>>>>>> causing an endless translation loop.
>>>>>>
>>>>>
>>>>> Cc: qemu-stable@nongnu.org
>>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
>>>>>
>>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
>>>>>> ---
>>>>>> target/riscv/cpu_helper.c | 4 ++--
>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>>>> index 6c4391d96b..bc146771c8 100644
>>>>>> --- a/target/riscv/cpu_helper.c
>>>>>> +++ b/target/riscv/cpu_helper.c
>>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
>>>>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
>>>>>> target_ulong old_pte;
>>>>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
>>>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
>>>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
Then don't we need:
old_pte = le32_to_cpu(old_pte);
>>>>>> } else {
>>>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
>>>>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
old_pte = le64_to_cpu(old_pte);
?
>>>>>> }
>>>>>> if (old_pte != pte) {
>>>>>> goto restart;
>>>>>
>>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
>>>>>
>>>>> -- >8 --
>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
>>>>> index 619c76cc001..b6ac2800240 100644
>>>>> --- a/target/riscv/cpu_helper.c
>>>>> +++ b/target/riscv/cpu_helper.c
>>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
>>>>> *env, hwaddr *physical,
>>>>> if (riscv_cpu_mxl(env) == MXL_RV32) {
>>>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
>>>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
>>>>> } else {
>>>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
>>>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
>>>>
>>>> Unfortunately, this doesn't work in two ways:
>>>>
>>>> 1. Note pte is used in the following code and that means pte must hold
>>>> a correct value from the
>>>> view of host endian (in my case, big endian not little endian).
>>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
>>>> address_space_leq will dispatch to ldq_p.
>>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
>>>> making no effects.
>>>>
>>>> Per my testing, this patch doesn't have any effect indeed. To have a
>>>> brief view what is happening,
>>>> see the logs just before atomic_cmpxchg:
>>>>
>>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
>>>>
>>>>> }
>>>>> ---
>>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-15 7:15 ` Philippe Mathieu-Daudé
@ 2025-04-15 7:19 ` Ziqiao Kong
2025-04-15 7:22 ` Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-15 7:19 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: QEMU Developers, qemu-trivial, alistair.francis,
Richard Henderson, Paolo Bonzini
Hello Philippe,
On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 15/4/25 09:04, Ziqiao Kong wrote:
> > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > the duplicated message:
> >
> > Hello Philippe,
> >
> > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > <philmd@linaro.org> wrote:
> >>
> >> Hi,
> >>
> >> On 14/4/25 18:59, Ziqiao Kong wrote:
> >>> Hello Philippe,
> >>>
> >>> Any further concern regarding this series? I certainly would like to investigate
> >>> and help =).
> >>
> >> Short term I can't keep looking because I'm busy with other stuffs and
> >> tagged this patch for another review, because there is some endianness
> >> code smell in get_physical_address(). I understand your change fixes
> >> your issue, but I'm skeptical about it, in part because there are no
> >> such use in the whole code base. My change suggestion is just a starting
> >> point, more is needed.
> >
> > Thanks for responding.
> >
> > Actually, the pattern of this usage is actually very common in the code base and
> > that's why I fixed in this way. Sorry I should have put this in the
> > cover letter to
> > justify my fix. Below is an incomplete list of the code using this pattern:
> >
> > - target/i386/tcg/system/excp_helper.c:129
> >
> > if (likely(in->haddr)) {
> > old = cpu_to_le32(old);
> > new = cpu_to_le32(new);
> > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > }
> >
> > - target/arm/ptw.c: 840
> >
> > if (ptw->out_be) {
> > old_val = cpu_to_be64(old_val);
> > new_val = cpu_to_be64(new_val);
> > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > cur_val = be64_to_cpu(cur_val);
> > } else {
> > old_val = cpu_to_le64(old_val);
> > new_val = cpu_to_le64(new_val);
> > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > cur_val = le64_to_cpu(cur_val);
> > }
>
> Doh OK...
>
> >
> > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> >
> >
> >>
> >>>
> >>> Bests,
> >>> Ziqiao
> >>>
> >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >>>>
> >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> >>>> <philmd@linaro.org> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> >>>>>> while pte_pa points to little endian target data. This means the branch
> >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> >>>>>> causing an endless translation loop.
> >>>>>>
> >>>>>
> >>>>> Cc: qemu-stable@nongnu.org
> >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> >>>>>
> >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> >>>>>> ---
> >>>>>> target/riscv/cpu_helper.c | 4 ++--
> >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>>>> index 6c4391d96b..bc146771c8 100644
> >>>>>> --- a/target/riscv/cpu_helper.c
> >>>>>> +++ b/target/riscv/cpu_helper.c
> >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> >>>>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> >>>>>> target_ulong old_pte;
> >>>>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
> >>>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> >>>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
>
> Then don't we need:
>
> old_pte = le32_to_cpu(old_pte);
>
> >>>>>> } else {
> >>>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> >>>>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
>
> old_pte = le64_to_cpu(old_pte);
>
> ?
Note old_pte is no longer used later (it is defined within the scope here) and
dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.
>
> >>>>>> }
> >>>>>> if (old_pte != pte) {
> >>>>>> goto restart;
> >>>>>
> >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> >>>>>
> >>>>> -- >8 --
> >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> >>>>> index 619c76cc001..b6ac2800240 100644
> >>>>> --- a/target/riscv/cpu_helper.c
> >>>>> +++ b/target/riscv/cpu_helper.c
> >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> >>>>> *env, hwaddr *physical,
> >>>>> if (riscv_cpu_mxl(env) == MXL_RV32) {
> >>>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> >>>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> >>>>> } else {
> >>>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> >>>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> >>>>
> >>>> Unfortunately, this doesn't work in two ways:
> >>>>
> >>>> 1. Note pte is used in the following code and that means pte must hold
> >>>> a correct value from the
> >>>> view of host endian (in my case, big endian not little endian).
> >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> >>>> address_space_leq will dispatch to ldq_p.
> >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> >>>> making no effects.
> >>>>
> >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> >>>> brief view what is happening,
> >>>> see the logs just before atomic_cmpxchg:
> >>>>
> >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> >>>>
> >>>>> }
> >>>>> ---
> >>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-15 7:19 ` Ziqiao Kong
@ 2025-04-15 7:22 ` Ziqiao Kong
2025-04-15 7:40 ` Ziqiao Kong
0 siblings, 1 reply; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-15 7:22 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: QEMU Developers, qemu-trivial, alistair.francis,
Richard Henderson, Paolo Bonzini
On Tue, Apr 15, 2025 at 3:19 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> Hello Philippe,
>
> On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
> >
> > On 15/4/25 09:04, Ziqiao Kong wrote:
> > > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > > the duplicated message:
> > >
> > > Hello Philippe,
> > >
> > > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > > <philmd@linaro.org> wrote:
> > >>
> > >> Hi,
> > >>
> > >> On 14/4/25 18:59, Ziqiao Kong wrote:
> > >>> Hello Philippe,
> > >>>
> > >>> Any further concern regarding this series? I certainly would like to investigate
> > >>> and help =).
> > >>
> > >> Short term I can't keep looking because I'm busy with other stuffs and
> > >> tagged this patch for another review, because there is some endianness
> > >> code smell in get_physical_address(). I understand your change fixes
> > >> your issue, but I'm skeptical about it, in part because there are no
> > >> such use in the whole code base. My change suggestion is just a starting
> > >> point, more is needed.
> > >
> > > Thanks for responding.
> > >
> > > Actually, the pattern of this usage is actually very common in the code base and
> > > that's why I fixed in this way. Sorry I should have put this in the
> > > cover letter to
> > > justify my fix. Below is an incomplete list of the code using this pattern:
> > >
> > > - target/i386/tcg/system/excp_helper.c:129
> > >
> > > if (likely(in->haddr)) {
> > > old = cpu_to_le32(old);
> > > new = cpu_to_le32(new);
> > > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > > }
> > >
> > > - target/arm/ptw.c: 840
> > >
> > > if (ptw->out_be) {
> > > old_val = cpu_to_be64(old_val);
> > > new_val = cpu_to_be64(new_val);
> > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > cur_val = be64_to_cpu(cur_val);
> > > } else {
> > > old_val = cpu_to_le64(old_val);
> > > new_val = cpu_to_le64(new_val);
> > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > cur_val = le64_to_cpu(cur_val);
> > > }
> >
> > Doh OK...
> >
> > >
> > > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> > >
> > >
> > >>
> > >>>
> > >>> Bests,
> > >>> Ziqiao
> > >>>
> > >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> > >>>>
> > >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> > >>>> <philmd@linaro.org> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> > >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> > >>>>>> while pte_pa points to little endian target data. This means the branch
> > >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> > >>>>>> causing an endless translation loop.
> > >>>>>>
> > >>>>>
> > >>>>> Cc: qemu-stable@nongnu.org
> > >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> > >>>>>
> > >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > >>>>>> ---
> > >>>>>> target/riscv/cpu_helper.c | 4 ++--
> > >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
> > >>>>>>
> > >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >>>>>> index 6c4391d96b..bc146771c8 100644
> > >>>>>> --- a/target/riscv/cpu_helper.c
> > >>>>>> +++ b/target/riscv/cpu_helper.c
> > >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > >>>>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > >>>>>> target_ulong old_pte;
> > >>>>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
> > >>>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > >>>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> >
> > Then don't we need:
> >
> > old_pte = le32_to_cpu(old_pte);
> >
> > >>>>>> } else {
> > >>>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > >>>>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> >
> > old_pte = le64_to_cpu(old_pte);
> >
> > ?
>
> Note old_pte is no longer used later (it is defined within the scope here) and
> dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.
Also pte is not modified inplace previously, unlike the code samples above.
>
> >
> > >>>>>> }
> > >>>>>> if (old_pte != pte) {
> > >>>>>> goto restart;
> > >>>>>
> > >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> > >>>>>
> > >>>>> -- >8 --
> > >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > >>>>> index 619c76cc001..b6ac2800240 100644
> > >>>>> --- a/target/riscv/cpu_helper.c
> > >>>>> +++ b/target/riscv/cpu_helper.c
> > >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > >>>>> *env, hwaddr *physical,
> > >>>>> if (riscv_cpu_mxl(env) == MXL_RV32) {
> > >>>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > >>>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> > >>>>> } else {
> > >>>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > >>>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> > >>>>
> > >>>> Unfortunately, this doesn't work in two ways:
> > >>>>
> > >>>> 1. Note pte is used in the following code and that means pte must hold
> > >>>> a correct value from the
> > >>>> view of host endian (in my case, big endian not little endian).
> > >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> > >>>> address_space_leq will dispatch to ldq_p.
> > >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> > >>>> making no effects.
> > >>>>
> > >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> > >>>> brief view what is happening,
> > >>>> see the logs just before atomic_cmpxchg:
> > >>>>
> > >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> > >>>>
> > >>>>> }
> > >>>>> ---
> > >>
> >
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems
2025-04-15 7:22 ` Ziqiao Kong
@ 2025-04-15 7:40 ` Ziqiao Kong
0 siblings, 0 replies; 11+ messages in thread
From: Ziqiao Kong @ 2025-04-15 7:40 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: QEMU Developers, qemu-trivial, alistair.francis,
Richard Henderson, Paolo Bonzini
On Tue, Apr 15, 2025 at 3:22 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
>
> On Tue, Apr 15, 2025 at 3:19 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> >
> > Hello Philippe,
> >
> > On Tue, Apr 15, 2025 at 3:15 PM Philippe Mathieu-Daudé
> > <philmd@linaro.org> wrote:
> > >
> > > On 15/4/25 09:04, Ziqiao Kong wrote:
> > > > Accidentally not cc all recipients. Sorry for the confusion. Below is
> > > > the duplicated message:
> > > >
> > > > Hello Philippe,
> > > >
> > > > On Tue, Apr 15, 2025 at 1:38 AM Philippe Mathieu-Daudé
> > > > <philmd@linaro.org> wrote:
> > > >>
> > > >> Hi,
> > > >>
> > > >> On 14/4/25 18:59, Ziqiao Kong wrote:
> > > >>> Hello Philippe,
> > > >>>
> > > >>> Any further concern regarding this series? I certainly would like to investigate
> > > >>> and help =).
> > > >>
> > > >> Short term I can't keep looking because I'm busy with other stuffs and
> > > >> tagged this patch for another review, because there is some endianness
> > > >> code smell in get_physical_address(). I understand your change fixes
> > > >> your issue, but I'm skeptical about it, in part because there are no
> > > >> such use in the whole code base. My change suggestion is just a starting
> > > >> point, more is needed.
> > > >
> > > > Thanks for responding.
> > > >
> > > > Actually, the pattern of this usage is actually very common in the code base and
> > > > that's why I fixed in this way. Sorry I should have put this in the
> > > > cover letter to
> > > > justify my fix. Below is an incomplete list of the code using this pattern:
> > > >
> > > > - target/i386/tcg/system/excp_helper.c:129
> > > >
> > > > if (likely(in->haddr)) {
> > > > old = cpu_to_le32(old);
> > > > new = cpu_to_le32(new);
> > > > return qatomic_cmpxchg((uint32_t *)in->haddr, old, new) == old;
> > > > }
> > > >
> > > > - target/arm/ptw.c: 840
> > > >
> > > > if (ptw->out_be) {
> > > > old_val = cpu_to_be64(old_val);
> > > > new_val = cpu_to_be64(new_val);
> > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > > cur_val = be64_to_cpu(cur_val);
> > > > } else {
> > > > old_val = cpu_to_le64(old_val);
> > > > new_val = cpu_to_le64(new_val);
> > > > cur_val = qatomic_cmpxchg__nocheck((uint64_t *)host, old_val, new_val);
> > > > cur_val = le64_to_cpu(cur_val);
> > > > }
> > >
> > > Doh OK...
> > >
> > > >
> > > > You might want to do a `grep -rn "qatomic_cmpxchg" .` to see all matches.
> > > >
> > > >
> > > >>
> > > >>>
> > > >>> Bests,
> > > >>> Ziqiao
> > > >>>
> > > >>> On Mon, Apr 14, 2025 at 7:17 PM Ziqiao Kong <ziqiaokong@gmail.com> wrote:
> > > >>>>
> > > >>>> On Mon, Apr 14, 2025 at 6:41 PM Philippe Mathieu-Daudé
> > > >>>> <philmd@linaro.org> wrote:
> > > >>>>>
> > > >>>>> Hi,
> > > >>>>>
> > > >>>>> On 14/4/25 05:46, Ziqiao Kong wrote:
> > > >>>>>> On big endian systems, pte and updated_pte hold big endian host data
> > > >>>>>> while pte_pa points to little endian target data. This means the branch
> > > >>>>>> at cpu_helper.c:1669 will be always satisfied and restart translation,
> > > >>>>>> causing an endless translation loop.
> > > >>>>>>
> > > >>>>>
> > > >>>>> Cc: qemu-stable@nongnu.org
> > > >>>>> Fixes: 0c3e702aca7 ("RISC-V CPU Helpers")
> > > >>>>>
> > > >>>>>> Signed-off-by: Ziqiao Kong <ziqiaokong@gmail.com>
> > > >>>>>> ---
> > > >>>>>> target/riscv/cpu_helper.c | 4 ++--
> > > >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >>>>>>
> > > >>>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > >>>>>> index 6c4391d96b..bc146771c8 100644
> > > >>>>>> --- a/target/riscv/cpu_helper.c
> > > >>>>>> +++ b/target/riscv/cpu_helper.c
> > > >>>>>> @@ -1662,9 +1662,9 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
> > > >>>>>> target_ulong *pte_pa = qemu_map_ram_ptr(mr->ram_block, addr1);
> > > >>>>>> target_ulong old_pte;
> > > >>>>>> if (riscv_cpu_sxl(env) == MXL_RV32) {
> > > >>>>>> - old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, pte, updated_pte);
> > > >>>>>> + old_pte = qatomic_cmpxchg((uint32_t *)pte_pa, cpu_to_le32(pte), cpu_to_le32(updated_pte));
> > >
> > > Then don't we need:
> > >
> > > old_pte = le32_to_cpu(old_pte);
> > >
> > > >>>>>> } else {
> > > >>>>>> - old_pte = qatomic_cmpxchg(pte_pa, pte, updated_pte);
> > > >>>>>> + old_pte = qatomic_cmpxchg(pte_pa, cpu_to_le64(pte), cpu_to_le64(updated_pte));
> > >
> > > old_pte = le64_to_cpu(old_pte);
> > >
> > > ?
Oh you are correct. I will draft a v3.
> >
> > Note old_pte is no longer used later (it is defined within the scope here) and
> > dropped immediately after qatomic_cmpxchg so we don't need that extra bswap.
>
> Also pte is not modified inplace previously, unlike the code samples above.
>
> >
> > >
> > > >>>>>> }
> > > >>>>>> if (old_pte != pte) {
> > > >>>>>> goto restart;
> > > >>>>>
> > > >>>>> If PTEs are always stored in LE order, maybe what we want is earlier:
> > > >>>>>
> > > >>>>> -- >8 --
> > > >>>>> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > > >>>>> index 619c76cc001..b6ac2800240 100644
> > > >>>>> --- a/target/riscv/cpu_helper.c
> > > >>>>> +++ b/target/riscv/cpu_helper.c
> > > >>>>> @@ -1464,5 +1464,5 @@ static int get_physical_address(CPURISCVState
> > > >>>>> *env, hwaddr *physical,
> > > >>>>> if (riscv_cpu_mxl(env) == MXL_RV32) {
> > > >>>>> - pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
> > > >>>>> + pte = address_space_ldl_le(cs->as, pte_addr, attrs, &res);
> > > >>>>> } else {
> > > >>>>> - pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
> > > >>>>> + pte = address_space_ldq_le(cs->as, pte_addr, attrs, &res);
> > > >>>>
> > > >>>> Unfortunately, this doesn't work in two ways:
> > > >>>>
> > > >>>> 1. Note pte is used in the following code and that means pte must hold
> > > >>>> a correct value from the
> > > >>>> view of host endian (in my case, big endian not little endian).
> > > >>>> 2. address_space_ldq_le will dispatch to ldq_le_p, while
> > > >>>> address_space_leq will dispatch to ldq_p.
> > > >>>> However, on little endian targets, ldq_p is an alias of ldq_le_p so
> > > >>>> making no effects.
> > > >>>>
> > > >>>> Per my testing, this patch doesn't have any effect indeed. To have a
> > > >>>> brief view what is happening,
> > > >>>> see the logs just before atomic_cmpxchg:
> > > >>>>
> > > >>>> pte_pa 0xf14000000000000 == pte 0x140f ? updated_pte 0x144f
> > > >>>>
> > > >>>>> }
> > > >>>>> ---
> > > >>
> > >
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-15 7:41 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-14 3:46 [PATCH v2 0/1] Fix endless translation loop of riscv Ziqiao Kong
2025-04-14 3:46 ` [PATCH v2 1/1] target/riscv: fix endless translation loop on big endian systems Ziqiao Kong
2025-04-14 10:41 ` Philippe Mathieu-Daudé
2025-04-14 11:17 ` Ziqiao Kong
2025-04-14 16:59 ` Ziqiao Kong
2025-04-14 17:38 ` Philippe Mathieu-Daudé
[not found] ` <CAM0BWNBNrjJ6UuF+TRtkuEesLatnY1pzSjyaiPVDeKSMF8no-A@mail.gmail.com>
2025-04-15 7:04 ` Ziqiao Kong
2025-04-15 7:15 ` Philippe Mathieu-Daudé
2025-04-15 7:19 ` Ziqiao Kong
2025-04-15 7:22 ` Ziqiao Kong
2025-04-15 7:40 ` Ziqiao Kong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).