* [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. @ 2011-01-18 14:34 Christophe Lyon 2011-01-18 15:26 ` Peter Maydell 2011-01-18 15:36 ` Peter Maydell 0 siblings, 2 replies; 10+ messages in thread From: Christophe Lyon @ 2011-01-18 14:34 UTC (permalink / raw) To: qemu-devel@nongnu.org Fix garbage collection of temporaries in Neon emulation. Signed-off-by: Christophe Lyon <christophe.lyon@st.com> --- target-arm/translate.c | 22 +++++++++++++++++----- 1 files changed, 17 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 57664bc..363351e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4176,6 +4176,18 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u) break; default: abort(); } + + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. + Don't forget to clean them now. */ + switch ((size << 1) | u) { + case 0: + case 1: + case 2: + case 3: + dead_tmp(a); + dead_tmp(b); + break; + } } /* Translate a NEON data processing instruction. Return nonzero if the @@ -4840,7 +4852,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) if (size == 3) { tcg_temp_free_i64(tmp64); } else { - dead_tmp(tmp2); + tcg_temp_free_i32(tmp2); } } else if (op == 10) { /* VSHLL */ @@ -5076,8 +5088,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) case 8: case 9: case 10: case 11: case 12: case 13: /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */ gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp2); - dead_tmp(tmp); break; case 14: /* Polynomial VMULL */ cpu_abort(env, "Polynomial VMULL not implemented"); @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = neon_load_reg(rn, 0); } else { tmp = tmp3; + /* tmp2 has been discarded in + gen_neon_mull during pass 0, we need to + recreate it. */ + tmp2 = neon_get_scalar(size, rm); } gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp); if (op == 6 || op == 7) { gen_neon_negl(cpu_V0, size); } @@ -5264,7 +5277,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) neon_store_reg64(cpu_V0, rd + pass); } - dead_tmp(tmp2); break; default: /* 14 and 15 are RESERVED */ -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 14:34 [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation Christophe Lyon @ 2011-01-18 15:26 ` Peter Maydell 2011-01-18 16:58 ` Christophe Lyon 2011-01-18 15:36 ` Peter Maydell 1 sibling, 1 reply; 10+ messages in thread From: Peter Maydell @ 2011-01-18 15:26 UTC (permalink / raw) To: Christophe Lyon; +Cc: qemu-devel@nongnu.org On 18 January 2011 14:34, Christophe Lyon <christophe.lyon@st.com> wrote: > + > + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. > + Don't forget to clean them now. */ > + switch ((size << 1) | u) { > + case 0: > + case 1: > + case 2: > + case 3: > + dead_tmp(a); > + dead_tmp(b); > + break; > + } > } This seems a rather convoluted way to write "if (size < 2) { ... }" > @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, > DisasContext *s, uint32_t insn) > tmp = neon_load_reg(rn, 0); > } else { > tmp = tmp3; > + /* tmp2 has been discarded in > + gen_neon_mull during pass 0, we need to > + recreate it. */ > + tmp2 = neon_get_scalar(size, rm); > } I think this will give the wrong results for instructions where the scalar operand is in the same Neon register as the destination for the first pass, because calling neon_get_scalar() again will do a reload from the Neon register and it might have changed. (Also loading it once at the start rather than in every pass is more efficient as well as being correct :-)) Also your patch has hard-coded tabs in it (please see CODING_STYLE on the subject of whitespace) and your mail client or server has line-wrapped long lines in the patch so it doesn't apply cleanly... -- PMM ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 15:26 ` Peter Maydell @ 2011-01-18 16:58 ` Christophe Lyon 2011-01-19 14:37 ` Christophe Lyon 0 siblings, 1 reply; 10+ messages in thread From: Christophe Lyon @ 2011-01-18 16:58 UTC (permalink / raw) To: Peter Maydell, qemu-devel@nongnu.org On 18.01.2011 16:26, Peter Maydell wrote: > On 18 January 2011 14:34, Christophe Lyon <christophe.lyon@st.com> wrote: >> + >> + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. >> + Don't forget to clean them now. */ >> + switch ((size << 1) | u) { >> + case 0: >> + case 1: >> + case 2: >> + case 3: >> + dead_tmp(a); >> + dead_tmp(b); >> + break; >> + } >> } > > This seems a rather convoluted way to write "if (size < 2) { ... }" > It was for consistency/readability with the preceding paragraph. >> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env, >> DisasContext *s, uint32_t insn) >> tmp = neon_load_reg(rn, 0); >> } else { >> tmp = tmp3; >> + /* tmp2 has been discarded in >> + gen_neon_mull during pass 0, we need to >> + recreate it. */ >> + tmp2 = neon_get_scalar(size, rm); >> } > > I think this will give the wrong results for instructions where the > scalar operand is in the same Neon register as the destination > for the first pass, because calling neon_get_scalar() again will > do a reload from the Neon register and it might have changed. > (Also loading it once at the start rather than in every pass is > more efficient as well as being correct :-)) I agree it's more efficient, but as the temporary is freed by gen_neon_mull, how can I make an efficient copy? If we decide not to free the temporary in gen_mul[us]_i64_i32, we'll have to make sure clean up is performed correctly in many places. > Also your patch has hard-coded tabs in it (please see > CODING_STYLE on the subject of whitespace) and your > mail client or server has line-wrapped long lines in the patch > so it doesn't apply cleanly... Sorry, I know we have some trouble with the mail client or server. Is it possible to send patches as attachments on this list? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 16:58 ` Christophe Lyon @ 2011-01-19 14:37 ` Christophe Lyon 2011-01-19 19:12 ` Peter Maydell 2011-01-26 13:34 ` Aurelien Jarno 0 siblings, 2 replies; 10+ messages in thread From: Christophe Lyon @ 2011-01-19 14:37 UTC (permalink / raw) To: Peter Maydell, qemu-devel@nongnu.org Here is an updated patch which will hopefully not be mangled by my mailer. Fix garbage collection of temporaries in Neon emulation. Signed-off-by: Christophe Lyon <christophe.lyon@st.com> --- target-arm/translate.c | 18 +++++++++++++----- 1 files changed, 13 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 57664bc..b3e3d70 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4176,6 +4176,13 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u) break; default: abort(); } + + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. + Don't forget to clean them now. */ + if (size < 2) { + dead_tmp(a); + dead_tmp(b); + } } /* Translate a NEON data processing instruction. Return nonzero if the @@ -4840,7 +4847,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) if (size == 3) { tcg_temp_free_i64(tmp64); } else { - dead_tmp(tmp2); + tcg_temp_free_i32(tmp2); } } else if (op == 10) { /* VSHLL */ @@ -5076,8 +5083,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) case 8: case 9: case 10: case 11: case 12: case 13: /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */ gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp2); - dead_tmp(tmp); break; case 14: /* Polynomial VMULL */ cpu_abort(env, "Polynomial VMULL not implemented"); @@ -5228,6 +5233,10 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) return 1; tmp2 = neon_get_scalar(size, rm); + /* We need a copy of tmp2 because gen_neon_mull + * deletes it during pass 0. */ + tmp4 = new_tmp(); + tcg_gen_mov_i32(tmp4, tmp2); tmp3 = neon_load_reg(rn, 1); for (pass = 0; pass < 2; pass++) { @@ -5235,9 +5244,9 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = neon_load_reg(rn, 0); } else { tmp = tmp3; + tmp2 = tmp4; } gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp); if (op == 6 || op == 7) { gen_neon_negl(cpu_V0, size); } @@ -5264,7 +5273,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) neon_store_reg64(cpu_V0, rd + pass); } - dead_tmp(tmp2); break; default: /* 14 and 15 are RESERVED */ -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-19 14:37 ` Christophe Lyon @ 2011-01-19 19:12 ` Peter Maydell 2011-01-20 16:52 ` Christophe Lyon 2011-01-26 13:34 ` Aurelien Jarno 1 sibling, 1 reply; 10+ messages in thread From: Peter Maydell @ 2011-01-19 19:12 UTC (permalink / raw) To: Christophe Lyon; +Cc: qemu-devel@nongnu.org On 19 January 2011 14:37, Christophe Lyon <christophe.lyon@st.com> wrote: > Here is an updated patch which will hopefully not be mangled by my mailer. > > Fix garbage collection of temporaries in Neon emulation. I've tested this patch and it does indeed fix the problems with VMULL and friends (I was seeing assertions/hangs). I've tested with random instruction sequence generation and with this patch the non-scalar forms of VMLAL, VMLSL, VQDMLAL, VQDMLSL, VMULL, VQDMULL now all pass. The scalar forms now pass random-sequence testing with the addition of a patch from the qemu-meego tree. Since I have effectively just tested that meego patch I'll post it to the list in a moment. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> I would personally prefer slightly less terse commit messages (for instance it might be nice to list the affected instructions in this case). The convention is also to preface the summary line with the file or directory affected, ie "target-arm: Fix garbage collection of temporaries in Neon emulation". -- PMM ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-19 19:12 ` Peter Maydell @ 2011-01-20 16:52 ` Christophe Lyon 0 siblings, 0 replies; 10+ messages in thread From: Christophe Lyon @ 2011-01-20 16:52 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel@nongnu.org > I would personally prefer slightly less terse commit messages > (for instance it might be nice to list the affected instructions in > this case). The convention is also to preface the summary line > with the file or directory affected, ie "target-arm: Fix garbage > collection of temporaries in Neon emulation". > OK, so here is the same patch with an updated description: target-arm: Fix garbage collection of temporaries in Neon emulation of MULL and friends (VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL) as well as (VSHRN, VRSHRN, VQSHRN, VQRSHRN). Signed-off-by: Christophe Lyon <christophe.lyon@st.com> --- target-arm/translate.c | 18 +++++++++++++----- 1 files changed, 13 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 57664bc..b3e3d70 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4176,6 +4176,13 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u) break; default: abort(); } + + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. + Don't forget to clean them now. */ + if (size < 2) { + dead_tmp(a); + dead_tmp(b); + } } /* Translate a NEON data processing instruction. Return nonzero if the @@ -4840,7 +4847,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) if (size == 3) { tcg_temp_free_i64(tmp64); } else { - dead_tmp(tmp2); + tcg_temp_free_i32(tmp2); } } else if (op == 10) { /* VSHLL */ @@ -5076,8 +5083,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) case 8: case 9: case 10: case 11: case 12: case 13: /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */ gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp2); - dead_tmp(tmp); break; case 14: /* Polynomial VMULL */ cpu_abort(env, "Polynomial VMULL not implemented"); @@ -5228,6 +5233,10 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) return 1; tmp2 = neon_get_scalar(size, rm); + /* We need a copy of tmp2 because gen_neon_mull + * deletes it during pass 0. */ + tmp4 = new_tmp(); + tcg_gen_mov_i32(tmp4, tmp2); tmp3 = neon_load_reg(rn, 1); for (pass = 0; pass < 2; pass++) { @@ -5235,9 +5244,9 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) tmp = neon_load_reg(rn, 0); } else { tmp = tmp3; + tmp2 = tmp4; } gen_neon_mull(cpu_V0, tmp, tmp2, size, u); - dead_tmp(tmp); if (op == 6 || op == 7) { gen_neon_negl(cpu_V0, size); } @@ -5264,7 +5273,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) neon_store_reg64(cpu_V0, rd + pass); } - dead_tmp(tmp2); break; default: /* 14 and 15 are RESERVED */ -- 1.7.2.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-19 14:37 ` Christophe Lyon 2011-01-19 19:12 ` Peter Maydell @ 2011-01-26 13:34 ` Aurelien Jarno 1 sibling, 0 replies; 10+ messages in thread From: Aurelien Jarno @ 2011-01-26 13:34 UTC (permalink / raw) To: Christophe Lyon; +Cc: Peter Maydell, qemu-devel@nongnu.org On Wed, Jan 19, 2011 at 03:37:58PM +0100, Christophe Lyon wrote: > Here is an updated patch which will hopefully not be mangled by my mailer. > > Fix garbage collection of temporaries in Neon emulation. > > > Signed-off-by: Christophe Lyon <christophe.lyon@st.com> > --- > target-arm/translate.c | 18 +++++++++++++----- > 1 files changed, 13 insertions(+), 5 deletions(-) Thanks, applied. > diff --git a/target-arm/translate.c b/target-arm/translate.c > index 57664bc..b3e3d70 100644 > --- a/target-arm/translate.c > +++ b/target-arm/translate.c > @@ -4176,6 +4176,13 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u) > break; > default: abort(); > } > + > + /* gen_helper_neon_mull_[su]{8|16} do not free their parameters. > + Don't forget to clean them now. */ > + if (size < 2) { > + dead_tmp(a); > + dead_tmp(b); > + } > } > > /* Translate a NEON data processing instruction. Return nonzero if the > @@ -4840,7 +4847,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) > if (size == 3) { > tcg_temp_free_i64(tmp64); > } else { > - dead_tmp(tmp2); > + tcg_temp_free_i32(tmp2); > } > } else if (op == 10) { > /* VSHLL */ > @@ -5076,8 +5083,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) > case 8: case 9: case 10: case 11: case 12: case 13: > /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL, VQDMULL */ > gen_neon_mull(cpu_V0, tmp, tmp2, size, u); > - dead_tmp(tmp2); > - dead_tmp(tmp); > break; > case 14: /* Polynomial VMULL */ > cpu_abort(env, "Polynomial VMULL not implemented"); > @@ -5228,6 +5233,10 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) > return 1; > > tmp2 = neon_get_scalar(size, rm); > + /* We need a copy of tmp2 because gen_neon_mull > + * deletes it during pass 0. */ > + tmp4 = new_tmp(); > + tcg_gen_mov_i32(tmp4, tmp2); > tmp3 = neon_load_reg(rn, 1); > > for (pass = 0; pass < 2; pass++) { > @@ -5235,9 +5244,9 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) > tmp = neon_load_reg(rn, 0); > } else { > tmp = tmp3; > + tmp2 = tmp4; > } > gen_neon_mull(cpu_V0, tmp, tmp2, size, u); > - dead_tmp(tmp); > if (op == 6 || op == 7) { > gen_neon_negl(cpu_V0, size); > } > @@ -5264,7 +5273,6 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) > neon_store_reg64(cpu_V0, rd + pass); > } > > - dead_tmp(tmp2); > > break; > default: /* 14 and 15 are RESERVED */ > -- > 1.7.2.3 > > > -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 14:34 [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation Christophe Lyon 2011-01-18 15:26 ` Peter Maydell @ 2011-01-18 15:36 ` Peter Maydell 2011-01-18 17:00 ` Christophe Lyon 1 sibling, 1 reply; 10+ messages in thread From: Peter Maydell @ 2011-01-18 15:36 UTC (permalink / raw) To: Christophe Lyon; +Cc: qemu-devel@nongnu.org Incidentally there are some correctness fixes for the multiply-by-scalar neon insns from the qemu-meego tree which are on my list to push upstream. So you probably aren't getting the right results even if you've managed to shut up qemu's warnings :-) -- PMM ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 15:36 ` Peter Maydell @ 2011-01-18 17:00 ` Christophe Lyon 2011-01-18 17:09 ` Peter Maydell 0 siblings, 1 reply; 10+ messages in thread From: Christophe Lyon @ 2011-01-18 17:00 UTC (permalink / raw) To: Peter Maydell; +Cc: qemu-devel@nongnu.org On 18.01.2011 16:36, Peter Maydell wrote: > Incidentally there are some correctness fixes for the multiply-by-scalar > neon insns from the qemu-meego tree which are on my list to push > upstream. So you probably aren't getting the right results even if > you've managed to shut up qemu's warnings :-) > Actually it did not only shut up qemu's warnings. It was asserting. After fixing the asserts, it did warn a lot about resource leakage indeed, which I tried to fix with this patch. And yes I can confirm there are many other wrong results in the Neon support, which I am currently fixing. Christophe. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation. 2011-01-18 17:00 ` Christophe Lyon @ 2011-01-18 17:09 ` Peter Maydell 0 siblings, 0 replies; 10+ messages in thread From: Peter Maydell @ 2011-01-18 17:09 UTC (permalink / raw) To: Christophe Lyon; +Cc: qemu-devel@nongnu.org On 18 January 2011 17:00, Christophe Lyon <christophe.lyon@st.com> wrote: > On 18.01.2011 16:36, Peter Maydell wrote: >> Incidentally there are some correctness fixes for the multiply-by-scalar >> neon insns from the qemu-meego tree which are on my list to push >> upstream. So you probably aren't getting the right results even if >> you've managed to shut up qemu's warnings :-) > And yes I can confirm there are many other wrong results in the > Neon support, which I am currently fixing. Please coordinate this with me! I have a big pile of fixes which I am working through, testing and submitting upstream, so you are in significant danger of duplicating work, which would be unfortunate. -- PMM ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-01-26 13:34 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-01-18 14:34 [Qemu-devel] [PATCH] target-arm: Fix garbage collection of temporaries in Neon emulation Christophe Lyon 2011-01-18 15:26 ` Peter Maydell 2011-01-18 16:58 ` Christophe Lyon 2011-01-19 14:37 ` Christophe Lyon 2011-01-19 19:12 ` Peter Maydell 2011-01-20 16:52 ` Christophe Lyon 2011-01-26 13:34 ` Aurelien Jarno 2011-01-18 15:36 ` Peter Maydell 2011-01-18 17:00 ` Christophe Lyon 2011-01-18 17:09 ` Peter Maydell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).