* Re: [Qemu-devel] TCG assertion with qemu-system-mipsel [not found] <51293E4A.1040100@weilnetz.de> @ 2013-03-04 16:37 ` Aurélien Jarno 2013-03-04 20:29 ` Stefan Weil 2013-03-05 14:18 ` Aurélien Jarno 0 siblings, 2 replies; 9+ messages in thread From: Aurélien Jarno @ 2013-03-04 16:37 UTC (permalink / raw) To: Stefan Weil; +Cc: qemu-devel, Richard Henderson Hi, On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: > This assertion occured with latest git master: > > qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: > tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) > == 0' failed. > Aborted > > QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS > root). > The assertion happened when running "apt-get update" in the guest. > Is it something reproductible or more or less random? Have you Cc:ed Richard because it's related to the latest patches? On my side I am experiencing random segfaults in various guests (at least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even if it is quite long (building Perl + the testsuite). Currently I know that 1.3 is affected, while 1.2 is not. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG assertion with qemu-system-mipsel 2013-03-04 16:37 ` [Qemu-devel] TCG assertion with qemu-system-mipsel Aurélien Jarno @ 2013-03-04 20:29 ` Stefan Weil 2013-03-05 14:18 ` Aurélien Jarno 1 sibling, 0 replies; 9+ messages in thread From: Stefan Weil @ 2013-03-04 20:29 UTC (permalink / raw) To: Aurélien Jarno; +Cc: qemu-devel, Richard Henderson Am 04.03.2013 17:37, schrieb Aurélien Jarno: > Hi, > > On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: >> This assertion occured with latest git master: >> >> qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: >> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) >> == 0' failed. >> Aborted >> >> QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS >> root). >> The assertion happened when running "apt-get update" in the guest. >> > Is it something reproductible or more or less random? Have you Cc:ed > Richard because it's related to the latest patches? > > On my side I am experiencing random segfaults in various guests (at > least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even > if it is quite long (building Perl + the testsuite). Currently I know > that 1.3 is affected, while 1.2 is not. Hi, either it is fixed now or it is not reproducible: I tried it twice on Feb. 23 and got the same assertion in both cases, but now I tried it with the current git master and had no problem. Sorry that I cannot run more tests or a bisect - I'm currently rather busy with other work. Stefan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG assertion with qemu-system-mipsel 2013-03-04 16:37 ` [Qemu-devel] TCG assertion with qemu-system-mipsel Aurélien Jarno 2013-03-04 20:29 ` Stefan Weil @ 2013-03-05 14:18 ` Aurélien Jarno 2013-03-06 2:05 ` Yeongkyoon Lee 1 sibling, 1 reply; 9+ messages in thread From: Aurélien Jarno @ 2013-03-05 14:18 UTC (permalink / raw) To: Stefan Weil; +Cc: Blue Swirl, Yeongkyoon Lee, qemu-devel, Richard Henderson On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: > Hi, > > On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: > > This assertion occured with latest git master: > > > > qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: > > tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) > > == 0' failed. > > Aborted > > > > QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS > > root). > > The assertion happened when running "apt-get update" in the guest. > > > > Is it something reproductible or more or less random? Have you Cc:ed > Richard because it's related to the latest patches? > > On my side I am experiencing random segfaults in various guests (at > least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even > if it is quite long (building Perl + the testsuite). Currently I know > that 1.3 is affected, while 1.2 is not. > I have found that the issue comes from the following commits, which unfortunately are not bisectable one by one (though it won't change the results a lot): commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Date: Wed Oct 31 16:04:25 2012 +0900 tcg: Optimize qemu_ld/st by generating slow paths at the end of a block Add optimized TCG qemu_ld/st generation which locates the code of TLB miss cases at the end of a block after generating the other IRs. Currently, this optimization supports only i386 and x86_64 hosts. Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Date: Wed Oct 31 16:04:24 2012 +0900 tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization Add GETPC_EXT which is used by MMU helpers to selectively calculate the code address of accessing guest memory when called from a qemu_ld/st optimized code or a C function. Currently, it supports only i386 and x86-64 hosts. Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Date: Wed Oct 31 16:04:23 2012 +0900 configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when a host is i386 or x86_64. Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> I will try to understand why. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG assertion with qemu-system-mipsel 2013-03-05 14:18 ` Aurélien Jarno @ 2013-03-06 2:05 ` Yeongkyoon Lee 2013-03-06 6:10 ` Aurélien Jarno 0 siblings, 1 reply; 9+ messages in thread From: Yeongkyoon Lee @ 2013-03-06 2:05 UTC (permalink / raw) To: Aurélien Jarno Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On 03/05/2013 11:18 PM, Aurélien Jarno wrote: > On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: >> Hi, >> >> On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: >>> This assertion occured with latest git master: >>> >>> qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: >>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) >>> == 0' failed. >>> Aborted >>> >>> QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS >>> root). >>> The assertion happened when running "apt-get update" in the guest. >>> >> Is it something reproductible or more or less random? Have you Cc:ed >> Richard because it's related to the latest patches? >> >> On my side I am experiencing random segfaults in various guests (at >> least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even >> if it is quite long (building Perl + the testsuite). Currently I know >> that 1.3 is affected, while 1.2 is not. >> > I have found that the issue comes from the following commits, which > unfortunately are not bisectable one by one (though it won't change the > results a lot): > > commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Date: Wed Oct 31 16:04:25 2012 +0900 > > tcg: Optimize qemu_ld/st by generating slow paths at the end of a block > > Add optimized TCG qemu_ld/st generation which locates the code of TLB miss > cases at the end of a block after generating the other IRs. > Currently, this optimization supports only i386 and x86_64 hosts. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Date: Wed Oct 31 16:04:24 2012 +0900 > > tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization > > Add GETPC_EXT which is used by MMU helpers to selectively calculate the code > address of accessing guest memory when called from a qemu_ld/st optimized code > or a C function. Currently, it supports only i386 and x86-64 hosts. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Date: Wed Oct 31 16:04:23 2012 +0900 > > configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization > > Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when > a host is i386 or x86_64. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > I will try to understand why. > > Hi Aurélien, Do you mean that those random segfaults occurred only when configured with "--enable-debug"? Although I cannot see how my commits affect debug built image at a glance, I'll do double-check. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG assertion with qemu-system-mipsel 2013-03-06 2:05 ` Yeongkyoon Lee @ 2013-03-06 6:10 ` Aurélien Jarno 2013-03-17 22:27 ` [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) Aurélien Jarno 0 siblings, 1 reply; 9+ messages in thread From: Aurélien Jarno @ 2013-03-06 6:10 UTC (permalink / raw) To: Yeongkyoon Lee; +Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On Wed, Mar 06, 2013 at 11:05:15AM +0900, Yeongkyoon Lee wrote: > On 03/05/2013 11:18 PM, Aurélien Jarno wrote: > >On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: > >>Hi, > >> > >>On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: > >>>This assertion occured with latest git master: > >>> > >>>qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: > >>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) > >>>== 0' failed. > >>>Aborted > >>> > >>>QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS > >>>root). > >>>The assertion happened when running "apt-get update" in the guest. > >>> > >>Is it something reproductible or more or less random? Have you Cc:ed > >>Richard because it's related to the latest patches? > >> > >>On my side I am experiencing random segfaults in various guests (at > >>least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even > >>if it is quite long (building Perl + the testsuite). Currently I know > >>that 1.3 is affected, while 1.2 is not. > >> > >I have found that the issue comes from the following commits, which > >unfortunately are not bisectable one by one (though it won't change the > >results a lot): > > > > commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Date: Wed Oct 31 16:04:25 2012 +0900 > > tcg: Optimize qemu_ld/st by generating slow paths at the end of a block > > Add optimized TCG qemu_ld/st generation which locates the code of TLB miss > > cases at the end of a block after generating the other IRs. > > Currently, this optimization supports only i386 and x86_64 hosts. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Date: Wed Oct 31 16:04:24 2012 +0900 > > tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization > > Add GETPC_EXT which is used by MMU helpers to selectively calculate the code > > address of accessing guest memory when called from a qemu_ld/st optimized code > > or a C function. Currently, it supports only i386 and x86-64 hosts. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Date: Wed Oct 31 16:04:23 2012 +0900 > > configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization > > Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when > > a host is i386 or x86_64. > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > > >I will try to understand why. > > > > > > Hi Aurélien, > Do you mean that those random segfaults occurred only when > configured with "--enable-debug"? > Although I cannot see how my commits affect debug built image at a > glance, I'll do double-check. > Thanks. The problem is there even without configuring QEMU with --enable-debug. It justs doesn't happens very often, and very randomly. The only way to reproduce it each time is to launch a big task in the guest (for me building Perl) and see if it completes or now. It can take up to one hour until it happens. I should precise that the segfault is on the guest side. I have tried to look at your patches, and so far I haven't found the issue. It seems the two first patches are fine, ie I have verified the return address is always correctly computed. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) 2013-03-06 6:10 ` Aurélien Jarno @ 2013-03-17 22:27 ` Aurélien Jarno 2013-03-21 7:04 ` Yeongkyoon Lee 0 siblings, 1 reply; 9+ messages in thread From: Aurélien Jarno @ 2013-03-17 22:27 UTC (permalink / raw) To: Yeongkyoon Lee; +Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On Wed, Mar 06, 2013 at 07:10:17AM +0100, Aurélien Jarno wrote: > On Wed, Mar 06, 2013 at 11:05:15AM +0900, Yeongkyoon Lee wrote: > > On 03/05/2013 11:18 PM, Aurélien Jarno wrote: > > >On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: > > >>Hi, > > >> > > >>On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: > > >>>This assertion occured with latest git master: > > >>> > > >>>qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: > > >>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) > > >>>== 0' failed. > > >>>Aborted > > >>> > > >>>QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS > > >>>root). > > >>>The assertion happened when running "apt-get update" in the guest. > > >>> > > >>Is it something reproductible or more or less random? Have you Cc:ed > > >>Richard because it's related to the latest patches? > > >> > > >>On my side I am experiencing random segfaults in various guests (at > > >>least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even > > >>if it is quite long (building Perl + the testsuite). Currently I know > > >>that 1.3 is affected, while 1.2 is not. > > >> > > >I have found that the issue comes from the following commits, which > > >unfortunately are not bisectable one by one (though it won't change the > > >results a lot): > > > > > > commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 > > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Date: Wed Oct 31 16:04:25 2012 +0900 > > > tcg: Optimize qemu_ld/st by generating slow paths at the end of a block > > > Add optimized TCG qemu_ld/st generation which locates the code of TLB miss > > > cases at the end of a block after generating the other IRs. > > > Currently, this optimization supports only i386 and x86_64 hosts. > > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > > commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd > > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Date: Wed Oct 31 16:04:24 2012 +0900 > > > tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization > > > Add GETPC_EXT which is used by MMU helpers to selectively calculate the code > > > address of accessing guest memory when called from a qemu_ld/st optimized code > > > or a C function. Currently, it supports only i386 and x86-64 hosts. > > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > > commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 > > > Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Date: Wed Oct 31 16:04:23 2012 +0900 > > > configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization > > > Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when > > > a host is i386 or x86_64. > > > Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > > > Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > > > > > >I will try to understand why. > > > > > > > > > > Hi Aurélien, > > Do you mean that those random segfaults occurred only when > > configured with "--enable-debug"? > > Although I cannot see how my commits affect debug built image at a > > glance, I'll do double-check. > > Thanks. > > The problem is there even without configuring QEMU with --enable-debug. > It justs doesn't happens very often, and very randomly. The only way to > reproduce it each time is to launch a big task in the guest (for me > building Perl) and see if it completes or now. It can take up to one > hour until it happens. > > I should precise that the segfault is on the guest side. > > I have tried to look at your patches, and so far I haven't found the > issue. It seems the two first patches are fine, ie I have verified the > return address is always correctly computed. > I still haven't found the issue, but on the other hand I can't find any problem in your code, after reading it dozen of times. I also tried to modify it as less as possible while issuing the slow path back inside the TB and it fixes the problem. So it really looks like to be due to the slow path being at the end of the TB, and not to a bug in the code generating it. After adding various checks, I am also convinced the address computed in GETPC_EXT() is always correct. I have to say I am running out of ideas. One way to reproduce the issue more easily is to reduce the size of the generated code buffer, for example by setting it to 512kB for both MIN_CODE_GEN_BUFFER_SIZE and MAX_CODE_GEN_BUFFER_SIZE in translate-all.c. That way booting an ARM guest triggers plenty of segmentation faults or other strange issues with your patch but not without. OTOH increasing this size make the issue to almost disappear even when building perl including the testsuite (for that it has to be at least 512MB). -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) 2013-03-17 22:27 ` [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) Aurélien Jarno @ 2013-03-21 7:04 ` Yeongkyoon Lee 2013-03-21 22:11 ` Aurélien Jarno 0 siblings, 1 reply; 9+ messages in thread From: Yeongkyoon Lee @ 2013-03-21 7:04 UTC (permalink / raw) To: Aurélien Jarno Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On 03/18/2013 07:27 AM, Aurélien Jarno wrote: > On Wed, Mar 06, 2013 at 07:10:17AM +0100, Aurélien Jarno wrote: >> On Wed, Mar 06, 2013 at 11:05:15AM +0900, Yeongkyoon Lee wrote: >>> On 03/05/2013 11:18 PM, Aurélien Jarno wrote: >>>> On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: >>>>> Hi, >>>>> >>>>> On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: >>>>>> This assertion occured with latest git master: >>>>>> >>>>>> qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: >>>>>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) >>>>>> == 0' failed. >>>>>> Aborted >>>>>> >>>>>> QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS >>>>>> root). >>>>>> The assertion happened when running "apt-get update" in the guest. >>>>>> >>>>> Is it something reproductible or more or less random? Have you Cc:ed >>>>> Richard because it's related to the latest patches? >>>>> >>>>> On my side I am experiencing random segfaults in various guests (at >>>>> least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even >>>>> if it is quite long (building Perl + the testsuite). Currently I know >>>>> that 1.3 is affected, while 1.2 is not. >>>>> >>>> I have found that the issue comes from the following commits, which >>>> unfortunately are not bisectable one by one (though it won't change the >>>> results a lot): >>>> >>>> commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Date: Wed Oct 31 16:04:25 2012 +0900 >>>> tcg: Optimize qemu_ld/st by generating slow paths at the end of a block >>>> Add optimized TCG qemu_ld/st generation which locates the code of TLB miss >>>> cases at the end of a block after generating the other IRs. >>>> Currently, this optimization supports only i386 and x86_64 hosts. >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>> commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Date: Wed Oct 31 16:04:24 2012 +0900 >>>> tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization >>>> Add GETPC_EXT which is used by MMU helpers to selectively calculate the code >>>> address of accessing guest memory when called from a qemu_ld/st optimized code >>>> or a C function. Currently, it supports only i386 and x86-64 hosts. >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>> commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Date: Wed Oct 31 16:04:23 2012 +0900 >>>> configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization >>>> Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when >>>> a host is i386 or x86_64. >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>> >>>> I will try to understand why. >>>> >>>> >>> Hi Aurélien, >>> Do you mean that those random segfaults occurred only when >>> configured with "--enable-debug"? >>> Although I cannot see how my commits affect debug built image at a >>> glance, I'll do double-check. >>> Thanks. >> The problem is there even without configuring QEMU with --enable-debug. >> It justs doesn't happens very often, and very randomly. The only way to >> reproduce it each time is to launch a big task in the guest (for me >> building Perl) and see if it completes or now. It can take up to one >> hour until it happens. >> >> I should precise that the segfault is on the guest side. >> >> I have tried to look at your patches, and so far I haven't found the >> issue. It seems the two first patches are fine, ie I have verified the >> return address is always correctly computed. >> > I still haven't found the issue, but on the other hand I can't find any > problem in your code, after reading it dozen of times. I also tried to > modify it as less as possible while issuing the slow path back inside > the TB and it fixes the problem. So it really looks like to be due to > the slow path being at the end of the TB, and not to a bug in the code > generating it. After adding various checks, I am also convinced the > address computed in GETPC_EXT() is always correct. I have to say I am > running out of ideas. > > One way to reproduce the issue more easily is to reduce the size of the > generated code buffer, for example by setting it to 512kB for both > MIN_CODE_GEN_BUFFER_SIZE and MAX_CODE_GEN_BUFFER_SIZE in > translate-all.c. That way booting an ARM guest triggers plenty of > segmentation faults or other strange issues with your patch but not > without. > > OTOH increasing this size make the issue to almost disappear even when > building perl including the testsuite (for that it has to be at least > 512MB). > Although I've not succeeded to reproduce the problem, I've found a suspicious code stub about boundary-checking of generated code (is_tcg_gen_code() in translate-all.c). The code is supposed to be changed as follows.case Before: return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + tcg_ctx.code_gen_buffer_max_size)); After: return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + tcg_ctx.code_gen_buffer_size)); The reason is that there could happen to miss out the generated code ranges by "(TCG_MAX_OP_SIZE * OPC_BUF_SIZE)". See code_gen_alloc() in translate-all.c: tcg_ctx.code_gen_buffer_max_size = tcg_ctx.code_gen_buffer_size - (TCG_MAX_OP_SIZE * OPC_BUF_SIZE) Aurélien and Stefan, Could you please test this and feedback the result? Because, I'm not able to reproduce this problem, though I follow up Aurélien's reproducible steps. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) 2013-03-21 7:04 ` Yeongkyoon Lee @ 2013-03-21 22:11 ` Aurélien Jarno 2013-03-22 1:48 ` Yeongkyoon Lee 0 siblings, 1 reply; 9+ messages in thread From: Aurélien Jarno @ 2013-03-21 22:11 UTC (permalink / raw) To: Yeongkyoon Lee; +Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On Thu, Mar 21, 2013 at 04:04:44PM +0900, Yeongkyoon Lee wrote: > On 03/18/2013 07:27 AM, Aurélien Jarno wrote: > >On Wed, Mar 06, 2013 at 07:10:17AM +0100, Aurélien Jarno wrote: > >>On Wed, Mar 06, 2013 at 11:05:15AM +0900, Yeongkyoon Lee wrote: > >>>On 03/05/2013 11:18 PM, Aurélien Jarno wrote: > >>>>On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: > >>>>>Hi, > >>>>> > >>>>>On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: > >>>>>>This assertion occured with latest git master: > >>>>>> > >>>>>>qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: > >>>>>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) > >>>>>>== 0' failed. > >>>>>>Aborted > >>>>>> > >>>>>>QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS > >>>>>>root). > >>>>>>The assertion happened when running "apt-get update" in the guest. > >>>>>> > >>>>>Is it something reproductible or more or less random? Have you Cc:ed > >>>>>Richard because it's related to the latest patches? > >>>>> > >>>>>On my side I am experiencing random segfaults in various guests (at > >>>>>least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even > >>>>>if it is quite long (building Perl + the testsuite). Currently I know > >>>>>that 1.3 is affected, while 1.2 is not. > >>>>> > >>>>I have found that the issue comes from the following commits, which > >>>>unfortunately are not bisectable one by one (though it won't change the > >>>>results a lot): > >>>> > >>>> commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 > >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Date: Wed Oct 31 16:04:25 2012 +0900 > >>>> tcg: Optimize qemu_ld/st by generating slow paths at the end of a block > >>>> Add optimized TCG qemu_ld/st generation which locates the code of TLB miss > >>>> cases at the end of a block after generating the other IRs. > >>>> Currently, this optimization supports only i386 and x86_64 hosts. > >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > >>>> commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd > >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Date: Wed Oct 31 16:04:24 2012 +0900 > >>>> tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization > >>>> Add GETPC_EXT which is used by MMU helpers to selectively calculate the code > >>>> address of accessing guest memory when called from a qemu_ld/st optimized code > >>>> or a C function. Currently, it supports only i386 and x86-64 hosts. > >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > >>>> commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 > >>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Date: Wed Oct 31 16:04:23 2012 +0900 > >>>> configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization > >>>> Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when > >>>> a host is i386 or x86_64. > >>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> > >>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> > >>>> > >>>>I will try to understand why. > >>>> > >>>> > >>>Hi Aurélien, > >>>Do you mean that those random segfaults occurred only when > >>>configured with "--enable-debug"? > >>>Although I cannot see how my commits affect debug built image at a > >>>glance, I'll do double-check. > >>>Thanks. > >>The problem is there even without configuring QEMU with --enable-debug. > >>It justs doesn't happens very often, and very randomly. The only way to > >>reproduce it each time is to launch a big task in the guest (for me > >>building Perl) and see if it completes or now. It can take up to one > >>hour until it happens. > >> > >>I should precise that the segfault is on the guest side. > >> > >>I have tried to look at your patches, and so far I haven't found the > >>issue. It seems the two first patches are fine, ie I have verified the > >>return address is always correctly computed. > >> > >I still haven't found the issue, but on the other hand I can't find any > >problem in your code, after reading it dozen of times. I also tried to > >modify it as less as possible while issuing the slow path back inside > >the TB and it fixes the problem. So it really looks like to be due to > >the slow path being at the end of the TB, and not to a bug in the code > >generating it. After adding various checks, I am also convinced the > >address computed in GETPC_EXT() is always correct. I have to say I am > >running out of ideas. > > > >One way to reproduce the issue more easily is to reduce the size of the > >generated code buffer, for example by setting it to 512kB for both > >MIN_CODE_GEN_BUFFER_SIZE and MAX_CODE_GEN_BUFFER_SIZE in > >translate-all.c. That way booting an ARM guest triggers plenty of > >segmentation faults or other strange issues with your patch but not > >without. > > > >OTOH increasing this size make the issue to almost disappear even when > >building perl including the testsuite (for that it has to be at least > >512MB). > > > > Although I've not succeeded to reproduce the problem, I've found a > suspicious code stub about boundary-checking of generated code > (is_tcg_gen_code() in translate-all.c). > > The code is supposed to be changed as follows.case > Before: > return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && > tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + > tcg_ctx.code_gen_buffer_max_size)); > After: > return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && > tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + > tcg_ctx.code_gen_buffer_size)); > > The reason is that there could happen to miss out the generated code > ranges by "(TCG_MAX_OP_SIZE * OPC_BUF_SIZE)". > See code_gen_alloc() in translate-all.c: > tcg_ctx.code_gen_buffer_max_size = tcg_ctx.code_gen_buffer_size > - (TCG_MAX_OP_SIZE * OPC_BUF_SIZE) > Very good catch! Thanks. This fixes the issue I observed. To give more details, code_gen_buffer_max_size corresponds to the threshold which clear all TBs before continuing generating code. This means that it can be exceeded by a few bytes and up to (TCG_MAX_OP_SIZE * OPC_BUF_SIZE) bytes which corresponds to the maximum bytes of a generated TB. Could you please send a proper patch to fix that? I think it should also be fixed in the next 0.13.x and 0.14.x releases (0.12.x releases are not affected), so please Cc: qemu-stable (even if the patch will have to be slightly tweaked). -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) 2013-03-21 22:11 ` Aurélien Jarno @ 2013-03-22 1:48 ` Yeongkyoon Lee 0 siblings, 0 replies; 9+ messages in thread From: Yeongkyoon Lee @ 2013-03-22 1:48 UTC (permalink / raw) To: Aurélien Jarno Cc: Blue Swirl, Stefan Weil, qemu-devel, Richard Henderson On 03/22/2013 07:11 AM, Aurélien Jarno wrote: > On Thu, Mar 21, 2013 at 04:04:44PM +0900, Yeongkyoon Lee wrote: >> On 03/18/2013 07:27 AM, Aurélien Jarno wrote: >>> On Wed, Mar 06, 2013 at 07:10:17AM +0100, Aurélien Jarno wrote: >>>> On Wed, Mar 06, 2013 at 11:05:15AM +0900, Yeongkyoon Lee wrote: >>>>> On 03/05/2013 11:18 PM, Aurélien Jarno wrote: >>>>>> On Mon, Mar 04, 2013 at 05:37:31PM +0100, Aurélien Jarno wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On Sat, Feb 23, 2013 at 11:10:18PM +0100, Stefan Weil wrote: >>>>>>>> This assertion occured with latest git master: >>>>>>>> >>>>>>>> qemu-system-mipsel: /src/qemu/tcg/tcg-op.h:2589: >>>>>>>> tcg_gen_goto_tb: Assertion `(tcg_ctx.goto_tb_issue_mask & (1 << idx)) >>>>>>>> == 0' failed. >>>>>>>> Aborted >>>>>>>> >>>>>>>> QEMU was built with --enable-debug and running a Debian MIPS Lenny (NFS >>>>>>>> root). >>>>>>>> The assertion happened when running "apt-get update" in the guest. >>>>>>>> >>>>>>> Is it something reproductible or more or less random? Have you Cc:ed >>>>>>> Richard because it's related to the latest patches? >>>>>>> >>>>>>> On my side I am experiencing random segfaults in various guests (at >>>>>>> least PowerPC, MIPS, SH4 and ARM). I have found a way to bisect it, even >>>>>>> if it is quite long (building Perl + the testsuite). Currently I know >>>>>>> that 1.3 is affected, while 1.2 is not. >>>>>>> >>>>>> I have found that the issue comes from the following commits, which >>>>>> unfortunately are not bisectable one by one (though it won't change the >>>>>> results a lot): >>>>>> >>>>>> commit b76f0d8c2e3eac94bc7fd90a510cb7426b2a2699 >>>>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Date: Wed Oct 31 16:04:25 2012 +0900 >>>>>> tcg: Optimize qemu_ld/st by generating slow paths at the end of a block >>>>>> Add optimized TCG qemu_ld/st generation which locates the code of TLB miss >>>>>> cases at the end of a block after generating the other IRs. >>>>>> Currently, this optimization supports only i386 and x86_64 hosts. >>>>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>>>> commit fdbb84d1332ae0827d60f1a2ca03c7d5678c6edd >>>>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Date: Wed Oct 31 16:04:24 2012 +0900 >>>>>> tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization >>>>>> Add GETPC_EXT which is used by MMU helpers to selectively calculate the code >>>>>> address of accessing guest memory when called from a qemu_ld/st optimized code >>>>>> or a C function. Currently, it supports only i386 and x86-64 hosts. >>>>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>>>> commit 32761257c0b9fa7ee04d2871a6e48a41f119c469 >>>>>> Author: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Date: Wed Oct 31 16:04:23 2012 +0900 >>>>>> configure: Add CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization >>>>>> Enable CONFIG_QEMU_LDST_OPTIMIZATION for TCG qemu_ld/st optimization only when >>>>>> a host is i386 or x86_64. >>>>>> Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> >>>>>> Signed-off-by: Blue Swirl <blauwirbel@gmail.com> >>>>>> >>>>>> I will try to understand why. >>>>>> >>>>>> >>>>> Hi Aurélien, >>>>> Do you mean that those random segfaults occurred only when >>>>> configured with "--enable-debug"? >>>>> Although I cannot see how my commits affect debug built image at a >>>>> glance, I'll do double-check. >>>>> Thanks. >>>> The problem is there even without configuring QEMU with --enable-debug. >>>> It justs doesn't happens very often, and very randomly. The only way to >>>> reproduce it each time is to launch a big task in the guest (for me >>>> building Perl) and see if it completes or now. It can take up to one >>>> hour until it happens. >>>> >>>> I should precise that the segfault is on the guest side. >>>> >>>> I have tried to look at your patches, and so far I haven't found the >>>> issue. It seems the two first patches are fine, ie I have verified the >>>> return address is always correctly computed. >>>> >>> I still haven't found the issue, but on the other hand I can't find any >>> problem in your code, after reading it dozen of times. I also tried to >>> modify it as less as possible while issuing the slow path back inside >>> the TB and it fixes the problem. So it really looks like to be due to >>> the slow path being at the end of the TB, and not to a bug in the code >>> generating it. After adding various checks, I am also convinced the >>> address computed in GETPC_EXT() is always correct. I have to say I am >>> running out of ideas. >>> >>> One way to reproduce the issue more easily is to reduce the size of the >>> generated code buffer, for example by setting it to 512kB for both >>> MIN_CODE_GEN_BUFFER_SIZE and MAX_CODE_GEN_BUFFER_SIZE in >>> translate-all.c. That way booting an ARM guest triggers plenty of >>> segmentation faults or other strange issues with your patch but not >>> without. >>> >>> OTOH increasing this size make the issue to almost disappear even when >>> building perl including the testsuite (for that it has to be at least >>> 512MB). >>> >> Although I've not succeeded to reproduce the problem, I've found a >> suspicious code stub about boundary-checking of generated code >> (is_tcg_gen_code() in translate-all.c). >> >> The code is supposed to be changed as follows.case >> Before: >> return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && >> tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + >> tcg_ctx.code_gen_buffer_max_size)); >> After: >> return (tc_ptr >= (uintptr_t)tcg_ctx.code_gen_buffer && >> tc_ptr < (uintptr_t)(tcg_ctx.code_gen_buffer + >> tcg_ctx.code_gen_buffer_size)); >> >> The reason is that there could happen to miss out the generated code >> ranges by "(TCG_MAX_OP_SIZE * OPC_BUF_SIZE)". >> See code_gen_alloc() in translate-all.c: >> tcg_ctx.code_gen_buffer_max_size = tcg_ctx.code_gen_buffer_size >> - (TCG_MAX_OP_SIZE * OPC_BUF_SIZE) >> > Very good catch! Thanks. This fixes the issue I observed. > > To give more details, code_gen_buffer_max_size corresponds to the > threshold which clear all TBs before continuing generating code. This > means that it can be exceeded by a few bytes and up to (TCG_MAX_OP_SIZE > * OPC_BUF_SIZE) bytes which corresponds to the maximum bytes of a > generated TB. > > Could you please send a proper patch to fix that? I think it should also > be fixed in the next 0.13.x and 0.14.x releases (0.12.x releases are not > affected), so please Cc: qemu-stable (even if the patch will have to be > slightly tweaked). > Sure, I'll send the patch. Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-03-22 1:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <51293E4A.1040100@weilnetz.de> 2013-03-04 16:37 ` [Qemu-devel] TCG assertion with qemu-system-mipsel Aurélien Jarno 2013-03-04 20:29 ` Stefan Weil 2013-03-05 14:18 ` Aurélien Jarno 2013-03-06 2:05 ` Yeongkyoon Lee 2013-03-06 6:10 ` Aurélien Jarno 2013-03-17 22:27 ` [Qemu-devel] TCG broken in system mode (was TCG assertion with qemu-system-mipsel) Aurélien Jarno 2013-03-21 7:04 ` Yeongkyoon Lee 2013-03-21 22:11 ` Aurélien Jarno 2013-03-22 1:48 ` Yeongkyoon Lee
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).