* [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
@ 2012-11-27 16:21 Paolo Bonzini
2012-11-27 16:24 ` Alexander Graf
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Paolo Bonzini @ 2012-11-27 16:21 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, kraxel
Some versions of GCC require insane (>2GB) amounts of memory to compile
translate.o. As a countermeasure, disable the culprit optimization pass.
This should fix the buildbot failure for default_x86_64_fedora16.
Anyway is a good thing to do because people will try to compile 1.3 with
less than 2GB of memory and complain.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
Makefile.target | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Makefile.target b/Makefile.target
index 8b658c0..d38bb58 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -143,6 +143,12 @@ GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
endif # CONFIG_SOFTMMU
+# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
+# and large functions that use global variables. The bug is in all
+# releases of GCC, but it became particularly acute in 4.7.x. We
+# should be able to delete this at the end of 2013.
+%/translate.o: QEMU_CFLAGS += -fno-gcse
+
nested-vars += obj-y
# This resolves all nested paths, so it must come last
--
1.8.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 16:21 [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option Paolo Bonzini
@ 2012-11-27 16:24 ` Alexander Graf
2012-11-27 16:30 ` Paolo Bonzini
2012-11-28 2:01 ` 陳韋任 (Wei-Ren Chen)
2012-11-28 10:47 ` Andreas Färber
2 siblings, 1 reply; 8+ messages in thread
From: Alexander Graf @ 2012-11-27 16:24 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel, kraxel
On 11/27/2012 05:21 PM, Paolo Bonzini wrote:
> Some versions of GCC require insane (>2GB) amounts of memory to compile
> translate.o. As a countermeasure, disable the culprit optimization pass.
> This should fix the buildbot failure for default_x86_64_fedora16.
> Anyway is a good thing to do because people will try to compile 1.3 with
> less than 2GB of memory and complain.
>
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
> Makefile.target | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/Makefile.target b/Makefile.target
> index 8b658c0..d38bb58 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -143,6 +143,12 @@ GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
>
> endif # CONFIG_SOFTMMU
>
> +# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
> +# and large functions that use global variables. The bug is in all
> +# releases of GCC, but it became particularly acute in 4.7.x. We
> +# should be able to delete this at the end of 2013.
Can we add a version check for gcc here?
Alex
> +%/translate.o: QEMU_CFLAGS += -fno-gcse
> +
> nested-vars += obj-y
>
> # This resolves all nested paths, so it must come last
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 16:24 ` Alexander Graf
@ 2012-11-27 16:30 ` Paolo Bonzini
2012-11-27 18:17 ` Stefan Weil
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2012-11-27 16:30 UTC (permalink / raw)
To: Alexander Graf; +Cc: peter.maydell, qemu-devel, kraxel
Il 27/11/2012 17:24, Alexander Graf ha scritto:
>>
>> +# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
>> +# and large functions that use global variables. The bug is in all
>> +# releases of GCC, but it became particularly acute in 4.7.x. We
>> +# should be able to delete this at the end of 2013.
>
> Can we add a version check for gcc here?
I don't think it is useful unless somebody finds that the patch gives
substantially worse TCG performance.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 16:30 ` Paolo Bonzini
@ 2012-11-27 18:17 ` Stefan Weil
2012-11-28 7:29 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Weil @ 2012-11-27 18:17 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: peter.maydell, kraxel, Alexander Graf, qemu-devel
Am 27.11.2012 17:30, schrieb Paolo Bonzini:
> Il 27/11/2012 17:24, Alexander Graf ha scritto:
>>> +# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
>>> +# and large functions that use global variables. The bug is in all
>>> +# releases of GCC, but it became particularly acute in 4.7.x. We
>>> +# should be able to delete this at the end of 2013.
>> Can we add a version check for gcc here?
> I don't think it is useful unless somebody finds that the patch gives
> substantially worse TCG performance.
>
> Paolo
>
Hi Paolo,
latest native MinGW-w64 uses gcc 4.7.2 for w64. It compiles */translate.c
without needing too much RAM.
In a short test, I compiled target-ppc/translate.c without and with
-fno-gcse.
This compiler option increases compilation speed a little and creates a
smaller
binary (tested with Debian amd64-mingw32msvc-gcc 4.4.4):
standard
time: user 0m31.966s
size: 1113760 70216 1376 1185352 121648
target-ppc/translate.o
with -fno-gcse
time: user 0m30.542s
size: 1111056 70216 1376 1182648 120bb8
target-ppc/translate.o
To summarize, -fno-gcse is not needed for MinGW, but I don't expect
that it would do any harm there.
A real problem could arise from compilers which don't support -fno-gcse.
As this option is not checked for compatibility in configure, such compilers
would no longer work with unmodified QEMU sources. clang obviously
supports -fno-gcse, so maybe we don't have a real problem currently.
For the buildbot machines, "configure --enable-debug" wouldsolve the
OOM problem, dramatically reduce compilation time, add some
compile time checks for TCG, reduce CO2 emission, ... For most buildbots,
--enable-debug would be a good choice. There are some kinds of errors
which compilers only detect during their optimization pass, so some
buildbots should still run without --enable-debug.
Do we need -fno-gcse for all */translate.c or only for some of them?
The problem with gcc using large quantities of RAM for those files
is not new (it was a good RAM tester on a defective PC some time
ago for me). I think it is caused by huge switch statements
in those files. Splitting those switch statements might also help.
If the memory needed grows with n * n (n = number of case statements
in one switch statement), then splitting a switch statement in two
would reduce the memory needed from 2 GiB to 0.5 GiB.
Regards
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 16:21 [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option Paolo Bonzini
2012-11-27 16:24 ` Alexander Graf
@ 2012-11-28 2:01 ` 陳韋任 (Wei-Ren Chen)
2012-11-28 2:34 ` Wenchao Xia
2012-11-28 10:47 ` Andreas Färber
2 siblings, 1 reply; 8+ messages in thread
From: 陳韋任 (Wei-Ren Chen) @ 2012-11-28 2:01 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: peter.maydell, qemu-devel, kraxel
On Tue, Nov 27, 2012 at 05:21:03PM +0100, Paolo Bonzini wrote:
> Some versions of GCC require insane (>2GB) amounts of memory to compile
> translate.o. As a countermeasure, disable the culprit optimization pass.
> This should fix the buildbot failure for default_x86_64_fedora16.
> Anyway is a good thing to do because people will try to compile 1.3 with
> less than 2GB of memory and complain.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> Makefile.target | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/Makefile.target b/Makefile.target
> index 8b658c0..d38bb58 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -143,6 +143,12 @@ GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
>
> endif # CONFIG_SOFTMMU
>
> +# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
> +# and large functions that use global variables. The bug is in all
> +# releases of GCC, but it became particularly acute in 4.7.x. We
> +# should be able to delete this at the end of 2013.
> +%/translate.o: QEMU_CFLAGS += -fno-gcse
> +
> nested-vars += obj-y
>
> # This resolves all nested paths, so it must come last
No objection here. But will we remove this option when GCC fix this pr
or we just leave it there? If we're going to remove it in the future,
better keep a note on the release change log or somewhere else.
Regards,
chenwj
--
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-28 2:01 ` 陳韋任 (Wei-Ren Chen)
@ 2012-11-28 2:34 ` Wenchao Xia
0 siblings, 0 replies; 8+ messages in thread
From: Wenchao Xia @ 2012-11-28 2:34 UTC (permalink / raw)
To: "陳韋任 (Wei-Ren Chen)"
Cc: Paolo Bonzini, kraxel, qemu-devel, peter.maydell
于 2012-11-28 10:01, 陳韋任 (Wei-Ren Chen) 写道:
> On Tue, Nov 27, 2012 at 05:21:03PM +0100, Paolo Bonzini wrote:
>> Some versions of GCC require insane (>2GB) amounts of memory to compile
>> translate.o. As a countermeasure, disable the culprit optimization pass.
>> This should fix the buildbot failure for default_x86_64_fedora16.
>> Anyway is a good thing to do because people will try to compile 1.3 with
>> less than 2GB of memory and complain.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> Makefile.target | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/Makefile.target b/Makefile.target
>> index 8b658c0..d38bb58 100644
>> --- a/Makefile.target
>> +++ b/Makefile.target
>> @@ -143,6 +143,12 @@ GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
>>
>> endif # CONFIG_SOFTMMU
>>
>> +# Workaround for http://gcc.gnu.org/PR55489. Happens with -fPIE/-fPIC
>> +# and large functions that use global variables. The bug is in all
>> +# releases of GCC, but it became particularly acute in 4.7.x. We
>> +# should be able to delete this at the end of 2013.
>> +%/translate.o: QEMU_CFLAGS += -fno-gcse
>> +
>> nested-vars += obj-y
>>
>> # This resolves all nested paths, so it must come last
>
> No objection here. But will we remove this option when GCC fix this pr
> or we just leave it there? If we're going to remove it in the future,
> better keep a note on the release change log or somewhere else.
>
> Regards,
> chenwj
>
+1, I think -fno-gcse is good enough to fix the problem quickly,
and a note would be nice which remind people that translate.o was
compiled with special flag now. If some one find it problem with
special compiler, or some thing about performance, he can find the
reason quickly.
--
Best Regards
Wenchao Xia
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 18:17 ` Stefan Weil
@ 2012-11-28 7:29 ` Paolo Bonzini
0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2012-11-28 7:29 UTC (permalink / raw)
To: Stefan Weil; +Cc: qemu-devel, peter.maydell, kraxel, Alexander Graf
Il 27/11/2012 19:17, Stefan Weil ha scritto:
> A real problem could arise from compilers which don't support -fno-gcse.
It was introduced in GCC 3.0.
> As this option is not checked for compatibility in configure, such
> compilers would no longer work with unmodified QEMU sources. clang
> obviously supports -fno-gcse, so maybe we don't have a real problem
> currently.
Yes.
> For the buildbot machines, "configure --enable-debug" wouldsolve the
> OOM problem, dramatically reduce compilation time, add some
> compile time checks for TCG, reduce CO2 emission, ... For most buildbots,
> --enable-debug would be a good choice. There are some kinds of errors
> which compilers only detect during their optimization pass, so some
> buildbots should still run without --enable-debug.
No, --enable-debug is not a solution. Fixing GCC bugs, or working
around them if possible/useful, is.
> Do we need -fno-gcse for all */translate.c or only for some of them?
Intel is an order of magnitude worse than the others; however, all of
the translate.c are potentially susceptible to this problem. It happens
when you have -fPIE or -fPIC, and largish functions that access a lot of
globals. translate.c tends to use tcg_ctx, and to inline almost
everything into disas_insn... hence the problem.
Intel is the worst, but SPARC also requires 300MB for GCSE. PPC is
special: it "only" needs 55MB for GCSE, but 150MB for inlining and
similarly for other passes---more than other targets. I put "only" in
quotes because even Intel with a patched GCC requires only 1.5MB for
GCSE, and without sacrificing any optimization.
> I think it is caused by huge switch statements
> in those files.
GCC can handle much worse control flow. Over the years, the developers
got really fiendish testcases, mostly template-heavy C++ code or
computer-generated. These testcases have a single huge program in a
single function, and are "interesting" to say the least.
In this case the memory needed is indeed quadratic, but (roughly) in the
number of globals that are accessed in the function. GCC uses a garbage
collector, but it runs it only between optimization passes in general;
usually it doesn't find that much garbage. In this case, GCSE produces
hundreds of MB of garbage. Fixing the bug is just a matter of moving
some invariant stuff out of an inner loop (interestingly it doesn't save
much computation time, only memory).
Paolo
> Splitting those switch statements might also help.
> If the memory needed grows with n * n (n = number of case statements
> in one switch statement), then splitting a switch statement in two
> would reduce the memory needed from 2 GiB to 0.5 GiB.
>
> Regards
> Stefan
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option
2012-11-27 16:21 [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option Paolo Bonzini
2012-11-27 16:24 ` Alexander Graf
2012-11-28 2:01 ` 陳韋任 (Wei-Ren Chen)
@ 2012-11-28 10:47 ` Andreas Färber
2 siblings, 0 replies; 8+ messages in thread
From: Andreas Färber @ 2012-11-28 10:47 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: peter.maydell, Bruce Rogers, qemu-devel, kraxel
Am 27.11.2012 17:21, schrieb Paolo Bonzini:
> Some versions of GCC require insane (>2GB) amounts of memory to compile
> translate.o. As a countermeasure, disable the culprit optimization pass.
> This should fix the buildbot failure for default_x86_64_fedora16.
> Anyway is a good thing to do because people will try to compile 1.3 with
> less than 2GB of memory and complain.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Our builds survived the night now, will test v3 next.
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-11-28 10:48 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-27 16:21 [Qemu-devel] [PATCH v2 1.3] build: compile translate.o with -fno-gcse option Paolo Bonzini
2012-11-27 16:24 ` Alexander Graf
2012-11-27 16:30 ` Paolo Bonzini
2012-11-27 18:17 ` Stefan Weil
2012-11-28 7:29 ` Paolo Bonzini
2012-11-28 2:01 ` 陳韋任 (Wei-Ren Chen)
2012-11-28 2:34 ` Wenchao Xia
2012-11-28 10:47 ` Andreas Färber
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).