* [Linux-ia64] gcc won't inline function returning struct?
@ 2001-10-31 21:39 Bdale Garbee
2001-10-31 22:23 ` Jim Wilson
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Bdale Garbee @ 2001-10-31 21:39 UTC (permalink / raw)
To: linux-ia64
I'm playing around with Thomas Sailer's user space soundcard modems for use
on amateur radio links. One of the things he includes is a small library of
complex math functions intended to be used inline. The way this is coded is
to create a .h file with the gcc "magic" use of extern and inline, like so:
typedef struct {
float re, im;
} complex;
/*
* Complex multiplication.
*/
extern __inline__ complex cmul(complex x, complex y)
{
complex z;
z.re = x.re * y.re - x.im * y.im;
z.im = x.re * y.im + x.im * y.re;
return z;
}
A couple of C source files in the same directory include the header containing
this code, and are aggregated into a .a that an application in an adjacent
directory links against later in the build process.
On my Pentium laptop running Debian with a 2.95.4 compiler, this builds and
runs fine. On my Itanium system running Debian with everyone's favorite 2.96
plus lots of patches, the above function fails to inline with the complaint
complex.h: In function `cmul':
complex.h:14: warning: inline functions not supported for this
return value type
when I add a -Winline. I get the same complaint with gcc-3.0. The effect of
this is that since the function isn't inlined it's left as an extern, and the
linker can't find it when it tries to link the application against this lib
later in the build.
So, my question. Why does the ia64 gcc not handle this when the i386 gcc
likes it just fine? That's way beyond my toolchain knowledge right now...
Bdale
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Linux-ia64] gcc won't inline function returning struct?
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
@ 2001-10-31 22:23 ` Jim Wilson
2001-10-31 23:02 ` n0ano
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Jim Wilson @ 2001-10-31 22:23 UTC (permalink / raw)
To: linux-ia64
The IA-64 ABI says that structures of floats are passed/returned decomposed
into floating point registers. They ABI calls them homogeneous floating-point
aggregates, or HFA for short. This also applies to complex types. Thus your
structure
typedef struct {
float re, im;
} complex;
is handled by putting RE in one FP register, and IM in the next. This is
not normal practice, since the structure is 8 bytes, but ends up using 16
bytes worth of register (ignoring long double to simplify the discussion).
This requires special code to decompose/compose HFA arguments and return
values on IA-64 when loading/storing them. IA-32 does not use this convention,
and thus does not need special code for HFAs.
Because of the old design of the C front end, this special code is problematic.
The C front end generates low level code first, including code to compose/
decompose HFAs, and then tries to do function inlining. When we inline a
function, we have to optimize away the code that composes/decomposes HFAs,
and this is so difficult that in practice it isn't worthwhile to try. Thus
we can not inline a function that uses an HFA argument or return value.
The C++ front uses a more recent design that inlines first, and then generates
low level code including the HFA compose/decompose code. If you compile your
example as C++ code, it will work.
Work is underway to rewrite the C front end to make it work more like the C++
front end, or perhaps even just use the C++ front end for C. When this work
gets far enough, inlining of HFA functions will work in C. I just tried your
example with the current FSF development sources, and it did work, so I think
this is fixed as of Alexandre Oliva's 2001-10-05 gcc changes to the C front
end. I don't know how well it is working at the moment though. However,
I would expect it to be working fine by the time gcc 3.1 comes out in spring of
2002.
Another consideration here is that the IL (Intermediate Language) used by
gcc has no support for representing decomposed structures. If we did,
then we could get much better optimization of structures by separately
optimizing every structure field as if it was a scalar. But we don't,
so the only way we can handle decomposed structures as arguments is to
decompose them before the call, and then recompose them in the function
prologue. This is pretty inefficient, but it does work. Fixing this will
be a lot of work, and it will likely be a while before anyone tries.
Jim
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Linux-ia64] gcc won't inline function returning struct?
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
2001-10-31 22:23 ` Jim Wilson
@ 2001-10-31 23:02 ` n0ano
2001-11-01 0:27 ` Jim Wilson
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: n0ano @ 2001-10-31 23:02 UTC (permalink / raw)
To: linux-ia64
Jim-
Just out of idle curiousity, what would happen if the `complex'
structure were changed to something like:
typedef struct {
float re, im;
int dummy;
} complex;
Since this is no longer an HFA would this kick the compiler into
a mode where the code would at least work, all be it not in the most
efficient manner?
On Wed, Oct 31, 2001 at 02:23:25PM -0800, Jim Wilson wrote:
> The IA-64 ABI says that structures of floats are passed/returned decomposed
> into floating point registers. They ABI calls them homogeneous floating-point
> aggregates, or HFA for short. This also applies to complex types. Thus your
> structure
> typedef struct {
> float re, im;
> } complex;
> is handled by putting RE in one FP register, and IM in the next. This is
> not normal practice, since the structure is 8 bytes, but ends up using 16
> bytes worth of register (ignoring long double to simplify the discussion).
> This requires special code to decompose/compose HFA arguments and return
> values on IA-64 when loading/storing them. IA-32 does not use this convention,
> and thus does not need special code for HFAs.
>
> Because of the old design of the C front end, this special code is problematic.
> The C front end generates low level code first, including code to compose/
> decompose HFAs, and then tries to do function inlining. When we inline a
> function, we have to optimize away the code that composes/decomposes HFAs,
> and this is so difficult that in practice it isn't worthwhile to try. Thus
> we can not inline a function that uses an HFA argument or return value.
>
> The C++ front uses a more recent design that inlines first, and then generates
> low level code including the HFA compose/decompose code. If you compile your
> example as C++ code, it will work.
>
> Work is underway to rewrite the C front end to make it work more like the C++
> front end, or perhaps even just use the C++ front end for C. When this work
> gets far enough, inlining of HFA functions will work in C. I just tried your
> example with the current FSF development sources, and it did work, so I think
> this is fixed as of Alexandre Oliva's 2001-10-05 gcc changes to the C front
> end. I don't know how well it is working at the moment though. However,
> I would expect it to be working fine by the time gcc 3.1 comes out in spring of
> 2002.
>
> Another consideration here is that the IL (Intermediate Language) used by
> gcc has no support for representing decomposed structures. If we did,
> then we could get much better optimization of structures by separately
> optimizing every structure field as if it was a scalar. But we don't,
> so the only way we can handle decomposed structures as arguments is to
> decompose them before the call, and then recompose them in the function
> prologue. This is pretty inefficient, but it does work. Fixing this will
> be a lot of work, and it will likely be a while before anyone tries.
>
> Jim
>
> _______________________________________________
> Linux-IA64 mailing list
> Linux-IA64@linuxia64.org
> http://lists.linuxia64.org/lists/listinfo/linux-ia64
--
Don Dugger
"Censeo Toto nos in Kansa esse decisse." - D. Gale
n0ano@indstorage.com
Ph: 303/652-0870x117
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Linux-ia64] gcc won't inline function returning struct?
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
2001-10-31 22:23 ` Jim Wilson
2001-10-31 23:02 ` n0ano
@ 2001-11-01 0:27 ` Jim Wilson
2001-11-01 0:40 ` Matthew Wilcox
2001-11-01 0:49 ` Jim Wilson
4 siblings, 0 replies; 6+ messages in thread
From: Jim Wilson @ 2001-11-01 0:27 UTC (permalink / raw)
To: linux-ia64
typedef struct {
float re, im;
int dummy;
} complex;
I tried this, and it works.
>Since this is no longer an HFA would this kick the compiler into
>a mode where the code would at least work, all be it not in the most
>efficient manner?
Just to clarify, the code that gcc emits is correct. The testcase can fail
only if the "extern inline" feature is misused. I assumed that was the
case without explicitly mentioning it. Thus my recommended solution would be
to stop using extern inline, or else use it correctly. I can understand that
this might be inconvenient, and that you might want to keep the current
unsafe uses of extern inline.
extern inline means emit this function inline if you can, otherwise emit
nothing. Since gcc makes no promise that it will inline any function, it
is inherently unsafe to put extern inline in a C file. There is no guarantee
that it will work.
There are some programs that do this for functions that the IA-32 compiler
happens to inline, but which the IA-64 compiler does not happen to inline.
This always gets reported as an IA-64 gcc "bug", but really it isn't. It
is programmer error; extern inline has been used incorrectly.
A correct use of extern inline is how glibc uses it. It puts extern inline
in header files, and static functions in C files linked into libc/libm.
If gcc can inline the function, then you get the fast inline version from
the header file. If gcc cannot inline the function, then you get the slow
static version from libc/libm.
Jim
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Linux-ia64] gcc won't inline function returning struct?
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
` (2 preceding siblings ...)
2001-11-01 0:27 ` Jim Wilson
@ 2001-11-01 0:40 ` Matthew Wilcox
2001-11-01 0:49 ` Jim Wilson
4 siblings, 0 replies; 6+ messages in thread
From: Matthew Wilcox @ 2001-11-01 0:40 UTC (permalink / raw)
To: linux-ia64
On Wed, Oct 31, 2001 at 04:27:17PM -0800, Jim Wilson wrote:
> extern inline means emit this function inline if you can, otherwise emit
> nothing. Since gcc makes no promise that it will inline any function, it
> is inherently unsafe to put extern inline in a C file. There is no guarantee
> that it will work.
You're overlooking an important reason to mark functions as extern inline.
That is the case where the author knows that having the functions out
of line will cause the software to not work. I believe this is true in
this instance -- the software is a softmodem and is therefore real-time.
If it makes function calls when the author was expecting it to do two
computations, its performance may well be insufficient to function.
It is better for this program to fail to compile than run too slowly.
--
Revolutions do not require corporate support.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Linux-ia64] gcc won't inline function returning struct?
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
` (3 preceding siblings ...)
2001-11-01 0:40 ` Matthew Wilcox
@ 2001-11-01 0:49 ` Jim Wilson
4 siblings, 0 replies; 6+ messages in thread
From: Jim Wilson @ 2001-11-01 0:49 UTC (permalink / raw)
To: linux-ia64
>If it makes function calls when the author was expecting it to do two
>computations, its performance may well be insufficient to function.
I've never heard this argument before. It does make some sense.
It does have a flaw though. If the code was originally written for IA-32, and
proven to meet timing constraints on the IA-32 host, then there is no guarantee
that it will work on an IA-64 host. The timing analysis will all have to be
redone. So perhaps the program has a valid reason for using extern inline,
but it is still non-portable, which was the point I was trying to make.
Jim
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2001-11-01 0:49 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-10-31 21:39 [Linux-ia64] gcc won't inline function returning struct? Bdale Garbee
2001-10-31 22:23 ` Jim Wilson
2001-10-31 23:02 ` n0ano
2001-11-01 0:27 ` Jim Wilson
2001-11-01 0:40 ` Matthew Wilcox
2001-11-01 0:49 ` Jim Wilson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox