public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Inlining can be _very_bad...
@ 2007-03-28 23:18 J.A. Magallón
  2007-03-29  1:29 ` Benjamin LaHaise
  2007-03-29 17:52 ` Adrian Bunk
  0 siblings, 2 replies; 5+ messages in thread
From: J.A. Magallón @ 2007-03-28 23:18 UTC (permalink / raw)
  To: Linux-Kernel, 

[-- Attachment #1: Type: text/plain, Size: 1941 bytes --]

Hi all...

I post this here as it can be of direct interest for kernel development
(as I recall many discussions about inlining yes or no...).

Testing other problems, I finally got this this issue: the same short
and stupid loop lasted from 3 to 5 times more if it was in main() than
if it was in an out-of-line function. The same (bad thing) happens if
the function is inlined.

The basic code is like this:

float	data[];

[inline] double one()
{
    double sum;
    sum = 0;
    for (i=0; i<SIZE; i++) sum += data[i];
    return sum;
}

int main()
{
    gettimeofday(&tv0,0);
    for (i=0; i<SIZE; i++)
        s0 += data[i];
    gettimeofday(&tv1,0);
    printf("T0: %6.2f ms\n",elap(tv0,tv1));
    gettimeofday(&tv0,0);
        s1 = one();
    gettimeofday(&tv1,0);
    printf("T1: %6.2f ms\n",elap(tv0,tv1));
}

The times if one() is not inlined (emt64, 2.33GHz):

apolo:~/e4> tst
T0: 1145.12 ms
S0: 268435456.00
T1: 457.19 ms
S1: 268435456.00

With one() inlined:

apolo:~/e4> tst
T0: 1200.52 ms
S0: 268435456.00
T1: 1200.14 ms
S1: 268435456.00

Looking at the assembler, the non-inlined version does:

.L2:
    cvtss2sd    (%rdx,%rax,4), %xmm0
    incq    %rax
    cmpq    $268435456, %rax
    addsd   %xmm0, %xmm1
    jne .L2

and the inlined

.L13:
    cvtss2sd    (%rdx,%rax,4), %xmm0
    incq    %rax
    cmpq    $268435456, %rax
    addsd   8(%rsp), %xmm0
    movsd   %xmm0, 8(%rsp)
    jne .L13

It looks like is updating the stack on each iteration...This is -march=opteron
code, the -march=pentium4 is similar. Same behaviour with gcc3 and gcc4.

tst.c and Makefile attached.

Nice, isn't it ? Please, probe where is my fault...

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2007.1 (Cooker) for i586
Linux 2.6.20-jam06 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 SMP PREEMPT

[-- Attachment #2: Makefile --]
[-- Type: application/octet-stream, Size: 307 bytes --]

PROG=tst
SRCS=tst.c
CC=gcc4 -m64 -march=opteron -O2
#CC=gcc4 -m32 -march=pentium4 -O2
#CC+=-DINLINE
LIBS=

OBJS=$(SRCS:.c=.o)
ASMS=$(SRCS:.c=.s)

all: $(PROG) $(ASMS)

$(PROG): $(OBJS)
	$(CC) -o $@ $(OBJS) $(LIBS)

.c.o:
	$(CC) -c $<

.c.s:
	$(CC) -c -S $<

clean:
	@rm -f $(PROG) $(OBJS) $(ASMS) core tags

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: tst.c --]
[-- Type: text/x-csrc; name=tst.c, Size: 958 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

#define SIZE 256*1024*1024

#define elap(t0,t1) \
	((1000*t1.tv_sec+0.001*t1.tv_usec) - (1000*t0.tv_sec+0.001*t0.tv_usec))

double  one();

float	*data;

#ifdef INLINE
inline
#endif
double one()
{
	int i;
	double sum;

	sum = 0;
	asm("#FBGN");
	for (i=0; i<SIZE; i++)
		sum += data[i];
	asm("#FEND");

	return sum;
}

int main(int argc,char** argv)
{
	struct timeval	tv0,tv1;
	double			s0,s1;
	int				i;

	data = malloc(SIZE*sizeof(float));
	for (i=0; i<SIZE; i++)
		data[i] = 1;

	gettimeofday(&tv0,0);
	s0 = 0;
	asm("#MBGN");
	for (i=0; i<SIZE; i++)
		s0 += data[i];
	asm("#MEND");
	gettimeofday(&tv1,0);
	printf("T0: %6.2f ms\n",elap(tv0,tv1));
	printf("S0: %0.2lf\n",s0);

	gettimeofday(&tv0,0);
		s1 = one();
	gettimeofday(&tv1,0);
	printf("T1: %6.2f ms\n",elap(tv0,tv1));
	printf("S1: %0.2lf\n",s1);

	free(data);

	return 0;
}


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-03-29 22:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-28 23:18 Inlining can be _very_bad J.A. Magallón
2007-03-29  1:29 ` Benjamin LaHaise
2007-03-29 17:52 ` Adrian Bunk
2007-03-29 22:01   ` J.A. Magallón
2007-03-29 22:28     ` Adrian Bunk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox