public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <andi@firstfloor.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jeff Garzik <jeff@garzik.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org
Subject: Re: fresh data was Re: [PATCH] X86-32: Let gcc decide whether to inline memcpy was Re: New x86 warning
Date: Thu, 23 Apr 2009 09:37:09 +0200	[thread overview]
Message-ID: <20090423073709.GH13896@one.firstfloor.org> (raw)
In-Reply-To: <20090423063625.GB9833@elte.hu>

On Thu, Apr 23, 2009 at 08:36:25AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <andi@firstfloor.org> wrote:
> 
> > Andi Kleen <andi@firstfloor.org> writes:
> > 
> > >> > Quick test here:
> > >> 
> > >> How about you just compile the kernel with gcc-3.2 and compare the number 
> > >> of calls to memcpy before-and-after instead? That's the real test.
> > >
> > > I waited over 10 minutes for the full vmlinux objdumps to finish. sorry lost
> > > patience. If someone has a fast disassembler we can try it. I'll leave
> > > them running over night, maybe there are exact numbers tomorrow.
> > >
> > > But from a quick check (find -name '*.o' | xargs nm | grep memcpy) there are
> > > very little files which call it with the patch, so there's some
> > > evidence that there isn't a dramatic increase.
> > 
> > I let the objdumps finish over night. [...]
> 
> objdump -d never took me more than a minute - let alone a full 

I use objdump -S. Maybe that's slower than -d.

Hmm quick test, yes -S seems to be much slower than -d. Thanks for
the hint. I guess I should switch to -d for these cases, unfortunately
-S seems to be hardcoded in my fingers and of course it gives much
nicer output if you have debug info.

> night. You must be doing something really wrong there. Looking at 
> objdump -d is an essential, unavoidable component of my workflow 
> with x86 architecture patches, you need to find a way to do it 

I do it all the time too, but only for specific functions, not
for full kernels. I have a objdump-symbol script for that that
looks up a symbol in the symbol table and only disassembles
the function I'm interested in 
(ftp://firstfloor.org/pub/ak/perl/objdump-symbol) 
I normally don't look at full listings of the complete kernel.

> > [...] On my setup (defconfig + some additions) there are actually 
> > less calls to out of line memcpy/__memcpy with the patch. I see 
> > only one for my defconfig, while there are ~10 without the patch. 
> > So it makes very little difference. The code size savings must 
> > come from more efficient code generation for the inline case. I 
> > haven't investigated that in detail though.
> > 
> > So the patch seems like a overall win.
> 
> It's a clear loss here with GCC 3.4, and it took me less than 5 
> minutes to figure that out.

Loss in what way?

> 
> With what precise compiler version did you test (please paste the 
> gcc -v output), and could you send me the precise .config you used, 

See the 2nd previous mail: 3.2.3

I didn't do tests with later versions, assuming there are no 
regressions.

> and describe the method you used to determine the number of 
> out-of-line memcpy calls? I'd like to double-check your numbers.

objdump -S ... | grep call.*memcpy         (gives some false positives,
you have to weed them out)

In addition I did a quick find -name '*.o' | xargs nm | grep 'U.*memcpy$'
to (under) estimate the calls  

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

  reply	other threads:[~2009-04-23  7:51 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-22  6:46 New x86 warning Jeff Garzik
2009-04-22  7:01 ` Ingo Molnar
2009-04-22  8:45   ` [PATCH] X86-32: Let gcc decide whether to inline memcpy was " Andi Kleen
2009-04-22 18:00     ` [tip:x86/asm] x86: use __builtin_memcpy() on 32 bits tip-bot for Andi Kleen
2009-04-22 20:56     ` [PATCH] X86-32: Let gcc decide whether to inline memcpy was Re: New x86 warning Linus Torvalds
2009-04-22 21:15       ` Andi Kleen
2009-04-22 21:19         ` Linus Torvalds
2009-04-22 22:04           ` Andi Kleen
2009-04-23  6:08             ` fresh data was " Andi Kleen
2009-04-23  6:36               ` Ingo Molnar
2009-04-23  7:37                 ` Andi Kleen [this message]
2009-04-23  6:30             ` Ingo Molnar
2009-04-23  7:43               ` Andi Kleen
2009-04-22 23:49     ` Joe Damato
2009-04-23  1:48       ` H. Peter Anvin
2009-04-23 21:22         ` Joe Damato
2009-04-23 22:09           ` H. Peter Anvin
2009-04-24  8:44           ` Andi Kleen
2009-04-23  6:09       ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090423073709.GH13896@one.firstfloor.org \
    --to=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox