From: Eric Dumazet <dada1@cosmosbay.com>
To: Andi Kleen <ak@suse.de>
Cc: David Miller <davem@davemloft.net>,
mchan@broadcom.com, netdev@vger.kernel.org
Subject: Re: [PATCH] tg3 : avoid an expensive divide
Date: Wed, 7 Feb 2007 10:56:04 +0100 [thread overview]
Message-ID: <200702071056.04879.dada1@cosmosbay.com> (raw)
In-Reply-To: <p73ireemgwm.fsf@bingen.suse.de>
On Wednesday 07 February 2007 10:54, Andi Kleen wrote:
> David Miller <davem@davemloft.net> writes:
> > Because I've seen gcc optimize this properly before (at least on
> > sparc64), it means that either:
> >
> > 1) There is a GCC bug where the properties of the constants
> > do not propagate.
> >
> > 2) GCC really thinks the divide is cheaper (code density vs.
> > cycle count tradeoffs etc.)
>
> Probably Eric compiled with the now default
> CONFIG_CC_OPTIMIZE_FOR_SIZE/-Os. With that gcc decides to use the shorter
> hardware divide instruction, even though it is significantly slower than an
> expanded optimized sequence for constant dividend.
>
> We've seen this in a few other cases while during performance regression
> testing between kernels that still used -O2 vs the newer -Os.
>
> No good solution found unfortunately.
Well, this could explain but unfortunatly I dont have this option set :
# grep OPTIMIZE .config
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
# gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.1.1/configure --enable-languages=c,c++
Thread model: posix
gcc version 4.1.1
Then I did :
# make drivers/net/tg3.s
CHK include/linux/version.h
CHK include/linux/utsrelease.h
CC drivers/net/tg3.s
# more drivers/net/tg3.s
.file "tg3.c"
# GNU C version 4.1.1 (x86_64-unknown-linux-gnu)
# compiled by GNU C version 4.1.0.
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed: -nostdinc -Iinclude -D__KERNEL__ -DCONFIG_AS_CFI=1
# -DKBUILD_STR(s)=#s -DKBUILD_BASENAME=KBUILD_STR(tg3)
# -DKBUILD_MODNAME=KBUILD_STR(tg3) -isystem -include -MD -march=k8 -m64
# -mno-red-zone -mcmodel=kernel -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
# -maccumulate-outgoing-args -auxbase-strip -O2 -Wall -Wundef
# -Wstrict-prototypes -Wno-trigraphs -Wno-sign-compare
# -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-aliasing
# -fno-common -fno-reorder-blocks -fno-asynchronous-unwind-tables
# -funit-at-a-time -fno-omit-frame-pointer -fno-optimize-sibling-calls
# -fno-stack-protector -fverbose-asm
# options enabled: -falign-loops -fargument-alias -fbranch-count-reg
# -fcaller-saves -fcprop-registers -fcrossjumping -fcse-follow-jumps
# -fcse-skip-blocks -fdefer-pop -fdelete-null-pointer-checks
# -fearly-inlining -feliminate-unused-debug-types -fexpensive-optimizations
# -ffunction-cse -fgcse -fgcse-lm -fguess-branch-probability -fident
# -fif-conversion -fif-conversion2 -finline-functions-called-once
# -fipa-pure-const -fipa-reference -fipa-type-escape -fivopts
# -fkeep-static-consts -fleading-underscore -floop-optimize
# -floop-optimize2 -fmath-errno -fmerge-constants -foptimize-register-move
# -fpeephole -fpeephole2 -freg-struct-return -fregmove -freorder-functions
# -frerun-cse-after-loop -frerun-loop-opt -fsched-interblock -fsched-spec
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column
# -fsplit-ivs-in-unroller -fstrength-reduce -fthread-jumps -ftrapping-math
# -ftree-ccp -ftree-ch -ftree-copy-prop -ftree-copyrename -ftree-dce
# -ftree-dominator-opts -ftree-dse -ftree-fre -ftree-loop-im
# -ftree-loop-ivcanon -ftree-loop-optimize -ftree-lrs -ftree-pre
# -ftree-salias -ftree-sink -ftree-sra -ftree-store-ccp
# -ftree-store-copy-prop -ftree-ter -ftree-vect-loop-version -ftree-vrp
# -funit-at-a-time -fverbose-asm -fzero-initialized-in-bss
# -m128bit-long-double -m64 -m80387 -maccumulate-outgoing-args
# -malign-stringops -mfancy-math-387 -mfp-ret-in-387 -mieee-fp
# -mno-red-zone -mpush-args -mtls-direct-seg-refs
# Compiler executable checksum: a068cb1f6a9c2e4d8616444230e91dfc
Eric
next prev parent reply other threads:[~2007-02-07 10:28 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-03 16:33 [PATCH/RFC 00/10] Transparent proxying patches version 4 KOVACS Krisztian
2007-01-03 16:34 ` [PATCH/RFC 01/10] Implement local diversion of IPv4 skbs KOVACS Krisztian
2007-01-10 6:46 ` Patrick McHardy
2007-01-10 9:31 ` Balazs Scheidler
2007-01-10 12:32 ` Patrick McHardy
2007-01-10 13:27 ` Ingo Oeser
2007-01-10 13:42 ` Patrick McHardy
2007-01-11 14:05 ` KOVACS Krisztian
2007-01-10 10:17 ` KOVACS Krisztian
2007-01-10 12:19 ` Patrick McHardy
2007-01-16 12:49 ` KOVACS Krisztian
2007-01-16 13:19 ` Patrick McHardy
2007-01-03 16:34 ` [PATCH/RFC 02/10] Port redirection support for TCP KOVACS Krisztian
2007-01-03 16:35 ` [PATCH/RFC 03/10] Don't do the TCP socket lookup if we already have one attached KOVACS Krisztian
2007-01-03 16:35 ` [PATCH/RFC 04/10] Don't do the UDP " KOVACS Krisztian
2007-01-03 16:36 ` [PATCH/RFC 05/10] Remove local address check on IP output KOVACS Krisztian
2007-01-10 6:47 ` Patrick McHardy
2007-01-10 10:01 ` KOVACS Krisztian
2007-02-06 14:36 ` IP_FREEBIND and CAP_NET_ADMIN (was: Re: [PATCH/RFC 05/10] Remove local address check on IP output) KOVACS Krisztian
2007-02-06 19:46 ` IP_FREEBIND and CAP_NET_ADMIN David Miller
2007-02-06 20:53 ` [PATCH] tg3 : avoid an expensive divide Eric Dumazet
2007-02-06 21:19 ` David Miller
2007-02-06 22:09 ` Michael Chan
2007-02-06 21:27 ` David Miller
2007-02-07 9:54 ` Andi Kleen
2007-02-07 9:45 ` David Miller
2007-02-07 9:56 ` Eric Dumazet [this message]
2007-02-07 10:27 ` Andi Kleen
2007-02-06 22:05 ` Michael Chan
2007-02-06 21:25 ` David Miller
2007-02-06 21:35 ` Eric Dumazet
2007-02-06 22:17 ` David Miller
2007-01-03 16:36 ` [PATCH/RFC 06/10] Create a tproxy flag in struct sk_buff KOVACS Krisztian
2007-01-03 16:37 ` [PATCH/RFC 07/10] Export UDP socket lookup function KOVACS Krisztian
2007-01-03 16:37 ` [PATCH/RFC 08/10] iptables tproxy table KOVACS Krisztian
2007-01-10 12:40 ` Patrick McHardy
2007-01-03 16:38 ` [PATCH/RFC 09/10] iptables TPROXY target KOVACS Krisztian
2007-01-10 12:45 ` Patrick McHardy
2007-01-03 16:38 ` [PATCH/RFC 10/10] iptables tproxy match KOVACS Krisztian
2007-01-03 17:23 ` [PATCH/RFC 00/10] Transparent proxying patches version 4 Evgeniy Polyakov
2007-01-08 20:30 ` KOVACS Krisztian
2007-01-03 19:33 ` Lennert Buytenhek
2007-01-04 12:13 ` KOVACS Krisztian
2007-01-04 12:16 ` Lennert Buytenhek
2007-01-07 14:11 ` Harald Welte
2007-01-07 16:11 ` Lennert Buytenhek
2007-01-07 23:58 ` Harald Welte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200702071056.04879.dada1@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=ak@suse.de \
--cc=davem@davemloft.net \
--cc=mchan@broadcom.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).