From: Russell King <rmk@arm.linux.org.uk>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Thunder from the hill <thunder@ngforever.de>,
Lightweight patch manager <patch@luckynet.dynu.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH][2.5] Double quote patches part one: drivers 1/2
Date: Tue, 11 Jun 2002 17:54:46 +0100 [thread overview]
Message-ID: <20020611175446.E3665@flint.arm.linux.org.uk> (raw)
In-Reply-To: <20020611114344.B3081@flint.arm.linux.org.uk> <20020611084758.B1346@flint.arm.linux.org.uk> <Pine.LNX.4.44.0206110422110.24261-100000@hawkeye.luckynet.adm> <20020611114344.B3081@flint.arm.linux.org.uk> <5812.1023808616@redhat.com>
On Tue, Jun 11, 2002 at 04:16:56PM +0100, David Woodhouse wrote:
> Or were you _really_ advocating the kind of development methodology which
Not at all. You can sometimes sanely re-organise and optimise your C
code to produce better assembly without affecting its readability too
much.
> Tweaking your code and sacrificing chickens until you happen to get the
> output you want is no substitute for fixing the compiler.
"Fixing the compiler" for ARM would entail getting a degree in compiler
design, and rewriting GCC from scratch. GCC *really* isn't suited to
ARM CPUs at all. It sucks. The fact that earlier GCC versions sucked
less is the really interesting thing here...
GCC 3.1 is a hard compiler to find stuff wrong with (unlike previous
versions - well done gcc people!) However, here's a good example of a
bit of madvise_fixup_start, built for Xscale with a 5-stage instruction
pipeline with branch prediction.
It's from mm/filemap.c, setup_read_behaviour (inline function inside
madvise_fixup_start).
1. The C code:
VM_ClearReadHint(vma);
switch(behavior) {
case MADV_SEQUENTIAL:
vma->vm_flags |= VM_SEQ_READ;
break;
case MADV_RANDOM:
vma->vm_flags |= VM_RAND_READ;
break;
default:
break;
}
2. GCC 3.1 output with -O2 on ARM:
ldr r3, [r4, #20]
cmp r6, #1
bic r3, r3, #98304
str r7, [r4, #8]
str r3, [r4, #20]
beq .L803 <=== branch if equal
cmp r6, #2
orreq r3, r3, #32768
beq .L811 <=== branch if equal
.L806:
ldr r0, [r4, #56]
...
b .L809
.L811:
str r3, [r4, #20]
b .L806 <=== unconditional branch
.L803:
orr r3, r3, #65536
b .L811 <=== unconditional branch
This gives the following instruction path lengths:
neither: 10 (no branches)
first: 11 (3 branches)
second: 12 (2 branches)
-Os doesn't make much difference.
3. My human-based optimised output for ARM is:
ldr r3, [r4, #20]
cmp r6, #1
bic r3, r3, #98304
str r7, [r4, #8]
orreq r3, r3, #65536
beq .L806 <=== branch if equal
cmp r6, #2
orreq r3, r3, #32768
.L806:
str r3, [r4, #20]
ldr r0, [r4, #56]
...
b .L809
This gives the following instruction path lengths:
neither: 10 (no branches)
first: 8 (3 branches)
second: 10 (2 branches)
Any way you look at it, the above has got to be better.
We can probably get something very close to (3) out of gcc by doing
something like:
unsigned long flags;
flags = vma->vm_flags & ~(VM_SEQ_READ|VM_RAND_READ);
switch(behavior) {
case MADV_SEQUENTIAL:
flags |= VM_SEQ_READ;
break;
case MADV_RANDOM:
flags |= VM_RAND_READ;
break;
default:
break;
}
vma->vm_flags = flags;
No gotos required. So, what about VM_ClearReadHint()? It turns out
that its only used in once place - setup_read_behaviour(). Now, with
the above, we end up with the following output from the same version
of gcc with -O2:
ldr r3, [r4, #20]
cmp r6, #1
bic r3, r3, #98304
str r7, [r4, #8]
orreq r3, r3, #65536
beq .L801
cmp r6, #2
orreq r3, r3, #32768
.L801:
ldr r0, [r4, #56]
str r3, [r4, #20]
All I've shown above is how GCC can miss some optimisations its easy
for a human to see, and there are some simple ways to rewrite C code
to allow GCC to make those optimisations. But hey - that's nothing
new.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
prev parent reply other threads:[~2002-06-11 16:54 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-06-11 3:07 [PATCH][2.5] Double quote patches part one: drivers 1/2 Lightweight patch manager
2002-06-11 7:47 ` Russell King
2002-06-11 10:24 ` Thunder from the hill
2002-06-11 10:43 ` Russell King
2002-06-11 15:16 ` David Woodhouse
2002-06-11 16:54 ` Russell King [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020611175446.E3665@flint.arm.linux.org.uk \
--to=rmk@arm.linux.org.uk \
--cc=dwmw2@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patch@luckynet.dynu.com \
--cc=thunder@ngforever.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.