From: "Joakim Tjernlund" <Joakim.Tjernlund@lumentis.se>
To: <linuxppc-dev@lists.linuxppc.org>
Subject: csum_partial() and csum_partial_copy_generic() in badly optimized?
Date: Sat, 16 Nov 2002 00:01:06 +0100 [thread overview]
Message-ID: <000701c28cfa$e14e1a60$0200a8c0@telia.com> (raw)
Hi
Looking over the different checksums in I came across csum_partial() and csum_partial_copy_generic(), which lives in
arch/ppc/lib/checksum.S.
This comment in csum_partial:
/* the bdnz has zero overhead, so it should */
/* be unnecessary to unroll this loop */
got me wondering(code included last). A instruction can not have zero cost/overhead.
This instruction must be eating cycles. I think this function needs unrolling, but I am pretty
useless on assembler so I need help.
Can any PPC/assembler guy comment on this and, if needed, do the
unrolling? I think 6 or 8 as unroll step will be enough.
The same goes for csum_partial_copy_generic()
These functions are used to checksum every IP/TCP/UDP packet, so it
would be a good thing if they were properly optimized.
It would be really nice if there were more comments(and use names on jump labels, numbers
are very uninformative), it's hard enough to understand as is.
Jocke
/ *
* computes the checksum of a memory block at buff, length len,
* and adds in "sum" (32-bit)
*
* csum_partial(buff, len, sum)
*/
_GLOBAL(csum_partial)
addic r0,r5,0
subi r3,r3,4
srwi. r6,r4,2
beq 3f /* if we're doing < 4 bytes */
andi. r5,r3,2 /* Align buffer to longword boundary */
beq+ 1f
lhz r5,4(r3) /* do 2 bytes to get aligned */
addi r3,r3,2
subi r4,r4,2
addc r0,r0,r5
srwi. r6,r4,2 /* # words to do */
beq 3f
1: mtctr r6
2: lwzu r5,4(r3) /* the bdnz has zero overhead, so it should */
adde r0,r0,r5 /* be unnecessary to unroll this loop */
bdnz 2b
andi. r4,r4,3
3: cmpi 0,r4,2
blt+ 4f
lhz r5,4(r3)
addi r3,r3,2
subi r4,r4,2
adde r0,r0,r5
4: cmpi 0,r4,1
bne+ 5f
lbz r5,4(r3)
slwi r5,r5,8 /* Upper byte of word */
adde r0,r0,r5
5: addze r3,r0 /* add in final carry */
blr
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
next reply other threads:[~2002-11-15 23:01 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-15 23:01 Joakim Tjernlund [this message]
2002-11-16 2:39 ` csum_partial() and csum_partial_copy_generic() in badly optimized? Tim Seufert
2002-11-16 10:16 ` Joakim Tjernlund
2002-11-17 5:58 ` Tim Seufert
2002-11-17 15:17 ` Joakim Tjernlund
2002-11-17 22:00 ` Tim Seufert
2002-11-17 23:32 ` Joakim Tjernlund
2002-11-18 1:27 ` Tim Seufert
2002-11-18 4:12 ` Gabriel Paubert
2002-11-18 13:49 ` Joakim Tjernlund
2002-11-18 18:05 ` Gabriel Paubert
2002-11-18 18:43 ` Joakim Tjernlund
2002-11-19 1:24 ` Gabriel Paubert
2002-11-19 3:31 ` Paul Mackerras
2002-11-19 5:35 ` Gabriel Paubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='000701c28cfa$e14e1a60$0200a8c0@telia.com' \
--to=joakim.tjernlund@lumentis.se \
--cc=linuxppc-dev@lists.linuxppc.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).