* Making the NS83820 usable on IA64
@ 2004-03-17 3:57 Peter Chubb
[not found] ` <4057D1B8.9090508@pobox.com>
2004-03-17 5:49 ` David S. Miller
0 siblings, 2 replies; 4+ messages in thread
From: Peter Chubb @ 2004-03-17 3:57 UTC (permalink / raw)
To: linux-ia64; +Cc: linux-net, netdev
Hi Folks,
With the current NS83820 driver and IP stack implementation,
the IA64 kernel spends 99.9% of its time in the unaligned access trap
handler when the network starts getting busy. When I raised this
issue before, the idea of realigning the skbuf data in the driver was
scouted; therefore I submit this patch for your approval. It makes
the driver usable, and doesn't seem to affect anything else.
The idea is to tell gcc that the IP header is 2-byte aligned,
so it can generate the right code to access it. Otherwise, it tries
to do a 4-byte load when trying to extract the header length bitfield,
which traps. As far as I read the C standard, gcc can do almost
whatever it wants as regarding the alignment and underlying storage
size of a bitfield, so it's free to assume 32-bit alignment if it
wants.
Tested on McKinley with gcc 3.3.x
===== linus-2.6.4/include/linux/ip.h 1.12 vs edited =====
--- 1.12/include/linux/ip.h Fri Jan 2 07:28:33 2004
+++ edited/include/linux/ip.h Wed Mar 17 11:58:09 2004
@@ -186,7 +186,7 @@
__u32 saddr;
__u32 daddr;
/*The options start here. */
-};
+} __attribute__((packed,aligned(2)));
struct ip_auth_hdr {
__u8 nexthdr;
--
Dr Peter Chubb http://www.gelato.unsw.edu.au peterc@gelato.unsw.edu.au
You are lost in a maze of BitKeeper repositories, all slightly different.
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <4057D1B8.9090508@pobox.com>]
* Re: Making the NS83820 usable on IA64
[not found] ` <4057D1B8.9090508@pobox.com>
@ 2004-03-17 4:33 ` Peter Chubb
0 siblings, 0 replies; 4+ messages in thread
From: Peter Chubb @ 2004-03-17 4:33 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Peter Chubb, linux-ia64, linux-net, netdev
>>>>> "Jeff" == Jeff Garzik <jgarzik@pobox.com> writes:
Jeff> Peter Chubb wrote:
>> With the current NS83820 driver and IP stack implementation, the
>> IA64 kernel spends 99.9% of its time in the unaligned access trap
>> handler when the network starts getting busy. When I raised this
>> issue before, the idea of realigning the skbuf data in the driver
>> was scouted; therefore I submit this patch for your approval. It
>> makes the driver usable, and doesn't seem to affect anything else.
Jeff> More likely, you need to play around with ns83820.c's
Jeff> skb_reserve() code, in rx_refill():
Unfortunately the dp83820 won't dma to anything other than a 64-bit
aligned packet. See table 3-2:
offset tag description
0004h bufptr 32- or 64- bit pointer to the first fragment
or buffer. In transmit descriptors the buffer
can begin on any byte boundary. In receive
descriptors, the buffer must be aligned on
a 64-bit boundary.
--
Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au
The technical we do immediately, the political takes *forever*
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Making the NS83820 usable on IA64
2004-03-17 3:57 Making the NS83820 usable on IA64 Peter Chubb
[not found] ` <4057D1B8.9090508@pobox.com>
@ 2004-03-17 5:49 ` David S. Miller
2004-03-17 8:54 ` Peter Chubb
1 sibling, 1 reply; 4+ messages in thread
From: David S. Miller @ 2004-03-17 5:49 UTC (permalink / raw)
To: Peter Chubb; +Cc: linux-ia64, linux-net, netdev
On Wed, 17 Mar 2004 14:57:36 +1100
Peter Chubb <peterc@gelato.unsw.edu.au> wrote:
> The idea is to tell gcc that the IP header is 2-byte aligned,
> so it can generate the right code to access it. Otherwise, it tries
> to do a 4-byte load when trying to extract the header length bitfield,
> which traps. As far as I read the C standard, gcc can do almost
> whatever it wants as regarding the alignment and underlying storage
> size of a bitfield, so it's free to assume 32-bit alignment if it
> wants.
This makes every piece of code only able to assume 2-byte
alignment. I don't think this will get accepted :)
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Making the NS83820 usable on IA64
2004-03-17 5:49 ` David S. Miller
@ 2004-03-17 8:54 ` Peter Chubb
0 siblings, 0 replies; 4+ messages in thread
From: Peter Chubb @ 2004-03-17 8:54 UTC (permalink / raw)
To: David S. Miller; +Cc: Peter Chubb, linux-ia64, linux-net, netdev
>>>>> "David" == David S Miller <davem@redhat.com> writes:
David> On Wed, 17 Mar 2004 14:57:36 +1100 Peter Chubb
David> <peterc@gelato.unsw.edu.au> wrote:
>> The idea is to tell gcc that the IP header is 2-byte aligned, so it
>> can generate the right code to access it. Otherwise, it tries to
>> do a 4-byte load when trying to extract the header length bitfield,
>> which traps. As far as I read the C standard, gcc can do almost
>> whatever it wants as regarding the alignment and underlying storage
>> size of a bitfield, so it's free to assume 32-bit alignment if it
>> wants.
David> This makes every piece of code only able to assume 2-byte
David> alignment. I don't think this will get accepted :)
Well, there are at least two other alternatives.
One is to copy the buffer to force iphdr to be
4-byte aligned. If you want to access a bitfield, it should be
aligned at whatever the compiler expects bitfields to be aligned at --
in this case, 4-bytes. Last time I brought up this solution it was
shouted down.
The other is to get rid of the bitfields and to do explicit masking
and shifting, every time ihl and version are accessed.
And something may have to be done if we ever port to a big-endian
machine with struct alignment constraints. We're saved at present by
ntohl and friends on the non-aligned saddr and daddr etc.
Something has to be done, however. The standard driver plus stack
won't even do 100Mb/s; with any kind of network load the whole machine
becomes unusable. with the patch I sent before, it gets to
315Mb/s (UDP echo server, 1024 byte packets). If I enable interrupt
holdoff code (currently disabled) I can push it to 340Mb/s (and it's
spending most of its time in rx_interrupt() -- system time close to
99%). Even this is pretty appalling. (Numbers derived using ipbench.sf.net)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-03-17 8:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-17 3:57 Making the NS83820 usable on IA64 Peter Chubb
[not found] ` <4057D1B8.9090508@pobox.com>
2004-03-17 4:33 ` Peter Chubb
2004-03-17 5:49 ` David S. Miller
2004-03-17 8:54 ` Peter Chubb
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).