* how to tune a pair of e1000 cards on intel e7501-based system?
@ 2004-12-06 2:44 Ray Lehtiniemi
2004-12-06 3:34 ` Scott Feldman
0 siblings, 1 reply; 5+ messages in thread
From: Ray Lehtiniemi @ 2004-12-06 2:44 UTC (permalink / raw)
To: netdev
hi all
i'm trying to understand how to tune a pair of e1000 cards in
a server box. the box is a dual xeon 3.06 with hyperthreading,
using the intel e7501 chipset, with both cards on hub interface
D. the application involves small UDP packets, generally under
300 bytes. i need to maximize the number of packets per second
transferred between the two cards.
at the moment, i'm looking at the PCI bus in this box to see
what might be tweakable. lspci output for the relevant parts is
attached below.
could anyone give me an idea:
- what kind of packets per second i could expect to achieve
from this particular system (for small packets)
- what parameters i can tweak at the PCI level (or any other
level, for that matter...) to achieve that level of performance
for example, how could i get my e1000 cards to say '64bit+ 133MHz+'
to match the secondary side of the 82870P2 bridge?
thank you
-------------------------------------------------------------------------
lspci -t
-------------------------------------------------------------------------
-[00]-+-00.0
+-00.1
+-04.0-[01-03]--+-1c.0
| +-1d.0-[02]--+-01.0
| | \-02.0
| +-1e.0
| \-1f.0-[03]--
+-1d.0
+-1d.1
+-1d.2
+-1e.0-[04]--+-03.0
| \-06.0
+-1f.0
+-1f.1
\-1f.3
-------------------------------------------------------------------------
lspci -vv (selected items)
-------------------------------------------------------------------------
0000:00:00.0 Host bridge: Intel Corp. E7501 Memory Controller Hub (rev 01)
Subsystem: Intel Corp. E7501 Memory Controller Hub
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Capabilities: [40] #09 [1105]
0000:00:04.0 PCI bridge: Intel Corp. E7000 Series Hub Interface D PCI-to-PCI Bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32
Bus: primary=00, secondary=01, subordinate=03, sec-latency=0
I/O behind bridge: 0000b000-0000bfff
Memory behind bridge: f9000000-fbffffff
BridgeCtl: Parity- SERR- NoISA+ VGA- MAbort- >Reset- FastB2B-
0000:01:1c.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0
Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [50] PCI-X non-bridge device.
Command: DPERE- ERO- RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-
0000:01:1d.0 PCI bridge: Intel Corp. 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32, cache line size 10
Bus: primary=01, secondary=02, subordinate=02, sec-latency=64
I/O behind bridge: 0000b000-0000bfff
Memory behind bridge: f9000000-faffffff
BridgeCtl: Parity+ SERR+ NoISA- VGA- MAbort+ >Reset- FastB2B-
Capabilities: [50] PCI-X bridge device.
Secondary Status: 64bit+, 133MHz+, SCD-, USC-, SCO-, SRD- Freq=3
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, SCO-, SRD-
: Upstream: Capacity=0, Commitment Limit=0
: Downstream: Capacity=0, Commitment Limit=0
0000:02:01.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet Controller (Copper) (rev 02)
Subsystem: Intel Corp. PRO/1000 XT Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (63750ns min), cache line size 08
Interrupt: pin A routed to IRQ 10
Region 0: Memory at fa020000 (64-bit, non-prefetchable) [size=128K]
Region 2: Memory at fa000000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at b000 [size=32]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
0000:02:02.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet Controller (Copper) (rev 02)
Subsystem: Intel Corp. PRO/1000 XT Server Adapter
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (63750ns min), cache line size 08
Interrupt: pin A routed to IRQ 10
Region 0: Memory at fa040000 (64-bit, non-prefetchable) [size=128K]
Region 2: Memory at fa060000 (64-bit, non-prefetchable) [size=128K]
Region 4: I/O ports at b400 [size=32]
Expansion ROM at <unassigned> [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
--
----------------------------------------------------------------------
Ray L <rayl@mail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to tune a pair of e1000 cards on intel e7501-based system?
2004-12-06 2:44 how to tune a pair of e1000 cards on intel e7501-based system? Ray Lehtiniemi
@ 2004-12-06 3:34 ` Scott Feldman
2004-12-06 4:10 ` Ray Lehtiniemi
0 siblings, 1 reply; 5+ messages in thread
From: Scott Feldman @ 2004-12-06 3:34 UTC (permalink / raw)
To: Ray Lehtiniemi; +Cc: netdev
On Sun, 2004-12-05 at 18:44, Ray Lehtiniemi wrote:
> i'm trying to understand how to tune a pair of e1000 cards in
> a server box. the box is a dual xeon 3.06 with hyperthreading,
> using the intel e7501 chipset, with both cards on hub interface
> D. the application involves small UDP packets, generally under
These are 82544 cards hanging of P64H2, so they're running at PCI-X, but
at what speed? Run ethtool -d eth<x> | grep Bus. The cards support
133Mhz, but having them adjacent on the same P64H2 probably bumps them
down to 100Mhz. Can you put one on D and the other on another bus?
> 300 bytes. i need to maximize the number of packets per second
> transferred between the two cards.
There is a lot of current traffic on netdev about this topic. netdev is
the official e1000 mailing this weekend. :-)
> at the moment, i'm looking at the PCI bus in this box to see
> what might be tweakable. lspci output for the relevant parts is
> attached below.
>
> could anyone give me an idea:
>
> - what kind of packets per second i could expect to achieve
> from this particular system (for small packets)
What kind of numbers are you getting?
What kernel are you using?
What driver tweaks have you made, if any?
-scott
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to tune a pair of e1000 cards on intel e7501-based system?
2004-12-06 3:34 ` Scott Feldman
@ 2004-12-06 4:10 ` Ray Lehtiniemi
2004-12-06 23:12 ` Ray Lehtiniemi
0 siblings, 1 reply; 5+ messages in thread
From: Ray Lehtiniemi @ 2004-12-06 4:10 UTC (permalink / raw)
To: Scott Feldman; +Cc: netdev
On Sun, Dec 05, 2004 at 07:34:18PM -0800, Scott Feldman wrote:
> On Sun, 2004-12-05 at 18:44, Ray Lehtiniemi wrote:
> > i'm trying to understand how to tune a pair of e1000 cards in
> > a server box. the box is a dual xeon 3.06 with hyperthreading,
> > using the intel e7501 chipset, with both cards on hub interface
> > D. the application involves small UDP packets, generally under
>
> These are 82544 cards hanging of P64H2, so they're running at PCI-X, but
> at what speed? Run ethtool -d eth<x> | grep Bus.
:-) just found ethtool and compiled it before i read this email. any other
useful tools i should get?
> The cards support
> 133Mhz, but having them adjacent on the same P64H2 probably bumps them
> down to 100Mhz.
# ethtool -d eth0 | grep Bus
Bus type: PCI-X
Bus speed: 133MHz
Bus width: 64-bit
# ethtool -d eth1 | grep Bus
Bus type: PCI-X
Bus speed: 133MHz
Bus width: 64-bit
so that looks okay...
any idea why lspci -vv shows non-64bit, non-133 MHz? (i am assuming
that is what the minus sign means)
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-
> Can you put one on D and the other on another bus?
not sure... have to look at the chassis tomorrow morning. a co-worker
actually built the box, i've not seen it in person yet.
> There is a lot of current traffic on netdev about this topic. netdev is
> the official e1000 mailing this weekend. :-)
i noticed that almost half of the messages in the e1000-devel archives
since oct 2002 were sent in the last 10 days :-)
> What kind of numbers are you getting?
will start testing tomorrow, just starting my background research tonight
> What kernel are you using?
2.4.20 with driver 4.4.12-k1, and 2.6.bkcurr with driver 5.bkcurr
> What driver tweaks have you made, if any?
none yet. based on mailing list searches, i plan to:
- InterruptThrottleRate 15000
- TxIntDelay 0
i also plan to set rp_filter to zero for all interfaces.
thanks
--
----------------------------------------------------------------------
Ray L <rayl@mail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: how to tune a pair of e1000 cards on intel e7501-based system?
2004-12-06 4:10 ` Ray Lehtiniemi
@ 2004-12-06 23:12 ` Ray Lehtiniemi
0 siblings, 0 replies; 5+ messages in thread
From: Ray Lehtiniemi @ 2004-12-06 23:12 UTC (permalink / raw)
To: Scott Feldman; +Cc: netdev
On Sun, Dec 05, 2004 at 09:10:03PM -0700, Ray Lehtiniemi wrote:
>
> any idea why lspci -vv shows non-64bit, non-133 MHz? (i am assuming
> that is what the minus sign means)
>
> Capabilities: [e4] PCI-X non-bridge device.
> Command: DPERE- ERO+ RBC=0 OST=0
> Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM-
turns out this is a bug in pci-utils-2.1.11. it was not correctly
fetching the config information from the card, and was displaying
zeroed data instead. i've included a patch at the end of this email
and have also forwarded it to martin mares.
> > Can you put one on D and the other on another bus?
>
> not sure... have to look at the chassis tomorrow morning. a co-worker
> actually built the box, i've not seen it in person yet.
nope, can't move things around. this is a NexGate NSA 2040G, and everything
is built into the motherboard.
> > What kind of numbers are you getting?
i'm seeing about 100Kpps, with all settings at their defaults on the 2.4.20
kernel.
basically, i have a couple of desktop PCs generating 480 streams
of UDP data at 50 packets per second. Packet size on the wire, including
96 bits of IFG, is 128 bytes. these packets are forwarded through a user
process on the NexGen box to an echoer process which is also running on the
traffic generator boxes. the echoer sends them back to the NexGen user process,
which forwards them back to the generator process. timestamps are logged
for each packet at send, loop and recv points.
anything over 480 streams, and i start to get large latencies and packet drops,
as measured by the timestamps in the sender and echoer process.
does 100Kpps sound reasonable for an untweaked 2.4.20 kernel?
thanks
diff -ur pciutils-2.1.11/lspci.c pciutils-2.1.11-rayl/lspci.c
--- pciutils-2.1.11/lspci.c 2002-12-26 13:24:50.000000000 -0700
+++ pciutils-2.1.11-rayl/lspci.c 2004-12-06 15:54:33.573313973 -0700
@@ -476,16 +476,19 @@
static void
show_pcix_nobridge(struct device *d, int where)
{
- u16 command = get_conf_word(d, where + PCI_PCIX_COMMAND);
- u32 status = get_conf_long(d, where + PCI_PCIX_STATUS);
+ u16 command;
+ u32 status;
printf("PCI-X non-bridge device.\n");
if (verbose < 2)
return;
+ config_fetch(d, where, 8);
+ command = get_conf_word(d, where + PCI_PCIX_COMMAND);
printf("\t\tCommand: DPERE%c ERO%c RBC=%d OST=%d\n",
FLAG(command, PCI_PCIX_COMMAND_DPERE),
FLAG(command, PCI_PCIX_COMMAND_ERO),
((command & PCI_PCIX_COMMAND_MAX_MEM_READ_BYTE_COUNT) >> 2U),
((command & PCI_PCIX_COMMAND_MAX_OUTSTANDING_SPLIT_TRANS) >> 4U));
+ status = get_conf_long(d, where + PCI_PCIX_STATUS);
printf("\t\tStatus: Bus=%u Dev=%u Func=%u 64bit%c 133MHz%c SCD%c USC%c, DC=%s, DMMRBC=%u, DMOST=%u, DMCRS=%u, RSCEM%c",
((status >> 8) & 0xffU), // bus
((status >> 3) & 0x1fU), // dev
@@ -509,6 +512,7 @@
printf("PCI-X bridge device.\n");
if (verbose < 2)
return;
+ config_fetch(d, where, 8);
secstatus = get_conf_word(d, where + PCI_PCIX_BRIDGE_SEC_STATUS);
printf("\t\tSecondary Status: 64bit%c, 133MHz%c, SCD%c, USC%c, SCO%c, SRD%c Freq=%d\n",
FLAG(secstatus, PCI_PCIX_BRIDGE_SEC_STATUS_64BIT),
--
----------------------------------------------------------------------
Ray L <rayl@mail.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: how to tune a pair of e1000 cards on intel e7501-based system?
@ 2004-12-08 1:10 Brandeburg, Jesse
0 siblings, 0 replies; 5+ messages in thread
From: Brandeburg, Jesse @ 2004-12-08 1:10 UTC (permalink / raw)
To: Ray Lehtiniemi, Scott Feldman; +Cc: netdev
> > > What kind of numbers are you getting?
>
> i'm seeing about 100Kpps, with all settings at their defaults on the
> 2.4.20
> kernel.
>
> basically, i have a couple of desktop PCs generating 480 streams
> of UDP data at 50 packets per second. Packet size on the wire,
including
> 96 bits of IFG, is 128 bytes. these packets are forwarded through a
user
> process on the NexGen box to an echoer process which is also running
on
> the
> traffic generator boxes. the echoer sends them back to the NexGen user
> process,
> which forwards them back to the generator process. timestamps are
logged
> for each packet at send, loop and recv points.
>
I'm not much of an expert, but one easy thing to try is to up your
receive stack resources, as they were particularly low on 2.4 series
kernels, leading to udp getting overrun pretty easily with gig nics. I
think if you make the value go too high it just ignores it, so if you
see no change, try 256kB instead.
cat /proc/sys/net/core/rmem_default
cat /proc/sys/net/core/rmem_max
echo -n 512000 > /proc/sys/net/core/rmem_default
echo -n 512000 > /proc/sys/net/core/rmem_max
Jesse
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-12-08 1:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-06 2:44 how to tune a pair of e1000 cards on intel e7501-based system? Ray Lehtiniemi
2004-12-06 3:34 ` Scott Feldman
2004-12-06 4:10 ` Ray Lehtiniemi
2004-12-06 23:12 ` Ray Lehtiniemi
-- strict thread matches above, loose matches on Subject: below --
2004-12-08 1:10 Brandeburg, Jesse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).