[e1000 2.6 10/11] TxDescriptors -> 1024 default

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [e1000 2.6 10/11] TxDescriptors -> 1024 default
@ 2003-09-09  3:14 Feldman, Scott
  2003-09-11 19:18 ` Jeff Garzik
  0 siblings, 1 reply; 38+ messages in thread
From: Feldman, Scott @ 2003-09-09  3:14 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev, ricardoz


* Change the default number of Tx descriptors from 256 to 1024.
  Data from [ricardoz@us.ibm.com] shows it's easy to overrun
  the Tx desc queue.

-------------

diff -Nuarp linux-2.6.0-test4/drivers/net/e1000/e1000_param.c linux-2.6.0-test4/drivers/net/e1000.new/e1000_param.c
--- linux-2.6.0-test4/drivers/net/e1000/e1000_param.c	2003-08-22 16:57:59.000000000 -0700
+++ linux-2.6.0-test4/drivers/net/e1000.new/e1000_param.c	2003-09-08 09:13:12.000000000 -0700
@@ -63,9 +63,10 @@ MODULE_PARM_DESC(X, S);
 /* Transmit Descriptor Count
  *
  * Valid Range: 80-256 for 82542 and 82543 gigabit ethernet controllers
- * Valid Range: 80-4096 for 82544
+ * Valid Range: 80-4096 for 82544 and newer
  *
- * Default Value: 256
+ * Default Value: 256 for 82542 and 82543 gigabit ethernet controllers
+ * Default Value: 1024 for 82544 and newer
  */
 
 E1000_PARAM(TxDescriptors, "Number of transmit descriptors");
@@ -73,7 +74,7 @@ E1000_PARAM(TxDescriptors, "Number of tr
 /* Receive Descriptor Count
  *
  * Valid Range: 80-256 for 82542 and 82543 gigabit ethernet controllers
- * Valid Range: 80-4096 for 82544
+ * Valid Range: 80-4096 for 82544 and newer
  *
  * Default Value: 256
  */
@@ -200,6 +201,7 @@ E1000_PARAM(InterruptThrottleRate, "Inte
 #define MAX_TXD                      256
 #define MIN_TXD                       80
 #define MAX_82544_TXD               4096
+#define DEFAULT_82544_TXD           1024
 
 #define DEFAULT_RXD                  256
 #define MAX_RXD                      256
@@ -320,12 +322,15 @@ e1000_check_options(struct e1000_adapter
 		struct e1000_option opt = {
 			.type = range_option,
 			.name = "Transmit Descriptors",
-			.err  = "using default of " __MODULE_STRING(DEFAULT_TXD),
-			.def  = DEFAULT_TXD,
 			.arg  = { .r = { .min = MIN_TXD }}
 		};
 		struct e1000_desc_ring *tx_ring = &adapter->tx_ring;
 		e1000_mac_type mac_type = adapter->hw.mac_type;
+		opt.err = mac_type < e1000_82544 ?
+			"using default of " __MODULE_STRING(DEFAULT_TXD) :
+			"using default of " __MODULE_STRING(DEFAULT_82544_TXD);
+		opt.def = mac_type < e1000_82544 ?
+			DEFAULT_TXD : DEFAULT_82544_TXD;
 		opt.arg.r.max = mac_type < e1000_82544 ?
 			MAX_TXD : MAX_82544_TXD;
 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-09  3:14 [e1000 2.6 10/11] TxDescriptors -> 1024 default Feldman, Scott
@ 2003-09-11 19:18 ` Jeff Garzik
  2003-09-11 19:45   ` Ben Greear
  0 siblings, 1 reply; 38+ messages in thread
From: Jeff Garzik @ 2003-09-11 19:18 UTC (permalink / raw)
  To: Feldman, Scott; +Cc: netdev, ricardoz

Feldman, Scott wrote:
> * Change the default number of Tx descriptors from 256 to 1024.
>   Data from [ricardoz@us.ibm.com] shows it's easy to overrun
>   the Tx desc queue.

All e1000 patches applied except this one.

Of _course_ it's easy to overrun the Tx desc queue.  That's why we have 
a TX queue sitting on top of the NIC's hardware queue.  And TCP socket 
buffers on top of that.  And similar things.

Descriptor increases like this are usually the result of some sillyhead 
blasting out UDP packets, and then wondering why he sees packet loss on 
the local computer (the "blast out packets" side).

You're just wasting memory.

	Jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 19:18 ` Jeff Garzik
@ 2003-09-11 19:45   ` Ben Greear
  2003-09-11 19:59     ` Jeff Garzik
  2003-09-11 20:12     ` David S. Miller
  0 siblings, 2 replies; 38+ messages in thread
From: Ben Greear @ 2003-09-11 19:45 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Feldman, Scott, netdev, ricardoz

Jeff Garzik wrote:
> Feldman, Scott wrote:
> 
>> * Change the default number of Tx descriptors from 256 to 1024.
>>   Data from [ricardoz@us.ibm.com] shows it's easy to overrun
>>   the Tx desc queue.
> 
> 
> 
> All e1000 patches applied except this one.
> 
> Of _course_ it's easy to overrun the Tx desc queue.  That's why we have 
> a TX queue sitting on top of the NIC's hardware queue.  And TCP socket 
> buffers on top of that.  And similar things.
> 
> Descriptor increases like this are usually the result of some sillyhead 
> blasting out UDP packets, and then wondering why he sees packet loss on 
> the local computer (the "blast out packets" side).

Erm, shouldn't the local machine back itself off if the various
queues are full?  Some time back I looked through the code and it
appeared to.  If not, I think it should.

> 
> You're just wasting memory.
> 
>     Jeff
> 
> 
> 


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 19:45   ` Ben Greear
@ 2003-09-11 19:59     ` Jeff Garzik
  2003-09-11 20:12     ` David S. Miller
  1 sibling, 0 replies; 38+ messages in thread
From: Jeff Garzik @ 2003-09-11 19:59 UTC (permalink / raw)
  To: Ben Greear; +Cc: Feldman, Scott, netdev, ricardoz

Ben Greear wrote:
> Jeff Garzik wrote:
> 
>> Feldman, Scott wrote:
>>
>>> * Change the default number of Tx descriptors from 256 to 1024.
>>>   Data from [ricardoz@us.ibm.com] shows it's easy to overrun
>>>   the Tx desc queue.
>>
>>
>>
>>
>> All e1000 patches applied except this one.
>>
>> Of _course_ it's easy to overrun the Tx desc queue.  That's why we 
>> have a TX queue sitting on top of the NIC's hardware queue.  And TCP 
>> socket buffers on top of that.  And similar things.
>>
>> Descriptor increases like this are usually the result of some 
>> sillyhead blasting out UDP packets, and then wondering why he sees 
>> packet loss on the local computer (the "blast out packets" side).
> 
> 
> Erm, shouldn't the local machine back itself off if the various
> queues are full?  Some time back I looked through the code and it
> appeared to.  If not, I think it should.


Given the guarantees of the protocol, the net stack has the freedom to 
drop UDP packets, for example at times when (for TCP) one would 
otherwise queue a packet for retransmit.

	Jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 19:45   ` Ben Greear
  2003-09-11 19:59     ` Jeff Garzik
@ 2003-09-11 20:12     ` David S. Miller
  2003-09-11 20:40       ` Ben Greear
  1 sibling, 1 reply; 38+ messages in thread
From: David S. Miller @ 2003-09-11 20:12 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

On Thu, 11 Sep 2003 12:45:55 -0700
Ben Greear <greearb@candelatech.com> wrote:

> Erm, shouldn't the local machine back itself off if the various
> queues are full?  Some time back I looked through the code and it
> appeared to.  If not, I think it should.

Generic networking device queues drop when the overflow.

Whatever dev->tx_queue_len is set to, the device driver needs
to be prepared to be able to queue successfully.

Most people run into problems when they run stupid UDP applications
that send a stream of tinygrams (<~64 bytes).  The solutions are to
either fix the UDP app or restrict it's socket send buffer size.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 20:12     ` David S. Miller
@ 2003-09-11 20:40       ` Ben Greear
  2003-09-11 21:07         ` David S. Miller
  0 siblings, 1 reply; 38+ messages in thread
From: Ben Greear @ 2003-09-11 20:40 UTC (permalink / raw)
  To: David S. Miller; +Cc: jgarzik, scott.feldman, netdev, ricardoz

David S. Miller wrote:
> Generic networking device queues drop when the overflow.
> 
> Whatever dev->tx_queue_len is set to, the device driver needs
> to be prepared to be able to queue successfully.
> 
> Most people run into problems when they run stupid UDP applications
> that send a stream of tinygrams (<~64 bytes).  The solutions are to
> either fix the UDP app or restrict it's socket send buffer size.

Is this close to how it works?

So, assume we configure a 10MB socket send queue on our UDP socket...

Select says its writable up to at least 5MB.

We write 5MB of 64byte packets "righ now".

Did we just drop a large number of packets?

I would expect that the packets, up to 10MB, are buffered in some
list/fifo in the socket code, and that as the underlying device queue
empties itself, the socket will feed it more packets.

The device queue, in turn, is emptied as the driver is able to fill it's
TxDescriptors, and the hardware empties the TxDescriptors.

Obviously, I'm confused somewhere....

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 20:40       ` Ben Greear
@ 2003-09-11 21:07         ` David S. Miller
  2003-09-11 21:29           ` Ben Greear
  0 siblings, 1 reply; 38+ messages in thread
From: David S. Miller @ 2003-09-11 21:07 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

On Thu, 11 Sep 2003 13:40:44 -0700
Ben Greear <greearb@candelatech.com> wrote:

> So, assume we configure a 10MB socket send queue on our UDP socket...
> 
> Select says its writable up to at least 5MB.
> 
> We write 5MB of 64byte packets "righ now".
> 
> Did we just drop a large number of packets?

Yes, we did _iff_ dev->tx_queue_len is less than or equal
to (5MB / (64 + sizeof(udp_id_headers))).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 21:07         ` David S. Miller
@ 2003-09-11 21:29           ` Ben Greear
  2003-09-11 21:29             ` David S. Miller
  0 siblings, 1 reply; 38+ messages in thread
From: Ben Greear @ 2003-09-11 21:29 UTC (permalink / raw)
  To: David S. Miller; +Cc: jgarzik, scott.feldman, netdev, ricardoz

David S. Miller wrote:
> On Thu, 11 Sep 2003 13:40:44 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>So, assume we configure a 10MB socket send queue on our UDP socket...
>>
>>Select says its writable up to at least 5MB.
>>
>>We write 5MB of 64byte packets "righ now".
>>
>>Did we just drop a large number of packets?
> 
> 
> Yes, we did _iff_ dev->tx_queue_len is less than or equal
> to (5MB / (64 + sizeof(udp_id_headers))).

Thanks for that clarification.  Is there no way to tell
at 'sendto' time that the buffers are over-full, and either
block or return -EBUSY or something like that?

Perhaps the poll logic should also take the underlying buffer
into account and not show the socket as writable in this case?

Supposing in the above example, I set tx_queue_len to
(5MB / (64 + sizeof(udp_id_headers))), will
the packets now be dropped in the driver instead, or will there
be no more (local) drops?

Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 21:29           ` Ben Greear
@ 2003-09-11 21:29             ` David S. Miller
  2003-09-11 21:47               ` Ricardo C Gonzalez
  2003-09-11 22:15               ` Ben Greear
  0 siblings, 2 replies; 38+ messages in thread
From: David S. Miller @ 2003-09-11 21:29 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

On Thu, 11 Sep 2003 14:29:43 -0700
Ben Greear <greearb@candelatech.com> wrote:

> Thanks for that clarification.  Is there no way to tell
> at 'sendto' time that the buffers are over-full, and either
> block or return -EBUSY or something like that?

The TX queue state can change by hundreds of packets by
the time we are finished making the "decision", also how would
you like to "wake" up sockets when the TX queue is liberated.
That extra overhead and logic would be wonderful for performance.

No, this is all nonsense.  Packet scheduling and queueing is
an opaque layer to all the upper layers.  It is the only sensible
design.

IP transmit is black hole that may drop packets at any moment,
any datagram application not prepared for this should be prepared
for troubles or choose to move over to something like TCP.

I listed even a workaround for such stupid UDP apps, simply limit
their socket send queue limits.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 21:29             ` David S. Miller
@ 2003-09-11 21:47               ` Ricardo C Gonzalez
  2003-09-11 22:00                 ` Jeff Garzik
  2003-09-11 22:15               ` Ben Greear
  1 sibling, 1 reply; 38+ messages in thread
From: Ricardo C Gonzalez @ 2003-09-11 21:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: greearb, jgarzik, scott.feldman, netdev

>IP transmit is black hole that may drop packets at any moment,
>any datagram application not prepared for this should be prepared
>for troubles or choose to move over to something like TCP.

As I said before, please do not make this a UDP issue. The data I sent out
was taken using a TCP_STREAM test case. Please review it.

regards,
----------------------------------------------------------------------------------

***  ALWAYS THINK POSITIVE ***

Rick Gonzalez
IBM Linux Performance Group
Building: 905    Office: 7G019
Phone: (512) 838-0623

"David S. Miller" <davem@redhat.com> on 09/11/2003 04:29:06 PM

To:    Ben Greear <greearb@candelatech.com>
cc:    jgarzik@pobox.com, scott.feldman@intel.com, netdev@oss.sgi.com,
       Ricardo C Gonzalez/Austin/IBM@ibmus
Subject:    Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default

On Thu, 11 Sep 2003 14:29:43 -0700
Ben Greear <greearb@candelatech.com> wrote:

> Thanks for that clarification.  Is there no way to tell
> at 'sendto' time that the buffers are over-full, and either
> block or return -EBUSY or something like that?

The TX queue state can change by hundreds of packets by
the time we are finished making the "decision", also how would
you like to "wake" up sockets when the TX queue is liberated.
That extra overhead and logic would be wonderful for performance.

No, this is all nonsense.  Packet scheduling and queueing is
an opaque layer to all the upper layers.  It is the only sensible
design.

IP transmit is black hole that may drop packets at any moment,
any datagram application not prepared for this should be prepared
for troubles or choose to move over to something like TCP.

I listed even a workaround for such stupid UDP apps, simply limit
 their socket send queue limits.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 21:47               ` Ricardo C Gonzalez
@ 2003-09-11 22:00                 ` Jeff Garzik
  0 siblings, 0 replies; 38+ messages in thread
From: Jeff Garzik @ 2003-09-11 22:00 UTC (permalink / raw)
  To: Ricardo C Gonzalez; +Cc: David S. Miller, greearb, scott.feldman, netdev

Ricardo C Gonzalez wrote:
> 
> 
>>IP transmit is black hole that may drop packets at any moment,
>>any datagram application not prepared for this should be prepared
>>for troubles or choose to move over to something like TCP.
> 
> 
> 
> As I said before, please do not make this a UDP issue. The data I sent out
> was taken using a TCP_STREAM test case. Please review it.


Your own words say "CPUs can fill TX queue".  We already know this. 
CPUs have been doing wire speed for ages.

	Jeff

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 21:29             ` David S. Miller
  2003-09-11 21:47               ` Ricardo C Gonzalez
@ 2003-09-11 22:15               ` Ben Greear
  2003-09-11 23:02                 ` David S. Miller
  1 sibling, 1 reply; 38+ messages in thread
From: Ben Greear @ 2003-09-11 22:15 UTC (permalink / raw)
  To: David S. Miller; +Cc: jgarzik, scott.feldman, netdev, ricardoz

David S. Miller wrote:
> On Thu, 11 Sep 2003 14:29:43 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>Thanks for that clarification.  Is there no way to tell
>>at 'sendto' time that the buffers are over-full, and either
>>block or return -EBUSY or something like that?
> 
> 
> The TX queue state can change by hundreds of packets by
> the time we are finished making the "decision", also how would
> you like to "wake" up sockets when the TX queue is liberated.

So, at some point the decision is already made that we must drop
the packet, or that we can enqueue it.  This is where I would propose
we block the thing trying to enqueue, or at least propagate a failure
code back up the stack(s) so that the packet can be retried by the
calling layer.

Preferably, one would propagate the error all the way to userspace
and let them deal with it, just like we currently deal with socket
queue full issues.

> That extra overhead and logic would be wonderful for performance.

The cost of a retransmit is also expensive, whether it is some hacked
up UDP protocol or for TCP.  Even if one had to implement callbacks
from the device queue to the interested sockets, this should not
be a large performance hit.

> 
> No, this is all nonsense.  Packet scheduling and queueing is
> an opaque layer to all the upper layers.  It is the only sensible
> design.

This is possible, but it does not seem cut and dried to me.  If there
is any documentation or research that support this assertion, please
do let us know.

> 
> IP transmit is black hole that may drop packets at any moment,
> any datagram application not prepared for this should be prepared
> for troubles or choose to move over to something like TCP.
> 
> I listed even a workaround for such stupid UDP apps, simply limit
> their socket send queue limits.

And the original poster shows how a similar problem slows down TCP
as well due to local dropped packets.  Don't you think we'd get better
TCP throughput if we instead had the calling code wait 1us for the buffers
to clear?

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 22:15               ` Ben Greear
@ 2003-09-11 23:02                 ` David S. Miller
  2003-09-11 23:22                   ` Ben Greear
  0 siblings, 1 reply; 38+ messages in thread
From: David S. Miller @ 2003-09-11 23:02 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

On Thu, 11 Sep 2003 15:15:19 -0700
Ben Greear <greearb@candelatech.com> wrote:

> And the original poster shows how a similar problem slows down TCP
> as well due to local dropped packets.

So, again, dampen the per-socket send queue sizes.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 23:02                 ` David S. Miller
@ 2003-09-11 23:22                   ` Ben Greear
  2003-09-11 23:29                     ` David S. Miller
  2003-09-12  1:34                     ` jamal
  0 siblings, 2 replies; 38+ messages in thread
From: Ben Greear @ 2003-09-11 23:22 UTC (permalink / raw)
  Cc: jgarzik, scott.feldman, netdev, ricardoz

David S. Miller wrote:
> On Thu, 11 Sep 2003 15:15:19 -0700
> Ben Greear <greearb@candelatech.com> wrote:
> 
> 
>>And the original poster shows how a similar problem slows down TCP
>>as well due to local dropped packets.
> 
> 
> So, again, dampen the per-socket send queue sizes.

That's just a band-aid to cover up the flaw with the lack
of queue-pressure feedback to the higher stacks, as would be increasing the
TxDescriptors for that matter.

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 23:22                   ` Ben Greear
@ 2003-09-11 23:29                     ` David S. Miller
  2003-09-12  1:34                     ` jamal
  1 sibling, 0 replies; 38+ messages in thread
From: David S. Miller @ 2003-09-11 23:29 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

On Thu, 11 Sep 2003 16:22:35 -0700
Ben Greear <greearb@candelatech.com> wrote:

> David S. Miller wrote:
> > So, again, dampen the per-socket send queue sizes.
> 
> That's just a band-aid to cover up the flaw with the lack
> of queue-pressure feedback to the higher stacks, as would be increasing the
> TxDescriptors for that matter.

The whole point of the various packet scheduler algorithms
are foregone if we're just going to queue up and send the
crap again.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-11 23:22                   ` Ben Greear
  2003-09-11 23:29                     ` David S. Miller
@ 2003-09-12  1:34                     ` jamal
  2003-09-12  2:20                       ` Ricardo C Gonzalez
  2003-09-13  3:49                       ` David S. Miller
  1 sibling, 2 replies; 38+ messages in thread
From: jamal @ 2003-09-12  1:34 UTC (permalink / raw)
  To: Ben Greear; +Cc: jgarzik, scott.feldman, netdev, ricardoz

Scott,

dont increase the tx descriptor ring size - that would truly wasting
memory; 256 is pretty adequate.
* increase instead the txquelen (as suggested by Davem); user space
tools like ip or ifconfig could do it. The standard size has been around
100 for 100Mbps; i suppose it is fair to say that Gige can move data out
at 10x that; so set it to 1000. Maybe you can do this from the driver
based on what negotiated speed is detected?

--------
[root@jzny root]# ip link ls eth0
4: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:b0:d0:05:ae:81 brd ff:ff:ff:ff:ff:ff

[root@jzny root]# ip  link set[root@jzny root]# ip link ls eth0
4: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:b0:d0:05:ae:81 brd ff:ff:ff:ff:ff:ff eth0 txqueuelen
1000
-------

TCP already reacts on packets dropped at the scheduler level, UDP would
be too hard to enforce since the logic is typically on an app above udp.
So just conrtol it via the socket queue size.

cheers,
jamal

On Thu, 2003-09-11 at 19:22, Ben Greear wrote:
> David S. Miller wrote:
> > On Thu, 11 Sep 2003 15:15:19 -0700
> > Ben Greear <greearb@candelatech.com> wrote:
> > 
> > 
> >>And the original poster shows how a similar problem slows down TCP
> >>as well due to local dropped packets.
> > 
> > 
> > So, again, dampen the per-socket send queue sizes.
> 
> That's just a band-aid to cover up the flaw with the lack
> of queue-pressure feedback to the higher stacks, as would be increasing the
> TxDescriptors for that matter.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-12  1:34                     ` jamal
@ 2003-09-12  2:20                       ` Ricardo C Gonzalez
  2003-09-12  3:05                         ` jamal
  2003-09-13  3:49                       ` David S. Miller
  1 sibling, 1 reply; 38+ messages in thread
From: Ricardo C Gonzalez @ 2003-09-12  2:20 UTC (permalink / raw)
  To: hadi; +Cc: greearb, jgarzik, scott.feldman, netdev



Jamal wrote:

>* increase instead the txquelen (as suggested by Davem); user space
>tools like ip or ifconfig could do it. The standard size has been around
>100 for 100Mbps; i suppose it is fair to say that Gige can move data out
>at 10x that; so set it to 1000. Maybe you can do this from the driver
>based on what negotiated speed is detected?

      This is also another way to do it. As long as we make it harder for
users to drop packets and get up to date with Gigabit speeds. We would also
have to think about the upcomming 10Gige adapters and their queue sizes,
but that is a separate issue. Anyway, the driver can easly set the
txqueuelen to 1000.

      We should care about counting the packets being dropped on the
transmit side. Would it be the responsability of the driver to account for
this drops? Because each driver has a dedicated software queue and in my
opinion, the driver should account for this packets.


regards,


----------------------------------------------------------------------------------

***  ALWAYS THINK POSITIVE ***

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-12  2:20                       ` Ricardo C Gonzalez
@ 2003-09-12  3:05                         ` jamal
  0 siblings, 0 replies; 38+ messages in thread
From: jamal @ 2003-09-12  3:05 UTC (permalink / raw)
  To: Ricardo C Gonzalez; +Cc: greearb, jgarzik, scott.feldman, netdev

On Thu, 2003-09-11 at 22:20, Ricardo C Gonzalez wrote:
> Jamal wrote:

>       We should care about counting the packets being dropped on the
> transmit side. Would it be the responsability of the driver to account for
> this drops? Because each driver has a dedicated software queue and in my
> opinion, the driver should account for this packets.

This is really the schedulers responsibility. Its hard for the driver to
keep track of why a packet was dropped. Example, could be dropped to
make room for a higher priority packet thats being anticipated to show
up soon.

The simple default 3-band scheduler unfortunately doesnt quiet show its
stats ...so simple way to see drops is:

- install the prio qdisc 
------
[root@jzny root]# tc qdisc add dev eth0 root prio
[root@jzny root]# tc -s qdisc
qdisc prio 8001: dev eth0 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1
1
 Sent 42 bytes 1 pkts (dropped 0, overlimits 0) 
-----

or you may wanna install a single pfifo queue with size of 1000 each
(although this is a little too mediavial)

example:
#tc qdisc add dev eth0 root pfifo limit 1000
#tc -s qdisc
qdisc pfifo 8002: dev eth0 limit 1000p
 Sent 0 bytes 0 pkts (dropped 0, overlimits 0) 

etc

cheers,
jamal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-12  1:34                     ` jamal
  2003-09-12  2:20                       ` Ricardo C Gonzalez
@ 2003-09-13  3:49                       ` David S. Miller
  2003-09-13 11:52                         ` Robert Olsson
  2003-09-14 19:08                         ` Ricardo C Gonzalez
  1 sibling, 2 replies; 38+ messages in thread
From: David S. Miller @ 2003-09-13  3:49 UTC (permalink / raw)
  To: hadi; +Cc: greearb, jgarzik, scott.feldman, netdev, ricardoz

On 11 Sep 2003 21:34:23 -0400
jamal <hadi@cyberus.ca> wrote:

> dont increase the tx descriptor ring size - that would truly wasting
> memory; 256 is pretty adequate.
> * increase instead the txquelen (as suggested by Davem); user space
> tools like ip or ifconfig could do it. The standard size has been around
> 100 for 100Mbps; i suppose it is fair to say that Gige can move data out
> at 10x that; so set it to 1000. Maybe you can do this from the driver
> based on what negotiated speed is detected?

I spoke with Alexey once about this, actually tx_queue_len can
be arbitrarily large but it should be reasonable nonetheless.

Our preliminary conclusions were that values of 1000 for 100Mbit and
faster were probably appropriate.  Maybe something larger for 1Gbit,
who knows.

We also determined that the only connection between TX descriptor
ring size and dev->tx_queue_len was that the latter should be large
enough to handle, at a minimum, the amount of pending TX descriptor
ACKs that can be pending considering mitigation et al.

So if TX irq mitigation can defer up to N TX descriptor completions
then dev->tx_queue_len must be at least that large.

Back to the main topic, maybe we should set dev->tx_queue_len to
1000 by default for all ethernet devices.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-13  3:49                       ` David S. Miller
@ 2003-09-13 11:52                         ` Robert Olsson
  2003-09-15 12:12                           ` jamal
  2003-09-14 19:08                         ` Ricardo C Gonzalez
  1 sibling, 1 reply; 38+ messages in thread
From: Robert Olsson @ 2003-09-13 11:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: hadi, greearb, jgarzik, scott.feldman, netdev, ricardoz


David S. Miller writes:
 > On 11 Sep 2003 21:34:23 -0400
 > jamal <hadi@cyberus.ca> wrote:
 > 
 > > dont increase the tx descriptor ring size - that would truly wasting
 > > memory; 256 is pretty adequate.
 > > * increase instead the txquelen (as suggested by Davem); user space
 > > tools like ip or ifconfig could do it. The standard size has been around
 > > 100 for 100Mbps; i suppose it is fair to say that Gige can move data out
 > > at 10x that; so set it to 1000. Maybe you can do this from the driver
 > > based on what negotiated speed is detected?
 > 
 > I spoke with Alexey once about this, actually tx_queue_len can
 > be arbitrarily large but it should be reasonable nonetheless.
 > 
 > Our preliminary conclusions were that values of 1000 for 100Mbit and
 > faster were probably appropriate.  Maybe something larger for 1Gbit,
 > who knows.
 > 
 > We also determined that the only connection between TX descriptor
 > ring size and dev->tx_queue_len was that the latter should be large
 > enough to handle, at a minimum, the amount of pending TX descriptor
 > ACKs that can be pending considering mitigation et al.
 > 
 > So if TX irq mitigation can defer up to N TX descriptor completions
 > then dev->tx_queue_len must be at least that large.
 > 
 > Back to the main topic, maybe we should set dev->tx_queue_len to
 > 1000 by default for all ethernet devices.

 Hello!

 Yes sounds like adequate setting for GIGE. This is what use for production
 and lab but rather than increasing dev->tx_queue_len to 1000 we replace the 
 pfifo_fast with the pfifo qdisc w. setting a qlen of 1000.

 And with we have tx_descriptor_ring_size 256 which is tuned to the NIC's
 "TX service interval" with respect to interrupt mitigation etc. This seems 
 good enough even for small packets.

 For routers this setting is even more crucial as we need to serialize 
 several flows and we know the flows are bursty.


 Cheers.
						--ro

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-13 11:52                         ` Robert Olsson
@ 2003-09-15 12:12                           ` jamal
  2003-09-15 13:45                             ` Robert Olsson
  0 siblings, 1 reply; 38+ messages in thread
From: jamal @ 2003-09-15 12:12 UTC (permalink / raw)
  To: Robert Olsson
  Cc: David S. Miller, greearb, jgarzik, scott.feldman, netdev,
	ricardoz

On Sat, 2003-09-13 at 07:52, Robert Olsson wrote:

>  > 
>  > I spoke with Alexey once about this, actually tx_queue_len can
>  > be arbitrarily large but it should be reasonable nonetheless.
>  > 
>  > Our preliminary conclusions were that values of 1000 for 100Mbit and
>  > faster were probably appropriate.  Maybe something larger for 1Gbit,
>  > who knows.

If you recall we saw that even for the gent who was trying to do 100K
TCP sockets on a 4 way SMP, 1000 was sufficient and no packets were
dropped.

>  > 
>  > We also determined that the only connection between TX descriptor
>  > ring size and dev->tx_queue_len was that the latter should be large
>  > enough to handle, at a minimum, the amount of pending TX descriptor
>  > ACKs that can be pending considering mitigation et al.
>  > 
>  > So if TX irq mitigation can defer up to N TX descriptor completions
>  > then dev->tx_queue_len must be at least that large.
>  > 
>  > Back to the main topic, maybe we should set dev->tx_queue_len to
>  > 1000 by default for all ethernet devices.
> 
>  Hello!
> 
>  Yes sounds like adequate setting for GIGE. This is what use for production
>  and lab but rather than increasing dev->tx_queue_len to 1000 we replace the 
>  pfifo_fast with the pfifo qdisc w. setting a qlen of 1000.
> 

I think this may not be good for the reason of QoS. You want BGP packets
to be given priority over ftp. A single queue kills that.
The current default 3 band queue is good enough, the only challenge
being noone sees stats for it. I have a patch for the kernel at:
http://www.cyberus.ca/~hadi/patches/restore.pfifo.kernel
and for tc at:
http://www.cyberus.ca/~hadi/patches/restore.pfifo.tc

cheers,
jamal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-15 12:12                           ` jamal
@ 2003-09-15 13:45                             ` Robert Olsson
  2003-09-15 23:15                               ` David S. Miller
  0 siblings, 1 reply; 38+ messages in thread
From: Robert Olsson @ 2003-09-15 13:45 UTC (permalink / raw)
  To: hadi
  Cc: Robert Olsson, David S. Miller, greearb, jgarzik, scott.feldman,
	netdev, ricardoz


jamal writes:

 > I think this may not be good for the reason of QoS. You want BGP packets
 > to be given priority over ftp. A single queue kills that.

 Well so far single queue has been robust enough for BGP-sessions. Talking
 from own experiences...

 > The current default 3 band queue is good enough, the only challenge
 > being noone sees stats for it. I have a patch for the kernel at:
 > http://www.cyberus.ca/~hadi/patches/restore.pfifo.kernel
 > and for tc at:
 > http://www.cyberus.ca/~hadi/patches/restore.pfifo.tc
 
 Yes. 
 I've missed this. Our lazy work-around for the missing stats is to install 
 pfifo qdisc as said. IMO it should be included.

 Cheers.
						--ro
  

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-15 13:45                             ` Robert Olsson
@ 2003-09-15 23:15                               ` David S. Miller
  2003-09-16  9:28                                 ` Robert Olsson
  0 siblings, 1 reply; 38+ messages in thread
From: David S. Miller @ 2003-09-15 23:15 UTC (permalink / raw)
  To: Robert Olsson
  Cc: hadi, Robert.Olsson, greearb, jgarzik, scott.feldman, netdev,
	ricardoz

On Mon, 15 Sep 2003 15:45:42 +0200
Robert Olsson <Robert.Olsson@data.slu.se> wrote:

>  > The current default 3 band queue is good enough, the only challenge
>  > being noone sees stats for it. I have a patch for the kernel at:
>  > http://www.cyberus.ca/~hadi/patches/restore.pfifo.kernel
>  > and for tc at:
>  > http://www.cyberus.ca/~hadi/patches/restore.pfifo.tc
>  
>  Yes. 
>  I've missed this. Our lazy work-around for the missing stats is to install 
>  pfifo qdisc as said. IMO it should be included.

I've included Jamal's pfifo_fast statistic patch, and the
change to increase ethernet's tx_queue_len to 1000 in all
of my trees.

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-15 23:15                               ` David S. Miller
@ 2003-09-16  9:28                                 ` Robert Olsson
  0 siblings, 0 replies; 38+ messages in thread
From: Robert Olsson @ 2003-09-16  9:28 UTC (permalink / raw)
  To: David S. Miller
  Cc: kuznet, Robert Olsson, hadi, greearb, jgarzik, scott.feldman,
	netdev, ricardoz


David S. Miller writes:

 > >  > http://www.cyberus.ca/~hadi/patches/restore.pfifo.kernel
 > >  > and for tc at:
 > >  > http://www.cyberus.ca/~hadi/patches/restore.pfifo.tc
 > 
 > I've included Jamal's pfifo_fast statistic patch, and the
 > change to increase ethernet's tx_queue_len to 1000 in all
 > of my trees.

 Thanks. We ask Alexey to include the tc part too.

 Cheers.
					--ro

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-13  3:49                       ` David S. Miller
  2003-09-13 11:52                         ` Robert Olsson
@ 2003-09-14 19:08                         ` Ricardo C Gonzalez
  2003-09-15  2:50                           ` David Brownell
  2004-05-15 12:14                           ` TxDescriptors -> 1024 default. Please not for every NIC! Marc Herbert
  1 sibling, 2 replies; 38+ messages in thread
From: Ricardo C Gonzalez @ 2003-09-14 19:08 UTC (permalink / raw)
  To: David S. Miller; +Cc: hadi, greearb, jgarzik, scott.feldman, netdev






David Miller wrote:

>Back to the main topic, maybe we should set dev->tx_queue_len to
>1000 by default for all ethernet devices.


I definately agree with setting the dev->tx_queue_len to 1000 as a default
for all ethernet adapters. All adapters will benefit from this change.


regards,

----------------------------------------------------------------------------------

***  ALWAYS THINK POSITIVE ***

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-14 19:08                         ` Ricardo C Gonzalez
@ 2003-09-15  2:50                           ` David Brownell
  2003-09-15  8:17                             ` David S. Miller
  2004-05-15 12:14                           ` TxDescriptors -> 1024 default. Please not for every NIC! Marc Herbert
  1 sibling, 1 reply; 38+ messages in thread
From: David Brownell @ 2003-09-15  2:50 UTC (permalink / raw)
  To: Ricardo C Gonzalez, David S. Miller
  Cc: hadi, greearb, jgarzik, scott.feldman, netdev

Ricardo C Gonzalez wrote:
> 
> David Miller wrote:
> 
> 
>>Back to the main topic, maybe we should set dev->tx_queue_len to
>>1000 by default for all ethernet devices.
> 
> 
> 
> I definately agree with setting the dev->tx_queue_len to 1000 as a default
> for all ethernet adapters. All adapters will benefit from this change.

Except ones where CONFIG_EMBEDDED, maybe?  Not everyone wants
to spend that much memory, even when it's available...

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [e1000 2.6 10/11] TxDescriptors -> 1024 default
  2003-09-15  2:50                           ` David Brownell
@ 2003-09-15  8:17                             ` David S. Miller
  0 siblings, 0 replies; 38+ messages in thread
From: David S. Miller @ 2003-09-15  8:17 UTC (permalink / raw)
  To: David Brownell; +Cc: ricardoz, hadi, greearb, jgarzik, scott.feldman, netdev

On Sun, 14 Sep 2003 19:50:56 -0700
David Brownell <david-b@pacbell.net> wrote:

> Except ones where CONFIG_EMBEDDED, maybe?  Not everyone wants
> to spend that much memory, even when it's available...

Dropping the packet between the network stack and the driver
does waste memory for _LONGER_ periods of time.

When we drop, TCP still hangs onto the buffer, and we'll send
it again and again until it makes it and we get an ACK back
or the connection completely times out.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* TxDescriptors -> 1024 default. Please not for every NIC!
  2003-09-14 19:08                         ` Ricardo C Gonzalez
  2003-09-15  2:50                           ` David Brownell
@ 2004-05-15 12:14                           ` Marc Herbert
  2004-05-19  9:30                             ` Marc Herbert
  1 sibling, 1 reply; 38+ messages in thread
From: Marc Herbert @ 2004-05-15 12:14 UTC (permalink / raw)
  To: netdev

On Sun, 14 Sep 2003, Ricardo C Gonzalez wrote:

> David Miller wrote:
>
> >Back to the main topic, maybe we should set dev->tx_queue_len to
> >1000 by default for all ethernet devices.
>
>
> I definately agree with setting the dev->tx_queue_len to 1000 as a default
> for all ethernet adapters. All adapters will benefit from this change.
>

<http://oss.sgi.com/projects/netdev/archive/2003-09/threads.html#00247>

Sorry to exhume this discussion but I only recently discovered this
change, the hard way.

I carefully read this old thread and did not grasp _every_ detail, but
there is one thing that I am sure of: 1000 packets @ 1 Gb/s looks
good, but on the other hand, 1000 full-size Ethernet packets @ 10 Mb/s
are about 1.2 seconds long!

Too little buffering means not enough dampering effect, which is very
important for performance in asynchronous systems, granted. However,
_too much_ buffering means too big and too variable latencies. When
discussing buffers, duration is very often more important than size.
Applications, TCP's dynamic (and kernel dynamics too?) do not care
much about buffer sizes, they more often care about latencies (and
throughput, of course). Buffers sizes is often "just a small matter of
implementation" :-) For instance people designing routers talk about
buffers in _milliseconds_ much more often than in _bytes_ (despite the
fact that their memories cost more than in hosts, considering the
throughputs involved).

100 packets @ 100 Mb/s was 12 ms. 1000 packets @ 1 Gb/s is still
12 ms. 12 ms is great. It's a "good" latency because it is the
order of magnitude of real-world constants like:  comfortable
interactive applications, operating system sheduler granularity or
propagation time in 2000 km of cable.

But 1000 packets @ 100 Mb/s is 120 ms and is neither very good nor
very useful anymore. 1000 packets @ 10 Mb/s is 1.2 s, which is
ridiculous. It does mean that, when joe user is uploading some big
file through his cheap Ethernet card, and that there are no other
bottleneck/drops further in the network, every concurrent application
will have to wait 1.2 s before accessing the network!
 It this hard to believe for you, just make the test yourself, it's
very easy: force one of you NICs to 10Mb/s full duplex, txqueuelen
1000 and send a continuous flow to a nearby machine. Then try to ping
anything.
 Imagine now that some packet is lost for whatever reason on some
_other_ TCP connection going through this terrible 1.2 s queue. Then
you need one SACK/RTX extra round trip time to recover from it: so
it's now _2.4 s_ to deliver the data sent just after the dropped
packet...  Assuming of course TCP timers do not become confused by
this huge latency and probably huge jitter.

And I don't think you want to make fiddling with "tc" mandatory for
joe user. Or tell him: "oh, please just 'ifconfig txqueuelen 10', or
buy a new Ethernet card".

I am unfortunately not familiar with this part of the linux kernel,
but I really think that, if possible, txqueuelen should be initialized
at some "constant 12 ms" and not at the "1000 packets" highly variable
latency setting. I can imagine there are some corner cases, like for
instance when some GEth NIC is hot-plugged into a 100 Mb/s, or jumbo
frames, but hey, those are corner cases : as a first step, even a
simple constant-per-model txqueuelen initialization would be already
great.

Cheers,

Marc.

PS: one workaround for joe user against this 1.2s latency would be to
keep his SND_BUF and number of sockets small. But this is poor.

-- 
"Je n'ai fait cette lettre-ci plus longue que parce que je n'ai pas eu
le loisir de la faire plus courte." -- Blaise Pascal

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-15 12:14                           ` TxDescriptors -> 1024 default. Please not for every NIC! Marc Herbert
@ 2004-05-19  9:30                             ` Marc Herbert
  2004-05-19 10:27                               ` Pekka Pietikainen
  2004-05-19 11:54                               ` Andi Kleen
  0 siblings, 2 replies; 38+ messages in thread
From: Marc Herbert @ 2004-05-19  9:30 UTC (permalink / raw)
  To: netdev

On Sat, 15 May 2004, Marc Herbert wrote:

> <http://oss.sgi.com/projects/netdev/archive/2003-09/threads.html#00247>
>
> Sorry to exhume this discussion but I only recently discovered this
> change, the hard way.
>

> I am unfortunately not familiar with this part of the linux kernel,
> but I really think that, if possible, txqueuelen should be initialized
> at some "constant 12 ms" and not at the "1000 packets" highly variable
> latency setting. I can imagine there are some corner cases, like for
> instance when some GEth NIC is hot-plugged into a 100 Mb/s, or jumbo
> frames, but hey, those are corner cases : as a first step, even a
> simple constant-per-model txqueuelen initialization would be already
> great.

After some further study, I was glad to discover my suggestion above
both easy and short to implement. See patch below.

Trying to sum-it up:

- Ricardo asks (among others) for a new 1000 packets default
  txqueuelen for Intel's e1000, based on some data (couldn't not find
  this data, please send me the pointer if you have it, thanks).

- Me argues that we all lived happy for ages with this default
  setting of 100 packets @ 100 Mb/s (and lived approximately happy @
  10 Mb/s), but we'll soon see doom and gloom with this new and
  brutal change to 1000 packets for all this _legacy_ 10-100 Mb/s
  hardware. e1000 data only is not enough to justify this radical
  shift.

If you are convinced by _both_ items above, then the patch below
content _both_, and we're done.

If you are not, then... wait for further discussion, including answers
to latest Ricardo's post.


PS: several people seem to think TCP "drops" packets when the qdisc is
full. My analysis of the code _and_ my experiments makes me think they
are wrong: TCP rather "blocks" when the qdisc is full. See explanation
here: <http://oss.sgi.com/archives/netdev/2004-05/msg00151.html>
(Subject: Re: TcpOutSegs way too optimistic (netstat -s))


===== drivers/net/net_init.c 1.11 vs edited =====
--- 1.11/drivers/net/net_init.c	Tue Sep 16 01:12:25 2003
+++ edited/drivers/net/net_init.c	Wed May 19 11:05:34 2004
@@ -420,7 +420,10 @@
 	dev->hard_header_len 	= ETH_HLEN;
 	dev->mtu		= 1500; /* eth_mtu */
 	dev->addr_len		= ETH_ALEN;
-	dev->tx_queue_len	= 1000;	/* Ethernet wants good queues */
+	dev->tx_queue_len	= 100; /* This is a sensible generic default for
+					100 Mb/s: about 12ms with 1500 full size packets.
+					Drivers should tune this depending on interface
+					specificities and settings */

 	memset(dev->broadcast,0xFF, ETH_ALEN);

===== drivers/net/e1000/e1000_main.c 1.56 vs edited =====
--- 1.56/drivers/net/e1000/e1000_main.c	Tue Feb  3 01:43:42 2004
+++ edited/drivers/net/e1000/e1000_main.c	Wed May 19 03:14:32 2004
@@ -400,6 +400,8 @@
 		err = -ENOMEM;
 		goto err_alloc_etherdev;
 	}
+
+	netdev->tx_queue_len = 1000;

 	SET_MODULE_OWNER(netdev);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-19  9:30                             ` Marc Herbert
@ 2004-05-19 10:27                               ` Pekka Pietikainen
  2004-05-20 14:11                                 ` Luis R. Rodriguez
  2004-05-19 11:54                               ` Andi Kleen
  1 sibling, 1 reply; 38+ messages in thread
From: Pekka Pietikainen @ 2004-05-19 10:27 UTC (permalink / raw)
  To: Marc Herbert; +Cc: netdev, prism54-devel

On Wed, May 19, 2004 at 11:30:28AM +0200, Marc Herbert wrote:
> - Me argues that we all lived happy for ages with this default
>   setting of 100 packets @ 100 Mb/s (and lived approximately happy @
>   10 Mb/s), but we'll soon see doom and gloom with this new and
>   brutal change to 1000 packets for all this _legacy_ 10-100 Mb/s
>   hardware. e1000 data only is not enough to justify this radical
>   shift.
> 
> If you are convinced by _both_ items above, then the patch below
> content _both_, and we're done.
> 
> If you are not, then... wait for further discussion, including answers
> to latest Ricardo's post.
Not to mention that not all modern hardware is gigabit, current
2.6 seems to be setting txqueuelen of 1000 for 802.11 devices too (at least
my prism54), which might be causing major problems for me.

Well, I'm still trying to figure out whether it's txqueue or WEP that causes
all traffic to stop (with rx invalid crypt packets showing up in iwconfig
afterwards, AP is a linksys wrt54g in case it makes a difference) every now
and then until a ifdown / ifup. Tried both vanilla 2.6 prism54 and CVS
(which seems to have a reset on tx timeout thing added), but if txqueue is
1000 that won't easily get triggered will it?
 
It's been running for a few days just fine with txqueue = 100 and no WEP, if
it stays like that i'll start tweaking to find what exactly triggers it.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-19 10:27                               ` Pekka Pietikainen
@ 2004-05-20 14:11                                 ` Luis R. Rodriguez
  2004-05-20 16:38                                   ` [Prism54-devel] " Jean Tourrilhes
  0 siblings, 1 reply; 38+ messages in thread
From: Luis R. Rodriguez @ 2004-05-20 14:11 UTC (permalink / raw)
  To: Pekka Pietikainen; +Cc: Marc Herbert, netdev, prism54-devel, Jean Tourrilhes

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Wed, May 19, 2004 at 01:27:00PM +0300, Pekka Pietikainen wrote:
> On Wed, May 19, 2004 at 11:30:28AM +0200, Marc Herbert wrote:
> > - Me argues that we all lived happy for ages with this default
> >   setting of 100?packets @?100?Mb/s (and lived approximately happy @
> >   10 Mb/s), but we'll soon see doom and gloom with this new and
> >   brutal change to 1000?packets for all this _legacy_ 10-100 Mb/s
> >   hardware. e1000 data only is not enough to justify this radical
> >   shift.
> > 
> > If you are convinced by _both_ items above, then the patch below
> > content _both_, and we're done.
> > 
> > If you are not, then... wait for further discussion, including answers
> > to latest Ricardo's post.
>
> Not to mention that not all modern hardware is gigabit, current
> 2.6 seems to be setting txqueuelen of 1000 for 802.11 devices too (at least
> my prism54), which might be causing major problems for me.

Considering 802.11b's peak is at 11Mbit and standard 802.11g is at 54Mbit
(some manufacturers are using two channels and getting 108Mbit now) I'd
think we should stick at 100, as the patch proposes. Jean?

	Luis

-- 
GnuPG Key fingerprint = 113F B290 C6D2 0251 4D84  A34A 6ADD 4937 E20A 525E

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Prism54-devel] Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-20 14:11                                 ` Luis R. Rodriguez
@ 2004-05-20 16:38                                   ` Jean Tourrilhes
  2004-05-20 16:45                                     ` Tomasz Torcz
  0 siblings, 1 reply; 38+ messages in thread
From: Jean Tourrilhes @ 2004-05-20 16:38 UTC (permalink / raw)
  To: Pekka Pietikainen, Marc Herbert, netdev, prism54-devel

On Thu, May 20, 2004 at 10:11:11AM -0400, Luis R. Rodriguez wrote:
> On Wed, May 19, 2004 at 01:27:00PM +0300, Pekka Pietikainen wrote:
> > On Wed, May 19, 2004 at 11:30:28AM +0200, Marc Herbert wrote:
> > > - Me argues that we all lived happy for ages with this default
> > >   setting of 100?packets @?100?Mb/s (and lived approximately happy @
> > >   10 Mb/s), but we'll soon see doom and gloom with this new and
> > >   brutal change to 1000?packets for all this _legacy_ 10-100 Mb/s
> > >   hardware. e1000 data only is not enough to justify this radical
> > >   shift.
> > > 
> > > If you are convinced by _both_ items above, then the patch below
> > > content _both_, and we're done.
> > > 
> > > If you are not, then... wait for further discussion, including answers
> > > to latest Ricardo's post.
> >
> > Not to mention that not all modern hardware is gigabit, current
> > 2.6 seems to be setting txqueuelen of 1000 for 802.11 devices too (at least
> > my prism54), which might be causing major problems for me.
> 
> Considering 802.11b's peak is at 11Mbit and standard 802.11g is at 54Mbit
> (some manufacturers are using two channels and getting 108Mbit now) I'd
> think we should stick at 100, as the patch proposes. Jean?
> 
> 	Luis

	I never like to have huge queues of buffers. It waste memory,
and degrade the latency, especially with competing sockets. In a
theoritical stable system, you don't need buffers (you run everything
synchronously), buffer are only needed to take care of the jitter in
real networks.
	The real throughouput of 802.11g is more around 30Mb/s (at
TCP/IP level). However, wireless networks tend to have more jitter
(interference and contention). But, wireless cards tend to have a fair
number of buffers in the hardware.
	I personally would stick with 100. The IrDA stack runs
perfectly fine with 15 buffers at 4 Mb/s. If 100 is not enough, I
think the problem is not the number of buffers, but somewhere else.
For example, we might want to think about explicit socket callbacks
(like I did in IrDA).
	But that's only personal opinions ;-)

	Have fun...

	Jean

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Prism54-devel] Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-20 16:38                                   ` [Prism54-devel] " Jean Tourrilhes
@ 2004-05-20 16:45                                     ` Tomasz Torcz
  2004-05-20 17:13                                       ` zero copy TX in benchmarks was " Andi Kleen
  0 siblings, 1 reply; 38+ messages in thread
From: Tomasz Torcz @ 2004-05-20 16:45 UTC (permalink / raw)
  To: netdev

On Thu, May 20, 2004 at 09:38:11AM -0700, Jean Tourrilhes wrote:
> 	I personally would stick with 100. The IrDA stack runs
> perfectly fine with 15 buffers at 4 Mb/s. If 100 is not enough, I
> think the problem is not the number of buffers, but somewhere else.

 I don't know how much trollish or true is that comment:
http://bsd.slashdot.org/comments.pl?sid=106258&cid=9049422
but it suggest, that Linux' stack having no BSD like mbuf functionality,
is not perfect for fast transmission. Maybe some network guru
cna comment ?

-- 
Tomasz Torcz       ,,(...) today's high-end is tomorrow's embedded processor.''
zdzichu@irc.-nie.spam-.pl                      -- Mitchell Blank on LKML

^ permalink raw reply	[flat|nested] 38+ messages in thread

* zero copy TX in benchmarks was Re: [Prism54-devel] Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-20 16:45                                     ` Tomasz Torcz
@ 2004-05-20 17:13                                       ` Andi Kleen
  0 siblings, 0 replies; 38+ messages in thread
From: Andi Kleen @ 2004-05-20 17:13 UTC (permalink / raw)
  To: Tomasz Torcz; +Cc: netdev

On Thu, May 20, 2004 at 06:45:16PM +0200, Tomasz Torcz wrote:
> On Thu, May 20, 2004 at 09:38:11AM -0700, Jean Tourrilhes wrote:
> > 	I personally would stick with 100. The IrDA stack runs
> > perfectly fine with 15 buffers at 4 Mb/s. If 100 is not enough, I
> > think the problem is not the number of buffers, but somewhere else.

Not sure why you post this to this thread? It has nothing to do
with the previous message.
> 
>  I don't know how much trollish or true is that comment:
> http://bsd.slashdot.org/comments.pl?sid=106258&cid=9049422

Linux sk_buffs and BSD mbufs are not very different anymore today.
The BSD mbufs have been getting more sk_buff'ish over time,
and sk_buffs have grown some properties of mbufs. They both
have changed to optionally pass references of memory around instead of 
copying always, which is what counts here.

> but it suggest, that Linux' stack having no BSD like mbuf functionality,
> is not perfect for fast transmission. Maybe some network guru
> cna comment ?

I have not read all the details, but I suppose they used sendmsg() 
instead of sendfile() for this test. NetBSD can use zero copy TX
in this case; Linux can only with sendfile and sendmsg will copy. 
Obvious linux will be slower then because a copy can cost quite
a lot of CPU. Or rather it is not really the CPU cost that is the
problem here, but the bandwidth usage - very high speed networking i
s essentially memory bandwidth limited and copying over the CPU 
adds additional bandwidth requirements to the memory subsystem.

There was an implementation of zero copy sendmsg() for linux long ago, 
but it was removed because it was fundamentally incompatible with good 
SMP scaling, because it would require remote TLB flushes over possible
many CPUs (if you search the archives of this list you will find 
long threads about it). It would not be very hard to readd (Linux
has all the low level infrastructure needed for it), but 
it doesn't make sense. NetBSD may have the luxury to not care
about MP scaling, but Linux doesn't.

The disadvantage of sendfile is that you can only transmit files
directly; if you want to transmit data directly out of an process'
address space you have to put them into a file mmap and sendfile
from there. This may be a bit inconvenient if the basic unit
of data in your program isn't files.

There was an plan suggested to fix that (implement zero copy TX for 
POSIX AIO instead of BSD sockets), which would not have this problem.
POSIX AIO has all the infrastructure to do zero copy IO without 
problematic and slow TLB flushes. Just so far nobody implemented that.

In practice it is not a too big issue because many tuned servers 
(your typical ftpd, httpd or samba server) use sendfile already.

-Andi

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-05-19  9:30                             ` Marc Herbert
  2004-05-19 10:27                               ` Pekka Pietikainen
@ 2004-05-19 11:54                               ` Andi Kleen
  1 sibling, 0 replies; 38+ messages in thread
From: Andi Kleen @ 2004-05-19 11:54 UTC (permalink / raw)
  To: Marc Herbert; +Cc: netdev

Marc Herbert <marc.herbert@free.fr> writes:
>
> PS: several people seem to think TCP "drops" packets when the qdisc is
> full. My analysis of the code _and_ my experiments makes me think they
> are wrong: TCP rather "blocks" when the qdisc is full. See explanation
> here: <http://oss.sgi.com/archives/netdev/2004-05/msg00151.html>
> (Subject: Re: TcpOutSegs way too optimistic (netstat -s))

This behaviour was only added relatively recently (in late 2.3.x timeframe)
I believe all the default queue lengths tunings were done before that.
So it would probably make sense to reevaluate/rebenchmark the default 
queue lengths for various devices with the newer code.

-Andi

^ permalink raw reply	[flat|nested] 38+ messages in thread

[parent not found: <C925F8B43D79CC49ACD0601FB68FF50CDB13D3@orsmsx408>]

* RE: TxDescriptors -> 1024 default. Please not for every NIC!
       [not found] <C925F8B43D79CC49ACD0601FB68FF50CDB13D3@orsmsx408>
@ 2004-06-02 19:14 ` Marc Herbert
  2004-06-02 19:49   ` Cheng Jin
  0 siblings, 1 reply; 38+ messages in thread
From: Marc Herbert @ 2004-06-02 19:14 UTC (permalink / raw)
  To: Brandeburg, Jesse; +Cc: netdev

On Wed, 26 May 2004, Brandeburg, Jesse wrote:

> I'm not sure that you could actually get the problem to occur on 100
> or 10Mb/s hardware however because of TCP window size limitation and
> such. What I'm getting at is that even if you have a device that can
> queue lots of packets, it probably won't unless you're using an
> unreliable protocol like UDP.  Theoretically it's a problem, but I'm
> not convinced that in real world scenarios it actually creates any
> issues.  Do you have a test that demonstrates the problem?

OK, let's go for it.

The following message details a simple experiment demonstrating how a
too big (1000-packet) txqueuelen creates a dreaded latency at the IP
level inside the sender for under-gigabit Ethernet interfaces, and so
finally advocates a default 1000-packet txqueuelen defined _only_
by/for the _GigE_ drivers, leaving the previous (before sep 2003 in
2.4) 100-packet default untouched for slower interfaces. The
experiment should be very quick and easy to reproduce in your lab,
even in your home.

Sorry, this message is way too long because it's... detailed, and
tries to anticipate questions, hopefully avoiding the need to come
back on this issue.  The counter-part is that it's not dense and thus
hopefully quick to read for anyone in the field. And please pardon my
english mistakes.

Detailed Experiment
-------------------

You need at least 2 hosts, but ideally 3. The sender "S", the
receiver "R1", and some witness host "R2". R1 and R2 can probably
be collapsed together if you don't have enough hosts but I am
afraid of unknown nasty side effects in this case.

Host S is using a simple TCP connection to upload an infinite file to
host R. The bottleneck is S's own 100Mb/s (or worst, 10Mb/s) network
interface. This is very important: no packet drop must occur elsewhere
between S and R, else TCP congestion avoidance algorithm will
interpret this as a congestion sign and throttle down, and the txqueue
will stay empty. If your TCP connection is under-performing your
sending wire for any reason, you will obviously never fill your
txqueue. The ACK-clocking property of TCP has for consequence that
only the queue of the bottleneck of the path may fill up (except in
very dynamic environnements where the bottleneck may be fast-changing,
but let's stay simple).

Actually almost everything below is still true when the bottleneck is
elsewhere in some router further on the path instead of local in the
sender. It's still true, just... elsewhere. For instance if you use a
linux box as a router, and if it happens to be the bottleneck, I
suspect this latency issue will appear more or less the same.  But
again, let's stay simple for the moment, forget those further routers
and get back to this _local_ IP bottleneck and its too big
txqueuelen. By the way, forcing your GigE interface to 100 or 10 is
ok.

I used iperf <http://dast.nlanr.net/Projects/Iperf/> to upload the
infinite file, but any equivalent tool should do it.

Since your TCP connection will suffer this artificial txqueue latency,
you also need to increase SND_BUF and RCV_BUF, else the number of TCP
packets sent (and thus the txqueue filling) will be capped (wait below
for more about this).

- So just run:

host_R $ iperf --server --interval 1 --window 1M
host_S $ iperf --client R --time 1000 --window 1M

Check that you get a full 94.1Mb/s (resp. 9.4Mb/s) wire-rate. If not,
investigate why and don't bother going on.

- Now just watch the latency between S some other host "R2". For
instance using mtr:

S$ mtr -l R2

As the txqueue fills up, you will see perceived latency increasing
every round-trip time, up to 120ms (worst with 10Mb/s: up to
1.2s!). When the txqueue is full, TCP detects it and enters congestion
window reduction, providing some temporary relief. Then the artificial latency
quickly ramps up again.

- You can also try to start another simultaneous upload to R2:

host_R2 $ iperf -s -i 1
host_S  $  iperf -c R2 -t 1000

... and watch how the artificial latency harms the start of the other
TCP connection, which need ages to ramp up it's throughput.

I also heard from here: "A Map of the Networking Code in Linux Kernel
 2.4.20", Technical Report DataTAG-2004-1, section 4.4
 http://datatag.web.cern.ch/datatag/publications.html
that you can get interesting qdisc stats using such commands:

# tc qdisc add dev eth1 root pfifo limit 100
# tc -s -d qdisc show dev eth1

But I did not tried them.

Warning: the interface tx_ring size has to be added to the qdisc's
txqueuelen to get the total sender's queue length perceived by TCP.
Some drivers may also set it big.

Solution
--------

Now reduce the txqueuelen to the previous value:

		ifconfig eth1 txqueuelen 100

for 10Mb/s you can even try:

		ifconfig eth1 txqueuelen 10

Now your latency is now back to a sensible value (12ms max), and
everything works fine. It's a simple as that. Throughput is not harmed
at all, you still get the full 94.1 Mb/s wire-rate. If this (previous)
setting was harmful to throughput for 100Mb/s interfaces, people would
have complained since long.

The more complex truth
-----------------------

If there is a real-world, distance-caused latency between S and R,
then having some equivalent amount of buffering in txqueuelen helps
average performance, because the interface has then a backlog of
packets to send while TCP takes time to ramp up its congestion window
again a decrease, the former compensating the latter. (This may be
what the e1000 guys observed in the first place, motivating the
increase to 1000 ? After all, 1.2ms of buffering was small) The
txqueue may smooth the sawtooth evolution of TCP congestion window,
minimizing the interface idle time.  But increased perceived latency
is the price to pay for this nice damper. There is a tradeoff between
latency and TCP throughput _on wide area_ routes to tune here, but
pushing it as far as storing in txqueuelen _multiple_ times any
real-world latency (did I say "1.2s" already?) brings no benefit at
all for throughput; it's just terribly harmful for perceived latency.
No IP router does so much buffering. Besides linux :-> I don't think
IP queues should be sized to cope with moon-earth latency by default.

Conclusion
----------
(aka: let's harass the maintainers)

Of course I just demonstrated here the worst case. In many other
cases, TCP will throttle down for some reason (packet losses, too
small socket buffers,...), it will not fill the pipe nor the txqueue,
and this dreaded latency will not appear. You could argue that my test
case is very seldom in the real world/not representative (and
I would _not_ agree), so the txqueue will never be full in practice,
since there will always be some other reason making TCP
under-performing the sending wire. OK. Even then, why defining it
uselessly so high for every NIC? Why take this risk? Just as a small
convenience for e1000 users? Mmmmm...

So now every interface has this 1000-packet queue (and soon 10,000
because of these upcoming 10Gb/s interfaces). To ensure no one ever
falls in this too big txqueue trap, I suggest the following user
documentation:

  "if you have a 100Mb/s interface, be warned that your txqueuelen is
  too high (it was tuned for real, gigabit men). So please reduce
  it using ifconfig. Alternatively, if you are not root,
  please tune your socket buffers finely. Too small, you will under-perform.
  Too big, you will fill up your txqueue and create artificial
  latency. Good luck. Of course, you can forget all the above when
  your interface is not the bottleneck"

On the other hand, having a default max txqueuelen defined in
_milliseconds_ (just like most other routers do) is quite easy to
implement and covers correctly all cases, without complex tuning
instructions for the end user. The ideal implementation is that every
driver defines txqueuelen by itself, depending on the actual link
speed. It is unrealistic in the short-term, but the incremental
implementation path is very easy: define a "sensible, generic default"
of 100-packet, perfect for the 100Mb/s masses, not too bad for 10Mb/s
masses, and let the (only few until now, let's hurry up) gigabit
drivers override/optimize this to 1000, or whatever else even more
finely tuned, for the cheap price of a few lines of code per driver.

Thanks in advance for agreeing OR proving that I am wrong.
I mean: thanks in advance for anything besides remaining silent.

And thanks for reading all this gossiping, an impressive effort indeed.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-06-02 19:14 ` Marc Herbert
@ 2004-06-02 19:49   ` Cheng Jin
  2004-06-05 14:37     ` jamal
  0 siblings, 1 reply; 38+ messages in thread
From: Cheng Jin @ 2004-06-02 19:49 UTC (permalink / raw)
  To: Marc Herbert; +Cc: netdev@oss.sgi.com

Marc,

In general, I very much agree with what you have stated about not having
a large txqueuelen.  Txqueuelen should be something that alleviates
the mismatch between CPU speed and NIC transmission speed, temporarily.
As long as the txqueuelne is greater than zero, say 10 just to be safe, 
NIC will be running at full speed (unless there were inefficiencies in 
scheduling) so there is no incentive in setting it to be an excessively 
large value like 1000.

> > I'm not sure that you could actually get the problem to occur on 100
> > or 10Mb/s hardware however because of TCP window size limitation and

With today's CPU, I think you will be able to fill up the txqueuelen
on a 10 or 100 Mbps NIC, assuming there is a large file transfer and
large window size and stuff.  

> If there is a real-world, distance-caused latency between S and R,
> then having some equivalent amount of buffering in txqueuelen helps
> average performance, because the interface has then a backlog of
> packets to send while TCP takes time to ramp up its congestion window
> again a decrease, the former compensating the latter. (This may be
> what the e1000 guys observed in the first place, motivating the
> increase to 1000 ? After all, 1.2ms of buffering was small) The
> txqueue may smooth the sawtooth evolution of TCP congestion window,
> minimizing the interface idle time.  But increased perceived latency
> is the price to pay for this nice damper. There is a tradeoff between
> latency and TCP throughput _on wide area_ routes to tune here, but
> pushing it as far as storing in txqueuelen _multiple_ times any
> real-world latency (did I say "1.2s" already?) brings no benefit at
> all for throughput; it's just terribly harmful for perceived latency.
> No IP router does so much buffering. Besides linux :-> I don't think
> IP queues should be sized to cope with moon-earth latency by default.

Very much agree with this paragraph.  As long as the buffer is more
than one bandwidth delay product, for a single TCP flow, window halving 
after each loss will still sustain a large enough window to maintain 
packets in the buffer to have full utilization.  The downside is exactly
what Marc said, very very large queueing delay for a long time.

Going back to what Marc said in an earlier e-mail about having txqueuelen 
in the unit of bytes rather than packets to provide a fixed queueing 
delay in ms rather than packets.  Maintaining txqueuelen in ms would be
an ideal solution, but probably hard to achieve in practice.

Keeping txqueuelen in bytes may be a problem for senders that 
wants to send many small pacekts.  While the byte count may be small, the 
overhead of sending small packets may introduce large delays.  

Cheng

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: TxDescriptors -> 1024 default. Please not for every NIC!
  2004-06-02 19:49   ` Cheng Jin
@ 2004-06-05 14:37     ` jamal
  0 siblings, 0 replies; 38+ messages in thread
From: jamal @ 2004-06-05 14:37 UTC (permalink / raw)
  To: Cheng Jin; +Cc: Marc Herbert, netdev@oss.sgi.com

On Wed, 2004-06-02 at 15:49, Cheng Jin wrote:
> Marc,
> 
> In general, I very much agree with what you have stated about not having
> a large txqueuelen.  Txqueuelen should be something that alleviates
> the mismatch between CPU speed and NIC transmission speed,

Thats the theory. More interesting of course are bus speeds, arbitration
schemes,  RAM latencies and throughput and other dynamic bottlenecks
like system loads. 

>  temporarily.
> As long as the txqueuelne is greater than zero, say 10 just to be safe, 
> NIC will be running at full speed (unless there were inefficiencies in 
> scheduling) so there is no incentive in setting it to be an excessively 
> large value like 1000.

In theory as well, the only time you even need to queue is when theres
congestion..
In reality, totaly different ballgame. In other words its not a simple
system that you can throw Littles theorems at.

Marc, good email, at least you didnt hand wave and declare the wind was
blowing towards the south today.

My opinion:
I agree that the 1000 qlen is excessive for 10/100 - infact i think the
value should dynamically adjust itself even for gige capable NICs
(example if a gige NIC negotiates a 10Mbps speed with link partner, then
you should adjust the qlen)[1]. That wont be trivial to do - but more
importantly motivation lacks because i dont think the situation we have
right now is devastating. To clarify:
A single TCP flow will fill in any pipe you give it under proper
conditions (proper congestion control algorithms, buffer etc)[2]. 
Most apps using TCP dont care very much about latency; the only
exception would be some scientific clustering technologies using TCP for
control messaging. And for those type of apps, you should be able to 
tune the qlen to your liking using tc or ip utilities (I claim they
shouldnt be using tcp to begin with, but thats another discussion). If
you dont want to take that extra step to tune then you dont care, and
IMO you shouldnt complain.

Having said all that: i still think theres value in maybe issuing a
warning or making the default qlen selection a compile time config.

cheers,
jamal

[1] I think it would make a nice project for someone with time. I can
consult for anyone interested.
[2] Looking at the recent patches on BIC, it does seem pretty agressive
and should have no problem filling a 10Gige pipe with proper processing
power.

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2004-06-05 14:37 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-09  3:14 [e1000 2.6 10/11] TxDescriptors -> 1024 default Feldman, Scott
2003-09-11 19:18 ` Jeff Garzik
2003-09-11 19:45   ` Ben Greear
2003-09-11 19:59     ` Jeff Garzik
2003-09-11 20:12     ` David S. Miller
2003-09-11 20:40       ` Ben Greear
2003-09-11 21:07         ` David S. Miller
2003-09-11 21:29           ` Ben Greear
2003-09-11 21:29             ` David S. Miller
2003-09-11 21:47               ` Ricardo C Gonzalez
2003-09-11 22:00                 ` Jeff Garzik
2003-09-11 22:15               ` Ben Greear
2003-09-11 23:02                 ` David S. Miller
2003-09-11 23:22                   ` Ben Greear
2003-09-11 23:29                     ` David S. Miller
2003-09-12  1:34                     ` jamal
2003-09-12  2:20                       ` Ricardo C Gonzalez
2003-09-12  3:05                         ` jamal
2003-09-13  3:49                       ` David S. Miller
2003-09-13 11:52                         ` Robert Olsson
2003-09-15 12:12                           ` jamal
2003-09-15 13:45                             ` Robert Olsson
2003-09-15 23:15                               ` David S. Miller
2003-09-16  9:28                                 ` Robert Olsson
2003-09-14 19:08                         ` Ricardo C Gonzalez
2003-09-15  2:50                           ` David Brownell
2003-09-15  8:17                             ` David S. Miller
2004-05-15 12:14                           ` TxDescriptors -> 1024 default. Please not for every NIC! Marc Herbert
2004-05-19  9:30                             ` Marc Herbert
2004-05-19 10:27                               ` Pekka Pietikainen
2004-05-20 14:11                                 ` Luis R. Rodriguez
2004-05-20 16:38                                   ` [Prism54-devel] " Jean Tourrilhes
2004-05-20 16:45                                     ` Tomasz Torcz
2004-05-20 17:13                                       ` zero copy TX in benchmarks was " Andi Kleen
2004-05-19 11:54                               ` Andi Kleen
     [not found] <C925F8B43D79CC49ACD0601FB68FF50CDB13D3@orsmsx408>
2004-06-02 19:14 ` Marc Herbert
2004-06-02 19:49   ` Cheng Jin
2004-06-05 14:37     ` jamal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).