linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Matti Aarnio <matti.aarnio@zmailer.org>
Cc: Willy Tarreau <willy@w.ods.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	greearb@candelatech.com, akpm@osdl.org, alan@redhat.com,
	jgarzik@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: PATCH: VLAN support for 3c59x/3c90x
Date: Sat, 31 Jul 2004 12:18:29 -0400	[thread overview]
Message-ID: <410BC655.4040509@pobox.com> (raw)
In-Reply-To: <20040731141222.GJ2429@mea-ext.zmailer.org>

Matti Aarnio wrote:
> On Sat, Jul 31, 2004 at 12:11:52PM +0200, Willy Tarreau wrote:
>>Ok, sorry, I've just checked, they are 6. But I incidentely used the feature
>>on 2 of them (dl2k and starfire). But more drivers still have the
>>'static int mtu=1500' preceeded by a comment stating "allow the user to change
>>the mtu". Why is it not a #define then, if nobody can change it anymore ?
> 
> 
> In the older kernels that allowed for module parameter loading
> sufficiently, I recall.  Now couple additional macroes are needed
> to publish such parameters.  APIs do change in Linux kernel.
> 
> I have been pondering on the issue of usefullness of ->change_mtu
> for this use.   One of the bigger issues is, like Willy notes, that
> the MTU change information is given to the driver after it is already
> up and about, which requires then running setup magics which usually
> need running reset...

First, MTU change need not occur while the interface is up.

Second, modern hardware deals a lot better with MTU changes.  Some only 
need a write into a register.  Some need no reset at all, as long as you 
don't exceed the hardware limit.


> Also the Linux kernel isn't very well happy with multi-path stacking
> of layer-2 driver modules.  A side-effect from all of these things might

Elaboration?  The 2.6.x net drivers do proper refcounting, unlike a lot 
of other drivers.


> To prevent that from happening, it is sufficient in the eth driver to
> not to shrink its MTU except via card shutdown -- but then, IFCONFIG
> data for e.g. IP layer needs separation from the hardware driver layer.

In general ifconfig data should definitely -not- be separate from the 
driver.  In particular changing MTU definitely needs to be tightly 
integrated with the driver.


> For this IFCONFIG MTU issue, I would rather have the VLAN code to ask
> the underlaying driver of what MTU can be supported, than just blindly
> presume that 1500 will be functional for e.g. eth0.2  (like it does now)

The VLAN code could certainly be updated to poke at the lower level 
driver MTU.


>>>For VLAN support you definitely want to let the user increase the size 
>>>above 1500, and for that you need ->change_mtu
>>
>>I agree, but my point was that adding MODULE_PARM was only a one liner and
>>would have done the job too. But since everyone prefers a change_mtu(), I'll
>>do it.
>>
>>Jeff, do you know the absolute hardware limit on the tulip ? I've seen the
>>limitation to PKT_BUF_SZ (1536), but I don't know for example if the
>>hardware stores the FCS in the buffer or not, nor if the IP headers risk
>>being aligned or not (which would consume 2 more bytes).
>>Or does 1536 - 14 (ethernet) - 2 (iphdr alignment) - 4 (FCS) = 1516 seem a
>>reasonable conservative higher bound ?
> 
> 
> The Tulip (21143 at least) can do chained block receive; if first memory
> block is too short, it can continue to next one.  This way maximum frame

Yes, but receiving packets not wholly contained in a single frame is SO 
sub-optimal that it is to be avoided at all costs.

Maybe when receive scatter-gather is fully supported this can change, 
but for now the driver should not be returning multi-frag frames to the 
kernel.


> size is at least 2560 bytes.  For transmit the Jabber timer seems to
> trip at 2560, including preamples and crcs.  Also, there is a receive 
> watchdog, that is guaranteed to pass 2048 byte frames, and timeout at
> 2560 byte frames.  (When the watchdog is not disabled, that is.) See 
> CSR15<4>.  For transmit the Jabber-Clock bites at 2048-2560 bytes,
> OR at timer of 2.6-3.3 ms (of 100 Mbps) which means at least 32 000 bytes.
> ( CSR15<2> )
> 
> In the receive descriptors there might appear a TL bit (Frame Too Long),
> which is just telling that frame size exceeds 1518 bytes.
> If RW (Receive Watchdog; RDES0<4>) has tripped, then there is at least
> 2048 bytes long frame, most likely longer than 2560 bytes.
> 
> Based on my reading of  ds21143hrm.pdf  (copy of which I have), I do
> think it is safe to just receive larger frames with Tulip, and IGNORE
> the "TL" bit.

That covers one of seven or eight tulip chips driven by the driver.

Once you exceed the ethernet norm there are tons of chip-specific quirks 
and details to deal with.  In addition to the details you mention, the 
on-chip FIFO sizes and behaviors become important.  As does the 
multi-fragment frame issue.  Some chips with checksumming features only 
work when the MTU is less than an unknown magic number (less than you 
would think, but higher than 1500).

All these reasons are why I want to dive into the 3c59x documentation, 
and also do some testing on older models, before we merge Alan's patch 
from $subject.

	Jeff



  reply	other threads:[~2004-07-31 16:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-28 12:42 PATCH: VLAN support for 3c59x/3c90x Alan Cox
2004-07-28 21:33 ` Ben Greear
2004-07-28 22:15   ` Alan Cox
2004-07-28 22:30     ` Ben Greear
2004-07-29 17:19       ` Alan Cox
2004-07-28 21:36 ` Andrew Morton
2004-07-28 21:45   ` Ben Greear
2004-07-29  4:18     ` Willy Tarreau
2004-07-30  2:20       ` Herbert Xu
2004-07-30 12:10         ` Willy Tarreau
2004-07-31  3:57           ` Herbert Xu
2004-07-31  8:33             ` Willy Tarreau
     [not found]               ` <200407310846.i6V8k3qq006659@uai.com.br>
2004-07-31  8:57                 ` Willy Tarreau
2004-07-31  9:34               ` Jeff Garzik
2004-07-31 10:11                 ` Willy Tarreau
2004-07-31 14:12                   ` Matti Aarnio
2004-07-31 16:18                     ` Jeff Garzik [this message]
2004-07-31 17:13                       ` Ben Greear
2004-07-31 17:03                     ` Ben Greear
2004-07-31 17:05                       ` Willy Tarreau
2004-07-31 17:21                         ` Ben Greear
2004-07-31 20:16                           ` Lee Revell
2004-07-31 20:23                             ` Willy Tarreau
2004-07-31 20:25                             ` Alan Cox
2004-07-31 20:40                               ` Lee Revell
2004-08-06 12:30                     ` Willy Tarreau
2004-07-31 16:05                   ` Jeff Garzik
2004-07-31 16:12                     ` Willy Tarreau
2004-07-31 16:26                       ` Jeff Garzik
2004-07-31 21:06                         ` PATCH-2.4: MTU fix for tulip driver Willy Tarreau
2004-07-31  9:35               ` PATCH: VLAN support for 3c59x/3c90x Herbert Xu
2004-07-31 10:01                 ` Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=410BC655.4040509@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=akpm@osdl.org \
    --cc=alan@redhat.com \
    --cc=greearb@candelatech.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgarzik@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matti.aarnio@zmailer.org \
    --cc=willy@w.ods.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).