netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Andy Fleming <afleming@freescale.com>
Cc: Kumar Gala <kumar.gala@freescale.com>,
	netdev@oss.sgi.com, dwmw2@infradead.org, hadi@cyberus.ca
Subject: Re: [RFR] gianfar ethernet driver
Date: Wed, 07 Jul 2004 01:27:40 -0400	[thread overview]
Message-ID: <40EB89CC.2040100@pobox.com> (raw)
In-Reply-To: <89563A5C-CFAE-11D8-BA44-000393C30512@freescale.com>

Andy Fleming wrote:
>>> 7) ethtool_ops should not be kmalloc'd.
>>> +       dev_ethtool_ops =
>>> +               (struct ethtool_ops *)kmalloc(sizeof(struct 
>>> ethtool_ops),
>>> +                                             GFP_KERNEL);
> 
> 
> Is there a particular reason?  The reason I did it this way is because 
> the driver supports a 10/100 controller which does not have the RMON 
> statistics, and therefore should not read from that register space.  So 
> I needed to use different functions to fill in the statistics lists.

Just switch out one static pointer for another, once you know the 
hardware you're dealing with.  You _must_ have that knowledge anyway, 
before calling register_netdev(), otherwise you're got a race in 
initialization versus interface-up.


>>> 8) I recommend enabling _both_ hardware interrupt coalescing and NAPI,
>>> at the same time.  Several other Linux net drivers need to be changed to
>>> do this, as well
> 
> 
> Ok... but this is possible with the driver as it is.  Interrupt 
> Coalescing is runtime configurable, and NAPI is a compile-time option.  
> NAPI can sometimes hurt performance, and so we'd like to have the 
> ability to disable it for some deployments.  I guess I'm not sure what 
> change this suggestion was meant to cause.

Default hardware mitigation to on in all cases, and preferably NAPI as well.

If you see cases that hurt performance, that wants investigating.


>>> 14) surely gfar_set_mac_address() needs some sort of synchronization?
> 
> 
> I'm not sure why.  It only gets called during gfar_open(), and 
> startup_gfar() gets called after it.

Incorrect.  Set-mac-address can be called when the interface is up and 
active, such as by the bonding driver.


>>> 15) gfar_change_mtu() synchronization appears absent?
> 
> 
> I'm not exactly sure what kind of synchronization is expected here.  
> stop_gfar() and startup_gfar() do modify the appropriate hardware values.

Same conditions (and response) as set-mac-address.  These can be called 
while you're actively DMAing packets.


>>> 23) gfar_set_multi() does not appear to take into account huge
>>> multi-cast lists.  Drivers usually handle large dev->mc_count by falling
>>> back to ALLMULTI behavior.
> 
> 
> Actually, it does.  The bits which are set represent hash table values. 
>  Essentially, the MAC addr is converted to an 8-bit CRC.  This value 
> then serves as an index to the 256-bit hash table.  If the list is 
> large, then certain bits may be written twice, but if all the bits are 
> set, then the behavior is the same as in the ALLMULTI behavior -- every 
> multicast packet arrives.  It would be a mistake, IMHO, to cut it off 
> after N entries, as several of those entries could overlap in the 
> bitmap.  Clearly, the less bits which are set, the more effective the 
> hardware filter, so checking each address will, I think, always be a 
> performance win or tie (on the filtering side) over ALLMULTI

ok


>>> 24) setting IFF_MULTICAST is suspicious at best, possibly wrong:
>>> +       dev->mtu = 1500;
>>> +       dev->set_multicast_list = gfar_set_multi;
>>> +       dev->flags |= IFF_MULTICAST;
>>> +
> 
> 
> I'll believe you, but is there a reason for this?  I guess it's already 
> set, by default, so easy fix.

follow the other drivers, and behave predictably :)


>>> 28) I see you support register dump (good!), now please send the
>>> register-pretty-print patch to the ethtool package :)
> 
> 
> Heh.  Well, I haven't written that, yet.  There's actually a slight 
> problem, in that the 85xx registers are 32-bit, and refuse to be 
> accessed as bytes.  I'll be sure to look at that in the near...ish future.

Nothing wrong with accessing registers as 32-bit quantities.  That 
attribute is common to a lot of NICs.


>>> 31) infinite loops, particularly on 1-bits, are discouraged:
>>> +       /* Wait for the transaction to finish */
>>> +       while (gfar_read(&regbase->miimind) & (MIIMIND_NOTVALID |
>>> MIIMIND_BUSY))
>>> +               cpu_relax();
> 
> 
> What is the suggested method for waiting for the bus to be free?  Should 
> I timeout after some time, and bring the driver down?

it's really up to you, as it depends on the implementation platforms.

The most simple is to include a counter that counts down to zero, and 
starts some absurdly large number like 100000.


>>> 33) merge-stopper:  mii_parse_sr().  never wait for autonegotiation to
>>> complete.  it is an asynchronous operation that could exceed 30 seconds.
> 
> 
> Hmm...
> 
>>> 34) merge-stopper:  dm9161_wait().  same thing as #33.
> 
> 
> This may be a problem.  That function is there to work around an issue 
> in the PHY, wherein if you try to configure it before it has come up all 
> the way, it refuses to bring the link up.  We've sworn at this code many 
> times, but there has, as of yet, not been a good suggestion as to how we 
> can ensure that the 9161 is  ready before we configure it.

I interpret that as a driver bug :)

As common sense, regardless of phy bugs, you should not be trying to 
configure the MAC _or_ the phy in the middle of autonegotiation. 
Presumeably you are using a combination of netif_stop_queue(), 
netif_carrier_off(), and netif_device_detach() to achieve this.


>>> 35) liberally borrow code and ideas from Ben H's sungem_phy.c.
>>> Eventually we want to move to a generic phy module.
> 
> 
> Heh.  ironically, I stole liberally from the ibm_emac driver, which now 
> looks exactly like the sungem_phy code for phy handling.  A generic phy 
> module would be a good thing, and I'm even interested in helping with 
> code and/or suggestions.  If nothing else, I'll jump on the new generic 
> phy module bandwagon!

Cool.  This item #35 is more of a long term "think about it" type of 
request.  Please do not hesitate to think of ways that you could share 
code with sungem_phy.c, and/or make them both use the same API, and 
submit patches along those lines :)

It's not enough to just write a driver, help change Linux for the better :)


>  From Jamal:
> 
>>
>> 1) The check (in gfar_start_xmit()):
>>
>> Should happen much sooner - i.e before the skb is touched.
> 
> 
> I'm not sure I agree here.  What I am doing is detecting the full state 
> before an skb needs to be rejected.  I am testing the NEXT descriptor to 
> see if it is ready (if not, dirty_tx will be pointing to it).  This way, 
> I always handle any skb that is passed to gfar_start_xmit()

See my comments to jamal as well:  if you guarantee that you always have 
room on the DMA ring for an additional skb, that check should never be 
needed.

Some extremely cautious/paranoid programmers will add a check, with a 
printk noting it's a BUG (i.e. impossible) condition, such as

	if (unlikely(... no more descriptors ...)) {
		printk(KERN_ERR "%s: BUG: no room for TX\n", dev->name);
		netif_stop_queue();
		spin_unlock_irqrestore();
		return 1;
	}

	... queue TX to hardware ...

	if (no more descriptors)
		netif_stop_queue()

	spin_unlock_irqrestore()


>> 2) Also its pretty scary if you are doing:
>> txbdp->status |= TXBD_INTERRUPT for every packet.
>> Look at other drivers, they try to do this every few packets;
>> or enable tx mitigation to slow the rate  of those interupts.
> 
> 
> I don't believe this is as dire as you think.  This bit only indicates 
> that, if the conditions are right, an interrupt will be generated once 
> that descriptor is processed, and its data sent.  Conditions which 
> mitigate that are:
> 1) coalescing is on, and the timer and counter have not triggered the 
> interrupt yet
> 2) NAPI is enabled, and so interrupts are disabled after the first 
> packet arrives
> 3) NAPI is disabled, but the driver is currently handling a previous 
> interrupt, so the interrupts are disabled for now.

Even at 10/100 speeds, you really don't want to be generating one 
interrupt per Tx...

	Jeff

  parent reply	other threads:[~2004-07-07  5:27 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <C681B01E-CEA9-11D8-931F-000393DBC2E8@freescale.com>
     [not found] ` <89563A5C-CFAE-11D8-BA44-000393C30512@freescale.com>
2004-07-07  3:18   ` [RFR] gianfar ethernet driver jamal
2004-07-07  3:29     ` Jeff Garzik
2004-07-07  3:41       ` jamal
2004-07-07  5:35         ` Jeff Garzik
2004-07-07 18:29           ` jamal
     [not found]             ` <40EDC7A7.8060906@pobox.com>
2004-07-08 23:08               ` jamal
2004-07-21 19:51       ` Andy Fleming
2004-07-21 20:14         ` David Woodhouse
2004-08-02 22:19       ` Andy Fleming
2004-08-02 23:11         ` Jeff Garzik
2004-08-02 23:25           ` Andy Fleming
2004-08-04 23:02             ` Andy Fleming
2004-08-16 16:31               ` Kumar Gala
2004-08-22 21:03               ` Jeff Garzik
2004-07-07  5:27   ` Jeff Garzik [this message]
     [not found]     ` <D3458628-D05D-11D8-BA44-000393C30512@freescale.com>
2004-07-08 22:29       ` Jeff Garzik
     [not found]         ` <944A2374-D137-11D8-8835-000393C30512@freescale.com>
2004-07-09  1:32           ` jamal
2004-07-09  1:42             ` jamal
2004-07-26 22:06     ` Andy Fleming
2004-07-26 22:10       ` Jeff Garzik
2004-08-02 13:59         ` Kumar Gala
2004-08-02 14:09           ` Christoph Hellwig
2004-08-02 14:11             ` Kumar Gala
     [not found] ` <2A724364-D53A-11D8-8835-000393C30512@freescale.com>
     [not found]   ` <40F4A6E5.4060000@pobox.com>
2004-07-19 23:29     ` Andy Fleming
2004-07-20  1:13       ` David Woodhouse
     [not found] <8F52CF1D-C916-11D8-BB6A-000393DBC2E8@freescale.com>
2004-07-05 17:28 ` Jeff Garzik
2004-07-06  2:38   ` jamal
     [not found]   ` <20040708231131.GA20305@infradead.org>
2004-07-08 23:25     ` Jeff Garzik
2004-07-08 23:35       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40EB89CC.2040100@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=afleming@freescale.com \
    --cc=dwmw2@infradead.org \
    --cc=hadi@cyberus.ca \
    --cc=kumar.gala@freescale.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).