All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jgarzik@pobox.com>
To: Andy Fleming <afleming@freescale.com>
Cc: Kumar Gala <kumar.gala@freescale.com>,
	netdev@oss.sgi.com, dwmw2@infradead.org, hadi@cyberus.ca
Subject: Re: [RFR] gianfar ethernet driver
Date: Wed, 07 Jul 2004 01:27:40 -0400	[thread overview]
Message-ID: <40EB89CC.2040100@pobox.com> (raw)
In-Reply-To: <89563A5C-CFAE-11D8-BA44-000393C30512@freescale.com>

Andy Fleming wrote:
>>> 7) ethtool_ops should not be kmalloc'd.
>>> +       dev_ethtool_ops =
>>> +               (struct ethtool_ops *)kmalloc(sizeof(struct 
>>> ethtool_ops),
>>> +                                             GFP_KERNEL);
> 
> 
> Is there a particular reason?  The reason I did it this way is because 
> the driver supports a 10/100 controller which does not have the RMON 
> statistics, and therefore should not read from that register space.  So 
> I needed to use different functions to fill in the statistics lists.

Just switch out one static pointer for another, once you know the 
hardware you're dealing with.  You _must_ have that knowledge anyway, 
before calling register_netdev(), otherwise you're got a race in 
initialization versus interface-up.


>>> 8) I recommend enabling _both_ hardware interrupt coalescing and NAPI,
>>> at the same time.  Several other Linux net drivers need to be changed to
>>> do this, as well
> 
> 
> Ok... but this is possible with the driver as it is.  Interrupt 
> Coalescing is runtime configurable, and NAPI is a compile-time option.  
> NAPI can sometimes hurt performance, and so we'd like to have the 
> ability to disable it for some deployments.  I guess I'm not sure what 
> change this suggestion was meant to cause.

Default hardware mitigation to on in all cases, and preferably NAPI as well.

If you see cases that hurt performance, that wants investigating.


>>> 14) surely gfar_set_mac_address() needs some sort of synchronization?
> 
> 
> I'm not sure why.  It only gets called during gfar_open(), and 
> startup_gfar() gets called after it.

Incorrect.  Set-mac-address can be called when the interface is up and 
active, such as by the bonding driver.


>>> 15) gfar_change_mtu() synchronization appears absent?
> 
> 
> I'm not exactly sure what kind of synchronization is expected here.  
> stop_gfar() and startup_gfar() do modify the appropriate hardware values.

Same conditions (and response) as set-mac-address.  These can be called 
while you're actively DMAing packets.


>>> 23) gfar_set_multi() does not appear to take into account huge
>>> multi-cast lists.  Drivers usually handle large dev->mc_count by falling
>>> back to ALLMULTI behavior.
> 
> 
> Actually, it does.  The bits which are set represent hash table values. 
>  Essentially, the MAC addr is converted to an 8-bit CRC.  This value 
> then serves as an index to the 256-bit hash table.  If the list is 
> large, then certain bits may be written twice, but if all the bits are 
> set, then the behavior is the same as in the ALLMULTI behavior -- every 
> multicast packet arrives.  It would be a mistake, IMHO, to cut it off 
> after N entries, as several of those entries could overlap in the 
> bitmap.  Clearly, the less bits which are set, the more effective the 
> hardware filter, so checking each address will, I think, always be a 
> performance win or tie (on the filtering side) over ALLMULTI

ok


>>> 24) setting IFF_MULTICAST is suspicious at best, possibly wrong:
>>> +       dev->mtu = 1500;
>>> +       dev->set_multicast_list = gfar_set_multi;
>>> +       dev->flags |= IFF_MULTICAST;
>>> +
> 
> 
> I'll believe you, but is there a reason for this?  I guess it's already 
> set, by default, so easy fix.

follow the other drivers, and behave predictably :)


>>> 28) I see you support register dump (good!), now please send the
>>> register-pretty-print patch to the ethtool package :)
> 
> 
> Heh.  Well, I haven't written that, yet.  There's actually a slight 
> problem, in that the 85xx registers are 32-bit, and refuse to be 
> accessed as bytes.  I'll be sure to look at that in the near...ish future.

Nothing wrong with accessing registers as 32-bit quantities.  That 
attribute is common to a lot of NICs.


>>> 31) infinite loops, particularly on 1-bits, are discouraged:
>>> +       /* Wait for the transaction to finish */
>>> +       while (gfar_read(&regbase->miimind) & (MIIMIND_NOTVALID |
>>> MIIMIND_BUSY))
>>> +               cpu_relax();
> 
> 
> What is the suggested method for waiting for the bus to be free?  Should 
> I timeout after some time, and bring the driver down?

it's really up to you, as it depends on the implementation platforms.

The most simple is to include a counter that counts down to zero, and 
starts some absurdly large number like 100000.


>>> 33) merge-stopper:  mii_parse_sr().  never wait for autonegotiation to
>>> complete.  it is an asynchronous operation that could exceed 30 seconds.
> 
> 
> Hmm...
> 
>>> 34) merge-stopper:  dm9161_wait().  same thing as #33.
> 
> 
> This may be a problem.  That function is there to work around an issue 
> in the PHY, wherein if you try to configure it before it has come up all 
> the way, it refuses to bring the link up.  We've sworn at this code many 
> times, but there has, as of yet, not been a good suggestion as to how we 
> can ensure that the 9161 is  ready before we configure it.

I interpret that as a driver bug :)

As common sense, regardless of phy bugs, you should not be trying to 
configure the MAC _or_ the phy in the middle of autonegotiation. 
Presumeably you are using a combination of netif_stop_queue(), 
netif_carrier_off(), and netif_device_detach() to achieve this.


>>> 35) liberally borrow code and ideas from Ben H's sungem_phy.c.
>>> Eventually we want to move to a generic phy module.
> 
> 
> Heh.  ironically, I stole liberally from the ibm_emac driver, which now 
> looks exactly like the sungem_phy code for phy handling.  A generic phy 
> module would be a good thing, and I'm even interested in helping with 
> code and/or suggestions.  If nothing else, I'll jump on the new generic 
> phy module bandwagon!

Cool.  This item #35 is more of a long term "think about it" type of 
request.  Please do not hesitate to think of ways that you could share 
code with sungem_phy.c, and/or make them both use the same API, and 
submit patches along those lines :)

It's not enough to just write a driver, help change Linux for the better :)


>  From Jamal:
> 
>>
>> 1) The check (in gfar_start_xmit()):
>>
>> Should happen much sooner - i.e before the skb is touched.
> 
> 
> I'm not sure I agree here.  What I am doing is detecting the full state 
> before an skb needs to be rejected.  I am testing the NEXT descriptor to 
> see if it is ready (if not, dirty_tx will be pointing to it).  This way, 
> I always handle any skb that is passed to gfar_start_xmit()

See my comments to jamal as well:  if you guarantee that you always have 
room on the DMA ring for an additional skb, that check should never be 
needed.

Some extremely cautious/paranoid programmers will add a check, with a 
printk noting it's a BUG (i.e. impossible) condition, such as

	if (unlikely(... no more descriptors ...)) {
		printk(KERN_ERR "%s: BUG: no room for TX\n", dev->name);
		netif_stop_queue();
		spin_unlock_irqrestore();
		return 1;
	}

	... queue TX to hardware ...

	if (no more descriptors)
		netif_stop_queue()

	spin_unlock_irqrestore()


>> 2) Also its pretty scary if you are doing:
>> txbdp->status |= TXBD_INTERRUPT for every packet.
>> Look at other drivers, they try to do this every few packets;
>> or enable tx mitigation to slow the rate  of those interupts.
> 
> 
> I don't believe this is as dire as you think.  This bit only indicates 
> that, if the conditions are right, an interrupt will be generated once 
> that descriptor is processed, and its data sent.  Conditions which 
> mitigate that are:
> 1) coalescing is on, and the timer and counter have not triggered the 
> interrupt yet
> 2) NAPI is enabled, and so interrupts are disabled after the first 
> packet arrives
> 3) NAPI is disabled, but the driver is currently handling a previous 
> interrupt, so the interrupts are disabled for now.

Even at 10/100 speeds, you really don't want to be generating one 
interrupt per Tx...

	Jeff

  parent reply	other threads:[~2004-07-07  5:27 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <C681B01E-CEA9-11D8-931F-000393DBC2E8@freescale.com>
     [not found] ` <89563A5C-CFAE-11D8-BA44-000393C30512@freescale.com>
2004-07-07  3:18   ` [RFR] gianfar ethernet driver jamal
2004-07-07  3:29     ` Jeff Garzik
2004-07-07  3:41       ` jamal
2004-07-07  5:35         ` Jeff Garzik
2004-07-07 18:29           ` jamal
     [not found]             ` <40EDC7A7.8060906@pobox.com>
2004-07-08 23:08               ` jamal
2004-07-21 19:51       ` Andy Fleming
2004-07-21 20:14         ` David Woodhouse
2004-08-02 22:19       ` Andy Fleming
2004-08-02 23:11         ` Jeff Garzik
2004-08-02 23:25           ` Andy Fleming
2004-08-04 23:02             ` Andy Fleming
2004-08-16 16:31               ` Kumar Gala
2004-08-22 21:03               ` Jeff Garzik
2004-07-07  5:27   ` Jeff Garzik [this message]
     [not found]     ` <D3458628-D05D-11D8-BA44-000393C30512@freescale.com>
2004-07-08 22:29       ` Jeff Garzik
     [not found]         ` <944A2374-D137-11D8-8835-000393C30512@freescale.com>
2004-07-09  1:32           ` jamal
2004-07-09  1:42             ` jamal
2004-07-26 22:06     ` Andy Fleming
2004-07-26 22:10       ` Jeff Garzik
2004-08-02 13:59         ` Kumar Gala
2004-08-02 14:09           ` Christoph Hellwig
2004-08-02 14:11             ` Kumar Gala
     [not found] ` <2A724364-D53A-11D8-8835-000393C30512@freescale.com>
     [not found]   ` <40F4A6E5.4060000@pobox.com>
2004-07-19 23:29     ` Andy Fleming
2004-07-20  1:13       ` David Woodhouse
     [not found] <8F52CF1D-C916-11D8-BB6A-000393DBC2E8@freescale.com>
2004-07-05 17:28 ` Jeff Garzik
2004-07-06  2:38   ` jamal
     [not found]   ` <20040708231131.GA20305@infradead.org>
2004-07-08 23:25     ` Jeff Garzik
2004-07-08 23:35       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40EB89CC.2040100@pobox.com \
    --to=jgarzik@pobox.com \
    --cc=afleming@freescale.com \
    --cc=dwmw2@infradead.org \
    --cc=hadi@cyberus.ca \
    --cc=kumar.gala@freescale.com \
    --cc=netdev@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.