From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Garzik Subject: Re: [PATCH] [2.4] forcedeth network driver Date: Sat, 24 Jan 2004 15:21:26 -0500 Sender: netdev-bounce@oss.sgi.com Message-ID: <4012D3C6.1050805@pobox.com> References: <4012BF44.9@colorfullife.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: Carl-Daniel Hailfinger , linux-kernel@vger.kernel.org, Netdev Return-path: To: Manfred Spraul In-Reply-To: <4012BF44.9@colorfullife.com> Errors-to: netdev-bounce@oss.sgi.com List-Id: netdev.vger.kernel.org Manfred Spraul wrote: > Jeff wrote: > >> * Interrupt handler is SCARY. You can potentially take and release >> the spinlock -many times- during a single interrupt. >> > I think that can happen only in theory: A new packet is completed while > the driver processes rx packets. I all normal cases there should be one > spinlock operation per tx irq, and 0 per rx irq. > And error handling IMHO doesn't count: it should be rare. > Or do I overlook a common case? Any amount of load at all will lead to multiple interations of the master loop in the interrupt handler. >>> +#define NV_MIIPHY_DELAY 10 >>> +#define NV_MIIPHY_DELAYMAX 10000 >> >> >> Style: it's fairly silly to mix enums and constants. >> >> > Right now: enum for the nic registers, #define for the rest. If you > don't like it I can change it. enums are definitely preferred... communicates more type/symbol information to the compiler and more symbol info to the debugger. >>> +/* General driver defaults */ >>> +#define NV_WATCHDOG_TIMEO (2*HZ) >> >> >> this seems too short, and might trigger on normal events? >> >> > I think I copied it from another driver - which value would you recommend? 5 seconds is the norm, but it also depends on whether your link interrupt is 100% reliable. If you don't have to synchronize the link watchdog timeout with other driver-private timers, the task is easier. Tangent -- you may wish to check for link in ->tx_timeout(), before resetting the NIC. Again, this depends on link interrupt/timer setup as well. The basic point is that TX timeout -may- occur because link went away. One way to handle this is to ensure that link-down events are always noticed before the watchdog timer kicks. Another way to handle this is to simply check for link when ->tx_timeout() is called. >>> +static inline struct fe_priv *get_nvpriv(struct net_device *dev) >>> +{ >>> + return (struct fe_priv *) dev->priv; >>> +} >> >> >> What's the point of this wrapper? >> >> You don't need to cast from a void pointer, either. >> >> > I usually try to write code that compiles as cpp - is that a forbidden > in kernel modules? It's pointless in C, and so I've been stripping such casts out of all net drivers when I find them. >>> +/* >>> + * alloc_rx: fill rx ring entries. >>> + * Return 1 if the allocations for the skbs failed and the >>> + * rx engine is without Available descriptors >>> + */ >>> +static int alloc_rx(struct net_device *dev) >>> +{ >> >> > [snip] > >>> + return 0; >>> +} >> >> >> skb_reserve() seems to be missing >> >> > Do you have specs that show that all nForce versions support unaligned > buffers? skb_reserve is a performance feature, I don't want to add it > yet. Testing that it works is on our TODO list. hmmmm, is nForce ever found on non-x86 boxes? I would think that skb_reserve might be -required- for some platforms. >> I wonder about calling dev_kfree_skb() from dev->tx_timeout() with >> dev->xmit_lock held... >> >> > Is that bug in the networking core still not fixed? I am not aware of a bug in this area. >>> + /* 2) check that the packets were not sent already: */ >>> + tx_done(dev); >> >> >> bug: tx_done unconditionally calls dev_kfree_skb_irq(), but here we >> are not in an interrupt. >> >> > What is the xxx_kfree_skb_xxx function that just works? dev_kfree_skb_any >>> + /* >>> + * the packet is for us - immediately tear down the pci >>> mapping, and >>> + * prefetch the first cacheline of the packet. >>> + */ >>> + pci_unmap_single(np->pci_dev, np->rx_dma[i], >>> + np->rx_skbuff[i]->len, >>> + PCI_DMA_FROMDEVICE); >>> + prefetch(np->rx_skbuff[i]->data); >> >> >> is this just guessing? or has this actually shown some value? >> >> I would prefer not to put stuff like this in unless it shows a >> measureable CPU usage or cache miss impact. >> >> > Just guessing - it shouldn't hurt. CPU usage won't be important until > nForce supports GigE. Should I remove it for now? I would rather remove it. "premature optimization" and all that. Otherwise this guess will be cut-n-pasted into other drivers, I guarantee, all without any verification of the guess... :) >>> +/* >>> + * change_mtu: dev->change_mtu function >>> + * Called with dev_base_lock held for read. >>> + */ >>> +static int change_mtu(struct net_device *dev, int new_mtu) >>> +{ >>> + if (new_mtu > DEFAULT_MTU) >>> + return -EINVAL; >>> + dev->mtu = new_mtu; >>> + return 0; >>> +} >> >> >> bug #1: have you tested changing the MTU while the NIC is actually >> running? >> > What should the nic do? I'll continue to allocate 1.8 kB buffers because > I don't know how to reconfigure the nic hardware to reject large packets. Fair enough. You may wish to (after testing!) increase DEFAULT_MTU by 4 bytes, to support VLAN. >> bug #2: need a minimum bound for the MTU as well >> >> > What is the minimum MTU? I remember a flamewar lkml about 200 byte MTU > for noisy radio links. Usually the ethernet standard 60 is fine. >>> + for (i=0; ; i++) { >>> + events = readl(base + NvRegIrqStatus) & NVREG_IRQSTAT_MASK; >>> + writel(NVREG_IRQSTAT_MASK, base + NvRegIrqStatus); >>> + pci_push(base); >>> + dprintk(KERN_DEBUG "%s: irq: %08x\n", dev->name, events); >>> + if (!(events & np->irqmask)) >>> + break; >> >> >> bug: check for 0xffffffff >> >> > What causes 0xfffffff? Hotplug? I think the irq handler could leave > immediately if a reserved bit is set. I'll add that. Yes, hot unplug or hardware fault. >>> + if (i == 32) { >>> + printk(KERN_INFO "%s: open: failing due to lack of suitable >>> PHY.\n", >>> + dev->name); >>> + ret = -EINVAL; >>> + goto out_drain; >>> + } >> >> >> bug: check #0 after checking #1, before giving up >> >> >> > MII id 0 a valid mii address? Or is that broadcast to all? It's usually something akin to an alias of one of the other phy id's, but if you found -no- phys at all, it wouldn't hurt to try zero. Thanks, Jeff