* [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) @ 2007-08-06 21:57 Kok, Auke 2007-08-07 3:07 ` Andi Kleen 2007-08-07 16:26 ` Jeff Garzik 0 siblings, 2 replies; 6+ messages in thread From: Kok, Auke @ 2007-08-06 21:57 UTC (permalink / raw) To: Jeff Garzik, NetDev Cc: Andrew Morton, Arjan van de Ven, Ronciak, John, Auke Kok All, Another update on e1000e. Many thanks to Jeff for helping out and getting this going forward. The driver is unfortunately still too large to post, so please use the URL's below to review: http://foo-projects.org/~sofar/e1000e-20070806.patch [525k] http://foo-projects.org/~sofar/e1000e-20070806.patch.bz2 [90k] git://lost.foo-projects.org/~ahkok/git/linux-2.6 e1000e-20070806 Please note that the non-ICH9 device IDs are disabled, but this driver will (once you put them back in) properly work with all the current pci-e adapters that e1000 supports. Once this driver gets merged we'll move those devices over one by one. Cheers, Auke --- From: Auke Kok <auke-jan.h.kok@intel.com> Date: Mon, 6 Aug 2007 14:14:44 -0700 Subject: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) This driver implements support for the ICH9 on-board LAN ethernet device. The device is similar to ICH8. The driver encompasses code to support 82571/2/3, es2lan and ICH8 devices as well, but those device IDs are disabled and will be "lifted" from the e1000 driver over one at a time once this driver receives some more live time. Changes to the last snapshot posted are exclusively in the internal hardware API organization. Many thanks to Jeff Garzik for jumping in and getting this organized with a keen eye on the future layout. Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> --- drivers/net/Kconfig | 23 + drivers/net/Makefile | 1 + drivers/net/e1000e/82571.c | 1382 +++++++++++++ drivers/net/e1000e/Makefile | 37 + drivers/net/e1000e/defines.h | 738 +++++++ drivers/net/e1000e/e1000.h | 518 +++++ drivers/net/e1000e/es2lan.c | 1255 ++++++++++++ drivers/net/e1000e/ethtool.c | 1763 +++++++++++++++++ drivers/net/e1000e/hw.h | 862 +++++++++ drivers/net/e1000e/ich8lan.c | 2297 ++++++++++++++++++++++ drivers/net/e1000e/lib.c | 2528 ++++++++++++++++++++++++ drivers/net/e1000e/netdev.c | 4413 ++++++++++++++++++++++++++++++++++++++++++ drivers/net/e1000e/param.c | 382 ++++ drivers/net/e1000e/phy.c | 1821 +++++++++++++++++ 14 files changed, 18020 insertions(+), 0 deletions(-) create mode 100644 drivers/net/e1000e/82571.c create mode 100644 drivers/net/e1000e/Makefile create mode 100644 drivers/net/e1000e/defines.h create mode 100644 drivers/net/e1000e/e1000.h create mode 100644 drivers/net/e1000e/es2lan.c create mode 100644 drivers/net/e1000e/ethtool.c create mode 100644 drivers/net/e1000e/hw.h create mode 100644 drivers/net/e1000e/ich8lan.c create mode 100644 drivers/net/e1000e/lib.c create mode 100644 drivers/net/e1000e/netdev.c create mode 100644 drivers/net/e1000e/param.c create mode 100644 drivers/net/e1000e/phy.c ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) 2007-08-06 21:57 [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) Kok, Auke @ 2007-08-07 3:07 ` Andi Kleen 2007-08-07 4:25 ` Kok, Auke 2007-08-10 20:17 ` Kok, Auke 2007-08-07 16:26 ` Jeff Garzik 1 sibling, 2 replies; 6+ messages in thread From: Andi Kleen @ 2007-08-07 3:07 UTC (permalink / raw) To: Kok, Auke Cc: Jeff Garzik, NetDev, Andrew Morton, Arjan van de Ven, Ronciak, John "Kok, Auke" <auke-jan.h.kok@intel.com> writes: > All, > > Another update on e1000e. Many thanks to Jeff for helping out and > getting this going forward. The driver is unfortunately still too > large to post, so please use the URL's below to review: Just some things I noticed; no comprehensive review +static void e1000_clear_hw_cntrs_82571(struct e1000_hw *hw) +{ + u32 temp; + + e1000_clear_hw_cntrs_base(hw); Would be much nicer with a table and a loop. Same in similar functions. + tx_ring->buffer_info[i].dma = + pci_map_single(pdev, skb->data, skb->len, + PCI_DMA_TODEVICE); Misses error handling. Multiple occurrences. + rx_ring->desc = pci_alloc_consistent(pdev, rx_ring->size, + &rx_ring->dma); If you use dma_alloc_coherent and don't hold a lock (I think you do not) you could specify GFP_KERNEL and be more reliable. p_a_c() unfortunately defaults to flakey GFP_ATOMIC for historical reasons Multiple occurrences. + skb = alloc_skb(2048 + NET_IP_ALIGN, GFP_KERNEL); alloc_skb already aligns to the next cache line, more might be not needed. The allocation is quite wasteful because you'll get a full 4K page with most of it unused. I remember this being discussed some time ago; it's sad even newer e1000 problems still have the same issue It's unclear why you clear the skbs here. + } while (good_cnt < 64 && jiffies < (time + 20)); Doesn't handle jiffies wrap; use time_* More occurrences all over. + mod_timer(&adapter->blink_timer, jiffies + E1000_ID_INTERVAL); Should use round_jiffies to avoid wakeups +s32 e1000_get_bus_info_pcie(struct e1000_hw *hw) A couple of drivers have similar functions. Should be really put into a generic function into the PCI layer instead of reinventing the wheel. + if (ret_val) + goto out; ... +out: + return ret_val; Totally unnecessary goto. Lots of occurrences. /* Force memory writes to complete before letting h/w + * know there are new descriptors to fetch. (Only + * applicable for weak-ordered memory model archs, + * such as IA-64). */ + wmb(); That is not what a memory barrier does. It just orders the writes, but doesn't actually flush them. + /* Make buffer alignment 2 beyond a 16 byte boundary + * this will result in a 16 byte aligned IP header after + * the 14 byte MAC header is removed + */ + skb_reserve(skb, NET_IP_ALIGN); At least on x86 (or other architectures with cheap unalignment support) it seems like a bad trade off :- it forces the NIC to do R-M-W to get these 14 bytes and it doesn't help the CPU too much. Have you tried if performance improves if the beginning is just cache line aligned? + /* It must be a TCP or UDP packet with a valid checksum */ You could set skb->protocol then if you know. If the hw also tells you if the packet was unicast for you then you could also set skb->pkt_type and avoid an early cache miss. In general you don't seem to care about PCI posting too much. I guess it's ok on Intel chipsets, but other chipsets do buffer a lot. >E1000_SUCCESS everywhere It is weird to have an own define for this. How about just 0 as the rest of the kernel? + prefetch(skb->data - NET_IP_ALIGN); The minus is useless. + prefetch(next_rxd); Should be probably prefetchw + skb_reserve(new_skb, NET_IP_ALIGN); + memcpy(new_skb->data - NET_IP_ALIGN, + skb->data - NET_IP_ALIGN, + length + NET_IP_ALIGN); Lots of effort to copy useless padding. -Andi ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) 2007-08-07 3:07 ` Andi Kleen @ 2007-08-07 4:25 ` Kok, Auke 2007-08-07 5:37 ` Arjan van de Ven 2007-08-10 20:17 ` Kok, Auke 1 sibling, 1 reply; 6+ messages in thread From: Kok, Auke @ 2007-08-07 4:25 UTC (permalink / raw) To: Andi Kleen Cc: Kok, Auke, Jeff Garzik, NetDev, Andrew Morton, Arjan van de Ven, Ronciak, John Andi Kleen wrote: > "Kok, Auke" <auke-jan.h.kok@intel.com> writes: > >> All, >> >> Another update on e1000e. Many thanks to Jeff for helping out and >> getting this going forward. The driver is unfortunately still too >> large to post, so please use the URL's below to review: > > Just some things I noticed; no comprehensive review thanks, quick reply to one of the issues below, others I'll take into account and look into deeper. > + mod_timer(&adapter->blink_timer, jiffies + E1000_ID_INTERVAL); > Should use round_jiffies to avoid wakeups actually, not here - we don't want the led to blink unreliably. If the timer gets stalled beyond 1/2 a second and is irregular, you'll never be able to identify the proper adapter port in you data center. remember, this is only used once the user invokes 'ethtool -p' Auke ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) 2007-08-07 4:25 ` Kok, Auke @ 2007-08-07 5:37 ` Arjan van de Ven 0 siblings, 0 replies; 6+ messages in thread From: Arjan van de Ven @ 2007-08-07 5:37 UTC (permalink / raw) To: Kok, Auke; +Cc: Andi Kleen, Jeff Garzik, NetDev, Andrew Morton, Ronciak, John Kok, Auke wrote: > Andi Kleen wrote: >> "Kok, Auke" <auke-jan.h.kok@intel.com> writes: >> >>> All, >>> >>> Another update on e1000e. Many thanks to Jeff for helping out and >>> getting this going forward. The driver is unfortunately still too >>> large to post, so please use the URL's below to review: >> >> Just some things I noticed; no comprehensive review > > thanks, quick reply to one of the issues below, others I'll take into > account and look into deeper. > > >> + mod_timer(&adapter->blink_timer, jiffies + E1000_ID_INTERVAL); >> Should use round_jiffies to avoid wakeups > > actually, not here - we don't want the led to blink unreliably. If the > timer gets stalled beyond 1/2 a second and is irregular, you'll never be > able to identify the proper adapter port in you data center. > > remember, this is only used once the user invokes 'ethtool -p' round_jiffies is regular, except for the first time.. (it's a whole-second only though) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) 2007-08-07 3:07 ` Andi Kleen 2007-08-07 4:25 ` Kok, Auke @ 2007-08-10 20:17 ` Kok, Auke 1 sibling, 0 replies; 6+ messages in thread From: Kok, Auke @ 2007-08-10 20:17 UTC (permalink / raw) To: Andi Kleen Cc: Kok, Auke, Jeff Garzik, NetDev, Andrew Morton, Arjan van de Ven, Ronciak, John Andi Kleen wrote: > "Kok, Auke" <auke-jan.h.kok@intel.com> writes: > >> All, >> >> Another update on e1000e. Many thanks to Jeff for helping out and >> getting this going forward. The driver is unfortunately still too >> large to post, so please use the URL's below to review: > > Just some things I noticed; no comprehensive review OK, I just pushed a bunch of patches to Jeff to address some of this... > +static void e1000_clear_hw_cntrs_82571(struct e1000_hw *hw) > +{ > + u32 temp; > + > + e1000_clear_hw_cntrs_base(hw); > > Would be much nicer with a table and a loop. Same in similar functions. I'm leaving this for now. We've always done it this way and while I tend to agree it could be more organized, it's a low priority for me to do stuff like this for now :) > + tx_ring->buffer_info[i].dma = > + pci_map_single(pdev, skb->data, skb->len, > + PCI_DMA_TODEVICE); > > Misses error handling. Multiple occurrences. I tried to address all of these properly. It's interesting to see how few drivers do this properly. I wonder how much it impacts performance and if it is really worth it. > + rx_ring->desc = pci_alloc_consistent(pdev, rx_ring->size, > + &rx_ring->dma); > > If you use dma_alloc_coherent and don't hold a lock (I think you do > not) you could specify GFP_KERNEL and be more reliable. p_a_c() > unfortunately defaults to flakey GFP_ATOMIC for historical reasons > Multiple occurrences. done, seems like a very good idea indead. > + skb = alloc_skb(2048 + NET_IP_ALIGN, GFP_KERNEL); > > alloc_skb already aligns to the next cache line, more might be not needed. > The allocation is quite wasteful because you'll get a full 4K page with > most of it unused. > > I remember this being discussed some time ago; it's sad even newer e1000 > problems still have the same issue not an e1000 problem. All drivers do this as it's a significant gain. See original thread from 2004 here: http://lwn.net/Articles/89770/ > It's unclear why you clear the skbs here. > > + } while (good_cnt < 64 && jiffies < (time + 20)); > Doesn't handle jiffies wrap; use time_* > More occurrences all over. made it use time_after() > + mod_timer(&adapter->blink_timer, jiffies + E1000_ID_INTERVAL); > Should use round_jiffies to avoid wakeups I don't expect people to have their leds blinking 24/7 while the rtnl lock is held, so those 2 extra wakeups per second are nice to have at proper intervals. > +s32 e1000_get_bus_info_pcie(struct e1000_hw *hw) > A couple of drivers have similar functions. Should be really put > into a generic function into the PCI layer instead of reinventing the wheel. > > + if (ret_val) > + goto out; > ... > +out: > + return ret_val; > > Totally unnecessary goto. Lots of occurrences. cleaned up. > /* Force memory writes to complete before letting h/w > + * know there are new descriptors to fetch. (Only > + * applicable for weak-ordered memory model archs, > + * such as IA-64). */ > + wmb(); > > That is not what a memory barrier does. It just orders the writes, > but doesn't actually flush them. > > + /* Make buffer alignment 2 beyond a 16 byte boundary > + * this will result in a 16 byte aligned IP header after > + * the 14 byte MAC header is removed > + */ > + skb_reserve(skb, NET_IP_ALIGN); > At least on x86 (or other architectures with cheap unalignment > support) it seems like a bad trade off :- it forces the NIC to > do R-M-W to get these 14 bytes and it doesn't help the CPU > too much. > > Have you tried if performance improves if the beginning is just > cache line aligned? I'll need to look into that, but as said above and as can be seen in the kernel code, on PPC (e.g.) NET_IP_ALIGN is zero due to the expensiveness of this alignment, and so we have that problem covered. > + /* It must be a TCP or UDP packet with a valid checksum */ > You could set skb->protocol then if you know. > If the hw also tells you if the packet was unicast for you then > you could also set skb->pkt_type and avoid an early cache miss. > > In general you don't seem to care about PCI posting too much. > I guess it's ok on Intel chipsets, but other chipsets do buffer > a lot. we've added (over time) some write flushes in problematic areas, I think we're OK currently. >> E1000_SUCCESS everywhere > It is weird to have an own define for this. How about just 0 as > the rest of the kernel? all gone. thanks for the comments, Auke ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) 2007-08-06 21:57 [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) Kok, Auke 2007-08-07 3:07 ` Andi Kleen @ 2007-08-07 16:26 ` Jeff Garzik 1 sibling, 0 replies; 6+ messages in thread From: Jeff Garzik @ 2007-08-07 16:26 UTC (permalink / raw) To: Kok, Auke Cc: NetDev, Andrew Morton, Arjan van de Ven, Ronciak, John, Andi Kleen Kok, Auke wrote: > From: Auke Kok <auke-jan.h.kok@intel.com> > Date: Mon, 6 Aug 2007 14:14:44 -0700 > Subject: [PATCH] e1000e: New pci-express e1000 driver (currently for > ICH9 devices only) > > This driver implements support for the ICH9 on-board LAN ethernet > device. The device is similar to ICH8. > > The driver encompasses code to support 82571/2/3, es2lan and ICH8 > devices as well, but those device IDs are disabled and will be > "lifted" from the e1000 driver over one at a time once this driver > receives some more live time. > > Changes to the last snapshot posted are exclusively in the internal > hardware API organization. Many thanks to Jeff Garzik for jumping in > and getting this organized with a keen eye on the future layout. > > Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> Thanks for posting the patch in a git-am friendly format :) I merged this into netdev-2.6.git#e1000e just now, and pulled it into netdev-2.6.git#ALL so that Andrew's -mm tree will automatically pick up this driver. Please submit e1000e in the form of follow-up patches to #e1000e, rather than reposting the entire driver. We'll leave it on this side branch for a little while, to give others a chance to review and test, and give you (auke) a chance to update for Andi's comments etc. Jeff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-08-10 20:17 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-06 21:57 [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only) Kok, Auke 2007-08-07 3:07 ` Andi Kleen 2007-08-07 4:25 ` Kok, Auke 2007-08-07 5:37 ` Arjan van de Ven 2007-08-10 20:17 ` Kok, Auke 2007-08-07 16:26 ` Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).