* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. [not found] ` <471E2AD0.1000500@intel.com> @ 2007-10-23 20:40 ` Jeff Garzik 2007-10-23 21:01 ` Kok, Auke ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Jeff Garzik @ 2007-10-23 20:40 UTC (permalink / raw) To: Kok, Auke; +Cc: Adam Jackson, linux-kernel, David Miller, netdev Kok, Auke wrote: > Adam Jackson wrote: >> On Tue, 2007-10-23 at 09:18 -0700, Kok, Auke wrote: >>> Adam Jackson wrote: >>>> When the EEPROM gets corrupted, you can fix it with ethtool, but only if >>>> the module loads and creates a network device. But, without this option, >>>> if the EEPROM is corrupted, the driver will not create a network device. >>>> >>>> Signed-off-by: Adam Jackson <ajax@redhat.com> >>> NAK >>> >>> wrong list, not sent to me, and while for e100 I was OK with this patch, for e1000 >>> it really does not make sense to 'just allow' a bad checksum - if your eeprom is >>> randomly messed up then you cannot just fix it like this anyway. >> That's strange, I managed to recover an otherwise horked e1000 with it. >> What should I have done instead? > > > Dump the eeprom and send us a copy, plus any and all information to the card, > system etc.. I realize that you need the patch to actually create it but the > danger is that people will start using it *without* troubleshooting the real > issue. In various systems the eeprom checksum failure is actually due to a > misconfigured powersavings feature and the checksum is really not bad at all, but > the card just reports random values. > > In any case, this patch should not be merged. We often send it around to users to > debug their issue in case it involves eeproms, but merging it will just conceal > the real issue and all of a sudden a flood of people stop reporting *real* issues > to us. Sorry, I disagree. Just as with e100, if there is a clear way the user can recover their setup -- and Adam says his was effective -- I don't see why we should be denying users the ability to use their own hardware. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 20:40 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Jeff Garzik @ 2007-10-23 21:01 ` Kok, Auke 2007-10-23 21:51 ` David Miller 2007-10-23 21:20 ` Dave Jones 2007-10-23 21:48 ` David Miller 2 siblings, 1 reply; 21+ messages in thread From: Kok, Auke @ 2007-10-23 21:01 UTC (permalink / raw) To: Jeff Garzik; +Cc: Adam Jackson, linux-kernel, David Miller, netdev Jeff Garzik wrote: > Kok, Auke wrote: >> Adam Jackson wrote: >>> On Tue, 2007-10-23 at 09:18 -0700, Kok, Auke wrote: >>>> Adam Jackson wrote: >>>>> When the EEPROM gets corrupted, you can fix it with ethtool, but >>>>> only if >>>>> the module loads and creates a network device. But, without this >>>>> option, >>>>> if the EEPROM is corrupted, the driver will not create a network >>>>> device. >>>>> >>>>> Signed-off-by: Adam Jackson <ajax@redhat.com> >>>> NAK >>>> >>>> wrong list, not sent to me, and while for e100 I was OK with this >>>> patch, for e1000 >>>> it really does not make sense to 'just allow' a bad checksum - if >>>> your eeprom is >>>> randomly messed up then you cannot just fix it like this anyway. >>> That's strange, I managed to recover an otherwise horked e1000 with it. >>> What should I have done instead? >> >> >> Dump the eeprom and send us a copy, plus any and all information to >> the card, >> system etc.. I realize that you need the patch to actually create it >> but the >> danger is that people will start using it *without* troubleshooting >> the real >> issue. In various systems the eeprom checksum failure is actually due >> to a >> misconfigured powersavings feature and the checksum is really not bad >> at all, but >> the card just reports random values. >> >> In any case, this patch should not be merged. We often send it around >> to users to >> debug their issue in case it involves eeproms, but merging it will >> just conceal >> the real issue and all of a sudden a flood of people stop reporting >> *real* issues >> to us. > > > Sorry, I disagree. Just as with e100, if there is a clear way the user > can recover their setup -- and Adam says his was effective -- I don't > see why we should be denying users the ability to use their own hardware. That's not even relevant, I already offer the same patch offline to people who *really* only have a wrong checksum, AFTER we check the contents of their eeprom for them. We help everyone out, and if you merge this patch you will prevent users from getting to us for support in the first place. I realize that we should probably document the "bad eeprom checksum" case more decently but I think merging this patch is a bad idea for the *user* in all cases. You completely bypass the fact that e100 eeproms and e1000 eeproms are completely different beasts as well, one can be practically empty in all cases (e100), the other one every bit counts (most e1000's), which is an unfair representation and falsely tells the user that he can just do this without any information or disclaimer as to what he may expect afterwards. Auke ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 21:01 ` Kok, Auke @ 2007-10-23 21:51 ` David Miller 0 siblings, 0 replies; 21+ messages in thread From: David Miller @ 2007-10-23 21:51 UTC (permalink / raw) To: auke-jan.h.kok; +Cc: jeff, ajax, linux-kernel, netdev From: "Kok, Auke" <auke-jan.h.kok@intel.com> Date: Tue, 23 Oct 2007 14:01:21 -0700 > We help everyone out, and if you merge this patch you will prevent > users from getting to us for support in the first place. If using the bad eeprom has to be explicitly enabled by the user, your argument holds no water. We just need to make sure the patch does that. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 20:40 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Jeff Garzik 2007-10-23 21:01 ` Kok, Auke @ 2007-10-23 21:20 ` Dave Jones 2007-10-23 21:38 ` Alan Cox ` (2 more replies) 2007-10-23 21:48 ` David Miller 2 siblings, 3 replies; 21+ messages in thread From: Dave Jones @ 2007-10-23 21:20 UTC (permalink / raw) To: Jeff Garzik; +Cc: Kok, Auke, Adam Jackson, linux-kernel, David Miller, netdev On Tue, Oct 23, 2007 at 04:40:01PM -0400, Jeff Garzik wrote: > > In any case, this patch should not be merged. We often send it around to users to > > debug their issue in case it involves eeproms, but merging it will just conceal > > the real issue and all of a sudden a flood of people stop reporting *real* issues > > to us. > > Sorry, I disagree. Just as with e100, if there is a clear way the user > can recover their setup -- and Adam says his was effective -- I don't > see why we should be denying users the ability to use their own hardware. Indeed. This is a common enough problem that not including it causes more pain than its worth. I have two affected boxes myself that I actually thought the hardware was dead before I tried ajax's patch. People aren't going to report this as a bug. They aren't going to try out patches, they're going to do what I did and stick another network card in the box and go on with life. Our users deserve better than this. Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 21:20 ` Dave Jones @ 2007-10-23 21:38 ` Alan Cox 2007-10-23 21:53 ` David Miller 2007-10-23 23:03 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Kok, Auke 2 siblings, 0 replies; 21+ messages in thread From: Alan Cox @ 2007-10-23 21:38 UTC (permalink / raw) To: Dave Jones Cc: Jeff Garzik, Kok, Auke, Adam Jackson, linux-kernel, David Miller, netdev > People aren't going to report this as a bug. They aren't going to try out patches, > they're going to do what I did and stick another network card in the box and > go on with life. > > Our users deserve better than this. Agreed. By all means warn people, or give them a 1-800 Intel number to phone, but they should be able to continue as well. Alan ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 21:20 ` Dave Jones 2007-10-23 21:38 ` Alan Cox @ 2007-10-23 21:53 ` David Miller 2007-10-23 23:19 ` Kok, Auke 2007-10-24 0:55 ` [PATCH] e1000, e1000e valid-addr fixes Jeff Garzik 2007-10-23 23:03 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Kok, Auke 2 siblings, 2 replies; 21+ messages in thread From: David Miller @ 2007-10-23 21:53 UTC (permalink / raw) To: davej; +Cc: jeff, auke-jan.h.kok, ajax, linux-kernel, netdev From: Dave Jones <davej@redhat.com> Date: Tue, 23 Oct 2007 17:20:26 -0400 > Indeed. This is a common enough problem that not including it causes > more pain than its worth. I have two affected boxes myself that I > actually thought the hardware was dead before I tried ajax's patch. > > People aren't going to report this as a bug. They aren't going to > try out patches, they're going to do what I did and stick another > network card in the box and go on with life. > > Our users deserve better than this. Seconded. The resistence to this patch is just flat-out rediculious, just like it was in the e100 case. And I think all of this "e1000 is different!" talk is merely a scarecrow for the fact that Intel simply doesn't want this patch merged for some other reason. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 21:53 ` David Miller @ 2007-10-23 23:19 ` Kok, Auke 2007-10-24 0:55 ` [PATCH] e1000, e1000e valid-addr fixes Jeff Garzik 1 sibling, 0 replies; 21+ messages in thread From: Kok, Auke @ 2007-10-23 23:19 UTC (permalink / raw) To: David Miller; +Cc: davej, jeff, ajax, linux-kernel, netdev David Miller wrote: > From: Dave Jones <davej@redhat.com> > Date: Tue, 23 Oct 2007 17:20:26 -0400 > >> Indeed. This is a common enough problem that not including it causes >> more pain than its worth. I have two affected boxes myself that I >> actually thought the hardware was dead before I tried ajax's patch. >> >> People aren't going to report this as a bug. They aren't going to >> try out patches, they're going to do what I did and stick another >> network card in the box and go on with life. >> >> Our users deserve better than this. > > Seconded. The resistence to this patch is just flat-out rediculious, > just like it was in the e100 case. > > And I think all of this "e1000 is different!" talk is merely a > scarecrow for the fact that Intel simply doesn't want this patch > merged for some other reason. no, e1000 eeproms contain many timing information and bits crucial to getting the adapter working in the first place. All of these are documented in our PUBLICALLY available SDM which is downloadable from our e1000.sf.net website. (e.g. 8254x sdm, section 5.6, page 98+). For pci-e silicon this gets much more complex. we haven't even heard from the user what hardware he has nor gotten an eeprom dump from him. I'm not hiding anything and you're deliberately creating a negative atmosphere here. The people who do have eeprom checksum issues have come to us in the past. The fact that you didn't see them means that they *properly* made it to us. As a matter of fact I am still working on a permanent solution for bad eeprom checksums on lenovo T60 laptops. Should I just drop that issue and leave the real problem unsolved? This patch only affirms *YOUR* point of view, not that of many customers who have come to us and received help with many issues. You're completely ignoring that and that is unfair. If we want this patch in the kernel in some form that actually shows in a decent way what a user *should* do and more importantly should *know*, then maybe we can talk about that. The patch in question does not add any extra information nor does it do some sanity checking on the eeprom values or turn off any of the problematic features that we really should disable. We even have code that allows some of the hardware to run without a properly setup eeprom in a few hardware cases. And we definately should print out a lot more warnings to the user that running with garbage eeprom data is _not_ a good idea. Auke ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH] e1000, e1000e valid-addr fixes 2007-10-23 21:53 ` David Miller 2007-10-23 23:19 ` Kok, Auke @ 2007-10-24 0:55 ` Jeff Garzik 2007-10-24 1:03 ` Jeff Garzik 2007-10-24 1:15 ` Adrian Bunk 1 sibling, 2 replies; 21+ messages in thread From: Jeff Garzik @ 2007-10-24 0:55 UTC (permalink / raw) To: David Miller; +Cc: davej, auke-jan.h.kok, ajax, linux-kernel, netdev [-- Attachment #1: Type: text/plain, Size: 579 bytes --] Actually, looking over the code I see obvious bugs in the logic: An invalid ethernet address should not cause device loading to fail, because the user is given the opportunity to supply a MAC address via userspace (ifconfig or whatever) before the interface goes up. I just created the attached -bug fix- patch as illustration, though I have not committed it, waiting for comment. This patch will make no difference for users hitting invalid-eep-csum rather than invalid-MAC-addr condition, but it's a problem I noticed while reviewing Adam's patch in detail. Jeff [-- Attachment #2: patch.e1000-addr --] [-- Type: text/plain, Size: 3740 bytes --] diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index f1ce348..8936d48 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -1022,11 +1022,6 @@ e1000_probe(struct pci_dev *pdev, memcpy(netdev->dev_addr, adapter->hw.mac_addr, netdev->addr_len); memcpy(netdev->perm_addr, adapter->hw.mac_addr, netdev->addr_len); - if (!is_valid_ether_addr(netdev->perm_addr)) { - DPRINTK(PROBE, ERR, "Invalid MAC Address\n"); - goto err_eeprom; - } - e1000_get_bus_info(&adapter->hw); init_timer(&adapter->tx_fifo_stall_timer); @@ -1134,7 +1129,9 @@ e1000_probe(struct pci_dev *pdev, "32-bit")); } - printk("%s\n", print_mac(mac, netdev->dev_addr)); + printk("%s%s\n", + print_mac(mac, netdev->dev_addr), + is_valid_ether_addr(netdev->dev_addr) ? "" : " (INVALID)"); /* reset the hardware with the new settings */ e1000_reset(adapter); @@ -1396,6 +1393,9 @@ e1000_open(struct net_device *netdev) struct e1000_adapter *adapter = netdev_priv(netdev); int err; + if (!is_valid_ether_addr(netdev->dev_addr)) + return -EINVAL; + /* disallow open during test */ if (test_bit(__E1000_TESTING, &adapter->flags)) return -EBUSY; @@ -2377,7 +2377,7 @@ e1000_set_mac(struct net_device *netdev, void *p) struct sockaddr *addr = p; if (!is_valid_ether_addr(addr->sa_data)) - return -EADDRNOTAVAIL; + return -EINVAL; /* 82542 2.0 needs to be in reset to write receive address registers */ diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 033e124..0e3216c 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -2557,6 +2557,9 @@ static int e1000_open(struct net_device *netdev) struct e1000_hw *hw = &adapter->hw; int err; + if (!is_valid_ether_addr(netdev->dev_addr)) + return -EINVAL; + /* disallow open during test */ if (test_bit(__E1000_TESTING, &adapter->state)) return -EBUSY; @@ -2670,7 +2673,7 @@ static int e1000_set_mac(struct net_device *netdev, void *p) struct sockaddr *addr = p; if (!is_valid_ether_addr(addr->sa_data)) - return -EADDRNOTAVAIL; + return -EINVAL; memcpy(netdev->dev_addr, addr->sa_data, netdev->addr_len); memcpy(adapter->hw.mac.addr, addr->sa_data, netdev->addr_len); @@ -3960,14 +3963,16 @@ static void e1000_print_device_info(struct e1000_adapter *adapter) /* print bus type/speed/width info */ ndev_info(netdev, "(PCI Express:2.5GB/s:%s) " - "%02x:%02x:%02x:%02x:%02x:%02x\n", + "%02x:%02x:%02x:%02x:%02x:%02x%s\n", /* bus width */ ((hw->bus.width == e1000_bus_width_pcie_x4) ? "Width x4" : "Width x1"), /* MAC address */ netdev->dev_addr[0], netdev->dev_addr[1], netdev->dev_addr[2], netdev->dev_addr[3], - netdev->dev_addr[4], netdev->dev_addr[5]); + netdev->dev_addr[4], netdev->dev_addr[5], + is_valid_ether_addr(netdev->dev_addr) ? + "" : " (INVALID)"); ndev_info(netdev, "Intel(R) PRO/%s Network Connection\n", (hw->phy.type == e1000_phy_ife) ? "10/100" : "1000"); @@ -4170,16 +4175,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev, memcpy(netdev->dev_addr, adapter->hw.mac.addr, netdev->addr_len); memcpy(netdev->perm_addr, adapter->hw.mac.addr, netdev->addr_len); - if (!is_valid_ether_addr(netdev->perm_addr)) { - ndev_err(netdev, "Invalid MAC Address: " - "%02x:%02x:%02x:%02x:%02x:%02x\n", - netdev->perm_addr[0], netdev->perm_addr[1], - netdev->perm_addr[2], netdev->perm_addr[3], - netdev->perm_addr[4], netdev->perm_addr[5]); - err = -EIO; - goto err_eeprom; - } - init_timer(&adapter->watchdog_timer); adapter->watchdog_timer.function = &e1000_watchdog; adapter->watchdog_timer.data = (unsigned long) adapter; ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 0:55 ` [PATCH] e1000, e1000e valid-addr fixes Jeff Garzik @ 2007-10-24 1:03 ` Jeff Garzik 2007-10-24 1:07 ` David Miller 2007-10-24 1:15 ` Adrian Bunk 1 sibling, 1 reply; 21+ messages in thread From: Jeff Garzik @ 2007-10-24 1:03 UTC (permalink / raw) To: netdev; +Cc: David Miller, davej, auke-jan.h.kok, ajax, linux-kernel Jeff Garzik wrote: > Actually, looking over the code I see obvious bugs in the logic: > > An invalid ethernet address should not cause device loading to fail, > because the user is given the opportunity to supply a MAC address via > userspace (ifconfig or whatever) before the interface goes up. > > I just created the attached -bug fix- patch as illustration, though I > have not committed it, waiting for comment. > > This patch will make no difference for users hitting invalid-eep-csum > rather than invalid-MAC-addr condition, but it's a problem I noticed > while reviewing Adam's patch in detail. Adding my own comment :) Does the ethernet stack check is_valid_ether_addr() before permitting interface-up? I'm wondering if there is a way to avoid adding if (!is_valid_ether_addr(dev->dev_addr)) return -EINVAL; to every ethernet driver's ->open() hook. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 1:03 ` Jeff Garzik @ 2007-10-24 1:07 ` David Miller 2007-10-24 2:20 ` Jeff Garzik 0 siblings, 1 reply; 21+ messages in thread From: David Miller @ 2007-10-24 1:07 UTC (permalink / raw) To: jeff; +Cc: netdev, davej, auke-jan.h.kok, ajax, linux-kernel From: Jeff Garzik <jeff@garzik.org> Date: Tue, 23 Oct 2007 21:03:36 -0400 > I'm wondering if there is a way to avoid adding > > if (!is_valid_ether_addr(dev->dev_addr)) > return -EINVAL; > > to every ethernet driver's ->open() hook. The first idea I get is: 1) Create netdev->validate_dev_addr(). 2) If it exists, invoke it before ->open(), abort and return if any errors signaled. etherdev init hooks up a function that does the above check, which allows us to avoid editing every ethernet driver What do you think? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 1:07 ` David Miller @ 2007-10-24 2:20 ` Jeff Garzik 2007-10-24 2:23 ` David Miller 2007-11-01 18:11 ` Stephen Hemminger 0 siblings, 2 replies; 21+ messages in thread From: Jeff Garzik @ 2007-10-24 2:20 UTC (permalink / raw) To: David Miller; +Cc: netdev, davej, auke-jan.h.kok, ajax, linux-kernel [-- Attachment #1: Type: text/plain, Size: 649 bytes --] David Miller wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Tue, 23 Oct 2007 21:03:36 -0400 > >> I'm wondering if there is a way to avoid adding >> >> if (!is_valid_ether_addr(dev->dev_addr)) >> return -EINVAL; >> >> to every ethernet driver's ->open() hook. > > The first idea I get is: > > 1) Create netdev->validate_dev_addr(). > > 2) If it exists, invoke it before ->open(), abort > and return if any errors signaled. > > etherdev init hooks up a function that does the above > check, which allows us to avoid editing every ethernet > driver > > What do you think? Seems sane to me. Something like this (attached)? Jeff [-- Attachment #2: patch.validate-addr --] [-- Type: text/plain, Size: 2000 bytes --] diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 4a3f54e..962d1de 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -669,6 +669,8 @@ struct net_device #define HAVE_SET_MAC_ADDR int (*set_mac_address)(struct net_device *dev, void *addr); +#define HAVE_VALIDATE_ADDR + int (*validate_addr)(struct net_device *dev); #define HAVE_PRIVATE_IOCTL int (*do_ioctl)(struct net_device *dev, struct ifreq *ifr, int cmd); diff --git a/net/core/dev.c b/net/core/dev.c index 8726589..f861555 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1007,17 +1007,20 @@ int dev_open(struct net_device *dev) * Call device private open method */ set_bit(__LINK_STATE_START, &dev->state); - if (dev->open) { + + if (dev->validate_addr) + ret = dev->validate_addr(dev); + + if (!ret && dev->open) ret = dev->open(dev); - if (ret) - clear_bit(__LINK_STATE_START, &dev->state); - } /* * If it went open OK then: */ - if (!ret) { + if (ret) + clear_bit(__LINK_STATE_START, &dev->state); + else { /* * Set the flags. */ @@ -1038,6 +1041,7 @@ int dev_open(struct net_device *dev) */ call_netdevice_notifiers(NETDEV_UP, dev); } + return ret; } diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index ed8a3d4..5471cd2 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -298,6 +298,14 @@ static int eth_change_mtu(struct net_device *dev, int new_mtu) return 0; } +static int eth_validate_addr(struct net_device *dev) +{ + if (!is_valid_ether_addr(dev->dev_addr)) + return -EINVAL; + + return 0; +} + const struct header_ops eth_header_ops ____cacheline_aligned = { .create = eth_header, .parse = eth_header_parse, @@ -317,6 +325,7 @@ void ether_setup(struct net_device *dev) dev->change_mtu = eth_change_mtu; dev->set_mac_address = eth_mac_addr; + dev->validate_addr = eth_validate_addr; dev->type = ARPHRD_ETHER; dev->hard_header_len = ETH_HLEN; ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 2:20 ` Jeff Garzik @ 2007-10-24 2:23 ` David Miller 2007-11-01 18:04 ` Kok, Auke 2007-11-01 18:11 ` Stephen Hemminger 1 sibling, 1 reply; 21+ messages in thread From: David Miller @ 2007-10-24 2:23 UTC (permalink / raw) To: jeff; +Cc: netdev, davej, auke-jan.h.kok, ajax, linux-kernel From: Jeff Garzik <jeff@garzik.org> Date: Tue, 23 Oct 2007 22:20:30 -0400 > David Miller wrote: > > From: Jeff Garzik <jeff@garzik.org> > > Date: Tue, 23 Oct 2007 21:03:36 -0400 > > > >> I'm wondering if there is a way to avoid adding > >> > >> if (!is_valid_ether_addr(dev->dev_addr)) > >> return -EINVAL; > >> > >> to every ethernet driver's ->open() hook. > > > > The first idea I get is: > > > > 1) Create netdev->validate_dev_addr(). > > > > 2) If it exists, invoke it before ->open(), abort > > and return if any errors signaled. > > > > etherdev init hooks up a function that does the above > > check, which allows us to avoid editing every ethernet > > driver > > > > What do you think? > > Seems sane to me. Something like this (attached)? Looks great: Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 2:23 ` David Miller @ 2007-11-01 18:04 ` Kok, Auke 2007-11-01 18:47 ` Jeff Garzik 0 siblings, 1 reply; 21+ messages in thread From: Kok, Auke @ 2007-11-01 18:04 UTC (permalink / raw) To: David Miller; +Cc: jeff, netdev, davej, ajax, linux-kernel David Miller wrote: > From: Jeff Garzik <jeff@garzik.org> > Date: Tue, 23 Oct 2007 22:20:30 -0400 > >> David Miller wrote: >>> From: Jeff Garzik <jeff@garzik.org> >>> Date: Tue, 23 Oct 2007 21:03:36 -0400 >>> >>>> I'm wondering if there is a way to avoid adding >>>> >>>> if (!is_valid_ether_addr(dev->dev_addr)) >>>> return -EINVAL; >>>> >>>> to every ethernet driver's ->open() hook. >>> The first idea I get is: >>> >>> 1) Create netdev->validate_dev_addr(). >>> >>> 2) If it exists, invoke it before ->open(), abort >>> and return if any errors signaled. >>> >>> etherdev init hooks up a function that does the above >>> check, which allows us to avoid editing every ethernet >>> driver >>> >>> What do you think? >> Seems sane to me. Something like this (attached)? > > Looks great: > > Acked-by: David S. Miller <davem@davemloft.net> I like it. Should I start sending patches to remove the checks from e1000/e1000e/ixgb/ixgbe already (to David, I assume?)? Auke ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-11-01 18:04 ` Kok, Auke @ 2007-11-01 18:47 ` Jeff Garzik 0 siblings, 0 replies; 21+ messages in thread From: Jeff Garzik @ 2007-11-01 18:47 UTC (permalink / raw) To: Kok, Auke; +Cc: David Miller, netdev, davej, ajax, linux-kernel Kok, Auke wrote: > David Miller wrote: >> From: Jeff Garzik <jeff@garzik.org> >> Date: Tue, 23 Oct 2007 22:20:30 -0400 >> >>> David Miller wrote: >>>> From: Jeff Garzik <jeff@garzik.org> >>>> Date: Tue, 23 Oct 2007 21:03:36 -0400 >>>> >>>>> I'm wondering if there is a way to avoid adding >>>>> >>>>> if (!is_valid_ether_addr(dev->dev_addr)) >>>>> return -EINVAL; >>>>> >>>>> to every ethernet driver's ->open() hook. >>>> The first idea I get is: >>>> >>>> 1) Create netdev->validate_dev_addr(). >>>> >>>> 2) If it exists, invoke it before ->open(), abort >>>> and return if any errors signaled. >>>> >>>> etherdev init hooks up a function that does the above >>>> check, which allows us to avoid editing every ethernet >>>> driver >>>> >>>> What do you think? >>> Seems sane to me. Something like this (attached)? >> Looks great: >> >> Acked-by: David S. Miller <davem@davemloft.net> > > I like it. > > Should I start sending patches to remove the checks from e1000/e1000e/ixgb/ixgbe > already (to David, I assume?)? Send the patches to me like normal... Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 2:20 ` Jeff Garzik 2007-10-24 2:23 ` David Miller @ 2007-11-01 18:11 ` Stephen Hemminger 2007-11-01 19:31 ` Jeff Garzik 1 sibling, 1 reply; 21+ messages in thread From: Stephen Hemminger @ 2007-11-01 18:11 UTC (permalink / raw) To: Jeff Garzik Cc: David Miller, netdev, davej, auke-jan.h.kok, ajax, linux-kernel How about: static int eth_validate_addr(const struct net_device *dev) { return is_valid_ether_addr(dev->dev_addr) ? 0 : -EINVAL; } -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-11-01 18:11 ` Stephen Hemminger @ 2007-11-01 19:31 ` Jeff Garzik 0 siblings, 0 replies; 21+ messages in thread From: Jeff Garzik @ 2007-11-01 19:31 UTC (permalink / raw) To: Stephen Hemminger Cc: David Miller, netdev, davej, auke-jan.h.kok, ajax, linux-kernel Stephen Hemminger wrote: > How about: > > static int eth_validate_addr(const struct net_device *dev) > { > return is_valid_ether_addr(dev->dev_addr) ? 0 : -EINVAL; > } hmmm -- its a slow path, so I don't see the value of marking the argument 'const' -- right now this implementation merely reads the dev->dev_addr[], but that need not always be the case. And I don't see the value of squashing everything onto one line, IMO the current version is more readable. Jeff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] e1000, e1000e valid-addr fixes 2007-10-24 0:55 ` [PATCH] e1000, e1000e valid-addr fixes Jeff Garzik 2007-10-24 1:03 ` Jeff Garzik @ 2007-10-24 1:15 ` Adrian Bunk 1 sibling, 0 replies; 21+ messages in thread From: Adrian Bunk @ 2007-10-24 1:15 UTC (permalink / raw) To: Jeff Garzik Cc: David Miller, davej, auke-jan.h.kok, ajax, linux-kernel, netdev On Tue, Oct 23, 2007 at 08:55:29PM -0400, Jeff Garzik wrote: > Actually, looking over the code I see obvious bugs in the logic: > > An invalid ethernet address should not cause device loading to fail, > because the user is given the opportunity to supply a MAC address via > userspace (ifconfig or whatever) before the interface goes up. > > I just created the attached -bug fix- patch as illustration, though I have > not committed it, waiting for comment. >... Is there a good reason why we have such checks duplicated in the drivers (with every driver doing it differently...) instead of doing it in net/core/dev.c? > Jeff cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 21:20 ` Dave Jones 2007-10-23 21:38 ` Alan Cox 2007-10-23 21:53 ` David Miller @ 2007-10-23 23:03 ` Kok, Auke 2007-10-23 23:53 ` Stephen Hemminger 2007-10-24 5:38 ` Dave Jones 2 siblings, 2 replies; 21+ messages in thread From: Kok, Auke @ 2007-10-23 23:03 UTC (permalink / raw) To: Dave Jones, Jeff Garzik, Kok, Auke, Adam Jackson, linux-kernel, David Miller <davem Dave Jones wrote: > On Tue, Oct 23, 2007 at 04:40:01PM -0400, Jeff Garzik wrote: > > > > In any case, this patch should not be merged. We often send it around to users to > > > debug their issue in case it involves eeproms, but merging it will just conceal > > > the real issue and all of a sudden a flood of people stop reporting *real* issues > > > to us. > > > > Sorry, I disagree. Just as with e100, if there is a clear way the user > > can recover their setup -- and Adam says his was effective -- I don't > > see why we should be denying users the ability to use their own hardware. > > Indeed. This is a common enough problem that not including it causes more pain > than its worth. I have two affected boxes myself that I actually thought > the hardware was dead before I tried ajax's patch. look: You should have reported this to us and you didn't. Now you are using the fact that you did not report it as an argument which is out of place. why do you say it is common? how often have you seen this and not reported it back to our support? are you willingly trying to frustrate this issue? Auke ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 23:03 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Kok, Auke @ 2007-10-23 23:53 ` Stephen Hemminger 2007-10-24 5:38 ` Dave Jones 1 sibling, 0 replies; 21+ messages in thread From: Stephen Hemminger @ 2007-10-23 23:53 UTC (permalink / raw) To: Kok, Auke Cc: Dave Jones, Jeff Garzik, Kok, Auke, Adam Jackson, linux-kernel, David Miller, netdev On Tue, 23 Oct 2007 16:03:38 -0700 "Kok, Auke" <auke-jan.h.kok@intel.com> wrote: > Dave Jones wrote: > > On Tue, Oct 23, 2007 at 04:40:01PM -0400, Jeff Garzik wrote: > > > > > > In any case, this patch should not be merged. We often send it around to users to > > > > debug their issue in case it involves eeproms, but merging it will just conceal > > > > the real issue and all of a sudden a flood of people stop reporting *real* issues > > > > to us. > > > > > > Sorry, I disagree. Just as with e100, if there is a clear way the user > > > can recover their setup -- and Adam says his was effective -- I don't > > > see why we should be denying users the ability to use their own hardware. > > > > Indeed. This is a common enough problem that not including it causes more pain > > than its worth. I have two affected boxes myself that I actually thought > > the hardware was dead before I tried ajax's patch. > > > look: You should have reported this to us and you didn't. Now you are using the > fact that you did not report it as an argument which is out of place. > > why do you say it is common? how often have you seen this and not reported it back > to our support? are you willingly trying to frustrate this issue? > > > Auke What about a compromise like "ignore_checksum" module option? That way users with bad checksums wouldn't just ignore the problem (no one reads console logs), but would have a way to correct the checksum. There are many reasons would want the ability to fix the problem themselves without asking Intel. -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 23:03 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Kok, Auke 2007-10-23 23:53 ` Stephen Hemminger @ 2007-10-24 5:38 ` Dave Jones 1 sibling, 0 replies; 21+ messages in thread From: Dave Jones @ 2007-10-24 5:38 UTC (permalink / raw) To: Kok, Auke; +Cc: Jeff Garzik, Adam Jackson, linux-kernel, David Miller, netdev On Tue, Oct 23, 2007 at 04:03:38PM -0700, Kok, Auke wrote: > Dave Jones wrote: > > On Tue, Oct 23, 2007 at 04:40:01PM -0400, Jeff Garzik wrote: > > > > > > In any case, this patch should not be merged. We often send it around to users to > > > > debug their issue in case it involves eeproms, but merging it will just conceal > > > > the real issue and all of a sudden a flood of people stop reporting *real* issues > > > > to us. > > > > > > Sorry, I disagree. Just as with e100, if there is a clear way the user > > > can recover their setup -- and Adam says his was effective -- I don't > > > see why we should be denying users the ability to use their own hardware. > > > > Indeed. This is a common enough problem that not including it causes more pain > > than its worth. I have two affected boxes myself that I actually thought > > the hardware was dead before I tried ajax's patch. > > > look: You should have reported this to us and you didn't. Now you are using the > fact that you did not report it as an argument which is out of place. you're missing the point. It looks like a hardware failure. Why would I report this? > why do you say it is common? how often have you seen this and not reported it back > to our support? are you willingly trying to frustrate this issue? Not at all. The only frustration here is that I used to have a kernel that worked, upgraded, and thought that my hardware was broken. How many other users thought the same ? Dave -- http://www.codemonkey.org.uk ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] Add eeprom_bad_csum_allow module option to e1000. 2007-10-23 20:40 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Jeff Garzik 2007-10-23 21:01 ` Kok, Auke 2007-10-23 21:20 ` Dave Jones @ 2007-10-23 21:48 ` David Miller 2 siblings, 0 replies; 21+ messages in thread From: David Miller @ 2007-10-23 21:48 UTC (permalink / raw) To: jeff; +Cc: auke-jan.h.kok, ajax, linux-kernel, netdev From: Jeff Garzik <jeff@garzik.org> Date: Tue, 23 Oct 2007 16:40:01 -0400 > Sorry, I disagree. Just as with e100, if there is a clear way the user > can recover their setup -- and Adam says his was effective -- I don't > see why we should be denying users the ability to use their own hardware. I'd like to second these sentiments. Just because you can come up with cases where using a wrong eeprom would fail, does not mean the facility should not be provided at all. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2007-11-01 19:31 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <11931515302013-git-send-email-ajax@redhat.com>
[not found] ` <471E1ECD.80002@intel.com>
[not found] ` <1193156487.26974.39.camel@localhost.localdomain>
[not found] ` <471E2AD0.1000500@intel.com>
2007-10-23 20:40 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Jeff Garzik
2007-10-23 21:01 ` Kok, Auke
2007-10-23 21:51 ` David Miller
2007-10-23 21:20 ` Dave Jones
2007-10-23 21:38 ` Alan Cox
2007-10-23 21:53 ` David Miller
2007-10-23 23:19 ` Kok, Auke
2007-10-24 0:55 ` [PATCH] e1000, e1000e valid-addr fixes Jeff Garzik
2007-10-24 1:03 ` Jeff Garzik
2007-10-24 1:07 ` David Miller
2007-10-24 2:20 ` Jeff Garzik
2007-10-24 2:23 ` David Miller
2007-11-01 18:04 ` Kok, Auke
2007-11-01 18:47 ` Jeff Garzik
2007-11-01 18:11 ` Stephen Hemminger
2007-11-01 19:31 ` Jeff Garzik
2007-10-24 1:15 ` Adrian Bunk
2007-10-23 23:03 ` [PATCH] Add eeprom_bad_csum_allow module option to e1000 Kok, Auke
2007-10-23 23:53 ` Stephen Hemminger
2007-10-24 5:38 ` Dave Jones
2007-10-23 21:48 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).