* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified [not found] <15091.23655.488243.650394@pizda.ninka.net> @ 2001-05-13 18:19 ` kuznet 2001-05-14 1:43 ` Andrew Morton 0 siblings, 1 reply; 12+ messages in thread From: kuznet @ 2001-05-13 18:19 UTC (permalink / raw) To: David S. Miller; +Cc: linux-kernel Hello! > I believe these events get sent to the cardmgr daemon and it does > all the ifconf magic to change the device state. Compare this also to the situation with netif_present(). After Linus said that it is called from thread context, I prepared corresponding code for netif_present (and for carrier detection in assumption it is called from thread context or softirq) BUT... this happened to be not true. So, these macros still do not assume anything on context. As result netif_carrier* is unreliable, netif_present is still straight bug. Should be fixed, of course. BTW what did happen with Andrew's netdev registration patch? By some strange reason I believed it is already applied... Grrr. Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-13 18:19 ` NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified kuznet @ 2001-05-14 1:43 ` Andrew Morton 2001-05-14 17:47 ` kuznet 0 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2001-05-14 1:43 UTC (permalink / raw) To: kuznet; +Cc: David S. Miller, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > BTW what did happen with Andrew's netdev registration patch? > By some strange reason I believed it is already applied... Grrr. Alan had it applied in his tree for a while. It was a damn huge patch though, and Jeff came up with a simpler version which is now present in both 2.4 streams. Jeff has introduced `alloc_etherdev()' which allocates storage for a netdev but doesn't register it. The one quirk with this approach (and why it's vastly simpler than my thing) is that the name of the interface is not known during probe, so the drivers need to take care that the identifier which they emit in their detection messages make sense. Old code: xxx_probe() { dev = init_etherdev(); < initialise stuff > printk("%s", dev->name); /* * If we sleep here, we drop BKL and other tasks/CPUs * can open an unready device. dev_probe_lock() plugs * this for PCI/Cardbus devices only */ if (failed) { unregister_netdev(dev); return -EWHATEVER; } return 0; } New code: xxx_probe(struct pci_dev *pdev, ...) { dev = alloc_etherdev(); <initialise stuff> /* * Note that dev->name doesn't make sense at this time. * So we use PCI locations. Drivers which support * EISA as well as PCI (eg 3c59x) will have a NULL * pdev here and need to make something up. */ printk("%s", pdev->slot_name); /* * If we sleep here, we're fine: the device isn't * registered in the namespace yet */ ... if (failed) { kfree(dev); return -EBUMMER; } return register_netdev(dev); } Not many drivers have been converted to the new interface yet. The situation with old-style drivers which use net_device.init() is a bit foggy. ISTR that the init() method was inherently immune to this race. - ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 1:43 ` Andrew Morton @ 2001-05-14 17:47 ` kuznet 2001-05-14 18:12 ` Jeff Garzik 0 siblings, 1 reply; 12+ messages in thread From: kuznet @ 2001-05-14 17:47 UTC (permalink / raw) To: Andrew Morton; +Cc: davem, linux-kernel Hello! > Jeff has introduced `alloc_etherdev()' which allocates storage > for a netdev but doesn't register it. The one quirk with this > approach (and why it's vastly simpler than my thing) I do not see where it is simpler. The only difference is that name is unknown. 8) > Not many drivers have been converted to the new interface yet. Paaardon! It is the only place where it takes sense to tell: "simpler" and it was sense of your patch! Of course, it is much "simpler" to leave all the devices in buggy state, no doubts. 8) What's about dev_probe_lock, I again do not understand why it is not deleted. Please, shed some light. > is a bit foggy. ISTR that the init() method was inherently > immune to this race. 8) Imagine, I believed that all the devices use this method for years. The discovery that init_etherdev does some shit was real catharsis. 8) Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 17:47 ` kuznet @ 2001-05-14 18:12 ` Jeff Garzik 2001-05-14 18:40 ` kuznet 0 siblings, 1 reply; 12+ messages in thread From: Jeff Garzik @ 2001-05-14 18:12 UTC (permalink / raw) To: kuznet; +Cc: Andrew Morton, davem, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > Jeff has introduced `alloc_etherdev()' which allocates storage > > for a netdev but doesn't register it. The one quirk with this > > approach (and why it's vastly simpler than my thing) > > I do not see where it is simpler. The only difference is that > name is unknown. 8) Note that using dev->name during probe was always incorrect. Think about the error case: device 0: dev = init_etherdev(...); /* gets if eth0 */ printk(... dev->name ...) /* prints "eth0: error foo, aborting" */ failure! exit and unregister_netdev device 1: dev = init_etherdev(...); /* gets if eth0 */ printk(... dev->name ...) /* prints "eth0: error foo, aborting" */ failure! exit and unregister_netdev device 2: dev = init_etherdev(...); /* gets if eth0 */ printk(... dev->name ...) /* prints "eth0: error foo, aborting" */ failure! exit and unregister_netdev So, using interface name in this manner was always buggy because it conveys no useful information to the user. > What's about dev_probe_lock, I again do not understand why it is not deleted. > Please, shed some light. I'm all for removing it... I do not like removing it in a so-called "stable" series, though. alloc_etherdev() was enough to solve the race and flush out buggy drivers using dev->name during probe. Notice I did not remove init_etherdev and fix it properly -- IMHO that is 2.5 material. Jeff -- Jeff Garzik | Game called on account of naked chick Building 1024 | MandrakeSoft | ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 18:12 ` Jeff Garzik @ 2001-05-14 18:40 ` kuznet 2001-05-14 19:27 ` Jeff Garzik 2001-05-14 23:51 ` Andrew Morton 0 siblings, 2 replies; 12+ messages in thread From: kuznet @ 2001-05-14 18:40 UTC (permalink / raw) To: Jeff Garzik; +Cc: andrewm, davem, linux-kernel Hello! > Note that using dev->name during probe was always incorrect. Think > about the error case: ... > So, using interface name in this manner was always buggy because it > conveys no useful information to the user. I used to think about cases of success. 8) In any case the question follows: do we have some another generic unique human-readable identifier? Only if device is PCI? Actually, I am puzzled mostly with Andrew's note about "simplicity". Andrew's patch was evidently much __simpler__ than yours, at least, it required one liner for each device and surely was not a "2.5 material". > I'm all for removing it... I do not like removing it in a so-called > "stable" series, though. alloc_etherdev() was enough to solve the race > and flush out buggy drivers using dev->name during probe. Notice I did > not remove init_etherdev and fix it properly -- IMHO that is 2.5 > material. Nope, guy. Fixing fatal bug is always material of released kernel. In any case the question remains: what is the sense of dev_probe_lock now? Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 18:40 ` kuznet @ 2001-05-14 19:27 ` Jeff Garzik 2001-05-14 19:42 ` kuznet 2001-05-14 23:51 ` Andrew Morton 1 sibling, 1 reply; 12+ messages in thread From: Jeff Garzik @ 2001-05-14 19:27 UTC (permalink / raw) To: kuznet; +Cc: andrewm, davem, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > Note that using dev->name during probe was always incorrect. Think > > about the error case: > ... > > So, using interface name in this manner was always buggy because it > > conveys no useful information to the user. > > I used to think about cases of success. 8) > In any case the question follows: do we have some another generic > unique human-readable identifier? Only if device is PCI? Each bus should come up with its own way to uniquely identify devices... such is required for SCSI and EthTool ioctls which request bus information (as a text string) for a given device. PCI is simply the one example I know well... pci_dev->slot_name. Note, however, I have come to think that "tulip0: ...", "tulip1: ...", etc. is more user-friendly than "00:f0.1: ...". > Actually, I am puzzled mostly with Andrew's note about "simplicity". > Andrew's patch was evidently much __simpler__ than yours, at least, > it required one liner for each device and surely was not a "2.5 material". Are you talking about his 140k patch? I think a key point of my patch is that drivers now follow the method of other kernel drivers: perform all setup necessary, and then register the device in a single operation. After register_foo(dev), all members of 'dev' are assumed to be filled in and ready for use. This is not the case with init_etherdev() normal usage, nor using dev->init()... Tangent - IMHO having register_netdev call dev->init is ugly and unusual compared to other driver APIs in the kernel. Your register function should not call out to driver functions, it should just register a new, already-set-up device in the subsystem and return. > > I'm all for removing it... I do not like removing it in a so-called > > "stable" series, though. alloc_etherdev() was enough to solve the race > > and flush out buggy drivers using dev->name during probe. Notice I did > > not remove init_etherdev and fix it properly -- IMHO that is 2.5 > > material. > > Nope, guy. Fixing fatal bug is always material of released kernel. So you say a fatal bug remains in 2.4.5-pre1? If so please elaborate... > In any case the question remains: what is the sense of dev_probe_lock now? I dunno. Andrew? I just looked at all users and it looks like it can be removed. Jeff -- Jeff Garzik | Game called on account of naked chick Building 1024 | MandrakeSoft | ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 19:27 ` Jeff Garzik @ 2001-05-14 19:42 ` kuznet 2001-05-14 20:34 ` Jeff Garzik 0 siblings, 1 reply; 12+ messages in thread From: kuznet @ 2001-05-14 19:42 UTC (permalink / raw) To: Jeff Garzik; +Cc: andrewm, davem, linux-kernel Hello! > Each bus should Not all the device are bound to some "bus". > Are you talking about his 140k patch? Yes! Size of patch and "simplicity" are orthogonal things. It was simple like potatoe. > I think a key point of my patch is that drivers now follow the method of > other kernel drivers: perform all setup necessary, and then register the > device in a single operation. Nice. I agreed. I talk about other thing: after applying Andrew's patch I saw good correct code. After you will fix all the devices, your patch will be the same 140K or more due to killing refs t dev->name announced to be illegal. 8) > After register_foo(dev), all members of > 'dev' are assumed to be filled in and ready for use. This is not the > case ....................... using dev->init()... Sorry? Why? > Tangent - IMHO having register_netdev call dev->init is ugly and unusual > compared to other driver APIs in the kernel. Your register function > should not call out to driver functions, it should just register a new, > already-set-up device in the subsystem and return. Provided you teach me some way to generate unique identifiers, different of device names. > So you say a fatal bug remains in 2.4.5-pre1? If so please elaborate... Probably, I am looking into different code, but I found only 15 references to new interface. Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 19:42 ` kuznet @ 2001-05-14 20:34 ` Jeff Garzik 0 siblings, 0 replies; 12+ messages in thread From: Jeff Garzik @ 2001-05-14 20:34 UTC (permalink / raw) To: kuznet; +Cc: andrewm, davem, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > Each bus should > > Not all the device are bound to some "bus". True. Each driver author would make a decision, for what's best to appear in their probe time printk's... > > Are you talking about his 140k patch? > > Yes! > > Size of patch and "simplicity" are orthogonal things. > It was simple like potatoe. It was simple for existing code, I agree, but IMHO not correct WRT the dev->name error case mentioned earlier, and also different from the rest of the kernel driver APIs. > > I think a key point of my patch is that drivers now follow the method of > > other kernel drivers: perform all setup necessary, and then register the > > device in a single operation. > > Nice. I agreed. I talk about other thing: after applying Andrew's patch > I saw good correct code. After you will fix all the devices, your patch will > be the same 140K or more due to killing refs t dev->name announced > to be illegal. 8) true enough... > > After register_foo(dev), all members of > > 'dev' are assumed to be filled in and ready for use. This is not the > > case ....................... using dev->init()... > > Sorry? Why? Sorry -- I just rechecked the code, and I was mistaken. dev-init() is called earlier than I had thought.. > > Tangent - IMHO having register_netdev call dev->init is ugly and unusual > > compared to other driver APIs in the kernel. Your register function > > should not call out to driver functions, it should just register a new, > > already-set-up device in the subsystem and return. > > Provided you teach me some way to generate unique identifiers, different > of device names. I don't understand your point here. init_etherdev and alloc_etherdev users get by just fine without dev->init(). My thought is that dev->init is not needed at all -- simply require initialization before register_netdev[ice] is called. > > So you say a fatal bug remains in 2.4.5-pre1? If so please elaborate... > > Probably, I am looking into different code, but I found only 15 references > to new interface. Please let me know what bugs you find in the vanilla kernel... Regards, Jeff -- Jeff Garzik | Game called on account of naked chick Building 1024 | MandrakeSoft | ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 18:40 ` kuznet 2001-05-14 19:27 ` Jeff Garzik @ 2001-05-14 23:51 ` Andrew Morton 2001-05-15 9:02 ` kuznet 1 sibling, 1 reply; 12+ messages in thread From: Andrew Morton @ 2001-05-14 23:51 UTC (permalink / raw) To: kuznet; +Cc: Jeff Garzik, davem, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > Note that using dev->name during probe was always incorrect. Think > > about the error case: > ... > > So, using interface name in this manner was always buggy because it > > conveys no useful information to the user. > > I used to think about cases of success. 8) > In any case the question follows: do we have some another generic > unique human-readable identifier? Only if device is PCI? > > Actually, I am puzzled mostly with Andrew's note about "simplicity". > Andrew's patch was evidently much __simpler__ than yours, at least, > it required one liner for each device and surely was not a "2.5 material". mmm... I had to do a bit of mangling in the net core to be able to "reserve" the netdevice name at init_etherdev() time, but make it unavailable in namespace lookups. As Jeff points out, it was an unusual interface - a two-phase registration where we first reserve the name and then later either commit it or withdraw it. That's not the way kernel normally does things, and it was done that way for back-compatibility, and to make the device naming prettier. And yes, it was a 140k patch because it modified about eighty drivers, pulled out the old interfaces and pulled out dev_probe_lock(). > > I'm all for removing it... I do not like removing it in a so-called > > "stable" series, though. alloc_etherdev() was enough to solve the race > > and flush out buggy drivers using dev->name during probe. Notice I did > > not remove init_etherdev and fix it properly -- IMHO that is 2.5 > > material. > > Nope, guy. Fixing fatal bug is always material of released kernel. > > In any case the question remains: what is the sense of dev_probe_lock now? It protects the as-yet-unchanged PCI and Cardbus drivers from a fatal race. This is not some theoretical race, BTW: as an experiment I put a simple "schedule()" into vortex_probe1() and it killed the driver. Because: sys_init_module() ->vortex_probe1() ->init_etherdev() ->register_netdev() ->/sbin/hotplug (an asynchronous execve) -> ifconfig eth0 up ifconfig is executed before vortex_probe1() had fully initialised the device and the data structures. The probe takes tens of milliseconds. The only thing which prevents this race is the fact that sys_init_module() and sys_ioctl() both do lock_kernel(). If xxx_probe() drops the BKL, ifconfig gets in there and fails. Usually, ifconfig finds the driver has a null ->open() method, so the interface open "succeeds" but the hardware wasn't opened. A subsequent ->close() will probably crash the machine. That is what dev_probe_lock() is doing today. It blocks ifconfig while ->probe() is in progress. Drivers which do not use alloc_etherdev() and which can drop the BKL in probe may fail to work with /sbin/hotplug initialisation if dev_probe_lock() is removed. At present that seems to include acenic, eepro100, pcnet32 and everyone's 3c59x except mine :) - ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-14 23:51 ` Andrew Morton @ 2001-05-15 9:02 ` kuznet 2001-05-15 11:00 ` Andrew Morton 0 siblings, 1 reply; 12+ messages in thread From: kuznet @ 2001-05-15 9:02 UTC (permalink / raw) To: Andrew Morton; +Cc: jgarzik, davem, linux-kernel Hello! > It protects the as-yet-unchanged PCI and Cardbus drivers from a > fatal race. Fatal race remained. Andrew, you start again the story about white bull. 8) We have already discussed this. Device cannot stay in device list uninitialzied. Period. I am sorry, but no compromise is possible. With Jeff's approach all the references to init_etherdev and dev_probe_lock must be eliminated in 2.4. > and sys_ioctl() both do lock_kernel(). If xxx_probe() drops the BKL, Again, BKL has nothing to do with this (and ioctl does not hold it) It looks like you forgot all the discussion around your own patch. 8) If you want I can retransmit the mails which resulted in your patch? Alexey ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-15 9:02 ` kuznet @ 2001-05-15 11:00 ` Andrew Morton 2001-05-15 11:15 ` David S. Miller 0 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2001-05-15 11:00 UTC (permalink / raw) To: kuznet; +Cc: jgarzik, davem, linux-kernel kuznet@ms2.inr.ac.ru wrote: > > Hello! > > > It protects the as-yet-unchanged PCI and Cardbus drivers from a > > fatal race. > > Fatal race remained. Don't think so. We have exclusion against all netdevice ioctls across probe. Still. It doesn't matter. > Andrew, you start again the story about white bull. 8) > We have already discussed this. Device cannot stay in device list > uninitialzied. Period. > > I am sorry, but no compromise is possible. With Jeff's approach all > the references to init_etherdev and dev_probe_lock must be eliminated > in 2.4. Once init_etherdev() has gone, yes, dev_probe_lock() can go. > > and sys_ioctl() both do lock_kernel(). If xxx_probe() drops the BKL, > > Again, BKL has nothing to do with this (and ioctl does not hold it) asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg) { struct file * filp; unsigned int flag; int on, error = -EBADF; filp = fget(fd); if (!filp) goto out; error = 0; lock_kernel(); The CPU running ifconfig spins here. > It looks like you forgot all the discussion around your own patch. 8) > > If you want I can retransmit the mails which resulted in your patch? It doesn't matter... I think we agree that init_etherdev() must die, and dev_probe_lock() with it, and that Jeff's alloc_etherdev() is an appropriate way of doing it? Actually, yes. Please tell me what problem you think we still have in current kernels, which dev_probe_lock() does not prevent? - ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified 2001-05-15 11:00 ` Andrew Morton @ 2001-05-15 11:15 ` David S. Miller 0 siblings, 0 replies; 12+ messages in thread From: David S. Miller @ 2001-05-15 11:15 UTC (permalink / raw) To: Andrew Morton; +Cc: kuznet, jgarzik, linux-kernel Andrew Morton writes: > > Again, BKL has nothing to do with this (and ioctl does not hold it) > > asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd, unsigned long arg) > { > struct file * filp; > unsigned int flag; > int on, error = -EBADF; > > filp = fget(fd); > if (!filp) > goto out; > error = 0; > lock_kernel(); > > The CPU running ifconfig spins here. Alexey's brain is going through net/socket.c:sock_ioctl() :-) There we drop the kernel lock... Later, David S. Miller davem@redhat.com ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-05-15 11:15 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <15091.23655.488243.650394@pizda.ninka.net>
2001-05-13 18:19 ` NETDEV_CHANGE events when __LINK_STATE_NOCARRIER is modified kuznet
2001-05-14 1:43 ` Andrew Morton
2001-05-14 17:47 ` kuznet
2001-05-14 18:12 ` Jeff Garzik
2001-05-14 18:40 ` kuznet
2001-05-14 19:27 ` Jeff Garzik
2001-05-14 19:42 ` kuznet
2001-05-14 20:34 ` Jeff Garzik
2001-05-14 23:51 ` Andrew Morton
2001-05-15 9:02 ` kuznet
2001-05-15 11:00 ` Andrew Morton
2001-05-15 11:15 ` David S. Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox