netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Network device driver probe issues
@ 2004-07-15 13:52 Anton Blanchard
  2004-07-15 23:34 ` Jeff Garzik
  0 siblings, 1 reply; 7+ messages in thread
From: Anton Blanchard @ 2004-07-15 13:52 UTC (permalink / raw)
  To: netdev; +Cc: cramerj, john.ronciak, ganesh.venkatesan, jonmason, jkenisto


Hi,

With the advent of hotplug and udev we see network device driver opens
called as soon as possible, ie after register_netdev. In a lot of cases
drivers currently call register_netdev too early, before they are ready
to handle an open. I think the e1000 had this problem (which has since
been fixed), and the s2io appears to have this same issue.

Once the register_netdev call is delayed until late in probe (it really
should be the last thing the probe call does) we have an issue with
device printks. One of the reasons register_netdev is often early in the
probe routine of network drivers is that it provides a device name to
append to printks.

The e1000 gets around this by replicating register_netdev:

        rtnl_lock();
        /* we need to set the name early since the DPRINTK macro needs it set */
        if (dev_alloc_name(netdev, netdev->name) < 0)
                goto err_free_unlock;

...

        /* since we are holding the rtnl lock already, call the no-lock version */
        if((err = register_netdevice(netdev)))
                goto err_register;

        cards_found++;
        rtnl_unlock();

The problem I have with this method has to do with how failures appear
to the user. If you have two network cards and the first one fails
during probe you will see:

eth0: Intel(R) PRO/1000 Network Connection
eth0: The EEPROM Checksum Is Not Valid
eth0: Intel(R) PRO/1000 Network Connection

This makes debugging the issue difficult. Where is the card that failed?
We have machines with 100s of interfaces and there it becomes a real
pain.

We should instead use something stable to attach to printks during
probe. pci_name() is the obvious choice, perhaps using dev_printk().
The failure then becomes:

0000:01:01.0 Intel(R) PRO/1000 Network Connection
0000:01:01.0 The EEPROM Checksum Is Not Valid
0000:02:01.0 Intel(R) PRO/1000 Network Connection

Also it would be useful if somewhere during probe the network driver
associated the network name with the pci id, eg:

0000:02:01.0 Intel(R) PRO/1000 Network Connection (eth0)

I know we can get at it via other means but it would be nice to be able
to look through the dmesg and work out what names were given to the
cards.

Anton

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Network device driver probe issues
  2004-07-15 13:52 Anton Blanchard
@ 2004-07-15 23:34 ` Jeff Garzik
  2004-07-16  1:04   ` Jim Keniston
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Garzik @ 2004-07-15 23:34 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: netdev, cramerj, john.ronciak, ganesh.venkatesan, jonmason,
	jkenisto

Anton Blanchard wrote:
> The e1000 gets around this by replicating register_netdev:
> 
>         rtnl_lock();
>         /* we need to set the name early since the DPRINTK macro needs it set */
>         if (dev_alloc_name(netdev, netdev->name) < 0)
>                 goto err_free_unlock;
> 
> ...
> 
>         /* since we are holding the rtnl lock already, call the no-lock version */
>         if((err = register_netdevice(netdev)))
>                 goto err_register;
> 
>         cards_found++;
>         rtnl_unlock();
> 
> The problem I have with this method has to do with how failures appear
> to the user. If you have two network cards and the first one fails
> during probe you will see:

Agreed, this is a hack and strongly discouraged.

Hopefully Intel will hear this and send me a patch to fix... :)


> We should instead use something stable to attach to printks during
> probe. pci_name() is the obvious choice, perhaps using dev_printk().
> The failure then becomes:
> 
> 0000:01:01.0 Intel(R) PRO/1000 Network Connection
> 0000:01:01.0 The EEPROM Checksum Is Not Valid
> 0000:02:01.0 Intel(R) PRO/1000 Network Connection

pci_name() or a simple counter of devices found.  I prefer pci_name()

	Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Network device driver probe issues
  2004-07-15 23:34 ` Jeff Garzik
@ 2004-07-16  1:04   ` Jim Keniston
  2004-07-16  4:49     ` David Dillow
  0 siblings, 1 reply; 7+ messages in thread
From: Jim Keniston @ 2004-07-16  1:04 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Anton Blanchard, netdev, cramerj, john.ronciak, ganesh.venkatesan,
	jonmason, Jim Keniston

On Thu, 2004-07-15 at 16:34, Jeff Garzik wrote:
> Anton Blanchard wrote:
> > The e1000 gets around this by replicating register_netdev:
> > 
> > [snipped]
> > 
> > The problem I have with this method has to do with how failures appear
> > to the user. If you have two network cards and the first one fails
> > during probe you will see:
> 
> Agreed, this is a hack and strongly discouraged.
> 
> Hopefully Intel will hear this and send me a patch to fix... :)
> 
> 
> > We should instead use something stable to attach to printks during
> > probe. pci_name() is the obvious choice, perhaps using dev_printk().
> > The failure then becomes:
> > 
> > 0000:01:01.0 Intel(R) PRO/1000 Network Connection
> > 0000:01:01.0 The EEPROM Checksum Is Not Valid
> > 0000:02:01.0 Intel(R) PRO/1000 Network Connection
> 
> pci_name() or a simple counter of devices found.  I prefer pci_name()
> 
> 	Jeff

Delay registration until the end of the probe, and make DPRINTK smarter.
DPRINTK should check
	adapter->netdev->reg_state == NETREG_REGISTERED
and use the interface name (eth0) if the device is registered, or
pci_name() if it's not.

Jim

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Network device driver probe issues
  2004-07-16  1:04   ` Jim Keniston
@ 2004-07-16  4:49     ` David Dillow
  2004-07-16 18:26       ` Jim Keniston
  0 siblings, 1 reply; 7+ messages in thread
From: David Dillow @ 2004-07-16  4:49 UTC (permalink / raw)
  To: Jim Keniston
  Cc: Jeff Garzik, Anton Blanchard, Netdev, cramerj, john.ronciak,
	ganesh.venkatesan, jonmason

On Thu, 2004-07-15 at 21:04, Jim Keniston wrote:
> On Thu, 2004-07-15 at 16:34, Jeff Garzik wrote:
[snip]
> > > We should instead use something stable to attach to printks during
> > > probe. pci_name() is the obvious choice, perhaps using dev_printk().
> > > The failure then becomes:
> > > 
> > > 0000:01:01.0 Intel(R) PRO/1000 Network Connection
> > > 0000:01:01.0 The EEPROM Checksum Is Not Valid
> > > 0000:02:01.0 Intel(R) PRO/1000 Network Connection
> > 
> > pci_name() or a simple counter of devices found.  I prefer pci_name()
> > 
> > 	Jeff
> 
> Delay registration until the end of the probe, and make DPRINTK smarter.
> DPRINTK should check
> 	adapter->netdev->reg_state == NETREG_REGISTERED
> and use the interface name (eth0) if the device is registered, or
> pci_name() if it's not.
> 
> Jim

In the typhoon driver, I have tp->name, and most everything that prints
info/errors about the card use it. During probing, it is initially set
to point to pci_name(), and once registered, it gets pointed to
netdev->name.

This seems fairly clean, though maybe a bit redundant.

Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Network device driver probe issues
  2004-07-16  4:49     ` David Dillow
@ 2004-07-16 18:26       ` Jim Keniston
  0 siblings, 0 replies; 7+ messages in thread
From: Jim Keniston @ 2004-07-16 18:26 UTC (permalink / raw)
  To: David Dillow
  Cc: Jeff Garzik, Anton Blanchard, Netdev, cramerj, john.ronciak,
	ganesh.venkatesan, jonmason, Jim Keniston

On Thu, 2004-07-15 at 21:49, David Dillow wrote:
> On Thu, 2004-07-15 at 21:04, Jim Keniston wrote:
> > On Thu, 2004-07-15 at 16:34, Jeff Garzik wrote:
> [snip]
> > > > We should instead use something stable to attach to printks during
> > > > probe. pci_name() is the obvious choice, perhaps using dev_printk().
> > > > The failure then becomes:
> > > > 
> > > > 0000:01:01.0 Intel(R) PRO/1000 Network Connection
> > > > 0000:01:01.0 The EEPROM Checksum Is Not Valid
> > > > 0000:02:01.0 Intel(R) PRO/1000 Network Connection
> > > 
> > > pci_name() or a simple counter of devices found.  I prefer pci_name()
> > > 
> > > 	Jeff
> > 
> > Delay registration until the end of the probe, and make DPRINTK smarter.
> > DPRINTK should check
> > 	adapter->netdev->reg_state == NETREG_REGISTERED
> > and use the interface name (eth0) if the device is registered, or
> > pci_name() if it's not.
> > 
> > Jim
> 
> In the typhoon driver, I have tp->name, and most everything that prints
> info/errors about the card use it. During probing, it is initially set
> to point to pci_name(), and once registered, it gets pointed to
> netdev->name.
> 
> This seems fairly clean, though maybe a bit redundant.
> 
> Dave
> 

Works for me.  While we're talking about DPRINTK, e100.c could use a
similar adjustment.  e100_probe() correctly defers registration until
the probe has succeeded, but as a result, probe-time failure messages
refer to "eth%d", right?

Jim

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Network device driver probe issues
@ 2004-07-16 19:15 Venkatesan, Ganesh
  2004-07-16 19:52 ` Jeff Garzik
  0 siblings, 1 reply; 7+ messages in thread
From: Venkatesan, Ganesh @ 2004-07-16 19:15 UTC (permalink / raw)
  To: Jim Keniston, David Dillow
  Cc: Jeff Garzik, Anton Blanchard, Netdev, cramerj, Ronciak, John,
	jonmason

We've fixed the e100 to have the same logic as e1000. I like the typhoon
driver idea.

Thanks,
ganesh 
 
-------------------------------------------------
Ganesh Venkatesan
Network/Storage Division, Hillsboro, OR

-----Original Message-----
From: Jim Keniston [mailto:jkenisto@us.ibm.com] 
Sent: Friday, July 16, 2004 11:27 AM
To: David Dillow
Cc: Jeff Garzik; Anton Blanchard; Netdev; cramerj; Ronciak, John;
Venkatesan, Ganesh; jonmason@us.ibm.com; Jim Keniston
Subject: Re: Network device driver probe issues

On Thu, 2004-07-15 at 21:49, David Dillow wrote:
> On Thu, 2004-07-15 at 21:04, Jim Keniston wrote:
> > On Thu, 2004-07-15 at 16:34, Jeff Garzik wrote:
> [snip]
> > > > We should instead use something stable to attach to printks
during
> > > > probe. pci_name() is the obvious choice, perhaps using
dev_printk().
> > > > The failure then becomes:
> > > > 
> > > > 0000:01:01.0 Intel(R) PRO/1000 Network Connection
> > > > 0000:01:01.0 The EEPROM Checksum Is Not Valid
> > > > 0000:02:01.0 Intel(R) PRO/1000 Network Connection
> > > 
> > > pci_name() or a simple counter of devices found.  I prefer
pci_name()
> > > 
> > > 	Jeff
> > 
> > Delay registration until the end of the probe, and make DPRINTK
smarter.
> > DPRINTK should check
> > 	adapter->netdev->reg_state == NETREG_REGISTERED
> > and use the interface name (eth0) if the device is registered, or
> > pci_name() if it's not.
> > 
> > Jim
> 
> In the typhoon driver, I have tp->name, and most everything that
prints
> info/errors about the card use it. During probing, it is initially set
> to point to pci_name(), and once registered, it gets pointed to
> netdev->name.
> 
> This seems fairly clean, though maybe a bit redundant.
> 
> Dave
> 

Works for me.  While we're talking about DPRINTK, e100.c could use a
similar adjustment.  e100_probe() correctly defers registration until
the probe has succeeded, but as a result, probe-time failure messages
refer to "eth%d", right?

Jim

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Network device driver probe issues
  2004-07-16 19:15 Network device driver probe issues Venkatesan, Ganesh
@ 2004-07-16 19:52 ` Jeff Garzik
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff Garzik @ 2004-07-16 19:52 UTC (permalink / raw)
  To: Venkatesan, Ganesh
  Cc: Jim Keniston, David Dillow, Anton Blanchard, Netdev, cramerj,
	Ronciak, John, jonmason

Venkatesan, Ganesh wrote:
> We've fixed the e100 to have the same logic as e1000. I like the typhoon
> driver idea.


Unfortunately both are wrong.

e1000 should not be using dev_alloc_name(), and patches to add that to 
e100 will not be accepted.

e100 should not be referencing netdev->name until after 
register_netdev() completes.

The typhoon solution is fine with me.  Other drivers don't bother and 
simply use pci_name() during probe.  It's up to you as maintainer.

	Jeff

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-07-16 19:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-16 19:15 Network device driver probe issues Venkatesan, Ganesh
2004-07-16 19:52 ` Jeff Garzik
  -- strict thread matches above, loose matches on Subject: below --
2004-07-15 13:52 Anton Blanchard
2004-07-15 23:34 ` Jeff Garzik
2004-07-16  1:04   ` Jim Keniston
2004-07-16  4:49     ` David Dillow
2004-07-16 18:26       ` Jim Keniston

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).