netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: mal_probe crash
       [not found]   ` <1231368610.2142.27.camel@pasglop>
  2009-01-09 14:42     ` mal_probe crash Geert Uytterhoeven
@ 2009-01-09 14:42     ` Geert Uytterhoeven
       [not found]     ` <Pine.LNX.4.64.0901091533000.13230@vixen.sonytel.be>
  2 siblings, 0 replies; 4+ messages in thread
From: Geert Uytterhoeven @ 2009-01-09 14:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Herbert Xu, David S. Miller
  Cc: Josh Boyer, linuxppc-dev, Sean MacLennan, netdev

On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote:
> On Thu, 2009-01-08 at 15:46 -0500, Josh Boyer wrote:
> > On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote:
> > >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with
> > >the first parameter NULL. This was ok since the parameter, a net
> > >device, was only used if CONFIG_NETPOLL was set.
> > >
> > >Now it is always de-referenced. A quick check shows that ibm_newemac is
> > >the only driver that passed NULL as the first parameter to this call in
> > >2.6.28.
> > >
> > >I don't really follow ibm_newemac changes, so the patch may be waiting
> > >to be applied. This is really just a heads up.
> > 
> > I haven't heard of that, so I doubt there's a patch pending.  *Sigh*
> 
> There isn't that I know of. The EMAC code creates a single NAPI instance
> for all EMACs and I think used to completely disconnect things. The old
> code created a fake netdev just for NAPI, that became unnecessary with
> the new NAPI stuff.... but it looks like the way we do things now
> displeases some changes in the network stack. I'll have to dig.

Verified on my Sequoia (which now lost its network :-(

The regression/problem (requiring a valid net_device in netif_napi_add(), even
if CONFIG_NETPOLL=n) seems to be introduced by commit
d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload
infrastructure").

However, it was broken before, in case CONFIG_NETPOLL=y.

So mal_probe() (triggered by mal_init()) needs to know about the net_device
before it has been allocated by emac_probe() (triggered by
of_register_platform_driver(&emac_driver)):

| static int __init emac_init(void)
| {
| 	int rc;
| 
| 	printk(KERN_INFO DRV_DESC ", version " DRV_VERSION "\n");
| 
| 	/* Init debug stuff */
| 	emac_init_debug();
| 
| 	/* Build EMAC boot list */
| 	emac_make_bootlist();
| 
| 	/* Init submodules */
| 	rc = mal_init();
| 	if (rc)
| 		goto err;
| 	rc = zmii_init();
| 	if (rc)
| 		goto err_mal;
| 	rc = rgmii_init();
| 	if (rc)
| 		goto err_zmii;
| 	rc = tah_init();
| 	if (rc)
| 		goto err_rgmii;
| 	rc = of_register_platform_driver(&emac_driver);
| 	if (rc)
| 		goto err_tah;
| 
| 	return 0;
| 
|  err_tah:
| 	tah_exit();
|  err_rgmii:
| 	rgmii_exit();
|  err_zmii:
| 	zmii_exit();
|  err_mal:
| 	mal_exit();
|  err:
| 	return rc;
| }

Can the order of mal_init() and of_register_platform_driver(&emac_driver) be
reversed? If yes, there still has some link to be made between the mal and the
emac devices.

With kind regards,

Geert Uytterhoeven
Software Architect

Sony Techsoft Centre Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium

Phone:    +32 (0)2 700 8453
Fax:      +32 (0)2 700 8622
E-mail:   Geert.Uytterhoeven@sonycom.com
Internet: http://www.sony-europe.com/

A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 · RPR Brussels
Fortis · BIC GEBABEBB · IBAN BE41293037680010

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mal_probe crash
       [not found]   ` <1231368610.2142.27.camel@pasglop>
@ 2009-01-09 14:42     ` Geert Uytterhoeven
  2009-01-09 14:42     ` Geert Uytterhoeven
       [not found]     ` <Pine.LNX.4.64.0901091533000.13230@vixen.sonytel.be>
  2 siblings, 0 replies; 4+ messages in thread
From: Geert Uytterhoeven @ 2009-01-09 14:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Herbert Xu, David S. Miller
  Cc: linuxppc-dev, netdev, Sean MacLennan

On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote:
> On Thu, 2009-01-08 at 15:46 -0500, Josh Boyer wrote:
> > On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote:
> > >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with
> > >the first parameter NULL. This was ok since the parameter, a net
> > >device, was only used if CONFIG_NETPOLL was set.
> > >
> > >Now it is always de-referenced. A quick check shows that ibm_newemac is
> > >the only driver that passed NULL as the first parameter to this call in
> > >2.6.28.
> > >
> > >I don't really follow ibm_newemac changes, so the patch may be waiting
> > >to be applied. This is really just a heads up.
> > 
> > I haven't heard of that, so I doubt there's a patch pending.  *Sigh*
> 
> There isn't that I know of. The EMAC code creates a single NAPI instance
> for all EMACs and I think used to completely disconnect things. The old
> code created a fake netdev just for NAPI, that became unnecessary with
> the new NAPI stuff.... but it looks like the way we do things now
> displeases some changes in the network stack. I'll have to dig.

Verified on my Sequoia (which now lost its network :-(

The regression/problem (requiring a valid net_device in netif_napi_add(), even
if CONFIG_NETPOLL=n) seems to be introduced by commit
d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload
infrastructure").

However, it was broken before, in case CONFIG_NETPOLL=y.

So mal_probe() (triggered by mal_init()) needs to know about the net_device
before it has been allocated by emac_probe() (triggered by
of_register_platform_driver(&emac_driver)):

| static int __init emac_init(void)
| {
| 	int rc;
| 
| 	printk(KERN_INFO DRV_DESC ", version " DRV_VERSION "\n");
| 
| 	/* Init debug stuff */
| 	emac_init_debug();
| 
| 	/* Build EMAC boot list */
| 	emac_make_bootlist();
| 
| 	/* Init submodules */
| 	rc = mal_init();
| 	if (rc)
| 		goto err;
| 	rc = zmii_init();
| 	if (rc)
| 		goto err_mal;
| 	rc = rgmii_init();
| 	if (rc)
| 		goto err_zmii;
| 	rc = tah_init();
| 	if (rc)
| 		goto err_rgmii;
| 	rc = of_register_platform_driver(&emac_driver);
| 	if (rc)
| 		goto err_tah;
| 
| 	return 0;
| 
|  err_tah:
| 	tah_exit();
|  err_rgmii:
| 	rgmii_exit();
|  err_zmii:
| 	zmii_exit();
|  err_mal:
| 	mal_exit();
|  err:
| 	return rc;
| }

Can the order of mal_init() and of_register_platform_driver(&emac_driver) be
reversed? If yes, there still has some link to be made between the mal and the
emac devices.

With kind regards,

Geert Uytterhoeven
Software Architect

Sony Techsoft Centre Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium

Phone:    +32 (0)2 700 8453
Fax:      +32 (0)2 700 8622
E-mail:   Geert.Uytterhoeven@sonycom.com
Internet: http://www.sony-europe.com/

A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 · RPR Brussels
Fortis · BIC GEBABEBB · IBAN BE41293037680010
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mal_probe crash
       [not found]     ` <Pine.LNX.4.64.0901091533000.13230@vixen.sonytel.be>
@ 2009-01-09 22:34       ` Herbert Xu
  2009-01-09 23:13         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 4+ messages in thread
From: Herbert Xu @ 2009-01-09 22:34 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Benjamin Herrenschmidt, David S. Miller, Josh Boyer, linuxppc-dev,
	Sean MacLennan, netdev

On Fri, Jan 09, 2009 at 03:42:25PM +0100, Geert Uytterhoeven wrote:
> On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote:
>
> > There isn't that I know of. The EMAC code creates a single NAPI instance
> > for all EMACs and I think used to completely disconnect things. The old
> > code created a fake netdev just for NAPI, that became unnecessary with
> > the new NAPI stuff.... but it looks like the way we do things now
> > displeases some changes in the network stack. I'll have to dig.
> 
> Verified on my Sequoia (which now lost its network :-(
> 
> The regression/problem (requiring a valid net_device in netif_napi_add(), even
> if CONFIG_NETPOLL=n) seems to be introduced by commit
> d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload
> infrastructure").

Yes EMAC just needs to go back to the old fake dev setup.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mal_probe crash
  2009-01-09 22:34       ` Herbert Xu
@ 2009-01-09 23:13         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2009-01-09 23:13 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Geert Uytterhoeven, David S. Miller, Josh Boyer, linuxppc-dev,
	Sean MacLennan, netdev

On Sat, 2009-01-10 at 09:34 +1100, Herbert Xu wrote:
> On Fri, Jan 09, 2009 at 03:42:25PM +0100, Geert Uytterhoeven wrote:
> > On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote:
> >
> > > There isn't that I know of. The EMAC code creates a single NAPI instance
> > > for all EMACs and I think used to completely disconnect things. The old
> > > code created a fake netdev just for NAPI, that became unnecessary with
> > > the new NAPI stuff.... but it looks like the way we do things now
> > > displeases some changes in the network stack. I'll have to dig.
> > 
> > Verified on my Sequoia (which now lost its network :-(
> > 
> > The regression/problem (requiring a valid net_device in netif_napi_add(), even
> > if CONFIG_NETPOLL=n) seems to be introduced by commit
> > d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload
> > infrastructure").
> 
> Yes EMAC just needs to go back to the old fake dev setup.

One thing I wanted to do back then... which triggered the discussion
with Stephen just before he broke NAPI up from netdev, was to add a core
function that creates such dummy netdev so that drivers don't have to
break every time some new internal field changes or such...

I'll give that a spin asap, though it might have to wait for monday.

Cheers,
Ben.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-01-09 23:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20090107154434.0c9437ef@lappy.seanm.ca>
     [not found] ` <20090108204634.GB2337@yoda.jdub.homelinux.org>
     [not found]   ` <1231368610.2142.27.camel@pasglop>
2009-01-09 14:42     ` mal_probe crash Geert Uytterhoeven
2009-01-09 14:42     ` Geert Uytterhoeven
     [not found]     ` <Pine.LNX.4.64.0901091533000.13230@vixen.sonytel.be>
2009-01-09 22:34       ` Herbert Xu
2009-01-09 23:13         ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).