* mal_probe crash @ 2009-01-07 20:44 Sean MacLennan 2009-01-08 20:46 ` Josh Boyer 0 siblings, 1 reply; 18+ messages in thread From: Sean MacLennan @ 2009-01-07 20:44 UTC (permalink / raw) To: linuxppc-dev With Linus' latest git, mal_probe crashes. It calls netif_napi_add with the first parameter NULL. This was ok since the parameter, a net device, was only used if CONFIG_NETPOLL was set. Now it is always de-referenced. A quick check shows that ibm_newemac is the only driver that passed NULL as the first parameter to this call in 2.6.28. I don't really follow ibm_newemac changes, so the patch may be waiting to be applied. This is really just a heads up. Cheers, Sean ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-07 20:44 mal_probe crash Sean MacLennan @ 2009-01-08 20:46 ` Josh Boyer 2009-01-07 22:50 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 18+ messages in thread From: Josh Boyer @ 2009-01-08 20:46 UTC (permalink / raw) To: Sean MacLennan; +Cc: linuxppc-dev On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote: >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with >the first parameter NULL. This was ok since the parameter, a net >device, was only used if CONFIG_NETPOLL was set. > >Now it is always de-referenced. A quick check shows that ibm_newemac is >the only driver that passed NULL as the first parameter to this call in >2.6.28. > >I don't really follow ibm_newemac changes, so the patch may be waiting >to be applied. This is really just a heads up. I haven't heard of that, so I doubt there's a patch pending. *Sigh* josh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-08 20:46 ` Josh Boyer @ 2009-01-07 22:50 ` Benjamin Herrenschmidt 2009-01-09 14:42 ` Geert Uytterhoeven 2009-01-09 14:49 ` Matthias Fuchs 0 siblings, 2 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-07 22:50 UTC (permalink / raw) To: Josh Boyer; +Cc: linuxppc-dev, Sean MacLennan On Thu, 2009-01-08 at 15:46 -0500, Josh Boyer wrote: > On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote: > >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with > >the first parameter NULL. This was ok since the parameter, a net > >device, was only used if CONFIG_NETPOLL was set. > > > >Now it is always de-referenced. A quick check shows that ibm_newemac is > >the only driver that passed NULL as the first parameter to this call in > >2.6.28. > > > >I don't really follow ibm_newemac changes, so the patch may be waiting > >to be applied. This is really just a heads up. > > I haven't heard of that, so I doubt there's a patch pending. *Sigh* There isn't that I know of. The EMAC code creates a single NAPI instance for all EMACs and I think used to completely disconnect things. The old code created a fake netdev just for NAPI, that became unnecessary with the new NAPI stuff.... but it looks like the way we do things now displeases some changes in the network stack. I'll have to dig. Cheers, Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-07 22:50 ` Benjamin Herrenschmidt @ 2009-01-09 14:42 ` Geert Uytterhoeven 2009-01-09 22:34 ` Herbert Xu 2009-01-09 14:49 ` Matthias Fuchs 1 sibling, 1 reply; 18+ messages in thread From: Geert Uytterhoeven @ 2009-01-09 14:42 UTC (permalink / raw) To: Benjamin Herrenschmidt, Herbert Xu, David S. Miller Cc: linuxppc-dev, netdev, Sean MacLennan On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote: > On Thu, 2009-01-08 at 15:46 -0500, Josh Boyer wrote: > > On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote: > > >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with > > >the first parameter NULL. This was ok since the parameter, a net > > >device, was only used if CONFIG_NETPOLL was set. > > > > > >Now it is always de-referenced. A quick check shows that ibm_newemac is > > >the only driver that passed NULL as the first parameter to this call in > > >2.6.28. > > > > > >I don't really follow ibm_newemac changes, so the patch may be waiting > > >to be applied. This is really just a heads up. > > > > I haven't heard of that, so I doubt there's a patch pending. *Sigh* > > There isn't that I know of. The EMAC code creates a single NAPI instance > for all EMACs and I think used to completely disconnect things. The old > code created a fake netdev just for NAPI, that became unnecessary with > the new NAPI stuff.... but it looks like the way we do things now > displeases some changes in the network stack. I'll have to dig. Verified on my Sequoia (which now lost its network :-( The regression/problem (requiring a valid net_device in netif_napi_add(), even if CONFIG_NETPOLL=n) seems to be introduced by commit d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload infrastructure"). However, it was broken before, in case CONFIG_NETPOLL=y. So mal_probe() (triggered by mal_init()) needs to know about the net_device before it has been allocated by emac_probe() (triggered by of_register_platform_driver(&emac_driver)): | static int __init emac_init(void) | { | int rc; | | printk(KERN_INFO DRV_DESC ", version " DRV_VERSION "\n"); | | /* Init debug stuff */ | emac_init_debug(); | | /* Build EMAC boot list */ | emac_make_bootlist(); | | /* Init submodules */ | rc = mal_init(); | if (rc) | goto err; | rc = zmii_init(); | if (rc) | goto err_mal; | rc = rgmii_init(); | if (rc) | goto err_zmii; | rc = tah_init(); | if (rc) | goto err_rgmii; | rc = of_register_platform_driver(&emac_driver); | if (rc) | goto err_tah; | | return 0; | | err_tah: | tah_exit(); | err_rgmii: | rgmii_exit(); | err_zmii: | zmii_exit(); | err_mal: | mal_exit(); | err: | return rc; | } Can the order of mal_init() and of_register_platform_driver(&emac_driver) be reversed? If yes, there still has some link to be made between the mal and the emac devices. With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone: +32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: Geert.Uytterhoeven@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 14:42 ` Geert Uytterhoeven @ 2009-01-09 22:34 ` Herbert Xu 2009-01-09 23:13 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 18+ messages in thread From: Herbert Xu @ 2009-01-09 22:34 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: linuxppc-dev, netdev, Sean MacLennan, David S. Miller On Fri, Jan 09, 2009 at 03:42:25PM +0100, Geert Uytterhoeven wrote: > On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote: > > > There isn't that I know of. The EMAC code creates a single NAPI instance > > for all EMACs and I think used to completely disconnect things. The old > > code created a fake netdev just for NAPI, that became unnecessary with > > the new NAPI stuff.... but it looks like the way we do things now > > displeases some changes in the network stack. I'll have to dig. > > Verified on my Sequoia (which now lost its network :-( > > The regression/problem (requiring a valid net_device in netif_napi_add(), even > if CONFIG_NETPOLL=n) seems to be introduced by commit > d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload > infrastructure"). Yes EMAC just needs to go back to the old fake dev setup. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 22:34 ` Herbert Xu @ 2009-01-09 23:13 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-09 23:13 UTC (permalink / raw) To: Herbert Xu Cc: netdev, linuxppc-dev, Sean MacLennan, Geert Uytterhoeven, David S. Miller On Sat, 2009-01-10 at 09:34 +1100, Herbert Xu wrote: > On Fri, Jan 09, 2009 at 03:42:25PM +0100, Geert Uytterhoeven wrote: > > On Thu, 8 Jan 2009, Benjamin Herrenschmidt wrote: > > > > > There isn't that I know of. The EMAC code creates a single NAPI instance > > > for all EMACs and I think used to completely disconnect things. The old > > > code created a fake netdev just for NAPI, that became unnecessary with > > > the new NAPI stuff.... but it looks like the way we do things now > > > displeases some changes in the network stack. I'll have to dig. > > > > Verified on my Sequoia (which now lost its network :-( > > > > The regression/problem (requiring a valid net_device in netif_napi_add(), even > > if CONFIG_NETPOLL=n) seems to be introduced by commit > > d565b0a1a9b6ee7dff46e1f68b26b526ac11ae50 ("net: Add Generic Receive Offload > > infrastructure"). > > Yes EMAC just needs to go back to the old fake dev setup. One thing I wanted to do back then... which triggered the discussion with Stephen just before he broke NAPI up from netdev, was to add a core function that creates such dummy netdev so that drivers don't have to break every time some new internal field changes or such... I'll give that a spin asap, though it might have to wait for monday. Cheers, Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-07 22:50 ` Benjamin Herrenschmidt 2009-01-09 14:42 ` Geert Uytterhoeven @ 2009-01-09 14:49 ` Matthias Fuchs 2009-01-09 15:02 ` Matthias Fuchs 2009-01-09 21:09 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 18+ messages in thread From: Matthias Fuchs @ 2009-01-09 14:49 UTC (permalink / raw) To: linuxppc-dev; +Cc: Sean MacLennan On Wednesday 07 January 2009 23:50, Benjamin Herrenschmidt wrote: > On Thu, 2009-01-08 at 15:46 -0500, Josh Boyer wrote: > > On Wed, Jan 07, 2009 at 03:44:34PM -0500, Sean MacLennan wrote: > > >With Linus' latest git, mal_probe crashes. It calls netif_napi_add with > > >the first parameter NULL. This was ok since the parameter, a net > > >device, was only used if CONFIG_NETPOLL was set. > > > > > >Now it is always de-referenced. A quick check shows that ibm_newemac is > > >the only driver that passed NULL as the first parameter to this call in > > >2.6.28. > > > > > >I don't really follow ibm_newemac changes, so the patch may be waiting > > >to be applied. This is really just a heads up. > > > > I haven't heard of that, so I doubt there's a patch pending. *Sigh* > > There isn't that I know of. The EMAC code creates a single NAPI instance > for all EMACs and I think used to completely disconnect things. The old > code created a fake netdev just for NAPI, that became unnecessary with > the new NAPI stuff.... but it looks like the way we do things now > displeases some changes in the network stack. I'll have to dig. > Could it be that simple. Probably not. It works at a first glace on a 405EP ang GPr board. But it might cause problems when having more than one EMAC up at the same time. Matthias [PATCH] powerpc: Fix ibm_newemac driver Since commit d565b0a1a9b6ee7d netif_napi_add must be called if a proper net_device pointer != NULL. Signed-off-by: Matthias Fuchs <matthias.fuchs@esd-electronics.com> --- drivers/net/ibm_newemac/core.c | 3 +++ drivers/net/ibm_newemac/mal.c | 5 +---- drivers/net/ibm_newemac/mal.h | 1 + 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c index 87a7066..9bd4d6d 100644 --- a/drivers/net/ibm_newemac/core.c +++ b/drivers/net/ibm_newemac/core.c @@ -2767,6 +2767,9 @@ static int __devinit emac_probe(struct of_device *ofdev, if (dev->mdio_dev != NULL) dev->mdio_instance = dev_get_drvdata(&dev->mdio_dev->dev); + netif_napi_add(ndev, &dev->mal->napi, mal_poll, + CONFIG_IBM_NEW_EMAC_POLL_WEIGHT); + /* Register with MAL */ dev->commac.ops = &emac_commac_ops; dev->commac.dev = dev; diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c index ecf9798..d5306ae 100644 --- a/drivers/net/ibm_newemac/mal.c +++ b/drivers/net/ibm_newemac/mal.c @@ -391,7 +391,7 @@ void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac) napi_schedule(&mal->napi); } -static int mal_poll(struct napi_struct *napi, int budget) +int mal_poll(struct napi_struct *napi, int budget) { struct mal_instance *mal = container_of(napi, struct mal_instance, napi); struct list_head *l; @@ -613,9 +613,6 @@ static int __devinit mal_probe(struct of_device *ofdev, INIT_LIST_HEAD(&mal->list); spin_lock_init(&mal->lock); - netif_napi_add(NULL, &mal->napi, mal_poll, - CONFIG_IBM_NEW_EMAC_POLL_WEIGHT); - /* Load power-on reset defaults */ mal_reset(mal); diff --git a/drivers/net/ibm_newemac/mal.h b/drivers/net/ibm_newemac/mal.h index 2f0a873..51597bd 100644 --- a/drivers/net/ibm_newemac/mal.h +++ b/drivers/net/ibm_newemac/mal.h @@ -282,6 +282,7 @@ void mal_disable_rx_channel(struct mal_instance *mal, int channel); void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac); void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac); +int mal_poll(struct napi_struct *napi, int budget); /* Add/remove EMAC to/from MAL polling list */ void mal_poll_add(struct mal_instance *mal, struct mal_commac *commac); -- 1.5.6.3 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 14:49 ` Matthias Fuchs @ 2009-01-09 15:02 ` Matthias Fuchs 2009-01-09 15:24 ` Geert Uytterhoeven 2009-01-09 21:09 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 18+ messages in thread From: Matthias Fuchs @ 2009-01-09 15:02 UTC (permalink / raw) To: linuxppc-dev; +Cc: Sean MacLennan Forget my last posting! It's just a dirty work around when having a single EMAC. It does not work with two EMACs like on sequoia. Matthias ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 15:02 ` Matthias Fuchs @ 2009-01-09 15:24 ` Geert Uytterhoeven 2009-01-09 21:30 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 18+ messages in thread From: Geert Uytterhoeven @ 2009-01-09 15:24 UTC (permalink / raw) To: Matthias Fuchs; +Cc: linuxppc-dev, Sean MacLennan On Fri, 9 Jan 2009, Matthias Fuchs wrote: > Forget my last posting! It's just a dirty work around when having a single EMAC. > It does not work with two EMACs like on sequoia. Indeed. It doesn't on my sequoia :-( I also tried reviving connectivity by adding an Intel PRO/1000 GT network card, but I got a machine check exception. Don't know if this is a problem with the PPC44x PCI code or with the e1000 driver. U-Boot 1.2.0-gc0c292b2 (Jun 5 2007 - 07:16:12) CPU: AMCC PowerPC 440EPx Rev. A at 666.666 MHz (PLB=166, OPB=83, EBC=55 MHz) Security/Kasumi support I2C boot EEPROM enabled Bootstrap Option H - Boot ROM Location I2C (Addr 0x52) Internal PCI arbiter enabled, PCI async ext clock used 32 kB I-Cache 32 kB D-Cache Board: Sequoia - AMCC PPC440EPx Evaluation Board, Rev. F, PCI=33 MHz I2C: ready DTT: 1 is 223 C DRAM: 256 MB FLASH: 64 MB NAND: 32 MiB PCI: Bus Dev VenId DevId Class Int 00 0c 8086 107c 0200 00 In: serial Out: serial Err: serial USB: Host(int phy) Device(ext phy) Net: ppc_4xx_eth0, ppc_4xx_eth1 Type "run flash_nfs" to mount root filesystem over NFS Hit any key to stop autoboot: 0 Waiting for PHY auto negotiation to complete.. done ENET Speed is 100 Mbps - FULL duplex connection (EMAC0) BOOTP broadcast 1 DHCP client bound to address 192.168.106.188 Using ppc_4xx_eth0 device TFTP from server 192.168.106.200; our IP address is 192.168.106.188 Filename '/sequoia/cuImage.sequoia'. Load address: 0x100000 Loading: ################################################################# ################################################################# ################################################################# ################################################################# ############################################# done Bytes transferred = 1556529 (17c031 hex) ## Booting image at 00100000 ... Image Name: Linux-2.6.28-07939-g2150edc-dirt Image Type: PowerPC Linux Kernel Image (gzip compressed) Data Size: 1556465 Bytes = 1.5 MB Load Address: 00400000 Entry Point: 00400458 Verifying Checksum ... OK Uncompressing Kernel Image ... OK CPU clock-frequency <- 0x27bc86a4 (667MHz) CPU timebase-frequency <- 0x27bc86a4 (667MHz) /plb: clock-frequency <- 9ef21a9 (167MHz) /plb/opb: clock-frequency <- 4f790d4 (83MHz) /plb/opb/ebc: clock-frequency <- 34fb5e3 (56MHz) /plb/opb/serial@ef600300: clock-frequency <- a8c000 (11MHz) /plb/opb/serial@ef600400: clock-frequency <- a8c000 (11MHz) /plb/opb/serial@ef600500: clock-frequency <- 42ecac (4MHz) /plb/opb/serial@ef600600: clock-frequency <- 42ecac (4MHz) Memory <- <0x0 0x0 0xffff000> (255MB) ethernet0: local-mac-address <- 00:10:ec:00:f1:df ethernet1: local-mac-address <- 00:10:ec:80:f1:df zImage starting: loaded at 0x00400000 (sp: 0x0ff2ba18) Allocating 0x333834 bytes for kernel ... gunzipping (0x00000000 <- 0x0040e000:0x00735820)...done 0x31417c bytes Linux/PowerPC load: ip=on root=/dev/nfs Finalizing device tree... flat tree at 0x742300 Using PowerPC 44x Platform machine description Linux version 2.6.28-07939-g2150edc-dirty (geert@vixen) (gcc version 4.3.2 (GCC) ) #4 Fri Jan 9 16:05:53 CET 2009 console [udbg0] enabled setup_arch: bootmem arch: exit Zone PFN ranges: DMA 0x00000000 -> 0x0000ffff Normal 0x0000ffff -> 0x0000ffff Movable zone start PFN for each node early_node_map[1] active PFN ranges 0: 0x00000000 -> 0x0000ffff MMU: Allocated 1088 bytes of context maps for 255 contexts Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65023 Kernel command line: ip=on root=/dev/nfs UIC0 (32 IRQ sources) at DCR 0xc0 UIC1 (32 IRQ sources) at DCR 0xd0 UIC2 (32 IRQ sources) at DCR 0xe0 PID hash table entries: 1024 (order: 10, 4096 bytes) clocksource: timebase mult[600000] shift[22] registered Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 256256k/262140k available (2996k kernel code, 5572k reserved, 128k data, 122k bss, 156k init) SLUB: Genslabs=10, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Calibrating delay loop... 1331.20 BogoMIPS (lpj=2662400) Mount-cache hash table entries: 512 net_namespace: 716 bytes NET: Registered protocol family 16 PCI host bridge /plb/pci@1ec000000 (primary) ranges: MEM 0x0000000180000000..0x00000001bfffffff -> 0x0000000080000000 IO 0x00000001e8000000..0x00000001e800ffff -> 0x0000000000000000 IO 0x00000001e8800000..0x00000001ebffffff -> 0x0000000000000000 \--> Skipped (too many) ! 4xx PCI DMA offset set to 0x00000000 /plb/pci@1ec000000: Resource out of range PCI: Probing PCI hardware PCI: Hiding 4xx host bridge resources 0000:00:00.0 pci 0000:00:0c.0: PME# supported from D0 D3hot D3cold pci 0000:00:0c.0: PME# disabled bio: create slab <bio-0> at 0 NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 3, 32768 bytes) TCP: Hash tables configured (established 8192 bind 8192) TCP reno registered NET: Registered protocol family 1 JFFS2 version 2.2. (NAND) © 2001-2006 Red Hat, Inc. msgmni has been set to 501 alg: No test for stdrng (krng) io scheduler noop registered io scheduler anticipatory registered (default) io scheduler deadline registered io scheduler cfq registered Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled serial8250.0: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550A console handover: boot [udbg0] -> real [ttyS0] serial8250.0: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550A serial8250.0: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550A serial8250.0: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550A 1ef600300.serial: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550A 1ef600400.serial: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550A 1ef600500.serial: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550A 1ef600600.serial: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550A brd: module loaded Intel(R) PRO/1000 Network Driver - version 7.3.20-k3-NAPI Copyright (c) 1999-2006 Intel Corporation. e1000 0000:00:0c.0: enabling device (0000 -> 0003) Machine check in kernel mode. Data Read PLB Error Oops: Machine check, sig: 7 [#1] PowerPC 44x Platform Modules linked in: NIP: c0187cb8 LR: c0236300 CTR: c0187bb0 REGS: cfff7f10 TRAP: 0214 Not tainted (2.6.28-07939-g2150edc-dirty) MSR: 00029000 <EE,ME,CE> CR: 28d6cb24 XER: 20000000 TASK = cf818400[1] 'swapper' THREAD: cf828000 GPR00: 00000000 cf829db0 cf818400 cf8114fc 00000004 00000000 00000002 cf829d88 GPR08: 00000000 d10c0008 00000000 0000000b 00001000 00108000 0ffb2400 00000001 GPR16: 007fff13 00400458 00800000 c032d69c c024bfc4 c0330000 cf8114fc 00000001 GPR24: 00000000 00000001 00000047 cf811000 cf811320 cf811000 00000001 cf83d400 NIP [c0187cb8] e1000_set_media_type+0x64/0xe4 LR [c0236300] e1000_probe+0x334/0xd5c Call Trace: [cf829db0] [c02362b4] e1000_probe+0x2e8/0xd5c (unreliable) [cf829e10] [c015c018] local_pci_probe+0x24/0x34 [cf829e20] [c015c240] pci_device_probe+0x84/0xa8 [cf829e50] [c017b948] driver_probe_device+0xb4/0x1e8 [cf829e70] [c017bb20] __driver_attach+0xa4/0xa8 [cf829e90] [c017b0fc] bus_for_each_dev+0x70/0xac [cf829ec0] [c017b760] driver_attach+0x24/0x34 [cf829ed0] [c017aa04] bus_add_driver+0x1d0/0x244 [cf829ef0] [c017bd40] driver_register+0x70/0x160 [cf829f10] [c015c4e8] __pci_register_driver+0x4c/0xac [cf829f30] [c02dfb30] e1000_init_module+0x58/0xa8 [cf829f50] [c00013d8] do_one_initcall+0x34/0x1b0 [cf829fc0] [c02c6178] kernel_init+0x94/0x100 [cf829ff0] [c000da64] kernel_thread+0x50/0x6c Instruction dump: 409c0080 2f8b0010 419e006c 2b8b0010 419d005c 380bffff 2b800001 409d0074 81230000 39290008 7c0004ac 7c004c2c <0c000000> 4c00012c 70000020 40820060 ---[ end trace 85643a8ae0783f0b ]--- Kernel panic - not syncing: Attempted to kill init! Rebooting in 180 seconds.. With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone: +32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: Geert.Uytterhoeven@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 15:24 ` Geert Uytterhoeven @ 2009-01-09 21:30 ` Benjamin Herrenschmidt 2009-01-09 22:01 ` Roland Dreier 0 siblings, 1 reply; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-09 21:30 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Sean MacLennan, linuxppc-dev On Fri, 2009-01-09 at 16:24 +0100, Geert Uytterhoeven wrote: > On Fri, 9 Jan 2009, Matthias Fuchs wrote: > > Forget my last posting! It's just a dirty work around when having a single EMAC. > > It does not work with two EMACs like on sequoia. > > Indeed. It doesn't on my sequoia :-( > > I also tried reviving connectivity by adding an Intel PRO/1000 GT network card, > but I got a machine check exception. Don't know if this is a problem with the > PPC44x PCI code or with the e1000 driver. Can you double check that the e1000 isn't copying the PCI resources into a unsigned long before ioremap'ing the result, thus cropping the top bits ? It had a bug like that for which I sent a fix a while ago but maybe that crept back in... Cheers, Ben. > U-Boot 1.2.0-gc0c292b2 (Jun 5 2007 - 07:16:12) > > CPU: AMCC PowerPC 440EPx Rev. A at 666.666 MHz (PLB=166, OPB=83, EBC=55 MHz) > Security/Kasumi support > I2C boot EEPROM enabled > Bootstrap Option H - Boot ROM Location I2C (Addr 0x52) > Internal PCI arbiter enabled, PCI async ext clock used > 32 kB I-Cache 32 kB D-Cache > Board: Sequoia - AMCC PPC440EPx Evaluation Board, Rev. F, PCI=33 MHz > I2C: ready > DTT: 1 is 223 C > DRAM: 256 MB > FLASH: 64 MB > NAND: 32 MiB > PCI: Bus Dev VenId DevId Class Int > 00 0c 8086 107c 0200 00 > In: serial > Out: serial > Err: serial > USB: Host(int phy) Device(ext phy) > Net: ppc_4xx_eth0, ppc_4xx_eth1 > > Type "run flash_nfs" to mount root filesystem over NFS > > Hit any key to stop autoboot: 0 > Waiting for PHY auto negotiation to complete.. done > ENET Speed is 100 Mbps - FULL duplex connection (EMAC0) > BOOTP broadcast 1 > DHCP client bound to address 192.168.106.188 > Using ppc_4xx_eth0 device > TFTP from server 192.168.106.200; our IP address is 192.168.106.188 > Filename '/sequoia/cuImage.sequoia'. > Load address: 0x100000 > Loading: ################################################################# > ################################################################# > ################################################################# > ################################################################# > ############################################# > done > Bytes transferred = 1556529 (17c031 hex) > ## Booting image at 00100000 ... > Image Name: Linux-2.6.28-07939-g2150edc-dirt > Image Type: PowerPC Linux Kernel Image (gzip compressed) > Data Size: 1556465 Bytes = 1.5 MB > Load Address: 00400000 > Entry Point: 00400458 > Verifying Checksum ... OK > Uncompressing Kernel Image ... OK > CPU clock-frequency <- 0x27bc86a4 (667MHz) > CPU timebase-frequency <- 0x27bc86a4 (667MHz) > /plb: clock-frequency <- 9ef21a9 (167MHz) > /plb/opb: clock-frequency <- 4f790d4 (83MHz) > /plb/opb/ebc: clock-frequency <- 34fb5e3 (56MHz) > /plb/opb/serial@ef600300: clock-frequency <- a8c000 (11MHz) > /plb/opb/serial@ef600400: clock-frequency <- a8c000 (11MHz) > /plb/opb/serial@ef600500: clock-frequency <- 42ecac (4MHz) > /plb/opb/serial@ef600600: clock-frequency <- 42ecac (4MHz) > Memory <- <0x0 0x0 0xffff000> (255MB) > ethernet0: local-mac-address <- 00:10:ec:00:f1:df > ethernet1: local-mac-address <- 00:10:ec:80:f1:df > > zImage starting: loaded at 0x00400000 (sp: 0x0ff2ba18) > Allocating 0x333834 bytes for kernel ... > gunzipping (0x00000000 <- 0x0040e000:0x00735820)...done 0x31417c bytes > > Linux/PowerPC load: ip=on root=/dev/nfs > Finalizing device tree... flat tree at 0x742300 > Using PowerPC 44x Platform machine description > Linux version 2.6.28-07939-g2150edc-dirty (geert@vixen) (gcc version 4.3.2 (GCC) ) #4 Fri Jan 9 16:05:53 CET 2009 > console [udbg0] enabled > setup_arch: bootmem > arch: exit > Zone PFN ranges: > DMA 0x00000000 -> 0x0000ffff > Normal 0x0000ffff -> 0x0000ffff > Movable zone start PFN for each node > early_node_map[1] active PFN ranges > 0: 0x00000000 -> 0x0000ffff > MMU: Allocated 1088 bytes of context maps for 255 contexts > Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65023 > Kernel command line: ip=on root=/dev/nfs > UIC0 (32 IRQ sources) at DCR 0xc0 > UIC1 (32 IRQ sources) at DCR 0xd0 > UIC2 (32 IRQ sources) at DCR 0xe0 > PID hash table entries: 1024 (order: 10, 4096 bytes) > clocksource: timebase mult[600000] shift[22] registered > Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) > Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) > Memory: 256256k/262140k available (2996k kernel code, 5572k reserved, 128k data, 122k bss, 156k init) > SLUB: Genslabs=10, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 > Calibrating delay loop... 1331.20 BogoMIPS (lpj=2662400) > Mount-cache hash table entries: 512 > net_namespace: 716 bytes > NET: Registered protocol family 16 > > PCI host bridge /plb/pci@1ec000000 (primary) ranges: > MEM 0x0000000180000000..0x00000001bfffffff -> 0x0000000080000000 > IO 0x00000001e8000000..0x00000001e800ffff -> 0x0000000000000000 > IO 0x00000001e8800000..0x00000001ebffffff -> 0x0000000000000000 > \--> Skipped (too many) ! > 4xx PCI DMA offset set to 0x00000000 > /plb/pci@1ec000000: Resource out of range > PCI: Probing PCI hardware > PCI: Hiding 4xx host bridge resources 0000:00:00.0 > pci 0000:00:0c.0: PME# supported from D0 D3hot D3cold > pci 0000:00:0c.0: PME# disabled > bio: create slab <bio-0> at 0 > NET: Registered protocol family 2 > IP route cache hash table entries: 2048 (order: 1, 8192 bytes) > TCP established hash table entries: 8192 (order: 4, 65536 bytes) > TCP bind hash table entries: 8192 (order: 3, 32768 bytes) > TCP: Hash tables configured (established 8192 bind 8192) > TCP reno registered > NET: Registered protocol family 1 > JFFS2 version 2.2. (NAND) © 2001-2006 Red Hat, Inc. > msgmni has been set to 501 > alg: No test for stdrng (krng) > io scheduler noop registered > io scheduler anticipatory registered (default) > io scheduler deadline registered > io scheduler cfq registered > Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled > serial8250.0: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550A > console handover: boot [udbg0] -> real [ttyS0] > serial8250.0: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550A > serial8250.0: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550A > serial8250.0: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550A > 1ef600300.serial: ttyS0 at MMIO 0x1ef600300 (irq = 17) is a 16550A > 1ef600400.serial: ttyS1 at MMIO 0x1ef600400 (irq = 18) is a 16550A > 1ef600500.serial: ttyS2 at MMIO 0x1ef600500 (irq = 19) is a 16550A > 1ef600600.serial: ttyS3 at MMIO 0x1ef600600 (irq = 20) is a 16550A > brd: module loaded > Intel(R) PRO/1000 Network Driver - version 7.3.20-k3-NAPI > Copyright (c) 1999-2006 Intel Corporation. > e1000 0000:00:0c.0: enabling device (0000 -> 0003) > Machine check in kernel mode. > Data Read PLB Error > Oops: Machine check, sig: 7 [#1] > PowerPC 44x Platform > Modules linked in: > NIP: c0187cb8 LR: c0236300 CTR: c0187bb0 > REGS: cfff7f10 TRAP: 0214 Not tainted (2.6.28-07939-g2150edc-dirty) > MSR: 00029000 <EE,ME,CE> CR: 28d6cb24 XER: 20000000 > TASK = cf818400[1] 'swapper' THREAD: cf828000 > GPR00: 00000000 cf829db0 cf818400 cf8114fc 00000004 00000000 00000002 cf829d88 > GPR08: 00000000 d10c0008 00000000 0000000b 00001000 00108000 0ffb2400 00000001 > GPR16: 007fff13 00400458 00800000 c032d69c c024bfc4 c0330000 cf8114fc 00000001 > GPR24: 00000000 00000001 00000047 cf811000 cf811320 cf811000 00000001 cf83d400 > NIP [c0187cb8] e1000_set_media_type+0x64/0xe4 > LR [c0236300] e1000_probe+0x334/0xd5c > Call Trace: > [cf829db0] [c02362b4] e1000_probe+0x2e8/0xd5c (unreliable) > [cf829e10] [c015c018] local_pci_probe+0x24/0x34 > [cf829e20] [c015c240] pci_device_probe+0x84/0xa8 > [cf829e50] [c017b948] driver_probe_device+0xb4/0x1e8 > [cf829e70] [c017bb20] __driver_attach+0xa4/0xa8 > [cf829e90] [c017b0fc] bus_for_each_dev+0x70/0xac > [cf829ec0] [c017b760] driver_attach+0x24/0x34 > [cf829ed0] [c017aa04] bus_add_driver+0x1d0/0x244 > [cf829ef0] [c017bd40] driver_register+0x70/0x160 > [cf829f10] [c015c4e8] __pci_register_driver+0x4c/0xac > [cf829f30] [c02dfb30] e1000_init_module+0x58/0xa8 > [cf829f50] [c00013d8] do_one_initcall+0x34/0x1b0 > [cf829fc0] [c02c6178] kernel_init+0x94/0x100 > [cf829ff0] [c000da64] kernel_thread+0x50/0x6c > Instruction dump: > 409c0080 2f8b0010 419e006c 2b8b0010 419d005c 380bffff 2b800001 409d0074 > 81230000 39290008 7c0004ac 7c004c2c <0c000000> 4c00012c 70000020 40820060 > ---[ end trace 85643a8ae0783f0b ]--- > Kernel panic - not syncing: Attempted to kill init! > Rebooting in 180 seconds.. > > > With kind regards, > > Geert Uytterhoeven > Software Architect > > Sony Techsoft Centre Europe > The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium > > Phone: +32 (0)2 700 8453 > Fax: +32 (0)2 700 8622 > E-mail: Geert.Uytterhoeven@sonycom.com > Internet: http://www.sony-europe.com/ > > A division of Sony Europe (Belgium) N.V. > VAT BE 0413.825.160 · RPR Brussels > Fortis · BIC GEBABEBB · IBAN BE41293037680010 > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 21:30 ` Benjamin Herrenschmidt @ 2009-01-09 22:01 ` Roland Dreier 2009-01-12 13:37 ` Geert Uytterhoeven 0 siblings, 1 reply; 18+ messages in thread From: Roland Dreier @ 2009-01-09 22:01 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Geert Uytterhoeven, linuxppc-dev, Sean MacLennan > Can you double check that the e1000 isn't copying the PCI resources into > a unsigned long before ioremap'ing the result, thus cropping the top > bits ? as far as I can see, e1000 is using pci_ioremap_bar(), which should do the right thing as long as resource_size_t is the right type (which it looks like it is on PowerPC 44x). - R. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 22:01 ` Roland Dreier @ 2009-01-12 13:37 ` Geert Uytterhoeven 2009-01-12 21:36 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 18+ messages in thread From: Geert Uytterhoeven @ 2009-01-12 13:37 UTC (permalink / raw) To: Roland Dreier; +Cc: Linux/PPC Development, Sean MacLennan On Fri, 9 Jan 2009, Roland Dreier wrote: > > Can you double check that the e1000 isn't copying the PCI resources into > > a unsigned long before ioremap'ing the result, thus cropping the top > > bits ? > > as far as I can see, e1000 is using pci_ioremap_bar(), which should do > the right thing as long as resource_size_t is the right type (which it > looks like it is on PowerPC 44x). Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap_bar(), as evidenced from the additional debug output below (see [1]). As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V PCI USB 2.0 card in the second PCI slot, and got a similar crash (see [2]). Are the PCI slots on the Sequoia known broken under recent Linux kernels? I've never used them before... [1] E1000 probe with more debug info: | Intel(R) PRO/1000 Network Driver - version 7.3.20-k3-NAPI | Copyright (c) 1999-2006 Intel Corporation. | e1000 0000:00:0a.0: enabling device (0000 -> 0003) | resource 0: [0x180000000-0x18001ffff] | resource 1: [0x180020000-0x18003ffff] | resource 2: [0x1000-0x103f] | resource 3: [0x0-0x0] | resource 4: [0x0-0x0] | resource 5: [0x0-0x0] | __ioremap: addr 0x180000000, size 131072, flags 0x500 | v = 0xd10c0000 | ((unsigned long)addr & ~PAGE_MASK) = 0x0 | return d10c0000 | hw->hw_addr = d10c0000 | e1000_set_media_type:502: hw = cf8114fc | e1000_set_media_type:503: hw->hw_addr = d10c0000 | e1000_set_media_type:509: | e1000_set_media_type:534: er32(STATUS) will do a readl() on d10c0008 | Machine check in kernel mode. | Data Read PLB Error | Oops: Machine check, sig: 7 [#1] | PowerPC 44x Platform | Modules linked in: | NIP: c0188b48 LR: c0188b38 CTR: c01732f8 | REGS: cfff7f10 TRAP: 0214 Not tainted (2.6.28-07939-g2150edc-dirty) | MSR: 00029000 <EE,ME,CE> CR: 28f60b22 XER: 20000000 | TASK = cf818400[1] 'swapper' THREAD: cf828000 | GPR00: c0188b38 cf829d90 cf818400 00000048 00001a88 ffffffff c0173d7c 00003fff | GPR08: 00000000 d10c0008 00003fff 00001a88 28f60b22 01000030 0ffb2400 00000001 | GPR16: 007fff13 00400458 00800000 c032f69c c024cfc4 c0330000 cf8114fc 00000001 | GPR24: cf811000 cf811320 00000001 00000047 c02a27dc 00000000 c024d524 cf8114fc | NIP [c0188b48] e1000_set_media_type+0xf4/0x1ec | LR [c0188b38] e1000_set_media_type+0xe4/0x1ec | Call Trace: | [cf829d90] [c0188b38] e1000_set_media_type+0xe4/0x1ec (unreliable) | [cf829db0] [c0236c84] e1000_probe+0x3c0/0xdf4 | [cf829e10] [c015c0bc] local_pci_probe+0x24/0x34 | [cf829e20] [c015c2e4] pci_device_probe+0x84/0xa8 | [cf829e50] [c017b9ec] driver_probe_device+0xb4/0x1e8 | [cf829e70] [c017bbc4] __driver_attach+0xa4/0xa8 | [cf829e90] [c017b1a0] bus_for_each_dev+0x70/0xac | [cf829ec0] [c017b804] driver_attach+0x24/0x34 | [cf829ed0] [c017aaa8] bus_add_driver+0x1d0/0x244 | [cf829ef0] [c017bde4] driver_register+0x70/0x160 | [cf829f10] [c015c58c] __pci_register_driver+0x4c/0xac | [cf829f30] [c02e1b30] e1000_init_module+0x58/0xa8 | [cf829f50] [c00013d8] do_one_initcall+0x34/0x1b0 | [cf829fc0] [c02c8178] kernel_init+0x94/0x100 | [cf829ff0] [c000da64] kernel_thread+0x50/0x6c | Instruction dump: | 409d00e8 80df0000 3c60c02a 7fc4f378 38a00216 386327e8 38c60008 480aaa19 | 813f0000 39290008 7c0004ac 7fa04c2c <0c1d0000> 4c00012c 3c60c02a 7fa6eb78 | ---[ end trace 6c682238ca36f67d ]--- [2] EHCI probe: | ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver | ehci_hcd 0000:00:0c.2: enabling device (0000 -> 0002) | ehci_hcd 0000:00:0c.2: EHCI Host Controller | ehci_hcd 0000:00:0c.2: new USB bus registered, assigned bus number 1 | Machine check in kernel mode. | Data Read PLB Error | Oops: Machine check, sig: 7 [#1] | PowerPC 44x Platform | Modules linked in: | NIP: c01baebc LR: c01a8264 CTR: c01badcc | REGS: cfff7f10 TRAP: 0214 Not tainted (2.6.28-07939-g2150edc-dirty) | MSR: 00029000 <EE,ME,CE> CR: 28d60324 XER: 20000000 | TASK = cf818400[1] 'swapper' THREAD: cf828000 | GPR00: 00000000 cf829d80 cf818400 cfa0f800 00000001 00000000 c026dd0c fffffffd | GPR08: 00000000 d1098000 00000000 d1098000 48d60342 01208030 0ffb2400 00000001 | GPR16: 007fff13 00400458 00800000 ffffffff 007fff00 0ffadd68 00000000 00000001 | GPR24: 00000000 c032de18 00000010 000000a0 cf8c6400 cf84c000 cfa0f8b8 cfa0f800 | NIP [c01baebc] ehci_pci_setup+0xf0/0x600 | LR [c01a8264] usb_add_hcd+0x1a8/0x5e8 | Call Trace: | [cf829d80] [00000001] 0x1 (unreliable) | [cf829db0] [c01a8264] usb_add_hcd+0x1a8/0x5e8 | [cf829de0] [c01b395c] usb_hcd_pci_probe+0x158/0x2e4 | [cf829e10] [c015c0bc] local_pci_probe+0x24/0x34 | [cf829e20] [c015c2e4] pci_device_probe+0x84/0xa8 | [cf829e50] [c017b9ec] driver_probe_device+0xb4/0x1e8 | [cf829e70] [c017bbc4] __driver_attach+0xa4/0xa8 | [cf829e90] [c017b1a0] bus_for_each_dev+0x70/0xac | [cf829ec0] [c017b804] driver_attach+0x24/0x34 | [cf829ed0] [c017aaa8] bus_add_driver+0x1d0/0x244 | [cf829ef0] [c017bde4] driver_register+0x70/0x160 | [cf829f10] [c015c58c] __pci_register_driver+0x4c/0xac | [cf829f30] [c03061a0] ehci_hcd_init+0xb0/0xf0 | [cf829f50] [c00013d8] do_one_initcall+0x34/0x1b0 | [cf829fc0] [c02ec178] kernel_init+0x94/0x100 | [cf829ff0] [c000da64] kernel_thread+0x50/0x6c | Instruction dump: | 2f8001b5 409eff68 801e00e0 64002000 901e00e0 74082000 813f008c 7d2b4b78 | 913f00b8 40a2ff60 7c0004ac 7c004c2c <0c000000> 4c00012c 5400063e 7c0b0214 | ---[ end trace 8e7aeede5368187f ]--- With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone: +32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: Geert.Uytterhoeven@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-12 13:37 ` Geert Uytterhoeven @ 2009-01-12 21:36 ` Benjamin Herrenschmidt 2009-01-12 22:48 ` Josh Boyer 2009-01-12 22:51 ` Re[2]: " Yuri Tikhonov 0 siblings, 2 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-12 21:36 UTC (permalink / raw) To: Geert Uytterhoeven; +Cc: Linux/PPC Development, Roland Dreier, Sean MacLennan On Mon, 2009-01-12 at 14:37 +0100, Geert Uytterhoeven wrote: > On Fri, 9 Jan 2009, Roland Dreier wrote: > > > Can you double check that the e1000 isn't copying the PCI resources into > > > a unsigned long before ioremap'ing the result, thus cropping the top > > > bits ? > > > > as far as I can see, e1000 is using pci_ioremap_bar(), which should do > > the right thing as long as resource_size_t is the right type (which it > > looks like it is on PowerPC 44x). > > Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap_bar(), > as evidenced from the additional debug output below (see [1]). > > As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V PCI USB > 2.0 card in the second PCI slot, and got a similar crash (see [2]). > > Are the PCI slots on the Sequoia known broken under recent Linux kernels? I've > never used them before... Hrm, something is indeed wrong, hard to say what tho. My canyonlands works fine (460EPx) and I can try a Taishan one of these days (440GX iirc). What is in sequoia ? I think it's a GX no ? Could be something in the device-tree ? Cheers, Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-12 21:36 ` Benjamin Herrenschmidt @ 2009-01-12 22:48 ` Josh Boyer 2009-01-12 22:51 ` Re[2]: " Yuri Tikhonov 1 sibling, 0 replies; 18+ messages in thread From: Josh Boyer @ 2009-01-12 22:48 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Geert Uytterhoeven, Linux/PPC Development, Roland Dreier, Sean MacLennan On Tue, Jan 13, 2009 at 08:36:32AM +1100, Benjamin Herrenschmidt wrote: >On Mon, 2009-01-12 at 14:37 +0100, Geert Uytterhoeven wrote: >> On Fri, 9 Jan 2009, Roland Dreier wrote: >> > > Can you double check that the e1000 isn't copying the PCI resources into >> > > a unsigned long before ioremap'ing the result, thus cropping the top >> > > bits ? >> > >> > as far as I can see, e1000 is using pci_ioremap_bar(), which should do >> > the right thing as long as resource_size_t is the right type (which it >> > looks like it is on PowerPC 44x). >> >> Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap_bar(), >> as evidenced from the additional debug output below (see [1]). >> >> As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V PCI USB >> 2.0 card in the second PCI slot, and got a similar crash (see [2]). >> >> Are the PCI slots on the Sequoia known broken under recent Linux kernels? I've >> never used them before... > >Hrm, something is indeed wrong, hard to say what tho. My canyonlands >works fine (460EPx) and I can try a Taishan one of these days (440GX >iirc). What is in sequoia ? I think it's a GX no ? 440EPx. josh ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re[2]: mal_probe crash 2009-01-12 21:36 ` Benjamin Herrenschmidt 2009-01-12 22:48 ` Josh Boyer @ 2009-01-12 22:51 ` Yuri Tikhonov 2009-01-13 2:52 ` Benjamin Herrenschmidt 2009-01-13 16:19 ` Geert Uytterhoeven 1 sibling, 2 replies; 18+ messages in thread From: Yuri Tikhonov @ 2009-01-12 22:51 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Geert Uytterhoeven, Linux/PPC Development, Roland Dreier, Sean MacLennan On Tuesday, January 13, 2009 you wrote: > On Mon, 2009-01-12 at 14:37 +0100, Geert Uytterhoeven wrote: >> On Fri, 9 Jan 2009, Roland Dreier wrote: >> > > Can you double check that the e1000 isn't copying the PCI resources= into >> > > a unsigned long before ioremap'ing the result, thus cropping the top >> > > bits ? >> >=20 >> > as far as I can see, e1000 is using pci_ioremap_bar(), which should do >> > the right thing as long as resource_size_t is the right type (which it >> > looks like it is on PowerPC 44x). >>=20 >> Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap= _bar(), >> as evidenced from the additional debug output below (see [1]). >>=20 >> As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V P= CI USB >> 2.0 card in the second PCI slot, and got a similar crash (see [2]). >>=20 >> Are the PCI slots on the Sequoia known broken under recent Linux kernels= ? I've >> never used them before... > Hrm, something is indeed wrong, hard to say what tho. My canyonlands > works fine (460EPx) and I can try a Taishan one of these days (440GX > iirc). What is in sequoia ? I think it's a GX no ? Sequoia is equipped with 440EPx. I observe the 'mal_probe' crash on the Katmai board too (based on=20 440SPe): PPC 4xx OCP EMAC driver, version 3.54 Unable to handle kernel paging request for data at address 0x0000003c Faulting instruction address: 0xc01becb8 Oops: Kernel access of bad area, sig: 11 [#1] PowerPC 44x Platform Modules linked in: NIP: c01becb8 LR: c0232200 CTR: c0014d68 REGS: cfe47d30 TRAP: 0300 Not tainted (2.6.29-rc1-00014-g58a813f-dirty) MSR: 00029000 <EE,ME,CE> CR: 42144084 XER: 20000000 DEAR: 0000003c, ESR: 00000000 TASK =3D cfe00000[1] 'swapper' THREAD: cfe40000 GPR00: c08ce244 cfe47de0 cfe00000 00000000 c08ce22c c0158864 00000020 00029= 000=20 GPR08: 000000d0 c08ce254 000000d0 00000744 82144082 89003000 7ffe4300 00000= 000=20 GPR16: 7ffd901c 7ffde640 00000000 00000000 00000000 00000000 00000000 00000= 00d=20 GPR24: c0751e80 c0415e0c dfec1e00 c0751e64 c04161c8 00000000 dff46a00 c08ce= 200=20 NIP [c01becb8] netif_napi_add+0x1c/0x58 LR [c0232200] mal_probe+0x1cc/0x668 Call Trace: [cfe47de0] [c0232194] mal_probe+0x160/0x668 (unreliable) [cfe47e10] [c01abe38] of_platform_device_probe+0x5c/0x36c [cfe47e30] [c0152c24] driver_probe_device+0xb8/0x1e8 [cfe47e50] [c0152df8] __driver_attach+0xa4/0xa8 [cfe47e70] [c0151fa8] bus_for_each_dev+0x5c/0x98 [cfe47ea0] [c0152a2c] driver_attach+0x24/0x34 [cfe47eb0] [c0152788] bus_add_driver+0x1d8/0x258 [cfe47ee0] [c0153008] driver_register+0x5c/0x158 [cfe47f00] [c01abd00] of_register_driver+0x54/0x70 [cfe47f10] [c031a160] mal_init+0x20/0x30 [cfe47f20] [c031a2c8] emac_init+0x158/0x1b4 [cfe47f60] [c0001170] do_one_initcall+0x34/0x1a0 [cfe47fd0] [c0300168] kernel_init+0x88/0xf4 [cfe47ff0] [c000d8c4] kernel_thread+0x4c/0x68 Instruction dump: 7fe3fb78 7c0803a6 bb210014 38210030 4e800020 38000000 90040024 90040020=20 90a40010 90c4000c 90840000 38040018 <8123003c> 3963003c 91240018 90840004= =20 ---[ end trace 22428c4f73106ff5 ]--- This is with Linus's tree, head ae0..e10bb. The work-around from Matthias Fuchs (Jan 09, 2009) helps though. Regards, Yuri -- Yuri Tikhonov, Senior Software Engineer Emcraft Systems, www.emcraft.com ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re[2]: mal_probe crash 2009-01-12 22:51 ` Re[2]: " Yuri Tikhonov @ 2009-01-13 2:52 ` Benjamin Herrenschmidt 2009-01-13 16:19 ` Geert Uytterhoeven 1 sibling, 0 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-13 2:52 UTC (permalink / raw) To: Yuri Tikhonov Cc: Geert Uytterhoeven, Linux/PPC Development, Roland Dreier, Sean MacLennan On Tue, 2009-01-13 at 01:51 +0300, Yuri Tikhonov wrote: > On Tuesday, January 13, 2009 you wrote: > > > On Mon, 2009-01-12 at 14:37 +0100, Geert Uytterhoeven wrote: > >> On Fri, 9 Jan 2009, Roland Dreier wrote: > >> > > Can you double check that the e1000 isn't copying the PCI resources into > >> > > a unsigned long before ioremap'ing the result, thus cropping the top > >> > > bits ? > >> > > >> > as far as I can see, e1000 is using pci_ioremap_bar(), which should do > >> > the right thing as long as resource_size_t is the right type (which it > >> > looks like it is on PowerPC 44x). > >> > >> Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap_bar(), > >> as evidenced from the additional debug output below (see [1]). > >> > >> As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V PCI USB > >> 2.0 card in the second PCI slot, and got a similar crash (see [2]). > >> > >> Are the PCI slots on the Sequoia known broken under recent Linux kernels? I've > >> never used them before... > > > Hrm, something is indeed wrong, hard to say what tho. My canyonlands > > works fine (460EPx) and I can try a Taishan one of these days (440GX > > iirc). What is in sequoia ? I think it's a GX no ? > > Sequoia is equipped with 440EPx. > > I observe the 'mal_probe' crash on the Katmai board too (based on > 440SPe): Yes, EMAC is currently busted. We'll fix it asap. Cheers, Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re[2]: mal_probe crash 2009-01-12 22:51 ` Re[2]: " Yuri Tikhonov 2009-01-13 2:52 ` Benjamin Herrenschmidt @ 2009-01-13 16:19 ` Geert Uytterhoeven 1 sibling, 0 replies; 18+ messages in thread From: Geert Uytterhoeven @ 2009-01-13 16:19 UTC (permalink / raw) To: Yuri Tikhonov; +Cc: Roland Dreier, Sean MacLennan, Linux/PPC Development On Tue, 13 Jan 2009, Yuri Tikhonov wrote: > On Tuesday, January 13, 2009 you wrote: > > On Mon, 2009-01-12 at 14:37 +0100, Geert Uytterhoeven wrote: > >> On Fri, 9 Jan 2009, Roland Dreier wrote: > >> > > Can you double check that the e1000 isn't copying the PCI resources into > >> > > a unsigned long before ioremap'ing the result, thus cropping the top > >> > > bits ? > >> > > >> > as far as I can see, e1000 is using pci_ioremap_bar(), which should do > >> > the right thing as long as resource_size_t is the right type (which it > >> > looks like it is on PowerPC 44x). > >> > >> Indeed, the full 36-bit address is passed to __ioremap() via pci_ioremap_bar(), > >> as evidenced from the additional debug output below (see [1]). > >> > >> As I don't have any other 3.3V PCI Ethernet cards, I plugged in a 3.3V PCI USB > >> 2.0 card in the second PCI slot, and got a similar crash (see [2]). > >> > >> Are the PCI slots on the Sequoia known broken under recent Linux kernels? I've > >> never used them before... > > > Hrm, something is indeed wrong, hard to say what tho. My canyonlands > > works fine (460EPx) and I can try a Taishan one of these days (440GX > > iirc). What is in sequoia ? I think it's a GX no ? > > Sequoia is equipped with 440EPx. > > I observe the 'mal_probe' crash on the Katmai board too (based on > 440SPe): Do PCI cards in the Katmai's PCI-X slot work? With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone: +32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: Geert.Uytterhoeven@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: mal_probe crash 2009-01-09 14:49 ` Matthias Fuchs 2009-01-09 15:02 ` Matthias Fuchs @ 2009-01-09 21:09 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2009-01-09 21:09 UTC (permalink / raw) To: Matthias Fuchs; +Cc: linuxppc-dev, Sean MacLennan > Could it be that simple. Probably not. It works at a first glace on > a 405EP ang GPr board. But it might cause problems when having more than > one EMAC up at the same time. I talked with the network folks and that should be ok. We only need to be a bit careful in case for some reason the EMAC we linked to NAPI get removed/destroyed... We only do all at once for now but heh.. Ben. > Matthias > > [PATCH] powerpc: Fix ibm_newemac driver > > Since commit d565b0a1a9b6ee7d netif_napi_add must be called > if a proper net_device pointer != NULL. > > Signed-off-by: Matthias Fuchs <matthias.fuchs@esd-electronics.com> > --- > drivers/net/ibm_newemac/core.c | 3 +++ > drivers/net/ibm_newemac/mal.c | 5 +---- > drivers/net/ibm_newemac/mal.h | 1 + > 3 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c > index 87a7066..9bd4d6d 100644 > --- a/drivers/net/ibm_newemac/core.c > +++ b/drivers/net/ibm_newemac/core.c > @@ -2767,6 +2767,9 @@ static int __devinit emac_probe(struct of_device *ofdev, > if (dev->mdio_dev != NULL) > dev->mdio_instance = dev_get_drvdata(&dev->mdio_dev->dev); > > + netif_napi_add(ndev, &dev->mal->napi, mal_poll, > + CONFIG_IBM_NEW_EMAC_POLL_WEIGHT); > + > /* Register with MAL */ > dev->commac.ops = &emac_commac_ops; > dev->commac.dev = dev; > diff --git a/drivers/net/ibm_newemac/mal.c b/drivers/net/ibm_newemac/mal.c > index ecf9798..d5306ae 100644 > --- a/drivers/net/ibm_newemac/mal.c > +++ b/drivers/net/ibm_newemac/mal.c > @@ -391,7 +391,7 @@ void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac) > napi_schedule(&mal->napi); > } > > -static int mal_poll(struct napi_struct *napi, int budget) > +int mal_poll(struct napi_struct *napi, int budget) > { > struct mal_instance *mal = container_of(napi, struct mal_instance, napi); > struct list_head *l; > @@ -613,9 +613,6 @@ static int __devinit mal_probe(struct of_device *ofdev, > INIT_LIST_HEAD(&mal->list); > spin_lock_init(&mal->lock); > > - netif_napi_add(NULL, &mal->napi, mal_poll, > - CONFIG_IBM_NEW_EMAC_POLL_WEIGHT); > - > /* Load power-on reset defaults */ > mal_reset(mal); > > diff --git a/drivers/net/ibm_newemac/mal.h b/drivers/net/ibm_newemac/mal.h > index 2f0a873..51597bd 100644 > --- a/drivers/net/ibm_newemac/mal.h > +++ b/drivers/net/ibm_newemac/mal.h > @@ -282,6 +282,7 @@ void mal_disable_rx_channel(struct mal_instance *mal, int channel); > > void mal_poll_disable(struct mal_instance *mal, struct mal_commac *commac); > void mal_poll_enable(struct mal_instance *mal, struct mal_commac *commac); > +int mal_poll(struct napi_struct *napi, int budget); > > /* Add/remove EMAC to/from MAL polling list */ > void mal_poll_add(struct mal_instance *mal, struct mal_commac *commac); ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2009-01-13 16:19 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-01-07 20:44 mal_probe crash Sean MacLennan 2009-01-08 20:46 ` Josh Boyer 2009-01-07 22:50 ` Benjamin Herrenschmidt 2009-01-09 14:42 ` Geert Uytterhoeven 2009-01-09 22:34 ` Herbert Xu 2009-01-09 23:13 ` Benjamin Herrenschmidt 2009-01-09 14:49 ` Matthias Fuchs 2009-01-09 15:02 ` Matthias Fuchs 2009-01-09 15:24 ` Geert Uytterhoeven 2009-01-09 21:30 ` Benjamin Herrenschmidt 2009-01-09 22:01 ` Roland Dreier 2009-01-12 13:37 ` Geert Uytterhoeven 2009-01-12 21:36 ` Benjamin Herrenschmidt 2009-01-12 22:48 ` Josh Boyer 2009-01-12 22:51 ` Re[2]: " Yuri Tikhonov 2009-01-13 2:52 ` Benjamin Herrenschmidt 2009-01-13 16:19 ` Geert Uytterhoeven 2009-01-09 21:09 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).