* Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' @ 2017-06-14 19:26 Guenter Roeck 2017-06-14 21:31 ` Frank Rowand 2017-06-15 6:48 ` Michael Ellerman 0 siblings, 2 replies; 9+ messages in thread From: Guenter Roeck @ 2017-06-14 19:26 UTC (permalink / raw) To: Frank Rowand; +Cc: linux-kernel, Rob Herring Hi Frank, your commit 'of: remove *phandle properties from expanded device tree' in -next causes several of my ppc qemu tests to crash. Looking into qemu, it sets "linux,phandle" properties for the mpic and for other devices. The crashes are along the line of ------------[ cut here ]------------ kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=32 NUMA CoreNet Generic Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1 task: c000000000ad8cc0 task.stack: c000000000bec000 NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20 REGS: c000000000befb90 TRAP: 0700 Not tainted (4.12.0-rc5-next-20170614) MSR: 0000000080021000 <CE,ME> CR: 22000042 XER: 00000000 SOFTE: 0 GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90 LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 Call Trace: [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 (unreliable) [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500 [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48 Instruction dump: e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd random: 0x600000003d220004 get_random_bytes called with crng_init=0 ---[ end trace 0000000000000000 ]--- and are caused by the kernel not finding the mpic node anymore. Any idea how to solve the problem ? Bisect log is attached. Thanks, Guenter --- # bad: [b14746170b0684005bab3e07893e6b91baf7dbf6] Add linux-next specific files for 20170614 # good: [32c1431eea4881a6b17bd7c639315010aeefa452] Linux 4.12-rc5 git bisect start 'HEAD' 'v4.12-rc5' # good: [0500b956eedb4686b0420308ae01a74b00f9ab64] Merge remote-tracking branch 'crypto/master' git bisect good 0500b956eedb4686b0420308ae01a74b00f9ab64 # bad: [4717c17660509cee9d3596eb19b99f3e26d57c36] Merge remote-tracking branch 'tip/auto-latest' git bisect bad 4717c17660509cee9d3596eb19b99f3e26d57c36 # good: [f32807fd889514af115c32f597f59763d44ffae4] next-20170613/sound-asoc git bisect good f32807fd889514af115c32f597f59763d44ffae4 # good: [8bf3df94bf566c7294b6f972cb5afa2d9a3a83f5] Merge remote-tracking branch 'iommu/next' git bisect good 8bf3df94bf566c7294b6f972cb5afa2d9a3a83f5 # good: [e5c91c3569136b20783bd0799f026b89e4a2752a] Merge branch 'sched/core' git bisect good e5c91c3569136b20783bd0799f026b89e4a2752a # good: [3ff2be7e0e543ed1fbdd1a9f5ca49417be7b2a66] Merge branch 'x86/boot' git bisect good 3ff2be7e0e543ed1fbdd1a9f5ca49417be7b2a66 # good: [2b37bbbc6291132aa8b08088ec31652eaf66ce6a] Merge remote-tracking branches 'spi/topic/rockchip', 'spi/topic/sh-msiof', 'spi/topic/spidev' and 'spi/topic/st-ssc4' into spi-next git bisect good 2b37bbbc6291132aa8b08088ec31652eaf66ce6a # good: [82a28f6c16030d04f5719889999f4fa9a35bcfc7] Merge branch 'x86/timers' git bisect good 82a28f6c16030d04f5719889999f4fa9a35bcfc7 # bad: [d19a4961ac001b1284013ecff3deb6456a09abda] of: make __of_attach_node() static git bisect bad d19a4961ac001b1284013ecff3deb6456a09abda # good: [e5e9b5fae7e7d1fad87e4abb52f5f3d55c9f4e25] iio: proximity: as3935: add missing required spi-max-frequency git bisect good e5e9b5fae7e7d1fad87e4abb52f5f3d55c9f4e25 # good: [d20dc1493db438fbbfb7733adc82f472dd8a0789] of: Support const and non-const use for to_of_node() git bisect good d20dc1493db438fbbfb7733adc82f472dd8a0789 # good: [4811a1a7800bc59074e640a4fe9befdb668ae56f] Merge branch 'dt/property-move' into dt/next git bisect good 4811a1a7800bc59074e640a4fe9befdb668ae56f # bad: [f847192ce4061dc7e9087eb9136a38e3bf582efb] of: remove *phandle properties from expanded device tree git bisect bad f847192ce4061dc7e9087eb9136a38e3bf582efb # good: [6fedb069def034a4738584920fe94535ab29637a] of: Provide dummy of_device_compatible_match() for compile-testing git bisect good 6fedb069def034a4738584920fe94535ab29637a # first bad commit: [f847192ce4061dc7e9087eb9136a38e3bf582efb] of: remove *phandle properties from expanded device tree ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck @ 2017-06-14 21:31 ` Frank Rowand 2017-06-14 22:35 ` Guenter Roeck 2017-06-15 6:48 ` Michael Ellerman 1 sibling, 1 reply; 9+ messages in thread From: Frank Rowand @ 2017-06-14 21:31 UTC (permalink / raw) To: Guenter Roeck, Frank Rowand; +Cc: linux-kernel, Rob Herring Hi Guenter, Thanks for reporting this. On 06/14/17 12:26, Guenter Roeck wrote: > Hi Frank, > > your commit 'of: remove *phandle properties from expanded device tree' in > -next causes several of my ppc qemu tests to crash. Looking into qemu, it > sets "linux,phandle" properties for the mpic and for other devices. > > The crashes are along the line of > > ------------[ cut here ]------------ > kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! > Oops: Exception in kernel mode, sig: 5 [#1] > SMP NR_CPUS=32 > NUMA > CoreNet Generic > Modules linked in: > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1 > task: c000000000ad8cc0 task.stack: c000000000bec000 > NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20 > REGS: c000000000befb90 TRAP: 0700 Not tainted (4.12.0-rc5-next-20170614) > MSR: 0000000080021000 <CE,ME> > CR: 22000042 XER: 00000000 > SOFTE: 0 > GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 > GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 > GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 > GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 > NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90 > LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 > Call Trace: > [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 > (unreliable) > [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c > [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500 > [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48 > Instruction dump: > e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 > 60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd > random: 0x600000003d220004 get_random_bytes called with crng_init=0 > ---[ end trace 0000000000000000 ]--- > > and are caused by the kernel not finding the mpic node anymore. > > Any idea how to solve the problem ? The BUG() is triggered if mpic_alloc() returns NULL. I looked through mpic_alloc(), and the functions that it calls, and nothing is jumping out as being related to phandles. Can you add some printks to mpic_alloc() to determine what problem is causing it to return NULL? Can you also include the console messages before the "[ cut here ]" line? -Frank > > Bisect log is attached. > > Thanks, > Guenter < snip > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-14 21:31 ` Frank Rowand @ 2017-06-14 22:35 ` Guenter Roeck 2017-06-15 0:45 ` Frank Rowand 0 siblings, 1 reply; 9+ messages in thread From: Guenter Roeck @ 2017-06-14 22:35 UTC (permalink / raw) To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote: > Hi Guenter, > > Thanks for reporting this. > > > On 06/14/17 12:26, Guenter Roeck wrote: > > Hi Frank, > > > > your commit 'of: remove *phandle properties from expanded device tree' in > > -next causes several of my ppc qemu tests to crash. Looking into qemu, it > > sets "linux,phandle" properties for the mpic and for other devices. > > > > The crashes are along the line of > > > > ------------[ cut here ]------------ > > kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! > > Oops: Exception in kernel mode, sig: 5 [#1] > > SMP NR_CPUS=32 > > NUMA > > CoreNet Generic > > Modules linked in: > > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc5-next-20170614 #1 > > task: c000000000ad8cc0 task.stack: c000000000bec000 > > NIP: c000000000a8ca7c LR: c000000000a8ca6c CTR: c000000000a8ca20 > > REGS: c000000000befb90 TRAP: 0700 Not tainted (4.12.0-rc5-next-20170614) > > MSR: 0000000080021000 <CE,ME> > > CR: 22000042 XER: 00000000 > > SOFTE: 0 > > GPR00: c000000000a8ca6c c000000000befe10 c000000000befa00 0000000000000000 > > GPR04: 0000000000000000 c000000000ac8458 c000000000ac8438 c000000000830658 > > GPR08: 0000000000000001 0000000000000001 0000000000000000 0000000000009531 > > GPR12: 0000000022000022 c00000003fff1000 0000000000000000 0000000000000000 > > GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > > GPR28: c000000000000300 c00000003fff2cc0 c000000000ac06e0 c000000000ac06e0 > > NIP [c000000000a8ca7c] .corenet_gen_pic_init+0x5c/0x90 > > LR [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 > > Call Trace: > > [c000000000befe10] [c000000000a8ca6c] .corenet_gen_pic_init+0x4c/0x90 > > (unreliable) > > [c000000000befe80] [c000000000a832f8] .init_IRQ+0x34/0x4c > > [c000000000befef0] [c000000000a7fc88] .start_kernel+0x2fc/0x500 > > [c000000000beff90] [c000000000000554] start_here_common+0x1c/0x48 > > Instruction dump: > > e8aa0068 39088268 39407002 38600000 7fa54800 39205002 7caa4f9e 4bffe9e9 > > 60000000 2c230000 7d200026 55291ffe <0b090000> 4bfff335 60000000 3ca2ffdd > > random: 0x600000003d220004 get_random_bytes called with crng_init=0 > > ---[ end trace 0000000000000000 ]--- > > > > and are caused by the kernel not finding the mpic node anymore. > > > > Any idea how to solve the problem ? > > The BUG() is triggered if mpic_alloc() returns NULL. > Yes, I got that far as well ... > I looked through mpic_alloc(), and the functions that it calls, and nothing > is jumping out as being related to phandles. > > Can you add some printks to mpic_alloc() to determine what problem is > causing it to return NULL? > I'll try later tonight. > Can you also include the console messages before the "[ cut here ]" line? > http://kerneltests.org/builders Check qemu test results in the 'next' column. ppc and ppc64 show related console messages. Guenter ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-14 22:35 ` Guenter Roeck @ 2017-06-15 0:45 ` Frank Rowand 2017-06-15 2:10 ` Guenter Roeck 2017-06-15 4:12 ` Guenter Roeck 0 siblings, 2 replies; 9+ messages in thread From: Frank Rowand @ 2017-06-15 0:45 UTC (permalink / raw) To: Guenter Roeck; +Cc: Frank Rowand, linux-kernel, Rob Herring On 06/14/17 15:35, Guenter Roeck wrote: > On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote: >> Hi Guenter, < snip > >> Can you also include the console messages before the "[ cut here ]" line? >> > http://kerneltests.org/builders > > Check qemu test results in the 'next' column. ppc and ppc64 show related console > messages. Thanks for the pointer. Unfortunately I did not see any additional clues (yet) in the full log. I tried to compare the failed boot to a good boot, but did not find a console log for a good boot. I started at the qemu-ppc-next builder page: http://kerneltests.org/builders/qemu-ppc64-next and looked at recent tests that were successful (like #645). But the log file link from that test does not show the contents of the console for tests that pass. Is there some way to see what the console for a successful test looks like? -Frank ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-15 0:45 ` Frank Rowand @ 2017-06-15 2:10 ` Guenter Roeck 2017-06-15 4:12 ` Guenter Roeck 1 sibling, 0 replies; 9+ messages in thread From: Guenter Roeck @ 2017-06-15 2:10 UTC (permalink / raw) To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring [-- Attachment #1: Type: text/plain, Size: 1082 bytes --] On Wed, Jun 14, 2017 at 05:45:52PM -0700, Frank Rowand wrote: > On 06/14/17 15:35, Guenter Roeck wrote: > > On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote: > >> Hi Guenter, > > < snip > > > >> Can you also include the console messages before the "[ cut here ]" line? > >> > > http://kerneltests.org/builders > > > > Check qemu test results in the 'next' column. ppc and ppc64 show related console > > messages. > > Thanks for the pointer. Unfortunately I did not see any additional clues (yet) > in the full log. > > I tried to compare the failed boot to a good boot, but did not find a console > log for a good boot. I started at the qemu-ppc-next builder page: > > http://kerneltests.org/builders/qemu-ppc64-next > > and looked at recent tests that were successful (like #645). But the log file > link from that test does not show the contents of the console for tests that > pass. Is there some way to see what the console for a successful test looks > like? > See attached. I am on the road; I'll try to do some debugging later from home. Guenter [-- Attachment #2: ppc64.log --] [-- Type: text/plain, Size: 6407 bytes --] MMU: Supported page sizes 4 KB as direct 4096 KB as direct 16384 KB as direct 65536 KB as direct 262144 KB as direct 1048576 KB as direct MMU: Book3E HW tablewalk not supported Linux version 4.12.0-rc4 (groeck@mars) (gcc version 4.8.1 (GCC) ) #1 SMP Wed Jun 14 19:06:31 PDT 2017 Found initrd at 0xc000000004000000:0xc000000004200c00 Using CoreNet Generic machine description bootconsole [udbg0] enabled CPU maps initialized for 1 thread per core ----------------------------------------------------- phys_mem_size = 0x40000000 dcache_bsize = 0x40 icache_bsize = 0x40 cpu_features = 0x00180400181802c0 possible = 0x00180480581802c0 always = 0x00180400581802c0 cpu_user_features = 0xcc008000 0x08000000 mmu_features = 0x000a0010 firmware_features = 0x0000000000000000 ----------------------------------------------------- numa: NODE_DATA [mem 0x3ffd6740-0x3ffdffff] CoreNet Generic board Zone ranges: DMA [mem 0x0000000000000000-0x000000003fffffff] DMA32 empty Normal empty Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000000000000-0x000000003fffffff] Initmem setup node 0 [mem 0x0000000000000000-0x000000003fffffff] MMU: Allocated 2112 bytes of context maps for 255 contexts percpu: Embedded 18 pages/cpu @c00000003fe00000 s35544 r0 d38184 u1048576 Built 1 zonelists in Node order, mobility grouping on. Total pages: 258560 Policy zone: DMA Kernel command line: rdinit=/sbin/init console=tty console=ttyS0 doreboot PID hash table entries: 4096 (order: 3, 32768 bytes) Memory: 952996K/1048576K available (8224K kernel code, 1280K rwdata, 2448K rodata, 368K init, 441K bss, 95580K reserved, 0K cma-reserved) SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 Hierarchical RCU implementation. RCU debugfs-based tracing is enabled. RCU restricting CPUs from NR_CPUS=32 to nr_cpu_ids=1. RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1 NR_IRQS:512 nr_irqs:512 16 mpic: Setting up MPIC " OpenPIC " version 1.2 at e0040000, max 1 CPUs mpic: ISU size: 512, shift: 9, mask: 1ff mpic: Initializing for 512 sources clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x5c4093a7d1, max_idle_ns: 440795210635 ns clocksource: timebase mult[2800000] shift[24] registered Console: colour dummy device 80x25 console [tty0] enabled pid_max: default: 32768 minimum: 301 Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes) Inode-cache hash table entries: 65536 (order: 7, 524288 bytes) Mount-cache hash table entries: 2048 (order: 2, 16384 bytes) Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes) smp: Bringing up secondary CPUs ... smp: Brought up 1 node, 1 CPU devtmpfs: initialized clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns futex hash table entries: 256 (order: 2, 16384 bytes) NET: Registered protocol family 16 Machine: MPC8544DS SoC family: QorIQ SoC ID: svr:0x00000000, Revision: 0.0 PCI: Probing PCI hardware vgaarb: loaded SCSI subsystem initialized usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb pps_core: LinuxPPS API ver. 1 registered pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it> PTP clock support registered Advanced Linux Sound Architecture Driver Initialized. clocksource: Switched to clocksource timebase NET: Registered protocol family 2 TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 8192 (order: 5, 131072 bytes) TCP: Hash tables configured (established 8192 bind 8192) UDP hash table entries: 512 (order: 2, 16384 bytes) UDP-Lite hash table entries: 512 (order: 2, 16384 bytes) NET: Registered protocol family 1 RPC: Registered named UNIX socket transport module. RPC: Registered udp transport module. RPC: Registered tcp transport module. RPC: Registered tcp NFSv4.1 backchannel transport module. Trying to unpack rootfs image as initramfs... Freeing initrd memory: 2048K audit: initializing netlink subsys (disabled) audit: type=2000 audit(0.224:1): state=initialized audit_enabled=0 res=1 workingset: timestamp_bits=54 max_order=18 bucket_order=0 NFS: Registering the id_resolver key type Key type id_resolver registered Key type id_legacy registered Installing knfsd (copyright (C) 1996 okir@monad.swb.de). ntfs: driver 2.1.32 [Flags: R/O]. jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc. io scheduler noop registered io scheduler deadline registered io scheduler cfq registered (default) io scheduler mq-deadline registered io scheduler kyber registered Serial: 8250/16550 driver, 2 ports, IRQ sharing enabled console [ttyS0] disabled serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 42, base_baud = 115200) is a 16550A console [ttyS0] enabled console [ttyS0] enabled bootconsole [udbg0] disabled bootconsole [udbg0] disabled brd: module loaded loop: module loaded st: Version 20160209, fixed bufsize 32768, s/g segs 256 libphy: Fixed MDIO Bus: probed e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k e1000e: Copyright(c) 1999 - 2015 Intel Corporation. ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci-pci: EHCI PCI platform driver ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver ohci-pci: OHCI PCI platform driver usbcore: registered new interface driver usb-storage i2c /dev entries driver sdhci: Secure Digital Host Controller Interface driver sdhci: Copyright(c) Pierre Ossman sdhci-pltfm: SDHCI platform and OF driver helper usbcore: registered new interface driver usbhid usbhid: USB HID core driver ipip: IPv4 and MPLS over IPv4 tunneling driver Initializing XFRM netlink socket NET: Registered protocol family 10 Segment Routing with IPv6 sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver NET: Registered protocol family 17 NET: Registered protocol family 15 Key type dns_resolver registered hctosys: unable to open rtc device (rtc0) ALSA device list: No soundcards found. Freeing unused kernel memory: 368K This architecture does not have kernel memory protection. Boot successful. Rebooting. swapoff: can't open '/etc/fstab': No such file or directory umount: can't umount /: Invalid argument The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system reboot reboot: Restarting system ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-15 0:45 ` Frank Rowand 2017-06-15 2:10 ` Guenter Roeck @ 2017-06-15 4:12 ` Guenter Roeck 2017-06-15 7:58 ` Frank Rowand 1 sibling, 1 reply; 9+ messages in thread From: Guenter Roeck @ 2017-06-15 4:12 UTC (permalink / raw) To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring On 06/14/2017 05:45 PM, Frank Rowand wrote: > On 06/14/17 15:35, Guenter Roeck wrote: >> On Wed, Jun 14, 2017 at 02:31:58PM -0700, Frank Rowand wrote: >>> Hi Guenter, > > < snip > > >>> Can you also include the console messages before the "[ cut here ]" line? >>> >> http://kerneltests.org/builders >> >> Check qemu test results in the 'next' column. ppc and ppc64 show related console >> messages. > > Thanks for the pointer. Unfortunately I did not see any additional clues (yet) > in the full log. > > I tried to compare the failed boot to a good boot, but did not find a console > log for a good boot. I started at the qemu-ppc-next builder page: > > http://kerneltests.org/builders/qemu-ppc64-next > > and looked at recent tests that were successful (like #645). But the log file > link from that test does not show the contents of the console for tests that > pass. Is there some way to see what the console for a successful test looks > like? > > -Frank > Good (v4.12-rc4): ... NR_IRQS:512 nr_irqs:512 16 OF: Checking node / OF: node '/' compatible '' type 'open-pic' name '' score 0 OF: node '/' compatible 'open-pic' type '' name '' score 0 OF: Checking node /pci@e0008000 OF: node '/pci@e0008000' compatible '' type 'open-pic' name '' score 0 OF: node '/pci@e0008000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000 OF: node '/soc@e0000000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/msi@41600 OF: node '/soc@e0000000/msi@41600' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/msi@41600' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/global-utilities@e0000 OF: node '/soc@e0000000/global-utilities@e0000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/global-utilities@e0000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/serial@4500 OF: node '/soc@e0000000/serial@4500' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/serial@4500' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/pic@40000 OF: type match OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2 OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 mpic: Setting up MPIC " OpenPIC " version 1.2 at e0040000, max 1 CPUs mpic: ISU size: 512, shift: 9, mask: 1ff mpic: Initializing for 512 sources bad: NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16 OF: Checking node / OF: node '/' compatible '' type 'open-pic' name '' score 0 OF: node '/' compatible 'open-pic' type '' name '' score 0 OF: Checking node /pci@e0008000 OF: node '/pci@e0008000' compatible '' type 'open-pic' name '' score 0 OF: node '/pci@e0008000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000 OF: node '/soc@e0000000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/msi@41600 OF: node '/soc@e0000000/msi@41600' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/msi@41600' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/global-utilities@e0000 OF: node '/soc@e0000000/global-utilities@e0000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/global-utilities@e0000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/serial@4500 OF: node '/soc@e0000000/serial@4500' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/serial@4500' compatible 'open-pic' type '' name '' score 0 OF: Checking node /soc@e0000000/pic@40000 OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 OF: Checking node /aliases OF: node '/aliases' compatible '' type 'open-pic' name '' score 0 OF: node '/aliases' compatible 'open-pic' type '' name '' score 0 OF: Checking node /cpus OF: node '/cpus' compatible '' type 'open-pic' name '' score 0 OF: node '/cpus' compatible 'open-pic' type '' name '' score 0 OF: Checking node /cpus/PowerPC,8544@0 OF: node '/cpus/PowerPC,8544@0' compatible '' type 'open-pic' name '' score 0 OF: node '/cpus/PowerPC,8544@0' compatible 'open-pic' type '' name '' score 0 OF: Checking node /chosen OF: node '/chosen' compatible '' type 'open-pic' name '' score 0 OF: node '/chosen' compatible 'open-pic' type '' name '' score 0 OF: Checking node /memory OF: node '/memory' compatible '' type 'open-pic' name '' score 0 OF: node '/memory' compatible 'open-pic' type '' name '' score 0 No matching open-pic node ------------[ cut here ]------------ kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! So, in __of_device_is_compatible(), the difference is in __of_device_is_compatible() after /* Matching type is better than matching name */ Further debugging shows that device->type is NULL in the bad case. OF: Checking node /soc@e0000000/pic@40000 OF: trying type match open-pic - <NULL> OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 Do you need more information ? Thanks, Guenter ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-15 4:12 ` Guenter Roeck @ 2017-06-15 7:58 ` Frank Rowand 2017-06-15 9:53 ` Guenter Roeck 0 siblings, 1 reply; 9+ messages in thread From: Frank Rowand @ 2017-06-15 7:58 UTC (permalink / raw) To: Guenter Roeck; +Cc: Frank Rowand, linux-kernel, Rob Herring On 06/14/17 21:12, Guenter Roeck wrote: < snip > > Good (v4.12-rc4): > < snip > > OF: Checking node /soc@e0000000/pic@40000 > OF: type match > OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2 > OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 < snip > > > bad: < snip > > OF: Checking node /soc@e0000000/pic@40000 > OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 > OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 < snip > > No matching open-pic node > ------------[ cut here ]------------ > kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! > > So, in __of_device_is_compatible(), the difference is in > __of_device_is_compatible() after > > /* Matching type is better than matching name */ > > Further debugging shows that device->type is NULL in the bad case. > > OF: Checking node /soc@e0000000/pic@40000 > OF: trying type match open-pic - <NULL> > OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 > OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 > > Do you need more information ? I think I know what part of my patch is causing the problem. Can you try the following patch to see if if fixes the failure in __of_device_is_compatible()? If this fixes the failure, then I know what is going on. If it works then I will have to rework my original patch in a different way than this quick hack. -Frank --- drivers/of/dynamic.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) Index: b/drivers/of/dynamic.c =================================================================== --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -218,6 +218,20 @@ int of_property_notify(int action, struc static void __of_attach_node(struct device_node *np) { + const __be32 *phandle; + int sz; + + /* use "<NULL>" to be consistent with populate_node() */ + np->name = __of_get_property(np, "name", NULL) ? : "<NULL>"; + np->type = __of_get_property(np, "device_type", NULL) ? : "<NULL>"; + + phandle = __of_get_property(np, "phandle", &sz); + if (!phandle) + phandle = __of_get_property(np, "linux,phandle", &sz); + if (IS_ENABLED(CONFIG_PPC_PSERIES) && !phandle) + phandle = __of_get_property(np, "ibm,phandle", &sz); + np->phandle = (phandle && (sz >= 4)) ? be32_to_cpup(phandle) : 0; + np->child = NULL; np->sibling = np->parent->child; np->parent->child = np; ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-15 7:58 ` Frank Rowand @ 2017-06-15 9:53 ` Guenter Roeck 0 siblings, 0 replies; 9+ messages in thread From: Guenter Roeck @ 2017-06-15 9:53 UTC (permalink / raw) To: Frank Rowand; +Cc: Frank Rowand, linux-kernel, Rob Herring On 06/15/2017 12:58 AM, Frank Rowand wrote: > On 06/14/17 21:12, Guenter Roeck wrote: > > < snip > > >> Good (v4.12-rc4): >> > > < snip > > >> OF: Checking node /soc@e0000000/pic@40000 >> OF: type match >> OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 2 >> OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 > > < snip > > >> >> bad: > > < snip > > >> OF: Checking node /soc@e0000000/pic@40000 >> OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 >> OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 > > < snip > > >> No matching open-pic node >> ------------[ cut here ]------------ >> kernel BUG at arch/powerpc/platforms/85xx/corenet_generic.c:50! >> >> So, in __of_device_is_compatible(), the difference is in >> __of_device_is_compatible() after >> >> /* Matching type is better than matching name */ >> >> Further debugging shows that device->type is NULL in the bad case. >> >> OF: Checking node /soc@e0000000/pic@40000 >> OF: trying type match open-pic - <NULL> >> OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 >> OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 >> >> Do you need more information ? > > I think I know what part of my patch is causing the problem. > > Can you try the following patch to see if if fixes the failure in > __of_device_is_compatible()? > > If this fixes the failure, then I know what is going on. If it works > then I will have to rework my original patch in a different way than > this quick hack. > Sorry, doesn't make a difference. OF: Checking node /soc@e0000000/pic@40000 OF: trying type match open-pic - <NULL> OF: node '/soc@e0000000/pic@40000' compatible '' type 'open-pic' name '' score 0 OF: node '/soc@e0000000/pic@40000' compatible 'open-pic' type '' name '' score 0 I added a log message into __of_attach_node(); it is not called. Guenter ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' 2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck 2017-06-14 21:31 ` Frank Rowand @ 2017-06-15 6:48 ` Michael Ellerman 1 sibling, 0 replies; 9+ messages in thread From: Michael Ellerman @ 2017-06-15 6:48 UTC (permalink / raw) To: Guenter Roeck, Frank Rowand; +Cc: linux-kernel, Rob Herring Guenter Roeck <linux@roeck-us.net> writes: > Hi Frank, > > your commit 'of: remove *phandle properties from expanded device tree' in > -next causes several of my ppc qemu tests to crash. Looking into qemu, it > sets "linux,phandle" properties for the mpic and for other devices. Yeah this broke ~50% of my machines. Various back traces, or in some cases nothing at all. cheers eg: XICS: Cannot find a Source Controller ! ------------[ cut here ]------------ kernel BUG at arch/powerpc/sysdev/xics/xics-common.c:58! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.12.0-rc5-gcc5-next-20170614-gb147461 #1 task: c000000000eb1180 task.stack: c000000001084000 NIP: c00000000008d780 LR: c00000000008d770 CTR: 0000000000000000 REGS: c000000001087a40 TRAP: 0700 Tainted: G W (4.12.0-rc5-gcc5-next-20170614-gb147461) MSR: 8000000000021032 <SF,ME,IR,DR,RI> CR: 24000422 XER: 00000001 CFAR: c0000000008dd280 SOFTE: 0 GPR00: c00000000008d770 c000000001087cc0 c000000001086400 0000000000000000 GPR04: 0000000000000000 0000000000000000 c000000000ad14c8 0000000000000002 GPR08: 0000000000000002 0000000000000001 0000000000000002 0000000000000000 GPR12: 0000000022000424 c000000006af0000 00000000054dd288 00000000054b5618 GPR16: 00000000054b5320 00000000054b59e8 000000000554dd20 0000000000000060 GPR20: 000000000462eea0 0000000001b56c80 0000000000000040 0000000000000000 GPR24: 0000000004814000 0000000005aa0028 0000000004814000 0000000005ab158e GPR28: ffffffffd00dfeed c000000000e115e0 0000000000000000 c000000000eb54f4 NIP [c00000000008d780] .xics_update_irq_servers+0x40/0x140 LR [c00000000008d770] .xics_update_irq_servers+0x30/0x140 Call Trace: [c000000001087cc0] [c00000000008d770] .xics_update_irq_servers+0x30/0x140 (unreliable) [c000000001087d50] [c000000000db85f0] .xics_init+0x134/0x188 [c000000001087dd0] [c000000000dbdc64] .pseries_init_irq+0x48/0x230 [c000000001087e80] [c000000000da8dcc] .init_IRQ+0x3c/0x50 [c000000001087ef0] [c000000000da44e4] .start_kernel+0x31c/0x528 [c000000001087f90] [c00000000000b070] start_here_common+0x1c/0x4ac Instruction dump: f821ff71 60000000 60000000 3d02ffe3 38800000 3be8f0f4 e87f0002 4884fa85 60000000 7c690074 7c7e1b78 7929d182 <0b090000> e93f0002 3d02000b 3c82ffc2 ---[ end trace 523b05d3a02887f6 ]--- ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-06-15 9:53 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-06-14 19:26 Qemu crashes in -next due to 'of: remove *phandle properties from expanded device tree' Guenter Roeck 2017-06-14 21:31 ` Frank Rowand 2017-06-14 22:35 ` Guenter Roeck 2017-06-15 0:45 ` Frank Rowand 2017-06-15 2:10 ` Guenter Roeck 2017-06-15 4:12 ` Guenter Roeck 2017-06-15 7:58 ` Frank Rowand 2017-06-15 9:53 ` Guenter Roeck 2017-06-15 6:48 ` Michael Ellerman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox