From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e36.co.us.ibm.com", Issuer "Equifax" (not verified)) by ozlabs.org (Postfix) with ESMTPS id C299AB7074 for ; Tue, 19 Jun 2012 03:07:17 +1000 (EST) Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 18 Jun 2012 11:07:07 -0600 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id A4A1D19D804A for ; Mon, 18 Jun 2012 17:05:55 +0000 (WET) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q5IH5OdY161234 for ; Mon, 18 Jun 2012 11:05:24 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q5IH5NQP005647 for ; Mon, 18 Jun 2012 11:05:24 -0600 Date: Mon, 18 Jun 2012 10:05:21 -0700 From: "Paul E. McKenney" To: Benjamin Herrenschmidt Subject: Re: RCU stalls on 32-bit pmac SMP Message-ID: <20120618170521.GD2400@linux.vnet.ibm.com> References: <1339979746.2372.6.camel@pasglop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <1339979746.2372.6.camel@pasglop> Cc: linuxppc-dev@lists.ozlabs.org Reply-To: paulmck@linux.vnet.ibm.com List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Mon, Jun 18, 2012 at 10:35:46AM +1000, Benjamin Herrenschmidt wrote: > Hi Paul ! > > I was about to go debug something else when I hit that with -rc3 (plus > the patch to fix the current bug.h breakage) on a 32-bit PowerMac G4 SMP > (2 CPUs): About 1mn pause at boot followed by a bunch of RCU stall > messages. Log below. Any idea where to start looking ? > > pci 0002:20:0f.0: reg 30: [mem 0xf5100000-0xf51fffff pref] > PCI 0000:00 Cannot reserve Legacy IO [io 0x802000-0x802fff] > PCI 0001:10 Cannot reserve Legacy IO [io 0x0000-0x0fff] > pci_bus 0000:00: resource 4 [io 0x802000-0x1001fff] > pci_bus 0000:00: resource 5 [mem 0xf1000000-0xf1ffffff] > pci_bus 0000:00: resource 6 [mem 0x90000000-0x9fffffff] > pci_bus 0001:10: resource 4 [io 0x0000-0x7fffff] > pci_bus 0001:10: resource 5 [mem 0xf3000000-0xf3ffffff] > pci_bus 0001:10: resource 6 [mem 0x80000000-0x8fffffff] > pci_bus 0002:20: resource 4 [io 0xff7fe000-0xffffdfff] > pci_bus 0002:20: resource 5 [mem 0xf5000000-0xf5ffffff] > bio: create slab at 0 > vgaarb: device added: PCI:0000:00:10.0,decodes=io+mem,owns=io+mem,locks=none > vgaarb: loaded > vgaarb: bridge control possible 0000:00:10.0 > SCSI subsystem initialized > libata version 3.00 loaded. > usbcore: registered new interface driver usbfs > usbcore: registered new interface driver hub > usbcore: registered new device driver usb > Switching to clocksource timebase > NET: Registered protocol family 2 > IP route cache hash table entries: 32768 (order: 5, 131072 bytes) > TCP established hash table entries: 131072 (order: 8, 1048576 bytes) > TCP bind hash table entries: 65536 (order: 7, 524288 bytes) > TCP: Hash tables configured (established 131072 bind 65536) > TCP: reno registered > UDP hash table entries: 512 (order: 2, 16384 bytes) > UDP-Lite hash table entries: 512 (order: 2, 16384 bytes) > NET: Registered protocol family 1 > RPC: Registered named UNIX socket transport module. > RPC: Registered udp transport module. > RPC: Registered tcp transport module. > RPC: Registered tcp NFSv4.1 backchannel transport module. > pci 0001:10:18.0: enabling device (0000 -> 0002) > pci 0001:10:19.0: enabling device (0000 -> 0002) > PCI: CLS mismatch (32 != 1020), using 32 bytes > highmem bounce pool size: 64 pages > NFS: Registering the id_resolver key type > Key type id_resolver registered > msgmni has been set to 1500 > Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) > io scheduler noop registered > io scheduler deadline registered > io scheduler cfq registered (default) > Using unsupported 1680x1050 ATY,BlueStone_A at 9c008000, depth=8, pitch=1792 > Console: switching to colour frame buffer device 210x65 > fb0: Open Firmware frame buffer device on /pci@f0000000/ATY,BlueStoneParent@10/ATY,BlueStone_A > Using unsupported 640x480 ATY,BlueStone_B at 99008000, depth=8, pitch=768 > checking generic (9c008000 1cb600) vs hw (99008000 5a000) > fb1: Open Firmware frame buffer device on /pci@f0000000/ATY,BlueStoneParent@10/ATY,BlueStone_B > Generic RTC Driver v1.07 > Generic non-volatile memory driver v1.1 > brd: module loaded > loop: module loaded > MacIO PCI driver attached to Keylargo chipset > adb: starting probe task... > adb: finished probe task... > mesh: configured for synchronous 5 MB/s > st: Version 20101219, fixed bufsize 32768, s/g segs 256 > pata-macio 0.0001f000:ata-4: Activating pata-macio chipset KeyLargo ATA-4, Apple bus ID 2 > scsi0 : pata_macio > ata1: PATA max UDMA/66 irq 19 > ata1.00: ATA-5: ST380021A, 5.05, max UDMA/100 > ata1.00: 156301488 sectors, multi 16: LBA > ata1.01: ATA-5: ST380021A, 3.75, max UDMA/100 > ata1.01: 156301488 sectors, multi 16: LBA > ata1.00: configured for UDMA/66 > ata1.01: configured for UDMA/66 > scsi 0:0:0:0: Direct-Access ATA ST380021A 5.05 PQ: 0 ANSI: 5 > sd 0:0:0:0: [sda] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > sd 0:0:0:0: Attached scsi generic sg0 type 0 > sda: [mac] sda1 sda2 sda3 sda4 sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 sda13 sda14 sda15 > sd 0:0:0:0: [sda] Attached SCSI disk > scsi 0:0:1:0: Direct-Access ATA ST380021A 3.75 PQ: 0 ANSI: 5 > sd 0:0:1:0: [sdb] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB) > sd 0:0:1:0: [sdb] Write Protect is off > sd 0:0:1:0: Attached scsi generic sg1 type 0 > sd 0:0:1:0: [sdb] Mode Sense: 00 3a 00 00 > sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > INFO: rcu_sched self-detected stall on CPU { 0} (t=16163 jiffies) > Call Trace: > INFO: rcu_sched self-detected stall on CPU { 1} (t=16163 jiffies) > Call Trace: > [ef877d30] [c0008d04] show_stack+0x50/0x158 (unreliable) > [ef877d70] [c0097fe4] __rcu_pending+0x184/0x46c > [ef877da0] [c00991a0] rcu_check_callbacks+0x7c/0x168 > [ef877dc0] [c0044a40] update_process_times+0x3c/0x70 > [ef877de0] [c0083a3c] tick_sched_timer+0x88/0x100 > [ef877e10] [c005b11c] __run_hrtimer.clone.29+0x54/0x104 > [ef877e30] [c005bf44] hrtimer_interrupt+0x158/0x3f8 > [ef877ea0] [c000b5c4] timer_interrupt+0x1cc/0x204 > [ef877ed0] [c0011b88] ret_from_except+0x0/0x1c > --- Exception: 901 at cpu_idle+0xe4/0x188 Am I reading this correctly? Did the system really take a scheduling-clock interrupt from within another scheduling-clock interrupt? If so, my first question is "Why did the scheduling-clock interrupt run for so long?" ;-) Thanx, Paul > LR = cpu_idle+0xc8/0x188 > [ef877f90] [c00097e8] cpu_idle+0x60/0x188 (unreliable) > [ef877fc0] [c046531c] start_secondary+0x2c8/0x2cc > [ef877ff0] [00003278] 0x3278 > [ef873b60] [c0008d04] show_stack+0x50/0x158 (unreliable) > [ef873ba0] [c0097fe4] __rcu_pending+0x184/0x46c > [ef873bd0] [c0099240] rcu_check_callbacks+0x11c/0x168 > [ef873bf0] [c0044a40] update_process_times+0x3c/0x70 > [ef873c10] [c0083a3c] tick_sched_timer+0x88/0x100 > [ef873c40] [c005b11c] __run_hrtimer.clone.29+0x54/0x104 > [ef873c60] [c005bf44] hrtimer_interrupt+0x158/0x3f8 > [ef873cd0] [c000b5c4] timer_interrupt+0x1cc/0x204 > [ef873d00] [c0011b88] ret_from_except+0x0/0x1c > --- Exception: 901 at wake_up_new_task+0x134/0x16c > LR = wake_up_new_task+0x134/0x16c > [ef873dc0] [c0065f08] wake_up_new_task+0xfc/0x16c (unreliable) > [ef873df0] [c0035530] do_fork+0xe8/0x2bc > [ef873e30] [c0008a4c] sys_clone+0x50/0x90 > [ef873e50] [c00114b8] ret_from_syscall+0x0/0x40 > --- Exception: c00 at kernel_thread+0x28/0x68 > LR = __call_usermodehelper+0x40/0xdc > [ef873f10] [c005f5c8] async_run_entry_fn+0x128/0x1e4 (unreliable) > [ef873f20] [ef878c00] 0xef878c00 > [ef873f40] [c004e668] process_one_work+0x150/0x3f0 > [ef873f70] [c00514f0] worker_thread+0x18c/0x37c > [ef873fb0] [c00567bc] kthread+0x84/0x88 > [ef873ff0] [c000f514] kernel_thread+0x4c/0x68 > pata-macio 0.00020000:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 0 > scsi1 : pata_macio > ata2: PATA max MWDMA2 irq 20 > sdb: [mac] sdb1 sdb2 sdb3 sdb4 sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 sdb13 > sd 0:0:1:0: [sdb] Attached SCSI disk > ata2.00: ATAPI: PIONEER DVD-RW DVR-104, A227, max UDMA/33 > ata2.00: configured for MWDMA2 > scsi 1:0:0:0: CD-ROM PIONEER DVD-RW DVR-104 A227 PQ: 0 ANSI: 5 > pata-macio 0.00021000:ata-3: Activating pata-macio chipset KeyLargo ATA-3, Apple bus ID 1 > scsi2 : pata_macio > ata3: PATA max MWDMA2 irq 21 > sr0: scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray > cdrom: Uniform CD-ROM driver Revision: 3.20 > sr 1:0:0:0: Attached scsi CD-ROM sr0 > pcnet32: pcnet32.c:v1.35 21.Apr.2008 tsbogend@alpha.franken.de > sr 1:0:0:0: Attached scsi generic sg2 type 5 > sungem.c:v1.0 David S. Miller > gem 0002:20:0f.0: eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:03:93:6f:04:b2 > PPP generic driver version 2.4.2 > PPP Deflate Compression module registered > ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver > ohci_hcd 0001:10:18.0: OHCI Host Controller > ohci_hcd 0001:10:18.0: new USB bus registered, assigned bus number 1 > ohci_hcd 0001:10:18.0: irq 27, io mem 0x80081000 > hub 1-0:1.0: USB hub found > hub 1-0:1.0: 2 ports detected > ohci_hcd 0001:10:19.0: OHCI Host Controller > ohci_hcd 0001:10:19.0: new USB bus registered, assigned bus number 2 > ohci_hcd 0001:10:19.0: irq 28, io mem 0x80080000 > hub 2-0:1.0: USB hub found > hub 2-0:1.0: 2 ports detected > mousedev: PS/2 mouse device common for all mice > usbcore: registered new interface driver appletouch > PowerMac i2c bus pmu 2 registered > i2c i2c-0: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/via-pmu@16000/rtc > i2c i2c-0: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/via-pmu@16000/power-mgt > PowerMac i2c bus pmu 1 registered > i2c i2c-1: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/via-pmu@16000/rtc > i2c i2c-1: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/via-pmu@16000/power-mgt > PowerMac i2c bus mac-io 0 registered > i2c i2c-2: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/i2c@18000/cereal > i2c i2c-2: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/i2c@18000/deq > i2c i2c-2: i2c-powermac: invalid reg on /pci@f2000000/mac-io@17/i2c@18000/i2c-modem > PowerMac i2c bus uni-n 1 registered > i2c i2c-3: i2c-powermac: invalid reg on /uni-n@f8000000/i2c@f8001000/cereal > PowerMac i2c bus uni-n 0 registered > i2c i2c-4: i2c-powermac: invalid reg on /uni-n@f8000000/i2c@f8001000/cereal > APM Battery Driver > usbcore: registered new interface driver usbhid > usbhid: USB HID core driver > oprofile: using ppc/7450 performance monitoring. > TCP: cubic registered > Initializing XFRM netlink socket > NET: Registered protocol family 17 > NET: Registered protocol family 15 > Key type dns_resolver registered > PM: Hibernation image not present or could not be loaded. > input: PMU as /devices/virtual/input/input0 > kjournald starting. Commit interval 5 seconds > EXT3-fs (sda15): mounted filesystem with writeback data mode > VFS: Mounted root (ext3 filesystem) readonly on device 8:15. > usb 1-1: new full-speed USB device number 2 using ohci_hcd > devtmpfs: mounted > Freeing unused kernel memory: 240k freed > hub 1-1:1.0: USB hub found > hub 1-1:1.0: 3 ports detected > usb 1-2: new full-speed USB device number 3 using ohci_hcd > hub 1-2:1.0: USB hub found > hub 1-2:1.0: 3 ports detected > usb 1-1.3: new full-speed USB device number 4 using ohci_hcd > input: Mitsumi Electric Apple Extended USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-1/1-1.3/1-1.3:1.0/input/input1 > hid-generic 0003:05AC:020B.0001: input: USB HID v1.10 Keyboard [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:10:18.0-1.3/input0 > input: Mitsumi Electric Apple Extended USB Keyboard as /devices/pci0001:10/0001:10:18.0/usb1/1-1/1-1.3/1-1.3:1.1/input/input2 > hid-generic 0003:05AC:020B.0002: input: USB HID v1.10 Device [Mitsumi Electric Apple Extended USB Keyboard] on usb-0001:10:18.0-1.3/input1 > usb 1-2.3: new low-speed USB device number 5 using ohci_hcd > hid-generic 0003:05AC:9219.0003: claimed by neither input, hiddev nor hidraw > udevd[1209]: starting version 175 > EXT3-fs (sda15): using internal journal > sungem_phy: PHY ID: 2060e1, addr: 0 > gem 0002:20:0f.0: eth0: Found BCM5421 PHY > gem 0002:20:0f.0: eth0: Link is up at 100 Mbps, full-duplex > gem 0002:20:0f.0: eth0: Pause is enabled (rxfifo: 10240 off: 7168 on: 5632) > sshd (2042): /proc/2042/oom_adj is deprecated, please use /proc/2042/oom_score_adj instead. > >