From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752591AbdFNUmg (ORCPT ); Wed, 14 Jun 2017 16:42:36 -0400 Received: from bh-25.webhostbox.net ([208.91.199.152]:50868 "EHLO bh-25.webhostbox.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752488AbdFNUmf (ORCPT ); Wed, 14 Jun 2017 16:42:35 -0400 Date: Wed, 14 Jun 2017 13:42:31 -0700 From: Guenter Roeck To: David Miller Cc: pasha.tatashin@oracle.com, linux-kernel@vger.kernel.org, bob.picco@oracle.com, steven.sistare@oracle.com Subject: Re: qemu sparc64 runtime crashes in -next Message-ID: <20170614204231.GA3783@roeck-us.net> References: <157ed65a-b676-960c-5293-5964b7110f90@roeck-us.net> <20170614.153108.433249474812834884.davem@davemloft.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170614.153108.433249474812834884.davem@davemloft.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - bh-25.webhostbox.net X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - roeck-us.net X-Get-Message-Sender-Via: bh-25.webhostbox.net: authenticated_id: guenter@roeck-us.net X-Authenticated-Sender: bh-25.webhostbox.net: guenter@roeck-us.net X-Source: X-Source-Args: X-Source-Dir: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 14, 2017 at 03:31:08PM -0400, David Miller wrote: > From: Guenter Roeck > Date: Wed, 14 Jun 2017 03:13:54 -0700 > > > Hi, > > > > my sparc qemu tests started failing with next-20170613. > > Log output is not very helpful: > > > > Unhandled Exception 0x0000000000000028 > > PC = 0x00000000004620f4 NPC = 0x00000000004620f8 > > Stopping execution > > > > It looks like 0x00000000004620f4 is in init_tick_ops(). > > > > Bisect points to commit 'sparc64: improve modularity tick options'. > > Bisect log is attached. > > > > No idea if this is a qemu problem. If you think it is, anything to > > help > > tracking it down would be appreciated. > > Pavel, please look into this. > > It looks weird that the commit it bisects to would cause a problem. > Maybe the change from __read_mostly to __cachelin_aligned causes the > issue? > > Really weird... Turns out tick_get_frequency() returns 0. The value is used as divisor in clocksource_hz2mult(). Looking into it further, clock_tick is initialized much later. [ 0.000000] clock_tick is 0 -> tick_get_frequency() [ 0.039361] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01' [ 0.041646] PROMLIB: Root node compatible: sun4u [ 0.060500] Linux version 4.12.0-rc5-next-20170614+ (groeck@mars) (gcc version 4.6.3 (GCC) ) #5 SMP Wed Jun 14 13:40:01 PDT 2017 [ 0.893475] bootconsole [earlyprom0] enabled [ 0.958658] ARCH: SUN4U [ 1.265007] Ethernet address: 52:54:00:12:34:56 [ 1.340458] MM: PAGE_OFFSET is 0xfffff80000000000 (max_phys_bits == 40) [ 1.405302] MM: VMALLOC [0x0000000100000000 --> 0x0000060000000000] [ 1.468992] MM: VMEMMAP [0x0000060000000000 --> 0x00000c0000000000] [ 3.349070] Kernel: Using 5 locked TLB entries for main kernel image. [ 3.422093] Remapping the kernel... [ 4.342159] done. [ 136.231664] OF stdout device is: /pci@1fe,0/ebus@3/su [ 136.298896] PROM: Built device tree with 60466 bytes of memory. [ 136.458520] Top of RAM: 0x1fe80000, Total RAM: 0x1fe80000 [ 136.520487] Memory hole size: 0MB [ 143.705871] Allocated 16384 bytes for kernel page tables. [ 143.972916] Zone ranges: [ 144.039046] Normal [mem 0x0000000000000000-0x000000001fe7ffff] [ 144.118654] Movable zone start for each node [ 144.180797] Early memory node ranges [ 144.240870] node 0: [mem 0x0000000000000000-0x000000001fe7ffff] [ 144.333686] Initmem setup node 0 [mem 0x0000000000000000-0x000000001fe7ffff] [ 144.943918] Booting Linux... [ 145.010966] CPU CAPS: [flush,stbar,swap,muldiv,v9,mul32,div32,v8plus] [ 145.082225] CPU CAPS: [vis] [ 145.581394] percpu: Embedded 12 pages/cpu @fffff8001f800000 s57024 r8192 d33088 u4194304 [ 145.949412] ###################### fill_in_one_cpu(): CPU 0 clock tick set to 100000000 That doesn't really take 145 seconds, though :-). Guenter