From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N8D4z-000878-AB for qemu-devel@nongnu.org; Wed, 11 Nov 2009 08:17:45 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N8D4u-0007vd-En for qemu-devel@nongnu.org; Wed, 11 Nov 2009 08:17:44 -0500 Received: from [199.232.76.173] (port=52662 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N8D4t-0007vJ-Ut for qemu-devel@nongnu.org; Wed, 11 Nov 2009 08:17:40 -0500 Received: from moutng.kundenserver.de ([212.227.126.187]:56294) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N8D4t-0007l1-EX for qemu-devel@nongnu.org; Wed, 11 Nov 2009 08:17:39 -0500 Subject: Re: [Qemu-devel] ppc64 target broken From: Laurent Vivier In-Reply-To: <78FF0801-D8AA-4A6B-9238-3035301AA9C1@suse.de> References: <78FF0801-D8AA-4A6B-9238-3035301AA9C1@suse.de> Content-Type: text/plain; charset=utf-8 Date: Wed, 11 Nov 2009 14:17:36 +0100 Message-Id: <1257945456.9716.2.camel@Quad> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexander Graf Cc: Blue Swirl , qemu-devel@nongnu.org qemu-system-ppc is also broken. A bisect gives me: c169998802505c244b8bcad562633f29de7d74a4 is first bad commit commit c169998802505c244b8bcad562633f29de7d74a4 Author: Glauber Costa Date: Thu Nov 5 16:05:15 2009 -0200 v3: don't call reset functions on cpu initialization =20 There is absolutely no need to call reset functions when initializing devices. Since we are already registering them, calling qemu_system_res= et() should suffice. Actually, it is what happens when we reboot the machine= , and using the same process instead of a special case semantics will eve= n allow us to find bugs easier. =20 Furthermore, the fact that we initialize things like the cpu quite earl= y, leads to the need to introduce synchronization stuff like qemu_system_c= ond. This patch removes it entirely. All we need to do is call qemu_system_r= eset() only when we're already sure the system is up and running =20 I tested it with qemu (with and without io-thread) and qemu-kvm, and it seems to be doing okay - although qemu-kvm uses a slightly different pa= tch. =20 [ v2: user mode still needs cpu_reset, so put it in ifdef. ] [ v3: leave qemu_system_cond for now. ] =20 Signed-off-by: Glauber Costa Signed-off-by: Blue Swirl :040000 040000 0eb9d44c103057883acd6dcd76c306795bec4cca 8bd8c30125188ad9d34= 6c953ccf2e56b1c289ebc M hw :040000 040000 24f0e2cb42ae95c248fb6874797fa53fbc2b1cba 6dffa9f36e0350fd48f= c5f4a5d4191c5a5cc6e26 M target-i386 :100644 100644 e57f58fea0df3ab4ade42fce4cf86d9d2a60a1e2 9031911c513eb2fc155= 64ec3cd37e8b274c65cc8 M vl.c Le mardi 10 novembre 2009 =C3=A0 19:04 +0100, Alexander Graf a =C3=A9crit : > Hi list, >=20 > For quite some time the PPC64 target (-M mac99 -cpu 970fx) is broken =20 > in early init code: >=20 > <6>OF: ** translation for device /pci@f2000000/pci@d/mac-io@10/=20 > interrupt-controller@40000 ** > <6>OF: bus is default (na=3D1, ns=3D1) on /pci@f2000000/pci@d/mac-io@10 > <4>OF: translating address: 00040000 > <6>OF: parent bus is pci (na=3D3, ns=3D2) on /pci@f2000000/pci@d > <6>OF: walking ranges... > <6>OF: default map, cp=3D0, s=3D80000, da=3D40000 > <4>OF: parent translation for: 82008010 00000000 c0000000 > <6>OF: with offset: 40000 > <4>OF: one level translation: 82008010 00000000 c0040000 > <6>OF: parent bus is pci (na=3D3, ns=3D2) on /pci@f2000000 > <6>OF: no ranges, 1:1 translation > <4>OF: parent translation for: 00000000 00000000 00000000 > <6>OF: with offset: c0040000 > <4>OF: one level translation: 00000000 00000000 c0040000 > <6>OF: parent bus is default (na=3D1, ns=3D1) on / > <6>OF: walking ranges... > <6>OF: not found ! > <0>------------[ cut here ]------------ > <2>kernel BUG at arch/powerpc/platforms/powermac/pic.c:530! > <4>Oops: Exception in kernel mode, sig: 5 [#1] > <4>SMP NR_CPUS=3D1024 NUMA PowerMac > <4>Modules linked in: > <4>Supported: Yes > <4>NIP: c0000000007449a8 LR: c0000000007449a0 CTR: 0000000000000000 > <4>REGS: c0000000009a3b40 TRAP: 0700 Not tainted (2.6.27.7-kvm) > <4>MSR: 8000000000021032 CR: 22000088 XER: 20000000 > <4>TASK =3D c0000000008e83c0[0] 'swapper' THREAD: c0000000009a0000 CPU: 0 > <6>GPR00: c0000000007449a0 c0000000009a3dc0 c0000000009952c0 =20 > 0000000000000001 > <6>GPR04: c00000000092fd20 ffffffffffffffff 0000000000000010 =20 > d000080080107230 > <6>GPR08: c0000000008c4488 c00000000fffc400 0000000000000000 =20 > 0000000000000f72 > <6>GPR12: 0000000022000082 c000000000a62c80 c000000000773638 =20 > c00000000068b9b0 > <6>GPR16: 0000000001773570 0000000000000000 c000000000773570 =20 > 000000000f7fff20 > <6>GPR20: c000000000773588 c00000000068d02a c0000000007787d4 =20 > 000000000f7fff20 > <6>GPR24: 0000000005483224 00000000000000bb c000000000ae77a8 =20 > c000000000694bef > <6>GPR28: c00000000fffebd0 0000000000000000 c000000000914868 =20 > 0000000000000000 > <4>NIP [c0000000007449a8] .pmac_pic_init+0xec/0x1a8 > <4>LR [c0000000007449a0] .pmac_pic_init+0xe4/0x1a8 > <4>Call Trace: > <4>[c0000000009a3dc0] [c0000000007449a0] .pmac_pic_init+0xe4/0x1a8 =20 > (unreliable) > <4>[c0000000009a3e60] [c00000000073503c] .init_IRQ+0x3c/0x54 > <4>[c0000000009a3ee0] [c000000000730a00] .start_kernel+0x254/0x554 > <4>[c0000000009a3f90] [c000000000008568] .start_here_common+0x3c/0x54 >=20 >=20 >=20 >=20 > So the problem seems to be the "ranges" property or the address of the =20 > MPIC device. I'm not sure. One previously working revision =20 > (9d479c119b42b8a548f8d79a8e5a1c1ce2932d91) gives the following guest =20 > trace: >=20 > <6>OF: ** translation for device /pci@5800/mac-io@f/interrupt-=20 > controller@40000 ** > <6>OF: bus is default (na=3D1, ns=3D1) on /pci@5800/mac-io@f > <4>OF: translating address: 00040000 > <6>OF: parent bus is pci (na=3D3, ns=3D2) on /pci@5800 > <6>OF: walking ranges... > <6>OF: default map, cp=3D0, s=3D80000, da=3D40000 > <4>OF: parent translation for: 82007810 00000000 80880000 > <6>OF: with offset: 40000 > <4>OF: one level translation: 82007810 00000000 808c0000 > <6>OF: parent bus is default (na=3D1, ns=3D1) on / > <6>OF: no ranges, 1:1 translation > <4>OF: parent translation for: 00000000 > <6>OF: with offset: 808c0000 > <4>OF: one level translation: 808c0000 > <6>OF: reached root node >=20 > As you can see there is only one pci host device. > I don't see how the old offset would have matched the new "ranges" =20 > parameters of the pci@f2000000 device though: >=20 > http://imagebin.org/71215 >=20 >=20 > So I'm really puzzled on this. When removing the "ranges" property of =20 > the pci@f20000000 (so we're on 1:1 translation) Linux breaks in the =20 > PCI detection code. >=20 > The first commit where the mac99 worked with again at all is blue =20 > swirl's qdev conversion, so maybe he's got an idea? >=20 >=20 > Thanks! >=20 > Alex >=20 >=20 --=20 --------------------- laurent@vivier.eu ---------------------- "Tout ce qui est impossible reste =C3=A0 accomplir" Jules Verne "Things are only impossible until they're not" Jean-Luc Picard