From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MBZLI-00073M-Ag for qemu-devel@nongnu.org; Tue, 02 Jun 2009 15:08:12 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MBZLC-0006x2-Nr for qemu-devel@nongnu.org; Tue, 02 Jun 2009 15:08:12 -0400 Received: from [199.232.76.173] (port=55120 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MBZLC-0006wt-K8 for qemu-devel@nongnu.org; Tue, 02 Jun 2009 15:08:06 -0400 Received: from mx2.redhat.com ([66.187.237.31]:42345) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MBZLB-0007XK-Rk for qemu-devel@nongnu.org; Tue, 02 Jun 2009 15:08:06 -0400 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n52J852b016415 for ; Tue, 2 Jun 2009 15:08:05 -0400 Message-ID: <4A257890.3000706@redhat.com> Date: Tue, 02 Jun 2009 22:08:00 +0300 From: Avi Kivity MIME-Version: 1.0 Subject: Re: [Qemu-devel] i586 TCG: boot hangs intermittently on cryptomgr_test at doublefault_fn References: <20090602175833.GA26882@amd.home.annexia.org> In-Reply-To: <20090602175833.GA26882@amd.home.annexia.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Richard W.M. Jones" Cc: qemu-devel@nongnu.org Richard W.M. Jones wrote: > I have this bug[1] apparently in qemu which I'm trying to track down: > > ---------------------------------------------------------------------- > apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) > apm: overridden by ACPI. > audit: initializing netlink socket (disabled) > type=2000 audit(1243614582.002:1): initialized > HugeTLB registered 4 MB page size, pre-allocated 0 pages > VFS: Disk quotas dquot_6.5.2 > Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > msgmni has been set to 680 > BUG: unable to handle kernel NULL pointer dereference at 00000014 > IP: [] doublefault_fn+0xd/0x108 > *pde = 00000000 > Oops: 0000 [#1] SMP > last sysfs file: > Modules linked in: > > Pid: 26, comm: cryptomgr_test Not tainted (2.6.30-0.91.rc7.git1.fc12.i586 #1) > EIP: 0060:[] EFLAGS: f8d8409e CPU: 0 > EIP is at doublefault_fn+0xd/0x108 > EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 > ESI: 00000000 EDI: 00000000 EBP: c0be1e2c ESP: c0be1e18 > DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > Process cryptomgr_test (pid: 26, ti=c0be0000 task=d5418000 task.ti=d5b88000) > Stack: > 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > Call Trace: > Code: c2 eb 00 ba b8 dd 41 c0 ff e2 8d 15 e4 61 99 c0 8b 0a 51 8d 15 e0 61 99 > c0 8b 0a 51 c3 90 55 89 e5 56 53 83 ec 0c 0f 1f 44 00 00 <65> a1 14 00 00 00 89 > 45 f4 31 c0 8d 45 ee 66 c7 45 ee 00 00 c7 > EIP: [] doublefault_fn+0xd/0x108 SS:ESP 0068:c0be1e18 > CR2: 0000000000000014 > ---[ end trace 6d450e935ee1897c ]--- > cryptomgr_test used greatest stack depth: 7348 bytes left > ---------------------------------------------------------------------- > > It seems to be: i386 architecture only, software emulation, and > intermittent, quite hard to reproduce reliably. > > So my questions are: Has anyone seen anything like this before? > Is there anything I can set or enable to track down which instructions > are failing? > The faulting instruction accesses gs:0x14. Can you expand the register printout code to include the full information for the segment cache (base, limit, type, etc.)? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.