From: Alexander Graf <agraf@suse.de>
To: qemu-devel@nongnu.org
Cc: blauwirbel@gmail.com, Alexander Graf <alex@csgraf.de>,
Paul Brook <paul@codesourcery.com>
Subject: Re: [Qemu-devel] [PATCH 7/7] PPC64: Don't fault at lwsync
Date: Thu, 05 Mar 2009 17:09:38 +0100 [thread overview]
Message-ID: <49AFF942.6000708@suse.de> (raw)
In-Reply-To: <49AFF663.6020006@suse.de>
Alexander Graf wrote:
> Paul Brook wrote:
>
>> On Thursday 05 March 2009, Alexander Graf wrote:
>>
>>
>>> Right now we can throw a fault on lwsync, even though the fault is
>>> actually caused by the instruction after lwsync.
>>>
>>> I haven't found the magic that messed this up, but for now we can
>>> just end the TB on lwsync, forcing the next command to issue faults
>>> itself.
>>>
>>> If anyone knows how to really fix this, please step forward and do
>>> so. This only makes things work at all for me :-).
>>>
>>>
>> Where is the subsequent fault coming from? I suspect the real bug is nothing
>> to do with lwsync, and the subsequent fault is actually just corrupting the
>> CPU state. As discussed recently this is the same bug SPARC has with its
>> unassigned access handlers.
>>
>> Paul
>>
>>
>
> Without the patch I get:
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc0000000000ba524
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=1024 NUMA PowerMac
> Modules linked in:
> Supported: Yes
> NIP: c0000000000ba524 LR: c000000000775a0c CTR: c0000000007759e8
> REGS: c0000000061afb10 TRAP: 0300 Not tainted (2.6.27.7-9-ppc64)
> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 84000044 XER: 20000000
> DAR: 0000000000000000, DSISR: 0000000040000000
> TASK = c00000000619d560[1] 'swapper' THREAD: c0000000061ac000 CPU: 0
> GPR00: ffffffffffffffff c0000000061afd90 c0000000009bbce8 0000000000000000
> GPR04: 0000000000000000 0000000000000000 0000000000000000 c000000000a82c80
> GPR08: 0000000000000613 c00000000619d560 c0000000070704c0 c0000000061ac000
> GPR12: 0000000088000044 c000000000a82c80 0000000000051b63 0000000000051a41
> GPR16: 0000000000051b5b 000000000004003c 0000000000053958 000000000005345e
> GPR20: 0000000000052fd4 0000000000063dc8 0000000000063db4 00000000fff0245c
> GPR24: 4000000002110000 c0000000007932f8 c000000000b077a8 0000000000000000
> GPR28: c0000000009621f0 c0000000007b01c8 c000000000938f18 c0000000007afce0
> NIP [c0000000000ba524] .cmpxchg_futex_value_locked+0x38/0x78
> LR [c000000000775a0c] .futex_init+0x24/0xac
> Call Trace:
> [c0000000061afd90] [c0000000007759c0] .init_tstats_procfs+0x2c/0x54
> (unreliable)
> [c0000000061afe10] [c00000000000944c] .do_one_initcall+0x78/0x194
> [c0000000061aff00] [c000000000750440] .kernel_init+0xd0/0x148
> [c0000000061aff90] [c00000000002ad84] .kernel_thread+0x4c/0x68
> Instruction dump:
> 39290001 912b0014 7c8407b4 7ca507b4 e92d01b0 e8090520 7fa30040 419d0038
> e92d01b0 e8090520 2ba00003 409d0028 <7c2004ac> 7c001828 7c002000 40c20010
> ---[ end trace 561bb236c800851f ]---
> note: swapper[1] exited with preempt_count 1
> swapper used greatest stack depth: 9296 bytes left
> Kernel panic - not syncing: Attempted to kill init!
>
>
> Which is this translation block:
>
> NIP c0000000000ba524 LR c000000000775a0c CTR c0000000007759e8 XER 20000000
> MSR 8000000000009032 HID0 0000000060000000 HF 8000000000000000 idx 1
> TB 00000000 d8b159bb DECR 0007c417
> GPR00 ffffffffffffffff c0000000061afd90 c0000000009bbce8 0000000000000000
> GPR04 0000000000000000 0000000000000000 0000000000000000 c000000000a82c80
> GPR08 0000000000000613 c00000000619d560 c0000000070704c0 c0000000061ac000
> GPR12 0000000088000044 c000000000a82c80 0000000000051b63 0000000000051a41
> GPR16 0000000000051b5b 000000000004003c 0000000000053958 000000000005345e
> GPR20 0000000000052fd4 0000000000063dc8 0000000000063db4 00000000fff0245c
> GPR24 4000000002110000 c0000000007932f8 c000000000b077a8 0000000000000000
> GPR28 c0000000009621f0 c0000000007b01c8 c000000000938f18 c0000000007afce0
> CR 84000044 [ L G - - - - G G ] RES ffffffffffffffff
> FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> FPSCR 00000000
> SRR0 c000000000774950 SRR1 8000000000009032 SDR1 0000000007c00003
> IN:
> 0xc0000000000ba524: lwsync
> 0xc0000000000ba528: lwarx r0,0,r3
> 0xc0000000000ba52c: cmpw r0,r4
> 0xc0000000000ba530: bne- 0xc0000000000ba540
>
>
> And I seriously have trouble understanding how a data storage exception
> could happen on the lwsync opcode. It looks like R3 became 0 from the
> guest's point of view after lwsync though - hum.
>
Ah I remember that one now :-). The futex_init function tests if cmpxchg
works with NULL values and that's why R3 is 0. It's actually _supposed_
to fault here. But something gets messed up when the fault happens on
IP=lwsync instead of IP=lwarx and I haven't really researched into why.
Alex
next prev parent reply other threads:[~2009-03-05 16:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-05 14:14 [Qemu-devel] [PATCH 0/7] PPC64 Linux bringup patches v2 Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 1/7] PPC64: Implement slbmte Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 2/7] PPC64: Implement large pages Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 3/7] PPC64: Implment tlbiel Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 4/7] Activate uninorth AGP bridge Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 5/7] PPC64: Nop some SPRs on 970fx Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 6/7] PPC64: Enable 64bit mode on interrupts Alexander Graf
2009-03-05 14:14 ` [Qemu-devel] [PATCH 7/7] PPC64: Don't fault at lwsync Alexander Graf
2009-03-05 15:07 ` Paul Brook
2009-03-05 15:57 ` Alexander Graf
2009-03-05 16:09 ` Alexander Graf [this message]
2009-03-05 16:29 ` Paul Brook
2009-03-05 16:44 ` Paul Brook
2009-03-05 19:42 ` Daniel Jacobowitz
2009-03-06 0:53 ` Paul Brook
2009-03-05 21:21 ` Alexander Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49AFF942.6000708@suse.de \
--to=agraf@suse.de \
--cc=alex@csgraf.de \
--cc=blauwirbel@gmail.com \
--cc=paul@codesourcery.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).