From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44809) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZG60Y-0005gj-Hi for qemu-devel@nongnu.org; Fri, 17 Jul 2015 09:48:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZG60V-0001mb-C2 for qemu-devel@nongnu.org; Fri, 17 Jul 2015 09:48:58 -0400 Received: from foss.arm.com ([217.140.101.70]:39770) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZG60V-0001m6-4C for qemu-devel@nongnu.org; Fri, 17 Jul 2015 09:48:55 -0400 Date: Fri, 17 Jul 2015 14:48:40 +0100 From: Marc Zyngier Message-ID: <20150717144840.6dfafc3f@arm.com> In-Reply-To: <55A905AB.4000603@redhat.com> References: <1437040609-9878-1-git-send-email-pbonzini@redhat.com> <20150716190546.GI29283@redhat.com> <55A8883D.1010207@redhat.com> <55A8CB3E.5090404@redhat.com> <20150717142826.6da03734@arm.com> <55A905AB.4000603@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx->dispatching is dead, all hail ctx->notify_me List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Laszlo Ersek Cc: "kwolf@redhat.com" , Paolo Bonzini , "Richard W.M. Jones" , "stefanha@redhat.com" , "qemu-devel@nongnu.org" On Fri, 17 Jul 2015 14:39:55 +0100 Laszlo Ersek wrote: > On 07/17/15 15:28, Marc Zyngier wrote: > > On Fri, 17 Jul 2015 10:30:38 +0100 > > Paolo Bonzini wrote: > > > >> > >> > >> On 17/07/2015 06:44, Paolo Bonzini wrote: > >>> > >>> > >>> On 16/07/2015 21:05, Richard W.M. Jones wrote: > >>>> > >>>> Sorry to spoil things, but I'm still seeing this bug, although it is > >>>> now a lot less frequent with your patch. I would estimate it happens > >>>> more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs > >>>> with qemu.git + the v2 patch series. > >>>> > >>>> It's the exact same hang in both cases. > >>>> > >>>> Is it possible that this patch doesn't completely close any race? > >>>> > >>>> Still, it is an improvement, so there is that. > >>> > >>> Would seem at first glance like a different bug. > >>> > >>> Interestingly, adding some "tracing" (qemu_clock_get_ns) makes the bug > >>> more likely: now it reproduces in about 10 tries. Of course :) adding > >>> other kinds of tracing instead make it go away again (>50 tries). > >>> > >>> Perhaps this: > >>> > >>> i/o thread vcpu thread worker thread > >>> --------------------------------------------------------------------- > >>> lock_iothread > >>> notify_me = 1 > >>> ... > >>> unlock_iothread > >>> lock_iothread > >>> notify_me = 3 > >>> ppoll > >>> notify_me = 1 > >>> bh->scheduled = 1 > >>> event_notifier_set > >>> event_notifier_test_and_clear > >>> ppoll > >>> ^^ hang > >>> > >>> In the exact shape above, it doesn't seem too likely to happen, but > >>> perhaps there's another simpler case. Still, the bug exists. > >>> > >>> The above is not really related to notify_me. Here the notification is > >>> not being optimized away! So I wonder if this one has been there forever. > >>> > >>> Fam suggested putting the event_notifier_test_and_clear before > >>> aio_bh_poll(), but it does not work. I'll look more close > >>> > >>> However, an unconditional event_notifier_test_and_clear is pretty > >>> expensive. On one hand, obviously correctness comes first. On the > >>> other hand, an expensive operation at the wrong place can mask the race > >>> very easily; I'll let the fix run for a while, but I'm not sure if a > >>> successful test really says anything useful. > >> > >> So it may not be useful, but still successful test is successful. :) > >> The following patch, which also includes the delta between v2 and v3 > >> of this series, survived 674 reboots before hitting a definitely > >> unrelated problem: > >> > >> error: kvm run failed Function not implemented > >> PC=00000000bf671210 SP=00000000c00001f0 > >> X00=000000000a003e70 X01=0000000000000000 X02=00000000bf680548 X03=0000000000000030 > >> X04=00000000bbb5fc18 X05=00000000004b7770 X06=00000000bf721930 X07=000000000000009a > >> X08=00000000bf716858 X09=0000000000000090 X10=0000000000000000 X11=0000000000000046 > >> X12=00000000a007e03a X13=0000000000000000 X14=0000000000000000 X15=0000000000000000 > >> X16=00000000bf716df0 X17=0000000000000000 X18=0000000000000000 X19=00000000bf6f5f18 > >> X20=0000000000000000 X21=0000000000000000 X22=0000000000000000 X23=0000000000000000 > >> X24=0000000000000000 X25=0000000000000000 X26=0000000000000000 X27=0000000000000000 > >> X28=0000000000000000 X29=0000000000000000 X30=0000000000000000 PSTATE=60000305 (flags -ZC-) > >> > >> For the record, this is the kvm_run struct: > >> > >> $6 = {request_interrupt_window = 0 '\000', padding1 = "\000\000\000\000\000\000", exit_reason = 0, > >> ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { > >> hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { > >> exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, > >> count = 0, data_offset = 144}, debug = {arch = {}}, mmio = {phys_addr = 150994968, > >> data = "\220\000\000\000\000\000\000", len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, > >> args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, > >> is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, > >> s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { > >> dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, > >> data = {144, 4, 0 }}, osi = {gprs = {150994968, 144, 4, 0 }}, > >> papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { > >> subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, > >> dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, > >> s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', > >> sel2 = 0}, > >> padding = "\030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004", '\000' }, kvm_valid_regs = 0, kvm_dirty_regs = 0, s = {regs = {}, > >> padding = '\000' }} > >> > >> Marc, does it ring any bell? > > > > Well, this is an example of a guest accessing non-memory using an > > instruction that we cannot safely emulate - not an IO accessor (load > > multiple, for example). > > > > In this case, we kill the guest (we could as well inject a fault). > > > > This vcpu seems to be accessing 0x9000018 (the mmio structure points > > there), but I can't immediately relate it to the content of the > > registers. > > [VIRT_UART] = { 0x09000000, 0x00001000 }, > Still: there is nothing in the registers that remotely points to that area. X0 is the closest, but it'd take a big negative offset to get there. Is that a Linux kernel? or something else? M. -- Jazz is not dead. It just smells funny.