From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A593C32792 for ; Mon, 30 Sep 2019 16:39:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 35EE02168B for ; Mon, 30 Sep 2019 16:39:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="fetE1U2K" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 35EE02168B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54970 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iEyha-0007C9-Ad for qemu-devel@archiver.kernel.org; Mon, 30 Sep 2019 12:39:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42875) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iEygn-0006mv-Md for qemu-devel@nongnu.org; Mon, 30 Sep 2019 12:38:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iEygl-0006f3-Kp for qemu-devel@nongnu.org; Mon, 30 Sep 2019 12:38:21 -0400 Received: from mail-wm1-x344.google.com ([2a00:1450:4864:20::344]:33179) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iEygl-0006eX-CE for qemu-devel@nongnu.org; Mon, 30 Sep 2019 12:38:19 -0400 Received: by mail-wm1-x344.google.com with SMTP id r17so442011wme.0 for ; Mon, 30 Sep 2019 09:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version:content-transfer-encoding; bh=MqcDsqob8zdpSQ9O51HTy+pHc2xxQyUuSFnvAW8fGV0=; b=fetE1U2KlH19vBbBpv5sF9P1BNWRIFG/tj0f2SN866o5MBap/it29apegYEZJG/y0u G9DLDM4rHVmUlhjP+Gz7ZaOUbxaaw9OmII/7YjV3Qj46KezaQ6JNVN4c7R2Qp8WJfrvd fC7qqHj1vFk3r+kdm+zIwGmVJ2b83NexNlsxNbc4DzHd6rX8smvrw5Ewbxi5FTtWI8uz fNP+cjQgdIRHJJuUycS3Htc/8nhnl8Fj4GjjFbCzvcieNM0XT4KVbDUcK2bfkoKQuQpx pRl7xdhIQp5J+2sAl0H9tNmiNq08+PtyouEaJWI09AkAPyVvtRIiquS6iwM9z5PERFd+ AAAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:references:user-agent:from:to:cc:subject :in-reply-to:date:message-id:mime-version:content-transfer-encoding; bh=MqcDsqob8zdpSQ9O51HTy+pHc2xxQyUuSFnvAW8fGV0=; b=Pe4rKWC4T54VInkokAUgdGzNOyFd/RjmOI8UJmAJGjltJlJIxgmgKrdEWcWDkxrNPF c6iMRFTqK0BEjkbTe6uK9Xx7FrPKmOkFa9yFNPhkGd5hp1Tk0551kFFamr+r1zXtWFRy sbZXp8vgGJ5N/cA6uvjSeTg5B/ERuujutIP8f9QsJ/HfkqExnuWShEr3rDhJAKMYLP82 2204hTh1CEE+lq7QQ6AENjh7OzLxjk3+kjAK9hd1lGtYJfOD68TiuR1ZalIZ3qlCJYyH KA7jXfpeKNAAXexRC0y6Uo3d22jv34LtThSLBcqkuTowSeQ21fcvv7jW0t9Z2Gxbw1eF YD2Q== X-Gm-Message-State: APjAAAU/ls3zKKrW9B5nT4MaeKfxlDFXCw1jOe72blWLMybTmS20s6X3 mAsCLBTwCgKld+N5kIYWmuJkPw== X-Google-Smtp-Source: APXvYqzLUl/oZNxIMGof6pbHOs0K9ieW4bc2YopLxKzwC0NvcHkKHAvJLLMlo0qZmenI/CqsJ3E6cg== X-Received: by 2002:a1c:1bcf:: with SMTP id b198mr128630wmb.0.1569861497693; Mon, 30 Sep 2019 09:38:17 -0700 (PDT) Received: from zen.linaroharston ([51.148.130.216]) by smtp.gmail.com with ESMTPSA id q19sm33426874wra.89.2019.09.30.09.38.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Sep 2019 09:38:16 -0700 (PDT) Received: from zen (localhost [127.0.0.1]) by zen.linaroharston (Postfix) with ESMTP id 1946F1FF87; Mon, 30 Sep 2019 17:38:16 +0100 (BST) References: User-agent: mu4e 1.3.4; emacs 27.0.50 From: Alex =?utf-8?Q?Benn=C3=A9e?= To: Peter Maydell Subject: Re: Lockup with --accel tcg,thread=single In-reply-to: Date: Mon, 30 Sep 2019 17:38:15 +0100 Message-ID: <87h84tloy0.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::344 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , Doug Gale , qemu-devel Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Peter Maydell writes: > On Mon, 30 Sep 2019 at 14:17, Doug Gale wrote: >> >> I found a lockup in single threaded TCG, with OVMF bios, 16 CPUs. >> >> qemu-system-x86_64 --accel tcg,thread=3Dsingle -smp cpus=3D16 -bios >> /usr/share/ovmf/OVMF.fd >> >> Using Ubuntu 18.04 LTS, default gnome desktop. There is no guest OS, >> there is no hard drive, just the OVMF firmware locks it up. (I >> narrowed it down to this from a much larger repro) > >> Peter Maydell helped me bisect it in IRC. >> Works fine at commit 1e8a98b53867f61 >> Fails at commit 9458a9a1df1a4c7 >> >> MTTCG works fine all the way up to master. > > Thanks for this bug report. I've reproduced it and think > I have figured out what is going on here. > > Commit 9458a9a1df1a4c719e245 is Paolo's commit that fixes the > TCG-vs-dirty-bitmap race by having the thread which is > doing a memory access wait for the vCPU thread to finish > its current TB using a no-op run_on_cpu() operation. > > In the case of the hang the thread doing the memory access > is the iothread, like this: > > #14 0x000055c150c0a98c in run_on_cpu (cpu=3D0x55c153801c60, > func=3D0x55c150bbb542 , data=3D...) > at /home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1205 > #15 0x000055c150bbb58c in tcg_log_global_after_sync > (listener=3D0x55c1538410c8) at > /home/petmay01/linaro/qemu-from-laptop/qemu/exec.c:2963 > #16 0x000055c150c1fe18 in memory_global_after_dirty_log_sync () at > /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2598 > #17 0x000055c150c1e6b8 in memory_region_snapshot_and_clear_dirty > (mr=3D0x55c1541e82b0, addr=3D0, size=3D1920000, client=3D0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/memory.c:2106 > #18 0x000055c150c76c05 in vga_draw_graphic (s=3D0x55c1541e82a0, full_upda= te=3D0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1661 > #19 0x000055c150c771c4 in vga_update_display (opaque=3D0x55c1541e82a0) > at /home/petmay01/linaro/qemu-from-laptop/qemu/hw/display/vga.c:1785 > #20 0x000055c151052a83 in graphic_hw_update (con=3D0x55c1536dfaa0) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:268 > #21 0x000055c151091490 in gd_refresh (dcl=3D0x55c1549af090) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/gtk.c:484 > #22 0x000055c151056571 in dpy_refresh (s=3D0x55c1542f9d90) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:1629 > #23 0x000055c1510527f0 in gui_update (opaque=3D0x55c1542f9d90) at > /home/petmay01/linaro/qemu-from-laptop/qemu/ui/console.c:206 > #24 0x000055c1511ee67c in timerlist_run_timers > (timer_list=3D0x55c15370c280) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:592 > #25 0x000055c1511ee726 in qemu_clock_run_timers > (type=3DQEMU_CLOCK_REALTIME) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:606 > #26 0x000055c1511ee9e5 in qemu_clock_run_all_timers () at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/qemu-timer.c:692 > #27 0x000055c1511ef181 in main_loop_wait (nonblocking=3D0) at > /home/petmay01/linaro/qemu-from-laptop/qemu/util/main-loop.c:524 > #28 0x000055c150db03fe in main_loop () at > /home/petmay01/linaro/qemu-from-laptop/qemu/vl.c:1806 > > run_on_cpu() adds the do_nothing function to a cpu queued-work list. > > However, the single-threaded TCG runloop qemu_tcg_rr_cpu_thread_fn() > has the basic structure: > > while (1) { > for (each vCPU) { > run this vCPU for a timeslice; > } > qemu_tcg_rr_wait_io_event(); > } > > and it processes cpu work queues only in qemu_tcg_rr_wait_io_event(). > This is a problem, because the thing which causes us to stop running > one vCPU when its timeslice ends and move on to the next is the > tcg_kick_vcpu_timer -- and the iothread will never process that timer > and kick the vcpu because it is currently blocked in run_on_cpu() ! > > Not sure currently what the best fix is here. We seem to be repeating ourselves because: a8efa60633575a2ee4dbf807a71cb44d44b0e0f8 Author: Paolo Bonzini AuthorDate: Wed Nov 14 12:36:57 2018 +0100 cpus: run work items for all vCPUs if single-threaded However looking at the commit I can still see we have the problem of not advancing to the next vCPU if the kick timer (or some other event) doesn't bring the execution to an exit. I suspect you could get this in Linux but it's probably sufficiently busy to ensure vCPUs are always exiting for some reason or another. So options I can think of so far are: 1. Kick 'em all when not inter-vCPU Something like this untested patch... --8<---------------cut here---------------start------------->8--- 1 file changed, 17 insertions(+), 5 deletions(-) cpus.c | 22 +++++++++++++++++----- modified cpus.c @@ -949,8 +949,8 @@ static inline int64_t qemu_tcg_next_kick(void) return qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + TCG_KICK_PERIOD; } -/* Kick the currently round-robin scheduled vCPU */ -static void qemu_cpu_kick_rr_cpu(void) +/* Kick the currently round-robin scheduled vCPU to next */ +static void qemu_cpu_kick_rr_next_cpu(void) { CPUState *cpu; do { @@ -961,6 +961,16 @@ static void qemu_cpu_kick_rr_cpu(void) } while (cpu !=3D atomic_mb_read(&tcg_current_rr_cpu)); } +/* Kick all RR vCPUs */ +static void qemu_cpu_kick_rr_cpus(void) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + cpu_exit(cpu); + }; +} + static void do_nothing(CPUState *cpu, run_on_cpu_data unused) { } @@ -993,7 +1003,7 @@ void qemu_timer_notify_cb(void *opaque, QEMUClockType = type) static void kick_tcg_thread(void *opaque) { timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick()); - qemu_cpu_kick_rr_cpu(); + qemu_cpu_kick_rr_next_cpu(); } static void start_tcg_kick_timer(void) @@ -1828,9 +1838,11 @@ void qemu_cpu_kick(CPUState *cpu) { qemu_cond_broadcast(cpu->halt_cond); if (tcg_enabled()) { + if (qemu_tcg_mttcg_enabled()) { cpu_exit(cpu); - /* NOP unless doing single-thread RR */ - qemu_cpu_kick_rr_cpu(); + } else { + qemu_cpu_kick_rr_cpus(); + } } else { if (hax_enabled()) { /* --8<---------------cut here---------------end--------------->8--- 2. Add handling of kicking all VCPUs to do_run_on_cpu when current_cpu =3D=3D NULL Which would basically kick all vCPUs when events come from outside the emulation. 3. Figure out multi-threaded icount and record/replay and drop the special RR case. This might take a while. > Side note -- this use of run_on_cpu() means that we now drop > the iothread lock within memory_region_snapshot_and_clear_dirty(), > which we didn't before. This means that a vCPU thread can now > get in and execute an access to the device registers of > hw/display/vga.c, updating its state in a way I suspect that the > device model code is not expecting... So maybe the right answer > here should be to come up with a fix for the race that 9458a9a1 > addresses that doesn't require us to drop the iothread lock in > memory_region_snapshot_and_clear_dirty() ? Alternatively we need > to audit the callers and flag in the documentation that this > function has the unexpected side effect of briefly dropping the > iothread lock. There was a series Emilio posted to get rid of more of the BQL in place of per-CPU locks which IIRC also stopped some of the bouncing we do in the *_on_cpu functions. Each time we have to do a lock shuffle to get things moving is a bit of a red flag. > > thanks > -- PMM -- Alex Benn=C3=A9e