qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Thomas Huth" <thuth@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"QEMU Developers" <qemu-devel@nongnu.org>,
	"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH] cpu: Add starts_halted() method
Date: Sat, 11 Jul 2020 18:55:42 +0100	[thread overview]
Message-ID: <87o8omq6ld.fsf@linaro.org> (raw)
In-Reply-To: <87imev15xt.fsf@morokweng.localdomain>


Thiago Jung Bauermann <bauerman@linux.ibm.com> writes:

> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Thiago Jung Bauermann <bauerman@linux.ibm.com> writes:
>>
>>> Eduardo Habkost <ehabkost@redhat.com> writes:
>>>
>>>> On Wed, Jul 08, 2020 at 09:11:55PM +0100, Peter Maydell wrote:
>>>>> On Wed, 8 Jul 2020 at 18:36, Eduardo Habkost <ehabkost@redhat.com> wrote:
>>>>> >
>>>>> > On Wed, Jul 08, 2020 at 06:09:49PM +0100, Peter Maydell wrote:
>>>>> > > Exactly. It appears that there's a bug in our mechanisms,
>>>>> > > which is why I'm suggesting that the right thing is
>>>>> > > to fix that bug rather than marking the CPU as halted
>>>>> > > earlier in the reset process so that the KVM_RUN happens
>>>>> > > to do nothing...
>>>>> >
>>>>> > I agree this is necessary, but it doesn't seem sufficient.
>>>>> >
>>>>> > Having cpu_reset() set halted=0 on spapr (and probably other
>>>>> > machines) is also a bug, as it could still trigger unwanted
>>>>> > KVM_RUN when cpu_reset() returns (and before machine code sets
>>>>> > halted=1).
>>>>>
>>>>> The Arm handling of starting-halted sets halted=1 within cpu_reset,
>>>>> based on whether the CPU object was created with a
>>>>> "start-powered-off" property.
>>>>
>>>> Making this mechanism generic sounds like a good idea.
>>>
>>> I'll take a stab at doing that and using it for the spapr machine.
>>>
>>>>> I'm not sure in practice that anything can get in asynchronously
>>>>> and cause a KVM_RUN in between spapr_reset_vcpu() calling
>>>>> cpu_reset() and it setting cs->halted (and the other stuff),
>>>>> though. This function ought to be called with the iothread
>>>>> lock held, so KVM_RUN will only happen if it calls some
>>>>> other function which incorrectly lets the CPU run.
>>>>
>>>> Yeah, maybe it won't happen in practice.  It just seems fragile.
>>>> The same way ppc_cpu_reset() kicked the CPU by accident, code
>>>> outside cpu_reset() might one day kick the CPU by accident before
>>>> setting halted=1.
>>>
>>> I'm seeing the vcpu being KVM_RUN'd too early twice during hotplug.
>>> Both of them are before cpu_reset() and ppc_cpu_reset().
>>>
>>> Here's the backtrace for the first of them (redacted for clarity):
>>>
>>> #0  in cpu_resume ()
>>> #1  in cpu_common_realizefn ()
>>> #2  in ppc_cpu_realize ()
>>> #3  in device_set_realized ()
>>> #4  in property_set_bool ()
>>> #5  in object_property_set ()
>>> #6  in object_property_set_qobject ()
>>> #7  in object_property_set_bool ()
>>> #8  in qdev_realize ()
>> <snip>
>>> #18 in qmp_device_add ()
>>
>> Is this a hotplug event?
>
> Yes, the way I reproduce the problem is starting a pseries guest with
> `-smp 2,maxcpus=32,sockets=1,cores=16,threads=2` and then use qmp-shell to
> send the command:
>
> device_add id=device-2 driver=host-spapr-cpu-core core-id=2 node-id=0
>
>>> Here's the second:
>>>
>>> #0  in qemu_cpu_kick_thread ()
>>> #1  in qemu_cpu_kick ()
>>> #2  in queue_work_on_cpu ()
>>> #3  in async_run_on_cpu ()
>>> #4  in tlb_flush_by_mmuidx ()
>>> #5  in tlb_flush ()
>>> #6  in ppc_tlb_invalidate_all ()
>>
>> FWIW tcg_flush_softmmu_tlb handles a tlb_flush in the common reset code.
>
> Ok, maybe KVM should be doing that too? Or maybe it does but pseries
> isn't relying on it. I'll dig further.

No tlb flush is a softmmu only thing.


-- 
Alex Bennée


  reply	other threads:[~2020-07-11 17:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 20:43 [PATCH] cpu: Add starts_halted() method Thiago Jung Bauermann
2020-07-07 21:49 ` Eduardo Habkost
2020-07-07 23:28   ` Thiago Jung Bauermann
2020-07-08  8:38     ` Philippe Mathieu-Daudé
2020-07-08 10:00       ` David Gibson
2020-07-08 13:14         ` Peter Maydell
2020-07-08 15:25           ` Eduardo Habkost
2020-07-08 15:32             ` Peter Maydell
2020-07-08 16:03               ` Eduardo Habkost
2020-07-08 17:09                 ` Peter Maydell
2020-07-08 17:36                   ` Eduardo Habkost
2020-07-08 20:11                     ` Peter Maydell
2020-07-08 21:32                       ` Eduardo Habkost
2020-07-09  3:05                         ` Thiago Jung Bauermann
2020-07-09  3:26                           ` Thiago Jung Bauermann
2020-07-09 10:24                             ` Philippe Mathieu-Daudé
2020-07-10 20:02                               ` Thiago Jung Bauermann
2020-07-10 20:17                                 ` Eduardo Habkost
     [not found]                           ` <87k0zdm63s.fsf@linaro.org>
2020-07-10 20:16                             ` Thiago Jung Bauermann
2020-07-11 17:55                               ` Alex Bennée [this message]
2020-07-08 16:45             ` Philippe Mathieu-Daudé
2020-07-08 21:39               ` Eduardo Habkost
2020-07-09  5:11                 ` Philippe Mathieu-Daudé
2020-07-09  9:54                   ` Greg Kurz
2020-07-09 10:18                     ` Philippe Mathieu-Daudé
2020-07-09 10:55                       ` Greg Kurz
2020-07-09 12:21                         ` Philippe Mathieu-Daudé
2020-07-09 13:13                           ` Greg Kurz
2020-07-09 13:19                             ` Philippe Mathieu-Daudé
2020-07-09 13:40                             ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o8omq6ld.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=bauerman@linux.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=ehabkost@redhat.com \
    --cc=lvivier@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).