qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Phil Dennis-Jordan <phil@philjordan.eu>
To: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: qemu-devel@nongnu.org, agraf@csgraf.de, peter.maydell@linaro.org,
	 pbonzini@redhat.com, rad@semihalf.com,
	quic_llindhol@quicinc.com,  marcin.juszkiewicz@linaro.org,
	stefanha@redhat.com, mst@redhat.com,  slp@redhat.com,
	richard.henderson@linaro.org, eduardo@habkost.net,
	 marcel.apfelbaum@gmail.com, gaosong@loongson.cn,
	jiaxun.yang@flygoat.com,  chenhuacai@kernel.org,
	kwolf@redhat.com, hreitz@redhat.com, philmd@linaro.org,
	 shorne@gmail.com, palmer@dabbelt.com, alistair.francis@wdc.com,
	 bmeng.cn@gmail.com, liwei1518@gmail.com,
	dbarboza@ventanamicro.com,  zhiwei_liu@linux.alibaba.com,
	jcmvbkbc@gmail.com, marcandre.lureau@redhat.com,
	 berrange@redhat.com, qemu-arm@nongnu.org, qemu-block@nongnu.org,
	 qemu-riscv@nongnu.org, Alexander Graf <graf@amazon.com>
Subject: Re: [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support
Date: Mon, 28 Oct 2024 14:31:30 +0100	[thread overview]
Message-ID: <CAAibmn3HZeDeK8FrYhHa1GGwc+N8rBuB2VvMRm7LCt0mUGmsYQ@mail.gmail.com> (raw)
In-Reply-To: <CAAibmn3YEOT0O55-bwJkpi_oEGkA1WwvhC0w3jGbgXOZLTVa0w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4954 bytes --]

On Mon, 28 Oct 2024 at 10:00, Phil Dennis-Jordan <phil@philjordan.eu> wrote:

>
> >      >
>> >      > Hmm. I think if we were to use that, we would need to create a
>> new
>> >      > QemuEvent for every job and destroy it afterward, which seems
>> >     expensive.
>> >      > We can't rule out multiple concurrent jobs being submitted, and
>> the
>> >      > QemuEvent system only supports a single producer as far as I can
>> >     tell.
>> >      >
>> >      > You can probably sort of hack around it with just one QemuEvent
>> by
>> >      > putting the qemu_event_wait into a loop and turning the job.done
>> >     flag
>> >      > into an atomic (because it would now need to be checked outside
>> the
>> >      > lock) but this all seems unnecessarily complicated considering
>> the
>> >      > QemuEvent uses the same mechanism QemuCond/QemuMutex internally
>> >     on macOS
>> >      > (the only platform relevant here), except we can use it as
>> >     intended with
>> >      > QemuCond/QemuMutex rather than having to work against the
>> >     abstraction.
>> >
>> >     I don't think it's going to be used concurrently. It would be
>> difficult
>> >     to reason even for the framework if it performs memory
>> >     unmapping/mapping/reading operations concurrently.
>> >
>> >
>> > I've just performed a very quick test by wrapping the job submission/
>> > wait in the 2 mapMemory callbacks and the 1 readMemory callback with
>> > atomic counters and logging whenever a counter went above 1.
>> >
>> >   * Overall, concurrent callbacks across all types were common (many
>> per
>> > second when the VM is busy). It's not exactly a "thundering herd" (I
>> > never saw >2) but it's probably not a bad idea to use a separate
>> > condition variable for each job type. (task map, surface map, memory
>> read)
>> >   * While I did not observe any concurrent memory mapping operations
>> > *within* a type of memory map (2 task mappings or 2 surface mappings) I
>> > did see very occasional concurrent memory *read* callbacks. These
>> would,
>> > as far as I can tell, not be safe with QemuEvents, unless we placed the
>> > event inside the job struct and init/destroyed it on every callback
>> > (which seems like excessive overhead).
>>
>> I think we can tolerate that overhead. init/destroy essentially sets the
>> fields in the data structure and I estimate its total size is about 100
>> bytes. It is probably better than waking an irrelevant thread up. I also
>> hope that keeps the code simple; it's not worthwhile adding code to
>> optimize this.
>>
>
> At least pthread_cond_{init,destroy} and pthread_mutex_{init,destroy}
> don't make any syscalls, so yeah it's probably an acceptable overhead.
>

I've just experimented with QemuEvents created on-demand and ran into some
weird deadlocks, which then made me sit down and think about it some more.
I've come to the conclusion that creating (and crucially, destroying)
QemuEvents on demand in this way is not safe.

Specifically, you must not call qemu_event_destroy() - which transitively
destroys the mutex and condition variable - unless you can guarantee that
the qemu_event_set() call on that event object has completed.

In qemu_event_set, the event object's value is atomically set to EV_SET. If
the previous value was EV_BUSY, qemu_futex_wake() is called. All of this is
outside any mutex, however, so apart from memory coherence (there are
barriers) this can race with the waiting thread. qemu_event_wait() reads
the event's value. If EV_FREE, it's atomically set to EV_BUSY. Then the
mutex is locked, the value is checked again, and if it's still EV_BUSY, it
waits for the condition variable, otherwise the mutex is immediately
unlocked again. If the trigger thread's qemu_event_set() flip to EV_SET
occurs between the waiting thread's two atomic reads of the value, the
waiting thread will never wait for the condition variable, but the trigger
thread WILL try to acquire the mutex and signal the condition variable in
qemu_futex_wake(), by which  time the waiting thread may have advanced
outside of qemu_event_wait().

This is all fine usually, BUT if you destroy the QemuEvent immediately
after the qemu_event_wait() call, qemu_futex_wake() may try to lock a mutex
that has been destroyed, or signal a condition variable which has been
destroyed. I don't see a reasonable way of making this safe other than
using long-lived mutexes and condition variables. And anyway, we have much,
MUCH bigger contention/performance issues coming from almost everything
being covered by the BQL. (If waking these callbacks can even be considered
an issue: I haven't seen it show up in profiling, whereas BQL contention
very much does.)

I'll submit v5 of this patch set with separate condition variables for each
job type. This should make the occurrence of waking the wrong thread quite
rare, while reasoning about correctness is pretty straightforward. I think
that's good enough.

[-- Attachment #2: Type: text/html, Size: 6058 bytes --]

  reply	other threads:[~2024-10-28 13:32 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-24 10:27 [PATCH v4 00/15] macOS PV Graphics and new vmapple machine type Phil Dennis-Jordan
2024-10-24 10:27 ` [PATCH v4 01/15] ui & main loop: Redesign of system-specific main thread event handling Phil Dennis-Jordan
2024-10-25  4:34   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 02/15] hw/display/apple-gfx: Introduce ParavirtualizedGraphics.Framework support Phil Dennis-Jordan
2024-10-25  6:03   ` Akihiko Odaki
2024-10-25 19:43     ` Phil Dennis-Jordan
2024-10-26  4:40       ` Akihiko Odaki
2024-10-26 10:24         ` Phil Dennis-Jordan
2024-10-28  7:42           ` Akihiko Odaki
2024-10-28  9:00             ` Phil Dennis-Jordan
2024-10-28 13:31               ` Phil Dennis-Jordan [this message]
2024-10-28 14:02                 ` Akihiko Odaki
2024-10-28 14:13                   ` Phil Dennis-Jordan
2024-10-28 16:06                     ` Akihiko Odaki
2024-10-28 21:06                       ` Phil Dennis-Jordan
2024-10-29  7:42                         ` Akihiko Odaki
2024-10-29 21:16                           ` Phil Dennis-Jordan
2024-10-31  6:52                             ` Akihiko Odaki
2024-11-03 15:08                               ` Phil Dennis-Jordan
2024-10-24 10:28 ` [PATCH v4 03/15] hw/display/apple-gfx: Adds PCI implementation Phil Dennis-Jordan
2024-10-26  4:45   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 04/15] hw/display/apple-gfx: Adds configurable mode list Phil Dennis-Jordan
2024-10-26  5:15   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 05/15] MAINTAINERS: Add myself as maintainer for apple-gfx, reviewer for HVF Phil Dennis-Jordan
2024-11-05 15:36   ` Roman Bolshakov
2024-10-24 10:28 ` [PATCH v4 06/15] hw: Add vmapple subdir Phil Dennis-Jordan
2024-10-24 10:28 ` [PATCH v4 07/15] hw/misc/pvpanic: Add MMIO interface Phil Dennis-Jordan
2024-10-24 10:28 ` [PATCH v4 08/15] hvf: arm: Ignore writes to CNTP_CTL_EL0 Phil Dennis-Jordan
2024-10-24 10:28 ` [PATCH v4 09/15] gpex: Allow more than 4 legacy IRQs Phil Dennis-Jordan
2024-10-26  5:21   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 10/15] hw/vmapple/aes: Introduce aes engine Phil Dennis-Jordan
2024-10-26  5:40   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 11/15] hw/vmapple/bdif: Introduce vmapple backdoor interface Phil Dennis-Jordan
2024-10-24 10:28 ` [PATCH v4 12/15] hw/vmapple/cfg: Introduce vmapple cfg region Phil Dennis-Jordan
2024-10-26  5:48   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 13/15] hw/vmapple/virtio-blk: Add support for apple virtio-blk Phil Dennis-Jordan
2024-10-26  6:02   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 14/15] hw/block/virtio-blk: Replaces request free function with g_free Phil Dennis-Jordan
2024-10-26  6:03   ` Akihiko Odaki
2024-10-24 10:28 ` [PATCH v4 15/15] hw/vmapple/vmapple: Add vmapple machine type Phil Dennis-Jordan
2024-10-26  6:20   ` Akihiko Odaki
2024-10-26 11:58     ` Phil Dennis-Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAibmn3HZeDeK8FrYhHa1GGwc+N8rBuB2VvMRm7LCt0mUGmsYQ@mail.gmail.com \
    --to=phil@philjordan.eu \
    --cc=agraf@csgraf.de \
    --cc=akihiko.odaki@daynix.com \
    --cc=alistair.francis@wdc.com \
    --cc=berrange@redhat.com \
    --cc=bmeng.cn@gmail.com \
    --cc=chenhuacai@kernel.org \
    --cc=dbarboza@ventanamicro.com \
    --cc=eduardo@habkost.net \
    --cc=gaosong@loongson.cn \
    --cc=graf@amazon.com \
    --cc=hreitz@redhat.com \
    --cc=jcmvbkbc@gmail.com \
    --cc=jiaxun.yang@flygoat.com \
    --cc=kwolf@redhat.com \
    --cc=liwei1518@gmail.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=marcin.juszkiewicz@linaro.org \
    --cc=mst@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=quic_llindhol@quicinc.com \
    --cc=rad@semihalf.com \
    --cc=richard.henderson@linaro.org \
    --cc=shorne@gmail.com \
    --cc=slp@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=zhiwei_liu@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).