From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jan Kiszka <jan.kiszka@web.de>
Subject: Re: [PATCH] kvm: First step to push iothread lock out
 of inner run loop
Date: Sat, 23 Jun 2012 11:22:07 +0200
Message-ID: <4FE58ABF.6000100@web.de>
References: <4FE4F56D.1020201@web.de> <4FE4F7F5.7030400@web.de>
	<20120623002259.GA13440@amt.cnet>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature";
	boundary="------------enig193A1E06AE61B909AF51AB52"
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>, kvm <kvm@vger.kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	Alexander Graf <agraf@suse.de>, Avi Kivity <avi@redhat.com>,
	Anthony Liguori <anthony@codemonkey.ws>
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org>
In-Reply-To: <20120623002259.GA13440@amt.cnet>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org
Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org
List-Id: kvm.vger.kernel.org

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig193A1E06AE61B909AF51AB52
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 2012-06-23 02:22, Marcelo Tosatti wrote:
> On Sat, Jun 23, 2012 at 12:55:49AM +0200, Jan Kiszka wrote:
>> Should have declared this [RFC] in the subject and CC'ed kvm...
>>
>> On 2012-06-23 00:45, Jan Kiszka wrote:
>>> This sketches a possible path to get rid of the iothread lock on vmex=
its
>>> in KVM mode. On x86, the the in-kernel irqchips has to be used becaus=
e
>>> we otherwise need to synchronize APIC and other per-cpu state accesse=
s
>>> that could be changed concurrently. Not yet fully analyzed is the NMI=

>>> injection path in the absence of an APIC.
>>>
>>> s390x should be fine without specific locking as their pre/post-run
>>> callbacks are empty. Power requires locking for the pre-run callback.=

>>>
>>> This patch is untested, but a similar version was successfully used i=
n
>>> a x86 setup with a network I/O path that needed no central iothread
>>> locking anymore (required special MMIO exit handling).
>>> ---
>>>  kvm-all.c         |   18 ++++++++++++++++--
>>>  target-i386/kvm.c |    7 +++++++
>>>  target-ppc/kvm.c  |    4 ++++
>>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kvm-all.c b/kvm-all.c
>>> index f8e4328..9c3e26f 100644
>>> --- a/kvm-all.c
>>> +++ b/kvm-all.c
>>> @@ -1460,6 +1460,8 @@ int kvm_cpu_exec(CPUArchState *env)
>>>          return EXCP_HLT;
>>>      }
>>> =20
>>> +    qemu_mutex_unlock_iothread();
>>> +
>>>      do {
>>>          if (env->kvm_vcpu_dirty) {
>>>              kvm_arch_put_registers(env, KVM_PUT_RUNTIME_STATE);
>>> @@ -1476,14 +1478,16 @@ int kvm_cpu_exec(CPUArchState *env)
>>>               */
>>>              qemu_cpu_kick_self();
>>>          }
>>> -        qemu_mutex_unlock_iothread();
>>> =20
>>>          run_ret =3D kvm_vcpu_ioctl(env, KVM_RUN, 0);
>>> =20
>>> -        qemu_mutex_lock_iothread();
>>>          kvm_arch_post_run(env, run);
>=20
> target-i386/kvm.c
>=20
> void kvm_arch_post_run(CPUX86State *env, struct kvm_run *run)
> {      =20
>     if (run->if_flag) {
>         env->eflags |=3D IF_MASK;
>     } else {
>         env->eflags &=3D ~IF_MASK;
>     }
>     cpu_set_apic_tpr(env->apic_state, run->cr8);
>     cpu_set_apic_base(env->apic_state, run->apic_base);
> }
>=20
> Clearly there is no structure to any of the writes around the writes
> in x86's kvm_arch_post_run, so it is unsafe.

Can't parse this yet.

None of the fields touched above should be modified outside of the vcpu
thread context (as long as that thread is inside the inner loop).
Therefore, it should be safe to run that functions without any lock. Am
I missing something?

>=20
> In kvm_flush_coalesced_mmio_buffer, however, the first and last pointer=
s=20
> can be read safely without the global lock (so you could move the lock
> after reading run->exit_reason, in that case).
>=20
>>> +        /* TODO: push coalesced mmio flushing to the point where we =
access
>>> +         * devices that are using it (currently VGA and E1000). */
>>> +        qemu_mutex_lock_iothread();
>>>          kvm_flush_coalesced_mmio_buffer();
>>> +        qemu_mutex_unlock_iothread();
>=20
> But you have to flush first to then figure out which device the
> coalesced mmio belongs to (don't get that comment).

kvm_flush must not be called unconditionally on vmexit, that is my
point. I'm playing with the idea to tag memory regions that require
flushing (as they are coalescing themselves or logically depend on
coalesced regions). Then we would flush in the memory layer once a read
or write is about to be performed on such a region.

BTW, two more users arrived in the meantime: the G364 framebuffer and
the i82378 PCI-ISA bridge (not sure yet what requests that bridge
coalesces, if it's only VGA, but it looks a bit fishy).

Jan


--------------enig193A1E06AE61B909AF51AB52
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/lir8ACgkQitSsb3rl5xRjnQCgh9bUpq3j00Hn2CtHY2JTe9x/
mg8An0UdKconTZadsSW6shs6HG2+b7RU
=OIzV
-----END PGP SIGNATURE-----

--------------enig193A1E06AE61B909AF51AB52--