From mboxrd@z Thu Jan  1 00:00:00 1970
From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Subject: Re: [PATCH 1/6] kvm-s390: Fix memory slot versus run
Date: Mon, 11 May 2009 16:42:30 +0200
Message-ID: <4A083956.2000200@linux.vnet.ibm.com>
References: <1241534358-32172-1-git-send-email-ehrhardt@linux.vnet.ibm.com>    <1241534358-32172-2-git-send-email-ehrhardt@linux.vnet.ibm.com>   <4A017C2E.6060306@redhat.com> <4A08217E.6000102@linux.vnet.ibm.com>  <4A0824EF.5010201@redhat.com> <4A082C4E.60501@linux.vnet.ibm.com> <4A082FF3.4060908@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
	format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: kvm@vger.kernel.org,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Carsten Otte <cotte@de.ibm.com>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mtagate3.de.ibm.com ([195.212.29.152]:59988 "EHLO
	mtagate3.de.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755149AbZEKOmh (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 11 May 2009 10:42:37 -0400
Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49])
	by mtagate3.de.ibm.com (8.14.3/8.13.8) with ESMTP id n4BEgc5T171194
	for <kvm@vger.kernel.org>; Mon, 11 May 2009 14:42:38 GMT
Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com [9.149.165.229])
	by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n4BEgbge1753152
	for <kvm@vger.kernel.org>; Mon, 11 May 2009 16:42:37 +0200
Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1])
	by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n4BEgbdQ028340
	for <kvm@vger.kernel.org>; Mon, 11 May 2009 16:42:37 +0200
In-Reply-To: <4A082FF3.4060908@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Avi Kivity wrote:
> Christian Ehrhardt wrote:
>>>> I thought about implementing it with slots_lock, vcpu->request, et=
c=20
>>>> but it really looks like overkill for s390.
>>>
>>> We could make (some of) it common code, so it won't look so bad. =20
>>> There's value in having all kvm ports do things similarly; though o=
f=20
>>> course we shouldn't force the solution when it isn't really needed.
>>>
>>> vcpu->requests is useful whenever we modify global VM state that=20
>>> needs to be seen by all vcpus in host mode; see =20
>>> kvm_reload_remote_mmus().
>> yeah I read that code after your first hint in that thread, and I=20
>> agree that merging some of this into common code might be good.
>> But in my opinion not now for this bugfix patch (the intention is=20
>> just to prevent a user being able to crash the host via vcpu=20
>> create,set mem& and vcpu run in that order).
>> It might be a good point to further streamline this once we use the=20
>> same userspace code, but I think it doesn't make sense yet.
>
> Sure, don't mix bugfixes with infrastructure changes, when possible.
>
>>>> At least today we can assume that we only have one memslot.=20
>>>> Therefore a set_memslot with already created vcpu's will still not=
=20
>>>> interfere with running vcpus (they can't run without memslot and=20
>>>> since we have only one they won't run).
>>>> Anyway I the code is prepared to "meet" running vcpus, because it=20
>>>> might be different in future. To prevent the livelock issue I=20
>>>> changed the code using mutex_trylock and in case I can't get the=20
>>>> lock I explicitly let the vcpu exit from guest.
>>>
>>> Why not do it unconditionally?
>>>
>> hmm I might have written that misleading - eventually it's a loop=20
>> until it got the lock
>>  while !trylock
>>    kick vcpu out of guest
>>    schedule
>>
>> There is no reason to kick out guests where I got the lock cleanly a=
s=20
>> far as I see.
>> Especially as I expect the vcpus not running in the common case as i=
=20
>> explained above (can't run without memslot + we only have one =3D> n=
o=20
>> vcpu will run).
>
> Still livelockable, unless you stop the vcpu from entering the guest=20
> immediately.
>
> That's why vcpu->requests is so powerful.  Not only you kick the vcpu=
=20
> out of guest mode, you force it to synchronize when it tries to enter=
=20
> again.
>

The bad thing on vcpu->request in that case is that I don't want the=20
async behaviour of vcpu->requests in that case, I want the memory slot=20
updated in all vcpu's when the ioctl is returning.
Looking at vcpu->request based solution I don't find the synchronizatio=
n=20
I need. The changes to  vcpu->arch.guest_origin/guest_memsize and the=20
changes to vcpu->arch.sie_block->gmsor/gmslm need to happen without the=
=20
vcpu running.
Therefor i want the vcpu lock _before_ I update the both structs,=20
otherwise it could be racy (at least on s390).

On the other hand while it is very++ unlikely to happen you are still=20
right that it could theoretically livelock there.
I might use vcpu->request in to not enter vcpu run again after such a=20
"kick" out of guest state.
It would be checked on vcpu_run enter and could then drop the lock, cal=
l=20
schedule, relock and check the flag again until it is cleared.
I'm not yet happy with this solution as I expect it to end up in=20
something like a reference count which then would not fit into the=20
existing vcpu->request flags :-/

As I mentioned above the changes to vcpu->arch and vcpu->arch->sie_bloc=
k=20
have to be exclusive with the vcpu not running.
If I would find something as "transport" for the information I have on=20
set_memory_slot (origin/size) until the next vcpu_run entry I could do=20
both changes there synchronously.
In that case I could really use your suggested solution with=20
vcpu->request, kick out unconditionally and set values on next (re-)ent=
ry.

Hmmm .. Maybe I can find all I need on reentry in vcpu->kvm->memslots[]=
=2E
If I can change it that way it will definitely require some testing.
=2E.. to be continued :-)

--=20

Gr=C3=BCsse / regards,=20
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization