From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60952)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1Y1CcL-0000ON-VS
	for qemu-devel@nongnu.org; Wed, 17 Dec 2014 06:18:15 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1Y1CcG-0006NG-Pd
	for qemu-devel@nongnu.org; Wed, 17 Dec 2014 06:18:09 -0500
Received: from cantor2.suse.de ([195.135.220.15]:39407 helo=mx2.suse.de)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <agraf@suse.de>) id 1Y1CcG-0006N8-Fb
	for qemu-devel@nongnu.org; Wed, 17 Dec 2014 06:18:04 -0500
Message-ID: <5491666A.7060001@suse.de>
Date: Wed, 17 Dec 2014 12:18:02 +0100
From: Alexander Graf <agraf@suse.de>
MIME-Version: 1.0
References: <1418721234-9588-1-git-send-email-fred.konrad@greensocs.com>
	<CAFEAcA-v_ObUU7aw-e3UFew-SQ=GyfY8aVhUkyJpX2w_TEB-qw@mail.gmail.com>
	<54915A76.3000408@greensocs.com> <54915AE8.3010809@suse.de>
	<DC28F09E-E330-4E78-A80B-CFDEDA643E06@greensocs.com>
	<54915EC6.2050708@suse.de>
	<8B6B4BF9-3400-4125-8571-F4EF9F12AA89@greensocs.com>
In-Reply-To: <8B6B4BF9-3400-4125-8571-F4EF9F12AA89@greensocs.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [RFC PATCH] target-arm: protect cpu_exclusive_*.
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Mark Burton <mark.burton@greensocs.com>
Cc: mttcg@listserver.greensocs.com, Peter Maydell <peter.maydell@linaro.org>, QEMU Developers <qemu-devel@nongnu.org>, Paolo Bonzini <pbonzini@redhat.com>, =?windows-1252?Q?Llu=EDs_Vilanova?= <vilanova@ac.upc.edu>, =?windows-1252?Q?KONRAD_Fr=E9d=E9ric?= <fred.konrad@greensocs.com>


On 17.12.14 12:12, Mark Burton wrote:
>=20
>> On 17 Dec 2014, at 11:45, Alexander Graf <agraf@suse.de> wrote:
>>
>>
>>
>> On 17.12.14 11:31, Mark Burton wrote:
>>>
>>>> On 17 Dec 2014, at 11:28, Alexander Graf <agraf@suse.de> wrote:
>>>>
>>>>
>>>>
>>>> On 17.12.14 11:27, Frederic Konrad wrote:
>>>>> On 16/12/2014 17:37, Peter Maydell wrote:
>>>>>> On 16 December 2014 at 09:13,  <fred.konrad@greensocs.com> wrote:
>>>>>>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>>>>>>
>>>>>>> This adds a lock to avoid multiple exclusive access at the same t=
ime
>>>>>>> in case of
>>>>>>> TCG multithread.
>>>>> Hi Peter,
>>>>>
>>>>>> This feels to me like it's not really possible to review on
>>>>>> its own, since you can't see how it fits into the design of
>>>>>> the rest of the multithreading support.
>>>>> true the only thing we observe is that it didn't change anything ri=
ght now.
>>>>>
>>>>>> The other approach here rather than having a pile of mutexes
>>>>>> in the target-* code would be to have TCG IR support for
>>>>>> "begin critical section"/"end critical section". Then you
>>>>>> could have the main loop ensure that no other CPU is running
>>>>>> at the same time as the critical-section code. (linux-user
>>>>>> already has an ad-hoc implementation of this for the
>>>>>> exclusives.)
>>>>>>
>>>>>> -- PMM
>>>>>>
>>>>> What do you mean by TCG IR?
>>>>
>>>> TCP ops. The nice thing is that TCG could translate those into
>>>> transactions if the host supports them as well.
>>>>
>>>
>>> Hows that different in reality from what we have now?
>>> Cheers
>>> Mark.
>>
>> The current code can't optimize things in TCG. There's a good chance
>> your TCG host implementation can have an optimization pass that create=
s
>> host cmpxchg instructions or maybe even transaction blocks out of the
>> critical sections.
>>
>>
>=20
>=20
> Ok - I get it - I see the value - so long as it=92s possible to do. It =
would solve a lot of problems...
>=20
> We were not (yet) trying to fix that, we were simply asking the questio=
n, if we add these mutex=92s - do we have any detrimental impact on anyth=
ing.
> Seems like the answer is that adding the mutex=92s is fine - it doesn=92=
t seem to have a performance impact or anything. Good.
>=20
> But - I see what you mean - if we implemented this as an op, then it wo=
uld be much simpler to optimise/fix properly afterwards - and - that =93f=
ix=94 might not even need to deal with the whole memory chain issue - may=
be=85..=20

Yes, especially given that transactional memory is getting pretty common
these days (Haswell had it, POWER8 has it) I think it makes a lot of
sense to just base on its concept in the design here. It's the easiest
way to make parallel memory accesses fast without taking big locks all
the time.

So I think the best way to go forward would be to add transaction_start
and transaction_end opcodes to TCG and implement them as mutex locks
today. When you get the chance to get yourself a machine that supports
actual TM, try to replace them with transaction start/end blocks and
have the normal mutex code as fallback if the transaction fails.


Alex