From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Vrabel <david.vrabel@citrix.com>
Subject: Re: [PATCHv3 3/4] xen: use ticket locks for spin locks
Date: Wed, 29 Apr 2015 18:00:00 +0100
Message-ID: <55410E10.3060200@citrix.com>
References: <1429611088-23950-1-git-send-email-david.vrabel@citrix.com>
	<1429611088-23950-4-git-send-email-david.vrabel@citrix.com>
	<20150423120302.GG10810@deinos.phlegethon.org>
	<553915BA0200007800075314@mail.emea.novell.com>
	<20150423144342.GC10824@deinos.phlegethon.org>
	<553924C20200007800075518@mail.emea.novell.com>
	<5540FA62.4090900@citrix.com>
	<20150429165643.GB6279@deinos.phlegethon.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <david.vrabel@citrix.com>) id 1YnVLC-0000GS-KM
	for xen-devel@lists.xenproject.org; Wed, 29 Apr 2015 17:00:06 +0000
In-Reply-To: <20150429165643.GB6279@deinos.phlegethon.org>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Tim Deegan <tim@xen.org>
Cc: Keir Fraser <keir@xen.org>, Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Jennifer Herbert <jennifer.herbert@citrix.com>, Jan Beulich <JBeulich@suse.com>, xen-devel@lists.xenproject.org
List-Id: xen-devel@lists.xenproject.org

On 29/04/15 17:56, Tim Deegan wrote:
> At 16:36 +0100 on 29 Apr (1430325362), David Vrabel wrote:
>> On 23/04/15 15:58, Jan Beulich wrote:
>>>>>> On 23.04.15 at 16:43, <tim@xen.org> wrote:
>>>> At 14:54 +0100 on 23 Apr (1429800874), Jan Beulich wrote:
>>>>>>>> On 23.04.15 at 14:03, <tim@xen.org> wrote:
>>>>>> At 11:11 +0100 on 21 Apr (1429614687), David Vrabel wrote:
>>>>>>>  void _spin_unlock(spinlock_t *lock)
>>>>>>>  {
>>>>>>> +    smp_mb();
>>>>>>>      preempt_enable();
>>>>>>>      LOCK_PROFILE_REL;
>>>>>>> -    _raw_spin_unlock(&lock->raw);
>>>>>>> +    lock->tickets.head++;
>>>>>>
>>>>>> This needs to be done with an explicit atomic (though not locked)
>>>>>> write; otherwise the compiler might use some unsuitable operation that
>>>>>> clobbers .tail as well.
>>>>>
>>>>> How do you imagine that to happen? An increment of one
>>>>> structure member surely won't modify any others.
>>>>
>>>> AIUI, the '++' could end up as a word-size read, modify, and word-size
>>>> write.  If another CPU updates .tail parallel, that update could get
>>>> lost.
>>>
>>> Ah, right, compilers are allowed to do that, albeit normally wouldn't
>>> unless the architecture has no suitable loads/stores.
>>
>> lock->tickets.head++;
>>
>>   7b:   66 83 07 01             addw   $0x1,(%rdi)
>>
>> write_atomic(&lock->tickets.head, lock->tickets.head + 1);
>>
>>   7b:   0f b7 07                movzwl (%rdi),%eax
>>   7e:   83 c0 01                add    $0x1,%eax
>>   81:   66 89 07                mov    %ax,(%rdi)
> 
> :(
> 
>> Do you want a new add_atomic() operation? e.g.,
>>
>> #define add_atomic(ptr, inc) \
>>         asm volatile ("addw %1,%w" \
>>             : "+m" (*(ptr)) : "ri" (inc) : "memory")
>>
>> (but obviously handling all the different sizes.)
> 
> I guess so.  An equivalent 'inc' operation would be even shorter,
> but maybe GCC has its reasons for using addw + immediate?
> (Ah, it's in the optimization manual: addw $1 is preferred because it
> sets all the flags, whereas inc sets only some, so the inc has a
> dependence on the previous instruction to set flags.)
> 
> It needs some careful naming -- this series will add two
> new add operations, currently xadd() and add_atomic(), where xadd() is
> the more atomic of the two, IYSWIM.

Should I rename xadd() to arch_fetch_and_add() to match the GCC builtin
name?

David