From: Juergen Gross <jgross@suse.com>
To: Stefano Stabellini <sstabellini@kernel.org>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
Date: Thu, 16 Nov 2017 06:28:20 +0100 [thread overview]
Message-ID: <5ba96f9c-6ea4-30c2-4d5a-4bf94e82517c@suse.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1711151319250.12676@sstabellini-ThinkPad-X260>
On 15/11/17 22:20, Stefano Stabellini wrote:
> On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
>> On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
>>> On Wed, 15 Nov 2017, Juergen Gross wrote:
>>>>>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>>>>>> mutex_is_locked(&map->active.out_mutex.owner))
>>>>>>> cpu_relax();
>>>>>>>
>>>>>>> ?
>>>>>> I'm not convinced there isn't a race.
>>>>>>
>>>>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>>>>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>>>>>> sk_send_head and manages to test the mutex before the mutex is locked?
>>>>>>
>>>>>> Even in case this is impossible: the whole construct seems to be rather
>>>>>> fragile.
>>> I agree it looks fragile, and I agree that it might be best to avoid the
>>> usage of in_mutex and out_mutex as refcounts. More comments on this
>>> below.
>>>
>>>
>>>>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
>>>>> not rely on mutex state.
>>>> Yes, this would work.
>>> Yes, I agree it would work and for the sake of getting something in
>>> shape for the merge window I am attaching a patch for it. Please go
>>> ahead with it. Let me know if you need anything else immediately, and
>>> I'll work on it ASAP.
>>>
>>>
>>>
>>> However, I should note that this is a pretty big hammer we are using:
>>> the refcount is global, while we only need to wait until it's only us
>>> _on this specific socket_.
>>
>> Can you explain why socket is important?
>
> Yes, of course: there are going to be many open sockets on a given
> pvcalls connection. pvcalls_refcount is global: waiting on
> pvcalls_refcount means waiting until any operations on any unrelated
> sockets stop. While we only need to wait until the operations on the one
> socket we want to close stop.
>
>
>>>
>>> We really need a per socket refcount. If we don't want to use the mutex
>>> internal counters, then we need another one.
>>>
>>> See the appended patch that introduces a per socket refcount. However,
>>> for the merge window, also using pvcalls_refcount is fine.
>>>
>>> The race Juergen is concerned about is only theoretically possible:
>>>
>>> recvmsg: release:
>>>
>>> test sk_send_head clear sk_send_head
>>> <only few instructions> <prepare a message>
>>> grab in_mutex <send a message to the server>
>>> <wait for an answer>
>>> test in_mutex
>>>
>>> Without kernel preemption is not possible for release to clear
>>> sk_send_head and test in_mutex after recvmsg tests sk_send_head and
>>> before recvmsg grabs in_mutex.
>>
>> Sorry, I don't follow --- what does preemption have to do with this? If
>> recvmsg and release happen on different processors the order of
>> operations can be
>>
>> CPU0 CPU1
>>
>> test sk_send_head
>> <interrupt>
>> clear sk_send_head
>> <send a message to the server>
>> <wait for an answer>
>> test in_mutex
>> free everything
>> grab in_mutex
>>
>> I actually think RCU should take care of all of this.
>
> Preemption could cause something very similar to happen, but your
> example is very good too, even better, because it could trigger the
> issue even with preemption disabled. I'll think more about this and
> submit a separate patch on top of the simple pvcalls_refcount patch
> below.
We are running as a guest. Even with interrupts off the vcpu could be
off the pcpu for several milliseconds!
Don't count on code length to avoid races!
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
prev parent reply other threads:[~2017-11-16 5:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-06 22:17 [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c Stefano Stabellini
2017-11-06 22:34 ` Boris Ostrovsky
2017-11-06 22:38 ` Stefano Stabellini
2017-11-07 5:44 ` Juergen Gross
2017-11-10 23:57 ` Stefano Stabellini
2017-11-11 2:02 ` Boris Ostrovsky
2017-11-13 18:30 ` Stefano Stabellini
2017-11-13 15:15 ` Juergen Gross
2017-11-13 18:33 ` Stefano Stabellini
2017-11-14 9:11 ` Juergen Gross
2017-11-14 21:46 ` Boris Ostrovsky
2017-11-15 8:14 ` Juergen Gross
2017-11-15 19:09 ` Stefano Stabellini
2017-11-15 19:50 ` Boris Ostrovsky
2017-11-15 19:53 ` Boris Ostrovsky
2017-11-15 21:20 ` Stefano Stabellini
2017-11-15 21:50 ` Stefano Stabellini
2017-11-15 21:51 ` Boris Ostrovsky
2017-11-15 21:54 ` Stefano Stabellini
2017-11-16 5:28 ` Juergen Gross [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5ba96f9c-6ea4-30c2-4d5a-4bf94e82517c@suse.com \
--to=jgross@suse.com \
--cc=boris.ostrovsky@oracle.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).