* [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
@ 2017-11-06 22:17 Stefano Stabellini
2017-11-06 22:34 ` Boris Ostrovsky
2017-11-07 5:44 ` Juergen Gross
0 siblings, 2 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-06 22:17 UTC (permalink / raw)
To: xen-devel; +Cc: jgross, boris.ostrovsky, sstabellini
mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next times
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.
Solve the problem by moving the two mutex_trylock calls to two separate
loops.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
---
drivers/xen/pvcalls-front.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..047dce7 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
* is set to NULL -- we only need to wait for the existing
* waiters to return.
*/
- while (!mutex_trylock(&map->active.in_mutex) ||
- !mutex_trylock(&map->active.out_mutex))
+ while (!mutex_trylock(&map->active.in_mutex))
+ cpu_relax();
+ while (!mutex_trylock(&map->active.out_mutex))
cpu_relax();
pvcalls_front_free_map(bedata, map);
--
1.9.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-06 22:17 [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c Stefano Stabellini
@ 2017-11-06 22:34 ` Boris Ostrovsky
2017-11-06 22:38 ` Stefano Stabellini
2017-11-07 5:44 ` Juergen Gross
1 sibling, 1 reply; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-06 22:34 UTC (permalink / raw)
To: Stefano Stabellini, xen-devel; +Cc: jgross
On 11/06/2017 05:17 PM, Stefano Stabellini wrote:
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next times
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
>
> Solve the problem by moving the two mutex_trylock calls to two separate
> loops.
>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
BTW, I assume that when recvmsg or sendmsg is called no other locks are
held, right? (otherwise we'd get into a deadlock.)
-boris
> ---
> drivers/xen/pvcalls-front.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..047dce7 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> * is set to NULL -- we only need to wait for the existing
> * waiters to return.
> */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + while (!mutex_trylock(&map->active.in_mutex))
> + cpu_relax();
> + while (!mutex_trylock(&map->active.out_mutex))
> cpu_relax();
>
> pvcalls_front_free_map(bedata, map);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-06 22:34 ` Boris Ostrovsky
@ 2017-11-06 22:38 ` Stefano Stabellini
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-06 22:38 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: jgross, xen-devel, Stefano Stabellini
On Mon, 6 Nov 2017, Boris Ostrovsky wrote:
> On 11/06/2017 05:17 PM, Stefano Stabellini wrote:
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next times
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> >
> > Solve the problem by moving the two mutex_trylock calls to two separate
> > loops.
> >
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
>
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>
> BTW, I assume that when recvmsg or sendmsg is called no other locks are
> held, right? (otherwise we'd get into a deadlock.)
Yes, but most importantly the other assumption is that recvmsg and
sendmsg are already running: the partially visible comment explains it:
* Wait until there are no more waiters on the mutexes.
* We know that no new waiters can be added because sk_send_head
* is set to NULL -- we only need to wait for the existing
* waiters to return.
> > ---
> > drivers/xen/pvcalls-front.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..047dce7 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> > * is set to NULL -- we only need to wait for the existing
> > * waiters to return.
> > */
> > - while (!mutex_trylock(&map->active.in_mutex) ||
> > - !mutex_trylock(&map->active.out_mutex))
> > + while (!mutex_trylock(&map->active.in_mutex))
> > + cpu_relax();
> > + while (!mutex_trylock(&map->active.out_mutex))
> > cpu_relax();
> >
> > pvcalls_front_free_map(bedata, map);
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-06 22:17 [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c Stefano Stabellini
2017-11-06 22:34 ` Boris Ostrovsky
@ 2017-11-07 5:44 ` Juergen Gross
2017-11-10 23:57 ` Stefano Stabellini
1 sibling, 1 reply; 20+ messages in thread
From: Juergen Gross @ 2017-11-07 5:44 UTC (permalink / raw)
To: Stefano Stabellini, xen-devel; +Cc: boris.ostrovsky
On 06/11/17 23:17, Stefano Stabellini wrote:
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next times
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
>
> Solve the problem by moving the two mutex_trylock calls to two separate
> loops.
>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
> ---
> drivers/xen/pvcalls-front.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..047dce7 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> * is set to NULL -- we only need to wait for the existing
> * waiters to return.
> */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + while (!mutex_trylock(&map->active.in_mutex))
> + cpu_relax();
> + while (!mutex_trylock(&map->active.out_mutex))
> cpu_relax();
Any reason you don't just use mutex_lock()?
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-07 5:44 ` Juergen Gross
@ 2017-11-10 23:57 ` Stefano Stabellini
2017-11-11 2:02 ` Boris Ostrovsky
2017-11-13 15:15 ` Juergen Gross
0 siblings, 2 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-10 23:57 UTC (permalink / raw)
To: Juergen Gross; +Cc: xen-devel, boris.ostrovsky, Stefano Stabellini
On Tue, 7 Nov 2017, Juergen Gross wrote:
> On 06/11/17 23:17, Stefano Stabellini wrote:
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next times
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> >
> > Solve the problem by moving the two mutex_trylock calls to two separate
> > loops.
> >
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> > ---
> > drivers/xen/pvcalls-front.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..047dce7 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> > * is set to NULL -- we only need to wait for the existing
> > * waiters to return.
> > */
> > - while (!mutex_trylock(&map->active.in_mutex) ||
> > - !mutex_trylock(&map->active.out_mutex))
> > + while (!mutex_trylock(&map->active.in_mutex))
> > + cpu_relax();
> > + while (!mutex_trylock(&map->active.out_mutex))
> > cpu_relax();
>
> Any reason you don't just use mutex_lock()?
Hi Juergen, sorry for the late reply.
Yes, you are right. Given the patch, it would be just the same to use
mutex_lock.
This is where I realized that actually we have a problem: no matter if
we use mutex_lock or mutex_trylock, there are no guarantees that we'll
be the last to take the in/out_mutex. Other waiters could be still
outstanding.
We solved the same problem using a refcount in pvcalls_front_remove. In
this case, I was thinking of reusing the mutex internal counter for
efficiency, instead of adding one more refcount.
For using the mutex as a refcount, there is really no need to call
mutex_trylock or mutex_lock. I suggest checking on the mutex counter
directly:
while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
atomic_long_read(&map->active.out_mutex.owner) != 0UL)
cpu_relax();
Cheers,
Stefano
---
xen/pvcalls: fix potential endless loop in pvcalls-front.c
mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next time
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.
Actually, we don't want to use mutex_trylock at all: we don't need to
take the mutex, we only need to wait until the last mutex waiter/holder
releases it.
Instead of calling mutex_trylock or mutex_lock, just check on the mutex
refcount instead.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..9f33cb8 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
* is set to NULL -- we only need to wait for the existing
* waiters to return.
*/
- while (!mutex_trylock(&map->active.in_mutex) ||
- !mutex_trylock(&map->active.out_mutex))
+ while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
+ atomic_long_read(&map->active.out_mutex.owner) != 0UL)
cpu_relax();
pvcalls_front_free_map(bedata, map);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-10 23:57 ` Stefano Stabellini
@ 2017-11-11 2:02 ` Boris Ostrovsky
2017-11-13 18:30 ` Stefano Stabellini
2017-11-13 15:15 ` Juergen Gross
1 sibling, 1 reply; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-11 2:02 UTC (permalink / raw)
To: Stefano Stabellini, Juergen Gross; +Cc: xen-devel
On 11/10/2017 06:57 PM, Stefano Stabellini wrote:
> On Tue, 7 Nov 2017, Juergen Gross wrote:
>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>> loops.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>> ---
>>> drivers/xen/pvcalls-front.c | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..047dce7 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>> * is set to NULL -- we only need to wait for the existing
>>> * waiters to return.
>>> */
>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>> - !mutex_trylock(&map->active.out_mutex))
>>> + while (!mutex_trylock(&map->active.in_mutex))
>>> + cpu_relax();
>>> + while (!mutex_trylock(&map->active.out_mutex))
>>> cpu_relax();
>>
>> Any reason you don't just use mutex_lock()?
>
> Hi Juergen, sorry for the late reply.
>
> Yes, you are right. Given the patch, it would be just the same to use
> mutex_lock.
>
> This is where I realized that actually we have a problem: no matter if
> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> be the last to take the in/out_mutex. Other waiters could be still
> outstanding.
>
> We solved the same problem using a refcount in pvcalls_front_remove. In
> this case, I was thinking of reusing the mutex internal counter for
> efficiency, instead of adding one more refcount.
>
> For using the mutex as a refcount, there is really no need to call
> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> directly:
>
I think you want to say
while(mutex_locked(&map->active.in_mutex.owner) ||
mutex_locked(&map->active.out_mutex.owner))
cpu_relax();
since owner being NULL is not a documented value of a free mutex.
-boris
>
> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> cpu_relax();
>
> Cheers,
>
> Stefano
>
>
> ---
>
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next time
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
>
> Actually, we don't want to use mutex_trylock at all: we don't need to
> take the mutex, we only need to wait until the last mutex waiter/holder
> releases it.
>
> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> refcount instead.
>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..9f33cb8 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> * is set to NULL -- we only need to wait for the existing
> * waiters to return.
> */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> cpu_relax();
>
> pvcalls_front_free_map(bedata, map);
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-10 23:57 ` Stefano Stabellini
2017-11-11 2:02 ` Boris Ostrovsky
@ 2017-11-13 15:15 ` Juergen Gross
2017-11-13 18:33 ` Stefano Stabellini
1 sibling, 1 reply; 20+ messages in thread
From: Juergen Gross @ 2017-11-13 15:15 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: xen-devel, boris.ostrovsky
On 11/11/17 00:57, Stefano Stabellini wrote:
> On Tue, 7 Nov 2017, Juergen Gross wrote:
>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>> loops.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>> ---
>>> drivers/xen/pvcalls-front.c | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..047dce7 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>> * is set to NULL -- we only need to wait for the existing
>>> * waiters to return.
>>> */
>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>> - !mutex_trylock(&map->active.out_mutex))
>>> + while (!mutex_trylock(&map->active.in_mutex))
>>> + cpu_relax();
>>> + while (!mutex_trylock(&map->active.out_mutex))
>>> cpu_relax();
>>
>> Any reason you don't just use mutex_lock()?
>
> Hi Juergen, sorry for the late reply.
>
> Yes, you are right. Given the patch, it would be just the same to use
> mutex_lock.
>
> This is where I realized that actually we have a problem: no matter if
> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> be the last to take the in/out_mutex. Other waiters could be still
> outstanding.
>
> We solved the same problem using a refcount in pvcalls_front_remove. In
> this case, I was thinking of reusing the mutex internal counter for
> efficiency, instead of adding one more refcount.
>
> For using the mutex as a refcount, there is really no need to call
> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> directly:
>
>
> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> cpu_relax();
>
> Cheers,
>
> Stefano
>
>
> ---
>
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next time
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
>
> Actually, we don't want to use mutex_trylock at all: we don't need to
> take the mutex, we only need to wait until the last mutex waiter/holder
> releases it.
>
> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> refcount instead.
>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..9f33cb8 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> * is set to NULL -- we only need to wait for the existing
> * waiters to return.
> */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
I don't like this.
Can't you use a kref here? Even if it looks like more overhead it is
much cleaner. There will be no questions regarding possible races,
while with an approach like yours will always smell racy (can't there
be someone taking the mutex just after above test?).
In no case you should make use of the mutex internals.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-11 2:02 ` Boris Ostrovsky
@ 2017-11-13 18:30 ` Stefano Stabellini
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-13 18:30 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Juergen Gross, xen-devel, Stefano Stabellini
On Fri, 10 Nov 2017, Boris Ostrovsky wrote:
> On 11/10/2017 06:57 PM, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> > > On 06/11/17 23:17, Stefano Stabellini wrote:
> > > > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > > > take in_mutex on the first try, but you can't take out_mutex. Next times
> > > > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > > > loop.
> > > >
> > > > Solve the problem by moving the two mutex_trylock calls to two separate
> > > > loops.
> > > >
> > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > > > CC: boris.ostrovsky@oracle.com
> > > > CC: jgross@suse.com
> > > > ---
> > > > drivers/xen/pvcalls-front.c | 5 +++--
> > > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > > > index 0c1ec68..047dce7 100644
> > > > --- a/drivers/xen/pvcalls-front.c
> > > > +++ b/drivers/xen/pvcalls-front.c
> > > > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> > > > * is set to NULL -- we only need to wait for the
> > > > existing
> > > > * waiters to return.
> > > > */
> > > > - while (!mutex_trylock(&map->active.in_mutex) ||
> > > > - !mutex_trylock(&map->active.out_mutex))
> > > > + while (!mutex_trylock(&map->active.in_mutex))
> > > > + cpu_relax();
> > > > + while (!mutex_trylock(&map->active.out_mutex))
> > > > cpu_relax();
> > >
> > > Any reason you don't just use mutex_lock()?
> >
> > Hi Juergen, sorry for the late reply.
> >
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> >
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> >
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> >
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> >
>
>
> I think you want to say
>
> while(mutex_locked(&map->active.in_mutex.owner) ||
> mutex_locked(&map->active.out_mutex.owner))
> cpu_relax();
>
> since owner being NULL is not a documented value of a free mutex.
>
You are right, and the function name is "mutex_is_locked", so it would
be:
while(mutex_is_locked(&map->active.in_mutex.owner) ||
mutex_is_locked(&map->active.out_mutex.owner))
cpu_relax();
> >
> > while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> > cpu_relax();
> >
> > Cheers,
> >
> > Stefano
> >
> >
> > ---
> >
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> >
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> >
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> >
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> >
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> > * is set to NULL -- we only need to wait for the existing
> > * waiters to return.
> > */
> > - while (!mutex_trylock(&map->active.in_mutex) ||
> > - !mutex_trylock(&map->active.out_mutex))
> > + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> > cpu_relax();
> > pvcalls_front_free_map(bedata, map);
> >
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-13 15:15 ` Juergen Gross
@ 2017-11-13 18:33 ` Stefano Stabellini
2017-11-14 9:11 ` Juergen Gross
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-13 18:33 UTC (permalink / raw)
To: Juergen Gross; +Cc: xen-devel, boris.ostrovsky, Stefano Stabellini
On Mon, 13 Nov 2017, Juergen Gross wrote:
> On 11/11/17 00:57, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> >> On 06/11/17 23:17, Stefano Stabellini wrote:
> >>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> >>> take in_mutex on the first try, but you can't take out_mutex. Next times
> >>> you call mutex_trylock() in_mutex is going to fail. It's an endless
> >>> loop.
> >>>
> >>> Solve the problem by moving the two mutex_trylock calls to two separate
> >>> loops.
> >>>
> >>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> >>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> >>> CC: boris.ostrovsky@oracle.com
> >>> CC: jgross@suse.com
> >>> ---
> >>> drivers/xen/pvcalls-front.c | 5 +++--
> >>> 1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> >>> index 0c1ec68..047dce7 100644
> >>> --- a/drivers/xen/pvcalls-front.c
> >>> +++ b/drivers/xen/pvcalls-front.c
> >>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> >>> * is set to NULL -- we only need to wait for the existing
> >>> * waiters to return.
> >>> */
> >>> - while (!mutex_trylock(&map->active.in_mutex) ||
> >>> - !mutex_trylock(&map->active.out_mutex))
> >>> + while (!mutex_trylock(&map->active.in_mutex))
> >>> + cpu_relax();
> >>> + while (!mutex_trylock(&map->active.out_mutex))
> >>> cpu_relax();
> >>
> >> Any reason you don't just use mutex_lock()?
> >
> > Hi Juergen, sorry for the late reply.
> >
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> >
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> >
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> >
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> >
> >
> > while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> > cpu_relax();
> >
> > Cheers,
> >
> > Stefano
> >
> >
> > ---
> >
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> >
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> >
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> >
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> >
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> >
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> > * is set to NULL -- we only need to wait for the existing
> > * waiters to return.
> > */
> > - while (!mutex_trylock(&map->active.in_mutex) ||
> > - !mutex_trylock(&map->active.out_mutex))
> > + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>
> I don't like this.
>
> Can't you use a kref here? Even if it looks like more overhead it is
> much cleaner. There will be no questions regarding possible races,
> while with an approach like yours will always smell racy (can't there
> be someone taking the mutex just after above test?).
>
> In no case you should make use of the mutex internals.
Boris' suggestion solves that problem well. Would you be OK with the
proposed
while(mutex_is_locked(&map->active.in_mutex.owner) ||
mutex_is_locked(&map->active.out_mutex.owner))
cpu_relax();
?
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-13 18:33 ` Stefano Stabellini
@ 2017-11-14 9:11 ` Juergen Gross
2017-11-14 21:46 ` Boris Ostrovsky
0 siblings, 1 reply; 20+ messages in thread
From: Juergen Gross @ 2017-11-14 9:11 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: xen-devel, boris.ostrovsky
On 13/11/17 19:33, Stefano Stabellini wrote:
> On Mon, 13 Nov 2017, Juergen Gross wrote:
>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>> loop.
>>>>>
>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>> loops.
>>>>>
>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> CC: boris.ostrovsky@oracle.com
>>>>> CC: jgross@suse.com
>>>>> ---
>>>>> drivers/xen/pvcalls-front.c | 5 +++--
>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>> index 0c1ec68..047dce7 100644
>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>> * is set to NULL -- we only need to wait for the existing
>>>>> * waiters to return.
>>>>> */
>>>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>>>> - !mutex_trylock(&map->active.out_mutex))
>>>>> + while (!mutex_trylock(&map->active.in_mutex))
>>>>> + cpu_relax();
>>>>> + while (!mutex_trylock(&map->active.out_mutex))
>>>>> cpu_relax();
>>>>
>>>> Any reason you don't just use mutex_lock()?
>>>
>>> Hi Juergen, sorry for the late reply.
>>>
>>> Yes, you are right. Given the patch, it would be just the same to use
>>> mutex_lock.
>>>
>>> This is where I realized that actually we have a problem: no matter if
>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>> be the last to take the in/out_mutex. Other waiters could be still
>>> outstanding.
>>>
>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>> this case, I was thinking of reusing the mutex internal counter for
>>> efficiency, instead of adding one more refcount.
>>>
>>> For using the mutex as a refcount, there is really no need to call
>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>> directly:
>>>
>>>
>>> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>> atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>> cpu_relax();
>>>
>>> Cheers,
>>>
>>> Stefano
>>>
>>>
>>> ---
>>>
>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>> releases it.
>>>
>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>> refcount instead.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..9f33cb8 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>> * is set to NULL -- we only need to wait for the existing
>>> * waiters to return.
>>> */
>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>> - !mutex_trylock(&map->active.out_mutex))
>>> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>> + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>
>> I don't like this.
>>
>> Can't you use a kref here? Even if it looks like more overhead it is
>> much cleaner. There will be no questions regarding possible races,
>> while with an approach like yours will always smell racy (can't there
>> be someone taking the mutex just after above test?).
>>
>> In no case you should make use of the mutex internals.
>
> Boris' suggestion solves that problem well. Would you be OK with the
> proposed
>
> while(mutex_is_locked(&map->active.in_mutex.owner) ||
> mutex_is_locked(&map->active.out_mutex.owner))
> cpu_relax();
>
> ?
I'm not convinced there isn't a race.
In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
then in_mutex is taken. What happens if pvcalls_front_release() resets
sk_send_head and manages to test the mutex before the mutex is locked?
Even in case this is impossible: the whole construct seems to be rather
fragile.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-14 9:11 ` Juergen Gross
@ 2017-11-14 21:46 ` Boris Ostrovsky
2017-11-15 8:14 ` Juergen Gross
0 siblings, 1 reply; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-14 21:46 UTC (permalink / raw)
To: Juergen Gross, Stefano Stabellini; +Cc: xen-devel
On 11/14/2017 04:11 AM, Juergen Gross wrote:
> On 13/11/17 19:33, Stefano Stabellini wrote:
>> On Mon, 13 Nov 2017, Juergen Gross wrote:
>>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>>> loop.
>>>>>>
>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>>> loops.
>>>>>>
>>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>> CC: boris.ostrovsky@oracle.com
>>>>>> CC: jgross@suse.com
>>>>>> ---
>>>>>> drivers/xen/pvcalls-front.c | 5 +++--
>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>>> index 0c1ec68..047dce7 100644
>>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>> * is set to NULL -- we only need to wait for the existing
>>>>>> * waiters to return.
>>>>>> */
>>>>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>>>>> - !mutex_trylock(&map->active.out_mutex))
>>>>>> + while (!mutex_trylock(&map->active.in_mutex))
>>>>>> + cpu_relax();
>>>>>> + while (!mutex_trylock(&map->active.out_mutex))
>>>>>> cpu_relax();
>>>>> Any reason you don't just use mutex_lock()?
>>>> Hi Juergen, sorry for the late reply.
>>>>
>>>> Yes, you are right. Given the patch, it would be just the same to use
>>>> mutex_lock.
>>>>
>>>> This is where I realized that actually we have a problem: no matter if
>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>>> be the last to take the in/out_mutex. Other waiters could be still
>>>> outstanding.
>>>>
>>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>>> this case, I was thinking of reusing the mutex internal counter for
>>>> efficiency, instead of adding one more refcount.
>>>>
>>>> For using the mutex as a refcount, there is really no need to call
>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>>> directly:
>>>>
>>>>
>>>> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>> atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>> cpu_relax();
>>>>
>>>> Cheers,
>>>>
>>>> Stefano
>>>>
>>>>
>>>> ---
>>>>
>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>>
>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>> loop.
>>>>
>>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>>> releases it.
>>>>
>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>>> refcount instead.
>>>>
>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>> CC: boris.ostrovsky@oracle.com
>>>> CC: jgross@suse.com
>>>>
>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>> index 0c1ec68..9f33cb8 100644
>>>> --- a/drivers/xen/pvcalls-front.c
>>>> +++ b/drivers/xen/pvcalls-front.c
>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>> * is set to NULL -- we only need to wait for the existing
>>>> * waiters to return.
>>>> */
>>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>>> - !mutex_trylock(&map->active.out_mutex))
>>>> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>> + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>> I don't like this.
>>>
>>> Can't you use a kref here? Even if it looks like more overhead it is
>>> much cleaner. There will be no questions regarding possible races,
>>> while with an approach like yours will always smell racy (can't there
>>> be someone taking the mutex just after above test?).
>>>
>>> In no case you should make use of the mutex internals.
>> Boris' suggestion solves that problem well. Would you be OK with the
>> proposed
>>
>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
>> mutex_is_locked(&map->active.out_mutex.owner))
>> cpu_relax();
>>
>> ?
> I'm not convinced there isn't a race.
>
> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> then in_mutex is taken. What happens if pvcalls_front_release() resets
> sk_send_head and manages to test the mutex before the mutex is locked?
>
> Even in case this is impossible: the whole construct seems to be rather
> fragile.
I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
not rely on mutex state.
In any case, we are in the merge window now and I don't want to send
pull request with a known bug so we need to have this resolved in the
next couple of days.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-14 21:46 ` Boris Ostrovsky
@ 2017-11-15 8:14 ` Juergen Gross
2017-11-15 19:09 ` Stefano Stabellini
0 siblings, 1 reply; 20+ messages in thread
From: Juergen Gross @ 2017-11-15 8:14 UTC (permalink / raw)
To: Boris Ostrovsky, Stefano Stabellini; +Cc: xen-devel
On 14/11/17 22:46, Boris Ostrovsky wrote:
> On 11/14/2017 04:11 AM, Juergen Gross wrote:
>> On 13/11/17 19:33, Stefano Stabellini wrote:
>>> On Mon, 13 Nov 2017, Juergen Gross wrote:
>>>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>>>> loop.
>>>>>>>
>>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>>>> loops.
>>>>>>>
>>>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>> CC: boris.ostrovsky@oracle.com
>>>>>>> CC: jgross@suse.com
>>>>>>> ---
>>>>>>> drivers/xen/pvcalls-front.c | 5 +++--
>>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>>>> index 0c1ec68..047dce7 100644
>>>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>>> * is set to NULL -- we only need to wait for the existing
>>>>>>> * waiters to return.
>>>>>>> */
>>>>>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>>>>>> - !mutex_trylock(&map->active.out_mutex))
>>>>>>> + while (!mutex_trylock(&map->active.in_mutex))
>>>>>>> + cpu_relax();
>>>>>>> + while (!mutex_trylock(&map->active.out_mutex))
>>>>>>> cpu_relax();
>>>>>> Any reason you don't just use mutex_lock()?
>>>>> Hi Juergen, sorry for the late reply.
>>>>>
>>>>> Yes, you are right. Given the patch, it would be just the same to use
>>>>> mutex_lock.
>>>>>
>>>>> This is where I realized that actually we have a problem: no matter if
>>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>>>> be the last to take the in/out_mutex. Other waiters could be still
>>>>> outstanding.
>>>>>
>>>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>>>> this case, I was thinking of reusing the mutex internal counter for
>>>>> efficiency, instead of adding one more refcount.
>>>>>
>>>>> For using the mutex as a refcount, there is really no need to call
>>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>>>> directly:
>>>>>
>>>>>
>>>>> while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>> atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>>> cpu_relax();
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stefano
>>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>>>
>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>> loop.
>>>>>
>>>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>>>> releases it.
>>>>>
>>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>>>> refcount instead.
>>>>>
>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> CC: boris.ostrovsky@oracle.com
>>>>> CC: jgross@suse.com
>>>>>
>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>> index 0c1ec68..9f33cb8 100644
>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>>> * is set to NULL -- we only need to wait for the existing
>>>>> * waiters to return.
>>>>> */
>>>>> - while (!mutex_trylock(&map->active.in_mutex) ||
>>>>> - !mutex_trylock(&map->active.out_mutex))
>>>>> + while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>> + atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>> I don't like this.
>>>>
>>>> Can't you use a kref here? Even if it looks like more overhead it is
>>>> much cleaner. There will be no questions regarding possible races,
>>>> while with an approach like yours will always smell racy (can't there
>>>> be someone taking the mutex just after above test?).
>>>>
>>>> In no case you should make use of the mutex internals.
>>> Boris' suggestion solves that problem well. Would you be OK with the
>>> proposed
>>>
>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>> mutex_is_locked(&map->active.out_mutex.owner))
>>> cpu_relax();
>>>
>>> ?
>> I'm not convinced there isn't a race.
>>
>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>> sk_send_head and manages to test the mutex before the mutex is locked?
>>
>> Even in case this is impossible: the whole construct seems to be rather
>> fragile.
>
>
> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> not rely on mutex state.
Yes, this would work.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 8:14 ` Juergen Gross
@ 2017-11-15 19:09 ` Stefano Stabellini
2017-11-15 19:50 ` Boris Ostrovsky
0 siblings, 1 reply; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-15 19:09 UTC (permalink / raw)
To: Juergen Gross; +Cc: xen-devel, Boris Ostrovsky, Stefano Stabellini
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4871 bytes --]
On Wed, 15 Nov 2017, Juergen Gross wrote:
> >>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
> >>> mutex_is_locked(&map->active.out_mutex.owner))
> >>> cpu_relax();
> >>>
> >>> ?
> >> I'm not convinced there isn't a race.
> >>
> >> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> >> then in_mutex is taken. What happens if pvcalls_front_release() resets
> >> sk_send_head and manages to test the mutex before the mutex is locked?
> >>
> >> Even in case this is impossible: the whole construct seems to be rather
> >> fragile.
I agree it looks fragile, and I agree that it might be best to avoid the
usage of in_mutex and out_mutex as refcounts. More comments on this
below.
> > I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> > not rely on mutex state.
>
> Yes, this would work.
Yes, I agree it would work and for the sake of getting something in
shape for the merge window I am attaching a patch for it. Please go
ahead with it. Let me know if you need anything else immediately, and
I'll work on it ASAP.
However, I should note that this is a pretty big hammer we are using:
the refcount is global, while we only need to wait until it's only us
_on this specific socket_.
We really need a per socket refcount. If we don't want to use the mutex
internal counters, then we need another one.
See the appended patch that introduces a per socket refcount. However,
for the merge window, also using pvcalls_refcount is fine.
The race Juergen is concerned about is only theoretically possible:
recvmsg: release:
test sk_send_head clear sk_send_head
<only few instructions> <prepare a message>
grab in_mutex <send a message to the server>
<wait for an answer>
test in_mutex
Without kernel preemption is not possible for release to clear
sk_send_head and test in_mutex after recvmsg tests sk_send_head and
before recvmsg grabs in_mutex.
But maybe we need to disable kernel preemption in recvmsg and sendmsg to
stay on the safe side?
The patch below introduces a per active socket refcount, so that we
don't have to rely on in_mutex and out_mutex for refcounting. It also
disables preemption in sendmsg and recvmsg in the region described
above.
I don't think this patch should go in immediately. We can take our time
to figure out the best fix.
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..8c1030b 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -68,6 +68,7 @@ struct sock_mapping {
struct pvcalls_data data;
struct mutex in_mutex;
struct mutex out_mutex;
+ atomic_t sock_refcount;
wait_queue_head_t inflight_conn_req;
} active;
@@ -497,15 +498,20 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
}
bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
+ preempt_disable();
map = (struct sock_mapping *) sock->sk->sk_send_head;
if (!map) {
+ preempt_enable();
pvcalls_exit();
return -ENOTSOCK;
}
+ atomic_inc(&map->active.sock_refcount);
mutex_lock(&map->active.out_mutex);
+ preempt_enable();
if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) {
mutex_unlock(&map->active.out_mutex);
+ atomic_dec(&map->active.sock_refcount);
pvcalls_exit();
return -EAGAIN;
}
@@ -528,6 +534,7 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
tot_sent = sent;
mutex_unlock(&map->active.out_mutex);
+ atomic_dec(&map->active.sock_refcount);
pvcalls_exit();
return tot_sent;
}
@@ -600,13 +607,17 @@ int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
}
bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
+ preempt_disable();
map = (struct sock_mapping *) sock->sk->sk_send_head;
if (!map) {
+ preempt_enable();
pvcalls_exit();
return -ENOTSOCK;
}
+ atomic_inc(&map->active.sock_refcount);
mutex_lock(&map->active.in_mutex);
+ preempt_enable();
if (len > XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER))
len = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER);
@@ -625,6 +636,7 @@ int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
ret = 0;
mutex_unlock(&map->active.in_mutex);
+ atomic_dec(&map->active.sock_refcount);
pvcalls_exit();
return ret;
}
@@ -1048,8 +1060,7 @@ int pvcalls_front_release(struct socket *sock)
* is set to NULL -- we only need to wait for the existing
* waiters to return.
*/
- while (!mutex_trylock(&map->active.in_mutex) ||
- !mutex_trylock(&map->active.out_mutex))
+ while (atomic_read(&map->active.sock_refcount) > 0)
cpu_relax();
pvcalls_front_free_map(bedata, map);
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/X-DIFF; NAME=front.patch, Size: 1174 bytes --]
xen/pvcalls: fix potential endless loop in pvcalls-front.c
mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next times
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.
Solve the problem by waiting until the global refcount is 1 instead (the
refcount is 1 when the only active pvcalls frontend function is
pvcalls_front_release).
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..72a74c3 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1048,8 +1048,7 @@ int pvcalls_front_release(struct socket *sock)
* is set to NULL -- we only need to wait for the existing
* waiters to return.
*/
- while (!mutex_trylock(&map->active.in_mutex) ||
- !mutex_trylock(&map->active.out_mutex))
+ while (atomic_read(&pvcalls_refcount) > 1)
cpu_relax();
pvcalls_front_free_map(bedata, map);
[-- Attachment #3: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 19:09 ` Stefano Stabellini
@ 2017-11-15 19:50 ` Boris Ostrovsky
2017-11-15 19:53 ` Boris Ostrovsky
2017-11-15 21:20 ` Stefano Stabellini
0 siblings, 2 replies; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-15 19:50 UTC (permalink / raw)
To: Stefano Stabellini, Juergen Gross; +Cc: xen-devel
On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
> On Wed, 15 Nov 2017, Juergen Gross wrote:
>>>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>>>> mutex_is_locked(&map->active.out_mutex.owner))
>>>>> cpu_relax();
>>>>>
>>>>> ?
>>>> I'm not convinced there isn't a race.
>>>>
>>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>>>> sk_send_head and manages to test the mutex before the mutex is locked?
>>>>
>>>> Even in case this is impossible: the whole construct seems to be rather
>>>> fragile.
> I agree it looks fragile, and I agree that it might be best to avoid the
> usage of in_mutex and out_mutex as refcounts. More comments on this
> below.
>
>
>>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
>>> not rely on mutex state.
>> Yes, this would work.
> Yes, I agree it would work and for the sake of getting something in
> shape for the merge window I am attaching a patch for it. Please go
> ahead with it. Let me know if you need anything else immediately, and
> I'll work on it ASAP.
>
>
>
> However, I should note that this is a pretty big hammer we are using:
> the refcount is global, while we only need to wait until it's only us
> _on this specific socket_.
Can you explain why socket is important?
>
> We really need a per socket refcount. If we don't want to use the mutex
> internal counters, then we need another one.
>
> See the appended patch that introduces a per socket refcount. However,
> for the merge window, also using pvcalls_refcount is fine.
>
> The race Juergen is concerned about is only theoretically possible:
>
> recvmsg: release:
>
> test sk_send_head clear sk_send_head
> <only few instructions> <prepare a message>
> grab in_mutex <send a message to the server>
> <wait for an answer>
> test in_mutex
>
> Without kernel preemption is not possible for release to clear
> sk_send_head and test in_mutex after recvmsg tests sk_send_head and
> before recvmsg grabs in_mutex.
Sorry, I don't follow --- what does preemption have to do with this? If
recvmsg and release happen on different processors the order of
operations can be
CPU0 CPU1
test sk_send_head
<interrupt>
clear sk_send_head
<send a message to the server>
<wait for an answer>
test in_mutex
free everything
grab in_mutex
I actually think RCU should take care of all of this.
But for now I will take your refcount-based patch. However, it also
needs comment update.
How about
/*
* We need to make sure that send/rcvmsg on this socket has not started
* before we've cleared sk_send_head here. The easiest (though not optimal)
* way to guarantee this is to see that no pvcall (other than us) is in
progress.
*/
-boris
>
> But maybe we need to disable kernel preemption in recvmsg and sendmsg to
> stay on the safe side?
>
> The patch below introduces a per active socket refcount, so that we
> don't have to rely on in_mutex and out_mutex for refcounting. It also
> disables preemption in sendmsg and recvmsg in the region described
> above.
>
> I don't think this patch should go in immediately. We can take our time
> to figure out the best fix.
>
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..8c1030b 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -68,6 +68,7 @@ struct sock_mapping {
> struct pvcalls_data data;
> struct mutex in_mutex;
> struct mutex out_mutex;
> + atomic_t sock_refcount;
>
> wait_queue_head_t inflight_conn_req;
> } active;
> @@ -497,15 +498,20 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
> }
> bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
>
> + preempt_disable();
> map = (struct sock_mapping *) sock->sk->sk_send_head;
> if (!map) {
> + preempt_enable();
> pvcalls_exit();
> return -ENOTSOCK;
> }
>
> + atomic_inc(&map->active.sock_refcount);
> mutex_lock(&map->active.out_mutex);
> + preempt_enable();
> if ((flags & MSG_DONTWAIT) && !pvcalls_front_write_todo(map)) {
> mutex_unlock(&map->active.out_mutex);
> + atomic_dec(&map->active.sock_refcount);
> pvcalls_exit();
> return -EAGAIN;
> }
> @@ -528,6 +534,7 @@ int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
> tot_sent = sent;
>
> mutex_unlock(&map->active.out_mutex);
> + atomic_dec(&map->active.sock_refcount);
> pvcalls_exit();
> return tot_sent;
> }
> @@ -600,13 +607,17 @@ int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> }
> bedata = dev_get_drvdata(&pvcalls_front_dev->dev);
>
> + preempt_disable();
> map = (struct sock_mapping *) sock->sk->sk_send_head;
> if (!map) {
> + preempt_enable();
> pvcalls_exit();
> return -ENOTSOCK;
> }
>
> + atomic_inc(&map->active.sock_refcount);
> mutex_lock(&map->active.in_mutex);
> + preempt_enable();
> if (len > XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER))
> len = XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER);
>
> @@ -625,6 +636,7 @@ int pvcalls_front_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> ret = 0;
>
> mutex_unlock(&map->active.in_mutex);
> + atomic_dec(&map->active.sock_refcount);
> pvcalls_exit();
> return ret;
> }
> @@ -1048,8 +1060,7 @@ int pvcalls_front_release(struct socket *sock)
> * is set to NULL -- we only need to wait for the existing
> * waiters to return.
> */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + while (atomic_read(&map->active.sock_refcount) > 0)
> cpu_relax();
>
> pvcalls_front_free_map(bedata, map);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 19:50 ` Boris Ostrovsky
@ 2017-11-15 19:53 ` Boris Ostrovsky
2017-11-15 21:20 ` Stefano Stabellini
1 sibling, 0 replies; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-15 19:53 UTC (permalink / raw)
To: Stefano Stabellini, Juergen Gross; +Cc: xen-devel
On 11/15/2017 02:50 PM, Boris Ostrovsky wrote:
> On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
>>
>> However, I should note that this is a pretty big hammer we are using:
>> the refcount is global, while we only need to wait until it's only us
>> _on this specific socket_.
> Can you explain why socket is important?
Nevermind. I was thinking about *processor* socket (as in cores,
threads, packages etc. I am right now looking at a bug that deals with
core behavior ;-))
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 19:50 ` Boris Ostrovsky
2017-11-15 19:53 ` Boris Ostrovsky
@ 2017-11-15 21:20 ` Stefano Stabellini
2017-11-15 21:50 ` Stefano Stabellini
2017-11-16 5:28 ` Juergen Gross
1 sibling, 2 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-15 21:20 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Juergen Gross, xen-devel, Stefano Stabellini
On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
> On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
> > On Wed, 15 Nov 2017, Juergen Gross wrote:
> >>>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
> >>>>> mutex_is_locked(&map->active.out_mutex.owner))
> >>>>> cpu_relax();
> >>>>>
> >>>>> ?
> >>>> I'm not convinced there isn't a race.
> >>>>
> >>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> >>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
> >>>> sk_send_head and manages to test the mutex before the mutex is locked?
> >>>>
> >>>> Even in case this is impossible: the whole construct seems to be rather
> >>>> fragile.
> > I agree it looks fragile, and I agree that it might be best to avoid the
> > usage of in_mutex and out_mutex as refcounts. More comments on this
> > below.
> >
> >
> >>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> >>> not rely on mutex state.
> >> Yes, this would work.
> > Yes, I agree it would work and for the sake of getting something in
> > shape for the merge window I am attaching a patch for it. Please go
> > ahead with it. Let me know if you need anything else immediately, and
> > I'll work on it ASAP.
> >
> >
> >
> > However, I should note that this is a pretty big hammer we are using:
> > the refcount is global, while we only need to wait until it's only us
> > _on this specific socket_.
>
> Can you explain why socket is important?
Yes, of course: there are going to be many open sockets on a given
pvcalls connection. pvcalls_refcount is global: waiting on
pvcalls_refcount means waiting until any operations on any unrelated
sockets stop. While we only need to wait until the operations on the one
socket we want to close stop.
> >
> > We really need a per socket refcount. If we don't want to use the mutex
> > internal counters, then we need another one.
> >
> > See the appended patch that introduces a per socket refcount. However,
> > for the merge window, also using pvcalls_refcount is fine.
> >
> > The race Juergen is concerned about is only theoretically possible:
> >
> > recvmsg: release:
> >
> > test sk_send_head clear sk_send_head
> > <only few instructions> <prepare a message>
> > grab in_mutex <send a message to the server>
> > <wait for an answer>
> > test in_mutex
> >
> > Without kernel preemption is not possible for release to clear
> > sk_send_head and test in_mutex after recvmsg tests sk_send_head and
> > before recvmsg grabs in_mutex.
>
> Sorry, I don't follow --- what does preemption have to do with this? If
> recvmsg and release happen on different processors the order of
> operations can be
>
> CPU0 CPU1
>
> test sk_send_head
> <interrupt>
> clear sk_send_head
> <send a message to the server>
> <wait for an answer>
> test in_mutex
> free everything
> grab in_mutex
>
> I actually think RCU should take care of all of this.
Preemption could cause something very similar to happen, but your
example is very good too, even better, because it could trigger the
issue even with preemption disabled. I'll think more about this and
submit a separate patch on top of the simple pvcalls_refcount patch
below.
> But for now I will take your refcount-based patch. However, it also
> needs comment update.
>
> How about
>
> /*
> * We need to make sure that send/rcvmsg on this socket has not started
> * before we've cleared sk_send_head here. The easiest (though not optimal)
> * way to guarantee this is to see that no pvcall (other than us) is in
> progress.
> */
Yes, this is the patch:
---
xen/pvcalls: fix potential endless loop in pvcalls-front.c
mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next times
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.
Solve the problem by waiting until the global refcount is 1 instead (the
refcount is 1 when the only active pvcalls frontend function is
pvcalls_front_release).
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index 0c1ec68..54c0fda 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -1043,13 +1043,12 @@ int pvcalls_front_release(struct socket *sock)
wake_up_interruptible(&map->active.inflight_conn_req);
/*
- * Wait until there are no more waiters on the mutexes.
- * We know that no new waiters can be added because sk_send_head
- * is set to NULL -- we only need to wait for the existing
- * waiters to return.
- */
- while (!mutex_trylock(&map->active.in_mutex) ||
- !mutex_trylock(&map->active.out_mutex))
+ * We need to make sure that sendmsg/recvmsg on this socket have
+ * not started before we've cleared sk_send_head here. The
+ * easiest (though not optimal) way to guarantee this is to see
+ * that no pvcall (other than us) is in progress.
+ */
+ while (atomic_read(&pvcalls_refcount) > 1)
cpu_relax();
pvcalls_front_free_map(bedata, map);
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 21:20 ` Stefano Stabellini
@ 2017-11-15 21:50 ` Stefano Stabellini
2017-11-15 21:51 ` Boris Ostrovsky
2017-11-16 5:28 ` Juergen Gross
1 sibling, 1 reply; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-15 21:50 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Juergen Gross, xen-devel, Boris Ostrovsky
On Wed, 15 Nov 2017, Stefano Stabellini wrote:
> On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
> > On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
> > > On Wed, 15 Nov 2017, Juergen Gross wrote:
> > >>>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
> > >>>>> mutex_is_locked(&map->active.out_mutex.owner))
> > >>>>> cpu_relax();
> > >>>>>
> > >>>>> ?
> > >>>> I'm not convinced there isn't a race.
> > >>>>
> > >>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> > >>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
> > >>>> sk_send_head and manages to test the mutex before the mutex is locked?
> > >>>>
> > >>>> Even in case this is impossible: the whole construct seems to be rather
> > >>>> fragile.
> > > I agree it looks fragile, and I agree that it might be best to avoid the
> > > usage of in_mutex and out_mutex as refcounts. More comments on this
> > > below.
> > >
> > >
> > >>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> > >>> not rely on mutex state.
> > >> Yes, this would work.
> > > Yes, I agree it would work and for the sake of getting something in
> > > shape for the merge window I am attaching a patch for it. Please go
> > > ahead with it. Let me know if you need anything else immediately, and
> > > I'll work on it ASAP.
> > >
> > >
> > >
> > > However, I should note that this is a pretty big hammer we are using:
> > > the refcount is global, while we only need to wait until it's only us
> > > _on this specific socket_.
> >
> > Can you explain why socket is important?
>
> Yes, of course: there are going to be many open sockets on a given
> pvcalls connection. pvcalls_refcount is global: waiting on
> pvcalls_refcount means waiting until any operations on any unrelated
> sockets stop. While we only need to wait until the operations on the one
> socket we want to close stop.
>
>
> > >
> > > We really need a per socket refcount. If we don't want to use the mutex
> > > internal counters, then we need another one.
> > >
> > > See the appended patch that introduces a per socket refcount. However,
> > > for the merge window, also using pvcalls_refcount is fine.
> > >
> > > The race Juergen is concerned about is only theoretically possible:
> > >
> > > recvmsg: release:
> > >
> > > test sk_send_head clear sk_send_head
> > > <only few instructions> <prepare a message>
> > > grab in_mutex <send a message to the server>
> > > <wait for an answer>
> > > test in_mutex
> > >
> > > Without kernel preemption is not possible for release to clear
> > > sk_send_head and test in_mutex after recvmsg tests sk_send_head and
> > > before recvmsg grabs in_mutex.
> >
> > Sorry, I don't follow --- what does preemption have to do with this? If
> > recvmsg and release happen on different processors the order of
> > operations can be
> >
> > CPU0 CPU1
> >
> > test sk_send_head
> > <interrupt>
> > clear sk_send_head
> > <send a message to the server>
> > <wait for an answer>
> > test in_mutex
> > free everything
> > grab in_mutex
> >
> > I actually think RCU should take care of all of this.
>
> Preemption could cause something very similar to happen, but your
> example is very good too, even better, because it could trigger the
> issue even with preemption disabled. I'll think more about this and
> submit a separate patch on top of the simple pvcalls_refcount patch
> below.
>
>
>
> > But for now I will take your refcount-based patch. However, it also
> > needs comment update.
> >
> > How about
> >
> > /*
> > * We need to make sure that send/rcvmsg on this socket has not started
> > * before we've cleared sk_send_head here. The easiest (though not optimal)
> > * way to guarantee this is to see that no pvcall (other than us) is in
> > progress.
> > */
>
> Yes, this is the patch:
>
> ---
>
>
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next times
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
>
> Solve the problem by waiting until the global refcount is 1 instead (the
> refcount is 1 when the only active pvcalls frontend function is
> pvcalls_front_release).
>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
>
>
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..54c0fda 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1043,13 +1043,12 @@ int pvcalls_front_release(struct socket *sock)
> wake_up_interruptible(&map->active.inflight_conn_req);
>
> /*
> - * Wait until there are no more waiters on the mutexes.
> - * We know that no new waiters can be added because sk_send_head
> - * is set to NULL -- we only need to wait for the existing
> - * waiters to return.
> - */
> - while (!mutex_trylock(&map->active.in_mutex) ||
> - !mutex_trylock(&map->active.out_mutex))
> + * We need to make sure that sendmsg/recvmsg on this socket have
> + * not started before we've cleared sk_send_head here. The
> + * easiest (though not optimal) way to guarantee this is to see
> + * that no pvcall (other than us) is in progress.
> + */
> + while (atomic_read(&pvcalls_refcount) > 1)
> cpu_relax();
>
> pvcalls_front_free_map(bedata, map);
Sorry, code style issue: one missing space in the comment. I'll send it
again separately.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 21:50 ` Stefano Stabellini
@ 2017-11-15 21:51 ` Boris Ostrovsky
2017-11-15 21:54 ` Stefano Stabellini
0 siblings, 1 reply; 20+ messages in thread
From: Boris Ostrovsky @ 2017-11-15 21:51 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Juergen Gross, xen-devel
On 11/15/2017 04:50 PM, Stefano Stabellini wrote:
>
> Sorry, code style issue: one missing space in the comment. I'll send it
> again separately
I've already fixed this, no worries.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 21:51 ` Boris Ostrovsky
@ 2017-11-15 21:54 ` Stefano Stabellini
0 siblings, 0 replies; 20+ messages in thread
From: Stefano Stabellini @ 2017-11-15 21:54 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Juergen Gross, xen-devel, Stefano Stabellini
On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
> On 11/15/2017 04:50 PM, Stefano Stabellini wrote:
> >
> > Sorry, code style issue: one missing space in the comment. I'll send it
> > again separately
>
>
> I've already fixed this, no worries.
Thank you!!
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c
2017-11-15 21:20 ` Stefano Stabellini
2017-11-15 21:50 ` Stefano Stabellini
@ 2017-11-16 5:28 ` Juergen Gross
1 sibling, 0 replies; 20+ messages in thread
From: Juergen Gross @ 2017-11-16 5:28 UTC (permalink / raw)
To: Stefano Stabellini, Boris Ostrovsky; +Cc: xen-devel
On 15/11/17 22:20, Stefano Stabellini wrote:
> On Wed, 15 Nov 2017, Boris Ostrovsky wrote:
>> On 11/15/2017 02:09 PM, Stefano Stabellini wrote:
>>> On Wed, 15 Nov 2017, Juergen Gross wrote:
>>>>>>> while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>>>>>> mutex_is_locked(&map->active.out_mutex.owner))
>>>>>>> cpu_relax();
>>>>>>>
>>>>>>> ?
>>>>>> I'm not convinced there isn't a race.
>>>>>>
>>>>>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>>>>>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>>>>>> sk_send_head and manages to test the mutex before the mutex is locked?
>>>>>>
>>>>>> Even in case this is impossible: the whole construct seems to be rather
>>>>>> fragile.
>>> I agree it looks fragile, and I agree that it might be best to avoid the
>>> usage of in_mutex and out_mutex as refcounts. More comments on this
>>> below.
>>>
>>>
>>>>> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
>>>>> not rely on mutex state.
>>>> Yes, this would work.
>>> Yes, I agree it would work and for the sake of getting something in
>>> shape for the merge window I am attaching a patch for it. Please go
>>> ahead with it. Let me know if you need anything else immediately, and
>>> I'll work on it ASAP.
>>>
>>>
>>>
>>> However, I should note that this is a pretty big hammer we are using:
>>> the refcount is global, while we only need to wait until it's only us
>>> _on this specific socket_.
>>
>> Can you explain why socket is important?
>
> Yes, of course: there are going to be many open sockets on a given
> pvcalls connection. pvcalls_refcount is global: waiting on
> pvcalls_refcount means waiting until any operations on any unrelated
> sockets stop. While we only need to wait until the operations on the one
> socket we want to close stop.
>
>
>>>
>>> We really need a per socket refcount. If we don't want to use the mutex
>>> internal counters, then we need another one.
>>>
>>> See the appended patch that introduces a per socket refcount. However,
>>> for the merge window, also using pvcalls_refcount is fine.
>>>
>>> The race Juergen is concerned about is only theoretically possible:
>>>
>>> recvmsg: release:
>>>
>>> test sk_send_head clear sk_send_head
>>> <only few instructions> <prepare a message>
>>> grab in_mutex <send a message to the server>
>>> <wait for an answer>
>>> test in_mutex
>>>
>>> Without kernel preemption is not possible for release to clear
>>> sk_send_head and test in_mutex after recvmsg tests sk_send_head and
>>> before recvmsg grabs in_mutex.
>>
>> Sorry, I don't follow --- what does preemption have to do with this? If
>> recvmsg and release happen on different processors the order of
>> operations can be
>>
>> CPU0 CPU1
>>
>> test sk_send_head
>> <interrupt>
>> clear sk_send_head
>> <send a message to the server>
>> <wait for an answer>
>> test in_mutex
>> free everything
>> grab in_mutex
>>
>> I actually think RCU should take care of all of this.
>
> Preemption could cause something very similar to happen, but your
> example is very good too, even better, because it could trigger the
> issue even with preemption disabled. I'll think more about this and
> submit a separate patch on top of the simple pvcalls_refcount patch
> below.
We are running as a guest. Even with interrupts off the vcpu could be
off the pcpu for several milliseconds!
Don't count on code length to avoid races!
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2017-11-16 5:28 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-06 22:17 [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c Stefano Stabellini
2017-11-06 22:34 ` Boris Ostrovsky
2017-11-06 22:38 ` Stefano Stabellini
2017-11-07 5:44 ` Juergen Gross
2017-11-10 23:57 ` Stefano Stabellini
2017-11-11 2:02 ` Boris Ostrovsky
2017-11-13 18:30 ` Stefano Stabellini
2017-11-13 15:15 ` Juergen Gross
2017-11-13 18:33 ` Stefano Stabellini
2017-11-14 9:11 ` Juergen Gross
2017-11-14 21:46 ` Boris Ostrovsky
2017-11-15 8:14 ` Juergen Gross
2017-11-15 19:09 ` Stefano Stabellini
2017-11-15 19:50 ` Boris Ostrovsky
2017-11-15 19:53 ` Boris Ostrovsky
2017-11-15 21:20 ` Stefano Stabellini
2017-11-15 21:50 ` Stefano Stabellini
2017-11-15 21:51 ` Boris Ostrovsky
2017-11-15 21:54 ` Stefano Stabellini
2017-11-16 5:28 ` Juergen Gross
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).