[PATCH V2 0/3] tcm_vhost lock and flush fix

virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V2 0/3] tcm_vhost lock and flush fix
@ 2013-03-13  7:34 Asias He
  2013-03-13  7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Asias He @ 2013-03-13  7:34 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

Changes in v2
- Use vq->private_data insteadof vs->vs_endpoint 
- Add label in vhost_scsi_clear_endpoint to simply unlock code. 

Asias He (3):
  tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
  tcm_vhost: Flush vhost_work in vhost_scsi_flush()
  tcm_vhost: Use vq->private_data to indicate if the endpoint is setup

 drivers/vhost/tcm_vhost.c | 56 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 46 insertions(+), 10 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
  2013-03-13  7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
@ 2013-03-13  7:34 ` Asias He
  2013-03-13  7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13  7:34 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

tv_tpg->tv_tpg_vhost_count should be protected by tv_tpg->tv_tpg_mutex.

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 9951297..79d3aea 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -850,7 +850,7 @@ static int vhost_scsi_clear_endpoint(
 	for (index = 0; index < vs->dev.nvqs; ++index) {
 		if (!vhost_vq_access_ok(&vs->vqs[index])) {
 			ret = -EFAULT;
-			goto err;
+			goto err_dev;
 		}
 	}
 	for (i = 0; i < VHOST_SCSI_MAX_TARGET; i++) {
@@ -860,10 +860,11 @@ static int vhost_scsi_clear_endpoint(
 		if (!tv_tpg)
 			continue;
 
+		mutex_lock(&tv_tpg->tv_tpg_mutex);
 		tv_tport = tv_tpg->tport;
 		if (!tv_tport) {
 			ret = -ENODEV;
-			goto err;
+			goto err_tpg;
 		}
 
 		if (strcmp(tv_tport->tport_name, t->vhost_wwpn)) {
@@ -872,16 +873,19 @@ static int vhost_scsi_clear_endpoint(
 				tv_tport->tport_name, tv_tpg->tport_tpgt,
 				t->vhost_wwpn, t->vhost_tpgt);
 			ret = -EINVAL;
-			goto err;
+			goto err_tpg;
 		}
 		tv_tpg->tv_tpg_vhost_count--;
 		vs->vs_tpg[target] = NULL;
 		vs->vs_endpoint = false;
+		mutex_unlock(&tv_tpg->tv_tpg_mutex);
 	}
 	mutex_unlock(&vs->dev.mutex);
 	return 0;
 
-err:
+err_tpg:
+	mutex_unlock(&tv_tpg->tv_tpg_mutex);
+err_dev:
 	mutex_unlock(&vs->dev.mutex);
 	return ret;
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush()
  2013-03-13  7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
  2013-03-13  7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
@ 2013-03-13  7:34 ` Asias He
  2013-03-13  7:34 ` [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
       [not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
  3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13  7:34 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

We also need to flush the vhost_works. It is the completion vhost_work
currently.

Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 79d3aea..43fb11e 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -941,6 +941,7 @@ static void vhost_scsi_flush(struct vhost_scsi *vs)
 
 	for (i = 0; i < VHOST_SCSI_MAX_VQ; i++)
 		vhost_scsi_flush_vq(vs, i);
+	vhost_work_flush(&vs->dev, &vs->vs_completion_work);
 }
 
 static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-13  7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
  2013-03-13  7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
  2013-03-13  7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
@ 2013-03-13  7:34 ` Asias He
       [not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
  3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13  7:34 UTC (permalink / raw)
  To: Nicholas Bellinger
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi, Paolo Bonzini

Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
wrong.

Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take only lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".

Signed-off-by: Asias He <asias@redhat.com>
---
 drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 43fb11e..094fb10 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -67,7 +67,6 @@ struct vhost_scsi {
 	/* Protected by vhost_scsi->dev.mutex */
 	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
 	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
-	bool vs_endpoint;
 
 	struct vhost_dev dev;
 	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
@@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
 	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
 }
 
+static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
+{
+	bool ret = false;
+
+	/*
+	 * We can handle the vq only after the endpoint is setup by calling the
+	 * VHOST_SCSI_SET_ENDPOINT ioctl.
+	 *
+	 * TODO: check that we are running from vhost_worker?
+	 */
+	if (rcu_dereference_check(vq->private_data, 1))
+		ret = true;
+
+	return ret;
+}
+
 static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
 {
 	return 1;
@@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
 	int head, ret;
 	u8 target;
 
-	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
-	if (unlikely(!vs->vs_endpoint))
+	if (!tcm_vhost_check_endpoint(vq))
 		return;
 
 	mutex_lock(&vq->mutex);
@@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
 	bool match = false;
-	int index, ret;
+	int index, ret, i;
 
 	mutex_lock(&vs->dev.mutex);
 	/* Verify that ring has been setup correctly. */
@@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
 	if (match) {
 		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
 		       sizeof(vs->vs_vhost_wwpn));
-		vs->vs_endpoint = true;
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, vs);
+			vhost_init_used(vq);
+			mutex_unlock(&vq->mutex);
+		}
 		ret = 0;
 	} else {
 		ret = -EEXIST;
@@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
 {
 	struct tcm_vhost_tport *tv_tport;
 	struct tcm_vhost_tpg *tv_tpg;
+	struct vhost_virtqueue *vq;
+	bool match = false;
 	int index, ret, i;
 	u8 target;
 
@@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
 		}
 		tv_tpg->tv_tpg_vhost_count--;
 		vs->vs_tpg[target] = NULL;
-		vs->vs_endpoint = false;
+		match = true;
 		mutex_unlock(&tv_tpg->tv_tpg_mutex);
 	}
+	if (match) {
+		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+			vq = &vs->vqs[i];
+			mutex_lock(&vq->mutex);
+			rcu_assign_pointer(vq->private_data, NULL);
+			mutex_unlock(&vq->mutex);
+		}
+	}
 	mutex_unlock(&vs->dev.mutex);
 	return 0;
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
       [not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
@ 2013-03-13  8:56   ` Paolo Bonzini
  2013-03-14  2:07     ` Asias He
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-03-13  8:56 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi

Il 13/03/2013 08:34, Asias He ha scritto:
> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> not. It is set or cleared in vhost_scsi_set_endpoint() or
> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> wrong.
> 
> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> indicate the status of the endpoint, we use per virtqueue
> vq->private_data to indicate it. In this way, we can only take the
> vq->mutex lock which is per queue and make the concurrent multiqueue
> process having less lock contention. Further, in the read side of
> vq->private_data, we can even do not take only lock if it is accessed in
> the vhost worker thread, because it is protected by "vhost rcu".
> 
> Signed-off-by: Asias He <asias@redhat.com>
> ---
>  drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 43fb11e..094fb10 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -67,7 +67,6 @@ struct vhost_scsi {
>  	/* Protected by vhost_scsi->dev.mutex */
>  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
>  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> -	bool vs_endpoint;
>  
>  	struct vhost_dev dev;
>  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
>  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
>  }
>  
> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> +{
> +	bool ret = false;
> +
> +	/*
> +	 * We can handle the vq only after the endpoint is setup by calling the
> +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> +	 *
> +	 * TODO: check that we are running from vhost_worker?
> +	 */
> +	if (rcu_dereference_check(vq->private_data, 1))
> +		ret = true;
> +
> +	return ret;
> +}
> +
>  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
>  {
>  	return 1;
> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>  	int head, ret;
>  	u8 target;
>  
> -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> -	if (unlikely(!vs->vs_endpoint))
> +	if (!tcm_vhost_check_endpoint(vq))
>  		return;

You would still need at least a rcu_read_lock/unlock (actually srcu,
since vhost_scsi_handle_vq can sleep)...

>  	mutex_lock(&vq->mutex);
> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
>  {
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tv_tpg;
> +	struct vhost_virtqueue *vq;
>  	bool match = false;
> -	int index, ret;
> +	int index, ret, i;
>  
>  	mutex_lock(&vs->dev.mutex);
>  	/* Verify that ring has been setup correctly. */
> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
>  	if (match) {
>  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
>  		       sizeof(vs->vs_vhost_wwpn));
> -		vs->vs_endpoint = true;
> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> +			vq = &vs->vqs[i];
> +			mutex_lock(&vq->mutex);
> +			rcu_assign_pointer(vq->private_data, vs);
> +			vhost_init_used(vq);
> +			mutex_unlock(&vq->mutex);

... and a synchronize_srcu here.  But this is not correct use of RCU.
To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
whole vs structure on every set_endpoint or clear_endpoint operation,
and free it after synchronize_srcu returns.

What you're trying to do is really an rwlock, just use that. :)

Paolo

> +		}
>  		ret = 0;
>  	} else {
>  		ret = -EEXIST;
> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
>  {
>  	struct tcm_vhost_tport *tv_tport;
>  	struct tcm_vhost_tpg *tv_tpg;
> +	struct vhost_virtqueue *vq;
> +	bool match = false;
>  	int index, ret, i;
>  	u8 target;
>  
> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
>  		}
>  		tv_tpg->tv_tpg_vhost_count--;
>  		vs->vs_tpg[target] = NULL;
> -		vs->vs_endpoint = false;
> +		match = true;
>  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
>  	}
> +	if (match) {
> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> +			vq = &vs->vqs[i];
> +			mutex_lock(&vq->mutex);
> +			rcu_assign_pointer(vq->private_data, NULL);
> +			mutex_unlock(&vq->mutex);
> +		}
> +	}
>  	mutex_unlock(&vs->dev.mutex);
>  	return 0;
>  
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-13  8:56   ` Paolo Bonzini
@ 2013-03-14  2:07     ` Asias He
  2013-03-14  8:37       ` Paolo Bonzini
  0 siblings, 1 reply; 8+ messages in thread
From: Asias He @ 2013-03-14  2:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi

On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
> Il 13/03/2013 08:34, Asias He ha scritto:
> > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> > wrong.
> > 
> > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > indicate the status of the endpoint, we use per virtqueue
> > vq->private_data to indicate it. In this way, we can only take the
> > vq->mutex lock which is per queue and make the concurrent multiqueue
> > process having less lock contention. Further, in the read side of
> > vq->private_data, we can even do not take only lock if it is accessed in
> > the vhost worker thread, because it is protected by "vhost rcu".
> > 
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> >  drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > index 43fb11e..094fb10 100644
> > --- a/drivers/vhost/tcm_vhost.c
> > +++ b/drivers/vhost/tcm_vhost.c
> > @@ -67,7 +67,6 @@ struct vhost_scsi {
> >  	/* Protected by vhost_scsi->dev.mutex */
> >  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> >  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > -	bool vs_endpoint;
> >  
> >  	struct vhost_dev dev;
> >  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
> >  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> >  }
> >  
> > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > +{
> > +	bool ret = false;
> > +
> > +	/*
> > +	 * We can handle the vq only after the endpoint is setup by calling the
> > +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> > +	 *
> > +	 * TODO: check that we are running from vhost_worker?
> > +	 */
> > +	if (rcu_dereference_check(vq->private_data, 1))
> > +		ret = true;
> > +
> > +	return ret;
> > +}
> > +
> >  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> >  {
> >  	return 1;
> > @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> >  	int head, ret;
> >  	u8 target;
> >  
> > -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > -	if (unlikely(!vs->vs_endpoint))
> > +	if (!tcm_vhost_check_endpoint(vq))
> >  		return;
> 
> You would still need at least a rcu_read_lock/unlock (actually srcu,
> since vhost_scsi_handle_vq can sleep)...

See handle_rx() and handle_rx() in drivers/vhost/net.c

   /* Expects to be always run from workqueue - which acts as
    * read-size critical section for our kind of RCU. */

This is how vhost works, no? 

But, personally, I would prefer to use explicit locking instead of this
trick.

> >  	mutex_lock(&vq->mutex);
> > @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
> >  {
> >  	struct tcm_vhost_tport *tv_tport;
> >  	struct tcm_vhost_tpg *tv_tpg;
> > +	struct vhost_virtqueue *vq;
> >  	bool match = false;
> > -	int index, ret;
> > +	int index, ret, i;
> >  
> >  	mutex_lock(&vs->dev.mutex);
> >  	/* Verify that ring has been setup correctly. */
> > @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
> >  	if (match) {
> >  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> >  		       sizeof(vs->vs_vhost_wwpn));
> > -		vs->vs_endpoint = true;
> > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > +			vq = &vs->vqs[i];
> > +			mutex_lock(&vq->mutex);
> > +			rcu_assign_pointer(vq->private_data, vs);
> > +			vhost_init_used(vq);
> > +			mutex_unlock(&vq->mutex);
> 
> ... and a synchronize_srcu here.  But this is not correct use of RCU.
> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
> whole vs structure on every set_endpoint or clear_endpoint operation,
> and free it after synchronize_srcu returns.

See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h

	/* We use a kind of RCU to access private pointer.
	 * All readers access it from worker, which makes it possible to
	 * flush the vhost_work instead of synchronize_rcu. Therefore readers do
	 * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
	 * vhost_work execution acts instead of rcu_read_lock() and the end of
	 * vhost_work execution acts instead of rcu_read_unlock().
	 * Writers use virtqueue mutex. */
	 void __rcu *private_data;

> What you're trying to do is really an rwlock, just use that. :)

Yes, but the downside is that it introduces another lock.

> Paolo
> 
> > +		}
> >  		ret = 0;
> >  	} else {
> >  		ret = -EEXIST;
> > @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
> >  {
> >  	struct tcm_vhost_tport *tv_tport;
> >  	struct tcm_vhost_tpg *tv_tpg;
> > +	struct vhost_virtqueue *vq;
> > +	bool match = false;
> >  	int index, ret, i;
> >  	u8 target;
> >  
> > @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
> >  		}
> >  		tv_tpg->tv_tpg_vhost_count--;
> >  		vs->vs_tpg[target] = NULL;
> > -		vs->vs_endpoint = false;
> > +		match = true;
> >  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> >  	}
> > +	if (match) {
> > +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > +			vq = &vs->vqs[i];
> > +			mutex_lock(&vq->mutex);
> > +			rcu_assign_pointer(vq->private_data, NULL);
> > +			mutex_unlock(&vq->mutex);
> > +		}
> > +	}
> >  	mutex_unlock(&vs->dev.mutex);
> >  	return 0;
> >  
> > 
> 

-- 
Asias

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-14  2:07     ` Asias He
@ 2013-03-14  8:37       ` Paolo Bonzini
  2013-03-14  9:15         ` Asias He
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-03-14  8:37 UTC (permalink / raw)
  To: Asias He
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi

Il 14/03/2013 03:07, Asias He ha scritto:
> On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
>> Il 13/03/2013 08:34, Asias He ha scritto:
>>> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
>>> not. It is set or cleared in vhost_scsi_set_endpoint() or
>>> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
>>> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
>>> wrong.
>>>
>>> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
>>> indicate the status of the endpoint, we use per virtqueue
>>> vq->private_data to indicate it. In this way, we can only take the
>>> vq->mutex lock which is per queue and make the concurrent multiqueue
>>> process having less lock contention. Further, in the read side of
>>> vq->private_data, we can even do not take only lock if it is accessed in
>>> the vhost worker thread, because it is protected by "vhost rcu".
>>>
>>> Signed-off-by: Asias He <asias@redhat.com>
>>> ---
>>>  drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
>>>  1 file changed, 37 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
>>> index 43fb11e..094fb10 100644
>>> --- a/drivers/vhost/tcm_vhost.c
>>> +++ b/drivers/vhost/tcm_vhost.c
>>> @@ -67,7 +67,6 @@ struct vhost_scsi {
>>>  	/* Protected by vhost_scsi->dev.mutex */
>>>  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
>>>  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
>>> -	bool vs_endpoint;
>>>  
>>>  	struct vhost_dev dev;
>>>  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
>>> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
>>>  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
>>>  }
>>>  
>>> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
>>> +{
>>> +	bool ret = false;
>>> +
>>> +	/*
>>> +	 * We can handle the vq only after the endpoint is setup by calling the
>>> +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
>>> +	 *
>>> +	 * TODO: check that we are running from vhost_worker?
>>> +	 */
>>> +	if (rcu_dereference_check(vq->private_data, 1))
>>> +		ret = true;
>>> +
>>> +	return ret;
>>> +}
>>> +
>>>  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
>>>  {
>>>  	return 1;
>>> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>>>  	int head, ret;
>>>  	u8 target;
>>>  
>>> -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
>>> -	if (unlikely(!vs->vs_endpoint))
>>> +	if (!tcm_vhost_check_endpoint(vq))
>>>  		return;
>>
>> You would still need at least a rcu_read_lock/unlock (actually srcu,
>> since vhost_scsi_handle_vq can sleep)...
> 
> See handle_rx() and handle_rx() in drivers/vhost/net.c
> 
>    /* Expects to be always run from workqueue - which acts as
>     * read-size critical section for our kind of RCU. */
> 
> This is how vhost works, no? 
> 
> But, personally, I would prefer to use explicit locking instead of this
> trick.
> 
>>>  	mutex_lock(&vq->mutex);
>>> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
>>>  {
>>>  	struct tcm_vhost_tport *tv_tport;
>>>  	struct tcm_vhost_tpg *tv_tpg;
>>> +	struct vhost_virtqueue *vq;
>>>  	bool match = false;
>>> -	int index, ret;
>>> +	int index, ret, i;
>>>  
>>>  	mutex_lock(&vs->dev.mutex);
>>>  	/* Verify that ring has been setup correctly. */
>>> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
>>>  	if (match) {
>>>  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
>>>  		       sizeof(vs->vs_vhost_wwpn));
>>> -		vs->vs_endpoint = true;
>>> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
>>> +			vq = &vs->vqs[i];
>>> +			mutex_lock(&vq->mutex);
>>> +			rcu_assign_pointer(vq->private_data, vs);
>>> +			vhost_init_used(vq);
>>> +			mutex_unlock(&vq->mutex);
>>
>> ... and a synchronize_srcu here.  But this is not correct use of RCU.
>> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
>> whole vs structure on every set_endpoint or clear_endpoint operation,
>> and free it after synchronize_srcu returns.
> 
> See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> 
> 	/* We use a kind of RCU to access private pointer.
> 	 * All readers access it from worker, which makes it possible to
> 	 * flush the vhost_work instead of synchronize_rcu. Therefore readers do
> 	 * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
> 	 * vhost_work execution acts instead of rcu_read_lock() and the end of
> 	 * vhost_work execution acts instead of rcu_read_unlock().
> 	 * Writers use virtqueue mutex. */
> 	 void __rcu *private_data;

Aha, cool!  But please add a comment.

>> What you're trying to do is really an rwlock, just use that. :)
> 
> Yes, but the downside is that it introduces another lock.

Can't it can replace the existing mutex?

Paolo

> 
>> Paolo
>>
>>> +		}
>>>  		ret = 0;
>>>  	} else {
>>>  		ret = -EEXIST;
>>> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
>>>  {
>>>  	struct tcm_vhost_tport *tv_tport;
>>>  	struct tcm_vhost_tpg *tv_tpg;
>>> +	struct vhost_virtqueue *vq;
>>> +	bool match = false;
>>>  	int index, ret, i;
>>>  	u8 target;
>>>  
>>> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
>>>  		}
>>>  		tv_tpg->tv_tpg_vhost_count--;
>>>  		vs->vs_tpg[target] = NULL;
>>> -		vs->vs_endpoint = false;
>>> +		match = true;
>>>  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
>>>  	}
>>> +	if (match) {
>>> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
>>> +			vq = &vs->vqs[i];
>>> +			mutex_lock(&vq->mutex);
>>> +			rcu_assign_pointer(vq->private_data, NULL);
>>> +			mutex_unlock(&vq->mutex);
>>> +		}
>>> +	}
>>>  	mutex_unlock(&vs->dev.mutex);
>>>  	return 0;
>>>  
>>>
>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
  2013-03-14  8:37       ` Paolo Bonzini
@ 2013-03-14  9:15         ` Asias He
  0 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-14  9:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
	Stefan Hajnoczi

On Thu, Mar 14, 2013 at 09:37:23AM +0100, Paolo Bonzini wrote:
> Il 14/03/2013 03:07, Asias He ha scritto:
> > On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
> >> Il 13/03/2013 08:34, Asias He ha scritto:
> >>> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> >>> not. It is set or cleared in vhost_scsi_set_endpoint() or
> >>> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> >>> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> >>> wrong.
> >>>
> >>> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> >>> indicate the status of the endpoint, we use per virtqueue
> >>> vq->private_data to indicate it. In this way, we can only take the
> >>> vq->mutex lock which is per queue and make the concurrent multiqueue
> >>> process having less lock contention. Further, in the read side of
> >>> vq->private_data, we can even do not take only lock if it is accessed in
> >>> the vhost worker thread, because it is protected by "vhost rcu".
> >>>
> >>> Signed-off-by: Asias He <asias@redhat.com>
> >>> ---
> >>>  drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
> >>>  1 file changed, 37 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> >>> index 43fb11e..094fb10 100644
> >>> --- a/drivers/vhost/tcm_vhost.c
> >>> +++ b/drivers/vhost/tcm_vhost.c
> >>> @@ -67,7 +67,6 @@ struct vhost_scsi {
> >>>  	/* Protected by vhost_scsi->dev.mutex */
> >>>  	struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> >>>  	char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> >>> -	bool vs_endpoint;
> >>>  
> >>>  	struct vhost_dev dev;
> >>>  	struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> >>> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
> >>>  	       ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> >>>  }
> >>>  
> >>> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> >>> +{
> >>> +	bool ret = false;
> >>> +
> >>> +	/*
> >>> +	 * We can handle the vq only after the endpoint is setup by calling the
> >>> +	 * VHOST_SCSI_SET_ENDPOINT ioctl.
> >>> +	 *
> >>> +	 * TODO: check that we are running from vhost_worker?
> >>> +	 */
> >>> +	if (rcu_dereference_check(vq->private_data, 1))
> >>> +		ret = true;
> >>> +
> >>> +	return ret;
> >>> +}
> >>> +
> >>>  static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> >>>  {
> >>>  	return 1;
> >>> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> >>>  	int head, ret;
> >>>  	u8 target;
> >>>  
> >>> -	/* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> >>> -	if (unlikely(!vs->vs_endpoint))
> >>> +	if (!tcm_vhost_check_endpoint(vq))
> >>>  		return;
> >>
> >> You would still need at least a rcu_read_lock/unlock (actually srcu,
> >> since vhost_scsi_handle_vq can sleep)...
> > 
> > See handle_rx() and handle_rx() in drivers/vhost/net.c
> > 
> >    /* Expects to be always run from workqueue - which acts as
> >     * read-size critical section for our kind of RCU. */
> > 
> > This is how vhost works, no? 
> > 
> > But, personally, I would prefer to use explicit locking instead of this
> > trick.
> > 
> >>>  	mutex_lock(&vq->mutex);
> >>> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
> >>>  {
> >>>  	struct tcm_vhost_tport *tv_tport;
> >>>  	struct tcm_vhost_tpg *tv_tpg;
> >>> +	struct vhost_virtqueue *vq;
> >>>  	bool match = false;
> >>> -	int index, ret;
> >>> +	int index, ret, i;
> >>>  
> >>>  	mutex_lock(&vs->dev.mutex);
> >>>  	/* Verify that ring has been setup correctly. */
> >>> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
> >>>  	if (match) {
> >>>  		memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> >>>  		       sizeof(vs->vs_vhost_wwpn));
> >>> -		vs->vs_endpoint = true;
> >>> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> >>> +			vq = &vs->vqs[i];
> >>> +			mutex_lock(&vq->mutex);
> >>> +			rcu_assign_pointer(vq->private_data, vs);
> >>> +			vhost_init_used(vq);
> >>> +			mutex_unlock(&vq->mutex);
> >>
> >> ... and a synchronize_srcu here.  But this is not correct use of RCU.
> >> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
> >> whole vs structure on every set_endpoint or clear_endpoint operation,
> >> and free it after synchronize_srcu returns.
> > 
> > See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> > 
> > 	/* We use a kind of RCU to access private pointer.
> > 	 * All readers access it from worker, which makes it possible to
> > 	 * flush the vhost_work instead of synchronize_rcu. Therefore readers do
> > 	 * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
> > 	 * vhost_work execution acts instead of rcu_read_lock() and the end of
> > 	 * vhost_work execution acts instead of rcu_read_unlock().
> > 	 * Writers use virtqueue mutex. */
> > 	 void __rcu *private_data;
> 
> Aha, cool!  But please add a comment.

Okay.

> >> What you're trying to do is really an rwlock, just use that. :)
> > 
> > Yes, but the downside is that it introduces another lock.
> 
> Can't it can replace the existing mutex?

Do you mean vs->dev.mutex or vq->mutex. In both cases, we still need
them.

Anyway, if the current model works, we do not need the rwlock.

> Paolo
> 
> > 
> >> Paolo
> >>
> >>> +		}
> >>>  		ret = 0;
> >>>  	} else {
> >>>  		ret = -EEXIST;
> >>> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
> >>>  {
> >>>  	struct tcm_vhost_tport *tv_tport;
> >>>  	struct tcm_vhost_tpg *tv_tpg;
> >>> +	struct vhost_virtqueue *vq;
> >>> +	bool match = false;
> >>>  	int index, ret, i;
> >>>  	u8 target;
> >>>  
> >>> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
> >>>  		}
> >>>  		tv_tpg->tv_tpg_vhost_count--;
> >>>  		vs->vs_tpg[target] = NULL;
> >>> -		vs->vs_endpoint = false;
> >>> +		match = true;
> >>>  		mutex_unlock(&tv_tpg->tv_tpg_mutex);
> >>>  	}
> >>> +	if (match) {
> >>> +		for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> >>> +			vq = &vs->vqs[i];
> >>> +			mutex_lock(&vq->mutex);
> >>> +			rcu_assign_pointer(vq->private_data, NULL);
> >>> +			mutex_unlock(&vq->mutex);
> >>> +		}
> >>> +	}
> >>>  	mutex_unlock(&vs->dev.mutex);
> >>>  	return 0;
> >>>  
> >>>
> >>
> > 
> 

-- 
Asias

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-03-14  9:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-13  7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
2013-03-13  7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
2013-03-13  7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
2013-03-13  7:34 ` [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
     [not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
2013-03-13  8:56   ` Paolo Bonzini
2013-03-14  2:07     ` Asias He
2013-03-14  8:37       ` Paolo Bonzini
2013-03-14  9:15         ` Asias He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).