* [PATCH V2 0/3] tcm_vhost lock and flush fix
@ 2013-03-13 7:34 Asias He
2013-03-13 7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Asias He @ 2013-03-13 7:34 UTC (permalink / raw)
To: Nicholas Bellinger
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi, Paolo Bonzini
Changes in v2
- Use vq->private_data insteadof vs->vs_endpoint
- Add label in vhost_scsi_clear_endpoint to simply unlock code.
Asias He (3):
tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
tcm_vhost: Flush vhost_work in vhost_scsi_flush()
tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
drivers/vhost/tcm_vhost.c | 56 ++++++++++++++++++++++++++++++++++++++---------
1 file changed, 46 insertions(+), 10 deletions(-)
--
1.8.1.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint()
2013-03-13 7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
@ 2013-03-13 7:34 ` Asias He
2013-03-13 7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13 7:34 UTC (permalink / raw)
To: Nicholas Bellinger
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi, Paolo Bonzini
tv_tpg->tv_tpg_vhost_count should be protected by tv_tpg->tv_tpg_mutex.
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
drivers/vhost/tcm_vhost.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 9951297..79d3aea 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -850,7 +850,7 @@ static int vhost_scsi_clear_endpoint(
for (index = 0; index < vs->dev.nvqs; ++index) {
if (!vhost_vq_access_ok(&vs->vqs[index])) {
ret = -EFAULT;
- goto err;
+ goto err_dev;
}
}
for (i = 0; i < VHOST_SCSI_MAX_TARGET; i++) {
@@ -860,10 +860,11 @@ static int vhost_scsi_clear_endpoint(
if (!tv_tpg)
continue;
+ mutex_lock(&tv_tpg->tv_tpg_mutex);
tv_tport = tv_tpg->tport;
if (!tv_tport) {
ret = -ENODEV;
- goto err;
+ goto err_tpg;
}
if (strcmp(tv_tport->tport_name, t->vhost_wwpn)) {
@@ -872,16 +873,19 @@ static int vhost_scsi_clear_endpoint(
tv_tport->tport_name, tv_tpg->tport_tpgt,
t->vhost_wwpn, t->vhost_tpgt);
ret = -EINVAL;
- goto err;
+ goto err_tpg;
}
tv_tpg->tv_tpg_vhost_count--;
vs->vs_tpg[target] = NULL;
vs->vs_endpoint = false;
+ mutex_unlock(&tv_tpg->tv_tpg_mutex);
}
mutex_unlock(&vs->dev.mutex);
return 0;
-err:
+err_tpg:
+ mutex_unlock(&tv_tpg->tv_tpg_mutex);
+err_dev:
mutex_unlock(&vs->dev.mutex);
return ret;
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush()
2013-03-13 7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
2013-03-13 7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
@ 2013-03-13 7:34 ` Asias He
2013-03-13 7:34 ` [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
[not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13 7:34 UTC (permalink / raw)
To: Nicholas Bellinger
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi, Paolo Bonzini
We also need to flush the vhost_works. It is the completion vhost_work
currently.
Signed-off-by: Asias He <asias@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
drivers/vhost/tcm_vhost.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 79d3aea..43fb11e 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -941,6 +941,7 @@ static void vhost_scsi_flush(struct vhost_scsi *vs)
for (i = 0; i < VHOST_SCSI_MAX_VQ; i++)
vhost_scsi_flush_vq(vs, i);
+ vhost_work_flush(&vs->dev, &vs->vs_completion_work);
}
static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features)
--
1.8.1.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
2013-03-13 7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
2013-03-13 7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
2013-03-13 7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
@ 2013-03-13 7:34 ` Asias He
[not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
3 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-13 7:34 UTC (permalink / raw)
To: Nicholas Bellinger
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi, Paolo Bonzini
Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
not. It is set or cleared in vhost_scsi_set_endpoint() or
vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
wrong.
Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
indicate the status of the endpoint, we use per virtqueue
vq->private_data to indicate it. In this way, we can only take the
vq->mutex lock which is per queue and make the concurrent multiqueue
process having less lock contention. Further, in the read side of
vq->private_data, we can even do not take only lock if it is accessed in
the vhost worker thread, because it is protected by "vhost rcu".
Signed-off-by: Asias He <asias@redhat.com>
---
drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
index 43fb11e..094fb10 100644
--- a/drivers/vhost/tcm_vhost.c
+++ b/drivers/vhost/tcm_vhost.c
@@ -67,7 +67,6 @@ struct vhost_scsi {
/* Protected by vhost_scsi->dev.mutex */
struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
- bool vs_endpoint;
struct vhost_dev dev;
struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
@@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
}
+static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
+{
+ bool ret = false;
+
+ /*
+ * We can handle the vq only after the endpoint is setup by calling the
+ * VHOST_SCSI_SET_ENDPOINT ioctl.
+ *
+ * TODO: check that we are running from vhost_worker?
+ */
+ if (rcu_dereference_check(vq->private_data, 1))
+ ret = true;
+
+ return ret;
+}
+
static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
{
return 1;
@@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
int head, ret;
u8 target;
- /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
- if (unlikely(!vs->vs_endpoint))
+ if (!tcm_vhost_check_endpoint(vq))
return;
mutex_lock(&vq->mutex);
@@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
{
struct tcm_vhost_tport *tv_tport;
struct tcm_vhost_tpg *tv_tpg;
+ struct vhost_virtqueue *vq;
bool match = false;
- int index, ret;
+ int index, ret, i;
mutex_lock(&vs->dev.mutex);
/* Verify that ring has been setup correctly. */
@@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
if (match) {
memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
sizeof(vs->vs_vhost_wwpn));
- vs->vs_endpoint = true;
+ for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+ vq = &vs->vqs[i];
+ mutex_lock(&vq->mutex);
+ rcu_assign_pointer(vq->private_data, vs);
+ vhost_init_used(vq);
+ mutex_unlock(&vq->mutex);
+ }
ret = 0;
} else {
ret = -EEXIST;
@@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
{
struct tcm_vhost_tport *tv_tport;
struct tcm_vhost_tpg *tv_tpg;
+ struct vhost_virtqueue *vq;
+ bool match = false;
int index, ret, i;
u8 target;
@@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
}
tv_tpg->tv_tpg_vhost_count--;
vs->vs_tpg[target] = NULL;
- vs->vs_endpoint = false;
+ match = true;
mutex_unlock(&tv_tpg->tv_tpg_mutex);
}
+ if (match) {
+ for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
+ vq = &vs->vqs[i];
+ mutex_lock(&vq->mutex);
+ rcu_assign_pointer(vq->private_data, NULL);
+ mutex_unlock(&vq->mutex);
+ }
+ }
mutex_unlock(&vs->dev.mutex);
return 0;
--
1.8.1.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
[not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
@ 2013-03-13 8:56 ` Paolo Bonzini
2013-03-14 2:07 ` Asias He
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-03-13 8:56 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi
Il 13/03/2013 08:34, Asias He ha scritto:
> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> not. It is set or cleared in vhost_scsi_set_endpoint() or
> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> wrong.
>
> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> indicate the status of the endpoint, we use per virtqueue
> vq->private_data to indicate it. In this way, we can only take the
> vq->mutex lock which is per queue and make the concurrent multiqueue
> process having less lock contention. Further, in the read side of
> vq->private_data, we can even do not take only lock if it is accessed in
> the vhost worker thread, because it is protected by "vhost rcu".
>
> Signed-off-by: Asias He <asias@redhat.com>
> ---
> drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
> 1 file changed, 37 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> index 43fb11e..094fb10 100644
> --- a/drivers/vhost/tcm_vhost.c
> +++ b/drivers/vhost/tcm_vhost.c
> @@ -67,7 +67,6 @@ struct vhost_scsi {
> /* Protected by vhost_scsi->dev.mutex */
> struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> - bool vs_endpoint;
>
> struct vhost_dev dev;
> struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
> ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> }
>
> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> +{
> + bool ret = false;
> +
> + /*
> + * We can handle the vq only after the endpoint is setup by calling the
> + * VHOST_SCSI_SET_ENDPOINT ioctl.
> + *
> + * TODO: check that we are running from vhost_worker?
> + */
> + if (rcu_dereference_check(vq->private_data, 1))
> + ret = true;
> +
> + return ret;
> +}
> +
> static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> {
> return 1;
> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> int head, ret;
> u8 target;
>
> - /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> - if (unlikely(!vs->vs_endpoint))
> + if (!tcm_vhost_check_endpoint(vq))
> return;
You would still need at least a rcu_read_lock/unlock (actually srcu,
since vhost_scsi_handle_vq can sleep)...
> mutex_lock(&vq->mutex);
> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
> {
> struct tcm_vhost_tport *tv_tport;
> struct tcm_vhost_tpg *tv_tpg;
> + struct vhost_virtqueue *vq;
> bool match = false;
> - int index, ret;
> + int index, ret, i;
>
> mutex_lock(&vs->dev.mutex);
> /* Verify that ring has been setup correctly. */
> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
> if (match) {
> memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> sizeof(vs->vs_vhost_wwpn));
> - vs->vs_endpoint = true;
> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> + vq = &vs->vqs[i];
> + mutex_lock(&vq->mutex);
> + rcu_assign_pointer(vq->private_data, vs);
> + vhost_init_used(vq);
> + mutex_unlock(&vq->mutex);
... and a synchronize_srcu here. But this is not correct use of RCU.
To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
whole vs structure on every set_endpoint or clear_endpoint operation,
and free it after synchronize_srcu returns.
What you're trying to do is really an rwlock, just use that. :)
Paolo
> + }
> ret = 0;
> } else {
> ret = -EEXIST;
> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
> {
> struct tcm_vhost_tport *tv_tport;
> struct tcm_vhost_tpg *tv_tpg;
> + struct vhost_virtqueue *vq;
> + bool match = false;
> int index, ret, i;
> u8 target;
>
> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
> }
> tv_tpg->tv_tpg_vhost_count--;
> vs->vs_tpg[target] = NULL;
> - vs->vs_endpoint = false;
> + match = true;
> mutex_unlock(&tv_tpg->tv_tpg_mutex);
> }
> + if (match) {
> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> + vq = &vs->vqs[i];
> + mutex_lock(&vq->mutex);
> + rcu_assign_pointer(vq->private_data, NULL);
> + mutex_unlock(&vq->mutex);
> + }
> + }
> mutex_unlock(&vs->dev.mutex);
> return 0;
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
2013-03-13 8:56 ` Paolo Bonzini
@ 2013-03-14 2:07 ` Asias He
2013-03-14 8:37 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Asias He @ 2013-03-14 2:07 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi
On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
> Il 13/03/2013 08:34, Asias He ha scritto:
> > Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> > not. It is set or cleared in vhost_scsi_set_endpoint() or
> > vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> > we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> > wrong.
> >
> > Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> > indicate the status of the endpoint, we use per virtqueue
> > vq->private_data to indicate it. In this way, we can only take the
> > vq->mutex lock which is per queue and make the concurrent multiqueue
> > process having less lock contention. Further, in the read side of
> > vq->private_data, we can even do not take only lock if it is accessed in
> > the vhost worker thread, because it is protected by "vhost rcu".
> >
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> > drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 37 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> > index 43fb11e..094fb10 100644
> > --- a/drivers/vhost/tcm_vhost.c
> > +++ b/drivers/vhost/tcm_vhost.c
> > @@ -67,7 +67,6 @@ struct vhost_scsi {
> > /* Protected by vhost_scsi->dev.mutex */
> > struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> > char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> > - bool vs_endpoint;
> >
> > struct vhost_dev dev;
> > struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> > @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
> > ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> > }
> >
> > +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> > +{
> > + bool ret = false;
> > +
> > + /*
> > + * We can handle the vq only after the endpoint is setup by calling the
> > + * VHOST_SCSI_SET_ENDPOINT ioctl.
> > + *
> > + * TODO: check that we are running from vhost_worker?
> > + */
> > + if (rcu_dereference_check(vq->private_data, 1))
> > + ret = true;
> > +
> > + return ret;
> > +}
> > +
> > static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> > {
> > return 1;
> > @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> > int head, ret;
> > u8 target;
> >
> > - /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> > - if (unlikely(!vs->vs_endpoint))
> > + if (!tcm_vhost_check_endpoint(vq))
> > return;
>
> You would still need at least a rcu_read_lock/unlock (actually srcu,
> since vhost_scsi_handle_vq can sleep)...
See handle_rx() and handle_rx() in drivers/vhost/net.c
/* Expects to be always run from workqueue - which acts as
* read-size critical section for our kind of RCU. */
This is how vhost works, no?
But, personally, I would prefer to use explicit locking instead of this
trick.
> > mutex_lock(&vq->mutex);
> > @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
> > {
> > struct tcm_vhost_tport *tv_tport;
> > struct tcm_vhost_tpg *tv_tpg;
> > + struct vhost_virtqueue *vq;
> > bool match = false;
> > - int index, ret;
> > + int index, ret, i;
> >
> > mutex_lock(&vs->dev.mutex);
> > /* Verify that ring has been setup correctly. */
> > @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
> > if (match) {
> > memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> > sizeof(vs->vs_vhost_wwpn));
> > - vs->vs_endpoint = true;
> > + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > + vq = &vs->vqs[i];
> > + mutex_lock(&vq->mutex);
> > + rcu_assign_pointer(vq->private_data, vs);
> > + vhost_init_used(vq);
> > + mutex_unlock(&vq->mutex);
>
> ... and a synchronize_srcu here. But this is not correct use of RCU.
> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
> whole vs structure on every set_endpoint or clear_endpoint operation,
> and free it after synchronize_srcu returns.
See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
/* We use a kind of RCU to access private pointer.
* All readers access it from worker, which makes it possible to
* flush the vhost_work instead of synchronize_rcu. Therefore readers do
* not need to call rcu_read_lock/rcu_read_unlock: the beginning of
* vhost_work execution acts instead of rcu_read_lock() and the end of
* vhost_work execution acts instead of rcu_read_unlock().
* Writers use virtqueue mutex. */
void __rcu *private_data;
> What you're trying to do is really an rwlock, just use that. :)
Yes, but the downside is that it introduces another lock.
> Paolo
>
> > + }
> > ret = 0;
> > } else {
> > ret = -EEXIST;
> > @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
> > {
> > struct tcm_vhost_tport *tv_tport;
> > struct tcm_vhost_tpg *tv_tpg;
> > + struct vhost_virtqueue *vq;
> > + bool match = false;
> > int index, ret, i;
> > u8 target;
> >
> > @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
> > }
> > tv_tpg->tv_tpg_vhost_count--;
> > vs->vs_tpg[target] = NULL;
> > - vs->vs_endpoint = false;
> > + match = true;
> > mutex_unlock(&tv_tpg->tv_tpg_mutex);
> > }
> > + if (match) {
> > + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> > + vq = &vs->vqs[i];
> > + mutex_lock(&vq->mutex);
> > + rcu_assign_pointer(vq->private_data, NULL);
> > + mutex_unlock(&vq->mutex);
> > + }
> > + }
> > mutex_unlock(&vs->dev.mutex);
> > return 0;
> >
> >
>
--
Asias
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
2013-03-14 2:07 ` Asias He
@ 2013-03-14 8:37 ` Paolo Bonzini
2013-03-14 9:15 ` Asias He
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-03-14 8:37 UTC (permalink / raw)
To: Asias He
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi
Il 14/03/2013 03:07, Asias He ha scritto:
> On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
>> Il 13/03/2013 08:34, Asias He ha scritto:
>>> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
>>> not. It is set or cleared in vhost_scsi_set_endpoint() or
>>> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
>>> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
>>> wrong.
>>>
>>> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
>>> indicate the status of the endpoint, we use per virtqueue
>>> vq->private_data to indicate it. In this way, we can only take the
>>> vq->mutex lock which is per queue and make the concurrent multiqueue
>>> process having less lock contention. Further, in the read side of
>>> vq->private_data, we can even do not take only lock if it is accessed in
>>> the vhost worker thread, because it is protected by "vhost rcu".
>>>
>>> Signed-off-by: Asias He <asias@redhat.com>
>>> ---
>>> drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
>>> 1 file changed, 37 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
>>> index 43fb11e..094fb10 100644
>>> --- a/drivers/vhost/tcm_vhost.c
>>> +++ b/drivers/vhost/tcm_vhost.c
>>> @@ -67,7 +67,6 @@ struct vhost_scsi {
>>> /* Protected by vhost_scsi->dev.mutex */
>>> struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
>>> char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
>>> - bool vs_endpoint;
>>>
>>> struct vhost_dev dev;
>>> struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
>>> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
>>> ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
>>> }
>>>
>>> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
>>> +{
>>> + bool ret = false;
>>> +
>>> + /*
>>> + * We can handle the vq only after the endpoint is setup by calling the
>>> + * VHOST_SCSI_SET_ENDPOINT ioctl.
>>> + *
>>> + * TODO: check that we are running from vhost_worker?
>>> + */
>>> + if (rcu_dereference_check(vq->private_data, 1))
>>> + ret = true;
>>> +
>>> + return ret;
>>> +}
>>> +
>>> static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
>>> {
>>> return 1;
>>> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
>>> int head, ret;
>>> u8 target;
>>>
>>> - /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
>>> - if (unlikely(!vs->vs_endpoint))
>>> + if (!tcm_vhost_check_endpoint(vq))
>>> return;
>>
>> You would still need at least a rcu_read_lock/unlock (actually srcu,
>> since vhost_scsi_handle_vq can sleep)...
>
> See handle_rx() and handle_rx() in drivers/vhost/net.c
>
> /* Expects to be always run from workqueue - which acts as
> * read-size critical section for our kind of RCU. */
>
> This is how vhost works, no?
>
> But, personally, I would prefer to use explicit locking instead of this
> trick.
>
>>> mutex_lock(&vq->mutex);
>>> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
>>> {
>>> struct tcm_vhost_tport *tv_tport;
>>> struct tcm_vhost_tpg *tv_tpg;
>>> + struct vhost_virtqueue *vq;
>>> bool match = false;
>>> - int index, ret;
>>> + int index, ret, i;
>>>
>>> mutex_lock(&vs->dev.mutex);
>>> /* Verify that ring has been setup correctly. */
>>> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
>>> if (match) {
>>> memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
>>> sizeof(vs->vs_vhost_wwpn));
>>> - vs->vs_endpoint = true;
>>> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
>>> + vq = &vs->vqs[i];
>>> + mutex_lock(&vq->mutex);
>>> + rcu_assign_pointer(vq->private_data, vs);
>>> + vhost_init_used(vq);
>>> + mutex_unlock(&vq->mutex);
>>
>> ... and a synchronize_srcu here. But this is not correct use of RCU.
>> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
>> whole vs structure on every set_endpoint or clear_endpoint operation,
>> and free it after synchronize_srcu returns.
>
> See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
>
> /* We use a kind of RCU to access private pointer.
> * All readers access it from worker, which makes it possible to
> * flush the vhost_work instead of synchronize_rcu. Therefore readers do
> * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
> * vhost_work execution acts instead of rcu_read_lock() and the end of
> * vhost_work execution acts instead of rcu_read_unlock().
> * Writers use virtqueue mutex. */
> void __rcu *private_data;
Aha, cool! But please add a comment.
>> What you're trying to do is really an rwlock, just use that. :)
>
> Yes, but the downside is that it introduces another lock.
Can't it can replace the existing mutex?
Paolo
>
>> Paolo
>>
>>> + }
>>> ret = 0;
>>> } else {
>>> ret = -EEXIST;
>>> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
>>> {
>>> struct tcm_vhost_tport *tv_tport;
>>> struct tcm_vhost_tpg *tv_tpg;
>>> + struct vhost_virtqueue *vq;
>>> + bool match = false;
>>> int index, ret, i;
>>> u8 target;
>>>
>>> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
>>> }
>>> tv_tpg->tv_tpg_vhost_count--;
>>> vs->vs_tpg[target] = NULL;
>>> - vs->vs_endpoint = false;
>>> + match = true;
>>> mutex_unlock(&tv_tpg->tv_tpg_mutex);
>>> }
>>> + if (match) {
>>> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
>>> + vq = &vs->vqs[i];
>>> + mutex_lock(&vq->mutex);
>>> + rcu_assign_pointer(vq->private_data, NULL);
>>> + mutex_unlock(&vq->mutex);
>>> + }
>>> + }
>>> mutex_unlock(&vs->dev.mutex);
>>> return 0;
>>>
>>>
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup
2013-03-14 8:37 ` Paolo Bonzini
@ 2013-03-14 9:15 ` Asias He
0 siblings, 0 replies; 8+ messages in thread
From: Asias He @ 2013-03-14 9:15 UTC (permalink / raw)
To: Paolo Bonzini
Cc: kvm, Michael S. Tsirkin, virtualization, target-devel,
Stefan Hajnoczi
On Thu, Mar 14, 2013 at 09:37:23AM +0100, Paolo Bonzini wrote:
> Il 14/03/2013 03:07, Asias He ha scritto:
> > On Wed, Mar 13, 2013 at 09:56:41AM +0100, Paolo Bonzini wrote:
> >> Il 13/03/2013 08:34, Asias He ha scritto:
> >>> Currently, vs->vs_endpoint is used indicate if the endpoint is setup or
> >>> not. It is set or cleared in vhost_scsi_set_endpoint() or
> >>> vhost_scsi_clear_endpoint() under the vs->dev.mutex lock. However, when
> >>> we check it in vhost_scsi_handle_vq(), we ignored the lock, this is
> >>> wrong.
> >>>
> >>> Instead of using the vs->vs_endpoint and the vs->dev.mutex lock to
> >>> indicate the status of the endpoint, we use per virtqueue
> >>> vq->private_data to indicate it. In this way, we can only take the
> >>> vq->mutex lock which is per queue and make the concurrent multiqueue
> >>> process having less lock contention. Further, in the read side of
> >>> vq->private_data, we can even do not take only lock if it is accessed in
> >>> the vhost worker thread, because it is protected by "vhost rcu".
> >>>
> >>> Signed-off-by: Asias He <asias@redhat.com>
> >>> ---
> >>> drivers/vhost/tcm_vhost.c | 43 +++++++++++++++++++++++++++++++++++++------
> >>> 1 file changed, 37 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/drivers/vhost/tcm_vhost.c b/drivers/vhost/tcm_vhost.c
> >>> index 43fb11e..094fb10 100644
> >>> --- a/drivers/vhost/tcm_vhost.c
> >>> +++ b/drivers/vhost/tcm_vhost.c
> >>> @@ -67,7 +67,6 @@ struct vhost_scsi {
> >>> /* Protected by vhost_scsi->dev.mutex */
> >>> struct tcm_vhost_tpg *vs_tpg[VHOST_SCSI_MAX_TARGET];
> >>> char vs_vhost_wwpn[TRANSPORT_IQN_LEN];
> >>> - bool vs_endpoint;
> >>>
> >>> struct vhost_dev dev;
> >>> struct vhost_virtqueue vqs[VHOST_SCSI_MAX_VQ];
> >>> @@ -91,6 +90,22 @@ static int iov_num_pages(struct iovec *iov)
> >>> ((unsigned long)iov->iov_base & PAGE_MASK)) >> PAGE_SHIFT;
> >>> }
> >>>
> >>> +static bool tcm_vhost_check_endpoint(struct vhost_virtqueue *vq)
> >>> +{
> >>> + bool ret = false;
> >>> +
> >>> + /*
> >>> + * We can handle the vq only after the endpoint is setup by calling the
> >>> + * VHOST_SCSI_SET_ENDPOINT ioctl.
> >>> + *
> >>> + * TODO: check that we are running from vhost_worker?
> >>> + */
> >>> + if (rcu_dereference_check(vq->private_data, 1))
> >>> + ret = true;
> >>> +
> >>> + return ret;
> >>> +}
> >>> +
> >>> static int tcm_vhost_check_true(struct se_portal_group *se_tpg)
> >>> {
> >>> return 1;
> >>> @@ -581,8 +596,7 @@ static void vhost_scsi_handle_vq(struct vhost_scsi *vs,
> >>> int head, ret;
> >>> u8 target;
> >>>
> >>> - /* Must use ioctl VHOST_SCSI_SET_ENDPOINT */
> >>> - if (unlikely(!vs->vs_endpoint))
> >>> + if (!tcm_vhost_check_endpoint(vq))
> >>> return;
> >>
> >> You would still need at least a rcu_read_lock/unlock (actually srcu,
> >> since vhost_scsi_handle_vq can sleep)...
> >
> > See handle_rx() and handle_rx() in drivers/vhost/net.c
> >
> > /* Expects to be always run from workqueue - which acts as
> > * read-size critical section for our kind of RCU. */
> >
> > This is how vhost works, no?
> >
> > But, personally, I would prefer to use explicit locking instead of this
> > trick.
> >
> >>> mutex_lock(&vq->mutex);
> >>> @@ -781,8 +795,9 @@ static int vhost_scsi_set_endpoint(
> >>> {
> >>> struct tcm_vhost_tport *tv_tport;
> >>> struct tcm_vhost_tpg *tv_tpg;
> >>> + struct vhost_virtqueue *vq;
> >>> bool match = false;
> >>> - int index, ret;
> >>> + int index, ret, i;
> >>>
> >>> mutex_lock(&vs->dev.mutex);
> >>> /* Verify that ring has been setup correctly. */
> >>> @@ -826,7 +841,13 @@ static int vhost_scsi_set_endpoint(
> >>> if (match) {
> >>> memcpy(vs->vs_vhost_wwpn, t->vhost_wwpn,
> >>> sizeof(vs->vs_vhost_wwpn));
> >>> - vs->vs_endpoint = true;
> >>> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> >>> + vq = &vs->vqs[i];
> >>> + mutex_lock(&vq->mutex);
> >>> + rcu_assign_pointer(vq->private_data, vs);
> >>> + vhost_init_used(vq);
> >>> + mutex_unlock(&vq->mutex);
> >>
> >> ... and a synchronize_srcu here. But this is not correct use of RCU.
> >> To use RCU correctly, you need to _copy_ (that's the "C" in RCU) the
> >> whole vs structure on every set_endpoint or clear_endpoint operation,
> >> and free it after synchronize_srcu returns.
> >
> > See the comments in struct vhost_virtqueue in drivers/vhost/vhost.h
> >
> > /* We use a kind of RCU to access private pointer.
> > * All readers access it from worker, which makes it possible to
> > * flush the vhost_work instead of synchronize_rcu. Therefore readers do
> > * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
> > * vhost_work execution acts instead of rcu_read_lock() and the end of
> > * vhost_work execution acts instead of rcu_read_unlock().
> > * Writers use virtqueue mutex. */
> > void __rcu *private_data;
>
> Aha, cool! But please add a comment.
Okay.
> >> What you're trying to do is really an rwlock, just use that. :)
> >
> > Yes, but the downside is that it introduces another lock.
>
> Can't it can replace the existing mutex?
Do you mean vs->dev.mutex or vq->mutex. In both cases, we still need
them.
Anyway, if the current model works, we do not need the rwlock.
> Paolo
>
> >
> >> Paolo
> >>
> >>> + }
> >>> ret = 0;
> >>> } else {
> >>> ret = -EEXIST;
> >>> @@ -842,6 +863,8 @@ static int vhost_scsi_clear_endpoint(
> >>> {
> >>> struct tcm_vhost_tport *tv_tport;
> >>> struct tcm_vhost_tpg *tv_tpg;
> >>> + struct vhost_virtqueue *vq;
> >>> + bool match = false;
> >>> int index, ret, i;
> >>> u8 target;
> >>>
> >>> @@ -877,9 +900,17 @@ static int vhost_scsi_clear_endpoint(
> >>> }
> >>> tv_tpg->tv_tpg_vhost_count--;
> >>> vs->vs_tpg[target] = NULL;
> >>> - vs->vs_endpoint = false;
> >>> + match = true;
> >>> mutex_unlock(&tv_tpg->tv_tpg_mutex);
> >>> }
> >>> + if (match) {
> >>> + for (i = 0; i < VHOST_SCSI_MAX_VQ; i++) {
> >>> + vq = &vs->vqs[i];
> >>> + mutex_lock(&vq->mutex);
> >>> + rcu_assign_pointer(vq->private_data, NULL);
> >>> + mutex_unlock(&vq->mutex);
> >>> + }
> >>> + }
> >>> mutex_unlock(&vs->dev.mutex);
> >>> return 0;
> >>>
> >>>
> >>
> >
>
--
Asias
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-03-14 9:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-13 7:34 [PATCH V2 0/3] tcm_vhost lock and flush fix Asias He
2013-03-13 7:34 ` [PATCH V2 1/3] tcm_vhost: Add missed lock in vhost_scsi_clear_endpoint() Asias He
2013-03-13 7:34 ` [PATCH V2 2/3] tcm_vhost: Flush vhost_work in vhost_scsi_flush() Asias He
2013-03-13 7:34 ` [PATCH V2 3/3] tcm_vhost: Use vq->private_data to indicate if the endpoint is setup Asias He
[not found] ` <1363160055-24605-4-git-send-email-asias@redhat.com>
2013-03-13 8:56 ` Paolo Bonzini
2013-03-14 2:07 ` Asias He
2013-03-14 8:37 ` Paolo Bonzini
2013-03-14 9:15 ` Asias He
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).