* [PATCH] ib/core: Protect QP mcast list
@ 2011-12-19 7:09 Eli Cohen
2011-12-19 21:54 ` Roland Dreier
2011-12-20 23:28 ` Hefty, Sean
0 siblings, 2 replies; 6+ messages in thread
From: Eli Cohen @ 2011-12-19 7:09 UTC (permalink / raw)
To: roland-BHEL68pLQRGGvPXPguhicg; +Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w
Multicast attach/detach operations on a QP are carried under the
readers'/writers' lock of the QP. Such protection is not sufficient since a
reader's lock allows more than one reader to acquire the lock. However, list
manipulation implies write operations on the list variable. Use a spinlock to
provide protection.
We have hit kernel oops when running applications that perform multicast
joins/leaves. This patch solved the issue.
Reported by: Mike Dubman <miked-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
drivers/infiniband/core/uverbs_cmd.c | 11 ++++++++++-
include/rdma/ib_verbs.h | 1 +
2 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index c426992..0209292 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1132,6 +1132,7 @@ ssize_t ib_uverbs_create_qp(struct ib_uverbs_file *file,
if (ret)
goto err_destroy;
+ spin_lock_init(&qp->mcast_lock);
memset(&resp, 0, sizeof resp);
resp.qpn = qp->qp_num;
resp.qp_handle = obj->uevent.uobject.id;
@@ -1910,12 +1911,15 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
+ spin_lock(&qp->mcast_lock);
list_for_each_entry(mcast, &obj->mcast_list, list)
if (cmd.mlid == mcast->lid &&
!memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
ret = 0;
+ spin_unlock(&qp->mcast_lock);
goto out_put;
}
+ spin_unlock(&qp->mcast_lock);
mcast = kmalloc(sizeof *mcast, GFP_KERNEL);
if (!mcast) {
@@ -1927,8 +1931,11 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file *file,
memcpy(mcast->gid.raw, cmd.gid, sizeof mcast->gid.raw);
ret = ib_attach_mcast(qp, &mcast->gid, cmd.mlid);
- if (!ret)
+ if (!ret) {
+ spin_lock(&qp->mcast_lock);
list_add_tail(&mcast->list, &obj->mcast_list);
+ spin_unlock(&qp->mcast_lock);
+ }
else
kfree(mcast);
@@ -1961,6 +1968,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
+ spin_lock(&qp->mcast_lock);
list_for_each_entry(mcast, &obj->mcast_list, list)
if (cmd.mlid == mcast->lid &&
!memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
@@ -1968,6 +1976,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file *file,
kfree(mcast);
break;
}
+ spin_unlock(&qp->mcast_lock);
out_put:
put_qp_read(qp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 55cd0a0..bf86750 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -895,6 +895,7 @@ struct ib_qp {
void *qp_context;
u32 qp_num;
enum ib_qp_type qp_type;
+ spinlock_t mcast_lock;
};
struct ib_mr {
--
1.7.8
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] ib/core: Protect QP mcast list
2011-12-19 7:09 [PATCH] ib/core: Protect QP mcast list Eli Cohen
@ 2011-12-19 21:54 ` Roland Dreier
[not found] ` <CAL1RGDUpqz+26fKcHj=0NbNqU_8vYV6zOresyYzFZTasGwD_oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-20 23:28 ` Hefty, Sean
1 sibling, 1 reply; 6+ messages in thread
From: Roland Dreier @ 2011-12-19 21:54 UTC (permalink / raw)
To: Eli Cohen; +Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w
On Sun, Dec 18, 2011 at 11:09 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> Multicast attach/detach operations on a QP are carried under the
> readers'/writers' lock of the QP. Such protection is not sufficient since a
> reader's lock allows more than one reader to acquire the lock. However, list
> manipulation implies write operations on the list variable. Use a spinlock to
> provide protection.
Did you consider taking the QP's rwsem for writing across these operations
instead? How did that approach compare?
Assuming there's some problem with that, would it make sense to put
the mcast_lock next to the mcast_list in struct ib_uqp_object, instead of
hiding it in struct ib_qp?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ib/core: Protect QP mcast list
[not found] ` <CAL1RGDUpqz+26fKcHj=0NbNqU_8vYV6zOresyYzFZTasGwD_oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-12-20 7:19 ` Eli Cohen
[not found] ` <CAL3tnx7u9_Yrqr0QZ=c2tKnJtyFe97XR2QKh0=DQuP77XYn9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Eli Cohen @ 2011-12-20 7:19 UTC (permalink / raw)
To: Roland Dreier; +Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w
On Mon, Dec 19, 2011 at 11:54 PM, Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> wrote:
>
> Did you consider taking the QP's rwsem for writing across these operations
> instead? How did that approach compare?
>
I considered this but that means that you serialize attach/detach
operations at ib core. Using a spinlock to protect the list allows
more concurrency. After all, we hit this bug since concurrency of such
operations occur in real life applications.
> Assuming there's some problem with that, would it make sense to put
> the mcast_lock next to the mcast_list in struct ib_uqp_object, instead of
> hiding it in struct ib_qp?
>
Yes, it makes more sense to me.
If it is agreed to use a spinlock and put it next to mcast_list I will
prepare another patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] ib/core: Protect QP mcast list
2011-12-19 7:09 [PATCH] ib/core: Protect QP mcast list Eli Cohen
2011-12-19 21:54 ` Roland Dreier
@ 2011-12-20 23:28 ` Hefty, Sean
1 sibling, 0 replies; 6+ messages in thread
From: Hefty, Sean @ 2011-12-20 23:28 UTC (permalink / raw)
To: Eli Cohen, roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org
Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
> @@ -1910,12 +1911,15 @@ ssize_t ib_uverbs_attach_mcast(struct ib_uverbs_file
> *file,
>
> obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
>
> + spin_lock(&qp->mcast_lock);
> list_for_each_entry(mcast, &obj->mcast_list, list)
> if (cmd.mlid == mcast->lid &&
> !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
> ret = 0;
> + spin_unlock(&qp->mcast_lock);
> goto out_put;
> }
> + spin_unlock(&qp->mcast_lock);
Does the lower level code protect against trying to join the same group twice? The locking ends up looking racy here, with the lock being dropped between checking the list and adding the new entry.
> @@ -1961,6 +1968,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file
> *file,
>
> obj = container_of(qp->uobject, struct ib_uqp_object, uevent.uobject);
>
> + spin_lock(&qp->mcast_lock);
> list_for_each_entry(mcast, &obj->mcast_list, list)
> if (cmd.mlid == mcast->lid &&
> !memcmp(cmd.gid, mcast->gid.raw, sizeof mcast->gid.raw)) {
> @@ -1968,6 +1976,7 @@ ssize_t ib_uverbs_detach_mcast(struct ib_uverbs_file
> *file,
> kfree(mcast);
> break;
> }
> + spin_unlock(&qp->mcast_lock);
I wonder if the uverbs code should prevent the user from trying to join a group twice and prevent it from leaving a group that it hasn't joined. This is a separate question from these patches, but it would affect how the locking is held.
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ib/core: Protect QP mcast list
[not found] ` <CAL3tnx7u9_Yrqr0QZ=c2tKnJtyFe97XR2QKh0=DQuP77XYn9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-12-21 0:36 ` Roland Dreier
[not found] ` <CAL1RGDUdrAK20fTOQUTgBHw-PnPjkojC8hE7oAePd=NCLSnGrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 6+ messages in thread
From: Roland Dreier @ 2011-12-21 0:36 UTC (permalink / raw)
To: Eli Cohen; +Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w
On Mon, Dec 19, 2011 at 11:19 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> I considered this but that means that you serialize attach/detach
> operations at ib core. Using a spinlock to protect the list allows
> more concurrency. After all, we hit this bug since concurrency of such
> operations occur in real life applications.
Looking closer at it, I don't see how we can avoid serializing multicast
operations.
At least, I don't see how your proposed patch can work for all the crazy
things userspace might do.
For example suppose two threads join the same QP to the same MCG at
the same time. In that case it might happen that both threads check the
list and don't find the group there, since you drop the lock before the
actual low-level verbs attach (and you don't and can't hold the lock across
that sleeping call).
In that case both calls to attach will succeed (underlying verb is
idempotent) and the membership info will be added to the list twice.
Then all sorts of problems ensue, eg a detach will succeed but only
remove one list entry, and so a subsequent re-attach will silently do
nothing.
Also I'm sure racing attach/detach can get into trouble too, eg
detach succeeds but list entry ends up still on the list.
I don't see any obvious way to close all these holes except to
make sure that adding/removing a list entry is done atomically
along with the actual attach/detach operation -- ie hold some
sort of sleepable lock (like an rwsem) across checking the list,
doing the attach/detach and doing the list add/remove.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] ib/core: Protect QP mcast list
[not found] ` <CAL1RGDUdrAK20fTOQUTgBHw-PnPjkojC8hE7oAePd=NCLSnGrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-12-21 7:10 ` Eli Cohen
0 siblings, 0 replies; 6+ messages in thread
From: Eli Cohen @ 2011-12-21 7:10 UTC (permalink / raw)
To: Roland Dreier; +Cc: RDMA list, miked-VPRAkNaXOzVWk0Htik3J/w
On Tue, Dec 20, 2011 at 04:36:35PM -0800, Roland Dreier wrote:
> On Mon, Dec 19, 2011 at 11:19 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> > I considered this but that means that you serialize attach/detach
> > operations at ib core. Using a spinlock to protect the list allows
> > more concurrency. After all, we hit this bug since concurrency of such
> > operations occur in real life applications.
>
> Looking closer at it, I don't see how we can avoid serializing multicast
> operations.
>
> At least, I don't see how your proposed patch can work for all the crazy
> things userspace might do.
>
> For example suppose two threads join the same QP to the same MCG at
> the same time. In that case it might happen that both threads check the
> list and don't find the group there, since you drop the lock before the
> actual low-level verbs attach (and you don't and can't hold the lock across
> that sleeping call).
>
> In that case both calls to attach will succeed (underlying verb is
> idempotent) and the membership info will be added to the list twice.
>
> Then all sorts of problems ensue, eg a detach will succeed but only
> remove one list entry, and so a subsequent re-attach will silently do
> nothing.
>
> Also I'm sure racing attach/detach can get into trouble too, eg
> detach succeeds but list entry ends up still on the list.
>
> I don't see any obvious way to close all these holes except to
> make sure that adding/removing a list entry is done atomically
> along with the actual attach/detach operation -- ie hold some
> sort of sleepable lock (like an rwsem) across checking the list,
> doing the attach/detach and doing the list add/remove.
>
I see...
I think there is not much point in introducing a new semaphore. We can
simply use the existing rw_semaphore. This requires defining
idr_write_qp and put_qp_write. I will prepare a patch and run some
testing.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-12-21 7:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-19 7:09 [PATCH] ib/core: Protect QP mcast list Eli Cohen
2011-12-19 21:54 ` Roland Dreier
[not found] ` <CAL1RGDUpqz+26fKcHj=0NbNqU_8vYV6zOresyYzFZTasGwD_oA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-20 7:19 ` Eli Cohen
[not found] ` <CAL3tnx7u9_Yrqr0QZ=c2tKnJtyFe97XR2QKh0=DQuP77XYn9Kg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-21 0:36 ` Roland Dreier
[not found] ` <CAL1RGDUdrAK20fTOQUTgBHw-PnPjkojC8hE7oAePd=NCLSnGrg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-12-21 7:10 ` Eli Cohen
2011-12-20 23:28 ` Hefty, Sean
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox