* [PATCH 1/2] rdma/cm: fix handling of ipv6 addressing in cma_use_port
@ 2011-04-16 6:42 Hefty, Sean
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2B-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Hefty, Sean @ 2011-04-16 6:42 UTC (permalink / raw)
To: linux-rdma
cma_use_port is coded assuming that the sockaddr is an ipv4 address.
Since ipv6 addressing is supported, and also to support other address
families, make the code more generic in its address handling.
Signed-off-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
This patch was originally submitted as patch 2 of the AF_IB patch set;
however, it will also be used by the REUSEADDR support for patch 2/2.
drivers/infiniband/core/cma.c | 29 ++++++++++++++++++++++-------
1 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 6884da2..a1b1e27 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -708,6 +708,21 @@ static inline int cma_any_addr(struct sockaddr *addr)
return cma_zero_addr(addr) || cma_loopback_addr(addr);
}
+static int cma_addr_cmp(struct sockaddr *src, struct sockaddr *dst)
+{
+ if (src->sa_family != dst->sa_family)
+ return -1;
+
+ switch (src->sa_family) {
+ case AF_INET:
+ return ((struct sockaddr_in *) src)->sin_addr.s_addr !=
+ ((struct sockaddr_in *) dst)->sin_addr.s_addr;
+ default:
+ return ipv6_addr_cmp(&((struct sockaddr_in6 *) src)->sin6_addr,
+ &((struct sockaddr_in6 *) dst)->sin6_addr);
+ }
+}
+
static inline __be16 cma_port(struct sockaddr *addr)
{
if (addr->sa_family == AF_INET)
@@ -2159,13 +2174,13 @@ retry:
static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
{
struct rdma_id_private *cur_id;
- struct sockaddr_in *sin, *cur_sin;
+ struct sockaddr *addr, *cur_addr;
struct rdma_bind_list *bind_list;
struct hlist_node *node;
unsigned short snum;
- sin = (struct sockaddr_in *) &id_priv->id.route.addr.src_addr;
- snum = ntohs(sin->sin_port);
+ addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr;
+ snum = ntohs(cma_port(addr));
if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
return -EACCES;
@@ -2177,15 +2192,15 @@ static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
* We don't support binding to any address if anyone is bound to
* a specific address on the same port.
*/
- if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr))
+ if (cma_any_addr(addr))
return -EADDRNOTAVAIL;
hlist_for_each_entry(cur_id, node, &bind_list->owners, node) {
- if (cma_any_addr((struct sockaddr *) &cur_id->id.route.addr.src_addr))
+ cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
+ if (cma_any_addr(cur_addr))
return -EADDRNOTAVAIL;
- cur_sin = (struct sockaddr_in *) &cur_id->id.route.addr.src_addr;
- if (sin->sin_addr.s_addr == cur_sin->sin_addr.s_addr)
+ if (!cma_addr_cmp(addr, cur_addr))
return -EADDRINUSE;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] rdma/cm: Support REUSEADDR
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2B-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2011-04-16 6:46 ` Hefty, Sean
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-10 21:30 ` [PATCH 1/2] rdma/cm: fix handling of ipv6 addressing in cma_use_port Ira Weiny
1 sibling, 1 reply; 5+ messages in thread
From: Hefty, Sean @ 2011-04-16 6:46 UTC (permalink / raw)
To: linux-rdma; +Cc: Ira Weiny
Lustre requires that clients bind to a privileged port number before
connecting to a remote server. On larger clusters (typically more than
about 1000 nodes), the number of privileged ports is exhausted,
resulting in lustre being unusable.
To address this limitation, we add support for reusable addresses
to the rdma_cm. This mimics the behavior of the socket option
SO_REUSEADDR. A user may set an rdma_cm_id to reuse an address
before calling rdma_bind_addr (explicitly or implicitly). If set,
other rdma_cm_id's may be bound to the same address, provided that
they all have reuse enabled, and there are no active listens.
If rdma_listen is called on an rdma_cm_id that has reuse enabled,
it will only succeed if there are no other id's bound to that same
address. The reuse option is exported to user space. The behavior
of the kernel reuse implementation was verified against that given
by sockets.
This patch is derived from a path by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
Signed-off-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
Ira, can you please verify that these patches work for you?
drivers/infiniband/core/cma.c | 190 ++++++++++++++++++++++++++--------------
drivers/infiniband/core/ucma.c | 7 +
include/rdma/rdma_cm.h | 11 ++
include/rdma/rdma_user_cm.h | 5 +
4 files changed, 144 insertions(+), 69 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index a1b1e27..0590a4d 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -148,6 +148,7 @@ struct rdma_id_private {
u32 qp_num;
u8 srq;
u8 tos;
+ u8 reuseaddr;
};
struct cma_multicast {
@@ -1561,50 +1562,6 @@ static void cma_listen_on_all(struct rdma_id_private *id_priv)
mutex_unlock(&lock);
}
-int rdma_listen(struct rdma_cm_id *id, int backlog)
-{
- struct rdma_id_private *id_priv;
- int ret;
-
- id_priv = container_of(id, struct rdma_id_private, id);
- if (id_priv->state == CMA_IDLE) {
- ((struct sockaddr *) &id->route.addr.src_addr)->sa_family = AF_INET;
- ret = rdma_bind_addr(id, (struct sockaddr *) &id->route.addr.src_addr);
- if (ret)
- return ret;
- }
-
- if (!cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_LISTEN))
- return -EINVAL;
-
- id_priv->backlog = backlog;
- if (id->device) {
- switch (rdma_node_get_transport(id->device->node_type)) {
- case RDMA_TRANSPORT_IB:
- ret = cma_ib_listen(id_priv);
- if (ret)
- goto err;
- break;
- case RDMA_TRANSPORT_IWARP:
- ret = cma_iw_listen(id_priv, backlog);
- if (ret)
- goto err;
- break;
- default:
- ret = -ENOSYS;
- goto err;
- }
- } else
- cma_listen_on_all(id_priv);
-
- return 0;
-err:
- id_priv->backlog = 0;
- cma_comp_exch(id_priv, CMA_LISTEN, CMA_ADDR_BOUND);
- return ret;
-}
-EXPORT_SYMBOL(rdma_listen);
-
void rdma_set_service_type(struct rdma_cm_id *id, int tos)
{
struct rdma_id_private *id_priv;
@@ -2096,6 +2053,25 @@ err:
}
EXPORT_SYMBOL(rdma_resolve_addr);
+int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse)
+{
+ struct rdma_id_private *id_priv;
+ unsigned long flags;
+ int ret;
+
+ id_priv = container_of(id, struct rdma_id_private, id);
+ spin_lock_irqsave(&id_priv->lock, flags);
+ if (id_priv->state == CMA_IDLE) {
+ id_priv->reuseaddr = reuse;
+ ret = 0;
+ } else {
+ ret = -EINVAL;
+ }
+ spin_unlock_irqrestore(&id_priv->lock, flags);
+ return ret;
+}
+EXPORT_SYMBOL(rdma_set_reuseaddr);
+
static void cma_bind_port(struct rdma_bind_list *bind_list,
struct rdma_id_private *id_priv)
{
@@ -2171,43 +2147,73 @@ retry:
return -EADDRNOTAVAIL;
}
-static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
+/*
+ * Check that the requested port is available. This is called when trying to
+ * bind to a specific port, or when trying to listen on a bound port. In
+ * the latter case, the provided id_priv may already be on the bind_list, but
+ * we still need to check that it's okay to start listening.
+ */
+static int cma_check_port(struct rdma_bind_list *bind_list,
+ struct rdma_id_private *id_priv, uint8_t reuseaddr)
{
struct rdma_id_private *cur_id;
struct sockaddr *addr, *cur_addr;
- struct rdma_bind_list *bind_list;
struct hlist_node *node;
- unsigned short snum;
addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr;
- snum = ntohs(cma_port(addr));
- if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
- return -EACCES;
-
- bind_list = idr_find(ps, snum);
- if (!bind_list)
- return cma_alloc_port(ps, id_priv, snum);
-
- /*
- * We don't support binding to any address if anyone is bound to
- * a specific address on the same port.
- */
- if (cma_any_addr(addr))
+ if (cma_any_addr(addr) && !reuseaddr)
return -EADDRNOTAVAIL;
hlist_for_each_entry(cur_id, node, &bind_list->owners, node) {
- cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
- if (cma_any_addr(cur_addr))
- return -EADDRNOTAVAIL;
+ if (id_priv == cur_id)
+ continue;
- if (!cma_addr_cmp(addr, cur_addr))
- return -EADDRINUSE;
- }
+ if ((cur_id->state == CMA_LISTEN) ||
+ !reuseaddr || !cur_id->reuseaddr) {
+ cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
+ if (cma_any_addr(cur_addr))
+ return -EADDRNOTAVAIL;
- cma_bind_port(bind_list, id_priv);
+ if (!cma_addr_cmp(addr, cur_addr))
+ return -EADDRINUSE;
+ }
+ }
return 0;
}
+static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
+{
+ struct rdma_bind_list *bind_list;
+ unsigned short snum;
+ int ret;
+
+ snum = ntohs(cma_port((struct sockaddr *) &id_priv->id.route.addr.src_addr));
+ if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
+ return -EACCES;
+
+ bind_list = idr_find(ps, snum);
+ if (!bind_list) {
+ ret = cma_alloc_port(ps, id_priv, snum);
+ } else {
+ ret = cma_check_port(bind_list, id_priv, id_priv->reuseaddr);
+ if (!ret)
+ cma_bind_port(bind_list, id_priv);
+ }
+ return ret;
+}
+
+static int cma_bind_listen(struct rdma_id_private *id_priv)
+{
+ struct rdma_bind_list *bind_list = id_priv->bind_list;
+ int ret = 0;
+
+ mutex_lock(&lock);
+ if (bind_list->owners.first->next)
+ ret = cma_check_port(bind_list, id_priv, 0);
+ mutex_unlock(&lock);
+ return ret;
+}
+
static int cma_get_port(struct rdma_id_private *id_priv)
{
struct idr *ps;
@@ -2259,6 +2265,56 @@ static int cma_check_linklocal(struct rdma_dev_addr *dev_addr,
return 0;
}
+int rdma_listen(struct rdma_cm_id *id, int backlog)
+{
+ struct rdma_id_private *id_priv;
+ int ret;
+
+ id_priv = container_of(id, struct rdma_id_private, id);
+ if (id_priv->state == CMA_IDLE) {
+ ((struct sockaddr *) &id->route.addr.src_addr)->sa_family = AF_INET;
+ ret = rdma_bind_addr(id, (struct sockaddr *) &id->route.addr.src_addr);
+ if (ret)
+ return ret;
+ }
+
+ if (!cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_LISTEN))
+ return -EINVAL;
+
+ if (id_priv->reuseaddr) {
+ ret = cma_bind_listen(id_priv);
+ if (ret)
+ goto err;
+ }
+
+ id_priv->backlog = backlog;
+ if (id->device) {
+ switch (rdma_node_get_transport(id->device->node_type)) {
+ case RDMA_TRANSPORT_IB:
+ ret = cma_ib_listen(id_priv);
+ if (ret)
+ goto err;
+ break;
+ case RDMA_TRANSPORT_IWARP:
+ ret = cma_iw_listen(id_priv, backlog);
+ if (ret)
+ goto err;
+ break;
+ default:
+ ret = -ENOSYS;
+ goto err;
+ }
+ } else
+ cma_listen_on_all(id_priv);
+
+ return 0;
+err:
+ id_priv->backlog = 0;
+ cma_comp_exch(id_priv, CMA_LISTEN, CMA_ADDR_BOUND);
+ return ret;
+}
+EXPORT_SYMBOL(rdma_listen);
+
int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
{
struct rdma_id_private *id_priv;
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index ec1e9da..b3fa798 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -883,6 +883,13 @@ static int ucma_set_option_id(struct ucma_context *ctx, int optname,
}
rdma_set_service_type(ctx->cm_id, *((u8 *) optval));
break;
+ case RDMA_OPTION_ID_REUSEADDR:
+ if (optlen != sizeof(int)) {
+ ret = -EINVAL;
+ break;
+ }
+ ret = rdma_set_reuseaddr(ctx->cm_id, *((int *) optval) ? 1 : 0);
+ break;
default:
ret = -ENOSYS;
}
diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
index 4fae903..2cb5e7f 100644
--- a/include/rdma/rdma_cm.h
+++ b/include/rdma/rdma_cm.h
@@ -329,4 +329,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr);
*/
void rdma_set_service_type(struct rdma_cm_id *id, int tos);
+/**
+ * rdma_set_reuseaddr - Allow the reuse of local addresses when binding
+ * the rdma_cm_id.
+ * @id: Communication identifier to configure.
+ * @reuse: Value indicating if the bound address is reusable.
+ *
+ * Reuse must be set before an address is bound to the id.
+ */
+int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse);
+
+
#endif /* RDMA_CM_H */
diff --git a/include/rdma/rdma_user_cm.h b/include/rdma/rdma_user_cm.h
index 1d16502..fc82c18 100644
--- a/include/rdma/rdma_user_cm.h
+++ b/include/rdma/rdma_user_cm.h
@@ -221,8 +221,9 @@ enum {
/* Option details */
enum {
- RDMA_OPTION_ID_TOS = 0,
- RDMA_OPTION_IB_PATH = 1
+ RDMA_OPTION_ID_TOS = 0,
+ RDMA_OPTION_ID_REUSEADDR = 1,
+ RDMA_OPTION_IB_PATH = 1
};
struct rdma_ucm_set_option {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] rdma/cm: Support REUSEADDR
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2011-04-29 16:12 ` Roland Dreier
2011-05-10 21:30 ` Ira Weiny
1 sibling, 0 replies; 5+ messages in thread
From: Roland Dreier @ 2011-04-29 16:12 UTC (permalink / raw)
To: Hefty, Sean; +Cc: linux-rdma, Ira Weiny
Thanks, I queued both the ipv6 and this up.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] rdma/cm: fix handling of ipv6 addressing in cma_use_port
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2B-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-04-16 6:46 ` [PATCH 2/2] rdma/cm: Support REUSEADDR Hefty, Sean
@ 2011-05-10 21:30 ` Ira Weiny
1 sibling, 0 replies; 5+ messages in thread
From: Ira Weiny @ 2011-05-10 21:30 UTC (permalink / raw)
To: Hefty, Sean; +Cc: linux-rdma
On Fri, 15 Apr 2011 23:42:48 -0700
"Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> cma_use_port is coded assuming that the sockaddr is an ipv4 address.
> Since ipv6 addressing is supported, and also to support other address
> families, make the code more generic in its address handling.
>
> Signed-off-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Acked-by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> ---
> This patch was originally submitted as patch 2 of the AF_IB patch set;
> however, it will also be used by the REUSEADDR support for patch 2/2.
>
> drivers/infiniband/core/cma.c | 29 ++++++++++++++++++++++-------
> 1 files changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 6884da2..a1b1e27 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -708,6 +708,21 @@ static inline int cma_any_addr(struct sockaddr *addr)
> return cma_zero_addr(addr) || cma_loopback_addr(addr);
> }
>
> +static int cma_addr_cmp(struct sockaddr *src, struct sockaddr *dst)
> +{
> + if (src->sa_family != dst->sa_family)
> + return -1;
> +
> + switch (src->sa_family) {
> + case AF_INET:
> + return ((struct sockaddr_in *) src)->sin_addr.s_addr !=
> + ((struct sockaddr_in *) dst)->sin_addr.s_addr;
> + default:
> + return ipv6_addr_cmp(&((struct sockaddr_in6 *) src)->sin6_addr,
> + &((struct sockaddr_in6 *) dst)->sin6_addr);
> + }
> +}
> +
> static inline __be16 cma_port(struct sockaddr *addr)
> {
> if (addr->sa_family == AF_INET)
> @@ -2159,13 +2174,13 @@ retry:
> static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
> {
> struct rdma_id_private *cur_id;
> - struct sockaddr_in *sin, *cur_sin;
> + struct sockaddr *addr, *cur_addr;
> struct rdma_bind_list *bind_list;
> struct hlist_node *node;
> unsigned short snum;
>
> - sin = (struct sockaddr_in *) &id_priv->id.route.addr.src_addr;
> - snum = ntohs(sin->sin_port);
> + addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr;
> + snum = ntohs(cma_port(addr));
> if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
> return -EACCES;
>
> @@ -2177,15 +2192,15 @@ static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
> * We don't support binding to any address if anyone is bound to
> * a specific address on the same port.
> */
> - if (cma_any_addr((struct sockaddr *) &id_priv->id.route.addr.src_addr))
> + if (cma_any_addr(addr))
> return -EADDRNOTAVAIL;
>
> hlist_for_each_entry(cur_id, node, &bind_list->owners, node) {
> - if (cma_any_addr((struct sockaddr *) &cur_id->id.route.addr.src_addr))
> + cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
> + if (cma_any_addr(cur_addr))
> return -EADDRNOTAVAIL;
>
> - cur_sin = (struct sockaddr_in *) &cur_id->id.route.addr.src_addr;
> - if (sin->sin_addr.s_addr == cur_sin->sin_addr.s_addr)
> + if (!cma_addr_cmp(addr, cur_addr))
> return -EADDRINUSE;
> }
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] rdma/cm: Support REUSEADDR
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-04-29 16:12 ` Roland Dreier
@ 2011-05-10 21:30 ` Ira Weiny
1 sibling, 0 replies; 5+ messages in thread
From: Ira Weiny @ 2011-05-10 21:30 UTC (permalink / raw)
To: Hefty, Sean; +Cc: linux-rdma, Christopher J. Morrone
On Fri, 15 Apr 2011 23:46:09 -0700
"Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> Lustre requires that clients bind to a privileged port number before
> connecting to a remote server. On larger clusters (typically more than
> about 1000 nodes), the number of privileged ports is exhausted,
> resulting in lustre being unusable.
>
> To address this limitation, we add support for reusable addresses
> to the rdma_cm. This mimics the behavior of the socket option
> SO_REUSEADDR. A user may set an rdma_cm_id to reuse an address
> before calling rdma_bind_addr (explicitly or implicitly). If set,
> other rdma_cm_id's may be bound to the same address, provided that
> they all have reuse enabled, and there are no active listens.
>
> If rdma_listen is called on an rdma_cm_id that has reuse enabled,
> it will only succeed if there are no other id's bound to that same
> address. The reuse option is exported to user space. The behavior
> of the kernel reuse implementation was verified against that given
> by sockets.
>
> This patch is derived from a path by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
>
> Signed-off-by: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
> ---
> Ira, can you please verify that these patches work for you?
>
> drivers/infiniband/core/cma.c | 190 ++++++++++++++++++++++++++--------------
> drivers/infiniband/core/ucma.c | 7 +
> include/rdma/rdma_cm.h | 11 ++
> include/rdma/rdma_user_cm.h | 5 +
> 4 files changed, 144 insertions(+), 69 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index a1b1e27..0590a4d 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -148,6 +148,7 @@ struct rdma_id_private {
> u32 qp_num;
> u8 srq;
> u8 tos;
> + u8 reuseaddr;
> };
>
> struct cma_multicast {
> @@ -1561,50 +1562,6 @@ static void cma_listen_on_all(struct rdma_id_private *id_priv)
> mutex_unlock(&lock);
> }
>
> -int rdma_listen(struct rdma_cm_id *id, int backlog)
> -{
> - struct rdma_id_private *id_priv;
> - int ret;
> -
> - id_priv = container_of(id, struct rdma_id_private, id);
> - if (id_priv->state == CMA_IDLE) {
> - ((struct sockaddr *) &id->route.addr.src_addr)->sa_family = AF_INET;
> - ret = rdma_bind_addr(id, (struct sockaddr *) &id->route.addr.src_addr);
> - if (ret)
> - return ret;
> - }
> -
> - if (!cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_LISTEN))
> - return -EINVAL;
> -
> - id_priv->backlog = backlog;
> - if (id->device) {
> - switch (rdma_node_get_transport(id->device->node_type)) {
> - case RDMA_TRANSPORT_IB:
> - ret = cma_ib_listen(id_priv);
> - if (ret)
> - goto err;
> - break;
> - case RDMA_TRANSPORT_IWARP:
> - ret = cma_iw_listen(id_priv, backlog);
> - if (ret)
> - goto err;
> - break;
> - default:
> - ret = -ENOSYS;
> - goto err;
> - }
> - } else
> - cma_listen_on_all(id_priv);
> -
> - return 0;
> -err:
> - id_priv->backlog = 0;
> - cma_comp_exch(id_priv, CMA_LISTEN, CMA_ADDR_BOUND);
> - return ret;
> -}
> -EXPORT_SYMBOL(rdma_listen);
> -
> void rdma_set_service_type(struct rdma_cm_id *id, int tos)
> {
> struct rdma_id_private *id_priv;
> @@ -2096,6 +2053,25 @@ err:
> }
> EXPORT_SYMBOL(rdma_resolve_addr);
>
> +int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse)
> +{
> + struct rdma_id_private *id_priv;
> + unsigned long flags;
> + int ret;
> +
> + id_priv = container_of(id, struct rdma_id_private, id);
> + spin_lock_irqsave(&id_priv->lock, flags);
> + if (id_priv->state == CMA_IDLE) {
> + id_priv->reuseaddr = reuse;
> + ret = 0;
> + } else {
> + ret = -EINVAL;
> + }
> + spin_unlock_irqrestore(&id_priv->lock, flags);
> + return ret;
> +}
> +EXPORT_SYMBOL(rdma_set_reuseaddr);
> +
> static void cma_bind_port(struct rdma_bind_list *bind_list,
> struct rdma_id_private *id_priv)
> {
> @@ -2171,43 +2147,73 @@ retry:
> return -EADDRNOTAVAIL;
> }
>
> -static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
> +/*
> + * Check that the requested port is available. This is called when trying to
> + * bind to a specific port, or when trying to listen on a bound port. In
> + * the latter case, the provided id_priv may already be on the bind_list, but
> + * we still need to check that it's okay to start listening.
> + */
> +static int cma_check_port(struct rdma_bind_list *bind_list,
> + struct rdma_id_private *id_priv, uint8_t reuseaddr)
> {
> struct rdma_id_private *cur_id;
> struct sockaddr *addr, *cur_addr;
> - struct rdma_bind_list *bind_list;
> struct hlist_node *node;
> - unsigned short snum;
>
> addr = (struct sockaddr *) &id_priv->id.route.addr.src_addr;
> - snum = ntohs(cma_port(addr));
> - if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
> - return -EACCES;
> -
> - bind_list = idr_find(ps, snum);
> - if (!bind_list)
> - return cma_alloc_port(ps, id_priv, snum);
> -
> - /*
> - * We don't support binding to any address if anyone is bound to
> - * a specific address on the same port.
> - */
> - if (cma_any_addr(addr))
> + if (cma_any_addr(addr) && !reuseaddr)
> return -EADDRNOTAVAIL;
>
> hlist_for_each_entry(cur_id, node, &bind_list->owners, node) {
> - cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
> - if (cma_any_addr(cur_addr))
> - return -EADDRNOTAVAIL;
> + if (id_priv == cur_id)
> + continue;
>
> - if (!cma_addr_cmp(addr, cur_addr))
> - return -EADDRINUSE;
> - }
> + if ((cur_id->state == CMA_LISTEN) ||
> + !reuseaddr || !cur_id->reuseaddr) {
> + cur_addr = (struct sockaddr *) &cur_id->id.route.addr.src_addr;
> + if (cma_any_addr(cur_addr))
> + return -EADDRNOTAVAIL;
>
> - cma_bind_port(bind_list, id_priv);
> + if (!cma_addr_cmp(addr, cur_addr))
> + return -EADDRINUSE;
> + }
> + }
> return 0;
> }
>
> +static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
> +{
> + struct rdma_bind_list *bind_list;
> + unsigned short snum;
> + int ret;
> +
> + snum = ntohs(cma_port((struct sockaddr *) &id_priv->id.route.addr.src_addr));
> + if (snum < PROT_SOCK && !capable(CAP_NET_BIND_SERVICE))
> + return -EACCES;
> +
> + bind_list = idr_find(ps, snum);
> + if (!bind_list) {
> + ret = cma_alloc_port(ps, id_priv, snum);
> + } else {
> + ret = cma_check_port(bind_list, id_priv, id_priv->reuseaddr);
> + if (!ret)
> + cma_bind_port(bind_list, id_priv);
> + }
> + return ret;
> +}
> +
> +static int cma_bind_listen(struct rdma_id_private *id_priv)
> +{
> + struct rdma_bind_list *bind_list = id_priv->bind_list;
> + int ret = 0;
> +
> + mutex_lock(&lock);
> + if (bind_list->owners.first->next)
> + ret = cma_check_port(bind_list, id_priv, 0);
> + mutex_unlock(&lock);
> + return ret;
> +}
> +
> static int cma_get_port(struct rdma_id_private *id_priv)
> {
> struct idr *ps;
> @@ -2259,6 +2265,56 @@ static int cma_check_linklocal(struct rdma_dev_addr *dev_addr,
> return 0;
> }
>
> +int rdma_listen(struct rdma_cm_id *id, int backlog)
> +{
> + struct rdma_id_private *id_priv;
> + int ret;
> +
> + id_priv = container_of(id, struct rdma_id_private, id);
> + if (id_priv->state == CMA_IDLE) {
> + ((struct sockaddr *) &id->route.addr.src_addr)->sa_family = AF_INET;
> + ret = rdma_bind_addr(id, (struct sockaddr *) &id->route.addr.src_addr);
> + if (ret)
> + return ret;
> + }
> +
> + if (!cma_comp_exch(id_priv, CMA_ADDR_BOUND, CMA_LISTEN))
> + return -EINVAL;
> +
> + if (id_priv->reuseaddr) {
> + ret = cma_bind_listen(id_priv);
> + if (ret)
> + goto err;
> + }
> +
> + id_priv->backlog = backlog;
> + if (id->device) {
> + switch (rdma_node_get_transport(id->device->node_type)) {
> + case RDMA_TRANSPORT_IB:
> + ret = cma_ib_listen(id_priv);
> + if (ret)
> + goto err;
> + break;
> + case RDMA_TRANSPORT_IWARP:
> + ret = cma_iw_listen(id_priv, backlog);
> + if (ret)
> + goto err;
> + break;
> + default:
> + ret = -ENOSYS;
> + goto err;
> + }
> + } else
> + cma_listen_on_all(id_priv);
> +
> + return 0;
> +err:
> + id_priv->backlog = 0;
> + cma_comp_exch(id_priv, CMA_LISTEN, CMA_ADDR_BOUND);
> + return ret;
> +}
> +EXPORT_SYMBOL(rdma_listen);
> +
> int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr)
> {
> struct rdma_id_private *id_priv;
> diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
> index ec1e9da..b3fa798 100644
> --- a/drivers/infiniband/core/ucma.c
> +++ b/drivers/infiniband/core/ucma.c
> @@ -883,6 +883,13 @@ static int ucma_set_option_id(struct ucma_context *ctx, int optname,
> }
> rdma_set_service_type(ctx->cm_id, *((u8 *) optval));
> break;
> + case RDMA_OPTION_ID_REUSEADDR:
> + if (optlen != sizeof(int)) {
> + ret = -EINVAL;
> + break;
> + }
> + ret = rdma_set_reuseaddr(ctx->cm_id, *((int *) optval) ? 1 : 0);
> + break;
> default:
> ret = -ENOSYS;
> }
> diff --git a/include/rdma/rdma_cm.h b/include/rdma/rdma_cm.h
> index 4fae903..2cb5e7f 100644
> --- a/include/rdma/rdma_cm.h
> +++ b/include/rdma/rdma_cm.h
> @@ -329,4 +329,15 @@ void rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr);
> */
> void rdma_set_service_type(struct rdma_cm_id *id, int tos);
>
> +/**
> + * rdma_set_reuseaddr - Allow the reuse of local addresses when binding
> + * the rdma_cm_id.
> + * @id: Communication identifier to configure.
> + * @reuse: Value indicating if the bound address is reusable.
> + *
> + * Reuse must be set before an address is bound to the id.
> + */
> +int rdma_set_reuseaddr(struct rdma_cm_id *id, int reuse);
> +
> +
> #endif /* RDMA_CM_H */
> diff --git a/include/rdma/rdma_user_cm.h b/include/rdma/rdma_user_cm.h
> index 1d16502..fc82c18 100644
> --- a/include/rdma/rdma_user_cm.h
> +++ b/include/rdma/rdma_user_cm.h
> @@ -221,8 +221,9 @@ enum {
>
> /* Option details */
> enum {
> - RDMA_OPTION_ID_TOS = 0,
> - RDMA_OPTION_IB_PATH = 1
> + RDMA_OPTION_ID_TOS = 0,
> + RDMA_OPTION_ID_REUSEADDR = 1,
> + RDMA_OPTION_IB_PATH = 1
> };
>
> struct rdma_ucm_set_option {
>
>
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-05-10 21:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-16 6:42 [PATCH 1/2] rdma/cm: fix handling of ipv6 addressing in cma_use_port Hefty, Sean
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2B-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-04-16 6:46 ` [PATCH 2/2] rdma/cm: Support REUSEADDR Hefty, Sean
[not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF25DCC49B2C-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-04-29 16:12 ` Roland Dreier
2011-05-10 21:30 ` Ira Weiny
2011-05-10 21:30 ` [PATCH 1/2] rdma/cm: fix handling of ipv6 addressing in cma_use_port Ira Weiny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox