netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/15] RDS updates for net-next
@ 2009-07-17 23:13 Andy Grover
  2009-07-17 23:13 ` [PATCH 01/15] RDS: Set retry_count to 2 and make modifiable via modparam Andy Grover
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Hi,

These are some assorted RDS updates to net-next, please review and
apply if they look ok.

Thanks -- Regards -- Andy



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/15] RDS: Set retry_count to 2 and make modifiable via modparam
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 02/15] RDS/IB: Improve RDS protocol version checking Andy Grover
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

This will be default cause IB connections to failover faster,
but allow a longer retry count to be used if desired.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib.c    |    3 +++
 net/rds/ib.h    |    3 +++
 net/rds/ib_cm.c |    2 +-
 3 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index b9bcd32..27abdd3 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -43,11 +43,14 @@
 
 unsigned int fmr_pool_size = RDS_FMR_POOL_SIZE;
 unsigned int fmr_message_size = RDS_FMR_SIZE + 1; /* +1 allows for unaligned MRs */
+unsigned int rds_ib_retry_count = RDS_IB_DEFAULT_RETRY_COUNT;
 
 module_param(fmr_pool_size, int, 0444);
 MODULE_PARM_DESC(fmr_pool_size, " Max number of fmr per HCA");
 module_param(fmr_message_size, int, 0444);
 MODULE_PARM_DESC(fmr_message_size, " Max size of a RDMA transfer");
+module_param(rds_ib_retry_count, int, 0444);
+MODULE_PARM_DESC(rds_ib_retry_count, " Number of hw retries before reporting an error");
 
 struct list_head rds_ib_devices;
 
diff --git a/net/rds/ib.h b/net/rds/ib.h
index 455ae73..420afb9 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -15,6 +15,8 @@
 #define RDS_IB_DEFAULT_RECV_WR		1024
 #define RDS_IB_DEFAULT_SEND_WR		256
 
+#define RDS_IB_DEFAULT_RETRY_COUNT	2
+
 #define RDS_IB_SUPPORTED_PROTOCOLS	0x00000003	/* minor versions supported */
 
 extern struct list_head rds_ib_devices;
@@ -247,6 +249,7 @@ extern struct ib_client rds_ib_client;
 
 extern unsigned int fmr_pool_size;
 extern unsigned int fmr_message_size;
+extern unsigned int rds_ib_retry_count;
 
 extern spinlock_t ib_nodev_conns_lock;
 extern struct list_head ib_nodev_conns;
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index f8e40e1..605c032 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -145,7 +145,7 @@ static void rds_ib_cm_fill_conn_param(struct rds_connection *conn,
 	/* XXX tune these? */
 	conn_param->responder_resources = 1;
 	conn_param->initiator_depth = 1;
-	conn_param->retry_count = 7;
+	conn_param->retry_count = min_t(unsigned int, rds_ib_retry_count, 7);
 	conn_param->rnr_retry_count = 7;
 
 	if (dp) {
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/15] RDS/IB: Improve RDS protocol version checking
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
  2009-07-17 23:13 ` [PATCH 01/15] RDS: Set retry_count to 2 and make modifiable via modparam Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 03/15] RDS/IB: Handle connections using RDS 3.0 wire protocol Andy Grover
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

RDS on IB uses privdata to do protocol version negotiation. Apparently
the IB stack will return a larger privdata buffer than the struct we were
expecting. Just to be extra-sure, this patch adds some checks in this area.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_cm.c |   25 +++++++++++++++++++------
 1 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 605c032..0964ac5 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -98,7 +98,7 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even
 	struct ib_qp_attr qp_attr;
 	int err;
 
-	if (event->param.conn.private_data_len) {
+	if (event->param.conn.private_data_len >= sizeof(*dp)) {
 		dp = event->param.conn.private_data;
 
 		rds_ib_set_protocol(conn,
@@ -344,19 +344,32 @@ out:
 	return ret;
 }
 
-static u32 rds_ib_protocol_compatible(const struct rds_ib_connect_private *dp)
+static u32 rds_ib_protocol_compatible(struct rdma_cm_event *event)
 {
+	const struct rds_ib_connect_private *dp = event->param.conn.private_data;
 	u16 common;
 	u32 version = 0;
 
-	/* rdma_cm private data is odd - when there is any private data in the
+	/*
+	 * rdma_cm private data is odd - when there is any private data in the
 	 * request, we will be given a pretty large buffer without telling us the
 	 * original size. The only way to tell the difference is by looking at
 	 * the contents, which are initialized to zero.
 	 * If the protocol version fields aren't set, this is a connection attempt
 	 * from an older version. This could could be 3.0 or 2.0 - we can't tell.
-	 * We really should have changed this for OFED 1.3 :-( */
-	if (dp->dp_protocol_major == 0)
+	 * We really should have changed this for OFED 1.3 :-(
+	 */
+
+	/* Be paranoid. RDS always has privdata */
+	if (!event->param.conn.private_data_len) {
+		printk(KERN_NOTICE "RDS incoming connection has no private data, "
+			"rejecting\n");
+		return 0;
+	}
+
+	/* Even if len is crap *now* I still want to check it. -ASG */
+	if (event->param.conn.private_data_len < sizeof (*dp)
+	    || dp->dp_protocol_major == 0)
 		return RDS_PROTOCOL_3_0;
 
 	common = be16_to_cpu(dp->dp_protocol_minor_mask) & RDS_IB_SUPPORTED_PROTOCOLS;
@@ -388,7 +401,7 @@ int rds_ib_cm_handle_connect(struct rdma_cm_id *cm_id,
 	int err, destroy = 1;
 
 	/* Check whether the remote protocol version matches ours. */
-	version = rds_ib_protocol_compatible(dp);
+	version = rds_ib_protocol_compatible(event);
 	if (!version)
 		goto out;
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/15] RDS/IB: Handle connections using RDS 3.0 wire protocol
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
  2009-07-17 23:13 ` [PATCH 01/15] RDS: Set retry_count to 2 and make modifiable via modparam Andy Grover
  2009-07-17 23:13 ` [PATCH 02/15] RDS/IB: Improve RDS protocol version checking Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 04/15] RDS/IB: Fix printk to indicate remote IP, not local Andy Grover
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

The big differences between RDS 3.0 and 3.1 are protocol-level
flow control, and with 3.1 the header is in front of the data. The header
always ends up in the header buffer, and the data goes in the data page.

In 3.0 our "header" is a trailer, and will end up either in the data
page, the header buffer, or split across the two. Since 3.1 is backwards-
compatible with 3.0, we need to continue to support these cases. This
patch does that -- if using RDS 3.0 wire protocol, it will copy the header
from wherever it ended up into the header buffer.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib.h      |   12 ++++++++++--
 net/rds/ib_cm.c   |    9 ++++++---
 net/rds/ib_recv.c |   43 ++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index 420afb9..c0de7af 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -358,17 +358,25 @@ extern ctl_table rds_ib_sysctl_table[];
 /*
  * Helper functions for getting/setting the header and data SGEs in
  * RDS packets (not RDMA)
+ *
+ * From version 3.1 onwards, header is in front of data in the sge.
  */
 static inline struct ib_sge *
 rds_ib_header_sge(struct rds_ib_connection *ic, struct ib_sge *sge)
 {
-	return &sge[0];
+	if (ic->conn->c_version > RDS_PROTOCOL_3_0)
+		return &sge[0];
+	else
+		return &sge[1];
 }
 
 static inline struct ib_sge *
 rds_ib_data_sge(struct rds_ib_connection *ic, struct ib_sge *sge)
 {
-	return &sge[1];
+	if (ic->conn->c_version > RDS_PROTOCOL_3_0)
+		return &sge[1];
+	else
+		return &sge[0];
 }
 
 #endif
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 0964ac5..1eb0c29 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -101,10 +101,13 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even
 	if (event->param.conn.private_data_len >= sizeof(*dp)) {
 		dp = event->param.conn.private_data;
 
-		rds_ib_set_protocol(conn,
+		/* make sure it isn't empty data */
+		if (dp->dp_protocol_major) {
+			rds_ib_set_protocol(conn,
 				RDS_PROTOCOL(dp->dp_protocol_major,
-					dp->dp_protocol_minor));
-		rds_ib_set_flow_control(conn, be32_to_cpu(dp->dp_credit));
+				dp->dp_protocol_minor));
+			rds_ib_set_flow_control(conn, be32_to_cpu(dp->dp_credit));
+		}
 	}
 
 	printk(KERN_NOTICE "RDS/IB: connected to %pI4 version %u.%u%s\n",
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 5709bad..28bdcdc 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -555,6 +555,47 @@ u64 rds_ib_piggyb_ack(struct rds_ib_connection *ic)
 	return rds_ib_get_ack(ic);
 }
 
+static struct rds_header *rds_ib_get_header(struct rds_connection *conn,
+					    struct rds_ib_recv_work *recv,
+					    u32 data_len)
+{
+	struct rds_ib_connection *ic = conn->c_transport_data;
+	void *hdr_buff = &ic->i_recv_hdrs[recv - ic->i_recvs];
+	void *addr;
+	u32 misplaced_hdr_bytes;
+
+	/*
+	 * Support header at the front (RDS 3.1+) as well as header-at-end.
+	 *
+	 * Cases:
+	 * 1) header all in header buff (great!)
+	 * 2) header all in data page (copy all to header buff)
+	 * 3) header split across hdr buf + data page
+	 *    (move bit in hdr buff to end before copying other bit from data page)
+	 */
+	if (conn->c_version > RDS_PROTOCOL_3_0 || data_len == RDS_FRAG_SIZE)
+	        return hdr_buff;
+
+	if (data_len <= (RDS_FRAG_SIZE - sizeof(struct rds_header))) {
+		addr = kmap_atomic(recv->r_frag->f_page, KM_SOFTIRQ0);
+		memcpy(hdr_buff,
+		       addr + recv->r_frag->f_offset + data_len,
+		       sizeof(struct rds_header));
+		kunmap_atomic(addr, KM_SOFTIRQ0);
+		return hdr_buff;
+	}
+
+	misplaced_hdr_bytes = (sizeof(struct rds_header) - (RDS_FRAG_SIZE - data_len));
+
+	memmove(hdr_buff + misplaced_hdr_bytes, hdr_buff, misplaced_hdr_bytes);
+
+	addr = kmap_atomic(recv->r_frag->f_page, KM_SOFTIRQ0);
+	memcpy(hdr_buff, addr + recv->r_frag->f_offset + data_len,
+	       sizeof(struct rds_header) - misplaced_hdr_bytes);
+	kunmap_atomic(addr, KM_SOFTIRQ0);
+	return hdr_buff;
+}
+
 /*
  * It's kind of lame that we're copying from the posted receive pages into
  * long-lived bitmaps.  We could have posted the bitmaps and rdma written into
@@ -667,7 +708,7 @@ static void rds_ib_process_recv(struct rds_connection *conn,
 	}
 	byte_len -= sizeof(struct rds_header);
 
-	ihdr = &ic->i_recv_hdrs[recv - ic->i_recvs];
+	ihdr = rds_ib_get_header(conn, recv, byte_len);
 
 	/* Validate the checksum. */
 	if (!rds_message_verify_checksum(ihdr)) {
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/15] RDS/IB: Fix printk to indicate remote IP, not local
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (2 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 03/15] RDS/IB: Handle connections using RDS 3.0 wire protocol Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 05/15] RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.c Andy Grover
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_cm.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 1eb0c29..f621086 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -111,7 +111,7 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even
 	}
 
 	printk(KERN_NOTICE "RDS/IB: connected to %pI4 version %u.%u%s\n",
-			&conn->c_laddr,
+			&conn->c_faddr,
 			RDS_PROTOCOL_MAJOR(conn->c_version),
 			RDS_PROTOCOL_MINOR(conn->c_version),
 			ic->i_flowctl ? ", flow control" : "");
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 05/15] RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.c
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (3 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 04/15] RDS/IB: Fix printk to indicate remote IP, not local Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 06/15] RDS/IB: Rename byte_len to data_len to enhance readability Andy Grover
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/rdma_transport.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/rds/rdma_transport.c b/net/rds/rdma_transport.c
index 7d0f901..981a5e6 100644
--- a/net/rds/rdma_transport.c
+++ b/net/rds/rdma_transport.c
@@ -101,7 +101,7 @@ int rds_rdma_cm_event_handler(struct rdma_cm_id *cm_id,
 		break;
 
 	case RDMA_CM_EVENT_DISCONNECTED:
-		printk(KERN_WARNING "RDS/IW: DISCONNECT event - dropping connection "
+		printk(KERN_WARNING "RDS/RDMA: DISCONNECT event - dropping connection "
 			"%pI4->%pI4\n", &conn->c_laddr,
 			 &conn->c_faddr);
 		rds_conn_drop(conn);
@@ -132,7 +132,7 @@ static int __init rds_rdma_listen_init(void)
 	cm_id = rdma_create_id(rds_rdma_cm_event_handler, NULL, RDMA_PS_TCP);
 	if (IS_ERR(cm_id)) {
 		ret = PTR_ERR(cm_id);
-		printk(KERN_ERR "RDS/IW: failed to setup listener, "
+		printk(KERN_ERR "RDS/RDMA: failed to setup listener, "
 		       "rdma_create_id() returned %d\n", ret);
 		goto out;
 	}
@@ -147,14 +147,14 @@ static int __init rds_rdma_listen_init(void)
 	 */
 	ret = rdma_bind_addr(cm_id, (struct sockaddr *)&sin);
 	if (ret) {
-		printk(KERN_ERR "RDS/IW: failed to setup listener, "
+		printk(KERN_ERR "RDS/RDMA: failed to setup listener, "
 		       "rdma_bind_addr() returned %d\n", ret);
 		goto out;
 	}
 
 	ret = rdma_listen(cm_id, 128);
 	if (ret) {
-		printk(KERN_ERR "RDS/IW: failed to setup listener, "
+		printk(KERN_ERR "RDS/RDMA: failed to setup listener, "
 		       "rdma_listen() returned %d\n", ret);
 		goto out;
 	}
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 06/15] RDS/IB: Rename byte_len to data_len to enhance readability
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (4 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 05/15] RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.c Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 07/15] RDS: Don't set c_version in __rds_conn_create() Andy Grover
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Of course len is in bytes. Calling it data_len hopefully indicates
a little better what the variable is actually for.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_recv.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 28bdcdc..cd7a6cf 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -686,7 +686,7 @@ struct rds_ib_ack_state {
 };
 
 static void rds_ib_process_recv(struct rds_connection *conn,
-				struct rds_ib_recv_work *recv, u32 byte_len,
+				struct rds_ib_recv_work *recv, u32 data_len,
 				struct rds_ib_ack_state *state)
 {
 	struct rds_ib_connection *ic = conn->c_transport_data;
@@ -696,9 +696,9 @@ static void rds_ib_process_recv(struct rds_connection *conn,
 	/* XXX shut down the connection if port 0,0 are seen? */
 
 	rdsdebug("ic %p ibinc %p recv %p byte len %u\n", ic, ibinc, recv,
-		 byte_len);
+		 data_len);
 
-	if (byte_len < sizeof(struct rds_header)) {
+	if (data_len < sizeof(struct rds_header)) {
 		rds_ib_conn_error(conn, "incoming message "
 		       "from %pI4 didn't inclue a "
 		       "header, disconnecting and "
@@ -706,9 +706,9 @@ static void rds_ib_process_recv(struct rds_connection *conn,
 		       &conn->c_faddr);
 		return;
 	}
-	byte_len -= sizeof(struct rds_header);
+	data_len -= sizeof(struct rds_header);
 
-	ihdr = rds_ib_get_header(conn, recv, byte_len);
+	ihdr = rds_ib_get_header(conn, recv, data_len);
 
 	/* Validate the checksum. */
 	if (!rds_message_verify_checksum(ihdr)) {
@@ -728,7 +728,7 @@ static void rds_ib_process_recv(struct rds_connection *conn,
 	if (ihdr->h_credit)
 		rds_ib_send_add_credits(conn, ihdr->h_credit);
 
-	if (ihdr->h_sport == 0 && ihdr->h_dport == 0 && byte_len == 0) {
+	if (ihdr->h_sport == 0 && ihdr->h_dport == 0 && data_len == 0) {
 		/* This is an ACK-only packet. The fact that it gets
 		 * special treatment here is that historically, ACKs
 		 * were rather special beasts.
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/15] RDS: Don't set c_version in __rds_conn_create()
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (5 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 06/15] RDS/IB: Rename byte_len to data_len to enhance readability Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 08/15] RDS/IB: Move tx/rx ring init and refill to later Andy Grover
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Protocol negotiation is logically a property of the
transports, so rds core need not set it.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/connection.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index d14445c..605fe3f 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -155,7 +155,6 @@ static struct rds_connection *__rds_conn_create(__be32 laddr, __be32 faddr,
 	}
 
 	INIT_HLIST_NODE(&conn->c_hash_node);
-	conn->c_version = RDS_PROTOCOL_3_0;
 	conn->c_laddr = laddr;
 	conn->c_faddr = faddr;
 	spin_lock_init(&conn->c_lock);
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 08/15] RDS/IB: Move tx/rx ring init and refill to later
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (6 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 07/15] RDS: Don't set c_version in __rds_conn_create() Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 09/15] RDS/IB: Disable flow control in sysctl and explain why Andy Grover
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Since RDS 3.0 and 3.1 have different packet formats,
we need to wait until after protocol negotiation
is complete to layout the rx buffers.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_cm.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index f621086..0ad749c 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -116,6 +116,16 @@ void rds_ib_cm_connect_complete(struct rds_connection *conn, struct rdma_cm_even
 			RDS_PROTOCOL_MINOR(conn->c_version),
 			ic->i_flowctl ? ", flow control" : "");
 
+	/*
+	 * Init rings and fill recv. this needs to wait until protocol negotiation
+	 * is complete, since ring layout is different from 3.0 to 3.1.
+	 */
+	rds_ib_send_init_ring(ic);
+	rds_ib_recv_init_ring(ic);
+	/* Post receive buffers - as a side effect, this will update
+	 * the posted credit count. */
+	rds_ib_recv_refill(conn, GFP_KERNEL, GFP_HIGHUSER, 1);
+
 	/* Tune RNR behavior */
 	rds_ib_tune_rnr(ic, &qp_attr);
 
@@ -324,7 +334,7 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
 		rdsdebug("send allocation failed\n");
 		goto out;
 	}
-	rds_ib_send_init_ring(ic);
+	memset(ic->i_sends, 0, ic->i_send_ring.w_nr * sizeof(struct rds_ib_send_work));
 
 	ic->i_recvs = vmalloc(ic->i_recv_ring.w_nr * sizeof(struct rds_ib_recv_work));
 	if (ic->i_recvs == NULL) {
@@ -332,14 +342,10 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
 		rdsdebug("recv allocation failed\n");
 		goto out;
 	}
+	memset(ic->i_recvs, 0, ic->i_recv_ring.w_nr * sizeof(struct rds_ib_recv_work));
 
-	rds_ib_recv_init_ring(ic);
 	rds_ib_recv_init_ack(ic);
 
-	/* Post receive buffers - as a side effect, this will update
-	 * the posted credit count. */
-	rds_ib_recv_refill(conn, GFP_KERNEL, GFP_HIGHUSER, 1);
-
 	rdsdebug("conn %p pd %p mr %p cq %p %p\n", conn, ic->i_pd, ic->i_mr,
 		 ic->i_send_cq, ic->i_recv_cq);
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 09/15] RDS/IB: Disable flow control in sysctl and explain why
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (7 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 08/15] RDS/IB: Move tx/rx ring init and refill to later Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 10/15] RDS/IB: Drop connection when a fatal QP event is received Andy Grover
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Backwards compatibility with rds 3.0 causes protocol-
based flow control to be disabled as a side-effect.

I don't want to pull out FC support from the IB transport
but I do want to document and keep the sysctl consistent
if possible.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_sysctl.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/net/rds/ib_sysctl.c b/net/rds/ib_sysctl.c
index d87830d..84b5ffc 100644
--- a/net/rds/ib_sysctl.c
+++ b/net/rds/ib_sysctl.c
@@ -53,7 +53,17 @@ unsigned long rds_ib_sysctl_max_unsig_bytes = (16 << 20);
 static unsigned long rds_ib_sysctl_max_unsig_bytes_min = 1;
 static unsigned long rds_ib_sysctl_max_unsig_bytes_max = ~0UL;
 
-unsigned int rds_ib_sysctl_flow_control = 1;
+/*
+ * This sysctl does nothing.
+ *
+ * Backwards compatibility with RDS 3.0 wire protocol
+ * disables initial FC credit exchange.
+ * If it's ever possible to drop 3.0 support,
+ * setting this to 1 and moving init/refill of send/recv
+ * rings from ib_cm_connect_complete() back into ib_setup_qp()
+ * will cause credits to be added before protocol negotiation.
+ */
+unsigned int rds_ib_sysctl_flow_control = 0;
 
 ctl_table rds_ib_sysctl_table[] = {
 	{
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/15] RDS/IB: Drop connection when a fatal QP event is received
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (8 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 09/15] RDS/IB: Disable flow control in sysctl and explain why Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 11/15] RDS: Fix completion notifications on blocking sockets Andy Grover
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib_cm.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index 0ad749c..c2d372f 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -203,9 +203,9 @@ static void rds_ib_qp_event_handler(struct ib_event *event, void *data)
 		rdma_notify(ic->i_cm_id, IB_EVENT_COMM_EST);
 		break;
 	default:
-		printk(KERN_WARNING "RDS/ib: unhandled QP event %u "
-		       "on connection to %pI4\n", event->event,
-		       &conn->c_faddr);
+		rds_ib_conn_error(conn, "RDS/IB: Fatal QP Event %u "
+			"- connection %pI4->%pI4, reconnecting\n",
+			event->event, &conn->c_laddr, &conn->c_faddr);
 		break;
 	}
 }
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 11/15] RDS: Fix completion notifications on blocking sockets
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (9 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 10/15] RDS/IB: Drop connection when a fatal QP event is received Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 12/15] RDS/IB: Always use PAGE_SIZE for FMR page size Andy Grover
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Completion or congestion notifications were not being checked
if the socket went to sleep. This patch fixes that.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/recv.c |   24 +++++++++++++-----------
 1 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/net/rds/recv.c b/net/rds/recv.c
index f2118c5..86bc1a0 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -409,18 +409,18 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	if (msg_flags & MSG_OOB)
 		goto out;
 
-	/* If there are pending notifications, do those - and nothing else */
-	if (!list_empty(&rs->rs_notify_queue)) {
-		ret = rds_notify_queue_get(rs, msg);
-		goto out;
-	}
+	while (1) {
+		/* If there are pending notifications, do those - and nothing else */
+		if (!list_empty(&rs->rs_notify_queue)) {
+			ret = rds_notify_queue_get(rs, msg);
+			break;
+		}
 
-	if (rs->rs_cong_notify) {
-		ret = rds_notify_cong(rs, msg);
-		goto out;
-	}
+		if (rs->rs_cong_notify) {
+			ret = rds_notify_cong(rs, msg);
+			break;
+		}
 
-	while (1) {
 		if (!rds_next_incoming(rs, &inc)) {
 			if (nonblock) {
 				ret = -EAGAIN;
@@ -428,7 +428,9 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			}
 
 			timeo = wait_event_interruptible_timeout(*sk->sk_sleep,
-						rds_next_incoming(rs, &inc),
+						(!list_empty(&rs->rs_notify_queue)
+						|| rs->rs_cong_notify
+						|| rds_next_incoming(rs, &inc)),
 						timeo);
 			rdsdebug("recvmsg woke inc %p timeo %ld\n", inc,
 				 timeo);
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 12/15] RDS/IB: Always use PAGE_SIZE for FMR page size
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (10 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 11/15] RDS: Fix completion notifications on blocking sockets Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 13/15] RDS/IW: Remove page_shift variable from iwarp transport Andy Grover
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

While FMRs allow significant flexibility in what size of pages they can use,
we really just want FMR pages to match CPU page size. Roland says we can
count on this always being supported, so this simplifies things.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/ib.c      |    3 ---
 net/rds/ib.h      |    3 ---
 net/rds/ib_rdma.c |   12 ++++++------
 3 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/net/rds/ib.c b/net/rds/ib.c
index 27abdd3..868559a 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -85,9 +85,6 @@ void rds_ib_add_one(struct ib_device *device)
 	rds_ibdev->max_wrs = dev_attr->max_qp_wr;
 	rds_ibdev->max_sge = min(dev_attr->max_sge, RDS_IB_MAX_SGE);
 
-	rds_ibdev->fmr_page_shift = max(9, ffs(dev_attr->page_size_cap) - 1);
-	rds_ibdev->fmr_page_size  = 1 << rds_ibdev->fmr_page_shift;
-	rds_ibdev->fmr_page_mask  = ~((u64) rds_ibdev->fmr_page_size - 1);
 	rds_ibdev->fmr_max_remaps = dev_attr->max_map_per_fmr?: 32;
 	rds_ibdev->max_fmrs = dev_attr->max_fmr ?
 			min_t(unsigned int, dev_attr->max_fmr, fmr_pool_size) :
diff --git a/net/rds/ib.h b/net/rds/ib.h
index c0de7af..1378b85 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -159,9 +159,6 @@ struct rds_ib_device {
 	struct ib_pd		*pd;
 	struct ib_mr		*mr;
 	struct rds_ib_mr_pool	*mr_pool;
-	int			fmr_page_shift;
-	int			fmr_page_size;
-	u64			fmr_page_mask;
 	unsigned int		fmr_max_remaps;
 	unsigned int		max_fmrs;
 	int			max_sge;
diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index 81033af..ef3ab5b 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -211,7 +211,7 @@ struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_ibdev)
 
 	pool->fmr_attr.max_pages = fmr_message_size;
 	pool->fmr_attr.max_maps = rds_ibdev->fmr_max_remaps;
-	pool->fmr_attr.page_shift = rds_ibdev->fmr_page_shift;
+	pool->fmr_attr.page_shift = PAGE_SHIFT;
 	pool->max_free_pinned = rds_ibdev->max_fmrs * fmr_message_size / 4;
 
 	/* We never allow more than max_items MRs to be allocated.
@@ -349,13 +349,13 @@ static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev, struct rds_ib_mr *ibm
 		unsigned int dma_len = ib_sg_dma_len(dev, &scat[i]);
 		u64 dma_addr = ib_sg_dma_address(dev, &scat[i]);
 
-		if (dma_addr & ~rds_ibdev->fmr_page_mask) {
+		if (dma_addr & ~PAGE_MASK) {
 			if (i > 0)
 				return -EINVAL;
 			else
 				++page_cnt;
 		}
-		if ((dma_addr + dma_len) & ~rds_ibdev->fmr_page_mask) {
+		if ((dma_addr + dma_len) & ~PAGE_MASK) {
 			if (i < sg_dma_len - 1)
 				return -EINVAL;
 			else
@@ -365,7 +365,7 @@ static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev, struct rds_ib_mr *ibm
 		len += dma_len;
 	}
 
-	page_cnt += len >> rds_ibdev->fmr_page_shift;
+	page_cnt += len >> PAGE_SHIFT;
 	if (page_cnt > fmr_message_size)
 		return -EINVAL;
 
@@ -378,9 +378,9 @@ static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev, struct rds_ib_mr *ibm
 		unsigned int dma_len = ib_sg_dma_len(dev, &scat[i]);
 		u64 dma_addr = ib_sg_dma_address(dev, &scat[i]);
 
-		for (j = 0; j < dma_len; j += rds_ibdev->fmr_page_size)
+		for (j = 0; j < dma_len; j += PAGE_SIZE)
 			dma_pages[page_cnt++] =
-				(dma_addr & rds_ibdev->fmr_page_mask) + j;
+				(dma_addr & PAGE_MASK) + j;
 	}
 
 	ret = ib_map_phys_fmr(ibmr->fmr,
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 13/15] RDS/IW: Remove page_shift variable from iwarp transport
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (11 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 12/15] RDS/IB: Always use PAGE_SIZE for FMR page size Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 14/15] RDS/IW: Remove dead code Andy Grover
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

The existing code treated page_shift as a variable, when in fact we
always want to have the fastreg page size be the same as the arch's
page size -- and it is, so this doesn't need to be a variable.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/iw.c      |    2 --
 net/rds/iw.h      |    1 -
 net/rds/iw_rdma.c |   28 ++++++++++------------------
 net/rds/iw_send.c |    2 +-
 4 files changed, 11 insertions(+), 22 deletions(-)

diff --git a/net/rds/iw.c b/net/rds/iw.c
index d16e1cb..2e6b495 100644
--- a/net/rds/iw.c
+++ b/net/rds/iw.c
@@ -83,8 +83,6 @@ void rds_iw_add_one(struct ib_device *device)
 	rds_iwdev->max_wrs = dev_attr->max_qp_wr;
 	rds_iwdev->max_sge = min(dev_attr->max_sge, RDS_IW_MAX_SGE);
 
-	rds_iwdev->page_shift = max(PAGE_SHIFT, ffs(dev_attr->page_size_cap) - 1);
-
 	rds_iwdev->dev = device;
 	rds_iwdev->pd = ib_alloc_pd(device);
 	if (IS_ERR(rds_iwdev->pd))
diff --git a/net/rds/iw.h b/net/rds/iw.h
index 0715dde..dd72b62 100644
--- a/net/rds/iw.h
+++ b/net/rds/iw.h
@@ -181,7 +181,6 @@ struct rds_iw_device {
 	struct ib_pd		*pd;
 	struct ib_mr		*mr;
 	struct rds_iw_mr_pool	*mr_pool;
-	int			page_shift;
 	int			max_sge;
 	unsigned int		max_wrs;
 	unsigned int		dma_local_lkey:1;
diff --git a/net/rds/iw_rdma.c b/net/rds/iw_rdma.c
index dcdb37d..de4a1b1 100644
--- a/net/rds/iw_rdma.c
+++ b/net/rds/iw_rdma.c
@@ -263,18 +263,12 @@ static void rds_iw_set_scatterlist(struct rds_iw_scatterlist *sg,
 }
 
 static u64 *rds_iw_map_scatterlist(struct rds_iw_device *rds_iwdev,
-			struct rds_iw_scatterlist *sg,
-			unsigned int dma_page_shift)
+			struct rds_iw_scatterlist *sg)
 {
 	struct ib_device *dev = rds_iwdev->dev;
 	u64 *dma_pages = NULL;
-	u64 dma_mask;
-	unsigned int dma_page_size;
 	int i, j, ret;
 
-	dma_page_size = 1 << dma_page_shift;
-	dma_mask = dma_page_size - 1;
-
 	WARN_ON(sg->dma_len);
 
 	sg->dma_len = ib_dma_map_sg(dev, sg->list, sg->len, DMA_BIDIRECTIONAL);
@@ -295,18 +289,18 @@ static u64 *rds_iw_map_scatterlist(struct rds_iw_device *rds_iwdev,
 		sg->bytes += dma_len;
 
 		end_addr = dma_addr + dma_len;
-		if (dma_addr & dma_mask) {
+		if (dma_addr & PAGE_MASK) {
 			if (i > 0)
 				goto out_unmap;
-			dma_addr &= ~dma_mask;
+			dma_addr &= ~PAGE_MASK;
 		}
-		if (end_addr & dma_mask) {
+		if (end_addr & PAGE_MASK) {
 			if (i < sg->dma_len - 1)
 				goto out_unmap;
-			end_addr = (end_addr + dma_mask) & ~dma_mask;
+			end_addr = (end_addr + PAGE_MASK) & ~PAGE_MASK;
 		}
 
-		sg->dma_npages += (end_addr - dma_addr) >> dma_page_shift;
+		sg->dma_npages += (end_addr - dma_addr) >> PAGE_SHIFT;
 	}
 
 	/* Now gather the dma addrs into one list */
@@ -325,8 +319,8 @@ static u64 *rds_iw_map_scatterlist(struct rds_iw_device *rds_iwdev,
 		u64 end_addr;
 
 		end_addr = dma_addr + dma_len;
-		dma_addr &= ~dma_mask;
-		for (; dma_addr < end_addr; dma_addr += dma_page_size)
+		dma_addr &= ~PAGE_MASK;
+		for (; dma_addr < end_addr; dma_addr += PAGE_SIZE)
 			dma_pages[j++] = dma_addr;
 		BUG_ON(j > sg->dma_npages);
 	}
@@ -727,7 +721,7 @@ static int rds_iw_rdma_build_fastreg(struct rds_iw_mapping *mapping)
 	f_wr.wr.fast_reg.rkey = mapping->m_rkey;
 	f_wr.wr.fast_reg.page_list = ibmr->page_list;
 	f_wr.wr.fast_reg.page_list_len = mapping->m_sg.dma_len;
-	f_wr.wr.fast_reg.page_shift = ibmr->device->page_shift;
+	f_wr.wr.fast_reg.page_shift = PAGE_SHIFT;
 	f_wr.wr.fast_reg.access_flags = IB_ACCESS_LOCAL_WRITE |
 				IB_ACCESS_REMOTE_READ |
 				IB_ACCESS_REMOTE_WRITE;
@@ -780,9 +774,7 @@ static int rds_iw_map_fastreg(struct rds_iw_mr_pool *pool,
 
 	rds_iw_set_scatterlist(&mapping->m_sg, sg, sg_len);
 
-	dma_pages = rds_iw_map_scatterlist(rds_iwdev,
-				&mapping->m_sg,
-				rds_iwdev->page_shift);
+	dma_pages = rds_iw_map_scatterlist(rds_iwdev, &mapping->m_sg);
 	if (IS_ERR(dma_pages)) {
 		ret = PTR_ERR(dma_pages);
 		dma_pages = NULL;
diff --git a/net/rds/iw_send.c b/net/rds/iw_send.c
index 44a6a05..1f5abe3 100644
--- a/net/rds/iw_send.c
+++ b/net/rds/iw_send.c
@@ -779,7 +779,7 @@ static void rds_iw_build_send_fastreg(struct rds_iw_device *rds_iwdev, struct rd
 	send->s_wr.wr.fast_reg.rkey = send->s_mr->rkey;
 	send->s_wr.wr.fast_reg.page_list = send->s_page_list;
 	send->s_wr.wr.fast_reg.page_list_len = nent;
-	send->s_wr.wr.fast_reg.page_shift = rds_iwdev->page_shift;
+	send->s_wr.wr.fast_reg.page_shift = PAGE_SHIFT;
 	send->s_wr.wr.fast_reg.access_flags = IB_ACCESS_REMOTE_WRITE;
 	send->s_wr.wr.fast_reg.iova_start = sg_addr;
 
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 14/15] RDS/IW: Remove dead code
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (12 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 13/15] RDS/IW: Remove page_shift variable from iwarp transport Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-17 23:13 ` [PATCH 15/15] RDS: Refactor end of __conn_create for readability Andy Grover
  2009-07-20 15:04 ` [PATCH 0/15] RDS updates for net-next David Miller
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

In iWARP code, node_type will always be RNIC

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/iw.c |   13 ++++---------
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/net/rds/iw.c b/net/rds/iw.c
index 2e6b495..f5e9a29 100644
--- a/net/rds/iw.c
+++ b/net/rds/iw.c
@@ -89,15 +89,10 @@ void rds_iw_add_one(struct ib_device *device)
 		goto free_dev;
 
 	if (!rds_iwdev->dma_local_lkey) {
-		if (device->node_type != RDMA_NODE_RNIC) {
-			rds_iwdev->mr = ib_get_dma_mr(rds_iwdev->pd,
-						IB_ACCESS_LOCAL_WRITE);
-		} else {
-			rds_iwdev->mr = ib_get_dma_mr(rds_iwdev->pd,
-						IB_ACCESS_REMOTE_READ |
-						IB_ACCESS_REMOTE_WRITE |
-						IB_ACCESS_LOCAL_WRITE);
-		}
+		rds_iwdev->mr = ib_get_dma_mr(rds_iwdev->pd,
+					IB_ACCESS_REMOTE_READ |
+					IB_ACCESS_REMOTE_WRITE |
+					IB_ACCESS_LOCAL_WRITE);
 		if (IS_ERR(rds_iwdev->mr))
 			goto err_pd;
 	} else
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 15/15] RDS: Refactor end of __conn_create for readability
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (13 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 14/15] RDS/IW: Remove dead code Andy Grover
@ 2009-07-17 23:13 ` Andy Grover
  2009-07-20 15:04 ` [PATCH 0/15] RDS updates for net-next David Miller
  15 siblings, 0 replies; 17+ messages in thread
From: Andy Grover @ 2009-07-17 23:13 UTC (permalink / raw)
  To: netdev; +Cc: rds-devel

Add a comment for what's going on. Remove negative logic.
I find this much easier to understand quickly, although
there are a few lines duplicated.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
---
 net/rds/connection.c |   48 +++++++++++++++++++++++++++++++-----------------
 1 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index 605fe3f..b420a20 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -126,7 +126,7 @@ static struct rds_connection *__rds_conn_create(__be32 laddr, __be32 faddr,
 				       struct rds_transport *trans, gfp_t gfp,
 				       int is_outgoing)
 {
-	struct rds_connection *conn, *tmp, *parent = NULL;
+	struct rds_connection *conn, *parent = NULL;
 	struct hlist_head *head = rds_conn_bucket(laddr, faddr);
 	unsigned long flags;
 	int ret;
@@ -210,26 +210,40 @@ static struct rds_connection *__rds_conn_create(__be32 laddr, __be32 faddr,
 	  trans->t_name ? trans->t_name : "[unknown]",
 	  is_outgoing ? "(outgoing)" : "");
 
+	/*
+	 * Since we ran without holding the conn lock, someone could
+	 * have created the same conn (either normal or passive) in the
+	 * interim. We check while holding the lock. If we won, we complete
+	 * init and return our conn. If we lost, we rollback and return the
+	 * other one.
+	 */
 	spin_lock_irqsave(&rds_conn_lock, flags);
-	if (parent == NULL) {
-		tmp = rds_conn_lookup(head, laddr, faddr, trans);
-		if (tmp == NULL)
-			hlist_add_head(&conn->c_hash_node, head);
-	} else {
-		tmp = parent->c_passive;
-		if (!tmp)
+	if (parent) {
+		/* Creating passive conn */
+		if (parent->c_passive) {
+			trans->conn_free(conn->c_transport_data);
+			kmem_cache_free(rds_conn_slab, conn);
+			conn = parent->c_passive;
+		} else {
 			parent->c_passive = conn;
-	}
-
-	if (tmp) {
-		trans->conn_free(conn->c_transport_data);
-		kmem_cache_free(rds_conn_slab, conn);
-		conn = tmp;
+			rds_cong_add_conn(conn);
+			rds_conn_count++;
+		}
 	} else {
-		rds_cong_add_conn(conn);
-		rds_conn_count++;
+		/* Creating normal conn */
+		struct rds_connection *found;
+
+		found = rds_conn_lookup(head, laddr, faddr, trans);
+		if (found) {
+			trans->conn_free(conn->c_transport_data);
+			kmem_cache_free(rds_conn_slab, conn);
+			conn = found;
+		} else {
+			hlist_add_head(&conn->c_hash_node, head);
+			rds_cong_add_conn(conn);
+			rds_conn_count++;
+		}
 	}
-
 	spin_unlock_irqrestore(&rds_conn_lock, flags);
 
 out:
-- 
1.6.0.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/15] RDS updates for net-next
  2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
                   ` (14 preceding siblings ...)
  2009-07-17 23:13 ` [PATCH 15/15] RDS: Refactor end of __conn_create for readability Andy Grover
@ 2009-07-20 15:04 ` David Miller
  15 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2009-07-20 15:04 UTC (permalink / raw)
  To: andy.grover; +Cc: netdev, rds-devel

From: Andy Grover <andy.grover@oracle.com>
Date: Fri, 17 Jul 2009 16:13:21 -0700

> These are some assorted RDS updates to net-next, please review and
> apply if they look ok.

All applied to net-next-2.6, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-07-20 15:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-17 23:13 [PATCH 0/15] RDS updates for net-next Andy Grover
2009-07-17 23:13 ` [PATCH 01/15] RDS: Set retry_count to 2 and make modifiable via modparam Andy Grover
2009-07-17 23:13 ` [PATCH 02/15] RDS/IB: Improve RDS protocol version checking Andy Grover
2009-07-17 23:13 ` [PATCH 03/15] RDS/IB: Handle connections using RDS 3.0 wire protocol Andy Grover
2009-07-17 23:13 ` [PATCH 04/15] RDS/IB: Fix printk to indicate remote IP, not local Andy Grover
2009-07-17 23:13 ` [PATCH 05/15] RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.c Andy Grover
2009-07-17 23:13 ` [PATCH 06/15] RDS/IB: Rename byte_len to data_len to enhance readability Andy Grover
2009-07-17 23:13 ` [PATCH 07/15] RDS: Don't set c_version in __rds_conn_create() Andy Grover
2009-07-17 23:13 ` [PATCH 08/15] RDS/IB: Move tx/rx ring init and refill to later Andy Grover
2009-07-17 23:13 ` [PATCH 09/15] RDS/IB: Disable flow control in sysctl and explain why Andy Grover
2009-07-17 23:13 ` [PATCH 10/15] RDS/IB: Drop connection when a fatal QP event is received Andy Grover
2009-07-17 23:13 ` [PATCH 11/15] RDS: Fix completion notifications on blocking sockets Andy Grover
2009-07-17 23:13 ` [PATCH 12/15] RDS/IB: Always use PAGE_SIZE for FMR page size Andy Grover
2009-07-17 23:13 ` [PATCH 13/15] RDS/IW: Remove page_shift variable from iwarp transport Andy Grover
2009-07-17 23:13 ` [PATCH 14/15] RDS/IW: Remove dead code Andy Grover
2009-07-17 23:13 ` [PATCH 15/15] RDS: Refactor end of __conn_create for readability Andy Grover
2009-07-20 15:04 ` [PATCH 0/15] RDS updates for net-next David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).