* [PATCH 1/8] libceph: be less chatty about stray replies
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-30 23:00 ` Alex Elder
2012-07-30 15:59 ` [PATCH 2/8] ceph: update MAINTAINERS file Sage Weil
` (6 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
There are many (normal) conditions that can lead to us getting
unexpected replies, include cluster topology changes, osd failures,
and timeouts. There's no need to spam the console about it.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/osd_client.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index ad427e6..42119c0 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -2037,8 +2037,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con,
if (!req) {
*skip = 1;
m = NULL;
- pr_info("get_reply unknown tid %llu from osd%d\n", tid,
- osd->o_osd);
+ dout("get_reply unknown tid %llu from osd%d\n", tid,
+ osd->o_osd);
goto out;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 1/8] libceph: be less chatty about stray replies
2012-07-30 15:59 ` [PATCH 1/8] libceph: be less chatty about stray replies Sage Weil
@ 2012-07-30 23:00 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-30 23:00 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> There are many (normal) conditions that can lead to us getting
> unexpected replies, include cluster topology changes, osd failures,
> and timeouts. There's no need to spam the console about it.
>
> Signed-off-by: Sage Weil <sage@inktank.com>
Looks good.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/osd_client.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index ad427e6..42119c0 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -2037,8 +2037,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con,
> if (!req) {
> *skip = 1;
> m = NULL;
> - pr_info("get_reply unknown tid %llu from osd%d\n", tid,
> - osd->o_osd);
> + dout("get_reply unknown tid %llu from osd%d\n", tid,
> + osd->o_osd);
> goto out;
> }
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/8] ceph: update MAINTAINERS file
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
2012-07-30 15:59 ` [PATCH 1/8] libceph: be less chatty about stray replies Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-30 23:09 ` Alex Elder
2012-07-30 15:59 ` [PATCH 3/8] libceph: fix handling of immediate socket connect failure Sage Weil
` (5 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
* shiny new inktank.com email addresses
* add include/linux/crush directory (previous oversight)
Signed-off-by: Sage Weil <sage@inktank.com>
---
MAINTAINERS | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index fe643e7..b8af81a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1761,15 +1761,16 @@ F: arch/powerpc/oprofile/*cell*
F: arch/powerpc/platforms/cell/
CEPH DISTRIBUTED FILE SYSTEM CLIENT
-M: Sage Weil <sage@newdream.net>
+M: Sage Weil <sage@inktank.com>
L: ceph-devel@vger.kernel.org
-W: http://ceph.newdream.net/
+W: http://ceph.ceph/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
S: Supported
F: Documentation/filesystems/ceph.txt
F: fs/ceph
F: net/ceph
F: include/linux/ceph
+F: include/linux/crush
CERTIFIED WIRELESS USB (WUSB) SUBSYSTEM:
L: linux-usb@vger.kernel.org
@@ -5596,8 +5597,8 @@ F: arch/hexagon/
RADOS BLOCK DEVICE (RBD)
F: include/linux/qnxtypes.h
-M: Yehuda Sadeh <yehuda@hq.newdream.net>
-M: Sage Weil <sage@newdream.net>
+M: Yehuda Sadeh <yehuda@inktank.com>
+M: Sage Weil <sage@inktank.com>
M: ceph-devel@vger.kernel.org
S: Supported
F: drivers/block/rbd.c
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 2/8] ceph: update MAINTAINERS file
2012-07-30 15:59 ` [PATCH 2/8] ceph: update MAINTAINERS file Sage Weil
@ 2012-07-30 23:09 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-30 23:09 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> * shiny new inktank.com email addresses
> * add include/linux/crush directory (previous oversight)
>
> Signed-off-by: Sage Weil <sage@inktank.com>
If you implement my suggestions below, then:
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> MAINTAINERS | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index fe643e7..b8af81a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1761,15 +1761,16 @@ F: arch/powerpc/oprofile/*cell*
> F: arch/powerpc/platforms/cell/
>
> CEPH DISTRIBUTED FILE SYSTEM CLIENT
> -M: Sage Weil <sage@newdream.net>
> +M: Sage Weil <sage@inktank.com>
> L: ceph-devel@vger.kernel.org
> -W: http://ceph.newdream.net/
> +W: http://ceph.ceph/
http://ceph.com/ ?
> T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
> S: Supported
> F: Documentation/filesystems/ceph.txt
> F: fs/ceph
> F: net/ceph
> F: include/linux/ceph
> +F: include/linux/crush
>
> CERTIFIED WIRELESS USB (WUSB) SUBSYSTEM:
> L: linux-usb@vger.kernel.org
> @@ -5596,8 +5597,8 @@ F: arch/hexagon/
>
> RADOS BLOCK DEVICE (RBD)
> F: include/linux/qnxtypes.h
Delete this ^^^.
> -M: Yehuda Sadeh <yehuda@hq.newdream.net>
> -M: Sage Weil <sage@newdream.net>
> +M: Yehuda Sadeh <yehuda@inktank.com>
> +M: Sage Weil <sage@inktank.com>
> M: ceph-devel@vger.kernel.org
Add this:
T: git git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client.git
> S: Supported
> F: drivers/block/rbd.c
Add this:
F: drivers/block/rbd_types.h
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/8] libceph: fix handling of immediate socket connect failure
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
2012-07-30 15:59 ` [PATCH 1/8] libceph: be less chatty about stray replies Sage Weil
2012-07-30 15:59 ` [PATCH 2/8] ceph: update MAINTAINERS file Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-31 0:08 ` Alex Elder
2012-07-30 15:59 ` [PATCH 4/8] libceph: revoke mon_client messages on session restart Sage Weil
` (4 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
If the connect() call immediately fails such that sock == NULL, we
still need con_close_socket() to reset our socket state to CLOSED.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/messenger.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index fa16f2c..a6a0c7a 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -224,6 +224,8 @@ static void con_sock_state_init(struct ceph_connection *con)
old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
printk("%s: unexpected old state %d\n", __func__, old_state);
+ dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
+ CON_SOCK_STATE_CLOSED);
}
static void con_sock_state_connecting(struct ceph_connection *con)
@@ -233,6 +235,8 @@ static void con_sock_state_connecting(struct ceph_connection *con)
old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
printk("%s: unexpected old state %d\n", __func__, old_state);
+ dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
+ CON_SOCK_STATE_CONNECTING);
}
static void con_sock_state_connected(struct ceph_connection *con)
@@ -242,6 +246,8 @@ static void con_sock_state_connected(struct ceph_connection *con)
old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
printk("%s: unexpected old state %d\n", __func__, old_state);
+ dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
+ CON_SOCK_STATE_CONNECTED);
}
static void con_sock_state_closing(struct ceph_connection *con)
@@ -253,6 +259,8 @@ static void con_sock_state_closing(struct ceph_connection *con)
old_state != CON_SOCK_STATE_CONNECTED &&
old_state != CON_SOCK_STATE_CLOSING))
printk("%s: unexpected old state %d\n", __func__, old_state);
+ dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
+ CON_SOCK_STATE_CLOSING);
}
static void con_sock_state_closed(struct ceph_connection *con)
@@ -262,8 +270,11 @@ static void con_sock_state_closed(struct ceph_connection *con)
old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
old_state != CON_SOCK_STATE_CLOSING &&
- old_state != CON_SOCK_STATE_CONNECTING))
+ old_state != CON_SOCK_STATE_CONNECTING &&
+ old_state != CON_SOCK_STATE_CLOSED))
printk("%s: unexpected old state %d\n", __func__, old_state);
+ dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
+ CON_SOCK_STATE_CLOSED);
}
/*
@@ -448,14 +459,14 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
*/
static int con_close_socket(struct ceph_connection *con)
{
- int rc;
+ int rc = 0;
dout("con_close_socket on %p sock %p\n", con, con->sock);
- if (!con->sock)
- return 0;
- rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
- sock_release(con->sock);
- con->sock = NULL;
+ if (con->sock) {
+ rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
+ sock_release(con->sock);
+ con->sock = NULL;
+ }
/*
* Forcibly clear the SOCK_CLOSED flag. It gets set
@@ -464,6 +475,7 @@ static int con_close_socket(struct ceph_connection *con)
* shut the socket down.
*/
clear_bit(CON_FLAG_SOCK_CLOSED, &con->flags);
+
con_sock_state_closed(con);
return rc;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 3/8] libceph: fix handling of immediate socket connect failure
2012-07-30 15:59 ` [PATCH 3/8] libceph: fix handling of immediate socket connect failure Sage Weil
@ 2012-07-31 0:08 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-31 0:08 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> If the connect() call immediately fails such that sock == NULL, we
> still need con_close_socket() to reset our socket state to CLOSED.
>
> Signed-off-by: Sage Weil <sage@inktank.com>
Looks good.
I'm glad you're adding in dout() calls... I hadn't been doing
that as I was working on the messenger code.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 26 +++++++++++++++++++-------
> 1 file changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index fa16f2c..a6a0c7a 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -224,6 +224,8 @@ static void con_sock_state_init(struct ceph_connection *con)
> old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> if (WARN_ON(old_state != CON_SOCK_STATE_NEW))
> printk("%s: unexpected old state %d\n", __func__, old_state);
> + dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
> + CON_SOCK_STATE_CLOSED);
> }
>
> static void con_sock_state_connecting(struct ceph_connection *con)
> @@ -233,6 +235,8 @@ static void con_sock_state_connecting(struct ceph_connection *con)
> old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTING);
> if (WARN_ON(old_state != CON_SOCK_STATE_CLOSED))
> printk("%s: unexpected old state %d\n", __func__, old_state);
> + dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
> + CON_SOCK_STATE_CONNECTING);
> }
>
> static void con_sock_state_connected(struct ceph_connection *con)
> @@ -242,6 +246,8 @@ static void con_sock_state_connected(struct ceph_connection *con)
> old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CONNECTED);
> if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTING))
> printk("%s: unexpected old state %d\n", __func__, old_state);
> + dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
> + CON_SOCK_STATE_CONNECTED);
> }
>
> static void con_sock_state_closing(struct ceph_connection *con)
> @@ -253,6 +259,8 @@ static void con_sock_state_closing(struct ceph_connection *con)
> old_state != CON_SOCK_STATE_CONNECTED &&
> old_state != CON_SOCK_STATE_CLOSING))
> printk("%s: unexpected old state %d\n", __func__, old_state);
> + dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
> + CON_SOCK_STATE_CLOSING);
> }
>
> static void con_sock_state_closed(struct ceph_connection *con)
> @@ -262,8 +270,11 @@ static void con_sock_state_closed(struct ceph_connection *con)
> old_state = atomic_xchg(&con->sock_state, CON_SOCK_STATE_CLOSED);
> if (WARN_ON(old_state != CON_SOCK_STATE_CONNECTED &&
> old_state != CON_SOCK_STATE_CLOSING &&
> - old_state != CON_SOCK_STATE_CONNECTING))
> + old_state != CON_SOCK_STATE_CONNECTING &&
> + old_state != CON_SOCK_STATE_CLOSED))
> printk("%s: unexpected old state %d\n", __func__, old_state);
> + dout("%s con %p sock %d -> %d\n", __func__, con, old_state,
> + CON_SOCK_STATE_CLOSED);
> }
>
> /*
> @@ -448,14 +459,14 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> */
> static int con_close_socket(struct ceph_connection *con)
> {
> - int rc;
> + int rc = 0;
>
> dout("con_close_socket on %p sock %p\n", con, con->sock);
> - if (!con->sock)
> - return 0;
> - rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
> - sock_release(con->sock);
> - con->sock = NULL;
> + if (con->sock) {
> + rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR);
> + sock_release(con->sock);
> + con->sock = NULL;
> + }
>
> /*
> * Forcibly clear the SOCK_CLOSED flag. It gets set
> @@ -464,6 +475,7 @@ static int con_close_socket(struct ceph_connection *con)
> * shut the socket down.
> */
> clear_bit(CON_FLAG_SOCK_CLOSED, &con->flags);
> +
> con_sock_state_closed(con);
> return rc;
> }
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/8] libceph: revoke mon_client messages on session restart
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
` (2 preceding siblings ...)
2012-07-30 15:59 ` [PATCH 3/8] libceph: fix handling of immediate socket connect failure Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-31 0:09 ` Alex Elder
2012-07-30 15:59 ` [PATCH 5/8] libceph: verify state after retaking con lock after dispatch Sage Weil
` (3 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
Revoke all mon_client messages when we shut down the old connection.
This is mostly moot since we are re-using the same ceph_connection,
but it is cleaner.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/mon_client.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
index bfd21a8..105d533 100644
--- a/net/ceph/mon_client.c
+++ b/net/ceph/mon_client.c
@@ -118,6 +118,9 @@ static void __close_session(struct ceph_mon_client *monc)
{
dout("__close_session closing mon%d\n", monc->cur_mon);
ceph_msg_revoke(monc->m_auth);
+ ceph_msg_revoke_incoming(monc->m_auth_reply);
+ ceph_msg_revoke(monc->m_subscribe);
+ ceph_msg_revoke_incoming(monc->m_subscribe_ack);
ceph_con_close(&monc->con);
monc->cur_mon = -1;
monc->pending_auth = 0;
@@ -685,6 +688,7 @@ static void __resend_generic_request(struct ceph_mon_client *monc)
for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
req = rb_entry(p, struct ceph_mon_generic_request, node);
ceph_msg_revoke(req->request);
+ ceph_msg_revoke_incoming(req->reply);
ceph_con_send(&monc->con, ceph_msg_get(req->request));
}
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 4/8] libceph: revoke mon_client messages on session restart
2012-07-30 15:59 ` [PATCH 4/8] libceph: revoke mon_client messages on session restart Sage Weil
@ 2012-07-31 0:09 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-31 0:09 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> Revoke all mon_client messages when we shut down the old connection.
> This is mostly moot since we are re-using the same ceph_connection,
> but it is cleaner.
>
> Signed-off-by: Sage Weil <sage@inktank.com>
Looks good.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/mon_client.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> index bfd21a8..105d533 100644
> --- a/net/ceph/mon_client.c
> +++ b/net/ceph/mon_client.c
> @@ -118,6 +118,9 @@ static void __close_session(struct ceph_mon_client *monc)
> {
> dout("__close_session closing mon%d\n", monc->cur_mon);
> ceph_msg_revoke(monc->m_auth);
> + ceph_msg_revoke_incoming(monc->m_auth_reply);
> + ceph_msg_revoke(monc->m_subscribe);
> + ceph_msg_revoke_incoming(monc->m_subscribe_ack);
> ceph_con_close(&monc->con);
> monc->cur_mon = -1;
> monc->pending_auth = 0;
> @@ -685,6 +688,7 @@ static void __resend_generic_request(struct ceph_mon_client *monc)
> for (p = rb_first(&monc->generic_request_tree); p; p = rb_next(p)) {
> req = rb_entry(p, struct ceph_mon_generic_request, node);
> ceph_msg_revoke(req->request);
> + ceph_msg_revoke_incoming(req->reply);
> ceph_con_send(&monc->con, ceph_msg_get(req->request));
> }
> }
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 5/8] libceph: verify state after retaking con lock after dispatch
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
` (3 preceding siblings ...)
2012-07-30 15:59 ` [PATCH 4/8] libceph: revoke mon_client messages on session restart Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-31 0:11 ` Alex Elder
2012-07-30 15:59 ` [PATCH 6/8] libceph: avoid dropping con mutex before fault Sage Weil
` (2 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
We drop the con mutex when delivering a message. When we retake the
lock, we need to verify we are still in the OPEN state before
preparing to read the next tag, or else we risk stepping on a
connection that has been closed.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/messenger.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index a6a0c7a..feb5a2a 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2003,7 +2003,6 @@ static void process_message(struct ceph_connection *con)
con->ops->dispatch(con, msg);
mutex_lock(&con->mutex);
- prepare_read_tag(con);
}
@@ -2213,6 +2212,8 @@ more:
if (con->in_tag == CEPH_MSGR_TAG_READY)
goto more;
process_message(con);
+ if (con->state == CON_STATE_OPEN)
+ prepare_read_tag(con);
goto more;
}
if (con->in_tag == CEPH_MSGR_TAG_ACK) {
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 5/8] libceph: verify state after retaking con lock after dispatch
2012-07-30 15:59 ` [PATCH 5/8] libceph: verify state after retaking con lock after dispatch Sage Weil
@ 2012-07-31 0:11 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-31 0:11 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> We drop the con mutex when delivering a message. When we retake the
> lock, we need to verify we are still in the OPEN state before
> preparing to read the next tag, or else we risk stepping on a
> connection that has been closed.
>
> Signed-off-by: Sage Weil <sage@inktank.com>
Looks good. And good to move the prepare_read_tag() out too.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index a6a0c7a..feb5a2a 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2003,7 +2003,6 @@ static void process_message(struct ceph_connection *con)
> con->ops->dispatch(con, msg);
>
> mutex_lock(&con->mutex);
> - prepare_read_tag(con);
> }
>
>
> @@ -2213,6 +2212,8 @@ more:
> if (con->in_tag == CEPH_MSGR_TAG_READY)
> goto more;
> process_message(con);
> + if (con->state == CON_STATE_OPEN)
> + prepare_read_tag(con);
> goto more;
> }
> if (con->in_tag == CEPH_MSGR_TAG_ACK) {
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 6/8] libceph: avoid dropping con mutex before fault
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
` (4 preceding siblings ...)
2012-07-30 15:59 ` [PATCH 5/8] libceph: verify state after retaking con lock after dispatch Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-31 0:12 ` Alex Elder
2012-07-30 15:59 ` [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird Sage Weil
2012-07-30 15:59 ` [PATCH 8/8] libceph: recheck con state after allocating incoming message Sage Weil
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
The ceph_fault() function takes the con mutex, so we should avoid
dropping it before calling it. This fixes a potential race with
another thread calling ceph_con_close(), or _open(), or similar (we
don't reverify con->state after retaking the lock).
Add annotation so that lockdep realizes we will drop the mutex before
returning.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/messenger.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index feb5a2a..c3b628c 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2336,7 +2336,6 @@ done_unlocked:
return;
fault:
- mutex_unlock(&con->mutex);
ceph_fault(con); /* error/fault path */
goto done_unlocked;
}
@@ -2347,9 +2346,8 @@ fault:
* exponential backoff
*/
static void ceph_fault(struct ceph_connection *con)
+ __releases(con->mutex)
{
- mutex_lock(&con->mutex);
-
pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg);
dout("fault %p state %lu to peer %s\n",
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 6/8] libceph: avoid dropping con mutex before fault
2012-07-30 15:59 ` [PATCH 6/8] libceph: avoid dropping con mutex before fault Sage Weil
@ 2012-07-31 0:12 ` Alex Elder
0 siblings, 0 replies; 17+ messages in thread
From: Alex Elder @ 2012-07-31 0:12 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> The ceph_fault() function takes the con mutex, so we should avoid
> dropping it before calling it. This fixes a potential race with
> another thread calling ceph_con_close(), or _open(), or similar (we
> don't reverify con->state after retaking the lock).
>
> Add annotation so that lockdep realizes we will drop the mutex before
> returning.
>
> Signed-off-by: Sage Weil <sage@inktank.com>
Looks good.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index feb5a2a..c3b628c 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -2336,7 +2336,6 @@ done_unlocked:
> return;
>
> fault:
> - mutex_unlock(&con->mutex);
> ceph_fault(con); /* error/fault path */
> goto done_unlocked;
> }
> @@ -2347,9 +2346,8 @@ fault:
> * exponential backoff
> */
> static void ceph_fault(struct ceph_connection *con)
> + __releases(con->mutex)
> {
> - mutex_lock(&con->mutex);
> -
> pr_err("%s%lld %s %s\n", ENTITY_NAME(con->peer_name),
> ceph_pr_addr(&con->peer_addr.in_addr), con->error_msg);
> dout("fault %p state %lu to peer %s\n",
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
` (5 preceding siblings ...)
2012-07-30 15:59 ` [PATCH 6/8] libceph: avoid dropping con mutex before fault Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
2012-07-31 0:25 ` Alex Elder
2012-07-30 15:59 ` [PATCH 8/8] libceph: recheck con state after allocating incoming message Sage Weil
7 siblings, 1 reply; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
This function's calling convention is very limiting. In particular,
we can't return any error other than ENOMEM (and only implicitly),
which is a problem (see next patch).
Instead, return an normal 0 or error code, and make the skip a pointer
output parameter. Drop the useless in_hdr argument (we have the con
pointer).
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/messenger.c | 56 ++++++++++++++++++++++++++++----------------------
1 file changed, 31 insertions(+), 25 deletions(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index c3b628c..5177af0 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1733,9 +1733,7 @@ static int read_partial_message_section(struct ceph_connection *con,
return 1;
}
-static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
- struct ceph_msg_header *hdr);
-
+static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip);
static int read_partial_message_pages(struct ceph_connection *con,
struct page **pages,
@@ -1864,9 +1862,14 @@ static int read_partial_message(struct ceph_connection *con)
/* allocate message? */
if (!con->in_msg) {
+ int skip = 0;
+
dout("got hdr type %d front %d data %d\n", con->in_hdr.type,
con->in_hdr.front_len, con->in_hdr.data_len);
- if (ceph_con_in_msg_alloc(con, &con->in_hdr)) {
+ ret = ceph_con_in_msg_alloc(con, &skip);
+ if (ret < 0)
+ return ret;
+ if (skip) {
/* skip this message */
dout("alloc_msg said skip message\n");
BUG_ON(con->in_msg);
@@ -1876,12 +1879,8 @@ static int read_partial_message(struct ceph_connection *con)
con->in_seq++;
return 0;
}
- if (!con->in_msg) {
- con->error_msg =
- "error allocating memory for incoming message";
- return -ENOMEM;
- }
+ BUG_ON(!con->in_msg);
BUG_ON(con->in_msg->con != con);
m = con->in_msg;
m->front.iov_len = 0; /* haven't read it yet */
@@ -2715,43 +2714,50 @@ static int ceph_alloc_middle(struct ceph_connection *con, struct ceph_msg *msg)
* connection, and save the result in con->in_msg. Uses the
* connection's private alloc_msg op if available.
*
- * Returns true if the message should be skipped, false otherwise.
- * If true is returned (skip message), con->in_msg will be NULL.
- * If false is returned, con->in_msg will contain a pointer to the
- * newly-allocated message, or NULL in case of memory exhaustion.
+ * Returns 0 on success, or a negative error code.
+ *
+ * On success, *skip may be set to 1:
+ * - the next message should be skipped and ignored.
+ * - con->in_msg == NULL.
+ * or *skip may be 0:
+ * - con->in_msg is non-zero.
+ * On error (ENOMEM, EAGAIN, ...),
+ * - con->in_msg == NULL
*/
-static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
- struct ceph_msg_header *hdr)
+static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
{
+ struct ceph_msg_header *hdr = &con->in_hdr;
int type = le16_to_cpu(hdr->type);
int front_len = le32_to_cpu(hdr->front_len);
int middle_len = le32_to_cpu(hdr->middle_len);
- int ret;
+ int ret = 0;
BUG_ON(con->in_msg != NULL);
if (con->ops->alloc_msg) {
- int skip = 0;
-
mutex_unlock(&con->mutex);
- con->in_msg = con->ops->alloc_msg(con, hdr, &skip);
+ con->in_msg = con->ops->alloc_msg(con, hdr, skip);
mutex_lock(&con->mutex);
if (con->in_msg) {
con->in_msg->con = con->ops->get(con);
BUG_ON(con->in_msg->con == NULL);
}
- if (skip)
+ if (*skip) {
con->in_msg = NULL;
-
- if (!con->in_msg)
- return skip != 0;
+ return 0;
+ }
+ if (!con->in_msg) {
+ con->error_msg =
+ "error allocating memory for incoming message";
+ return -ENOMEM;
+ }
}
if (!con->in_msg) {
con->in_msg = ceph_msg_new(type, front_len, GFP_NOFS, false);
if (!con->in_msg) {
pr_err("unable to allocate msg type %d len %d\n",
type, front_len);
- return false;
+ return -ENOMEM;
}
con->in_msg->con = con->ops->get(con);
BUG_ON(con->in_msg->con == NULL);
@@ -2767,7 +2773,7 @@ static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
}
}
- return false;
+ return ret;
}
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird
2012-07-30 15:59 ` [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird Sage Weil
@ 2012-07-31 0:25 ` Alex Elder
2012-07-31 1:09 ` Sage Weil
0 siblings, 1 reply; 17+ messages in thread
From: Alex Elder @ 2012-07-31 0:25 UTC (permalink / raw)
To: Sage Weil; +Cc: ceph-devel
On 07/30/2012 10:59 AM, Sage Weil wrote:
> This function's calling convention is very limiting. In particular,
> we can't return any error other than ENOMEM (and only implicitly),
> which is a problem (see next patch).
>
> Instead, return an normal 0 or error code, and make the skip a pointer
> output parameter. Drop the useless in_hdr argument (we have the con
> pointer).
>
> Signed-off-by: Sage Weil <sage@inktank.com>
I think this is a good change. I have a few minor nits below
but it looks good.
Reviewed-by: Alex Elder <elder@inktank.com>
> ---
> net/ceph/messenger.c | 56 ++++++++++++++++++++++++++++----------------------
> 1 file changed, 31 insertions(+), 25 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index c3b628c..5177af0 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
. . .
> @@ -2715,43 +2714,50 @@ static int ceph_alloc_middle(struct ceph_connection *con, struct ceph_msg *msg)
> * connection, and save the result in con->in_msg. Uses the
> * connection's private alloc_msg op if available.
> *
> - * Returns true if the message should be skipped, false otherwise.
> - * If true is returned (skip message), con->in_msg will be NULL.
> - * If false is returned, con->in_msg will contain a pointer to the
> - * newly-allocated message, or NULL in case of memory exhaustion.
> + * Returns 0 on success, or a negative error code.
> + *
> + * On success, *skip may be set to 1:
> + * - the next message should be skipped and ignored.
> + * - con->in_msg == NULL.
I'm pretty sure you mean that con->in_msg will be null if *skip
is 1, but it isn't crystal clear the way it's written here.
> + * or *skip may be 0:
> + * - con->in_msg is non-zero.
Use "non-null" since it's a pointer, or "a valid message pointer".
> + * On error (ENOMEM, EAGAIN, ...),
> + * - con->in_msg == NULL
> */
> -static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
> - struct ceph_msg_header *hdr)
> +static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
> {
Getting rid of the hdr argument here was a good idea anyway, since
only &con->in_hdr is ever used.
> + struct ceph_msg_header *hdr = &con->in_hdr;
> int type = le16_to_cpu(hdr->type);
> int front_len = le32_to_cpu(hdr->front_len);
> int middle_len = le32_to_cpu(hdr->middle_len);
> - int ret;
> + int ret = 0;
>
> BUG_ON(con->in_msg != NULL);
>
> if (con->ops->alloc_msg) {
> - int skip = 0;
> -
> mutex_unlock(&con->mutex);
> - con->in_msg = con->ops->alloc_msg(con, hdr, &skip);
> + con->in_msg = con->ops->alloc_msg(con, hdr, skip);
> mutex_lock(&con->mutex);
> if (con->in_msg) {
> con->in_msg->con = con->ops->get(con);
> BUG_ON(con->in_msg->con == NULL);
> }
> - if (skip)
> + if (*skip) {
> con->in_msg = NULL;
> -
> - if (!con->in_msg)
> - return skip != 0;
> + return 0;
> + }
> + if (!con->in_msg) {
> + con->error_msg =
> + "error allocating memory for incoming message";
> + return -ENOMEM;
> + }
> }
> if (!con->in_msg) {
> con->in_msg = ceph_msg_new(type, front_len, GFP_NOFS, false);
> if (!con->in_msg) {
> pr_err("unable to allocate msg type %d len %d\n",
> type, front_len);
> - return false;
> + return -ENOMEM;
> }
> con->in_msg->con = con->ops->get(con);
> BUG_ON(con->in_msg->con == NULL);
> @@ -2767,7 +2773,7 @@ static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
> }
> }
>
> - return false;
> + return ret;
I believe ret will always be 0 here, and if you agree, this
should just be more direct about it:
return 0;
> }
>
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird
2012-07-31 0:25 ` Alex Elder
@ 2012-07-31 1:09 ` Sage Weil
0 siblings, 0 replies; 17+ messages in thread
From: Sage Weil @ 2012-07-31 1:09 UTC (permalink / raw)
To: Alex Elder; +Cc: ceph-devel
On Mon, 30 Jul 2012, Alex Elder wrote:
> On 07/30/2012 10:59 AM, Sage Weil wrote:
> > This function's calling convention is very limiting. In particular,
> > we can't return any error other than ENOMEM (and only implicitly),
> > which is a problem (see next patch).
> >
> > Instead, return an normal 0 or error code, and make the skip a pointer
> > output parameter. Drop the useless in_hdr argument (we have the con
> > pointer).
> >
> > Signed-off-by: Sage Weil <sage@inktank.com>
>
> I think this is a good change. I have a few minor nits below
> but it looks good.
>
> Reviewed-by: Alex Elder <elder@inktank.com>
>
> > ---
> > net/ceph/messenger.c | 56 ++++++++++++++++++++++++++++----------------------
> > 1 file changed, 31 insertions(+), 25 deletions(-)
> >
> > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> > index c3b628c..5177af0 100644
> > --- a/net/ceph/messenger.c
> > +++ b/net/ceph/messenger.c
>
> . . .
>
> > @@ -2715,43 +2714,50 @@ static int ceph_alloc_middle(struct ceph_connection *con, struct ceph_msg *msg)
> > * connection, and save the result in con->in_msg. Uses the
> > * connection's private alloc_msg op if available.
> > *
> > - * Returns true if the message should be skipped, false otherwise.
> > - * If true is returned (skip message), con->in_msg will be NULL.
> > - * If false is returned, con->in_msg will contain a pointer to the
> > - * newly-allocated message, or NULL in case of memory exhaustion.
> > + * Returns 0 on success, or a negative error code.
> > + *
> > + * On success, *skip may be set to 1:
> > + * - the next message should be skipped and ignored.
> > + * - con->in_msg == NULL.
>
> I'm pretty sure you mean that con->in_msg will be null if *skip
> is 1, but it isn't crystal clear the way it's written here.
>
> > + * or *skip may be 0:
> > + * - con->in_msg is non-zero.
>
> Use "non-null" since it's a pointer, or "a valid message pointer".
fixed those
>
> > + * On error (ENOMEM, EAGAIN, ...),
> > + * - con->in_msg == NULL
> > */
> > -static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
> > - struct ceph_msg_header *hdr)
> > +static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
> > {
>
> Getting rid of the hdr argument here was a good idea anyway, since
> only &con->in_hdr is ever used.
>
> > + struct ceph_msg_header *hdr = &con->in_hdr;
> > int type = le16_to_cpu(hdr->type);
> > int front_len = le32_to_cpu(hdr->front_len);
> > int middle_len = le32_to_cpu(hdr->middle_len);
> > - int ret;
> > + int ret = 0;
> >
> > BUG_ON(con->in_msg != NULL);
> >
> > if (con->ops->alloc_msg) {
> > - int skip = 0;
> > -
> > mutex_unlock(&con->mutex);
> > - con->in_msg = con->ops->alloc_msg(con, hdr, &skip);
> > + con->in_msg = con->ops->alloc_msg(con, hdr, skip);
> > mutex_lock(&con->mutex);
> > if (con->in_msg) {
> > con->in_msg->con = con->ops->get(con);
> > BUG_ON(con->in_msg->con == NULL);
> > }
> > - if (skip)
> > + if (*skip) {
> > con->in_msg = NULL;
> > -
> > - if (!con->in_msg)
> > - return skip != 0;
> > + return 0;
> > + }
> > + if (!con->in_msg) {
> > + con->error_msg =
> > + "error allocating memory for incoming message";
> > + return -ENOMEM;
> > + }
> > }
> > if (!con->in_msg) {
> > con->in_msg = ceph_msg_new(type, front_len, GFP_NOFS, false);
> > if (!con->in_msg) {
> > pr_err("unable to allocate msg type %d len %d\n",
> > type, front_len);
> > - return false;
> > + return -ENOMEM;
> > }
> > con->in_msg->con = con->ops->get(con);
> > BUG_ON(con->in_msg->con == NULL);
> > @@ -2767,7 +2773,7 @@ static bool ceph_con_in_msg_alloc(struct ceph_connection *con,
> > }
> > }
> >
> > - return false;
> > + return ret;
>
> I believe ret will always be 0 here, and if you agree, this
> should just be more direct about it:
I can be non-zero from the alloc_middle case above.
thanks
>
> return 0;
>
> > }
> >
> >
> >
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 8/8] libceph: recheck con state after allocating incoming message
2012-07-30 15:59 [PATCH 0/8] Last batch of messenger fixes, misc Sage Weil
` (6 preceding siblings ...)
2012-07-30 15:59 ` [PATCH 7/8] libceph: change ceph_con_in_msg_alloc convention to be less weird Sage Weil
@ 2012-07-30 15:59 ` Sage Weil
7 siblings, 0 replies; 17+ messages in thread
From: Sage Weil @ 2012-07-30 15:59 UTC (permalink / raw)
To: ceph-devel; +Cc: Sage Weil
We drop the lock when calling the ->alloc_msg() con op, which means
we need to (a) not clobber con->in_msg without the mutex held, and (b)
we need to verify that we are still in the OPEN state when we retake
it to avoid causing any mayhem. If the state does change, -EAGAIN
will get us back to con_work() and loop.
Signed-off-by: Sage Weil <sage@inktank.com>
---
net/ceph/messenger.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 5177af0..91ad692 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2735,9 +2735,16 @@ static int ceph_con_in_msg_alloc(struct ceph_connection *con, int *skip)
BUG_ON(con->in_msg != NULL);
if (con->ops->alloc_msg) {
+ struct ceph_msg *msg;
+
mutex_unlock(&con->mutex);
- con->in_msg = con->ops->alloc_msg(con, hdr, skip);
+ msg = con->ops->alloc_msg(con, hdr, skip);
mutex_lock(&con->mutex);
+ if (con->state != CON_STATE_OPEN) {
+ ceph_msg_put(msg);
+ return -EAGAIN;
+ }
+ con->in_msg = msg;
if (con->in_msg) {
con->in_msg->con = con->ops->get(con);
BUG_ON(con->in_msg->con == NULL);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 17+ messages in thread