All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] libceph: three bug fixes
@ 2012-12-27 23:09 Alex Elder
  2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Alex Elder @ 2012-12-27 23:09 UTC (permalink / raw)
  To: ceph-devel

This series includes three bug fixes that should be included
in Linux 3.8.

					-Alex

[PATCH 1/3] libceph: move linger requests sooner in kick_requests()
[PATCH 2/3] libceph: always reset osds when kicking
[PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] libceph: move linger requests sooner in kick_requests()
  2012-12-27 23:09 [PATCH 0/3] libceph: three bug fixes Alex Elder
@ 2012-12-27 23:17 ` Alex Elder
  2012-12-28  0:54   ` Sage Weil
  2012-12-27 23:17 ` [PATCH 2/3] libceph: always reset osds when kicking Alex Elder
  2012-12-27 23:17 ` [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states Alex Elder
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Elder @ 2012-12-27 23:17 UTC (permalink / raw)
  To: ceph-devel

The kick_requests() function is called by ceph_osdc_handle_map()
when an osd map change has been indicated.  Its purpose is to
re-queue any request whose target osd is different from what it
was when it was originally sent.

It is structured as two loops, one for incomplete but registered
requests, and a second for handling completed linger requests.
As a special case, in the first loop if a request marked to linger
has not yet completed, it is moved from the request list to the
linger list.  This is as a quick and dirty way to have the second
loop handle sending the request along with all the other linger
requests.

Because of the way it's done now, however, this quick and dirty
solution can result in these incomplete linger requests never
getting re-sent as desired.  The problem lies in the fact that
the second loop only arranges for a linger request to be sent
if it appears its target osd has changed.  This is the proper
handling for *completed* linger requests (it avoids issuing
the same linger request twice to the same osd).

But although the linger requests added to the list in the first loop
may have been sent, they have not yet completed, so they need to be
re-sent regardless of whether their target osd has changed.

The first required fix is we need to avoid calling __map_request()
on any incomplete linger request.  Otherwise the subsequent
__map_request() call in the second loop will find the target osd
has not changed and will therefore not re-send the request.

Second, we need to be sure that a sent but incomplete linger request
gets re-sent.  If the target osd is the same with the new osd map as
it was when the request was originally sent, this won't happen.
This can be fixed through careful handling when we move these
requests from the request list to the linger list, by unregistering
the request *before* it is registered as a linger request.  This
works because a side-effect of unregistering the request is to make
the request's r_osd pointer be NULL, and *that* will ensure the
second loop actually re-sends the linger request.

Processing of such a request is done at that point, so continue with
the next one once it's been moved.

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/osd_client.c |   31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 780caf6..616c6cf 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1284,6 +1284,25 @@ static void kick_requests(struct ceph_osd_client
*osdc, int force_resend)
 	for (p = rb_first(&osdc->requests); p; ) {
 		req = rb_entry(p, struct ceph_osd_request, r_node);
 		p = rb_next(p);
+
+		/*
+		 * For linger requests that have not yet been
+		 * registered, move them to the linger list; they'll
+		 * be sent to the osd in the loop below.  Unregister
+		 * the request before re-registering it as a linger
+		 * request to ensure the __map_request() below
+		 * will decide it needs to be sent.
+		 */
+		if (req->r_linger && list_empty(&req->r_linger_item)) {
+			dout("%p tid %llu restart on osd%d\n",
+			     req, req->r_tid,
+			     req->r_osd ? req->r_osd->o_osd : -1);
+			__unregister_request(osdc, req);
+			__register_linger_request(osdc, req);
+
+			continue;
+		}
+
 		err = __map_request(osdc, req, force_resend);
 		if (err < 0)
 			continue;  /* error */
@@ -1298,17 +1317,6 @@ static void kick_requests(struct ceph_osd_client
*osdc, int force_resend)
 				req->r_flags |= CEPH_OSD_FLAG_RETRY;
 			}
 		}
-		if (req->r_linger && list_empty(&req->r_linger_item)) {
-			/*
-			 * register as a linger so that we will
-			 * re-submit below and get a new tid
-			 */
-			dout("%p tid %llu restart on osd%d\n",
-			     req, req->r_tid,
-			     req->r_osd ? req->r_osd->o_osd : -1);
-			__register_linger_request(osdc, req);
-			__unregister_request(osdc, req);
-		}
 	}

 	list_for_each_entry_safe(req, nreq, &osdc->req_linger,
@@ -1316,6 +1324,7 @@ static void kick_requests(struct ceph_osd_client
*osdc, int force_resend)
 		dout("linger req=%p req->r_osd=%p\n", req, req->r_osd);

 		err = __map_request(osdc, req, force_resend);
+		dout("__map_request returned %d\n", err);
 		if (err == 0)
 			continue;  /* no change and no osd was specified */
 		if (err < 0)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] libceph: always reset osds when kicking
  2012-12-27 23:09 [PATCH 0/3] libceph: three bug fixes Alex Elder
  2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
@ 2012-12-27 23:17 ` Alex Elder
  2012-12-28  0:55   ` Sage Weil
  2012-12-27 23:17 ` [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states Alex Elder
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Elder @ 2012-12-27 23:17 UTC (permalink / raw)
  To: ceph-devel

When ceph_osdc_handle_map() is called to process a new osd map,
kick_requests() is called to ensure all affected requests are
updated if necessary to reflect changes in the osd map.  This
happens in two cases:  whenever an incremental map update is
processed; and when a full map update (or the last one if there is
more than one) gets processed.

In the former case, the kick_requests() call is followed immediately
by a call to reset_changed_osds() to ensure any connections to osds
affected by the map change are reset.  But for full map updates
this isn't done.

Both cases should be doint this osd reset.

Rather than duplicating the reset_changed_osds() call, move it into
the end of kick_requests().

Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/osd_client.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
index 616c6cf..0be8f0d 100644
--- a/net/ceph/osd_client.c
+++ b/net/ceph/osd_client.c
@@ -1270,7 +1270,7 @@ static void reset_changed_osds(struct
ceph_osd_client *osdc)
  * Requeue requests whose mapping to an OSD has changed.  If requests
map to
  * no osd, request a new map.
  *
- * Caller should hold map_sem for read and request_mutex.
+ * Caller should hold map_sem for read.
  */
 static void kick_requests(struct ceph_osd_client *osdc, int force_resend)
 {
@@ -1346,6 +1346,7 @@ static void kick_requests(struct ceph_osd_client
*osdc, int force_resend)
 		dout("%d requests for down osds, need new map\n", needmap);
 		ceph_monc_request_next_osdmap(&osdc->client->monc);
 	}
+	reset_changed_osds(osdc);
 }


@@ -1402,7 +1403,6 @@ void ceph_osdc_handle_map(struct ceph_osd_client
*osdc, struct ceph_msg *msg)
 				osdc->osdmap = newmap;
 			}
 			kick_requests(osdc, 0);
-			reset_changed_osds(osdc);
 		} else {
 			dout("ignoring incremental map %u len %d\n",
 			     epoch, maplen);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states
  2012-12-27 23:09 [PATCH 0/3] libceph: three bug fixes Alex Elder
  2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
  2012-12-27 23:17 ` [PATCH 2/3] libceph: always reset osds when kicking Alex Elder
@ 2012-12-27 23:17 ` Alex Elder
  2012-12-28  0:57   ` Sage Weil
  2 siblings, 1 reply; 8+ messages in thread
From: Alex Elder @ 2012-12-27 23:17 UTC (permalink / raw)
  To: ceph-devel

A number of assertions in the ceph messenger are implemented with
BUG_ON(), killing the system if connection's state doesn't match
what's expected.  At this point our state model is (evidently) not
well understood enough for these assertions to trigger a BUG().
Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
so we learn about these issues without killing the machine.

We now recognize that a connection fault can occur due to a socket
closure at any time, regardless of the state of the connection.  So
there is really nothing we can assert about the state of the
connection at that point so eliminate that assertion.

Reported-by: Ugis <ugis22@gmail.com>
Tested-by: Ugis <ugis22@gmail.com>
Signed-off-by: Alex Elder <elder@inktank.com>
---
 net/ceph/messenger.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 4d111fd..075b9fd 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con,
 	mutex_lock(&con->mutex);
 	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));

-	BUG_ON(con->state != CON_STATE_CLOSED);
+	WARN_ON(con->state != CON_STATE_CLOSED);
 	con->state = CON_STATE_PREOPEN;

 	con->peer_name.type = (__u8) entity_type;
@@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con)
 static void fail_protocol(struct ceph_connection *con)
 {
 	reset_connection(con);
-	BUG_ON(con->state != CON_STATE_NEGOTIATING);
+	WARN_ON(con->state != CON_STATE_NEGOTIATING);
 	con->state = CON_STATE_CLOSED;
 }

@@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection
*con)
 			return -1;
 		}

-		BUG_ON(con->state != CON_STATE_NEGOTIATING);
+		WARN_ON(con->state != CON_STATE_NEGOTIATING);
 		con->state = CON_STATE_OPEN;

 		con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq);
@@ -2132,7 +2132,6 @@ more:
 		if (ret < 0)
 			goto out;

-		BUG_ON(con->state != CON_STATE_CONNECTING);
 		con->state = CON_STATE_NEGOTIATING;

 		/*
@@ -2160,7 +2159,7 @@ more:
 		goto more;
 	}

-	BUG_ON(con->state != CON_STATE_OPEN);
+	WARN_ON(con->state != CON_STATE_OPEN);

 	if (con->in_base_pos < 0) {
 		/*
@@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con)
 	dout("fault %p state %lu to peer %s\n",
 	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));

-	BUG_ON(con->state != CON_STATE_CONNECTING &&
-	       con->state != CON_STATE_NEGOTIATING &&
-	       con->state != CON_STATE_OPEN);
-
 	con_close_socket(con);

 	if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) {
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] libceph: move linger requests sooner in kick_requests()
  2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
@ 2012-12-28  0:54   ` Sage Weil
  0 siblings, 0 replies; 8+ messages in thread
From: Sage Weil @ 2012-12-28  0:54 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Thu, 27 Dec 2012, Alex Elder wrote:
> The kick_requests() function is called by ceph_osdc_handle_map()
> when an osd map change has been indicated.  Its purpose is to
> re-queue any request whose target osd is different from what it
> was when it was originally sent.
> 
> It is structured as two loops, one for incomplete but registered
> requests, and a second for handling completed linger requests.
> As a special case, in the first loop if a request marked to linger
> has not yet completed, it is moved from the request list to the
> linger list.  This is as a quick and dirty way to have the second
> loop handle sending the request along with all the other linger
> requests.
> 
> Because of the way it's done now, however, this quick and dirty
> solution can result in these incomplete linger requests never
> getting re-sent as desired.  The problem lies in the fact that
> the second loop only arranges for a linger request to be sent
> if it appears its target osd has changed.  This is the proper
> handling for *completed* linger requests (it avoids issuing
> the same linger request twice to the same osd).
> 
> But although the linger requests added to the list in the first loop
> may have been sent, they have not yet completed, so they need to be
> re-sent regardless of whether their target osd has changed.
> 
> The first required fix is we need to avoid calling __map_request()
> on any incomplete linger request.  Otherwise the subsequent
> __map_request() call in the second loop will find the target osd
> has not changed and will therefore not re-send the request.
> 
> Second, we need to be sure that a sent but incomplete linger request
> gets re-sent.  If the target osd is the same with the new osd map as
> it was when the request was originally sent, this won't happen.
> This can be fixed through careful handling when we move these
> requests from the request list to the linger list, by unregistering
> the request *before* it is registered as a linger request.  This
> works because a side-effect of unregistering the request is to make
> the request's r_osd pointer be NULL, and *that* will ensure the
> second loop actually re-sends the linger request.
> 
> Processing of such a request is done at that point, so continue with
> the next one once it's been moved.
> 
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/osd_client.c |   31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 780caf6..616c6cf 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1284,6 +1284,25 @@ static void kick_requests(struct ceph_osd_client
> *osdc, int force_resend)
>  	for (p = rb_first(&osdc->requests); p; ) {
>  		req = rb_entry(p, struct ceph_osd_request, r_node);
>  		p = rb_next(p);
> +
> +		/*
> +		 * For linger requests that have not yet been
> +		 * registered, move them to the linger list; they'll
> +		 * be sent to the osd in the loop below.  Unregister
> +		 * the request before re-registering it as a linger
> +		 * request to ensure the __map_request() below
> +		 * will decide it needs to be sent.
> +		 */
> +		if (req->r_linger && list_empty(&req->r_linger_item)) {
> +			dout("%p tid %llu restart on osd%d\n",
> +			     req, req->r_tid,
> +			     req->r_osd ? req->r_osd->o_osd : -1);
> +			__unregister_request(osdc, req);
> +			__register_linger_request(osdc, req);
> +

Drop the newline?

Reviewed-by: Sage Weil <sage@inktank.com>


> +			continue;
> +		}
> +
>  		err = __map_request(osdc, req, force_resend);
>  		if (err < 0)
>  			continue;  /* error */
> @@ -1298,17 +1317,6 @@ static void kick_requests(struct ceph_osd_client
> *osdc, int force_resend)
>  				req->r_flags |= CEPH_OSD_FLAG_RETRY;
>  			}
>  		}
> -		if (req->r_linger && list_empty(&req->r_linger_item)) {
> -			/*
> -			 * register as a linger so that we will
> -			 * re-submit below and get a new tid
> -			 */
> -			dout("%p tid %llu restart on osd%d\n",
> -			     req, req->r_tid,
> -			     req->r_osd ? req->r_osd->o_osd : -1);
> -			__register_linger_request(osdc, req);
> -			__unregister_request(osdc, req);
> -		}
>  	}
> 
>  	list_for_each_entry_safe(req, nreq, &osdc->req_linger,
> @@ -1316,6 +1324,7 @@ static void kick_requests(struct ceph_osd_client
> *osdc, int force_resend)
>  		dout("linger req=%p req->r_osd=%p\n", req, req->r_osd);
> 
>  		err = __map_request(osdc, req, force_resend);
> +		dout("__map_request returned %d\n", err);
>  		if (err == 0)
>  			continue;  /* no change and no osd was specified */
>  		if (err < 0)
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/3] libceph: always reset osds when kicking
  2012-12-27 23:17 ` [PATCH 2/3] libceph: always reset osds when kicking Alex Elder
@ 2012-12-28  0:55   ` Sage Weil
  0 siblings, 0 replies; 8+ messages in thread
From: Sage Weil @ 2012-12-28  0:55 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

On Thu, 27 Dec 2012, Alex Elder wrote:
> When ceph_osdc_handle_map() is called to process a new osd map,
> kick_requests() is called to ensure all affected requests are
> updated if necessary to reflect changes in the osd map.  This
> happens in two cases:  whenever an incremental map update is
> processed; and when a full map update (or the last one if there is
> more than one) gets processed.
> 
> In the former case, the kick_requests() call is followed immediately
> by a call to reset_changed_osds() to ensure any connections to osds
> affected by the map change are reset.  But for full map updates
> this isn't done.
> 
> Both cases should be doint this osd reset.
> 
> Rather than duplicating the reset_changed_osds() call, move it into
> the end of kick_requests().
> 
> Signed-off-by: Alex Elder <elder@inktank.com>

Reviewed-by: Sage Weil <sage@inktank.com>

> ---
>  net/ceph/osd_client.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 616c6cf..0be8f0d 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -1270,7 +1270,7 @@ static void reset_changed_osds(struct
> ceph_osd_client *osdc)
>   * Requeue requests whose mapping to an OSD has changed.  If requests
> map to
>   * no osd, request a new map.
>   *
> - * Caller should hold map_sem for read and request_mutex.
> + * Caller should hold map_sem for read.
>   */
>  static void kick_requests(struct ceph_osd_client *osdc, int force_resend)
>  {
> @@ -1346,6 +1346,7 @@ static void kick_requests(struct ceph_osd_client
> *osdc, int force_resend)
>  		dout("%d requests for down osds, need new map\n", needmap);
>  		ceph_monc_request_next_osdmap(&osdc->client->monc);
>  	}
> +	reset_changed_osds(osdc);
>  }
> 
> 
> @@ -1402,7 +1403,6 @@ void ceph_osdc_handle_map(struct ceph_osd_client
> *osdc, struct ceph_msg *msg)
>  				osdc->osdmap = newmap;
>  			}
>  			kick_requests(osdc, 0);
> -			reset_changed_osds(osdc);
>  		} else {
>  			dout("ignoring incremental map %u len %d\n",
>  			     epoch, maplen);
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states
  2012-12-27 23:17 ` [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states Alex Elder
@ 2012-12-28  0:57   ` Sage Weil
  2012-12-28  2:01     ` Alex Elder
  0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2012-12-28  0:57 UTC (permalink / raw)
  To: Alex Elder; +Cc: ceph-devel

I agree that we should do BUG -> WARN on con->state everywhere.

But I don't think we should drop any of them, yet.  For Ugis's particular 
crash, it was fail_protocol()'s fault... see my other patch.  The rest of 
the time, a socket close should be caught at the top of con_work().

For any other cases where we see con->state changing when we don't expect 
it to, let's look at them on a case-by-case basis and address them in 
separate patches?

sage


On Thu, 27 Dec 2012, Alex Elder wrote:

> A number of assertions in the ceph messenger are implemented with
> BUG_ON(), killing the system if connection's state doesn't match
> what's expected.  At this point our state model is (evidently) not
> well understood enough for these assertions to trigger a BUG().
> Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
> so we learn about these issues without killing the machine.
> 
> We now recognize that a connection fault can occur due to a socket
> closure at any time, regardless of the state of the connection.  So
> there is really nothing we can assert about the state of the
> connection at that point so eliminate that assertion.
> 
> Reported-by: Ugis <ugis22@gmail.com>
> Tested-by: Ugis <ugis22@gmail.com>
> Signed-off-by: Alex Elder <elder@inktank.com>
> ---
>  net/ceph/messenger.c |   13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 4d111fd..075b9fd 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con,
>  	mutex_lock(&con->mutex);
>  	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
> 
> -	BUG_ON(con->state != CON_STATE_CLOSED);
> +	WARN_ON(con->state != CON_STATE_CLOSED);
>  	con->state = CON_STATE_PREOPEN;
> 
>  	con->peer_name.type = (__u8) entity_type;
> @@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con)
>  static void fail_protocol(struct ceph_connection *con)
>  {
>  	reset_connection(con);
> -	BUG_ON(con->state != CON_STATE_NEGOTIATING);
> +	WARN_ON(con->state != CON_STATE_NEGOTIATING);
>  	con->state = CON_STATE_CLOSED;
>  }
> 
> @@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection
> *con)
>  			return -1;
>  		}
> 
> -		BUG_ON(con->state != CON_STATE_NEGOTIATING);
> +		WARN_ON(con->state != CON_STATE_NEGOTIATING);
>  		con->state = CON_STATE_OPEN;
> 
>  		con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq);
> @@ -2132,7 +2132,6 @@ more:
>  		if (ret < 0)
>  			goto out;
> 
> -		BUG_ON(con->state != CON_STATE_CONNECTING);
>  		con->state = CON_STATE_NEGOTIATING;
> 
>  		/*
> @@ -2160,7 +2159,7 @@ more:
>  		goto more;
>  	}
> 
> -	BUG_ON(con->state != CON_STATE_OPEN);
> +	WARN_ON(con->state != CON_STATE_OPEN);
> 
>  	if (con->in_base_pos < 0) {
>  		/*
> @@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con)
>  	dout("fault %p state %lu to peer %s\n",
>  	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
> 
> -	BUG_ON(con->state != CON_STATE_CONNECTING &&
> -	       con->state != CON_STATE_NEGOTIATING &&
> -	       con->state != CON_STATE_OPEN);
> -
>  	con_close_socket(con);
> 
>  	if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) {
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states
  2012-12-28  0:57   ` Sage Weil
@ 2012-12-28  2:01     ` Alex Elder
  0 siblings, 0 replies; 8+ messages in thread
From: Alex Elder @ 2012-12-28  2:01 UTC (permalink / raw)
  To: Sage Weil; +Cc: ceph-devel

On 12/27/2012 06:57 PM, Sage Weil wrote:
> I agree that we should do BUG -> WARN on con->state everywhere.
> 
> But I don't think we should drop any of them, yet.  For Ugis's particular 
> crash, it was fail_protocol()'s fault... see my other patch.  The rest of 
> the time, a socket close should be caught at the top of con_work().

I looked at it again, and I think you're right.  I'd rather keep
the constraints in anyway.  So I will remove that last hunk from
this patch.

> For any other cases where we see con->state changing when we don't expect 
> it to, let's look at them on a case-by-case basis and address them in 
> separate patches?

Good plan.  With WARN_ON() rather than BUG_ON() if we find something
we got wrong it won't be a serious problem.

					-Alex

> sage
> 
> 
> On Thu, 27 Dec 2012, Alex Elder wrote:
> 
>> A number of assertions in the ceph messenger are implemented with
>> BUG_ON(), killing the system if connection's state doesn't match
>> what's expected.  At this point our state model is (evidently) not
>> well understood enough for these assertions to trigger a BUG().
>> Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
>> so we learn about these issues without killing the machine.
>>
>> We now recognize that a connection fault can occur due to a socket
>> closure at any time, regardless of the state of the connection.  So
>> there is really nothing we can assert about the state of the
>> connection at that point so eliminate that assertion.
>>
>> Reported-by: Ugis <ugis22@gmail.com>
>> Tested-by: Ugis <ugis22@gmail.com>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>>  net/ceph/messenger.c |   13 ++++---------
>>  1 file changed, 4 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 4d111fd..075b9fd 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con,
>>  	mutex_lock(&con->mutex);
>>  	dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
>>
>> -	BUG_ON(con->state != CON_STATE_CLOSED);
>> +	WARN_ON(con->state != CON_STATE_CLOSED);
>>  	con->state = CON_STATE_PREOPEN;
>>
>>  	con->peer_name.type = (__u8) entity_type;
>> @@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con)
>>  static void fail_protocol(struct ceph_connection *con)
>>  {
>>  	reset_connection(con);
>> -	BUG_ON(con->state != CON_STATE_NEGOTIATING);
>> +	WARN_ON(con->state != CON_STATE_NEGOTIATING);
>>  	con->state = CON_STATE_CLOSED;
>>  }
>>
>> @@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection
>> *con)
>>  			return -1;
>>  		}
>>
>> -		BUG_ON(con->state != CON_STATE_NEGOTIATING);
>> +		WARN_ON(con->state != CON_STATE_NEGOTIATING);
>>  		con->state = CON_STATE_OPEN;
>>
>>  		con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq);
>> @@ -2132,7 +2132,6 @@ more:
>>  		if (ret < 0)
>>  			goto out;
>>
>> -		BUG_ON(con->state != CON_STATE_CONNECTING);
>>  		con->state = CON_STATE_NEGOTIATING;
>>
>>  		/*
>> @@ -2160,7 +2159,7 @@ more:
>>  		goto more;
>>  	}
>>
>> -	BUG_ON(con->state != CON_STATE_OPEN);
>> +	WARN_ON(con->state != CON_STATE_OPEN);
>>
>>  	if (con->in_base_pos < 0) {
>>  		/*
>> @@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con)
>>  	dout("fault %p state %lu to peer %s\n",
>>  	     con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
>>
>> -	BUG_ON(con->state != CON_STATE_CONNECTING &&
>> -	       con->state != CON_STATE_NEGOTIATING &&
>> -	       con->state != CON_STATE_OPEN);
>> -
>>  	con_close_socket(con);
>>
>>  	if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) {
>> -- 
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-12-28  2:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-27 23:09 [PATCH 0/3] libceph: three bug fixes Alex Elder
2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
2012-12-28  0:54   ` Sage Weil
2012-12-27 23:17 ` [PATCH 2/3] libceph: always reset osds when kicking Alex Elder
2012-12-28  0:55   ` Sage Weil
2012-12-27 23:17 ` [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states Alex Elder
2012-12-28  0:57   ` Sage Weil
2012-12-28  2:01     ` Alex Elder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.