From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Elder Subject: [PATCH] libceph: WARN, don't BUG on unexpected connection states Date: Wed, 26 Dec 2012 10:46:01 -0600 Message-ID: <50DB29C9.4020703@inktank.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from mail-ia0-f182.google.com ([209.85.210.182]:55213 "EHLO mail-ia0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752819Ab2LZQpz (ORCPT ); Wed, 26 Dec 2012 11:45:55 -0500 Received: by mail-ia0-f182.google.com with SMTP id x2so7331111iad.27 for ; Wed, 26 Dec 2012 08:45:55 -0800 (PST) Sender: ceph-devel-owner@vger.kernel.org List-ID: To: ceph-devel@vger.kernel.org A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis Signed-off-by: Alex Elder --- net/ceph/messenger.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 4b04ccc..dd71b0e 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con, mutex_lock(&con->mutex); dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr)); - BUG_ON(con->state != CON_STATE_CLOSED); + WARN_ON(con->state != CON_STATE_CLOSED); con->state = CON_STATE_PREOPEN; con->peer_name.type = (__u8) entity_type; @@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con) static void fail_protocol(struct ceph_connection *con) { reset_connection(con); - BUG_ON(con->state != CON_STATE_NEGOTIATING); + WARN_ON(con->state != CON_STATE_NEGOTIATING); con->state = CON_STATE_CLOSED; } @@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection *con) return -1; } - BUG_ON(con->state != CON_STATE_NEGOTIATING); + WARN_ON(con->state != CON_STATE_NEGOTIATING); con->state = CON_STATE_OPEN; con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq); @@ -2132,7 +2132,6 @@ more: if (ret < 0) goto out; - BUG_ON(con->state != CON_STATE_CONNECTING); con->state = CON_STATE_NEGOTIATING; /* @@ -2160,7 +2159,7 @@ more: goto more; } - BUG_ON(con->state != CON_STATE_OPEN); + WARN_ON(con->state != CON_STATE_OPEN); if (con->in_base_pos < 0) { /* @@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con) dout("fault %p state %lu to peer %s\n", con, con->state, ceph_pr_addr(&con->peer_addr.in_addr)); - BUG_ON(con->state != CON_STATE_CONNECTING && - con->state != CON_STATE_NEGOTIATING && - con->state != CON_STATE_OPEN); - con_close_socket(con); if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) { -- 1.7.9.5