From: Alex Elder <elder@inktank.com>
To: Sage Weil <sage@inktank.com>
Cc: ceph-devel@vger.kernel.org
Subject: Re: [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states
Date: Thu, 27 Dec 2012 20:01:47 -0600 [thread overview]
Message-ID: <50DCFD8B.7000603@inktank.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1212271655190.22449@cobra.newdream.net>
On 12/27/2012 06:57 PM, Sage Weil wrote:
> I agree that we should do BUG -> WARN on con->state everywhere.
>
> But I don't think we should drop any of them, yet. For Ugis's particular
> crash, it was fail_protocol()'s fault... see my other patch. The rest of
> the time, a socket close should be caught at the top of con_work().
I looked at it again, and I think you're right. I'd rather keep
the constraints in anyway. So I will remove that last hunk from
this patch.
> For any other cases where we see con->state changing when we don't expect
> it to, let's look at them on a case-by-case basis and address them in
> separate patches?
Good plan. With WARN_ON() rather than BUG_ON() if we find something
we got wrong it won't be a serious problem.
-Alex
> sage
>
>
> On Thu, 27 Dec 2012, Alex Elder wrote:
>
>> A number of assertions in the ceph messenger are implemented with
>> BUG_ON(), killing the system if connection's state doesn't match
>> what's expected. At this point our state model is (evidently) not
>> well understood enough for these assertions to trigger a BUG().
>> Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
>> so we learn about these issues without killing the machine.
>>
>> We now recognize that a connection fault can occur due to a socket
>> closure at any time, regardless of the state of the connection. So
>> there is really nothing we can assert about the state of the
>> connection at that point so eliminate that assertion.
>>
>> Reported-by: Ugis <ugis22@gmail.com>
>> Tested-by: Ugis <ugis22@gmail.com>
>> Signed-off-by: Alex Elder <elder@inktank.com>
>> ---
>> net/ceph/messenger.c | 13 ++++---------
>> 1 file changed, 4 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
>> index 4d111fd..075b9fd 100644
>> --- a/net/ceph/messenger.c
>> +++ b/net/ceph/messenger.c
>> @@ -561,7 +561,7 @@ void ceph_con_open(struct ceph_connection *con,
>> mutex_lock(&con->mutex);
>> dout("con_open %p %s\n", con, ceph_pr_addr(&addr->in_addr));
>>
>> - BUG_ON(con->state != CON_STATE_CLOSED);
>> + WARN_ON(con->state != CON_STATE_CLOSED);
>> con->state = CON_STATE_PREOPEN;
>>
>> con->peer_name.type = (__u8) entity_type;
>> @@ -1509,7 +1509,7 @@ static int process_banner(struct ceph_connection *con)
>> static void fail_protocol(struct ceph_connection *con)
>> {
>> reset_connection(con);
>> - BUG_ON(con->state != CON_STATE_NEGOTIATING);
>> + WARN_ON(con->state != CON_STATE_NEGOTIATING);
>> con->state = CON_STATE_CLOSED;
>> }
>>
>> @@ -1635,7 +1635,7 @@ static int process_connect(struct ceph_connection
>> *con)
>> return -1;
>> }
>>
>> - BUG_ON(con->state != CON_STATE_NEGOTIATING);
>> + WARN_ON(con->state != CON_STATE_NEGOTIATING);
>> con->state = CON_STATE_OPEN;
>>
>> con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq);
>> @@ -2132,7 +2132,6 @@ more:
>> if (ret < 0)
>> goto out;
>>
>> - BUG_ON(con->state != CON_STATE_CONNECTING);
>> con->state = CON_STATE_NEGOTIATING;
>>
>> /*
>> @@ -2160,7 +2159,7 @@ more:
>> goto more;
>> }
>>
>> - BUG_ON(con->state != CON_STATE_OPEN);
>> + WARN_ON(con->state != CON_STATE_OPEN);
>>
>> if (con->in_base_pos < 0) {
>> /*
>> @@ -2382,10 +2381,6 @@ static void ceph_fault(struct ceph_connection *con)
>> dout("fault %p state %lu to peer %s\n",
>> con, con->state, ceph_pr_addr(&con->peer_addr.in_addr));
>>
>> - BUG_ON(con->state != CON_STATE_CONNECTING &&
>> - con->state != CON_STATE_NEGOTIATING &&
>> - con->state != CON_STATE_OPEN);
>> -
>> con_close_socket(con);
>>
>> if (test_bit(CON_FLAG_LOSSYTX, &con->flags)) {
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
prev parent reply other threads:[~2012-12-28 2:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-27 23:09 [PATCH 0/3] libceph: three bug fixes Alex Elder
2012-12-27 23:17 ` [PATCH 1/3] libceph: move linger requests sooner in kick_requests() Alex Elder
2012-12-28 0:54 ` Sage Weil
2012-12-27 23:17 ` [PATCH 2/3] libceph: always reset osds when kicking Alex Elder
2012-12-28 0:55 ` Sage Weil
2012-12-27 23:17 ` [PATCH 3/3] libceph: WARN, don't BUG on unexpected connection states Alex Elder
2012-12-28 0:57 ` Sage Weil
2012-12-28 2:01 ` Alex Elder [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50DCFD8B.7000603@inktank.com \
--to=elder@inktank.com \
--cc=ceph-devel@vger.kernel.org \
--cc=sage@inktank.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox