All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Teigland <teigland@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] Kernel crash at DLM: kernel BUG at /usr/src/packages/BUILD/dlm-1.6.fio/obj/default/lowcomms.c:715!
Date: Thu, 29 Jan 2015 09:57:46 -0600	[thread overview]
Message-ID: <20150129155746.GA22988@redhat.com> (raw)
In-Reply-To: <FEE6881C16279747A9F8AF438B84D745DDFA07@SACMBXIP01.sdcorp.global.sandisk.com>

On Thu, Jan 29, 2015 at 03:50:58AM +0000, Pralay Dakua wrote:
> 645 static int receive_from_sock(struct connection *con)
> 646 {
> ....
> ....
> 704
> 705         /* Process SCTP notifications */
> 706         if (msg.msg_flags & MSG_NOTIFICATION) {
> 707                 msg.msg_control = incmsg;
> 708                 msg.msg_controllen = sizeof(incmsg);
> 709
> 710                 process_sctp_notification(con, &msg,
> 711                                 page_address(con->rx_page) + con->cb.base);
> 712                 mutex_unlock(&con->sock_mutex);
> 713                 return 0;
> 714         }
> 715         BUG_ON(con->nodeid == 0);
> 
> 
> I am fairly new when it comes to understanding DLM code. We are using
> SCTP protocol. If I understood correctly, nodeid = 0 points to the base
> connection (associated with the listener socket). The function
> receive_from_sock() has an assumption that if MSG_NOTIFICATION flag is
> not set, it got to be a peeled socket (which has associated nodeid > 0).
> And vice versa - if MSG_NOTIFICATION flag is set, it is listener socket
> with nodeid = 0.

> But when process_sctp_notification() rejects a SCTP event message due
> addr to nodeid mismatch (ie. dlm_addr-to_nodeid function returns
> non-zero), the function returns without peeling off a new socket.  The
> code is shown below, where the function is returning from line number
> 579. And the socket is peeled off at line number 588.  As the socket
> peeling off is not done, it is possible for listener socket receiving
> ordinary data (which was meant for peeled socket) from the connection
> where client already send some data (I am assuming client already sent
> this data before the socket is shutdown at server end). And if listener
> socket receives ordinary data,  DLM is going to hit the "BUG_ON()" at
> lowcomms.c:715.
> 
> Please let me know if my analysis is correct.

I think you probably understand this code as well as anyone else at this
point, and I suspect you're correct.

As Chrissie suggested, removing the BUG_ON and ignoring the data is
probably the best option, but I'm not sure exactly how it should be
ignored.  Could it just return, or does it need to set some length to
zero first?

Dave



      parent reply	other threads:[~2015-01-29 15:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-29  3:50 [Cluster-devel] Kernel crash at DLM: kernel BUG at /usr/src/packages/BUILD/dlm-1.6.fio/obj/default/lowcomms.c:715! Pralay Dakua
2015-01-29 15:39 ` Christine Caulfield
2015-01-29 15:57 ` David Teigland [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150129155746.GA22988@redhat.com \
    --to=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.