From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors
Date: Mon, 15 Feb 2016 16:16:17 -0500 (EST) [thread overview]
Message-ID: <538645327.23339354.1455570977299.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1845924157.21879838.1455215948999.JavaMail.zimbra@redhat.com>
----- Original Message -----
> ----- Original Message -----
> > On Wed, Feb 10, 2016 at 01:55:26PM -0500, Bob Peterson wrote:
> > > I've been doing a bunch of recovery testing with DLM and discovered some
> > > issues. This collection of 6 patches addresses those issues. Some of them
> > > are of my own making, introduced by the recent patches that made DLM
> > > print socket connection errors, and recovery from those errors.
> >
> > Thanks Bob, perhaps I've not been paying close enough attention, but it's
> > unclear to me how this patch set relates the the most accute issue we have
> > at the moment, which are the problems introduced here:
> >
> > From b3a5bbfd780d9e9291f5f257be06e9ad6db11657 Mon Sep 17 00:00:00 2001
> > From: Bob Peterson <rpeterso@redhat.com>
> > Date: Thu, 27 Aug 2015 09:34:47 -0500
> > Subject: [PATCH] dlm: print error from kernel_sendpage
> >
> > Print a dlm-specific error when a socket error occurs
> > when sending a dlm message.
> >
> > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> > Signed-off-by: David Teigland <teigland@redhat.com>
> >
> > Could we begin with one patch that's easy to track that directly resolves
> > the issues with that commit (perhaps even a revert if it's not simple to
> > fix directly)? That brings us back to a known-good place, from which we
> > can look at cleanups and changes.
> >
> Hi Dave,
>
> My goal has always been to attain stability, which I think I've finally
> achieved.
>
> The problem is: While testing the dlm in multiple recovery situations,
> Nate and I discovered multiple problems. Until recently, no one has tried
> to run recovery tests on an upstream DLM, so I think we're finding some
> old bugs that have been there for a while, as well as bugs with b3a5bbfd,
> which you mentioned.
>
> I agree that some of these patches might be unnecessary improvements.
> I'll try to pare them down to what is absolutely necessary and what
> is not. I'll also document exactly why the necessary ones are needed.
Hi Dave,
Here is some more information on the set of DLM patches I recently posted,
and where things stand:
1. Patch: dlm: print error from kernel_sendpage
Commit: b3a5bbfd780d9e9291f5f257be06e9ad6db11657
Advantages: It allows dlm to report socket errors
Disadvantages: It caused some major problems:
Problem #1: nodeid_to_addr ends up occasionally being called from softirq
context, which is a problem because it takes a spinlock.
Problem #2: The first condition also does "return;" rather than calling
the original error report. This is a problem because the
original error report needs to be called to do socket cleanup.
The sunrpc implementation avoids this by doing that socket
cleanup manually inside its own error_report function.
Problem #3: It saves off the sk_error_report callback, but it never
restores the callback to its original value.
Problem #4: It only saves off the sk_error_report callback, but not any
of the other three callbacks. All four really ought to be saved
and restored once dlm is done with the socket, like sunrpc does.
Problem #5: If two competing socket errors occur, lowcomms_error_report
could, in theory, be called twice, causing socket cleanup
(from the original error_report function) to happen twice,
which results in a kernel panic (the details of which escape me,
but I could maybe recreate it).
2. Patch: DLM: Replace nodeid_to_addr with kernel_getpeername
Advanges: It fixes problem #1 above.
Disadvantages: It doesn't fix any of the other problems.
3. Patch: DLM: Call original error report when socket is NULL
Advantages: It fixes problem #2 above.
Disadvantages: It introduces a new problem below.
Problem: Error report recursion problem: Depending on timing, if/when
add_sock is called multiple times for the same socket, it saves
off the original sk_error_report multiple times. The first time,
it saves off the proper one and replaces it with lowcomms_error_report.
The second time, it saves lowcomms_error_report, which means
when lowcomms_error_report is called the next time, it recurses
and calls itself recursively an infinite number of times until
the system crashes and is fenced.
NOTE #1: This problem is, in fact, already in the code today, for the second
two paths through lowcomms_error_report. This patch only makes the first
path do the same thing. In other words, the problem is already there; this
patch just makes it a lot more likely to happen.
NOTE #2: There are two ways to fix it. The first is to make dlm do the
socket cleanup, like sunrpc does. I don't like that because any cleanup
introduced in the calling code needs to be echoed to dlm, and whomever
makes that kind of change won't know to do it.
The second is to clean up the socket code so it doesn't save itself as the
original error_report callback, which is what subsequent patches do.
4. Patch: DLM: save / restore all socket callbacks
Advantages: This tries to fix problems 3 and 4.
Disadvantages: It has some sock-level locking, but not sk-callback_lock
locking like sunrpc has, which means it does not fix
problem #5 above.
5. Patch: DLM: Add locking to protect save callback assignments
Advantages: This tries to fix problem #5 above.
Disadvantages: None.
6. Patch: DLM: Don't create kernel socket until we have valid node address
This is a cleanup, unrelated to the others. This makes the TCP code path
similar to the SCTP code path.
7. Patch: DLM: Make consistent error path through tcp_create_listen_sock
This is a cleanup, unrelated to the others.
8. Patch: DLM: Eliminate useless goto
This is a cleanup, unrelated to the others.
I think the "right thing to do" at this point is this:
1. Patch #1 is already upstream
2. Patch #2 stands on its own, so I think this should go forward.
3. Combine patches 3, 4 and 5, which ought to provide a comprehensive fix
for the other problems listed in #1.
4. The rest of the patches, I can post as separate patches because they are
code cleanups, not related to the original problems of #1.
Let me know your thoughts on the subject. If you like this plan, I can
re-test and post replacement patches tomorrow (hopefully).
Regards,
Bob Peterson
Red Hat File Systems
next prev parent reply other threads:[~2016-02-15 21:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-10 18:55 [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Bob Peterson
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 1/6] DLM: Don't create kernel socket until we have valid node address Bob Peterson
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 2/6] DLM: Call original error report when socket is NULL Bob Peterson
2016-02-11 16:43 ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path through tcp_create_listen_sock Bob Peterson
2016-02-11 16:52 ` Andreas Gruenbacher
2016-02-11 17:59 ` Bob Peterson
2016-02-11 21:09 ` [Cluster-devel] [DLM PATCH 3/6] DLM: Make consistent error path Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 4/6] DLM: Eliminate useless goto Bob Peterson
2016-02-11 16:53 ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 5/6] DLM: Add locking to protect save callback assignments Bob Peterson
2016-02-11 17:04 ` Andreas Gruenbacher
2016-02-10 18:55 ` [Cluster-devel] [DLM PATCH 6/6] DLM: save / restore all socket callbacks Bob Peterson
2016-02-11 15:31 ` Steven Whitehouse
2016-02-11 16:43 ` [Cluster-devel] [DLM PATCH 6/6][try #2] " Bob Peterson
2016-02-11 17:10 ` Andreas Gruenbacher
2016-02-11 17:05 ` [Cluster-devel] [DLM PATCH 0/6] Misc DLM Improvements Regarding Socket Errors Andreas Gruenbacher
2016-02-11 17:22 ` David Teigland
2016-02-11 18:39 ` Bob Peterson
2016-02-11 18:59 ` David Teigland
2016-02-15 21:16 ` Bob Peterson [this message]
2016-02-15 21:24 ` David Teigland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=538645327.23339354.1455570977299.JavaMail.zimbra@redhat.com \
--to=rpeterso@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).