From: Steve Dickson <SteveD@redhat.com>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: Jeff Layton <jlayton@redhat.com>, linux-nfs@vger.kernel.org
Subject: Re: [PATCH v4 09/11] nfsdcld: reopen pipe if it's deleted and recreated
Date: Thu, 26 Jan 2012 10:31:01 -0500 [thread overview]
Message-ID: <4F2171B5.2030103@RedHat.com> (raw)
In-Reply-To: <20120126093059.5c732d75@corrin.poochiereds.net>
On 01/26/2012 09:30 AM, Jeff Layton wrote:
> On Thu, 26 Jan 2012 08:28:30 -0500
> Jeff Layton <jlayton@redhat.com> wrote:
>
>> On Thu, 26 Jan 2012 07:47:51 -0500
>> Steve Dickson <SteveD@redhat.com> wrote:
>>
>>>
>>>
>>> On 01/25/2012 06:32 PM, Jeff Layton wrote:
>>>> On Wed, 25 Jan 2012 17:04:44 -0500
>>>> Steve Dickson <SteveD@redhat.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 01/25/2012 03:28 PM, Jeff Layton wrote:
>>>>>> On Wed, 25 Jan 2012 14:31:10 -0500
>>>>>> Steve Dickson <SteveD@redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 01/25/2012 02:09 PM, Jeff Layton wrote:
>>>>>>>> On Wed, 25 Jan 2012 13:16:24 -0500
>>>>>>>> Steve Dickson <SteveD@redhat.com> wrote:
>>>>>>>>
>>>>>>>>> Hey Jeff,
>>>>>>>>>
>>>>>>>>> Commit inline...
>>>>>>>>>
>>>>>>>>> On 01/23/2012 03:02 PM, Jeff Layton wrote:
>>>>>>>>>> This can happen if nfsd is shut down and restarted. If that occurs,
>>>>>>>>>> then reopen the pipe so we're not waiting for data on the defunct
>>>>>>>>>> pipe.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>> utils/nfsdcld/nfsdcld.c | 84 +++++++++++++++++++++++++++++++++++++++++-----
>>>>>>>>>> 1 files changed, 74 insertions(+), 10 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/utils/nfsdcld/nfsdcld.c b/utils/nfsdcld/nfsdcld.c
>>>>>>>>>> index b0c08e2..0dc5b37 100644
>>>>>>>>>> --- a/utils/nfsdcld/nfsdcld.c
>>>>>>>>>> +++ b/utils/nfsdcld/nfsdcld.c
>>>>>>>>>> @@ -57,6 +57,8 @@ struct cld_client {
>>>>>>>>>>
>>>>>>>>>> /* global variables */
>>>>>>>>>> static char *pipepath = DEFAULT_CLD_PATH;
>>>>>>>>>> +static int inotify_fd = -1;
>>>>>>>>>> +static struct event pipedir_event;
>>>>>>>>>>
>>>>>>>>>> static struct option longopts[] =
>>>>>>>>>> {
>>>>>>>>>> @@ -68,8 +70,10 @@ static struct option longopts[] =
>>>>>>>>>> { NULL, 0, 0, 0 },
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> /* forward declarations */
>>>>>>>>>> static void cldcb(int UNUSED(fd), short which, void *data);
>>>>>>>>>> +static void cld_pipe_reopen(struct cld_client *clnt);
>>>>>>>>>>
>>>>>>>>>> static void
>>>>>>>>>> usage(char *progname)
>>>>>>>>>> @@ -80,10 +84,62 @@ usage(char *progname)
>>>>>>>>>>
>>>>>>>>>> #define INOTIFY_EVENT_MAX (sizeof(struct inotify_event) + NAME_MAX)
>>>>>>>>>>
>>>>>>>>>> +static void
>>>>>>>>>> +cld_inotify_cb(int UNUSED(fd), short which, void *data)
>>>>>>>>>> +{
>>>>>>>>>> + int ret, oldfd;
>>>>>>>>>> + char evbuf[INOTIFY_EVENT_MAX];
>>>>>>>>>> + char *dirc = NULL, *pname;
>>>>>>>>>> + struct inotify_event *event = (struct inotify_event *)evbuf;
>>>>>>>>>> + struct cld_client *clnt = data;
>>>>>>>>>> +
>>>>>>>>>> + if (which != EV_READ)
>>>>>>>>>> + return;
>>>>>>>>>> +
>>>>>>>>>> + dirc = strndup(pipepath, PATH_MAX);
>>>>>>>>>> + if (!dirc) {
>>>>>>>>>> + xlog_err("%s: unable to allocate memory", __func__);
>>>>>>>>>> + goto out;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + ret = read(inotify_fd, evbuf, INOTIFY_EVENT_MAX);
>>>>>>>>>> + if (ret < 0) {
>>>>>>>>>> + xlog_err("%s: read from inotify fd failed: %m", __func__);
>>>>>>>>>> + goto out;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + /* check to see if we have a filename in the evbuf */
>>>>>>>>>> + if (!event->len)
>>>>>>>>>> + goto out;
>>>>>>>>>> +
>>>>>>>>>> + pname = basename(dirc);
>>>>>>>>>> +
>>>>>>>>>> + /* does the filename match our pipe? */
>>>>>>>>>> + if (strncmp(pname, event->name, event->len))
>>>>>>>>>> + goto out;
>>>>>>>>>> +
>>>>>>>>>> + /*
>>>>>>>>>> + * reopen the pipe. The old fd is not closed until the new one is
>>>>>>>>>> + * opened, so we know they should be different if the reopen is
>>>>>>>>>> + * successful.
>>>>>>>>>> + */
>>>>>>>>>> + oldfd = clnt->cl_fd;
>>>>>>>>>> + do {
>>>>>>>>>> + cld_pipe_reopen(clnt);
>>>>>>>>>> + } while (oldfd == clnt->cl_fd);
>>>>>>>>> Doesn't this have a potential for an infinite loop?
>>>>>>>>>
>>>>>>>>> steved.
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. If reopening the new pipe continually fails then it will loop
>>>>>>>> forever.
>>>>>>> Would it be more accurate to say it would be spinning forever?
>>>>>>> Since there is no sleep or delay in cld_pipe_reopen, what's
>>>>>>> going to stop the daemon from spinning in a CPU bound loop?
>>>>>>>
>>>>>>
>>>>>> Well, not spinning in a userspace loop...it'll continually be cycling on
>>>>>> an open() call that's not working for whatever reason. We sort of have
>>>>>> to loop on that though. I think the best we can do is add a sleep(1) in
>>>>>> there or something. Would that be sufficient?
>>>>>>
>>>>> I still think it going to needlessly suck up CPU cycles...
>>>>>
>>>>> The way I handled this in the rpc.idmapd daemon was to do the
>>>>> reopen on a SIGHUP signal. Then in NFS server initscript
>>>>> I did the following:
>>>>> /usr/bin/pkill -HUP rpc.idmapd
>>>>>
>>>>> Thoughts?
>>>>>
>>>>
>>>> Ugh, that requires manual intervention if the pipe is removed and
>>>> recreated. If someone restarts nfsd and doesn't send the signal, then
>>>> they won't get the upcalls. I'd prefer something that "just works".
>>> I have not seen any bz open saying rpc.idmapd doesn't just work...
>>>
>>>>
>>>> Seriously, is it that big a deal to just loop here? One open(2) call
>>>> every second doesn't seem that bad, and honestly if a new pipe pops up
>>>> and the daemon can't open it then a few CPU cycles is the least of your
>>>> worries.
>>>>
>>> Put the daemon in that loop and then run the top command in another
>>> window.. If the daemon is at the top of the list then it is a big
>>> deal because that daemon will on the top forever for no reason, in
>>> the cast of the NFS server not coming back.
>>>
>>
>> This situation is really unlikely. The daemon does not reopen the pipe
>> when the old one goes away. It reopens it when a new one with the same
>> name is recreated in the directory.
>>
>> That's an important distinction because in order to get into this loop,
>> you'd need to:
>>
>> 1/ remove the old pipe -- this happens when the daemon is shut down
Just to be clear, when this happens, that while loop is *not* executed
>>
>> 2/ create a new pipe -- this happens when the daemon is restarted
Then when this happens that while loop is *not* executed.
>>
>
> To clarify, the above happen when knfsd are stopped and started...
>
>> 3/ not be able to open the new pipe for some reason, even though you
>> were able to open the old one
Only when 1,2,3 happens synchronously will that while loop be execute, correct?
More the to the point, stopping the server will *not* cause this while
to be execute until the server is restarted, correct?
>>
>> The reason I put this in a loop is because it's possible (though not
>> likely) that you'd hit condition #3 temporarily. In that event, looping
>> and retrying an open(2) call every second seems entirely reasonable and
>> is more fault tolerant than just dying here. The open of a pipe takes
>> much less than 1s, so there's plenty of time between open attempts for
>> the machine to get other things done
By no means am I saying not to make it fault tolerant... Please do! I'm
just worried about the daemon spinning out of control.. :-)
>>
>> If it turns out that there's a problem, the admin can shut down the
>> daemon at that point. They may need to do so anyway in order to resolve
>> the situation if the thing preventing the opening of the pipe isn't
>> temporary.
I guess I would rather figure this out now, during the design, than
after the bits hit the street...
steved.
next prev parent reply other threads:[~2012-01-26 15:31 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-23 20:02 [PATCH v4 00/11] nfsdcld: add a daemon to track NFSv4 client names on stable storage Jeff Layton
2012-01-23 20:02 ` [PATCH v4 01/11] nfsdcld: add client tracking daemon stub Jeff Layton
2012-01-23 20:02 ` [PATCH v4 02/11] nfsdcld: add autoconf goop for sqlite Jeff Layton
2012-01-23 20:02 ` [PATCH v4 03/11] nfsdcld: add routines for a sqlite backend database Jeff Layton
2012-01-23 20:02 ` [PATCH v4 04/11] nfsdcld: add check/update functionality Jeff Layton
2012-01-23 20:02 ` [PATCH v4 05/11] nfsdcld: add function to remove unreclaimed client records Jeff Layton
2012-01-23 20:02 ` [PATCH v4 06/11] nfsdcld: have daemon pass client row index back to kernel Jeff Layton
2012-01-23 20:02 ` [PATCH v4 07/11] nfsdcld: implement an init upcall Jeff Layton
2012-01-23 20:02 ` [PATCH v4 08/11] nfsdcld: allow daemon to wait for pipe to show up Jeff Layton
2012-01-23 20:02 ` [PATCH v4 09/11] nfsdcld: reopen pipe if it's deleted and recreated Jeff Layton
2012-01-25 18:16 ` Steve Dickson
2012-01-25 19:09 ` Jeff Layton
2012-01-25 19:31 ` Steve Dickson
2012-01-25 20:28 ` Jeff Layton
2012-01-25 22:04 ` Steve Dickson
2012-01-25 23:32 ` Jeff Layton
2012-01-26 12:47 ` Steve Dickson
2012-01-26 13:28 ` Jeff Layton
2012-01-26 14:30 ` Jeff Layton
2012-01-26 15:31 ` Steve Dickson [this message]
2012-01-26 15:41 ` Jeff Layton
2012-01-26 18:58 ` J. Bruce Fields
2012-01-26 19:36 ` Jeff Layton
2012-01-26 20:18 ` J. Bruce Fields
2012-01-26 21:58 ` Steve Dickson
2012-01-23 20:02 ` [PATCH v4 10/11] nfsdcld: add a manpage for nfsdcld Jeff Layton
2012-01-23 20:02 ` [PATCH v4 11/11] nfsdcld: update the README Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F2171B5.2030103@RedHat.com \
--to=steved@redhat.com \
--cc=jlayton@poochiereds.net \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).