From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Steve French" <smfrench@gmail.com>
Subject: Re: [PATCH] do not attempt to close cifs files which are already closed due to session reconnect
Date: Thu, 20 Nov 2008 10:43:24 -0600
Message-ID: <524f69650811200843v4ab856f1r45591871b79a3cc@mail.gmail.com>
References: <524f69650811181946s79fdba88w11c8c4c6677df1db@mail.gmail.com>
	 <20081119070429.1d977f72@tleilax.poochiereds.net>
	 <524f69650811192124w2677e939l74846ed709335efa@mail.gmail.com>
	 <20081120080241.24e926f4@barsoom.rdu.redhat.com>
	 <524f69650811200604x2e1a5529k5bd1075ca5e53ed0@mail.gmail.com>
	 <20081120093900.44c967d2@barsoom.rdu.redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"linux-cifs-client@lists.samba.org"
	<linux-cifs-client@lists.samba.org>
To: "Jeff Layton" <jlayton@redhat.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from ey-out-2122.google.com ([74.125.78.25]:39123 "EHLO
	ey-out-2122.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753129AbYKTQn1 (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Thu, 20 Nov 2008 11:43:27 -0500
Received: by ey-out-2122.google.com with SMTP id 6so223940eyi.37
        for <linux-fsdevel@vger.kernel.org>; Thu, 20 Nov 2008 08:43:24 -0800 (PST)
In-Reply-To: <20081120093900.44c967d2@barsoom.rdu.redhat.com>
Content-Disposition: inline
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Thu, Nov 20, 2008 at 8:39 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Thu, 20 Nov 2008 08:04:08 -0600
> "Steve French" <smfrench@gmail.com> wrote:
>
>> On Thu, Nov 20, 2008 at 7:02 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> > On Wed, 19 Nov 2008 23:24:47 -0600
>> > "Steve French" <smfrench@gmail.com> wrote:
>> >
>> >> On Wed, Nov 19, 2008 at 6:04 AM, Jeff Layton <jlayton@redhat.com> wrote:
>> >> > On Tue, 18 Nov 2008 21:46:59 -0600
>> >> > "Steve French" <smfrench@gmail.com> wrote:
>> >> >
>> >> >> In hunting down why we could get EBADF returned on close in some cases
>> >> >> after reconnect, I found out that cifs_close was checking to see if
>> >> >> the share (mounted server export) was valid (didn't need reconnect due
>> >> >> to session crash/timeout) but we weren't checking if the handle was
>> >> >> valid (ie the share was reconnected, but the file handle was not
>> >> >> reopened yet).  It also adds some locking around the updates/checks of
>> >> >> the cifs_file->invalidHandle flag
>> >> >>
>> >>
>> >> >
>> >> > Do we need a lock around this check for invalidHandle? Could this race
>> >> > with mark_open_files_invalid()?
>> >> The attached patch may reduce the window of opportunity for the
>> >> race you describe.   Do you think we need another flag?  (one
>> >> to keep requests other than a write retry from using this
>> >> handle, and one to prevent reopen when the handle is about to be closed
>> >> after we have given up on write retries getting through?
>> >>
>> >
>> >
>> > So that I make sure I understand the problem...
>> >
>> > We have a file that is getting ready to be closed (closePend is set),
>> > but the tcon has been reconnected and the filehandle is now invalid.
>> > You only want to reopen the file in order to flush data out of the
>> > cache, but only if there are actually dirty pages to be flushed.
>> I don't think we have to worry about normal case of flushing dirty pages, that
>> happens already before we get to cifs_close (fput calls flush/fsync).
>> The case I was thinking about was a write on this handle that
>> has hung, reconnected, and we are waiting for this pending write to complete.
>>
>> > If closePend is set then the userspace filehandle is already dead? No
>> > further pages can be dirtied, right?
>> They could be dirtied from other handles, and writepages picks
>> the first handle that it can since writepages does not
>> specify which handle to use (writepages won't pick a handle that
>> that is close pending and it may be ok on retry because we look
>> for a valid handle each time we retry so shouldn't pick this one)
>>
>
> Right, I was assuming that the inode has no other open filehandles...
>
> Even if there are any other open filehandles though, we still want to
> flush whatever dirty pages we have, correct? Or at least start
> writeback on them...
>
>> > Rather than a new flag, I suggest checking for whether there are dirty
>> > pages attached to the inode. If so, then we'll want to reopen the file
>> > and flush it before finally closing it.
>> There shouldn't be dirty pages if this is the last handle on the inode
>> being closed
>>
>
> At the time that the "release" op is called (which is cifs_close in
> this case), there may still be dirty pages, even if this is the last
> filehandle, right?
I don't see how we could have dirty pages on that inode,
filemap_fdatawrite was called (by cifs_flush) before we got to release
and writes on different handles would not have oplock (if there are
any other handles) and we would call filemap_fdatawrite on each of
those (non-cached) writes on another handle.

> If so then it seems reasonable to just check to see if there are any
> dirty pages, reopen the file and start writeback if so.
>
> Alternately, I suppose we could consider skipping the reopen/writeback
> if there are other open filehandles for the inode. The idea would be
> that we could assume that the pages would get flushed when the last fh
> is closed. I'm not sure if this violates any close-to-open attribute
> semantics though.
I don't think it matters much.  We only have the write pending flag
when we are actually using the file handle (find_writable_file
increments it) for write ... if we failed timing out on write to that
handle we would use a different handle or fail.

-- 
Thanks,

Steve