* [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
@ 2010-09-24 12:59 Jeff Layton
[not found] ` <1285333153-9113-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-09-24 12:59 UTC (permalink / raw)
To: smfrench-Re5JQEeQqe8AvxtiuMwx3w; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
It's possible that the kernel will call cifs_close on a filp while there
is still dirty data attached to the inode. If it's the last writeable
filp for an inode, then we may have no way to flush this data to the
server.
Thus, simply doing a filemap_fdatawrite in the .flush operation isn't
sufficient. The client needs to wait for it to complete before
returning. OTOH, there's no real need to perform a flush on a
filehandle that's read-only.
What currently saves us on this is the fact that cifs_writepages is
basically synchronous. If and when we ever switch to an async writeback
model, this will need to be fixed, so we might as well go ahead and
fix it now.
When flushing a writable filp, wait for the writeback to complete before
returning to ensure that the filehandle stays around until the writeback
is complete.
Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
fs/cifs/file.c | 17 +++++++----------
1 files changed, 7 insertions(+), 10 deletions(-)
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 80856f1..8084a8d 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1737,16 +1737,13 @@ int cifs_flush(struct file *file, fl_owner_t id)
struct inode *inode = file->f_path.dentry->d_inode;
int rc = 0;
- /* Rather than do the steps manually:
- lock the inode for writing
- loop through pages looking for write behind data (dirty pages)
- coalesce into contiguous 16K (or smaller) chunks to write to server
- send to server (prefer in parallel)
- deal with writebehind errors
- unlock inode for writing
- filemapfdatawrite appears easier for the time being */
-
- rc = filemap_fdatawrite(inode->i_mapping);
+ /*
+ * if the file is writable, then flush any dirty data to ensure that
+ * the filehandle isn't closed out before writeback completes.
+ */
+ if (file->f_mode & FMODE_WRITE)
+ rc = filemap_write_and_wait(inode->i_mapping);
+
/* reset wb rc if we were able to write out dirty pages */
if (!rc) {
rc = CIFS_I(inode)->write_behind_rc;
--
1.7.2.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <1285333153-9113-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2010-09-24 17:58 ` Steve French
[not found] ` <AANLkTimdoOPyAmhWw0kTtnZan6+gudJZn=vgqTNpEL+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Steve French @ 2010-09-24 17:58 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
We need to see the performance impact. As you say cifs_writepages is
synchronous so we should be ok without it. Any test results
before/after?
On Fri, Sep 24, 2010 at 7:59 AM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> It's possible that the kernel will call cifs_close on a filp while there
> is still dirty data attached to the inode. If it's the last writeable
> filp for an inode, then we may have no way to flush this data to the
> server.
>
> Thus, simply doing a filemap_fdatawrite in the .flush operation isn't
> sufficient. The client needs to wait for it to complete before
> returning. OTOH, there's no real need to perform a flush on a
> filehandle that's read-only.
>
> What currently saves us on this is the fact that cifs_writepages is
> basically synchronous. If and when we ever switch to an async writeback
> model, this will need to be fixed, so we might as well go ahead and
> fix it now.
>
> When flushing a writable filp, wait for the writeback to complete before
> returning to ensure that the filehandle stays around until the writeback
> is complete.
>
> Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
> fs/cifs/file.c | 17 +++++++----------
> 1 files changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 80856f1..8084a8d 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -1737,16 +1737,13 @@ int cifs_flush(struct file *file, fl_owner_t id)
> struct inode *inode = file->f_path.dentry->d_inode;
> int rc = 0;
>
> - /* Rather than do the steps manually:
> - lock the inode for writing
> - loop through pages looking for write behind data (dirty pages)
> - coalesce into contiguous 16K (or smaller) chunks to write to server
> - send to server (prefer in parallel)
> - deal with writebehind errors
> - unlock inode for writing
> - filemapfdatawrite appears easier for the time being */
> -
> - rc = filemap_fdatawrite(inode->i_mapping);
> + /*
> + * if the file is writable, then flush any dirty data to ensure that
> + * the filehandle isn't closed out before writeback completes.
> + */
> + if (file->f_mode & FMODE_WRITE)
> + rc = filemap_write_and_wait(inode->i_mapping);
> +
> /* reset wb rc if we were able to write out dirty pages */
> if (!rc) {
> rc = CIFS_I(inode)->write_behind_rc;
> --
> 1.7.2.3
>
>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <AANLkTimdoOPyAmhWw0kTtnZan6+gudJZn=vgqTNpEL+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-09-24 18:11 ` Jeff Layton
[not found] ` <20100924141137.4cb03197-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-09-24 18:11 UTC (permalink / raw)
To: Steve French; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, 24 Sep 2010 12:58:55 -0500
Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> We need to see the performance impact. As you say cifs_writepages is
> synchronous so we should be ok without it. Any test results
> before/after?
>
No, I haven't tested this for performance. It is a correctness issue
though. We absolutely can't put the last reference to the last open
filehandle without flushing all of the data first.
My expectation here though is that this may help performance in some
cases since this patch also has it skip the flush on files open
read-only.
> On Fri, Sep 24, 2010 at 7:59 AM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > It's possible that the kernel will call cifs_close on a filp while there
> > is still dirty data attached to the inode. If it's the last writeable
> > filp for an inode, then we may have no way to flush this data to the
> > server.
> >
> > Thus, simply doing a filemap_fdatawrite in the .flush operation isn't
> > sufficient. The client needs to wait for it to complete before
> > returning. OTOH, there's no real need to perform a flush on a
> > filehandle that's read-only.
> >
> > What currently saves us on this is the fact that cifs_writepages is
> > basically synchronous. If and when we ever switch to an async writeback
> > model, this will need to be fixed, so we might as well go ahead and
> > fix it now.
> >
> > When flushing a writable filp, wait for the writeback to complete before
> > returning to ensure that the filehandle stays around until the writeback
> > is complete.
> >
> > Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> > ---
> > fs/cifs/file.c | 17 +++++++----------
> > 1 files changed, 7 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> > index 80856f1..8084a8d 100644
> > --- a/fs/cifs/file.c
> > +++ b/fs/cifs/file.c
> > @@ -1737,16 +1737,13 @@ int cifs_flush(struct file *file, fl_owner_t id)
> > struct inode *inode = file->f_path.dentry->d_inode;
> > int rc = 0;
> >
> > - /* Rather than do the steps manually:
> > - lock the inode for writing
> > - loop through pages looking for write behind data (dirty pages)
> > - coalesce into contiguous 16K (or smaller) chunks to write to server
> > - send to server (prefer in parallel)
> > - deal with writebehind errors
> > - unlock inode for writing
> > - filemapfdatawrite appears easier for the time being */
> > -
> > - rc = filemap_fdatawrite(inode->i_mapping);
> > + /*
> > + * if the file is writable, then flush any dirty data to ensure that
> > + * the filehandle isn't closed out before writeback completes.
> > + */
> > + if (file->f_mode & FMODE_WRITE)
> > + rc = filemap_write_and_wait(inode->i_mapping);
> > +
> > /* reset wb rc if we were able to write out dirty pages */
> > if (!rc) {
> > rc = CIFS_I(inode)->write_behind_rc;
> > --
> > 1.7.2.3
> >
> >
>
>
>
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100924141137.4cb03197-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2010-09-24 18:43 ` Steve French
[not found] ` <AANLkTinunozFdg9S+pBrNLpcgZKoAzVg2a4QXBOBi--u-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-09-25 4:23 ` Christoph Hellwig
1 sibling, 1 reply; 17+ messages in thread
From: Steve French @ 2010-09-24 18:43 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
It don't think it is a correctness issue - if close wants to do an
fsync why do we have a flush routine at all? close is the only place
flush is called. This seems very wrong to require additional
semantics beyond Unix semantics here (and slows close performance way
down unnecessarily). Even if we go async we would initiate i/o on
these before we return close to the user - and we are not going to
close the network handle of course until all network writes complete.
At a minimum, we don't need to do an fsync (flush with wait) on close
if there is more than one handle to that inode open - and should be
able to just do flush
On Fri, Sep 24, 2010 at 1:11 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, 24 Sep 2010 12:58:55 -0500
> Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> We need to see the performance impact. As you say cifs_writepages is
>> synchronous so we should be ok without it. Any test results
>> before/after?
>>
>
> No, I haven't tested this for performance. It is a correctness issue
> though. We absolutely can't put the last reference to the last open
> filehandle without flushing all of the data first.
>
> My expectation here though is that this may help performance in some
> cases since this patch also has it skip the flush on files open
> read-only.
>
>> On Fri, Sep 24, 2010 at 7:59 AM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> > It's possible that the kernel will call cifs_close on a filp while there
>> > is still dirty data attached to the inode. If it's the last writeable
>> > filp for an inode, then we may have no way to flush this data to the
>> > server.
>> >
>> > Thus, simply doing a filemap_fdatawrite in the .flush operation isn't
>> > sufficient. The client needs to wait for it to complete before
>> > returning. OTOH, there's no real need to perform a flush on a
>> > filehandle that's read-only.
>> >
>> > What currently saves us on this is the fact that cifs_writepages is
>> > basically synchronous. If and when we ever switch to an async writeback
>> > model, this will need to be fixed, so we might as well go ahead and
>> > fix it now.
>> >
>> > When flushing a writable filp, wait for the writeback to complete before
>> > returning to ensure that the filehandle stays around until the writeback
>> > is complete.
>> >
>> > Signed-off-by: Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>> > ---
>> > fs/cifs/file.c | 17 +++++++----------
>> > 1 files changed, 7 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/fs/cifs/file.c b/fs/cifs/file.c
>> > index 80856f1..8084a8d 100644
>> > --- a/fs/cifs/file.c
>> > +++ b/fs/cifs/file.c
>> > @@ -1737,16 +1737,13 @@ int cifs_flush(struct file *file, fl_owner_t id)
>> > struct inode *inode = file->f_path.dentry->d_inode;
>> > int rc = 0;
>> >
>> > - /* Rather than do the steps manually:
>> > - lock the inode for writing
>> > - loop through pages looking for write behind data (dirty pages)
>> > - coalesce into contiguous 16K (or smaller) chunks to write to server
>> > - send to server (prefer in parallel)
>> > - deal with writebehind errors
>> > - unlock inode for writing
>> > - filemapfdatawrite appears easier for the time being */
>> > -
>> > - rc = filemap_fdatawrite(inode->i_mapping);
>> > + /*
>> > + * if the file is writable, then flush any dirty data to ensure that
>> > + * the filehandle isn't closed out before writeback completes.
>> > + */
>> > + if (file->f_mode & FMODE_WRITE)
>> > + rc = filemap_write_and_wait(inode->i_mapping);
>> > +
>> > /* reset wb rc if we were able to write out dirty pages */
>> > if (!rc) {
>> > rc = CIFS_I(inode)->write_behind_rc;
>> > --
>> > 1.7.2.3
>> >
>> >
>>
>>
>>
>
>
> --
> Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
>
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <AANLkTinunozFdg9S+pBrNLpcgZKoAzVg2a4QXBOBi--u-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-09-24 19:08 ` Jeff Layton
[not found] ` <20100924150826.1819f930-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-09-24 19:08 UTC (permalink / raw)
To: Steve French; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, 24 Sep 2010 13:43:40 -0500
Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> It don't think it is a correctness issue - if close wants to do an
> fsync why do we have a flush routine at all? close is the only place
> flush is called. This seems very wrong to require additional
> semantics beyond Unix semantics here (and slows close performance way
> down unnecessarily). Even if we go async we would initiate i/o on
> these before we return close to the user - and we are not going to
> close the network handle of course until all network writes complete.
>
> At a minimum, we don't need to do an fsync (flush with wait) on close
> if there is more than one handle to that inode open - and should be
> able to just do flush
>
What does this have to do with fsync? The flush operation is to flush
out data to the server prior to close. CIFS is not like a local fs or
even NFS. We have to have an open filehandle in order to write out
data.
So, we have no choice but to ensure that all of the data is written out
to the server prior to allowing the filehandle to be closed. Waiting
for writeback to complete is mandatory here. If not, then there is no
place for that data to go but into the bit-bucket.
You also can't reasonably try to get clever and only wait for the
"last" filehandle to be closed. Any check you do there will be racy.
Suppose we have 2 filps being closed simultaneously and they're both
in .flush at the same time? How do you ensure that one of them will
stick around and wait for the flush to complete?
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100924150826.1819f930-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2010-09-24 19:24 ` Steve French
[not found] ` <AANLkTikX=cLcVK3FOFyiBeaYBQOSARHE1bnaKQcvXhX+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Steve French @ 2010-09-24 19:24 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, Sep 24, 2010 at 2:08 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, 24 Sep 2010 13:43:40 -0500
> Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> It don't think it is a correctness issue - if close wants to do an
>> fsync why do we have a flush routine at all? close is the only place
>> flush is called. This seems very wrong to require additional
>> semantics beyond Unix semantics here (and slows close performance way
>> down unnecessarily). Even if we go async we would initiate i/o on
>> these before we return close to the user - and we are not going to
>> close the network handle of course until all network writes complete.
>>
>> At a minimum, we don't need to do an fsync (flush with wait) on close
>> if there is more than one handle to that inode open - and should be
>> able to just do flush
>>
>
> What does this have to do with fsync? The flush operation is to flush
> out data to the server prior to close. CIFS is not like a local fs or
> even NFS. We have to have an open filehandle in order to write out
> data.
fsync is an fs file operation, handle based - so why do we need a distinct
flush call if it has identical semantics?
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <AANLkTikX=cLcVK3FOFyiBeaYBQOSARHE1bnaKQcvXhX+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-09-24 19:32 ` Jeff Layton
[not found] ` <20100924153206.48c202f4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-09-24 19:32 UTC (permalink / raw)
To: Steve French; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, 24 Sep 2010 14:24:31 -0500
Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Sep 24, 2010 at 2:08 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Fri, 24 Sep 2010 13:43:40 -0500
> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >> It don't think it is a correctness issue - if close wants to do an
> >> fsync why do we have a flush routine at all? close is the only place
> >> flush is called. This seems very wrong to require additional
> >> semantics beyond Unix semantics here (and slows close performance way
> >> down unnecessarily). Even if we go async we would initiate i/o on
> >> these before we return close to the user - and we are not going to
> >> close the network handle of course until all network writes complete.
> >>
> >> At a minimum, we don't need to do an fsync (flush with wait) on close
> >> if there is more than one handle to that inode open - and should be
> >> able to just do flush
> >>
> >
> > What does this have to do with fsync? The flush operation is to flush
> > out data to the server prior to close. CIFS is not like a local fs or
> > even NFS. We have to have an open filehandle in order to write out
> > data.
>
> fsync is an fs file operation, handle based - so why do we need a distinct
> flush call if it has identical semantics?
>
>
fsync has much more strict semantics than close. For many local
filesystems that implies a barrier or something equivalent to ensure
that it actually made it to the media. For the CIFS case, we also issue
a SMB_COM_FLUSH in the fsync case, but not for close.
flush is for the close(2) call. Not all filesystems need to flush data
to disk on close. A local filesystem may be perfectly able to keep
the data in pagecache and write it out whenever it gets to it even
long after all file descriptors are closed.
CIFS is not like that -- no filehandle == no writeback. Thus, we need to
wait for writeback to complete before allowing writable filehandles to
be closed.
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100924153206.48c202f4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2010-09-24 19:36 ` Steve French
[not found] ` <AANLkTinECWNE9B56fYhp7iahp3QN8S_f0BHV9XS4SBhX-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Steve French @ 2010-09-24 19:36 UTC (permalink / raw)
To: Jeff Layton; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, Sep 24, 2010 at 2:32 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Fri, 24 Sep 2010 14:24:31 -0500
> Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> On Fri, Sep 24, 2010 at 2:08 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> > On Fri, 24 Sep 2010 13:43:40 -0500
>> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> >
>> >> It don't think it is a correctness issue - if close wants to do an
>> >> fsync why do we have a flush routine at all? close is the only place
>> >> flush is called. This seems very wrong to require additional
>> >> semantics beyond Unix semantics here (and slows close performance way
>> >> down unnecessarily). Even if we go async we would initiate i/o on
>> >> these before we return close to the user - and we are not going to
>> >> close the network handle of course until all network writes complete.
>> >>
>> >> At a minimum, we don't need to do an fsync (flush with wait) on close
>> >> if there is more than one handle to that inode open - and should be
>> >> able to just do flush
>> >>
>> >
>> > What does this have to do with fsync? The flush operation is to flush
>> > out data to the server prior to close. CIFS is not like a local fs or
>> > even NFS. We have to have an open filehandle in order to write out
>> > data.
>>
>> fsync is an fs file operation, handle based - so why do we need a distinct
>> flush call if it has identical semantics?
>>
>>
>
> fsync has much more strict semantics than close. For many local
> filesystems that implies a barrier or something equivalent to ensure
> that it actually made it to the media. For the CIFS case, we also issue
> a SMB_COM_FLUSH in the fsync case, but not for close.
>
> flush is for the close(2) call. Not all filesystems need to flush data
> to disk on close. A local filesystem may be perfectly able to keep
> the data in pagecache and write it out whenever it gets to it even
> long after all file descriptors are closed.
>
> CIFS is not like that -- no filehandle == no writeback. Thus, we need to
> wait for writeback to complete before allowing writable filehandles to
> be closed.
? We do that already - we don't close the last writeable filehandle until
all writes complete. If we had too - we could hold up the close of the
user's last writeable file handle until all writes complete and
we can close the last writeable cifs network handle - but we end up
doing that already.
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <AANLkTinECWNE9B56fYhp7iahp3QN8S_f0BHV9XS4SBhX-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-09-24 19:45 ` Jeff Layton
0 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2010-09-24 19:45 UTC (permalink / raw)
To: Steve French; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, 24 Sep 2010 14:36:19 -0500
Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, Sep 24, 2010 at 2:32 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> > On Fri, 24 Sep 2010 14:24:31 -0500
> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> >> On Fri, Sep 24, 2010 at 2:08 PM, Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> >> > On Fri, 24 Sep 2010 13:43:40 -0500
> >> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >> >
> >> >> It don't think it is a correctness issue - if close wants to do an
> >> >> fsync why do we have a flush routine at all? close is the only place
> >> >> flush is called. This seems very wrong to require additional
> >> >> semantics beyond Unix semantics here (and slows close performance way
> >> >> down unnecessarily). Even if we go async we would initiate i/o on
> >> >> these before we return close to the user - and we are not going to
> >> >> close the network handle of course until all network writes complete.
> >> >>
> >> >> At a minimum, we don't need to do an fsync (flush with wait) on close
> >> >> if there is more than one handle to that inode open - and should be
> >> >> able to just do flush
> >> >>
> >> >
> >> > What does this have to do with fsync? The flush operation is to flush
> >> > out data to the server prior to close. CIFS is not like a local fs or
> >> > even NFS. We have to have an open filehandle in order to write out
> >> > data.
> >>
> >> fsync is an fs file operation, handle based - so why do we need a distinct
> >> flush call if it has identical semantics?
> >>
> >>
> >
> > fsync has much more strict semantics than close. For many local
> > filesystems that implies a barrier or something equivalent to ensure
> > that it actually made it to the media. For the CIFS case, we also issue
> > a SMB_COM_FLUSH in the fsync case, but not for close.
> >
> > flush is for the close(2) call. Not all filesystems need to flush data
> > to disk on close. A local filesystem may be perfectly able to keep
> > the data in pagecache and write it out whenever it gets to it even
> > long after all file descriptors are closed.
> >
> > CIFS is not like that -- no filehandle == no writeback. Thus, we need to
> > wait for writeback to complete before allowing writable filehandles to
> > be closed.
>
> ? We do that already - we don't close the last writeable filehandle until
> all writes complete. If we had too - we could hold up the close of the
> user's last writeable file handle until all writes complete and
> we can close the last writeable cifs network handle - but we end up
> doing that already.
>
>
Can you point out where the code waits for writeback to complete before
the last writable filehandle is closed? I'm not seeing it...
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100924141137.4cb03197-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-09-24 18:43 ` Steve French
@ 2010-09-25 4:23 ` Christoph Hellwig
[not found] ` <20100925042330.GA11930-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2010-09-25 4:23 UTC (permalink / raw)
To: Jeff Layton; +Cc: Steve French, linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, Sep 24, 2010 at 02:11:37PM -0400, Jeff Layton wrote:
> On Fri, 24 Sep 2010 12:58:55 -0500
> Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> > We need to see the performance impact. As you say cifs_writepages is
> > synchronous so we should be ok without it. Any test results
> > before/after?
> >
>
> No, I haven't tested this for performance. It is a correctness issue
> though. We absolutely can't put the last reference to the last open
> filehandle without flushing all of the data first.
>
> My expectation here though is that this may help performance in some
> cases since this patch also has it skip the flush on files open
> read-only.
->flush is called on every close call, ->release on the last close for a
given file pointer. Maybe you want a filemap_flush in ->flush and
filemap_write_and_wait in ->release?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100925042330.GA11930-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2010-09-25 5:31 ` Steve French
2010-09-25 11:33 ` Jeff Layton
2010-09-25 12:16 ` Jeff Layton
2 siblings, 0 replies; 17+ messages in thread
From: Steve French @ 2010-09-25 5:31 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jeff Layton, linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Fri, Sep 24, 2010 at 11:23 PM, Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> On Fri, Sep 24, 2010 at 02:11:37PM -0400, Jeff Layton wrote:
>> On Fri, 24 Sep 2010 12:58:55 -0500
>> Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> > We need to see the performance impact. As you say cifs_writepages is
>> > synchronous so we should be ok without it. Any test results
>> > before/after?
>> >
>>
>> No, I haven't tested this for performance. It is a correctness issue
>> though. We absolutely can't put the last reference to the last open
>> filehandle without flushing all of the data first.
>>
>> My expectation here though is that this may help performance in some
>> cases since this patch also has it skip the flush on files open
>> read-only.
>
> ->flush is called on every close call, ->release on the last close for a
> given file pointer. Maybe you want a filemap_flush in ->flush and
> filemap_write_and_wait in ->release?
seems reasonable
--
Thanks,
Steve
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100925042330.GA11930-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2010-09-25 5:31 ` Steve French
@ 2010-09-25 11:33 ` Jeff Layton
2010-09-25 12:16 ` Jeff Layton
2 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2010-09-25 11:33 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Steve French, linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Sat, 25 Sep 2010 00:23:30 -0400
Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> On Fri, Sep 24, 2010 at 02:11:37PM -0400, Jeff Layton wrote:
> > On Fri, 24 Sep 2010 12:58:55 -0500
> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > > We need to see the performance impact. As you say cifs_writepages is
> > > synchronous so we should be ok without it. Any test results
> > > before/after?
> > >
> >
> > No, I haven't tested this for performance. It is a correctness issue
> > though. We absolutely can't put the last reference to the last open
> > filehandle without flushing all of the data first.
> >
> > My expectation here though is that this may help performance in some
> > cases since this patch also has it skip the flush on files open
> > read-only.
>
> ->flush is called on every close call, ->release on the last close for a
> given file pointer. Maybe you want a filemap_flush in ->flush and
> filemap_write_and_wait in ->release?
>
Good point. By doing that the right way we should be able to get rid of
that nasty wait loop in the cifs release op too. I'll respin and
resend...
Thanks,
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100925042330.GA11930-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2010-09-25 5:31 ` Steve French
2010-09-25 11:33 ` Jeff Layton
@ 2010-09-25 12:16 ` Jeff Layton
[not found] ` <20100925081644.777952fe-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-09-25 12:16 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Steve French, linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Sat, 25 Sep 2010 00:23:30 -0400
Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> On Fri, Sep 24, 2010 at 02:11:37PM -0400, Jeff Layton wrote:
> > On Fri, 24 Sep 2010 12:58:55 -0500
> > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > > We need to see the performance impact. As you say cifs_writepages is
> > > synchronous so we should be ok without it. Any test results
> > > before/after?
> > >
> >
> > No, I haven't tested this for performance. It is a correctness issue
> > though. We absolutely can't put the last reference to the last open
> > filehandle without flushing all of the data first.
> >
> > My expectation here though is that this may help performance in some
> > cases since this patch also has it skip the flush on files open
> > read-only.
>
> ->flush is called on every close call, ->release on the last close for a
> given file pointer. Maybe you want a filemap_flush in ->flush and
> filemap_write_and_wait in ->release?
>
Hmm...there is one problem with this scheme. __fput ignores the error
return from ->release. Only the errors from ->flush will be returned to
userspace. So if we only filemap_fdatawait in the ->release op, then we
have the potential to miss returning writeback-related errors on a
close call.
On a side note, why does f_op->release return an int? Are there places
in the kernel besides __fput that call it? If not, maybe we should
consider changing it to a void function to make this more clear.
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100925081644.777952fe-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
@ 2010-09-29 11:48 ` Jeff Layton
2010-10-04 17:05 ` Christoph Hellwig
1 sibling, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2010-09-29 11:48 UTC (permalink / raw)
To: Christoph Hellwig, Steve French; +Cc: linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Sat, 25 Sep 2010 08:16:44 -0400
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Sat, 25 Sep 2010 00:23:30 -0400
> Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> > On Fri, Sep 24, 2010 at 02:11:37PM -0400, Jeff Layton wrote:
> > > On Fri, 24 Sep 2010 12:58:55 -0500
> > > Steve French <smfrench-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> > >
> > > > We need to see the performance impact. As you say cifs_writepages is
> > > > synchronous so we should be ok without it. Any test results
> > > > before/after?
> > > >
> > >
> > > No, I haven't tested this for performance. It is a correctness issue
> > > though. We absolutely can't put the last reference to the last open
> > > filehandle without flushing all of the data first.
> > >
> > > My expectation here though is that this may help performance in some
> > > cases since this patch also has it skip the flush on files open
> > > read-only.
> >
> > ->flush is called on every close call, ->release on the last close for a
> > given file pointer. Maybe you want a filemap_flush in ->flush and
> > filemap_write_and_wait in ->release?
> >
>
> Hmm...there is one problem with this scheme. __fput ignores the error
> return from ->release. Only the errors from ->flush will be returned to
> userspace. So if we only filemap_fdatawait in the ->release op, then we
> have the potential to miss returning writeback-related errors on a
> close call.
>
> On a side note, why does f_op->release return an int? Are there places
> in the kernel besides __fput that call it? If not, maybe we should
> consider changing it to a void function to make this more clear.
>
Now that I've had a chance to look over this code, I'm seeing some
other problems with it. I think the patch that fixes this problem is
going to have to be made part of a patchset to overhaul how open files
are managed in cifs. That's work that's long overdue, but it'll probably
take me some time to sort it all out.
Stay tuned...
--
Jeff Layton <jlayton-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20100925081644.777952fe-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2010-09-29 11:48 ` Jeff Layton
@ 2010-10-04 17:05 ` Christoph Hellwig
[not found] ` <20101004170534.GA25799-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2010-10-04 17:05 UTC (permalink / raw)
To: Jeff Layton
Cc: Christoph Hellwig, Steve French,
linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Sat, Sep 25, 2010 at 08:16:44AM -0400, Jeff Layton wrote:
> Hmm...there is one problem with this scheme. __fput ignores the error
> return from ->release. Only the errors from ->flush will be returned to
> userspace. So if we only filemap_fdatawait in the ->release op, then we
> have the potential to miss returning writeback-related errors on a
> close call.
>
> On a side note, why does f_op->release return an int? Are there places
> in the kernel besides __fput that call it? If not, maybe we should
> consider changing it to a void function to make this more clear.
If I remember correctly actually returning errors from ->release
actually caused problems with various bits of userspace software. I'd
need to check the commit logs for details. No idea why these problems
don't appear for ->flush or why we've never bothered to change the
->release prototype.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20101004170534.GA25799-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2010-10-04 17:20 ` Jeff Layton
[not found] ` <20101004132057.5f76b550-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Jeff Layton @ 2010-10-04 17:20 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Steve French, linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Mon, 4 Oct 2010 13:05:34 -0400
Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
> On Sat, Sep 25, 2010 at 08:16:44AM -0400, Jeff Layton wrote:
> > Hmm...there is one problem with this scheme. __fput ignores the error
> > return from ->release. Only the errors from ->flush will be returned to
> > userspace. So if we only filemap_fdatawait in the ->release op, then we
> > have the potential to miss returning writeback-related errors on a
> > close call.
> >
> > On a side note, why does f_op->release return an int? Are there places
> > in the kernel besides __fput that call it? If not, maybe we should
> > consider changing it to a void function to make this more clear.
>
> If I remember correctly actually returning errors from ->release
> actually caused problems with various bits of userspace software. I'd
> need to check the commit logs for details. No idea why these problems
> don't appear for ->flush or why we've never bothered to change the
> ->release prototype.
I asked Al about it, and he basically said that it had been an int
return for a long, long time. By the time we get to ->release it's far
too late to really be able to do anything about an error, so there's
really no benefit to keeping it an int return.
The only obstacle is code churn. I also suspect that a lot of ->release
ops return an error, thinking it'll do something too. What Al suggested
for a first step was to add a WARN() or something whenever a ->release
op returns non-zero. I'm not looking to tackle that myself at the
moment however. ;)
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding
[not found] ` <20101004132057.5f76b550-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
@ 2010-10-04 17:22 ` Jeff Layton
0 siblings, 0 replies; 17+ messages in thread
From: Jeff Layton @ 2010-10-04 17:22 UTC (permalink / raw)
To: Jeff Layton
Cc: Christoph Hellwig, Steve French,
linux-cifs-u79uwXL29TY76Z2rM5mHXA
On Mon, 4 Oct 2010 13:20:57 -0400
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Mon, 4 Oct 2010 13:05:34 -0400
> Christoph Hellwig <hch-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> > On Sat, Sep 25, 2010 at 08:16:44AM -0400, Jeff Layton wrote:
> > > Hmm...there is one problem with this scheme. __fput ignores the error
> > > return from ->release. Only the errors from ->flush will be returned to
> > > userspace. So if we only filemap_fdatawait in the ->release op, then we
> > > have the potential to miss returning writeback-related errors on a
> > > close call.
> > >
> > > On a side note, why does f_op->release return an int? Are there places
> > > in the kernel besides __fput that call it? If not, maybe we should
> > > consider changing it to a void function to make this more clear.
> >
> > If I remember correctly actually returning errors from ->release
> > actually caused problems with various bits of userspace software. I'd
> > need to check the commit logs for details. No idea why these problems
> > don't appear for ->flush or why we've never bothered to change the
> > ->release prototype.
>
> I asked Al about it, and he basically said that it had been an int
> return for a long, long time. By the time we get to ->release it's far
> too late to really be able to do anything about an error, so there's
> really no benefit to keeping it an int return.
>
> The only obstacle is code churn. I also suspect that a lot of ->release
> ops return an error, thinking it'll do something too. What Al suggested
> for a first step was to add a WARN() or something whenever a ->release
> op returns non-zero. I'm not looking to tackle that myself at the
> moment however. ;)
>
For CIFS though, this means that we really do need to wait for
writeback to complete in the ->flush operation.
--
Jeff Layton <jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-10-04 17:22 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-24 12:59 [PATCH] cifs: cifs_flush should wait for writeback to complete before proceeding Jeff Layton
[not found] ` <1285333153-9113-1-git-send-email-jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-09-24 17:58 ` Steve French
[not found] ` <AANLkTimdoOPyAmhWw0kTtnZan6+gudJZn=vgqTNpEL+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-09-24 18:11 ` Jeff Layton
[not found] ` <20100924141137.4cb03197-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-09-24 18:43 ` Steve French
[not found] ` <AANLkTinunozFdg9S+pBrNLpcgZKoAzVg2a4QXBOBi--u-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-09-24 19:08 ` Jeff Layton
[not found] ` <20100924150826.1819f930-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-09-24 19:24 ` Steve French
[not found] ` <AANLkTikX=cLcVK3FOFyiBeaYBQOSARHE1bnaKQcvXhX+-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-09-24 19:32 ` Jeff Layton
[not found] ` <20100924153206.48c202f4-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-09-24 19:36 ` Steve French
[not found] ` <AANLkTinECWNE9B56fYhp7iahp3QN8S_f0BHV9XS4SBhX-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-09-24 19:45 ` Jeff Layton
2010-09-25 4:23 ` Christoph Hellwig
[not found] ` <20100925042330.GA11930-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2010-09-25 5:31 ` Steve French
2010-09-25 11:33 ` Jeff Layton
2010-09-25 12:16 ` Jeff Layton
[not found] ` <20100925081644.777952fe-4QP7MXygkU+dMjc06nkz3ljfA9RmPOcC@public.gmane.org>
2010-09-29 11:48 ` Jeff Layton
2010-10-04 17:05 ` Christoph Hellwig
[not found] ` <20101004170534.GA25799-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2010-10-04 17:20 ` Jeff Layton
[not found] ` <20101004132057.5f76b550-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-10-04 17:22 ` Jeff Layton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.