public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* What trigge fsync of file on last close of the open inode?
@ 2006-10-03 19:53 Steve French
  2006-10-03 20:05 ` Dave Kleikamp
  0 siblings, 1 reply; 9+ messages in thread
From: Steve French @ 2006-10-03 19:53 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Shirish S Pargaonkar

What triggers flush/fsync of dirty pages on last (file) close of inode?  
I was hunting through the sys_close code and did not see a call
to fsync or filemap_write_and_wait there.  Is it something done in libc 
above the vfs?

Someone had reported a problem with a writepages call coming in on with 
no open files (so presumably the file was closed, with dirty pages not 
written).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 19:53 What trigge fsync of file on last close of the open inode? Steve French
@ 2006-10-03 20:05 ` Dave Kleikamp
  2006-10-03 20:45   ` Steve French
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Kleikamp @ 2006-10-03 20:05 UTC (permalink / raw)
  To: Steve French; +Cc: linux-fsdevel, Shirish S Pargaonkar

On Tue, 2006-10-03 at 14:53 -0500, Steve French wrote:
> What triggers flush/fsync of dirty pages on last (file) close of inode?  

Nothing.  A close doesn't imply fsync.  Dirty data is eventually written
by pdflush, if something else (memory pressure, maybe) doesn't do it
first.

> I was hunting through the sys_close code and did not see a call
> to fsync or filemap_write_and_wait there.  Is it something done in libc 
> above the vfs?

No

> Someone had reported a problem with a writepages call coming in on with 
> no open files (so presumably the file was closed, with dirty pages not 
> written).

This is normal behavior for most file systems.  I thought cifs protected
this by flushing dirty data in cifs_close.  I don't think any data
should be dirtied after cifs_close is called (on the last open file
handle).
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 20:05 ` Dave Kleikamp
@ 2006-10-03 20:45   ` Steve French
  2006-10-03 21:15     ` Zach Brown
  2006-10-03 23:13     ` Jeremy Allison
  0 siblings, 2 replies; 9+ messages in thread
From: Steve French @ 2006-10-03 20:45 UTC (permalink / raw)
  To: Dave Kleikamp; +Cc: linux-fsdevel, Shirish S Pargaonkar

Dave Kleikamp wrote:
>> Someone had reported a problem with a writepages call coming in on with 
>> no open files (so presumably the file was closed, with dirty pages not 
>> written).
>>     
>
> This is normal behavior for most file systems.  I thought cifs protected
> this by flushing dirty data in cifs_close.  I don't think any data
> should be dirtied after cifs_close is called (on the last open file
> handle).
>   
I found it ...

cifs exports flush, filp_close calls flush (before calling close)

cifs_flush calls filemap_fdatawrite

May be a case in which filemap_fdatawrite returns before the write(s) is 
sent to the vfs and write races with close (although cifs will defer a 
file close if a write is pending on that handle)?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 20:45   ` Steve French
@ 2006-10-03 21:15     ` Zach Brown
  2006-10-03 21:40       ` Dave Kleikamp
  2006-10-04 15:13       ` Steve French
  2006-10-03 23:13     ` Jeremy Allison
  1 sibling, 2 replies; 9+ messages in thread
From: Zach Brown @ 2006-10-03 21:15 UTC (permalink / raw)
  To: Steve French; +Cc: Dave Kleikamp, linux-fsdevel, Shirish S Pargaonkar

Steve French wrote:
> Dave Kleikamp wrote:
>>> Someone had reported a problem with a writepages call coming in on
>>> with no open files (so presumably the file was closed, with dirty
>>> pages not written).

So is the problem that you're getting a cifs_writepages() call after
cifs_close() returns?

fwiw, 9 out of 10 brains would be less confused if cifs_close() was
called cifs_release().

> May be a case in which filemap_fdatawrite returns before the write(s) is
> sent to the vfs and write races with close (although cifs will defer a
> file close if a write is pending on that handle)?

Are writes to mmap()ed regions involved at all?  They lead to pages
being dirtied at unmapping and eventually hitting ->writepage,
potentially after ->flush and ->release have been called.

I imagine you could force writeback of dirty pages in ->release so that
you don't wait for writeback to come around and hit them.  Heck, it
might be doing this already.  I didn't look very hard :).

- z

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 21:15     ` Zach Brown
@ 2006-10-03 21:40       ` Dave Kleikamp
  2006-10-03 23:43         ` Zach Brown
  2006-10-04 15:13       ` Steve French
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Kleikamp @ 2006-10-03 21:40 UTC (permalink / raw)
  To: Zach Brown; +Cc: Steve French, linux-fsdevel, Shirish S Pargaonkar

On Tue, 2006-10-03 at 14:15 -0700, Zach Brown wrote:
> Steve French wrote:
> > Dave Kleikamp wrote:
> >>> Someone had reported a problem with a writepages call coming in on
> >>> with no open files (so presumably the file was closed, with dirty
> >>> pages not written).
> 
> So is the problem that you're getting a cifs_writepages() call after
> cifs_close() returns?

If I understand Steve, the problem may be that cifs_writepages() is
called after cifs_flush().  cifs_flush() is called earlier than
cifs_close().

> fwiw, 9 out of 10 brains would be less confused if cifs_close() was
> called cifs_release().

That confused me a bit, but I still would have missed the cifs_flush
bit.

> > May be a case in which filemap_fdatawrite returns before the write(s) is
> > sent to the vfs and write races with close (although cifs will defer a
> > file close if a write is pending on that handle)?
> 
> Are writes to mmap()ed regions involved at all?  They lead to pages
> being dirtied at unmapping and eventually hitting ->writepage,
> potentially after ->flush and ->release have been called.

->flush does filemap_fdatawrite(), which should take care of any dirty
pages (I believe).

> I imagine you could force writeback of dirty pages in ->release so that
> you don't wait for writeback to come around and hit them.  Heck, it
> might be doing this already.  I didn't look very hard :).

->flush does.  I'm not sure about ->release.  I'm not sure if a write
could squeeze in between ->flush and ->release.
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 20:45   ` Steve French
  2006-10-03 21:15     ` Zach Brown
@ 2006-10-03 23:13     ` Jeremy Allison
  2006-10-04 20:46       ` Trond Myklebust
  1 sibling, 1 reply; 9+ messages in thread
From: Jeremy Allison @ 2006-10-03 23:13 UTC (permalink / raw)
  To: Steve French; +Cc: Dave Kleikamp, linux-fsdevel, Shirish S Pargaonkar

On Tue, Oct 03, 2006 at 03:45:49PM -0500, Steve French wrote:
> Dave Kleikamp wrote:
> >>Someone had reported a problem with a writepages call coming in on with 
> >>no open files (so presumably the file was closed, with dirty pages not 
> >>written).
> >>    
> >
> >This is normal behavior for most file systems.  I thought cifs protected
> >this by flushing dirty data in cifs_close.  I don't think any data
> >should be dirtied after cifs_close is called (on the last open file
> >handle).
> >  
> I found it ...
> 
> cifs exports flush, filp_close calls flush (before calling close)
> 
> cifs_flush calls filemap_fdatawrite
> 
> May be a case in which filemap_fdatawrite returns before the write(s) is 
> sent to the vfs and write races with close (although cifs will defer a 
> file close if a write is pending on that handle)?

Steve,

	Here's a comment I found in the NFSv4 code.... might be relevent.

>From /usr/src/linux/fs/nfs/nfs4proc.c

/*
 * It is possible for data to be read/written from a mem-mapped file
 * after the sys_close call (which hits the vfs layer as a flush).
 * This means that we can't safely call nfsv4 close on a file until
 * the inode is cleared.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 21:40       ` Dave Kleikamp
@ 2006-10-03 23:43         ` Zach Brown
  0 siblings, 0 replies; 9+ messages in thread
From: Zach Brown @ 2006-10-03 23:43 UTC (permalink / raw)
  To: Dave Kleikamp; +Cc: Steve French, linux-fsdevel, Shirish S Pargaonkar


>> Are writes to mmap()ed regions involved at all?  They lead to pages
>> being dirtied at unmapping and eventually hitting ->writepage,
>> potentially after ->flush and ->release have been called.
> 
> ->flush does filemap_fdatawrite(), which should take care of any dirty
> pages (I believe).

Not if they come from writes to mmap()ed pages.  The page cache won't
see them as dirty until they're unmapped and the page table dirty bits
are transfered to the page cache page flags.

unmapping to transfer dirty tracking from page tables to the page cache
is done in a few places.  look for calls to unmap_mapping_range() like
this one from the O_DIRECT write path:

  /*
   * If it's a write, unmap all mmappings of the file up-front.  This
   * will cause any pte dirty bits to be propagated into the pageframes
   * for the subsequent filemap_write_and_wait().
   */
        if (rw == WRITE) {
                write_len = iov_length(iov, nr_segs);
                if (mapping_mapped(mapping))
                        unmap_mapping_range(mapping, offset, write_len, 0);
        }

I'd hope that dirty tracking is transferred from the page tables to the
page cache *before* ->release is called.  (put_vma() makes me think this
is the case.)  So I think you could use unmap_mapping_range() and
filemap_fdatawrite() to at least initiate writeback on all dirty pages
in ->release.

That is, you should be able to stop ->writepages calls after ->release.
 But it's a really bad idea to try and assert that ->writepages can't be
called after ->flush.

But I still don't know if mmap() is involved in his current problem.

- z

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 21:15     ` Zach Brown
  2006-10-03 21:40       ` Dave Kleikamp
@ 2006-10-04 15:13       ` Steve French
  1 sibling, 0 replies; 9+ messages in thread
From: Steve French @ 2006-10-04 15:13 UTC (permalink / raw)
  To: Zach Brown; +Cc: Dave Kleikamp, linux-fsdevel, Shirish S Pargaonkar

Zach Brown wrote:
> Steve French wrote:
>   
>> Dave Kleikamp wrote:
>>     
>>>> Someone had reported a problem with a writepages call coming in on
>>>> with no open files (so presumably the file was closed, with dirty
>>>> pages not written).
>>>>         
>
> So is the problem that you're getting a cifs_writepages() call after
> cifs_close() returns?
>
>
>   
Probably - I could not see any other path that could cause that cifs 
error to be logged to dmesg

>> May be a case in which filemap_fdatawrite returns before the write(s) is
>> sent to the vfs and write races with close (although cifs will defer a
>> file close if a write is pending on that handle)?
>>     
>
> Are writes to mmap()ed regions involved at all?  They lead to pages
> being dirtied at unmapping and eventually hitting ->writepage,
> potentially after ->flush and ->release have been called.
>
>   
I don't think so - but the person reporting it did not give much 
information, and the error
is not something that I have been seeing so I can't draw conclusions 
about that.
> I imagine you could force writeback of dirty pages in ->release so that
> you don't wait for writeback to come around and hit them.  Heck, it
> might be doing this already.  I didn't look very hard :).
>   
filemap_fdatawrite is called in cifs flush (which I verified is called 
just before close) and
close will be delayed (not sent over the network) if a write is pending 
on that file handle but
there may be two possibilities:

1) the page gets dirty while it is being released (between 
fput->->cifs_flush and fput->cifs_close)
2) filemap_fdatawrite did not flush the pages fast enough - and needs to 
be changed to
the filemap write and wait variant of the call in this location.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What trigge fsync of file on last close of the open inode?
  2006-10-03 23:13     ` Jeremy Allison
@ 2006-10-04 20:46       ` Trond Myklebust
  0 siblings, 0 replies; 9+ messages in thread
From: Trond Myklebust @ 2006-10-04 20:46 UTC (permalink / raw)
  To: Jeremy Allison
  Cc: Steve French, Dave Kleikamp, linux-fsdevel, Shirish S Pargaonkar

On Tue, 2006-10-03 at 16:13 -0700, Jeremy Allison wrote:
> On Tue, Oct 03, 2006 at 03:45:49PM -0500, Steve French wrote:
> > Dave Kleikamp wrote:
> > >>Someone had reported a problem with a writepages call coming in on with 
> > >>no open files (so presumably the file was closed, with dirty pages not 
> > >>written).
> > >>    
> > >
> > >This is normal behavior for most file systems.  I thought cifs protected
> > >this by flushing dirty data in cifs_close.  I don't think any data
> > >should be dirtied after cifs_close is called (on the last open file
> > >handle).
> > >  
> > I found it ...
> > 
> > cifs exports flush, filp_close calls flush (before calling close)
> > 
> > cifs_flush calls filemap_fdatawrite
> > 
> > May be a case in which filemap_fdatawrite returns before the write(s) is 
> > sent to the vfs and write races with close (although cifs will defer a 
> > file close if a write is pending on that handle)?
> 
> Steve,
> 
> 	Here's a comment I found in the NFSv4 code.... might be relevent.
> 
> >From /usr/src/linux/fs/nfs/nfs4proc.c
> 
> /*
>  * It is possible for data to be read/written from a mem-mapped file
>  * after the sys_close call (which hits the vfs layer as a flush).
>  * This means that we can't safely call nfsv4 close on a file until
>  * the inode is cleared.

That comment needs updating.

In a nutshell the rules are:

          - The vm_area_struct (vma) that describes the mmapped area
        holds a reference to the struct file that was used in the call
        to do_mmap().

         - Once the the vma that referenced your struct file has been
        destroyed (usually via a call to munmap() or via the call to
        mmput() when the task is destroyed), the reference to the struct
        file is released in the usual way, via a call to fput().

Note that once the last reference to the struct file disappears, the
filesystem is notified by a call to filp->f_op->release().

Once all the struct file that refer to any given inode have been
released, you should be able to assume that no-one is reading or writing
to its pages (other than perhaps the CIFS client itself?).

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-10-04 20:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-03 19:53 What trigge fsync of file on last close of the open inode? Steve French
2006-10-03 20:05 ` Dave Kleikamp
2006-10-03 20:45   ` Steve French
2006-10-03 21:15     ` Zach Brown
2006-10-03 21:40       ` Dave Kleikamp
2006-10-03 23:43         ` Zach Brown
2006-10-04 15:13       ` Steve French
2006-10-03 23:13     ` Jeremy Allison
2006-10-04 20:46       ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox