From: Niels de Vos <ndevos@redhat.com>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
David Howells <dhowells@redhat.com>,
Jeffrey Layton <jlayton@redhat.com>
Subject: Re: [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor
Date: Thu, 03 May 2012 19:07:52 +0200 [thread overview]
Message-ID: <4FA2BB68.7070304@redhat.com> (raw)
In-Reply-To: <1336059797.5385.22.camel@lade.trondhjem.org>
On 05/03/2012 05:43 PM, Myklebust, Trond wrote:
> On Thu, 2012-05-03 at 17:34 +0200, Niels de Vos wrote:
>> When an application on an NFS-client (tested with NFSv3) executes the
>> following steps, data written after the close() is never flushed to the
>> server:
>>
>> 1. open()
>> 2. mmap()
>> 3. close()
>> 4.<modify data in the mmap'ed area>
>> 5. munmap()
>>
>> Dropping the caches (via /proc/sys/vm/drop_caches) or unmounting does not
>> result in the data being sent to the server.
>>
>> The man-page for mmap (man 2 mmap) does mention that closing the file-
>> descriptor does not munmap() the area. Using the mmap'ed area after a
>> close() sound valid to me (even if it may be bad practice).
>>
>> Investigation and checking showed that the NFS-client does not handle
>> munmap(), and only flushes on close(). To solve this problem, least two
>> solutions can be proposed:
>>
>> a. f_ops->release() is called on munmap() as well as on close(),
>> therefore release() can be used to flush data as well.
>> b. In the 'struct vm_operations_struct' add a .close to the
>> 'struct vm_area_struct' on calling mmap()/nfs_file_mmap() and flush
>> the data in the new close() function.
>>
>> Solution a. contains currently very few code changes:
>>
>> --- a/fs/nfs/inode.c
>> +++ b/fs/nfs/inode.c
>> @@ -713,6 +713,8 @@ int nfs_open(struct inode *inode, struct file *filp)
>>
>> int nfs_release(struct inode *inode, struct file *filp)
>> {
>> + if (S_ISREG(inode->i_mode)&& inode->i_mapping->nrpages != 0) {
>> + nfs_sync_mapping(inode->i_mapping);
>> nfs_file_clear_open_context(filp);
>> return 0;
>> }
>>
>> The disadvantage is, that nfs_release() is called on close() too. That
>> means this causes a flushing of dirty pages, and just after that the
>> nfs_file_clear_open_context() might flush again. The advantage is that
>> it is possible (though not done at the moment) to return an error in
>> case flushing failed.
>>
>> Solution b. does not provide an option to return an error, but does not
>> get called on each close():
>>
>> --- a/fs/nfs/file.c
>> +++ b/fs/nfs/file.c
>> @@ -547,9 +547,17 @@ out:
>> return ret;
>> }
>>
>> +static void nfs_vm_close(struct vm_area_struct * vma)
>> +{
>> + struct file *filp = vma->vm_file;
>> +
>> + nfs_file_flush(filp, (fl_owner_t)filp);
>> +}
>> +
>> static const struct vm_operations_struct nfs_file_vm_ops = {
>> .fault = filemap_fault,
>> .page_mkwrite = nfs_vm_page_mkwrite,
>> + .close = nfs_vm_close,
>> };
>>
>> static int nfs_need_sync_write(struct file *filp, struct inode *inode)
>>
>> I would like some feedback on what solution is most acceptable, or any
>> other suggestions.
>
> Neither solution is acceptable. This isn't a close-to-open cache
> consistency issue.
>
> The syntax of mmap() for both block and NFS mounts is the same: writes
> are not guaranteed to hit the disk until your application explicitly
> calls msync().
>
Okay, that makes sense. But if the application never calls msync(), and
just munmap()'s the area, when should the changes be written? I did not
expect that unmounting just disregards the data.
Any suggestions on correcting this (if it should be corrected) are
welcome.
Thanks again,
Niels
next prev parent reply other threads:[~2012-05-03 17:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-03 15:34 [RFC] Dataloss on NFS-clients who modify an mmap()'d area after closing the file-descriptor Niels de Vos
2012-05-03 15:34 ` Niels de Vos
2012-05-03 15:43 ` Myklebust, Trond
2012-05-03 15:43 ` Myklebust, Trond
2012-05-03 17:07 ` Niels de Vos [this message]
2012-05-03 17:26 ` Myklebust, Trond
2012-05-03 17:26 ` Myklebust, Trond
2012-05-04 16:03 ` Niels de Vos
2012-05-04 16:03 ` Niels de Vos
2012-05-04 18:29 ` Myklebust, Trond
2012-05-04 18:29 ` Myklebust, Trond
2012-05-07 8:49 ` Niels de Vos
2012-05-07 8:49 ` Niels de Vos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FA2BB68.7070304@redhat.com \
--to=ndevos@redhat.com \
--cc=Trond.Myklebust@netapp.com \
--cc=dhowells@redhat.com \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.