public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] memory mapped files not updating timestamps
@ 2006-05-17 15:16 Peter Staubach
  2006-05-17 19:24 ` Hugh Dickins
  2006-05-31 17:53 ` Peter Staubach
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Staubach @ 2006-05-17 15:16 UTC (permalink / raw)
  To: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1872 bytes --]

Hi.

Attached are some changes to address the problem that modifications to
the contents of a file, made via an mmap'd region, do not cause the
modification time on the file to be updated.  This lack can cause corruption
by allowing backup software to not detect files which should be backed up.
This also represents a potential security hole because it allows a file 
to be
modified with no corresponding change in the file modification or change
time fields.

The changes add support to detect when the modification time needs to be
updated by placing a hook in __set_pages_dirty_buffers and
__set_pages_dirty_nobuffers.  One of these two routines will be invoked
when the dirty bit is detected in the pte.  The hook sets a new bit in the
address_space mapping struct indicating that the file which is associated
with that part of the address space needs to have its modification and
change time attributes updated.

The new bit described above is used in various system calls to cause the
modification and change times to be updated.  These are msync, munmap, 
fsync,
and exit system calls.  Additionally, these two timestamps will be updated
if a sync or other inode flushing operation occurs as part of normal system
operations.

These changes were tested in two ways.  One was to simply create a file,
write(2) to it, close it, and then test to ensure that the file modification
time does not change after the file is closed.  Another was a program which
creates a file, mmap's it, modifies the mapped pages, and then either 
msync's
the region, fsync's the file, sync's the system, abruptly exits, or simply
munmap's the files.  This program shows the file mtime and ctime fields at
various times and these times were used to ensure that they did change and
did change in expected ways.

    Thanx...

       ps

Signed-off-by: Peter Staubach <staubach@redhat.com>

[-- Attachment #2: mctime.devel --]
[-- Type: text/plain, Size: 5139 bytes --]

--- linux-2.6.16.i686/fs/inode.c.org
+++ linux-2.6.16.i686/fs/inode.c
@@ -1211,8 +1211,8 @@ void touch_atime(struct vfsmount *mnt, s
 EXPORT_SYMBOL(touch_atime);
 
 /**
- *	file_update_time	-	update mtime and ctime time
- *	@file: file accessed
+ *	inode_update_time	-	update mtime and ctime time
+ *	@inode: file accessed
  *
  *	Update the mtime and ctime members of an inode and mark the inode
  *	for writeback.  Note that this function is meant exclusively for
@@ -1222,9 +1222,8 @@ EXPORT_SYMBOL(touch_atime);
  *	timestamps are handled by the server.
  */
 
-void file_update_time(struct file *file)
+void inode_update_time(struct inode *inode)
 {
-	struct inode *inode = file->f_dentry->d_inode;
 	struct timespec now;
 	int sync_it = 0;
 
@@ -1246,7 +1245,7 @@ void file_update_time(struct file *file)
 		mark_inode_dirty_sync(inode);
 }
 
-EXPORT_SYMBOL(file_update_time);
+EXPORT_SYMBOL(inode_update_time);
 
 int inode_needs_sync(struct inode *inode)
 {
--- linux-2.6.16.i686/fs/fs-writeback.c.org
+++ linux-2.6.16.i686/fs/fs-writeback.c
@@ -168,6 +168,9 @@ __sync_single_inode(struct inode *inode,
 
 	spin_unlock(&inode_lock);
 
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(inode);
+
 	ret = do_writepages(mapping, wbc);
 
 	/* Don't write the inode if only I_DIRTY_PAGES was set */
--- linux-2.6.16.i686/fs/buffer.c.org
+++ linux-2.6.16.i686/fs/buffer.c
@@ -347,6 +347,10 @@ long do_fsync(struct file *file, int dat
 	if (!ret)
 		ret = err;
 	current->flags &= ~PF_SYNCWRITE;
+
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(mapping->host);
+
 out:
 	return ret;
 }
@@ -837,6 +841,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
 int __set_page_dirty_buffers(struct page *page)
 {
 	struct address_space * const mapping = page->mapping;
+	int ret = 0;
 
 	spin_lock(&mapping->private_lock);
 	if (page_has_buffers(page)) {
@@ -861,9 +866,13 @@ int __set_page_dirty_buffers(struct page
 		}
 		write_unlock_irq(&mapping->tree_lock);
 		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-		return 1;
+		ret = 1;
 	}
-	return 0;
+
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_buffers);
 
--- linux-2.6.16.i686/include/linux/fs.h.org
+++ linux-2.6.16.i686/include/linux/fs.h
@@ -1777,7 +1777,12 @@ extern int buffer_migrate_page(struct pa
 extern int inode_change_ok(struct inode *, struct iattr *);
 extern int __must_check inode_setattr(struct inode *, struct iattr *);
 
-extern void file_update_time(struct file *file);
+extern void inode_update_time(struct inode *);
+
+static inline void file_update_time(struct file *file)
+{
+	inode_update_time(file->f_dentry->d_inode);
+}
 
 static inline ino_t parent_ino(struct dentry *dentry)
 {
--- linux-2.6.16.i686/include/linux/pagemap.h.org
+++ linux-2.6.16.i686/include/linux/pagemap.h
@@ -16,8 +16,9 @@
  * Bits in mapping->flags.  The lower __GFP_BITS_SHIFT bits are the page
  * allocation mode flags.
  */
-#define	AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
+#define AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
 #define AS_ENOSPC	(__GFP_BITS_SHIFT + 1)	/* ENOSPC on async write */
+#define AS_MCTIME	(__GFP_BITS_SHIFT + 2)	/* need m/ctime change */
 
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
--- linux-2.6.16.i686/mm/page-writeback.c.org
+++ linux-2.6.16.i686/mm/page-writeback.c
@@ -627,8 +627,10 @@ EXPORT_SYMBOL(write_one_page);
  */
 int __set_page_dirty_nobuffers(struct page *page)
 {
+	struct address_space *mapping = page_mapping(page);
+	int ret = 0;
+
 	if (!TestSetPageDirty(page)) {
-		struct address_space *mapping = page_mapping(page);
 		struct address_space *mapping2;
 
 		if (mapping) {
@@ -648,9 +650,11 @@ int __set_page_dirty_nobuffers(struct pa
 							I_DIRTY_PAGES);
 			}
 		}
-		return 1;
+		ret = 1;
 	}
-	return 0;
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_nobuffers);
 
--- linux-2.6.16.i686/mm/msync.c.org
+++ linux-2.6.16.i686/mm/msync.c
@@ -206,12 +206,16 @@ asmlinkage long sys_msync(unsigned long 
 		file = vma->vm_file;
 		start = vma->vm_end;
 		if ((flags & MS_ASYNC) && file && nr_pages_dirtied) {
+			struct address_space *mapping = file->f_mapping;
+
 			get_file(file);
 			up_read(&current->mm->mmap_sem);
-			balance_dirty_pages_ratelimited_nr(file->f_mapping,
+			balance_dirty_pages_ratelimited_nr(mapping,
 							nr_pages_dirtied);
 			fput(file);
 			down_read(&current->mm->mmap_sem);
+			if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+				inode_update_time(mapping->host);
 			vma = find_vma(current->mm, start);
 		} else if ((flags & MS_SYNC) && file &&
 				(vma->vm_flags & VM_SHARED)) {
--- linux-2.6.16.i686/mm/mmap.c.org
+++ linux-2.6.16.i686/mm/mmap.c
@@ -203,6 +203,8 @@ void unlink_file_vma(struct vm_area_stru
 		spin_lock(&mapping->i_mmap_lock);
 		__remove_shared_vm_struct(vma, file, mapping);
 		spin_unlock(&mapping->i_mmap_lock);
+		if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+			inode_update_time(mapping->host);
 	}
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-17 15:16 [PATCH] memory mapped files not updating timestamps Peter Staubach
@ 2006-05-17 19:24 ` Hugh Dickins
  2006-05-18 10:52   ` Peter Zijlstra
  2006-05-31 17:53 ` Peter Staubach
  1 sibling, 1 reply; 8+ messages in thread
From: Hugh Dickins @ 2006-05-17 19:24 UTC (permalink / raw)
  To: Peter Staubach; +Cc: Linux Kernel Mailing List

On Wed, 17 May 2006, Peter Staubach wrote:
> 
> Attached are some changes to address the problem that modifications to
> the contents of a file, made via an mmap'd region, do not cause the
> modification time on the file to be updated.  This lack can cause corruption
> by allowing backup software to not detect files which should be backed up.
> This also represents a potential security hole because it allows a file to be
> modified with no corresponding change in the file modification or change
> time fields.

It would be grand to fix this (it's, umm, three years since I promised to
do so a few days later), but I'm not quite satisfied by your patch as is.

> The changes add support to detect when the modification time needs to be
> updated by placing a hook in __set_pages_dirty_buffers and
> __set_pages_dirty_nobuffers.  One of these two routines will be invoked
> when the dirty bit is detected in the pte.  The hook sets a new bit in the
> address_space mapping struct indicating that the file which is associated
> with that part of the address space needs to have its modification and
> change time attributes updated.

You're adding a little overhead to every set_page_dirty, when the vast
majority (ordinary writes) don't need it: their mctime update is already
well taken care of.  (Or should we be deleting the code that does that?
I think I'd rather not dare.)

Perhaps it's so little overhead that it's not worth worrying about.
But writes to a file which happens to be mapped readonly at that time
are liable to end up with too late a last mctime on the file, aren't
they (if the readonly mapping is unmapped later)?

I think you'd do better to target those places where set_page_dirty is
called on a mapped page - and do the file_update_time at that point -
or as near to that point as is sensible/permitted given the locking
(vma->vm_file gives you the file without needing inode_update_time).

Calling your inode_update_time from unmap_file_vma is probably unwise:
at present we may have preemption disabled there, and it's not clear
what ->dirty_inode might get up to.  Calling it from the vm_file case
of remove_vma would be safer.

Peter Zijlstra has patches relating to dirty mmaps in the -mm tree
at present: I need to take a look at those, and I'll see if it would
make sense to factor in this mctime issue on top of those - you may
want to do the same.

In the course of trying that, I'm likely to discover exactly why you
made the decisions you did, and arrive at commending your solution.

Hugh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-17 19:24 ` Hugh Dickins
@ 2006-05-18 10:52   ` Peter Zijlstra
  2006-05-18 22:13     ` Peter Staubach
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2006-05-18 10:52 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Peter Staubach, Linux Kernel Mailing List

On Wed, 2006-05-17 at 20:24 +0100, Hugh Dickins wrote:
> On Wed, 17 May 2006, Peter Staubach wrote:

> > The changes add support to detect when the modification time needs to be
> > updated by placing a hook in __set_pages_dirty_buffers and
> > __set_pages_dirty_nobuffers.  One of these two routines will be invoked
> > when the dirty bit is detected in the pte.  The hook sets a new bit in the
> > address_space mapping struct indicating that the file which is associated
> > with that part of the address space needs to have its modification and
> > change time attributes updated.
> 
> You're adding a little overhead to every set_page_dirty, when the vast
> majority (ordinary writes) don't need it: their mctime update is already
> well taken care of.  (Or should we be deleting the code that does that?
> I think I'd rather not dare.)

It would make the code more symetric.

> I think you'd do better to target those places where set_page_dirty is
> called on a mapped page - and do the file_update_time at that point -
> or as near to that point as is sensible/permitted given the locking
> (vma->vm_file gives you the file without needing inode_update_time).

> Peter Zijlstra has patches relating to dirty mmaps in the -mm tree
> at present: I need to take a look at those, and I'll see if it would
> make sense to factor in this mctime issue on top of those - you may
> want to do the same.

Look for the callsites of set_page_dirty_balance(), those two points are
where writable file pages are dirtied.

PeterZ


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-18 10:52   ` Peter Zijlstra
@ 2006-05-18 22:13     ` Peter Staubach
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Staubach @ 2006-05-18 22:13 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Hugh Dickins, Linux Kernel Mailing List

Peter Zijlstra wrote:

>On Wed, 2006-05-17 at 20:24 +0100, Hugh Dickins wrote:
>  
>
>>On Wed, 17 May 2006, Peter Staubach wrote:
>>    
>>
>
>  
>
>>>The changes add support to detect when the modification time needs to be
>>>updated by placing a hook in __set_pages_dirty_buffers and
>>>__set_pages_dirty_nobuffers.  One of these two routines will be invoked
>>>when the dirty bit is detected in the pte.  The hook sets a new bit in the
>>>address_space mapping struct indicating that the file which is associated
>>>with that part of the address space needs to have its modification and
>>>change time attributes updated.
>>>      
>>>
>>You're adding a little overhead to every set_page_dirty, when the vast
>>majority (ordinary writes) don't need it: their mctime update is already
>>well taken care of.  (Or should we be deleting the code that does that?
>>I think I'd rather not dare.)
>>    
>>
>
>It would make the code more symetric.
>
>  
>

It might, but I think that it might upset the guarantees that the system
already makes about timely mtime/ctime updates when a file is written to.
Those requirements are much tighter than the requirements for when those
time fields get updated for a modified mmap'd file.  With the exception
of msync, really the only requirement for updating mtime and ctime for
an mmap'd file is that these time fields change, at some time.

    Thanx...

       ps

>>I think you'd do better to target those places where set_page_dirty is
>>called on a mapped page - and do the file_update_time at that point -
>>or as near to that point as is sensible/permitted given the locking
>>(vma->vm_file gives you the file without needing inode_update_time).
>>    
>>
>
>  
>
>>Peter Zijlstra has patches relating to dirty mmaps in the -mm tree
>>at present: I need to take a look at those, and I'll see if it would
>>make sense to factor in this mctime issue on top of those - you may
>>want to do the same.
>>    
>>
>
>Look for the callsites of set_page_dirty_balance(), those two points are
>where writable file pages are dirtied.
>
>PeterZ
>
>  
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-17 15:16 [PATCH] memory mapped files not updating timestamps Peter Staubach
  2006-05-17 19:24 ` Hugh Dickins
@ 2006-05-31 17:53 ` Peter Staubach
  2006-05-31 19:38   ` Eric Dumazet
  2006-06-19  4:33   ` Andrew Morton
  1 sibling, 2 replies; 8+ messages in thread
From: Peter Staubach @ 2006-05-31 17:53 UTC (permalink / raw)
  To: Peter Staubach; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 2724 bytes --]

Hi.

I am resending the patch and am including some justification for the
work.

Peter Staubach wrote:

> Hi.
>
> Attached are some changes to address the problem that modifications to
> the contents of a file, made via an mmap'd region, do not cause the
> modification time on the file to be updated.  This lack can cause 
> corruption
> by allowing backup software to not detect files which should be backed 
> up.
> This also represents a potential security hole because it allows a 
> file to be
> modified with no corresponding change in the file modification or change
> time fields.
>
> The changes add support to detect when the modification time needs to be
> updated by placing a hook in __set_pages_dirty_buffers and
> __set_pages_dirty_nobuffers.  One of these two routines will be invoked
> when the dirty bit is detected in the pte.  The hook sets a new bit in 
> the
> address_space mapping struct indicating that the file which is associated
> with that part of the address space needs to have its modification and
> change time attributes updated.
>
> The new bit described above is used in various system calls to cause the
> modification and change times to be updated.  These are msync, munmap, 
> fsync,
> and exit system calls.  Additionally, these two timestamps will be 
> updated
> if a sync or other inode flushing operation occurs as part of normal 
> system
> operations.
>
> These changes were tested in two ways.  One was to simply create a file,
> write(2) to it, close it, and then test to ensure that the file 
> modification
> time does not change after the file is closed.  Another was a program 
> which
> creates a file, mmap's it, modifies the mapped pages, and then either 
> msync's
> the region, fsync's the file, sync's the system, abruptly exits, or 
> simply
> munmap's the files.  This program shows the file mtime and ctime 
> fields at
> various times and these times were used to ensure that they did change 
> and
> did change in expected ways. 


I embarked on this work due to a bug reported by one of Red Hat's large
customers.  They are finding that files, which should have been backed
up, were not getting backed up.  This is due to the mtime on the files
not changing and their backup software looking for mtime changes.  This
is corruption and I need to get it fixed, sooner as opposed to later.

While I would like to get this fixed on top of Peter Zijlstra's changes,
the process for those is looking long and complicated.  I am wondering
if we could consider these changes and then add the requirement of
maintaining these semantics to those that Peter's work is attempting to
address.

    Thanx...

       ps

Signed-off-by: Peter Staubach <staubach@redhat.com>

[-- Attachment #2: mctime.devel --]
[-- Type: text/plain, Size: 5139 bytes --]

--- linux-2.6.16.i686/fs/inode.c.org
+++ linux-2.6.16.i686/fs/inode.c
@@ -1211,8 +1211,8 @@ void touch_atime(struct vfsmount *mnt, s
 EXPORT_SYMBOL(touch_atime);
 
 /**
- *	file_update_time	-	update mtime and ctime time
- *	@file: file accessed
+ *	inode_update_time	-	update mtime and ctime time
+ *	@inode: file accessed
  *
  *	Update the mtime and ctime members of an inode and mark the inode
  *	for writeback.  Note that this function is meant exclusively for
@@ -1222,9 +1222,8 @@ EXPORT_SYMBOL(touch_atime);
  *	timestamps are handled by the server.
  */
 
-void file_update_time(struct file *file)
+void inode_update_time(struct inode *inode)
 {
-	struct inode *inode = file->f_dentry->d_inode;
 	struct timespec now;
 	int sync_it = 0;
 
@@ -1246,7 +1245,7 @@ void file_update_time(struct file *file)
 		mark_inode_dirty_sync(inode);
 }
 
-EXPORT_SYMBOL(file_update_time);
+EXPORT_SYMBOL(inode_update_time);
 
 int inode_needs_sync(struct inode *inode)
 {
--- linux-2.6.16.i686/fs/fs-writeback.c.org
+++ linux-2.6.16.i686/fs/fs-writeback.c
@@ -168,6 +168,9 @@ __sync_single_inode(struct inode *inode,
 
 	spin_unlock(&inode_lock);
 
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(inode);
+
 	ret = do_writepages(mapping, wbc);
 
 	/* Don't write the inode if only I_DIRTY_PAGES was set */
--- linux-2.6.16.i686/fs/buffer.c.org
+++ linux-2.6.16.i686/fs/buffer.c
@@ -347,6 +347,10 @@ long do_fsync(struct file *file, int dat
 	if (!ret)
 		ret = err;
 	current->flags &= ~PF_SYNCWRITE;
+
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(mapping->host);
+
 out:
 	return ret;
 }
@@ -837,6 +841,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
 int __set_page_dirty_buffers(struct page *page)
 {
 	struct address_space * const mapping = page->mapping;
+	int ret = 0;
 
 	spin_lock(&mapping->private_lock);
 	if (page_has_buffers(page)) {
@@ -861,9 +866,13 @@ int __set_page_dirty_buffers(struct page
 		}
 		write_unlock_irq(&mapping->tree_lock);
 		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-		return 1;
+		ret = 1;
 	}
-	return 0;
+
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_buffers);
 
--- linux-2.6.16.i686/include/linux/fs.h.org
+++ linux-2.6.16.i686/include/linux/fs.h
@@ -1777,7 +1777,12 @@ extern int buffer_migrate_page(struct pa
 extern int inode_change_ok(struct inode *, struct iattr *);
 extern int __must_check inode_setattr(struct inode *, struct iattr *);
 
-extern void file_update_time(struct file *file);
+extern void inode_update_time(struct inode *);
+
+static inline void file_update_time(struct file *file)
+{
+	inode_update_time(file->f_dentry->d_inode);
+}
 
 static inline ino_t parent_ino(struct dentry *dentry)
 {
--- linux-2.6.16.i686/include/linux/pagemap.h.org
+++ linux-2.6.16.i686/include/linux/pagemap.h
@@ -16,8 +16,9 @@
  * Bits in mapping->flags.  The lower __GFP_BITS_SHIFT bits are the page
  * allocation mode flags.
  */
-#define	AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
+#define AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
 #define AS_ENOSPC	(__GFP_BITS_SHIFT + 1)	/* ENOSPC on async write */
+#define AS_MCTIME	(__GFP_BITS_SHIFT + 2)	/* need m/ctime change */
 
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
--- linux-2.6.16.i686/mm/page-writeback.c.org
+++ linux-2.6.16.i686/mm/page-writeback.c
@@ -627,8 +627,10 @@ EXPORT_SYMBOL(write_one_page);
  */
 int __set_page_dirty_nobuffers(struct page *page)
 {
+	struct address_space *mapping = page_mapping(page);
+	int ret = 0;
+
 	if (!TestSetPageDirty(page)) {
-		struct address_space *mapping = page_mapping(page);
 		struct address_space *mapping2;
 
 		if (mapping) {
@@ -648,9 +650,11 @@ int __set_page_dirty_nobuffers(struct pa
 							I_DIRTY_PAGES);
 			}
 		}
-		return 1;
+		ret = 1;
 	}
-	return 0;
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_nobuffers);
 
--- linux-2.6.16.i686/mm/msync.c.org
+++ linux-2.6.16.i686/mm/msync.c
@@ -206,12 +206,16 @@ asmlinkage long sys_msync(unsigned long 
 		file = vma->vm_file;
 		start = vma->vm_end;
 		if ((flags & MS_ASYNC) && file && nr_pages_dirtied) {
+			struct address_space *mapping = file->f_mapping;
+
 			get_file(file);
 			up_read(&current->mm->mmap_sem);
-			balance_dirty_pages_ratelimited_nr(file->f_mapping,
+			balance_dirty_pages_ratelimited_nr(mapping,
 							nr_pages_dirtied);
 			fput(file);
 			down_read(&current->mm->mmap_sem);
+			if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+				inode_update_time(mapping->host);
 			vma = find_vma(current->mm, start);
 		} else if ((flags & MS_SYNC) && file &&
 				(vma->vm_flags & VM_SHARED)) {
--- linux-2.6.16.i686/mm/mmap.c.org
+++ linux-2.6.16.i686/mm/mmap.c
@@ -203,6 +203,8 @@ void unlink_file_vma(struct vm_area_stru
 		spin_lock(&mapping->i_mmap_lock);
 		__remove_shared_vm_struct(vma, file, mapping);
 		spin_unlock(&mapping->i_mmap_lock);
+		if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+			inode_update_time(mapping->host);
 	}
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-31 17:53 ` Peter Staubach
@ 2006-05-31 19:38   ` Eric Dumazet
  2006-05-31 20:38     ` Peter Staubach
  2006-06-19  4:33   ` Andrew Morton
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2006-05-31 19:38 UTC (permalink / raw)
  To: Peter Staubach; +Cc: Linux Kernel Mailing List

Peter Staubach a écrit :
> --- linux-2.6.16.i686/mm/msync.c.org
> +++ linux-2.6.16.i686/mm/msync.c
> @@ -206,12 +206,16 @@ asmlinkage long sys_msync(unsigned long 
>  		file = vma->vm_file;
>  		start = vma->vm_end;
>  		if ((flags & MS_ASYNC) && file && nr_pages_dirtied) {
> +			struct address_space *mapping = file->f_mapping;
> +
>  			get_file(file);
>  			up_read(&current->mm->mmap_sem);
> -			balance_dirty_pages_ratelimited_nr(file->f_mapping,
> +			balance_dirty_pages_ratelimited_nr(mapping,
>  							nr_pages_dirtied);
>  			fput(file);

<here>, another thread can perform an munmap(), and the file can be totally 
dismantled.

>  			down_read(&current->mm->mmap_sem);

So referencing 'mapping' is *buggy* here.
I believe that you have to move 'fput(file);' *after* the folloging two lines.

> +			if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
> +				inode_update_time(mapping->host);
>  			vma = find_vma(current->mm, start);
>  		} else if ((flags & MS_SYNC) && file &&
>  				(vma->vm_flags & VM_SHARED)) {


Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-31 19:38   ` Eric Dumazet
@ 2006-05-31 20:38     ` Peter Staubach
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Staubach @ 2006-05-31 20:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1649 bytes --]

Eric Dumazet wrote:

> Peter Staubach a écrit :
>
>> --- linux-2.6.16.i686/mm/msync.c.org
>> +++ linux-2.6.16.i686/mm/msync.c
>> @@ -206,12 +206,16 @@ asmlinkage long sys_msync(unsigned long 
>>          file = vma->vm_file;
>>          start = vma->vm_end;
>>          if ((flags & MS_ASYNC) && file && nr_pages_dirtied) {
>> +            struct address_space *mapping = file->f_mapping;
>> +
>>              get_file(file);
>>              up_read(&current->mm->mmap_sem);
>> -            balance_dirty_pages_ratelimited_nr(file->f_mapping,
>> +            balance_dirty_pages_ratelimited_nr(mapping,
>>                              nr_pages_dirtied);
>>              fput(file);
>
>
> <here>, another thread can perform an munmap(), and the file can be 
> totally dismantled.
>
>>              down_read(&current->mm->mmap_sem);
>
>
> So referencing 'mapping' is *buggy* here.
> I believe that you have to move 'fput(file);' *after* the folloging 
> two lines.
>
>> +            if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
>> +                inode_update_time(mapping->host);
>>              vma = find_vma(current->mm, start);
>>          } else if ((flags & MS_SYNC) && file &&
>>                  (vma->vm_flags & VM_SHARED)) {
>
>
>
> Eric


Yes, sorry, you mentioned that before and I meant to address that.  I almost
reordered those operations originally because I think that it is cleaner
when acquires and releases are done in the opposite order.  However, I was
also trying to change as little as possible too.

Anyway, attached is an updated patch.

    Thanx...

       ps

Signed-off-by: Peter Staubach <staubach@redhat.com>

[-- Attachment #2: mctime.devel --]
[-- Type: text/plain, Size: 5187 bytes --]

--- linux-2.6.16.x86_64/fs/inode.c.org
+++ linux-2.6.16.x86_64/fs/inode.c
@@ -1211,8 +1211,8 @@ void touch_atime(struct vfsmount *mnt, s
 EXPORT_SYMBOL(touch_atime);
 
 /**
- *	file_update_time	-	update mtime and ctime time
- *	@file: file accessed
+ *	inode_update_time	-	update mtime and ctime time
+ *	@inode: file accessed
  *
  *	Update the mtime and ctime members of an inode and mark the inode
  *	for writeback.  Note that this function is meant exclusively for
@@ -1222,9 +1222,8 @@ EXPORT_SYMBOL(touch_atime);
  *	timestamps are handled by the server.
  */
 
-void file_update_time(struct file *file)
+void inode_update_time(struct inode *inode)
 {
-	struct inode *inode = file->f_dentry->d_inode;
 	struct timespec now;
 	int sync_it = 0;
 
@@ -1246,7 +1245,7 @@ void file_update_time(struct file *file)
 		mark_inode_dirty_sync(inode);
 }
 
-EXPORT_SYMBOL(file_update_time);
+EXPORT_SYMBOL(inode_update_time);
 
 int inode_needs_sync(struct inode *inode)
 {
--- linux-2.6.16.x86_64/fs/fs-writeback.c.org
+++ linux-2.6.16.x86_64/fs/fs-writeback.c
@@ -168,6 +168,9 @@ __sync_single_inode(struct inode *inode,
 
 	spin_unlock(&inode_lock);
 
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(inode);
+
 	ret = do_writepages(mapping, wbc);
 
 	/* Don't write the inode if only I_DIRTY_PAGES was set */
--- linux-2.6.16.x86_64/fs/buffer.c.org
+++ linux-2.6.16.x86_64/fs/buffer.c
@@ -347,6 +347,10 @@ long do_fsync(struct file *file, int dat
 	if (!ret)
 		ret = err;
 	current->flags &= ~PF_SYNCWRITE;
+
+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(mapping->host);
+
 out:
 	return ret;
 }
@@ -837,6 +841,7 @@ EXPORT_SYMBOL(mark_buffer_dirty_inode);
 int __set_page_dirty_buffers(struct page *page)
 {
 	struct address_space * const mapping = page->mapping;
+	int ret = 0;
 
 	spin_lock(&mapping->private_lock);
 	if (page_has_buffers(page)) {
@@ -861,9 +866,13 @@ int __set_page_dirty_buffers(struct page
 		}
 		write_unlock_irq(&mapping->tree_lock);
 		__mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
-		return 1;
+		ret = 1;
 	}
-	return 0;
+
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_buffers);
 
--- linux-2.6.16.x86_64/include/linux/fs.h.org
+++ linux-2.6.16.x86_64/include/linux/fs.h
@@ -1778,7 +1778,12 @@ extern int buffer_migrate_page(struct pa
 extern int inode_change_ok(struct inode *, struct iattr *);
 extern int __must_check inode_setattr(struct inode *, struct iattr *);
 
-extern void file_update_time(struct file *file);
+extern void inode_update_time(struct inode *);
+
+static inline void file_update_time(struct file *file)
+{
+	inode_update_time(file->f_dentry->d_inode);
+}
 
 static inline ino_t parent_ino(struct dentry *dentry)
 {
--- linux-2.6.16.x86_64/include/linux/pagemap.h.org
+++ linux-2.6.16.x86_64/include/linux/pagemap.h
@@ -16,8 +16,9 @@
  * Bits in mapping->flags.  The lower __GFP_BITS_SHIFT bits are the page
  * allocation mode flags.
  */
-#define	AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
+#define AS_EIO		(__GFP_BITS_SHIFT + 0)	/* IO error on async write */
 #define AS_ENOSPC	(__GFP_BITS_SHIFT + 1)	/* ENOSPC on async write */
+#define AS_MCTIME	(__GFP_BITS_SHIFT + 2)	/* need m/ctime change */
 
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
--- linux-2.6.16.x86_64/mm/page-writeback.c.org
+++ linux-2.6.16.x86_64/mm/page-writeback.c
@@ -627,8 +627,10 @@ EXPORT_SYMBOL(write_one_page);
  */
 int __set_page_dirty_nobuffers(struct page *page)
 {
+	struct address_space *mapping = page_mapping(page);
+	int ret = 0;
+
 	if (!TestSetPageDirty(page)) {
-		struct address_space *mapping = page_mapping(page);
 		struct address_space *mapping2;
 
 		if (mapping) {
@@ -648,9 +650,11 @@ int __set_page_dirty_nobuffers(struct pa
 							I_DIRTY_PAGES);
 			}
 		}
-		return 1;
+		ret = 1;
 	}
-	return 0;
+	if (page_mapped(page))
+		set_bit(AS_MCTIME, &mapping->flags);
+	return ret;
 }
 EXPORT_SYMBOL(__set_page_dirty_nobuffers);
 
--- linux-2.6.16.x86_64/mm/msync.c.org
+++ linux-2.6.16.x86_64/mm/msync.c
@@ -206,12 +206,16 @@ asmlinkage long sys_msync(unsigned long 
 		file = vma->vm_file;
 		start = vma->vm_end;
 		if ((flags & MS_ASYNC) && file && nr_pages_dirtied) {
+			struct address_space *mapping = file->f_mapping;
+
 			get_file(file);
 			up_read(&current->mm->mmap_sem);
-			balance_dirty_pages_ratelimited_nr(file->f_mapping,
+			balance_dirty_pages_ratelimited_nr(mapping,
 							nr_pages_dirtied);
-			fput(file);
 			down_read(&current->mm->mmap_sem);
+			if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+				inode_update_time(mapping->host);
+			fput(file);
 			vma = find_vma(current->mm, start);
 		} else if ((flags & MS_SYNC) && file &&
 				(vma->vm_flags & VM_SHARED)) {
--- linux-2.6.16.x86_64/mm/mmap.c.org
+++ linux-2.6.16.x86_64/mm/mmap.c
@@ -203,6 +203,8 @@ void unlink_file_vma(struct vm_area_stru
 		spin_lock(&mapping->i_mmap_lock);
 		__remove_shared_vm_struct(vma, file, mapping);
 		spin_unlock(&mapping->i_mmap_lock);
+		if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+			inode_update_time(mapping->host);
 	}
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memory mapped files not updating timestamps
  2006-05-31 17:53 ` Peter Staubach
  2006-05-31 19:38   ` Eric Dumazet
@ 2006-06-19  4:33   ` Andrew Morton
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2006-06-19  4:33 UTC (permalink / raw)
  To: Peter Staubach; +Cc: staubach, linux-kernel

On Wed, 31 May 2006 13:53:16 -0400
Peter Staubach <staubach@redhat.com> wrote:

>

A few issues here.

> 
> I embarked on this work due to a bug reported by one of Red Hat's large
> customers.  They are finding that files, which should have been backed
> up, were not getting backed up.  This is due to the mtime on the files
> not changing and their backup software looking for mtime changes.  This
> is corruption and I need to get it fixed, sooner as opposed to later.
> 
> While I would like to get this fixed on top of Peter Zijlstra's changes,
> the process for those is looking long and complicated.  I am wondering
> if we could consider these changes and then add the requirement of
> maintaining these semantics to those that Peter's work is attempting to
> address.
> 

We seem to be having a problem getting a coherent changelog.  A changelog
should describe why the patch exists, what it does and how it does it. 
Please develop and maintain a changelog for each patch and reissue the
changelog with each reissuing of a patch, thanks.

This sequence:

+	if (test_and_clear_bit(AS_MCTIME, &mapping->flags))
+		inode_update_time(inode);

appears all over the place and should be implemented in a helper function.


The patch should work correctly for mmaps of block devices and I don't
think it does.  Sometimes it updates the timestamp on the kernel-internal
blockdev inode at mapping->host and sometimes it updates the inode of the
device node (/dev/hda1) at file->f_dentry->d_inode.  It should be updating
the /dev/hda1 inode.

The change to unlink_file_vma() is awkward, IMO.  It means that this helper
function "knows" things about what its caller is using it for.  I'd suggest
that this code should be moved up to a higher level where we have a more
sure semantic context, rather than being implemented in some low-level
helper function where it happens to be convenient.

Also, inode_update_time() can sleep. 
mark_inode_dirty_sync()->__mark_inode_dirty()->ext3_dirty_inode().  This is
despite Documentation/filesystems/Locking saying "must not sleep".  But
unlink_file_vma() (at least) is called from atomic context:

	unmap_region()
	->tlb_gather_mmu()
	  ->preempt_disable()
	->free_pgtables()
	  ->unlink_file_vma()
	->tlb_finish_mmu()
	  ->preempt_enable()

Which _should_ have triggered warnings if full kernel debugging was enabled
and sufficient testing was performed.  Perhaps a might_sleep() is needed in
_mark_inode_dirty().


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-06-19  4:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-17 15:16 [PATCH] memory mapped files not updating timestamps Peter Staubach
2006-05-17 19:24 ` Hugh Dickins
2006-05-18 10:52   ` Peter Zijlstra
2006-05-18 22:13     ` Peter Staubach
2006-05-31 17:53 ` Peter Staubach
2006-05-31 19:38   ` Eric Dumazet
2006-05-31 20:38     ` Peter Staubach
2006-06-19  4:33   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox