public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
@ 2008-02-28 18:05 Aneesh Kumar K.V
  0 siblings, 0 replies; 6+ messages in thread
From: Aneesh Kumar K.V @ 2008-02-28 18:05 UTC (permalink / raw)
  To: cmm; +Cc: linux-ext4, Aneesh Kumar K.V

We would like to get notified when we are doing a write on mmap section.
This is needed with respect to preallocated area. We split the preallocated
area into initialzed extent and uninitialzed extent in the call back. This
let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
that would result in data loss. The changes are also needed to handle ENOSPC
when writing to an mmap section of files with holes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/file.c          |   19 +++++++++++++++-
 fs/ext4/inode.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/ext4_fs.h |    1 +
 3 files changed, 73 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 20507a2..77341c1 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -123,6 +123,23 @@ force_commit:
 	return ret;
 }
 
+static struct vm_operations_struct ext4_file_vm_ops = {
+	.fault		= filemap_fault,
+	.page_mkwrite   = ext4_page_mkwrite,
+};
+
+static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct address_space *mapping = file->f_mapping;
+
+	if (!mapping->a_ops->readpage)
+		return -ENOEXEC;
+	file_accessed(file);
+	vma->vm_ops = &ext4_file_vm_ops;
+	vma->vm_flags |= VM_CAN_NONLINEAR;
+	return 0;
+}
+
 const struct file_operations ext4_file_operations = {
 	.llseek		= generic_file_llseek,
 	.read		= do_sync_read,
@@ -133,7 +150,7 @@ const struct file_operations ext4_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= ext4_compat_ioctl,
 #endif
-	.mmap		= generic_file_mmap,
+	.mmap		= ext4_file_mmap,
 	.open		= generic_file_open,
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 5b5d63d..62aafc3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3490,3 +3490,57 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
 
 	return err;
 }
+
+int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+	unsigned long end;
+	loff_t size;
+	handle_t *handle;
+	int ret = -EINVAL, needed_blocks;
+	struct file *file   = vma->vm_file;
+	struct inode *inode = file->f_path.dentry->d_inode;
+
+	needed_blocks = ext4_writepage_trans_blocks(inode);
+	/* We need to take inode mutex to prevent parallel write */
+	mutex_lock(&inode->i_mutex);
+	lock_page(page);
+	size = i_size_read(inode);
+	if ((page->mapping != inode->i_mapping) ||
+	    (page_offset(page) > size)) {
+		/* page got truncated out from underneath us */
+		goto out_unlock;
+	}
+	/* page is wholly or partially inside EOF */
+	if (((page->index + 1) << PAGE_CACHE_SHIFT) > size)
+		end = size & ~PAGE_CACHE_MASK;
+	else
+		end = PAGE_CACHE_SIZE;
+
+	/*
+	 * if ext4_get_block resulted in a split of an uninitialized extent,
+	 * in file system full case, we will have to take the journal write
+	 * access and zero out the page.
+	 */
+	handle = ext4_journal_start(inode, needed_blocks);
+	if (IS_ERR(handle)) {
+		ret = PTR_ERR(handle);
+		goto out_unlock;
+	}
+	/* Will zero out the pages if buffer is marked new */
+	ret = block_prepare_write(page, 0, end, ext4_get_block);
+
+	/*
+	 * Now call commit_write to mark the buffer dirty and page
+	 * uptodate. page_mkwrite makes the page dirty towards the
+	 * end. We don't want to mark the buffer dirty for
+	 * journalled mode.
+	 */
+	 if (!ext4_should_journal_data(inode))
+		 ret = block_commit_write(page, 0, end);
+
+	ext4_journal_stop(handle);
+out_unlock:
+	unlock_page(page);
+	mutex_unlock(&inode->i_mutex);
+	return ret;
+}
diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
index 22810b1..8f5a563 100644
--- a/include/linux/ext4_fs.h
+++ b/include/linux/ext4_fs.h
@@ -1059,6 +1059,7 @@ extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
 extern int ext4_block_truncate_page(handle_t *handle, struct page *page,
 		struct address_space *mapping, loff_t from);
+extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page);
 
 /* ioctl.c */
 extern long ext4_ioctl(struct file *, unsigned int, unsigned long);
-- 
1.5.4.3.325.g6d216.dirty


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
@ 2008-03-04 12:42 Aneesh Kumar K.V
  2008-03-05  0:45 ` Mingming Cao
  0 siblings, 1 reply; 6+ messages in thread
From: Aneesh Kumar K.V @ 2008-03-04 12:42 UTC (permalink / raw)
  To: cmm, tytso; +Cc: linux-ext4, Aneesh Kumar K.V

We would like to get notified when we are doing a write on mmap section.
This is needed with respect to preallocated area. We split the preallocated
area into initialzed extent and uninitialzed extent in the call back. This
let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
that would result in data loss. The changes are also needed to handle ENOSPC
when writing to an mmap section of files with holes.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/ext4/file.c          |   19 ++++++++++++++++++-
 fs/ext4/inode.c         |   15 +++++++++++++++
 include/linux/ext4_fs.h |    1 +
 3 files changed, 34 insertions(+), 1 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 20507a2..77341c1 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -123,6 +123,23 @@ force_commit:
 	return ret;
 }
 
+static struct vm_operations_struct ext4_file_vm_ops = {
+	.fault		= filemap_fault,
+	.page_mkwrite   = ext4_page_mkwrite,
+};
+
+static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct address_space *mapping = file->f_mapping;
+
+	if (!mapping->a_ops->readpage)
+		return -ENOEXEC;
+	file_accessed(file);
+	vma->vm_ops = &ext4_file_vm_ops;
+	vma->vm_flags |= VM_CAN_NONLINEAR;
+	return 0;
+}
+
 const struct file_operations ext4_file_operations = {
 	.llseek		= generic_file_llseek,
 	.read		= do_sync_read,
@@ -133,7 +150,7 @@ const struct file_operations ext4_file_operations = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= ext4_compat_ioctl,
 #endif
-	.mmap		= generic_file_mmap,
+	.mmap		= ext4_file_mmap,
 	.open		= generic_file_open,
 	.release	= ext4_release_file,
 	.fsync		= ext4_sync_file,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 0f86bbb..42bc666 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3493,3 +3493,18 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
 
 	return err;
 }
+
+int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page)
+{
+	/*
+	 * if ext4_get_block resulted in a split of an uninitialized extent,
+	 * in file system full case, we will have to take the journal write
+	 * access and zero out the page. The journal handle get initialized
+	 * in ext4_get_block.
+	 */
+	/* FIXME!! should we take inode->i_mutex ? Currently we can't because
+	 * it has a circular locking dependency with DIO. But migrate expect
+	 * i_mutex to ensure no i_data changes
+	 */
+	return block_page_mkwrite(vma, page, ext4_get_block);
+}
diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
index 22810b1..8f5a563 100644
--- a/include/linux/ext4_fs.h
+++ b/include/linux/ext4_fs.h
@@ -1059,6 +1059,7 @@ extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
 extern int ext4_block_truncate_page(handle_t *handle, struct page *page,
 		struct address_space *mapping, loff_t from);
+extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page);
 
 /* ioctl.c */
 extern long ext4_ioctl(struct file *, unsigned int, unsigned long);
-- 
1.5.4.3.422.g34cd6.dirty


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
  2008-03-04 12:42 [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification Aneesh Kumar K.V
@ 2008-03-05  0:45 ` Mingming Cao
  2008-03-05 23:29   ` Theodore Tso
  0 siblings, 1 reply; 6+ messages in thread
From: Mingming Cao @ 2008-03-05  0:45 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: tytso, linux-ext4

On Tue, 2008-03-04 at 18:12 +0530, Aneesh Kumar K.V wrote:
> We would like to get notified when we are doing a write on mmap section.
> This is needed with respect to preallocated area. We split the preallocated
> area into initialzed extent and uninitialzed extent in the call back. This
> let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and
> that would result in data loss. The changes are also needed to handle ENOSPC
> when writing to an mmap section of files with holes.
> 

Reviewed. Looks good.
Added to patch queue

Mingming
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/ext4/file.c          |   19 ++++++++++++++++++-
>  fs/ext4/inode.c         |   15 +++++++++++++++
>  include/linux/ext4_fs.h |    1 +
>  3 files changed, 34 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 20507a2..77341c1 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -123,6 +123,23 @@ force_commit:
>  	return ret;
>  }
> 
> +static struct vm_operations_struct ext4_file_vm_ops = {
> +	.fault		= filemap_fault,
> +	.page_mkwrite   = ext4_page_mkwrite,
> +};
> +
> +static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
> +{
> +	struct address_space *mapping = file->f_mapping;
> +
> +	if (!mapping->a_ops->readpage)
> +		return -ENOEXEC;
> +	file_accessed(file);
> +	vma->vm_ops = &ext4_file_vm_ops;
> +	vma->vm_flags |= VM_CAN_NONLINEAR;
> +	return 0;
> +}
> +
>  const struct file_operations ext4_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read		= do_sync_read,
> @@ -133,7 +150,7 @@ const struct file_operations ext4_file_operations = {
>  #ifdef CONFIG_COMPAT
>  	.compat_ioctl	= ext4_compat_ioctl,
>  #endif
> -	.mmap		= generic_file_mmap,
> +	.mmap		= ext4_file_mmap,
>  	.open		= generic_file_open,
>  	.release	= ext4_release_file,
>  	.fsync		= ext4_sync_file,
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 0f86bbb..42bc666 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -3493,3 +3493,18 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
> 
>  	return err;
>  }
> +
> +int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page)
> +{
> +	/*
> +	 * if ext4_get_block resulted in a split of an uninitialized extent,
> +	 * in file system full case, we will have to take the journal write
> +	 * access and zero out the page. The journal handle get initialized
> +	 * in ext4_get_block.
> +	 */
> +	/* FIXME!! should we take inode->i_mutex ? Currently we can't because
> +	 * it has a circular locking dependency with DIO. But migrate expect
> +	 * i_mutex to ensure no i_data changes
> +	 */
> +	return block_page_mkwrite(vma, page, ext4_get_block);
> +}
> diff --git a/include/linux/ext4_fs.h b/include/linux/ext4_fs.h
> index 22810b1..8f5a563 100644
> --- a/include/linux/ext4_fs.h
> +++ b/include/linux/ext4_fs.h
> @@ -1059,6 +1059,7 @@ extern void ext4_set_aops(struct inode *inode);
>  extern int ext4_writepage_trans_blocks(struct inode *);
>  extern int ext4_block_truncate_page(handle_t *handle, struct page *page,
>  		struct address_space *mapping, loff_t from);
> +extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page);
> 
>  /* ioctl.c */
>  extern long ext4_ioctl(struct file *, unsigned int, unsigned long);


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
  2008-03-05  0:45 ` Mingming Cao
@ 2008-03-05 23:29   ` Theodore Tso
  2008-03-05 23:37     ` Mingming Cao
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Tso @ 2008-03-05 23:29 UTC (permalink / raw)
  To: Mingming Cao; +Cc: Aneesh Kumar K.V, linux-ext4

On Tue, Mar 04, 2008 at 04:45:51PM -0800, Mingming Cao wrote:
> > +	/* FIXME!! should we take inode->i_mutex ? Currently we can't because
> > +	 * it has a circular locking dependency with DIO. But migrate expect
> > +	 * i_mutex to ensure no i_data changes

Should I worry that we have something in the stable part of the patch
queue with this FIXME!! comment?  :-)

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
  2008-03-05 23:29   ` Theodore Tso
@ 2008-03-05 23:37     ` Mingming Cao
  2008-03-05 23:59       ` Theodore Tso
  0 siblings, 1 reply; 6+ messages in thread
From: Mingming Cao @ 2008-03-05 23:37 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Aneesh Kumar K.V, linux-ext4

On Wed, 2008-03-05 at 18:29 -0500, Theodore Tso wrote:
> On Tue, Mar 04, 2008 at 04:45:51PM -0800, Mingming Cao wrote:
> > > +	/* FIXME!! should we take inode->i_mutex ? Currently we can't because
> > > +	 * it has a circular locking dependency with DIO. But migrate expect
> > > +	 * i_mutex to ensure no i_data changes
> 
> Should I worry that we have something in the stable part of the patch
> queue with this FIXME!! comment?  :-)
> 

I think this comment could be moved to the migration.c. We can't take
i_mutex on mapped IO path. The i_data_mutex is the lock that should
protect the i_data concurrent changes, which is currently mapped IO
used. The race with migration could be addressed in migration instead of
here. I propose we drop this comment for now.

Mingming


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification.
  2008-03-05 23:37     ` Mingming Cao
@ 2008-03-05 23:59       ` Theodore Tso
  0 siblings, 0 replies; 6+ messages in thread
From: Theodore Tso @ 2008-03-05 23:59 UTC (permalink / raw)
  To: Mingming Cao; +Cc: Aneesh Kumar K.V, linux-ext4

On Wed, Mar 05, 2008 at 03:37:18PM -0800, Mingming Cao wrote:
> On Wed, 2008-03-05 at 18:29 -0500, Theodore Tso wrote:
> > On Tue, Mar 04, 2008 at 04:45:51PM -0800, Mingming Cao wrote:
> > > > +	/* FIXME!! should we take inode->i_mutex ? Currently we can't because
> > > > +	 * it has a circular locking dependency with DIO. But migrate expect
> > > > +	 * i_mutex to ensure no i_data changes
> > 
> > Should I worry that we have something in the stable part of the patch
> > queue with this FIXME!! comment?  :-)
> > 
> 
> I think this comment could be moved to the migration.c. We can't take
> i_mutex on mapped IO path. The i_data_mutex is the lock that should
> protect the i_data concurrent changes, which is currently mapped IO
> used. The race with migration could be addressed in migration instead of
> here. I propose we drop this comment for now.

OK, but that still means we have a known bug in the migration code,
which is in mainline....

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-03-05 23:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-04 12:42 [RFC PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification Aneesh Kumar K.V
2008-03-05  0:45 ` Mingming Cao
2008-03-05 23:29   ` Theodore Tso
2008-03-05 23:37     ` Mingming Cao
2008-03-05 23:59       ` Theodore Tso
  -- strict thread matches above, loose matches on Subject: below --
2008-02-28 18:05 [RFC][PATCH] " Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox