[PATCH 0/1][RFC] mm: prepare_write positive return value

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dmitriy Monakhov <dmonakhov@sw.ru>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@osdl.org>, Nick Piggin <npiggin@suse.de>,
	Linux Filesystems <linux-fsdevel@vger.kernel.org>
Subject: [PATCH 0/1][RFC] mm: prepare_write positive return value
Date: Tue, 06 Feb 2007 11:33:46 +0300	[thread overview]
Message-ID: <87ejp37kid.fsf@sw.ru> (raw)

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

Almost all read/write operation handles data with chunks(segments or pages)
and result has integral behaviour for folowing scenario: 
for_each_chunk() {
     res = op(....);
     if(IS_ERROR(res))
           return progress ? progress : res;
     progress += res;
}
prepare_write may has integral behaviour in case of blksize < pgsize,
for example we successfully allocated/read some blocks, but not all of them,
and than some error happend. Currently we eliminate this progress by doing
vmtrunate() after prepare_has failed.
It is good to have ability to signal about this progress. Interprete positive
prepare_write() ret code as bytes num that fs ready to handle at this moment.
I've ask SAW, he think it is sane. This size always less than PAGE_CACHE_SIZE
so it less than AOP_TRUNCATED_PAGE too.
 
BTH: This approach was used in OpenVZ 2.6.9 kernel in order to make FS with 
delayed allocation more correct, and its works well.
I think not everybody will happy about this,  but let's discuss all advantages
and disadvantages of this change.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
-------------

[-- Attachment #2: diff-mm-fs-prepare_write-retval --]
[-- Type: text/plain, Size: 3632 bytes --]

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 62632f5..b4f6eac 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -239,7 +239,11 @@ static int do_lo_send_aops(struct loop_device *lo, struct bio_vec *bvec,
 				page_cache_release(page);
 				continue;
 			}
-			goto unlock;
+			if (ret > 0 && ret < PAGE_CACHE_SIZE)
+			/* prepare_write demands limit size of bytes. */
+				size = min(size, (unsigned)ret);
+			else
+				goto unlock;
 		}
 		transfer_result = lo_do_transfer(lo, WRITE, page, offset,
 				bvec->bv_page, bv_offs, size, IV);
diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
index 1e5d2ba..b5eebe8 100644
--- a/fs/ecryptfs/mmap.c
+++ b/fs/ecryptfs/mmap.c
@@ -454,6 +454,17 @@ static int ecryptfs_write_inode_size_to_header(struct file *lower_file,
 	}
 	lower_a_ops = lower_inode->i_mapping->a_ops;
 	rc = lower_a_ops->prepare_write(lower_file, header_page, 0, 8);
+	if (unlikely(rc > 0 && rc < PAGE_CACHE_SIZE)) {
+	/* 
+	 * prepare_write can handle less bytes whan we need. This is not likely
+	 * to happend realy because usualy we need only one block. In order to
+	 * preserve prepare/commit balanced invoke commit end fail.
+	 */
+		int ret;
+		ret = lower_a_ops->commit_write(lower_file, header_page, 0, rc);
+		rc = ret ? ret : -ENOSPC;
+		goto unlock;
+	}
 	file_size = (u64)i_size_read(inode);
 	ecryptfs_printk(KERN_DEBUG, "Writing size: [0x%.16x]\n", file_size);
 	file_size = cpu_to_be64(file_size);
@@ -462,6 +473,7 @@ static int ecryptfs_write_inode_size_to_header(struct file *lower_file,
 	kunmap_atomic(header_virt, KM_USER0);
 	flush_dcache_page(header_page);
 	rc = lower_a_ops->commit_write(lower_file, header_page, 0, 8);
+unlock:
 	if (rc < 0)
 		ecryptfs_printk(KERN_ERR, "Error commiting header page "
 				"write\n");
diff --git a/fs/namei.c b/fs/namei.c
index b305589..723db81 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2704,6 +2704,16 @@ retry:
 		page_cache_release(page);
 		goto retry;
 	}
+	if (unlikely(err > 0 && err < PAGE_CACHE_SIZE)) {
+	/* 
+	 * prepare_write can handle less bytes whan we need. This is not likely
+	 * to happend realy because usualy we need only one block. In order to
+	 * preserve prepare/commit balanced invoke commit end fail.
+	 */
+		int ret;
+		ret = mapping->a_ops->commit_write(NULL, page, 0, err);
+		err = ret ? ret : -ENOSPC;
+	}
 	if (err)
 		goto fail_map;
 	kaddr = kmap_atomic(page, KM_USER0);
diff --git a/fs/splice.c b/fs/splice.c
index 2fca6eb..d2b92bf 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -652,6 +652,17 @@ find_page:
 	if (unlikely(ret)) {
 		loff_t isize = i_size_read(mapping->host);
 
+		if (ret > 0 && ret < PAGE_CACHE_SIZE) {
+		/* 
+	 	 * prepare_write demands limit size of bytes. In order to
+	 	 * preserve prepare/commit balanced invoke commit end fail. 
+	 	 * Initial i_size saved, so vmtruncate safely restore it later.
+		 */
+			int ret2;
+			ret2 = mapping->a_ops->commit_write(file, page, offset,
+				offset + ret);
+			ret = ret2 ? ret2 : -ENOSPC;
+		}
 		if (ret != AOP_TRUNCATED_PAGE)
 			unlock_page(page);
 		page_cache_release(page);
diff --git a/mm/filemap.c b/mm/filemap.c
index 5fe315a..529eb9e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2188,6 +2188,11 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov,
 		}
 
 		status = a_ops->prepare_write(file, page, offset, offset+bytes);
+		if (unlikely(status > 0 && status < PAGE_CACHE_SIZE)) {
+		/* prepare_write demands limit size of bytes at this iteration.*/
+			bytes = min(bytes, (size_t)status);
+			status = 0;
+		}
 		if (unlikely(status)) {
 			loff_t isize = i_size_read(inode);

next             reply	other threads:[~2007-02-06  8:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-06  8:33 Dmitriy Monakhov [this message]
2007-02-06 23:39 ` [PATCH 0/1][RFC] mm: prepare_write positive return value Andrew Morton

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:62632f5 dfblob:b4f6eac dfblob:1e5d2ba dfblob:b5eebe8
dfblob:b305589 dfblob:723db81 dfblob:2fca6eb dfblob:d2b92bf
dfblob:5fe315a dfblob:529eb9e )
 OR (
bs:"[PATCH 0/1][RFC] mm: prepare_write positive return value" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ejp37kid.fsf@sw.ru \
    --to=dmonakhov@sw.ru \
    --cc=akpm@osdl.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.