From: Dmitriy Monakhov <dmonakhov@sw.ru>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Memory Management <linux-mm@kvack.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux Filesystems <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@osdl.org>
Subject: Re: [patch 6/10] mm: be sure to trim blocks
Date: Sun, 14 Jan 2007 17:25:44 +0300 [thread overview]
Message-ID: <87bql1lm6v.fsf@sw.ru> (raw)
In-Reply-To: <20070113011255.9449.33228.sendpatchset@linux.site> (Nick Piggin's message of "Sat, 13 Jan 2007 04:25:11 +0100 (CET)")
Nick Piggin <npiggin@suse.de> writes:
> If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
> we may have failed the write operation despite prepare_write having
> instantiated blocks past i_size. Fix this, and consolidate the trimming into
> one place.
>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
>
> Index: linux-2.6/mm/filemap.c
> ===================================================================
> --- linux-2.6.orig/mm/filemap.c
> +++ linux-2.6/mm/filemap.c
> @@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
> }
>
> status = a_ops->prepare_write(file, page, offset, offset+bytes);
> - if (unlikely(status)) {
> - loff_t isize = i_size_read(inode);
> + if (unlikely(status))
> + goto fs_write_aop_error;
May be it's stupid question but still..
Why we treat non zero prepare_write() return code as error, it may be positive.
Positive error code may be used as fine grained 'bytes' limiter in case of
blksize < pgsize as follows:
status = a_ops->prepare_write(file, page, offset, offset+bytes);
if (unlikely(status)) {
if (status > 0) {
bytes = min(bytes, status);
status = 0;
} else {
goto fs_write_aop_error;
}
}
---
This is useful because fs may want to reduce 'bytes' by number of reasons,
for example make it blksize bound.
Example : filesystem has 1k blksize and only two free blocks. And we try
write 4k bytes.
Currently write(fd, buff, 4096) will return -ENOSPC
But after this fix write(fd, buff, 4096) will return as mutch as it can (2048).
>
> - if (status != AOP_TRUNCATED_PAGE)
> - unlock_page(page);
> - page_cache_release(page);
> - if (status == AOP_TRUNCATED_PAGE)
> - continue;
> - /*
> - * prepare_write() may have instantiated a few blocks
> - * outside i_size. Trim these off again.
> - */
> - if (pos + bytes > isize)
> - vmtruncate(inode, isize);
> - break;
> - }
> if (likely(nr_segs == 1))
> copied = filemap_copy_from_user(page, offset,
> buf, bytes);
> @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> cur_iov, iov_offset, bytes);
> flush_dcache_page(page);
> status = a_ops->commit_write(file, page, offset, offset+bytes);
> - if (status == AOP_TRUNCATED_PAGE) {
> - page_cache_release(page);
> - continue;
> - }
> + if (unlikely(status))
> + goto fs_write_aop_error;
> +
> if (likely(copied > 0)) {
> if (!status)
> status = copied;
> @@ -1969,6 +1955,25 @@ generic_file_buffered_write(struct kiocb
> break;
> balance_dirty_pages_ratelimited(mapping);
> cond_resched();
> + continue;
> +
> +fs_write_aop_error:
> + if (status != AOP_TRUNCATED_PAGE)
> + unlock_page(page);
> + page_cache_release(page);
> +
> + /*
> + * prepare_write() may have instantiated a few blocks
> + * outside i_size. Trim these off again. Don't need
> + * i_size_read because we hold i_mutex.
> + */
> + if (pos + bytes > inode->i_size)
> + vmtruncate(inode, inode->i_size);
> + if (status == AOP_TRUNCATED_PAGE)
> + continue;
> + else
> + break;
> +
> } while (count);
> *ppos = pos;
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
WARNING: multiple messages have this Message-ID (diff)
From: Dmitriy Monakhov <dmonakhov@sw.ru>
To: Nick Piggin <npiggin@suse.de>
Cc: Linux Memory Management <linux-mm@kvack.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux Filesystems <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@osdl.org>
Subject: Re: [patch 6/10] mm: be sure to trim blocks
Date: Sun, 14 Jan 2007 17:25:44 +0300 [thread overview]
Message-ID: <87bql1lm6v.fsf@sw.ru> (raw)
In-Reply-To: <20070113011255.9449.33228.sendpatchset@linux.site> (Nick Piggin's message of "Sat, 13 Jan 2007 04:25:11 +0100 (CET)")
Nick Piggin <npiggin@suse.de> writes:
> If prepare_write fails with AOP_TRUNCATED_PAGE, or if commit_write fails, then
> we may have failed the write operation despite prepare_write having
> instantiated blocks past i_size. Fix this, and consolidate the trimming into
> one place.
>
> Signed-off-by: Nick Piggin <npiggin@suse.de>
>
> Index: linux-2.6/mm/filemap.c
> ===================================================================
> --- linux-2.6.orig/mm/filemap.c
> +++ linux-2.6/mm/filemap.c
> @@ -1911,22 +1911,9 @@ generic_file_buffered_write(struct kiocb
> }
>
> status = a_ops->prepare_write(file, page, offset, offset+bytes);
> - if (unlikely(status)) {
> - loff_t isize = i_size_read(inode);
> + if (unlikely(status))
> + goto fs_write_aop_error;
May be it's stupid question but still..
Why we treat non zero prepare_write() return code as error, it may be positive.
Positive error code may be used as fine grained 'bytes' limiter in case of
blksize < pgsize as follows:
status = a_ops->prepare_write(file, page, offset, offset+bytes);
if (unlikely(status)) {
if (status > 0) {
bytes = min(bytes, status);
status = 0;
} else {
goto fs_write_aop_error;
}
}
---
This is useful because fs may want to reduce 'bytes' by number of reasons,
for example make it blksize bound.
Example : filesystem has 1k blksize and only two free blocks. And we try
write 4k bytes.
Currently write(fd, buff, 4096) will return -ENOSPC
But after this fix write(fd, buff, 4096) will return as mutch as it can (2048).
>
> - if (status != AOP_TRUNCATED_PAGE)
> - unlock_page(page);
> - page_cache_release(page);
> - if (status == AOP_TRUNCATED_PAGE)
> - continue;
> - /*
> - * prepare_write() may have instantiated a few blocks
> - * outside i_size. Trim these off again.
> - */
> - if (pos + bytes > isize)
> - vmtruncate(inode, isize);
> - break;
> - }
> if (likely(nr_segs == 1))
> copied = filemap_copy_from_user(page, offset,
> buf, bytes);
> @@ -1935,10 +1922,9 @@ generic_file_buffered_write(struct kiocb
> cur_iov, iov_offset, bytes);
> flush_dcache_page(page);
> status = a_ops->commit_write(file, page, offset, offset+bytes);
> - if (status == AOP_TRUNCATED_PAGE) {
> - page_cache_release(page);
> - continue;
> - }
> + if (unlikely(status))
> + goto fs_write_aop_error;
> +
> if (likely(copied > 0)) {
> if (!status)
> status = copied;
> @@ -1969,6 +1955,25 @@ generic_file_buffered_write(struct kiocb
> break;
> balance_dirty_pages_ratelimited(mapping);
> cond_resched();
> + continue;
> +
> +fs_write_aop_error:
> + if (status != AOP_TRUNCATED_PAGE)
> + unlock_page(page);
> + page_cache_release(page);
> +
> + /*
> + * prepare_write() may have instantiated a few blocks
> + * outside i_size. Trim these off again. Don't need
> + * i_size_read because we hold i_mutex.
> + */
> + if (pos + bytes > inode->i_size)
> + vmtruncate(inode, inode->i_size);
> + if (status == AOP_TRUNCATED_PAGE)
> + continue;
> + else
> + break;
> +
> } while (count);
> *ppos = pos;
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-01-14 14:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-13 3:24 [patch 0/10] buffered write deadlock fix Nick Piggin
2007-01-13 3:24 ` Nick Piggin
2007-01-13 3:24 ` [patch 1/10] fs: libfs buffered write leak fix Nick Piggin
2007-01-13 3:24 ` Nick Piggin
2007-01-13 3:24 ` [patch 2/10] mm: revert "generic_file_buffered_write(): handle zero length iovec segments" Nick Piggin
2007-01-13 3:24 ` Nick Piggin, Andrew Morton
2007-01-13 3:24 ` [patch 3/10] mm: revert "generic_file_buffered_write(): deadlock on vectored write" Nick Piggin
2007-01-13 3:24 ` Nick Piggin, Andrew Morton
2007-01-13 3:24 ` [patch 4/10] mm: generic_file_buffered_write cleanup Nick Piggin
2007-01-13 3:24 ` Nick Piggin, Andrew Morton
2007-01-13 3:25 ` [patch 5/10] mm: debug write deadlocks Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-13 3:25 ` [patch 6/10] mm: be sure to trim blocks Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-14 14:25 ` Dmitriy Monakhov [this message]
2007-01-14 14:25 ` Dmitriy Monakhov
2007-01-20 3:50 ` Nick Piggin
2007-01-20 3:50 ` Nick Piggin
2007-01-16 17:36 ` Peter Zijlstra
2007-01-16 17:36 ` Peter Zijlstra
2007-01-16 19:14 ` Peter Zijlstra
2007-01-16 19:14 ` Peter Zijlstra
2007-01-20 3:52 ` Nick Piggin
2007-01-20 3:52 ` Nick Piggin
2007-01-13 3:25 ` [patch 7/10] mm: cleanup pagecache insertion operations Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-13 3:25 ` [patch 8/10] mm: generic_file_buffered_write cleanup more Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-13 3:25 ` [patch 9/10] mm: generic_file_buffered_write iovec cleanup Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-13 3:25 ` [patch 10/10] mm: fix pagecache write deadlocks Nick Piggin
2007-01-13 3:25 ` Nick Piggin
2007-01-14 3:59 ` Nick Piggin
2007-01-14 3:59 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87bql1lm6v.fsf@sw.ru \
--to=dmonakhov@sw.ru \
--cc=akpm@osdl.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.