All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitriy Monakhov <dmonakhov@openvz.org>
To: Andrew Morton <akpm@osdl.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>,
	linux-kernel@vger.kernel.org,
	Linux Memory Management <linux-mm@kvack.org>,
	devel@openvz.org, xfs@oss.sgi.com
Subject: Re: [PATCH]  incorrect error handling inside generic_file_direct_write
Date: Wed, 13 Dec 2006 02:14:18 +0300	[thread overview]
Message-ID: <87y7pcn1v9.fsf@sw.ru> (raw)
In-Reply-To: <20061212024027.6c2a79d3.akpm@osdl.org> (Andrew Morton's message of "Tue, 12 Dec 2006 02:40:27 -0800")

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

Andrew Morton <akpm@osdl.org> writes:

> On Tue, 12 Dec 2006 16:18:32 +0300
> Dmitriy Monakhov <dmonakhov@sw.ru> wrote:
>
>> >> but according to filemaps locking rules: mm/filemap.c:77
>> >>  ..
>> >>  *  ->i_mutex			(generic_file_buffered_write)
>> >>  *    ->mmap_sem		(fault_in_pages_readable->do_page_fault)
>> >>  ..
>> >> I'm confused a litle bit, where is the truth? 
>> >
>> > xfs_write() calls generic_file_direct_write() without taking i_mutex for
>> > O_DIRECT writes.
>> Yes, but my quastion is about __generic_file_aio_write_nolock().
>> As i understand _nolock sufix means that i_mutex was already locked 
>> by caller, am i right ?
>
> Nope.  It just means that __generic_file_aio_write_nolock() doesn't take
> the lock.  We don't assume or require that the caller took it.  For example
> the raw driver calls generic_file_aio_write_nolock() without taking
> i_mutex.  Raw isn't relevant to the problem (although ocfs2 might be).  But
> we cannot assume that all callers have taken i_mutex, I think.
>
> I guess we can make that a rule (document it, add
> BUG_ON(!mutex_is_locked(..)) if it isn't a blockdev) if needs be.  After
> really checking that this matches reality for all callers.
I've checked generic_file_aio_write_nolock() callers for non blockdev.
Only ocfs2 call it explicitly, and do it under i_mutex.
So we need to do: 
1) Change wrong comments.
2) Add BUG_ON(!mutex_is_locked(..)) for non blkdev.
3) Invoke vmtruncate only for non blkdev.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
-------

[-- Attachment #2: direct-io-fix.patch --]
[-- Type: text/plain, Size: 2123 bytes --]

diff --git a/mm/filemap.c b/mm/filemap.c
index 7b84dc8..540ef5e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2046,8 +2046,8 @@ generic_file_direct_write(struct kiocb *
 	/*
 	 * Sync the fs metadata but not the minor inode changes and
 	 * of course not the data as we did direct DMA for the IO.
-	 * i_mutex is held, which protects generic_osync_inode() from
-	 * livelocking.
+	 * i_mutex may not being held, if so some specific locking
+	 * ordering must protect generic_osync_inode() from livelocking.
 	 */
 	if (written >= 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
 		int err = generic_osync_inode(inode, mapping, OSYNC_METADATA);
@@ -2282,6 +2282,17 @@ __generic_file_aio_write_nolock(struct k
 
 		written = generic_file_direct_write(iocb, iov, &nr_segs, pos,
 							ppos, count, ocount);
+		/*
+		 * If host is not S_ISBLK generic_file_direct_write() may 
+		 * have instantiated a few blocks outside i_size  files
+		 * Trim these off again.
+		 */
+		if (unlikely(written < 0) && !S_ISBLK(inode->i_mode)) {
+			loff_t isize = i_size_read(inode);
+			if (pos + count > isize)
+				vmtruncate(inode, isize);
+		}
+
 		if (written < 0 || written == count)
 			goto out;
 		/*
@@ -2344,6 +2355,13 @@ ssize_t generic_file_aio_write_nolock(st
 	ssize_t ret;
 
 	BUG_ON(iocb->ki_pos != pos);
+	/*
+	 *  generic_file_buffered_write() may be called inside 
+	 *  __generic_file_aio_write_nolock() even in case of
+	 *  O_DIRECT for non S_ISBLK files. So i_mutex must be held.
+	 */
+	if (!S_ISBLK(inode->i_mode))
+		BUG_ON(!mutex_is_locked(&inode->i_mutex));
 
 	ret = __generic_file_aio_write_nolock(iocb, iov, nr_segs,
 			&iocb->ki_pos);
@@ -2386,8 +2404,8 @@ ssize_t generic_file_aio_write(struct ki
 EXPORT_SYMBOL(generic_file_aio_write);
 
 /*
- * Called under i_mutex for writes to S_ISREG files.   Returns -EIO if something
- * went wrong during pagecache shootdown.
+ * May be called without i_mutex for writes to S_ISREG files.
+ * Returns -EIO if something went wrong during pagecache shootdown.
  */
 static ssize_t
 generic_file_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,

WARNING: multiple messages have this Message-ID (diff)
From: Dmitriy Monakhov <dmonakhov@openvz.org>
To: Andrew Morton <akpm@osdl.org>
Cc: Dmitriy Monakhov <dmonakhov@openvz.org>,
	linux-kernel@vger.kernel.org,
	Linux Memory Management <linux-mm@kvack.org>, <devel@openvz.org>,
	xfs@oss.sgi.com
Subject: Re: [PATCH]  incorrect error handling inside generic_file_direct_write
Date: Wed, 13 Dec 2006 02:14:18 +0300	[thread overview]
Message-ID: <87y7pcn1v9.fsf@sw.ru> (raw)
In-Reply-To: <20061212024027.6c2a79d3.akpm@osdl.org> (Andrew Morton's message of "Tue, 12 Dec 2006 02:40:27 -0800")

[-- Attachment #1: Type: text/plain, Size: 1535 bytes --]

Andrew Morton <akpm@osdl.org> writes:

> On Tue, 12 Dec 2006 16:18:32 +0300
> Dmitriy Monakhov <dmonakhov@sw.ru> wrote:
>
>> >> but according to filemaps locking rules: mm/filemap.c:77
>> >>  ..
>> >>  *  ->i_mutex			(generic_file_buffered_write)
>> >>  *    ->mmap_sem		(fault_in_pages_readable->do_page_fault)
>> >>  ..
>> >> I'm confused a litle bit, where is the truth? 
>> >
>> > xfs_write() calls generic_file_direct_write() without taking i_mutex for
>> > O_DIRECT writes.
>> Yes, but my quastion is about __generic_file_aio_write_nolock().
>> As i understand _nolock sufix means that i_mutex was already locked 
>> by caller, am i right ?
>
> Nope.  It just means that __generic_file_aio_write_nolock() doesn't take
> the lock.  We don't assume or require that the caller took it.  For example
> the raw driver calls generic_file_aio_write_nolock() without taking
> i_mutex.  Raw isn't relevant to the problem (although ocfs2 might be).  But
> we cannot assume that all callers have taken i_mutex, I think.
>
> I guess we can make that a rule (document it, add
> BUG_ON(!mutex_is_locked(..)) if it isn't a blockdev) if needs be.  After
> really checking that this matches reality for all callers.
I've checked generic_file_aio_write_nolock() callers for non blockdev.
Only ocfs2 call it explicitly, and do it under i_mutex.
So we need to do: 
1) Change wrong comments.
2) Add BUG_ON(!mutex_is_locked(..)) for non blkdev.
3) Invoke vmtruncate only for non blkdev.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
-------

[-- Attachment #2: direct-io-fix.patch --]
[-- Type: text/plain, Size: 2123 bytes --]

diff --git a/mm/filemap.c b/mm/filemap.c
index 7b84dc8..540ef5e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2046,8 +2046,8 @@ generic_file_direct_write(struct kiocb *
 	/*
 	 * Sync the fs metadata but not the minor inode changes and
 	 * of course not the data as we did direct DMA for the IO.
-	 * i_mutex is held, which protects generic_osync_inode() from
-	 * livelocking.
+	 * i_mutex may not being held, if so some specific locking
+	 * ordering must protect generic_osync_inode() from livelocking.
 	 */
 	if (written >= 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) {
 		int err = generic_osync_inode(inode, mapping, OSYNC_METADATA);
@@ -2282,6 +2282,17 @@ __generic_file_aio_write_nolock(struct k
 
 		written = generic_file_direct_write(iocb, iov, &nr_segs, pos,
 							ppos, count, ocount);
+		/*
+		 * If host is not S_ISBLK generic_file_direct_write() may 
+		 * have instantiated a few blocks outside i_size  files
+		 * Trim these off again.
+		 */
+		if (unlikely(written < 0) && !S_ISBLK(inode->i_mode)) {
+			loff_t isize = i_size_read(inode);
+			if (pos + count > isize)
+				vmtruncate(inode, isize);
+		}
+
 		if (written < 0 || written == count)
 			goto out;
 		/*
@@ -2344,6 +2355,13 @@ ssize_t generic_file_aio_write_nolock(st
 	ssize_t ret;
 
 	BUG_ON(iocb->ki_pos != pos);
+	/*
+	 *  generic_file_buffered_write() may be called inside 
+	 *  __generic_file_aio_write_nolock() even in case of
+	 *  O_DIRECT for non S_ISBLK files. So i_mutex must be held.
+	 */
+	if (!S_ISBLK(inode->i_mode))
+		BUG_ON(!mutex_is_locked(&inode->i_mutex));
 
 	ret = __generic_file_aio_write_nolock(iocb, iov, nr_segs,
 			&iocb->ki_pos);
@@ -2386,8 +2404,8 @@ ssize_t generic_file_aio_write(struct ki
 EXPORT_SYMBOL(generic_file_aio_write);
 
 /*
- * Called under i_mutex for writes to S_ISREG files.   Returns -EIO if something
- * went wrong during pagecache shootdown.
+ * May be called without i_mutex for writes to S_ISREG files.
+ * Returns -EIO if something went wrong during pagecache shootdown.
  */
 static ssize_t
 generic_file_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,

  reply	other threads:[~2006-12-12 20:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-11 13:34 [PATCH] incorrect error handling inside generic_file_direct_write Dmitriy Monakhov
2006-12-11 13:34 ` Dmitriy Monakhov
2006-12-11 12:38 ` [Devel] " Kirill Korotaev
2006-12-11 12:38   ` Kirill Korotaev
2006-12-11 20:40 ` Andrew Morton
2006-12-11 20:40   ` Andrew Morton
2006-12-11 20:40   ` Andrew Morton
2006-12-12  9:22   ` Dmitriy Monakhov
2006-12-12  9:22     ` Dmitriy Monakhov
2006-12-12  9:22     ` Dmitriy Monakhov
2006-12-12  6:36     ` Andrew Morton
2006-12-12  6:36       ` Andrew Morton
2006-12-12  6:36       ` Andrew Morton
2006-12-12 12:20   ` Dmitriy Monakhov
2006-12-12 12:20     ` Dmitriy Monakhov
2006-12-12 12:20     ` Dmitriy Monakhov
2006-12-12  9:52     ` Andrew Morton
2006-12-12  9:52       ` Andrew Morton
2006-12-12  9:52       ` Andrew Morton
2006-12-12 13:18       ` Dmitriy Monakhov
2006-12-12 13:18         ` Dmitriy Monakhov
2006-12-12 10:40         ` Andrew Morton
2006-12-12 10:40           ` Andrew Morton
2006-12-12 10:40           ` Andrew Morton
2006-12-12 23:14           ` Dmitriy Monakhov [this message]
2006-12-12 23:14             ` Dmitriy Monakhov
2006-12-13  2:43           ` Chen, Kenneth W
2006-12-13  2:43             ` Chen, Kenneth W
2006-12-13  2:43             ` Chen, Kenneth W
2006-12-15 10:43             ` 'Christoph Hellwig'
2006-12-15 10:43               ` 'Christoph Hellwig'
2006-12-15 18:53               ` Chen, Kenneth W
2006-12-15 18:53                 ` Chen, Kenneth W
2006-12-15 18:53                 ` Chen, Kenneth W
2007-01-02 11:17                 ` 'Christoph Hellwig'
2007-01-02 11:17                   ` 'Christoph Hellwig'

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y7pcn1v9.fsf@sw.ru \
    --to=dmonakhov@openvz.org \
    --cc=akpm@osdl.org \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.