public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Re: BUG: warning at mm/truncate.c:60/cancel_dirty_page()
       [not found] ` <Pine.LNX.4.64.0701062051570.24997@blonde.wat.veritas.com>
@ 2007-01-07 22:23   ` David Chinner
  2007-01-07 22:48     ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2007-01-07 22:23 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Sami Farin, Nathan Scott, xfs, Nick Piggin, linux-kernel

On Sat, Jan 06, 2007 at 09:11:07PM +0000, Hugh Dickins wrote:
> On Sat, 6 Jan 2007, Sami Farin wrote:
> 
> > Linux 2.6.19.1 SMP [2] on Pentium D...
> > I was running dt-15.14 [2] and I ran
> > "cinfo datafile" (it does mincore()).
> > Well it went OK but when I ran "strace cinfo datafile"...:
> > 04:18:48.062466 mincore(0x37f1f000, 2147266560, 
> 
> You rightly noted in a followup that there have been changes to
> mincore, but I doubt they have any bearing on this: I think the
> BUG just happened at the same time as your mincore.
> 
> > ...
> > 2007-01-06 04:19:03.788181500 <4>BUG: warning at mm/truncate.c:60/cancel_dirty_page()
> > 2007-01-06 04:19:03.788221500 <4> [<c0103cfb>] dump_trace+0x215/0x21a
> > 2007-01-06 04:19:03.788223500 <4> [<c0103da3>] show_trace_log_lvl+0x1a/0x30
> > 2007-01-06 04:19:03.788224500 <4> [<c0103dcb>] show_trace+0x12/0x14
> > 2007-01-06 04:19:03.788225500 <4> [<c0103ec8>] dump_stack+0x19/0x1b
> > 2007-01-06 04:19:03.788227500 <4> [<c01546a6>] cancel_dirty_page+0x7e/0x80
> > 2007-01-06 04:19:03.788228500 <4> [<c01546c2>] truncate_complete_page+0x1a/0x47
> > 2007-01-06 04:19:03.788229500 <4> [<c0154854>] truncate_inode_pages_range+0x114/0x2ae
> > 2007-01-06 04:19:03.788245500 <4> [<c0154a08>] truncate_inode_pages+0x1a/0x1c
> > 2007-01-06 04:19:03.788247500 <4> [<c0269244>] fs_flushinval_pages+0x40/0x77
> > 2007-01-06 04:19:03.788248500 <4> [<c026d48c>] xfs_write+0x8c4/0xb68
> > 2007-01-06 04:19:03.788250500 <4> [<c0268b14>] xfs_file_aio_write+0x7e/0x95
> > 2007-01-06 04:19:03.788251500 <4> [<c016d66c>] do_sync_write+0xca/0x119
> > 2007-01-06 04:19:03.788265500 <4> [<c016d842>] vfs_write+0x187/0x18c
> > 2007-01-06 04:19:03.788267500 <4> [<c016d8e8>] sys_write+0x3d/0x64
> > 2007-01-06 04:19:03.788268500 <4> [<c0102e73>] syscall_call+0x7/0xb
> > 2007-01-06 04:19:03.788269500 <4> [<001cf410>] 0x1cf410
> > 2007-01-06 04:19:03.788289500 <4> =======================
> 
> So... XFS uses truncate_inode_pages when serving the write system call.

Only when you are doing direct I/O. XFS does direct writes without
the i_mutex held, so it has to invalidate the range of cached pages
while holding it's own locks to ensure direct I/O cache semantics are
kept.

> That's very inventive,

Not really - been doing it for years.

> and now it looks like Linus' cancel_dirty_page
> and new warning have caught it out.  VM people expect it to be called
> either when freeing an inode no longer in use, or when doing a truncate,
> after ensuring that all pages mapped into userspace have been taken out.

Ok, so we are punching a hole in the middle of the address space
because we are doing direct I/O on it and need to invalidate the
cache.

How are you supposed to invalidate a range of pages in a mapping for
this case, then? invalidate_mapping_pages() would appear to be the
candidate (the generic code uses this), but it _skips_ pages that
are already mapped.  invalidate_mapping_pages() then advises you to
use truncate_inode_pages():

/**
 * invalidate_mapping_pages - Invalidate all the unlocked pages of one inode
 * @mapping: the address_space which holds the pages to invalidate
 * @start: the offset 'from' which to invalidate
 * @end: the offset 'to' which to invalidate (inclusive)
 *
 * This function only removes the unlocked pages, if you want to
 * remove all the pages of one inode, you must call truncate_inode_pages.
 *
 * invalidate_mapping_pages() will not block on IO activity. It will not
 * invalidate pages which are dirty, locked, under writeback or mapped into
 * pagetables.
 */

We want to remove all pages within the range given, so, as directed by
the comment here, we use truncate_inode_pages(). Says nothing about
mappings needing to be removed first so I guess that's where we've
been caught.....

I think we can use invalidate_inode_pages2_range(), but that doesn't
handle partial page invalidations. I think this will be ok, but it's
going to need some serious fsx testing on blocksize != page size
configs.

So, am I correct in assuming we should be calling invalidate_inode_pages2_range()
instead of truncate_inode_pages()?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: warning at mm/truncate.c:60/cancel_dirty_page()
  2007-01-07 22:23   ` BUG: warning at mm/truncate.c:60/cancel_dirty_page() David Chinner
@ 2007-01-07 22:48     ` Andrew Morton
  2007-01-07 23:04       ` David Chinner
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2007-01-07 22:48 UTC (permalink / raw)
  To: David Chinner
  Cc: Hugh Dickins, Sami Farin, Nathan Scott, xfs, Nick Piggin,
	linux-kernel

On Mon, 8 Jan 2007 09:23:41 +1100
David Chinner <dgc@sgi.com> wrote:

> How are you supposed to invalidate a range of pages in a mapping for
> this case, then? invalidate_mapping_pages() would appear to be the
> candidate (the generic code uses this), but it _skips_ pages that
> are already mapped.

unmap_mapping_range()?

> So, am I correct in assuming we should be calling invalidate_inode_pages2_range()
> instead of truncate_inode_pages()?

That would be conventional.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: warning at mm/truncate.c:60/cancel_dirty_page()
  2007-01-07 22:48     ` Andrew Morton
@ 2007-01-07 23:04       ` David Chinner
  2007-01-08 11:58         ` Sami Farin
  0 siblings, 1 reply; 4+ messages in thread
From: David Chinner @ 2007-01-07 23:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Chinner, Hugh Dickins, Sami Farin, xfs, Nick Piggin,
	linux-kernel

On Sun, Jan 07, 2007 at 02:48:12PM -0800, Andrew Morton wrote:
> On Mon, 8 Jan 2007 09:23:41 +1100
> David Chinner <dgc@sgi.com> wrote:
> 
> > How are you supposed to invalidate a range of pages in a mapping for
> > this case, then? invalidate_mapping_pages() would appear to be the
> > candidate (the generic code uses this), but it _skips_ pages that
> > are already mapped.
> 
> unmap_mapping_range()?

/me looks at how it's used in invalidate_inode_pages2_range() and
decides it's easier not to call this directly.

> > So, am I correct in assuming we should be calling invalidate_inode_pages2_range()
> > instead of truncate_inode_pages()?
> 
> That would be conventional.

.... in that case the following patch should fix the warning:

---
 fs/xfs/linux-2.6/xfs_fs_subr.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_fs_subr.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_fs_subr.c	2006-12-12 12:05:17.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_fs_subr.c	2007-01-08 09:30:22.056571711 +1100
@@ -21,6 +21,8 @@ int  fs_noerr(void) { return 0; }
 int  fs_nosys(void) { return ENOSYS; }
 void fs_noval(void) { return; }
 
+#define XFS_OFF_TO_PCSIZE(off)	\
+	(((off) + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT)
 void
 fs_tosspages(
 	bhv_desc_t	*bdp,
@@ -32,7 +34,9 @@ fs_tosspages(
 	struct inode	*ip = vn_to_inode(vp);
 
 	if (VN_CACHED(vp))
-		truncate_inode_pages(ip->i_mapping, first);
+		invalidate_inode_pages2_range(ip->i_mapping,
+					XFS_OFF_TO_PCSIZE(first),
+					XFS_OFF_TO_PCSIZE(last));
 }
 
 void
@@ -49,7 +53,9 @@ fs_flushinval_pages(
 		if (VN_TRUNC(vp))
 			VUNTRUNCATE(vp);
 		filemap_write_and_wait(ip->i_mapping);
-		truncate_inode_pages(ip->i_mapping, first);
+		invalidate_inode_pages2_range(ip->i_mapping,
+					XFS_OFF_TO_PCSIZE(first),
+					XFS_OFF_TO_PCSIZE(last));
 	}
 }
 

-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: BUG: warning at mm/truncate.c:60/cancel_dirty_page()
  2007-01-07 23:04       ` David Chinner
@ 2007-01-08 11:58         ` Sami Farin
  0 siblings, 0 replies; 4+ messages in thread
From: Sami Farin @ 2007-01-08 11:58 UTC (permalink / raw)
  To: xfs, linux-kernel; +Cc: Andrew Morton, David Chinner, Hugh Dickins, Nick Piggin

On Mon, Jan 08, 2007 at 10:04:36 +1100, David Chinner wrote:
> On Sun, Jan 07, 2007 at 02:48:12PM -0800, Andrew Morton wrote:
> > On Mon, 8 Jan 2007 09:23:41 +1100
> > David Chinner <dgc@sgi.com> wrote:
> > 
> > > How are you supposed to invalidate a range of pages in a mapping for
> > > this case, then? invalidate_mapping_pages() would appear to be the
> > > candidate (the generic code uses this), but it _skips_ pages that
> > > are already mapped.
> > 
> > unmap_mapping_range()?
> 
> /me looks at how it's used in invalidate_inode_pages2_range() and
> decides it's easier not to call this directly.
> 
> > > So, am I correct in assuming we should be calling invalidate_inode_pages2_range()
> > > instead of truncate_inode_pages()?
> > 
> > That would be conventional.
> 
> .... in that case the following patch should fix the warning:

I tried dt+strace+cinfo with this patch applied and got no warnings.
Thanks for quick fix.

-- 
Do what you love because life is too short for anything else.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-01-08 13:08 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070106023907.GA7766@m.safari.iki.fi>
     [not found] ` <Pine.LNX.4.64.0701062051570.24997@blonde.wat.veritas.com>
2007-01-07 22:23   ` BUG: warning at mm/truncate.c:60/cancel_dirty_page() David Chinner
2007-01-07 22:48     ` Andrew Morton
2007-01-07 23:04       ` David Chinner
2007-01-08 11:58         ` Sami Farin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox