public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* more pagecache invalidation issues?
@ 2014-09-11 21:14 Christoph Hellwig
  2014-09-11 22:03 ` Dave Chinner
  0 siblings, 1 reply; 2+ messages in thread
From: Christoph Hellwig @ 2014-09-11 21:14 UTC (permalink / raw)
  To: xfs

I just hit this with Linus' tree from a day or two ago when running
xfstests in my 64-bit x86 kvm VM:

[ 1810.820601] ------------[ cut here ]------------
[ 1810.821730] kernel BUG at ../fs/xfs/xfs_aops.c:1373!
[ 1810.822881] invalid opcode: 0000 [#1] SMP 
[ 1810.823177] Modules linked in:
[ 1810.823177] CPU: 0 PID: 5324 Comm: 4980.fsstress.b Not tainted 3.17.0-rc4+ #266
[ 1810.823177] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1810.823177] task: ffff88004fedc910 ti: ffff88000b340000 task.ti: ffff88000b340000
[ 1810.823177] RIP: 0010:[<ffffffff8150139b>]  [<ffffffff8150139b>] __xfs_get_blocks+0x5cb/0x5d0
[ 1810.823177] RSP: 0018:ffff88000b343998  EFLAGS: 00010202
[ 1810.823177] RAX: ffff880079ddf580 RBX: 0000000000166000 RCX: ffff88004fedd0f8
[ 1810.823177] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000246
[ 1810.823177] RBP: ffff88000b343a38 R08: 0000000000000001 R09: 0000000000000000
[ 1810.823177] R10: 0000000000000000 R11: 00000000000785b0 R12: ffff88004863d9a0
[ 1810.823177] R13: ffff88004863d700 R14: ffff88000b343b50 R15: 0000000000000000
[ 1810.823177] FS:  00007ff4401c6700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[ 1810.823177] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1810.823177] CR2: 00007ff4400c2008 CR3: 000000004fed3000 CR4: 00000000000006f0
[ 1810.823177] Stack:
[ 1810.823177]  ffff88000b343a18 0000000000005000 ffff88000b3439c8 ffff880000000000
[ 1810.823177]  ffff880000000008 0000000000000166 000188004ff2e940 0000000000005000
[ 1810.823177]  ffff88000b343a18 0000000100000202 000000000000015e ffffffffffffffff
[ 1810.823177] Call Trace:
[ 1810.823177]  [<ffffffff815013af>] xfs_get_blocks_direct+0xf/0x20
[ 1810.823177]  [<ffffffff811f3ffe>] __blockdev_direct_IO+0x9ee/0x3340
[ 1810.823177]  [<ffffffff815013a0>] ? __xfs_get_blocks+0x5d0/0x5d0
[ 1810.823177]  [<ffffffff814ff7d0>] xfs_vm_direct_IO+0x130/0x150
[ 1810.823177]  [<ffffffff815013a0>] ? __xfs_get_blocks+0x5d0/0x5d0
[ 1810.823177]  [<ffffffff8117116a>] generic_file_read_iter+0x54a/0x610
[ 1810.823177]  [<ffffffff810f5a8a>] ? mark_held_locks+0x6a/0x90
[ 1810.823177]  [<ffffffff8150c6e9>] xfs_file_read_iter+0xf9/0x2b0
[ 1810.823177]  [<ffffffff81193e3e>] ? might_fault+0x3e/0x90
[ 1810.823177]  [<ffffffff811b9b69>] new_sync_read+0x79/0xb0
[ 1810.823177]  [<ffffffff811bac6b>] vfs_read+0x9b/0x190
[ 1810.823177]  [<ffffffff811baf11>] SyS_read+0x51/0xc0
[ 1810.823177]  [<ffffffff81d9f6e9>] system_call_fastpath+0x16/0x1b

The BUG_ON is this one:

	if (imap.br_startblock == DELAYSTARTBLOCK) {
		BUG_ON(direct);
		if (create) {
			..

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: more pagecache invalidation issues?
  2014-09-11 21:14 more pagecache invalidation issues? Christoph Hellwig
@ 2014-09-11 22:03 ` Dave Chinner
  0 siblings, 0 replies; 2+ messages in thread
From: Dave Chinner @ 2014-09-11 22:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Thu, Sep 11, 2014 at 02:14:43PM -0700, Christoph Hellwig wrote:
> I just hit this with Linus' tree from a day or two ago when running
> xfstests in my 64-bit x86 kvm VM:
> 
> [ 1810.820601] ------------[ cut here ]------------
> [ 1810.821730] kernel BUG at ../fs/xfs/xfs_aops.c:1373!
> [ 1810.822881] invalid opcode: 0000 [#1] SMP 
> [ 1810.823177] Modules linked in:
> [ 1810.823177] CPU: 0 PID: 5324 Comm: 4980.fsstress.b Not tainted 3.17.0-rc4+ #266
> [ 1810.823177] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 1810.823177] task: ffff88004fedc910 ti: ffff88000b340000 task.ti: ffff88000b340000
> [ 1810.823177] RIP: 0010:[<ffffffff8150139b>]  [<ffffffff8150139b>] __xfs_get_blocks+0x5cb/0x5d0
> [ 1810.823177] RSP: 0018:ffff88000b343998  EFLAGS: 00010202
> [ 1810.823177] RAX: ffff880079ddf580 RBX: 0000000000166000 RCX: ffff88004fedd0f8
> [ 1810.823177] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000246
> [ 1810.823177] RBP: ffff88000b343a38 R08: 0000000000000001 R09: 0000000000000000
> [ 1810.823177] R10: 0000000000000000 R11: 00000000000785b0 R12: ffff88004863d9a0
> [ 1810.823177] R13: ffff88004863d700 R14: ffff88000b343b50 R15: 0000000000000000
> [ 1810.823177] FS:  00007ff4401c6700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> [ 1810.823177] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1810.823177] CR2: 00007ff4400c2008 CR3: 000000004fed3000 CR4: 00000000000006f0
> [ 1810.823177] Stack:
> [ 1810.823177]  ffff88000b343a18 0000000000005000 ffff88000b3439c8 ffff880000000000
> [ 1810.823177]  ffff880000000008 0000000000000166 000188004ff2e940 0000000000005000
> [ 1810.823177]  ffff88000b343a18 0000000100000202 000000000000015e ffffffffffffffff
> [ 1810.823177] Call Trace:
> [ 1810.823177]  [<ffffffff815013af>] xfs_get_blocks_direct+0xf/0x20
> [ 1810.823177]  [<ffffffff811f3ffe>] __blockdev_direct_IO+0x9ee/0x3340
> [ 1810.823177]  [<ffffffff815013a0>] ? __xfs_get_blocks+0x5d0/0x5d0
> [ 1810.823177]  [<ffffffff814ff7d0>] xfs_vm_direct_IO+0x130/0x150
> [ 1810.823177]  [<ffffffff815013a0>] ? __xfs_get_blocks+0x5d0/0x5d0
> [ 1810.823177]  [<ffffffff8117116a>] generic_file_read_iter+0x54a/0x610
> [ 1810.823177]  [<ffffffff810f5a8a>] ? mark_held_locks+0x6a/0x90
> [ 1810.823177]  [<ffffffff8150c6e9>] xfs_file_read_iter+0xf9/0x2b0
> [ 1810.823177]  [<ffffffff81193e3e>] ? might_fault+0x3e/0x90
> [ 1810.823177]  [<ffffffff811b9b69>] new_sync_read+0x79/0xb0
> [ 1810.823177]  [<ffffffff811bac6b>] vfs_read+0x9b/0x190
> [ 1810.823177]  [<ffffffff811baf11>] SyS_read+0x51/0xc0
> [ 1810.823177]  [<ffffffff81d9f6e9>] system_call_fastpath+0x16/0x1b
> 
> The BUG_ON is this one:
> 
> 	if (imap.br_startblock == DELAYSTARTBLOCK) {
> 		BUG_ON(direct);
> 		if (create) {
> 			..

That's a symptom of the problem I've been chasing for the past *18
months*. Every time we fix another bunch of bufferhead coherency
bugs, I hope that it goes away. It hasn't, and Brian's latest set of
collapse_range fixes have made it substantially worse on my test
machines. However, Brian has a simply test case we are discussing on
#xfs right now that reproduces on of the issues, and again it looks
like stray delalloc blocks and/or dirty buffers beyond EOF being the
source of the problems.

We're slowly fixing the problems we find, but the frequency of
that bug being hit is increasing and decreasing as time goes on. But
in reality we still haven't found the root cause because it's been
so hard to reliably reproduce....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-09-11 22:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-11 21:14 more pagecache invalidation issues? Christoph Hellwig
2014-09-11 22:03 ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox