warning in ext4_journal_start_sb on filesystem freeze

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* warning in ext4_journal_start_sb on filesystem freeze
       [not found] <217983071.143460.1385453196946.JavaMail.zimbra@rapitasystems.com>
@ 2013-11-26  8:20 ` Matthew Rahtz
  2013-11-26 12:58   ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Rahtz @ 2013-11-26  8:20 UTC (permalink / raw)
  To: linux-ext4

Hello, 

We're using qemu's guest agent daemon, qemu-ga, to freeze ext4 filesystems in guest virtual machines before taking an LVM snapshot of the disk volume in the host. However, in the guests' dmesg, we're consistently seeing warnings like: 

[1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160() 

Looking at the source at https://github.com/torvalds/linux/blob/v3.8/fs/ext4/super.c#L339, this warning seems to be generated if the function is reached despite the filesystem being marked as frozen:

WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);

In 3.12, this has been moved to https://github.com/torvalds/linux/blob/v3.12/fs/ext4/ext4_jbd2.c#L48.

Is this something we should be concerned about? The process that seems to be responsible for triggering it is mysqld, so we're concerned the databases in our snapshots have a higher possibility of being corrupt. (Taking online snapshots of databases like this is always risky, of course, but this just makes us a little more nervous :) ) Full kernel warning is attached below.

Thank you!


[1246478.632930] ------------[ cut here ]------------
[1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160()
[1246478.632938] Hardware name: Bochs
[1246478.632939] Modules linked in: cirrus(F) ttm(F) drm_kms_helper(F) drm(F) sysimgblt(F) psmouse(F) sysfillrect(F) serio_raw(F) syscopyarea(F) microcode(F) virtio_console(F) lp(F) virtio_balloon(F) mac_hid(F) i2c_piix4(F) ext2(F) parport(F) floppy(F) e1000(F)
[1246478.632973] Pid: 2856, comm: mysqld Tainted: GF       W    3.8.0-33-generic #48~precise1-Ubuntu
[1246478.632975] Call Trace:
[1246478.632981]  [<ffffffff81059b6f>] warn_slowpath_common+0x7f/0xc0
[1246478.632985]  [<ffffffff81059bca>] warn_slowpath_null+0x1a/0x20
[1246478.632989]  [<ffffffff8125eb59>] ext4_journal_start_sb+0x159/0x160
[1246478.632993]  [<ffffffff8123f1c8>] ? _ext4_get_block+0x138/0x170
[1246478.632997]  [<ffffffff8123f1c8>] _ext4_get_block+0x138/0x170
[1246478.633002]  [<ffffffff8104e070>] ? get_user_pages_fast+0xe0/0x1a0
[1246478.633006]  [<ffffffff8123f263>] ext4_get_block_write+0x13/0x20
[1246478.633009]  [<ffffffff811d6d3a>] get_more_blocks+0x6a/0xa0
[1246478.633013]  [<ffffffff811d7a7e>] do_direct_IO+0x4be/0x1530
[1246478.633018]  [<ffffffff8107f9ab>] ? bit_waitqueue+0x1b/0xc0
[1246478.633022]  [<ffffffff81186221>] ? kmem_cache_alloc+0x31/0x140
[1246478.633026]  [<ffffffff811d8f22>] do_blockdev_direct_IO+0x432/0x13e0
[1246478.633030]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
[1246478.633035]  [<ffffffff811d9f25>] __blockdev_direct_IO+0x55/0x60
[1246478.633039]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
[1246478.633042]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
[1246478.633046]  [<ffffffff8123bcd0>] ext4_ext_direct_IO+0x130/0x250
[1246478.633050]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
[1246478.633053]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
[1246478.633057]  [<ffffffff8123c1ad>] ext4_direct_IO+0x1ad/0x230
[1246478.633061]  [<ffffffff8108e3ca>] ? finish_task_switch+0x4a/0xf0
[1246478.633065]  [<ffffffff811368d6>] generic_file_direct_write+0xc6/0x180
[1246478.633068]  [<ffffffff81136c6d>] __generic_file_aio_write+0x2dd/0x3b0
[1246478.633072]  [<ffffffff816e5848>] ext4_file_dio_write+0x243/0x320
[1246478.633076]  [<ffffffff810b81b2>] ? unqueue_me+0x52/0x80
[1246478.633079]  [<ffffffff81236ed8>] ext4_file_write+0xc8/0xe0
[1246478.633084]  [<ffffffff8119b333>] do_sync_write+0xa3/0xe0
[1246478.633089]  [<ffffffff8119b9d3>] vfs_write+0xb3/0x180
[1246478.633093]  [<ffffffff8119be9a>] sys_pwrite64+0x9a/0xa0
[1246478.633097]  [<ffffffff816fd15d>] system_call_fastpath+0x1a/0x1f
[1246478.633099] ---[ end trace f37019187d44de90 ]---
Please Note: Rapita Systems has a new address and telephone number.
Telephone: +44 1904 413945
Address: Rapita Systems Ltd, Atlas House,
          Osbaldwick Link Road, YORK, YO10 3JB
          United Kingdom

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: warning in ext4_journal_start_sb on filesystem freeze
  2013-11-26  8:20 ` warning in ext4_journal_start_sb on filesystem freeze Matthew Rahtz
@ 2013-11-26 12:58   ` Jan Kara
  2014-02-22  9:50     ` Matthew Rahtz
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2013-11-26 12:58 UTC (permalink / raw)
  To: Matthew Rahtz; +Cc: linux-ext4

  Hello,

On Tue 26-11-13 08:20:51, Matthew Rahtz wrote:
> We're using qemu's guest agent daemon, qemu-ga, to freeze ext4
> filesystems in guest virtual machines before taking an LVM snapshot of
> the disk volume in the host. However, in the guests' dmesg, we're
> consistently seeing warnings like: 
> 
> [1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160() 
> 
> Looking at the source at
> https://github.com/torvalds/linux/blob/v3.8/fs/ext4/super.c#L339, this
> warning seems to be generated if the function is reached despite the
> filesystem being marked as frozen:
> 
> WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> 
> In 3.12, this has been moved to
> https://github.com/torvalds/linux/blob/v3.12/fs/ext4/ext4_jbd2.c#L48.
> 
> Is this something we should be concerned about? The process that seems to
> be responsible for triggering it is mysqld, so we're concerned the
> databases in our snapshots have a higher possibility of being corrupt.
> (Taking online snapshots of databases like this is always risky, of
> course, but this just makes us a little more nervous :) ) Full kernel
> warning is attached below.
  Yes, it's a bug in 3.8 kernel which got fixed by commit
03d95eb2f2578083a3f6286262e1cb5d88a00c02 (merged in 3.10). Looking into the
code there's really a chance the filesystem will be inconsistent because of
that bug so you might be better off updating to a kernel which has this bug
fixed if you rely on the snapshots heavily.

								Honza

> [1246478.632930] ------------[ cut here ]------------
> [1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160()
> [1246478.632938] Hardware name: Bochs
> [1246478.632939] Modules linked in: cirrus(F) ttm(F) drm_kms_helper(F) drm(F) sysimgblt(F) psmouse(F) sysfillrect(F) serio_raw(F) syscopyarea(F) microcode(F) virtio_console(F) lp(F) virtio_balloon(F) mac_hid(F) i2c_piix4(F) ext2(F) parport(F) floppy(F) e1000(F)
> [1246478.632973] Pid: 2856, comm: mysqld Tainted: GF       W    3.8.0-33-generic #48~precise1-Ubuntu
> [1246478.632975] Call Trace:
> [1246478.632981]  [<ffffffff81059b6f>] warn_slowpath_common+0x7f/0xc0
> [1246478.632985]  [<ffffffff81059bca>] warn_slowpath_null+0x1a/0x20
> [1246478.632989]  [<ffffffff8125eb59>] ext4_journal_start_sb+0x159/0x160
> [1246478.632993]  [<ffffffff8123f1c8>] ? _ext4_get_block+0x138/0x170
> [1246478.632997]  [<ffffffff8123f1c8>] _ext4_get_block+0x138/0x170
> [1246478.633002]  [<ffffffff8104e070>] ? get_user_pages_fast+0xe0/0x1a0
> [1246478.633006]  [<ffffffff8123f263>] ext4_get_block_write+0x13/0x20
> [1246478.633009]  [<ffffffff811d6d3a>] get_more_blocks+0x6a/0xa0
> [1246478.633013]  [<ffffffff811d7a7e>] do_direct_IO+0x4be/0x1530
> [1246478.633018]  [<ffffffff8107f9ab>] ? bit_waitqueue+0x1b/0xc0
> [1246478.633022]  [<ffffffff81186221>] ? kmem_cache_alloc+0x31/0x140
> [1246478.633026]  [<ffffffff811d8f22>] do_blockdev_direct_IO+0x432/0x13e0
> [1246478.633030]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633035]  [<ffffffff811d9f25>] __blockdev_direct_IO+0x55/0x60
> [1246478.633039]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633042]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
> [1246478.633046]  [<ffffffff8123bcd0>] ext4_ext_direct_IO+0x130/0x250
> [1246478.633050]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633053]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
> [1246478.633057]  [<ffffffff8123c1ad>] ext4_direct_IO+0x1ad/0x230
> [1246478.633061]  [<ffffffff8108e3ca>] ? finish_task_switch+0x4a/0xf0
> [1246478.633065]  [<ffffffff811368d6>] generic_file_direct_write+0xc6/0x180
> [1246478.633068]  [<ffffffff81136c6d>] __generic_file_aio_write+0x2dd/0x3b0
> [1246478.633072]  [<ffffffff816e5848>] ext4_file_dio_write+0x243/0x320
> [1246478.633076]  [<ffffffff810b81b2>] ? unqueue_me+0x52/0x80
> [1246478.633079]  [<ffffffff81236ed8>] ext4_file_write+0xc8/0xe0
> [1246478.633084]  [<ffffffff8119b333>] do_sync_write+0xa3/0xe0
> [1246478.633089]  [<ffffffff8119b9d3>] vfs_write+0xb3/0x180
> [1246478.633093]  [<ffffffff8119be9a>] sys_pwrite64+0x9a/0xa0
> [1246478.633097]  [<ffffffff816fd15d>] system_call_fastpath+0x1a/0x1f
> [1246478.633099] ---[ end trace f37019187d44de90 ]---
> Please Note: Rapita Systems has a new address and telephone number.
> Telephone: +44 1904 413945
> Address: Rapita Systems Ltd, Atlas House,
>           Osbaldwick Link Road, YORK, YO10 3JB
>           United Kingdom
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: warning in ext4_journal_start_sb on filesystem freeze
  2013-11-26 12:58   ` Jan Kara
@ 2014-02-22  9:50     ` Matthew Rahtz
       [not found]       ` <622177618.727.1393062606061.JavaMail.zimbra-lFL+a/sBLVi/3pe1ocb+swC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Rahtz @ 2014-02-22  9:50 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

Thanks for your help Jan,

A few months later, we've noticed the issue is actually still there. Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel logs:

[29243.606215] WARNING: CPU: 0 PID: 1785 at /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48 ext4_journal_check_start+0x83/0x90()

Having a look at the Ubuntu source package for that version, it definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02, and the line generating the warning is still:

WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);

Are there any other obvious possibilities for what may be causing this? There seem to be some users of Oracle Linux experiencing similar problems at https://community.oracle.com/thread/2617418, which was apparently fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when this might be integrated into the official kernel?

Full call trace included below.

Thanks again!
Matthew

[29243.606212] ------------[ cut here ]------------
[29243.606215] WARNING: CPU: 0 PID: 1785 at /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48 ext4_journal_check_start+0x83/0x90()
[29243.606216] Modules linked in: parport_pc ppdev nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc ext2 cirrus ttm drm_kms_helper drm sysimgblt psmouse i2c_piix4 virtio_balloon sysfillrect mac_hid serio_raw syscopyarea virtio_console lp parport floppy
[29243.606227] CPU: 0 PID: 1785 Comm: nfsd Tainted: G        W    3.11.0-17-generic #31~precise1-Ubuntu
[29243.606228] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[29243.606228]  0000000000000030 ffff8801162f3b08 ffffffff8173c72d 0000000000000007
[29243.606230]  0000000000000000 ffff8801162f3b48 ffffffff8106540c 0000000000000000
[29243.606232]  ffff880114892800 0000000000000007 0000000000000068 0000000000000000
[29243.606235] Call Trace:
[29243.606237]  [<ffffffff8173c72d>] dump_stack+0x46/0x58
[29243.606239]  [<ffffffff8106540c>] warn_slowpath_common+0x8c/0xc0
[29243.606241]  [<ffffffff8106545a>] warn_slowpath_null+0x1a/0x20
[29243.606244]  [<ffffffff8127ebb3>] ext4_journal_check_start+0x83/0x90
[29243.606246]  [<ffffffff8127ec35>] __ext4_journal_start_sb+0x45/0x100
[29243.606249]  [<ffffffff81258a03>] ? ext4_dirty_inode+0x33/0x70
[29243.606251]  [<ffffffff81258a03>] ext4_dirty_inode+0x33/0x70
[29243.606254]  [<ffffffff811de348>] __mark_inode_dirty+0x48/0x350
[29243.606256]  [<ffffffff81256b53>] ext4_setattr+0x1b3/0x5b0
[29243.606259]  [<ffffffff811d0903>] notify_change+0x1d3/0x390
[29243.606263]  [<ffffffffa01c7fe2>] nfsd_setattr+0x232/0x2a0 [nfsd]
[29243.606267]  [<ffffffffa01d00f6>] nfsd3_proc_setattr+0x76/0xc0 [nfsd]
[29243.606271]  [<ffffffffa01c0d85>] nfsd_dispatch+0xe5/0x230 [nfsd]
[29243.606283]  [<ffffffffa0128465>] svc_process_common+0x345/0x680 [sunrpc]
[29243.606289]  [<ffffffffa0128af3>] svc_process+0x103/0x160 [sunrpc]
[29243.606293]  [<ffffffffa01c08df>] nfsd+0xbf/0x130 [nfsd]
[29243.606297]  [<ffffffffa01c0820>] ? nfsd_destroy+0x80/0x80 [nfsd]
[29243.606299]  [<ffffffff81089170>] kthread+0xc0/0xd0
[29243.606302]  [<ffffffff810890b0>] ? flush_kthread_worker+0xb0/0xb0
[29243.606304]  [<ffffffff8175122c>] ret_from_fork+0x7c/0xb0
[29243.606307]  [<ffffffff810890b0>] ? flush_kthread_worker+0xb0/0xb0
[29243.606308] ---[ end trace e9d4726f92c62d43 ]---

----- Original Message -----
From: "Jan Kara" <jack@suse.cz>
To: "Matthew Rahtz" <mrahtz@rapitasystems.com>
Cc: linux-ext4@vger.kernel.org
Sent: Tuesday, 26 November, 2013 12:58:26 PM
Subject: Re: warning in ext4_journal_start_sb on filesystem freeze

  Hello,

On Tue 26-11-13 08:20:51, Matthew Rahtz wrote:
> We're using qemu's guest agent daemon, qemu-ga, to freeze ext4
> filesystems in guest virtual machines before taking an LVM snapshot of
> the disk volume in the host. However, in the guests' dmesg, we're
> consistently seeing warnings like: 
> 
> [1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160() 
> 
> Looking at the source at
> https://github.com/torvalds/linux/blob/v3.8/fs/ext4/super.c#L339, this
> warning seems to be generated if the function is reached despite the
> filesystem being marked as frozen:
> 
> WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> 
> In 3.12, this has been moved to
> https://github.com/torvalds/linux/blob/v3.12/fs/ext4/ext4_jbd2.c#L48.
> 
> Is this something we should be concerned about? The process that seems to
> be responsible for triggering it is mysqld, so we're concerned the
> databases in our snapshots have a higher possibility of being corrupt.
> (Taking online snapshots of databases like this is always risky, of
> course, but this just makes us a little more nervous :) ) Full kernel
> warning is attached below.
  Yes, it's a bug in 3.8 kernel which got fixed by commit
03d95eb2f2578083a3f6286262e1cb5d88a00c02 (merged in 3.10). Looking into the
code there's really a chance the filesystem will be inconsistent because of
that bug so you might be better off updating to a kernel which has this bug
fixed if you rely on the snapshots heavily.

								Honza

> [1246478.632930] ------------[ cut here ]------------
> [1246478.632936] WARNING: at /build/buildd/linux-lts-raring-3.8.0/fs/ext4/super.c:339 ext4_journal_start_sb+0x159/0x160()
> [1246478.632938] Hardware name: Bochs
> [1246478.632939] Modules linked in: cirrus(F) ttm(F) drm_kms_helper(F) drm(F) sysimgblt(F) psmouse(F) sysfillrect(F) serio_raw(F) syscopyarea(F) microcode(F) virtio_console(F) lp(F) virtio_balloon(F) mac_hid(F) i2c_piix4(F) ext2(F) parport(F) floppy(F) e1000(F)
> [1246478.632973] Pid: 2856, comm: mysqld Tainted: GF       W    3.8.0-33-generic #48~precise1-Ubuntu
> [1246478.632975] Call Trace:
> [1246478.632981]  [<ffffffff81059b6f>] warn_slowpath_common+0x7f/0xc0
> [1246478.632985]  [<ffffffff81059bca>] warn_slowpath_null+0x1a/0x20
> [1246478.632989]  [<ffffffff8125eb59>] ext4_journal_start_sb+0x159/0x160
> [1246478.632993]  [<ffffffff8123f1c8>] ? _ext4_get_block+0x138/0x170
> [1246478.632997]  [<ffffffff8123f1c8>] _ext4_get_block+0x138/0x170
> [1246478.633002]  [<ffffffff8104e070>] ? get_user_pages_fast+0xe0/0x1a0
> [1246478.633006]  [<ffffffff8123f263>] ext4_get_block_write+0x13/0x20
> [1246478.633009]  [<ffffffff811d6d3a>] get_more_blocks+0x6a/0xa0
> [1246478.633013]  [<ffffffff811d7a7e>] do_direct_IO+0x4be/0x1530
> [1246478.633018]  [<ffffffff8107f9ab>] ? bit_waitqueue+0x1b/0xc0
> [1246478.633022]  [<ffffffff81186221>] ? kmem_cache_alloc+0x31/0x140
> [1246478.633026]  [<ffffffff811d8f22>] do_blockdev_direct_IO+0x432/0x13e0
> [1246478.633030]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633035]  [<ffffffff811d9f25>] __blockdev_direct_IO+0x55/0x60
> [1246478.633039]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633042]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
> [1246478.633046]  [<ffffffff8123bcd0>] ext4_ext_direct_IO+0x130/0x250
> [1246478.633050]  [<ffffffff8123f250>] ? noalloc_get_block_write+0x30/0x30
> [1246478.633053]  [<ffffffff8123ab30>] ? ext4_journalled_invalidatepage+0x30/0x30
> [1246478.633057]  [<ffffffff8123c1ad>] ext4_direct_IO+0x1ad/0x230
> [1246478.633061]  [<ffffffff8108e3ca>] ? finish_task_switch+0x4a/0xf0
> [1246478.633065]  [<ffffffff811368d6>] generic_file_direct_write+0xc6/0x180
> [1246478.633068]  [<ffffffff81136c6d>] __generic_file_aio_write+0x2dd/0x3b0
> [1246478.633072]  [<ffffffff816e5848>] ext4_file_dio_write+0x243/0x320
> [1246478.633076]  [<ffffffff810b81b2>] ? unqueue_me+0x52/0x80
> [1246478.633079]  [<ffffffff81236ed8>] ext4_file_write+0xc8/0xe0
> [1246478.633084]  [<ffffffff8119b333>] do_sync_write+0xa3/0xe0
> [1246478.633089]  [<ffffffff8119b9d3>] vfs_write+0xb3/0x180
> [1246478.633093]  [<ffffffff8119be9a>] sys_pwrite64+0x9a/0xa0
> [1246478.633097]  [<ffffffff816fd15d>] system_call_fastpath+0x1a/0x1f
> [1246478.633099] ---[ end trace f37019187d44de90 ]---
> Please Note: Rapita Systems has a new address and telephone number.
> Telephone: +44 1904 413945
> Address: Rapita Systems Ltd, Atlas House,
>           Osbaldwick Link Road, YORK, YO10 3JB
>           United Kingdom
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR
Please Note: Rapita Systems has a new address and telephone number.
Telephone: +44 1904 413945
Address: Rapita Systems Ltd, Atlas House,
          Osbaldwick Link Road, YORK, YO10 3JB
          United Kingdom
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <622177618.727.1393062606061.JavaMail.zimbra-lFL+a/sBLVi/3pe1ocb+swC/G2K4zDHf@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]       ` <622177618.727.1393062606061.JavaMail.zimbra-lFL+a/sBLVi/3pe1ocb+swC/G2K4zDHf@public.gmane.org>
@ 2014-02-24  9:55         ` Jan Kara
       [not found]           ` <20140224095525.GA20532-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2014-02-24  9:55 UTC (permalink / raw)
  To: Matthew Rahtz
  Cc: Jan Kara, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, J. Bruce Fields

On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> Thanks for your help Jan,
> 
> A few months later, we've noticed the issue is actually still there.
> Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> logs:
> 
> [29243.606215] WARNING: CPU: 0 PID: 1785 at
> /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> ext4_journal_check_start+0x83/0x90()
> 
> Having a look at the Ubuntu source package for that version, it
> definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> and the line generating the warning is still:
> 
> WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> 
> Are there any other obvious possibilities for what may be causing this?
> There seem to be some users of Oracle Linux experiencing similar problems
> at https://community.oracle.com/thread/2617418, which was apparently
> fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> this might be integrated into the official kernel?
> 
> Full call trace included below.
  Looking at the trace below, now the problem seems to be in the NFS server
code. NFS should get protection against the filesystem being frozen (or
remounted read-only for that matter) via mnt_want_write() before calling
into notify_change() (actually before calling fh_lock() because of lock
ordering).  Similarly to what we do e.g. in fchownat(). Bruce?

								Honza

> [29243.606212] ------------[ cut here ]------------
> [29243.606215] WARNING: CPU: 0 PID: 1785 at /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48 ext4_journal_check_start+0x83/0x90()
> [29243.606216] Modules linked in: parport_pc ppdev nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc ext2 cirrus ttm drm_kms_helper drm sysimgblt psmouse i2c_piix4 virtio_balloon sysfillrect mac_hid serio_raw syscopyarea virtio_console lp parport floppy
> [29243.606227] CPU: 0 PID: 1785 Comm: nfsd Tainted: G        W    3.11.0-17-generic #31~precise1-Ubuntu
> [29243.606228] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [29243.606228]  0000000000000030 ffff8801162f3b08 ffffffff8173c72d 0000000000000007
> [29243.606230]  0000000000000000 ffff8801162f3b48 ffffffff8106540c 0000000000000000
> [29243.606232]  ffff880114892800 0000000000000007 0000000000000068 0000000000000000
> [29243.606235] Call Trace:
> [29243.606237]  [<ffffffff8173c72d>] dump_stack+0x46/0x58
> [29243.606239]  [<ffffffff8106540c>] warn_slowpath_common+0x8c/0xc0
> [29243.606241]  [<ffffffff8106545a>] warn_slowpath_null+0x1a/0x20
> [29243.606244]  [<ffffffff8127ebb3>] ext4_journal_check_start+0x83/0x90
> [29243.606246]  [<ffffffff8127ec35>] __ext4_journal_start_sb+0x45/0x100
> [29243.606249]  [<ffffffff81258a03>] ? ext4_dirty_inode+0x33/0x70
> [29243.606251]  [<ffffffff81258a03>] ext4_dirty_inode+0x33/0x70
> [29243.606254]  [<ffffffff811de348>] __mark_inode_dirty+0x48/0x350
> [29243.606256]  [<ffffffff81256b53>] ext4_setattr+0x1b3/0x5b0
> [29243.606259]  [<ffffffff811d0903>] notify_change+0x1d3/0x390
> [29243.606263]  [<ffffffffa01c7fe2>] nfsd_setattr+0x232/0x2a0 [nfsd]
> [29243.606267]  [<ffffffffa01d00f6>] nfsd3_proc_setattr+0x76/0xc0 [nfsd]
> [29243.606271]  [<ffffffffa01c0d85>] nfsd_dispatch+0xe5/0x230 [nfsd]
> [29243.606283]  [<ffffffffa0128465>] svc_process_common+0x345/0x680 [sunrpc]
> [29243.606289]  [<ffffffffa0128af3>] svc_process+0x103/0x160 [sunrpc]
> [29243.606293]  [<ffffffffa01c08df>] nfsd+0xbf/0x130 [nfsd]
> [29243.606297]  [<ffffffffa01c0820>] ? nfsd_destroy+0x80/0x80 [nfsd]
> [29243.606299]  [<ffffffff81089170>] kthread+0xc0/0xd0
> [29243.606302]  [<ffffffff810890b0>] ? flush_kthread_worker+0xb0/0xb0
> [29243.606304]  [<ffffffff8175122c>] ret_from_fork+0x7c/0xb0
> [29243.606307]  [<ffffffff810890b0>] ? flush_kthread_worker+0xb0/0xb0
> [29243.606308] ---[ end trace e9d4726f92c62d43 ]---
-- 
Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140224095525.GA20532-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]           ` <20140224095525.GA20532-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
@ 2014-02-24 15:45             ` J. Bruce Fields
       [not found]               ` <20140224154532.GB11992-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2014-02-24 15:45 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > Thanks for your help Jan,
> > 
> > A few months later, we've noticed the issue is actually still there.
> > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > logs:
> > 
> > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > ext4_journal_check_start+0x83/0x90()
> > 
> > Having a look at the Ubuntu source package for that version, it
> > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > and the line generating the warning is still:
> > 
> > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > 
> > Are there any other obvious possibilities for what may be causing this?
> > There seem to be some users of Oracle Linux experiencing similar problems
> > at https://community.oracle.com/thread/2617418, which was apparently
> > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > this might be integrated into the official kernel?
> > 
> > Full call trace included below.
>   Looking at the trace below, now the problem seems to be in the NFS server
> code. NFS should get protection against the filesystem being frozen (or
> remounted read-only for that matter) via mnt_want_write() before calling
> into notify_change() (actually before calling fh_lock() because of lock
> ordering).  Similarly to what we do e.g. in fchownat(). Bruce?

Like this?

But I wonder why this is just popping up now--as far as I can tell we've
had the bug since those write counts were introduced.

--b.

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6d7be3f..d573b61 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -445,12 +445,16 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 		err = nfserr_notsync;
 		goto out_put_write_access;
 	}
+	host_err = fh_want_write(fhp);
+	if (host_err)
+		goto out_nfserr;
 
 	fh_lock(fhp);
 	host_err = notify_change(dentry, iap, NULL);
 	fh_unlock(fhp);
+	fh_drop_write(fhp);
+out_nfserr:
 	err = nfserrno(host_err);
-
 out_put_write_access:
 	if (size_change)
 		put_write_access(inode);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

[parent not found: <20140224154532.GB11992-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]               ` <20140224154532.GB11992-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2014-02-25 10:21                 ` Jan Kara
       [not found]                   ` <20140225102126.GB1669-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2014-02-25 10:21 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jan Kara, Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Mon 24-02-14 10:45:32, J. Bruce Fields wrote:
> On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> > On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > > Thanks for your help Jan,
> > > 
> > > A few months later, we've noticed the issue is actually still there.
> > > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > > logs:
> > > 
> > > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > > ext4_journal_check_start+0x83/0x90()
> > > 
> > > Having a look at the Ubuntu source package for that version, it
> > > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > > and the line generating the warning is still:
> > > 
> > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > 
> > > Are there any other obvious possibilities for what may be causing this?
> > > There seem to be some users of Oracle Linux experiencing similar problems
> > > at https://community.oracle.com/thread/2617418, which was apparently
> > > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > > this might be integrated into the official kernel?
> > > 
> > > Full call trace included below.
> >   Looking at the trace below, now the problem seems to be in the NFS server
> > code. NFS should get protection against the filesystem being frozen (or
> > remounted read-only for that matter) via mnt_want_write() before calling
> > into notify_change() (actually before calling fh_lock() because of lock
> > ordering).  Similarly to what we do e.g. in fchownat(). Bruce?
> 
> Like this?
  Yup, that looks right.

> But I wonder why this is just popping up now--as far as I can tell we've
> had the bug since those write counts were introduced.
  Yeah, I'm wondering as well. NFS server on ext4 should have been
complaining for a long time.

								Honza

> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 6d7be3f..d573b61 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -445,12 +445,16 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  		err = nfserr_notsync;
>  		goto out_put_write_access;
>  	}
> +	host_err = fh_want_write(fhp);
> +	if (host_err)
> +		goto out_nfserr;
>  
>  	fh_lock(fhp);
>  	host_err = notify_change(dentry, iap, NULL);
>  	fh_unlock(fhp);
> +	fh_drop_write(fhp);
> +out_nfserr:
>  	err = nfserrno(host_err);
> -
>  out_put_write_access:
>  	if (size_change)
>  		put_write_access(inode);
-- 
Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140225102126.GB1669-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]                   ` <20140225102126.GB1669-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
@ 2014-03-04 16:43                     ` J. Bruce Fields
       [not found]                       ` <20140304164306.GC12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: J. Bruce Fields @ 2014-03-04 16:43 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Tue, Feb 25, 2014 at 11:21:26AM +0100, Jan Kara wrote:
> On Mon 24-02-14 10:45:32, J. Bruce Fields wrote:
> > On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> > > On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > > > Thanks for your help Jan,
> > > > 
> > > > A few months later, we've noticed the issue is actually still there.
> > > > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > > > logs:
> > > > 
> > > > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > > > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > > > ext4_journal_check_start+0x83/0x90()
> > > > 
> > > > Having a look at the Ubuntu source package for that version, it
> > > > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > > > and the line generating the warning is still:
> > > > 
> > > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > > 
> > > > Are there any other obvious possibilities for what may be causing this?
> > > > There seem to be some users of Oracle Linux experiencing similar problems
> > > > at https://community.oracle.com/thread/2617418, which was apparently
> > > > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > > > this might be integrated into the official kernel?
> > > > 
> > > > Full call trace included below.
> > >   Looking at the trace below, now the problem seems to be in the NFS server
> > > code. NFS should get protection against the filesystem being frozen (or
> > > remounted read-only for that matter) via mnt_want_write() before calling
> > > into notify_change() (actually before calling fh_lock() because of lock
> > > ordering).  Similarly to what we do e.g. in fchownat(). Bruce?
> > 
> > Like this?
>   Yup, that looks right.

Ugh, actually, I didn't realize we can't do mnt_want_write recursively,
and there's a confusing mixture of callers that do and don't already
take it, so I'll have to do something a little more complicated.

Oh well.--b.

> 
> > But I wonder why this is just popping up now--as far as I can tell we've
> > had the bug since those write counts were introduced.
>   Yeah, I'm wondering as well. NFS server on ext4 should have been
> complaining for a long time.
> 
> 								Honza
> 
> > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > index 6d7be3f..d573b61 100644
> > --- a/fs/nfsd/vfs.c
> > +++ b/fs/nfsd/vfs.c
> > @@ -445,12 +445,16 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> >  		err = nfserr_notsync;
> >  		goto out_put_write_access;
> >  	}
> > +	host_err = fh_want_write(fhp);
> > +	if (host_err)
> > +		goto out_nfserr;
> >  
> >  	fh_lock(fhp);
> >  	host_err = notify_change(dentry, iap, NULL);
> >  	fh_unlock(fhp);
> > +	fh_drop_write(fhp);
> > +out_nfserr:
> >  	err = nfserrno(host_err);
> > -
> >  out_put_write_access:
> >  	if (size_change)
> >  		put_write_access(inode);
> -- 
> Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
> SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140304164306.GC12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]                       ` <20140304164306.GC12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2014-03-04 19:04                         ` J. Bruce Fields
  2014-03-08  9:02                           ` Matthew Rahtz
       [not found]                           ` <20140304190442.GE12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  0 siblings, 2 replies; 14+ messages in thread
From: J. Bruce Fields @ 2014-03-04 19:04 UTC (permalink / raw)
  To: Jan Kara
  Cc: Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 04, 2014 at 11:43:06AM -0500, J. Bruce Fields wrote:
> On Tue, Feb 25, 2014 at 11:21:26AM +0100, Jan Kara wrote:
> > On Mon 24-02-14 10:45:32, J. Bruce Fields wrote:
> > > On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> > > > On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > > > > Thanks for your help Jan,
> > > > > 
> > > > > A few months later, we've noticed the issue is actually still there.
> > > > > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > > > > logs:
> > > > > 
> > > > > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > > > > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > > > > ext4_journal_check_start+0x83/0x90()
> > > > > 
> > > > > Having a look at the Ubuntu source package for that version, it
> > > > > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > > > > and the line generating the warning is still:
> > > > > 
> > > > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > > > 
> > > > > Are there any other obvious possibilities for what may be causing this?
> > > > > There seem to be some users of Oracle Linux experiencing similar problems
> > > > > at https://community.oracle.com/thread/2617418, which was apparently
> > > > > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > > > > this might be integrated into the official kernel?
> > > > > 
> > > > > Full call trace included below.
> > > >   Looking at the trace below, now the problem seems to be in the NFS server
> > > > code. NFS should get protection against the filesystem being frozen (or
> > > > remounted read-only for that matter) via mnt_want_write() before calling
> > > > into notify_change() (actually before calling fh_lock() because of lock
> > > > ordering).  Similarly to what we do e.g. in fchownat(). Bruce?
> > > 
> > > Like this?
> >   Yup, that looks right.
> 
> Ugh, actually, I didn't realize we can't do mnt_want_write recursively,
> and there's a confusing mixture of callers that do and don't already
> take it, so I'll have to do something a little more complicated.

Actually it looks like there's an easy enough way to distinguish when we
need mnt_want_write and when we don't; hopefully the following does the
job.

--b.

commit b0f5cd115e811a146a6e1a4dd1e7cb85808cca23
Author: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date:   Mon Feb 24 14:59:47 2014 -0500

    nfsd: notify_change needs elevated write count
    
    Looks like this bug has been here since these write counts were
    introduced, not sure why it was just noticed now.
    
    Thanks also to Jan Kara for pointing out the problem.
    
    Reported-by: Matthew Rahtz <mrahtz-lFL+a/sBLVi/3pe1ocb+swC/G2K4zDHf@public.gmane.org>
    Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6d7be3f..eea5ad1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	umode_t		ftype = 0;
 	__be32		err;
 	int		host_err;
+	bool		get_write_count;
 	int		size_change = 0;
 
 	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
@@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	if (iap->ia_valid & ATTR_SIZE)
 		ftype = S_IFREG;
 
+	/* Callers that do fh_verify should do the fh_want_write: */
+	get_write_count = !fhp->fh_dentry;
+
 	/* Get inode */
 	err = fh_verify(rqstp, fhp, ftype, accmode);
 	if (err)
 		goto out;
+	if (get_write_count) {
+		host_err = fh_want_write(fhp);
+		if (host_err)
+			return nfserrno(host_err);
+	}
 
 	dentry = fhp->fh_dentry;
 	inode = dentry->d_inode;
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: warning in ext4_journal_start_sb on filesystem freeze
  2014-03-04 19:04                         ` J. Bruce Fields
@ 2014-03-08  9:02                           ` Matthew Rahtz
  2014-03-10 13:26                             ` J. Bruce Fields
       [not found]                           ` <20140304190442.GE12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Matthew Rahtz @ 2014-03-08  9:02 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Jan Kara, linux-ext4, linux-nfs

Brilliant :) Thank you for your work!

----- Original Message -----
From: "J. Bruce Fields" <bfields@fieldses.org>
To: "Jan Kara" <jack@suse.cz>
Cc: "Matthew Rahtz" <mrahtz@rapitasystems.com>, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org
Sent: Tuesday, 4 March, 2014 7:04:42 PM
Subject: Re: warning in ext4_journal_start_sb on filesystem freeze

On Tue, Mar 04, 2014 at 11:43:06AM -0500, J. Bruce Fields wrote:
> On Tue, Feb 25, 2014 at 11:21:26AM +0100, Jan Kara wrote:
> > On Mon 24-02-14 10:45:32, J. Bruce Fields wrote:
> > > On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> > > > On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > > > > Thanks for your help Jan,
> > > > > 
> > > > > A few months later, we've noticed the issue is actually still there.
> > > > > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > > > > logs:
> > > > > 
> > > > > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > > > > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > > > > ext4_journal_check_start+0x83/0x90()
> > > > > 
> > > > > Having a look at the Ubuntu source package for that version, it
> > > > > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > > > > and the line generating the warning is still:
> > > > > 
> > > > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > > > 
> > > > > Are there any other obvious possibilities for what may be causing this?
> > > > > There seem to be some users of Oracle Linux experiencing similar problems
> > > > > at https://community.oracle.com/thread/2617418, which was apparently
> > > > > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > > > > this might be integrated into the official kernel?
> > > > > 
> > > > > Full call trace included below.
> > > >   Looking at the trace below, now the problem seems to be in the NFS server
> > > > code. NFS should get protection against the filesystem being frozen (or
> > > > remounted read-only for that matter) via mnt_want_write() before calling
> > > > into notify_change() (actually before calling fh_lock() because of lock
> > > > ordering).  Similarly to what we do e.g. in fchownat(). Bruce?
> > > 
> > > Like this?
> >   Yup, that looks right.
> 
> Ugh, actually, I didn't realize we can't do mnt_want_write recursively,
> and there's a confusing mixture of callers that do and don't already
> take it, so I'll have to do something a little more complicated.

Actually it looks like there's an easy enough way to distinguish when we
need mnt_want_write and when we don't; hopefully the following does the
job.

--b.

commit b0f5cd115e811a146a6e1a4dd1e7cb85808cca23
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Mon Feb 24 14:59:47 2014 -0500

    nfsd: notify_change needs elevated write count
    
    Looks like this bug has been here since these write counts were
    introduced, not sure why it was just noticed now.
    
    Thanks also to Jan Kara for pointing out the problem.
    
    Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 6d7be3f..eea5ad1 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	umode_t		ftype = 0;
 	__be32		err;
 	int		host_err;
+	bool		get_write_count;
 	int		size_change = 0;
 
 	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
@@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
 	if (iap->ia_valid & ATTR_SIZE)
 		ftype = S_IFREG;
 
+	/* Callers that do fh_verify should do the fh_want_write: */
+	get_write_count = !fhp->fh_dentry;
+
 	/* Get inode */
 	err = fh_verify(rqstp, fhp, ftype, accmode);
 	if (err)
 		goto out;
+	if (get_write_count) {
+		host_err = fh_want_write(fhp);
+		if (host_err)
+			return nfserrno(host_err);
+	}
 
 	dentry = fhp->fh_dentry;
 	inode = dentry->d_inode;
Please Note: Rapita Systems has a new address and telephone number.
Telephone: +44 1904 413945
Address: Rapita Systems Ltd, Atlas House,
          Osbaldwick Link Road, YORK, YO10 3JB
          United Kingdom
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: warning in ext4_journal_start_sb on filesystem freeze
  2014-03-08  9:02                           ` Matthew Rahtz
@ 2014-03-10 13:26                             ` J. Bruce Fields
  0 siblings, 0 replies; 14+ messages in thread
From: J. Bruce Fields @ 2014-03-10 13:26 UTC (permalink / raw)
  To: Matthew Rahtz; +Cc: Jan Kara, linux-ext4, linux-nfs

On Sat, Mar 08, 2014 at 09:02:26AM +0000, Matthew Rahtz wrote:
> Brilliant :) Thank you for your work!

Just to make sure, have you been able to confirm yet that this
eliminates the warning you were seeing?

--b.

> 
> ----- Original Message -----
> From: "J. Bruce Fields" <bfields@fieldses.org>
> To: "Jan Kara" <jack@suse.cz>
> Cc: "Matthew Rahtz" <mrahtz@rapitasystems.com>, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org
> Sent: Tuesday, 4 March, 2014 7:04:42 PM
> Subject: Re: warning in ext4_journal_start_sb on filesystem freeze
> 
> On Tue, Mar 04, 2014 at 11:43:06AM -0500, J. Bruce Fields wrote:
> > On Tue, Feb 25, 2014 at 11:21:26AM +0100, Jan Kara wrote:
> > > On Mon 24-02-14 10:45:32, J. Bruce Fields wrote:
> > > > On Mon, Feb 24, 2014 at 10:55:25AM +0100, Jan Kara wrote:
> > > > > On Sat 22-02-14 09:50:06, Matthew Rahtz wrote:
> > > > > > Thanks for your help Jan,
> > > > > > 
> > > > > > A few months later, we've noticed the issue is actually still there.
> > > > > > Using 3.11.0-17-generic on Ubuntu 12.04, we’re seeing this in the kernel
> > > > > > logs:
> > > > > > 
> > > > > > [29243.606215] WARNING: CPU: 0 PID: 1785 at
> > > > > > /build/buildd/linux-lts-saucy-3.11.0/fs/ext4/ext4_jbd2.c:48
> > > > > > ext4_journal_check_start+0x83/0x90()
> > > > > > 
> > > > > > Having a look at the Ubuntu source package for that version, it
> > > > > > definitely does include commit 03d95eb2f2578083a3f6286262e1cb5d88a00c02,
> > > > > > and the line generating the warning is still:
> > > > > > 
> > > > > > WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
> > > > > > 
> > > > > > Are there any other obvious possibilities for what may be causing this?
> > > > > > There seem to be some users of Oracle Linux experiencing similar problems
> > > > > > at https://community.oracle.com/thread/2617418, which was apparently
> > > > > > fixed in Oracle's kernel version '3.8.13-26.el6uek'. Any word on when
> > > > > > this might be integrated into the official kernel?
> > > > > > 
> > > > > > Full call trace included below.
> > > > >   Looking at the trace below, now the problem seems to be in the NFS server
> > > > > code. NFS should get protection against the filesystem being frozen (or
> > > > > remounted read-only for that matter) via mnt_want_write() before calling
> > > > > into notify_change() (actually before calling fh_lock() because of lock
> > > > > ordering).  Similarly to what we do e.g. in fchownat(). Bruce?
> > > > 
> > > > Like this?
> > >   Yup, that looks right.
> > 
> > Ugh, actually, I didn't realize we can't do mnt_want_write recursively,
> > and there's a confusing mixture of callers that do and don't already
> > take it, so I'll have to do something a little more complicated.
> 
> Actually it looks like there's an easy enough way to distinguish when we
> need mnt_want_write and when we don't; hopefully the following does the
> job.
> 
> --b.
> 
> commit b0f5cd115e811a146a6e1a4dd1e7cb85808cca23
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Mon Feb 24 14:59:47 2014 -0500
> 
>     nfsd: notify_change needs elevated write count
>     
>     Looks like this bug has been here since these write counts were
>     introduced, not sure why it was just noticed now.
>     
>     Thanks also to Jan Kara for pointing out the problem.
>     
>     Reported-by: Matthew Rahtz <mrahtz@rapitasystems.com>
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 6d7be3f..eea5ad1 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  	umode_t		ftype = 0;
>  	__be32		err;
>  	int		host_err;
> +	bool		get_write_count;
>  	int		size_change = 0;
>  
>  	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> @@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  	if (iap->ia_valid & ATTR_SIZE)
>  		ftype = S_IFREG;
>  
> +	/* Callers that do fh_verify should do the fh_want_write: */
> +	get_write_count = !fhp->fh_dentry;
> +
>  	/* Get inode */
>  	err = fh_verify(rqstp, fhp, ftype, accmode);
>  	if (err)
>  		goto out;
> +	if (get_write_count) {
> +		host_err = fh_want_write(fhp);
> +		if (host_err)
> +			return nfserrno(host_err);
> +	}
>  
>  	dentry = fhp->fh_dentry;
>  	inode = dentry->d_inode;
> Please Note: Rapita Systems has a new address and telephone number.
> Telephone: +44 1904 413945
> Address: Rapita Systems Ltd, Atlas House,
>           Osbaldwick Link Road, YORK, YO10 3JB
>           United Kingdom
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140304190442.GE12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]                           ` <20140304190442.GE12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2014-03-10 13:34                             ` Christoph Hellwig
       [not found]                               ` <20140310133451.GA17807-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2014-03-10 13:34 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Jan Kara, Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 04, 2014 at 02:04:42PM -0500, J. Bruce Fields wrote:
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 6d7be3f..eea5ad1 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  	umode_t		ftype = 0;
>  	__be32		err;
>  	int		host_err;
> +	bool		get_write_count;
>  	int		size_change = 0;
>  
>  	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> @@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
>  	if (iap->ia_valid & ATTR_SIZE)
>  		ftype = S_IFREG;
>  
> +	/* Callers that do fh_verify should do the fh_want_write: */
> +	get_write_count = !fhp->fh_dentry;

Eww, this is nasty.  Given that there are only 6 callers of nfsd_setattr
in total, and only half of these might cause size changes I'd rather
deal with this properly, e.g. by taking both the fh_verify into the
callers.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140310133451.GA17807-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]                               ` <20140310133451.GA17807-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2014-03-10 19:57                                 ` J. Bruce Fields
       [not found]                                   ` <20140310195709.GH28006-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
  2014-04-01 18:40                                   ` J. Bruce Fields
  0 siblings, 2 replies; 14+ messages in thread
From: J. Bruce Fields @ 2014-03-10 19:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Matthew Rahtz, linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Mon, Mar 10, 2014 at 06:34:51AM -0700, Christoph Hellwig wrote:
> On Tue, Mar 04, 2014 at 02:04:42PM -0500, J. Bruce Fields wrote:
> > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > index 6d7be3f..eea5ad1 100644
> > --- a/fs/nfsd/vfs.c
> > +++ b/fs/nfsd/vfs.c
> > @@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> >  	umode_t		ftype = 0;
> >  	__be32		err;
> >  	int		host_err;
> > +	bool		get_write_count;
> >  	int		size_change = 0;
> >  
> >  	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> > @@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> >  	if (iap->ia_valid & ATTR_SIZE)
> >  		ftype = S_IFREG;
> >  
> > +	/* Callers that do fh_verify should do the fh_want_write: */
> > +	get_write_count = !fhp->fh_dentry;
> 
> Eww, this is nasty.  Given that there are only 6 callers of nfsd_setattr
> in total, and only half of these might cause size changes I'd rather
> deal with this properly, e.g. by taking both the fh_verify into the
> callers.

Maybe so.

(Size is irrelevant, though, right?  Won't any setattr need an elevated
write count?)

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

[parent not found: <20140310195709.GH28006-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]

* Re: warning in ext4_journal_start_sb on filesystem freeze
       [not found]                                   ` <20140310195709.GH28006-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2014-03-10 23:40                                     ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2014-03-10 23:40 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Jan Kara, Matthew Rahtz,
	linux-ext4-u79uwXL29TY76Z2rM5mHXA,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA

On Mon, Mar 10, 2014 at 03:57:09PM -0400, J. Bruce Fields wrote:
> (Size is irrelevant, though, right?  Won't any setattr need an elevated
> write count?)

Indeed.  Not sure why I was thinking of truncate as a special case here.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: warning in ext4_journal_start_sb on filesystem freeze
  2014-03-10 19:57                                 ` J. Bruce Fields
       [not found]                                   ` <20140310195709.GH28006-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
@ 2014-04-01 18:40                                   ` J. Bruce Fields
  1 sibling, 0 replies; 14+ messages in thread
From: J. Bruce Fields @ 2014-04-01 18:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jan Kara, Matthew Rahtz, linux-ext4, linux-nfs

On Mon, Mar 10, 2014 at 03:57:09PM -0400, J. Bruce Fields wrote:
> On Mon, Mar 10, 2014 at 06:34:51AM -0700, Christoph Hellwig wrote:
> > On Tue, Mar 04, 2014 at 02:04:42PM -0500, J. Bruce Fields wrote:
> > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > > index 6d7be3f..eea5ad1 100644
> > > --- a/fs/nfsd/vfs.c
> > > +++ b/fs/nfsd/vfs.c
> > > @@ -404,6 +404,7 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> > >  	umode_t		ftype = 0;
> > >  	__be32		err;
> > >  	int		host_err;
> > > +	bool		get_write_count;
> > >  	int		size_change = 0;
> > >  
> > >  	if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> > > @@ -411,10 +412,18 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> > >  	if (iap->ia_valid & ATTR_SIZE)
> > >  		ftype = S_IFREG;
> > >  
> > > +	/* Callers that do fh_verify should do the fh_want_write: */
> > > +	get_write_count = !fhp->fh_dentry;
> > 
> > Eww, this is nasty.  Given that there are only 6 callers of nfsd_setattr
> > in total, and only half of these might cause size changes I'd rather
> > deal with this properly, e.g. by taking both the fh_verify into the
> > callers.
> 
> Maybe so.

Gah, I found clearing out my invoice that a) I'd forgotten this, b) I'd
already committed and pushed out the patch.

And I'd rather leave the fix as is and the cleanup to be done later.

But it's not OK to just drop review like that and if you think it
warrants reverting or rebasing I can do that.

--b.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-04-01 18:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <217983071.143460.1385453196946.JavaMail.zimbra@rapitasystems.com>
2013-11-26  8:20 ` warning in ext4_journal_start_sb on filesystem freeze Matthew Rahtz
2013-11-26 12:58   ` Jan Kara
2014-02-22  9:50     ` Matthew Rahtz
     [not found]       ` <622177618.727.1393062606061.JavaMail.zimbra-lFL+a/sBLVi/3pe1ocb+swC/G2K4zDHf@public.gmane.org>
2014-02-24  9:55         ` Jan Kara
     [not found]           ` <20140224095525.GA20532-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2014-02-24 15:45             ` J. Bruce Fields
     [not found]               ` <20140224154532.GB11992-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-02-25 10:21                 ` Jan Kara
     [not found]                   ` <20140225102126.GB1669-+0h/O2h83AeN3ZZ/Hiejyg@public.gmane.org>
2014-03-04 16:43                     ` J. Bruce Fields
     [not found]                       ` <20140304164306.GC12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-03-04 19:04                         ` J. Bruce Fields
2014-03-08  9:02                           ` Matthew Rahtz
2014-03-10 13:26                             ` J. Bruce Fields
     [not found]                           ` <20140304190442.GE12805-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-03-10 13:34                             ` Christoph Hellwig
     [not found]                               ` <20140310133451.GA17807-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2014-03-10 19:57                                 ` J. Bruce Fields
     [not found]                                   ` <20140310195709.GH28006-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2014-03-10 23:40                                     ` Christoph Hellwig
2014-04-01 18:40                                   ` J. Bruce Fields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).