* WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() @ 2016-05-10 23:19 Eric Biggers 2016-05-10 23:57 ` Chris Mason 2016-05-14 0:56 ` WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Chris Mason 0 siblings, 2 replies; 6+ messages in thread From: Eric Biggers @ 2016-05-10 23:19 UTC (permalink / raw) To: linux-btrfs Hello, The following warning has been triggering for me since about v4.6-rc3: WARN_ON(BTRFS_I(inode)->csum_bytes); On one machine the warning has occurred 657 times since v4.6-rc5. On another it has occurred 3 times since v4.6-rc3. Both are now on v4.6-rc7, where I have still observed the warning. The warnings occur in groups, and do_unlinkat() and evict() are always in the call stack. Is this a known issue? Here is the first occurrence: Apr 17 10:31:19 zzz kernel: ------------[ cut here ]------------ Apr 17 10:31:19 zzz kernel: WARNING: CPU: 0 PID: 4092 at fs/btrfs/inode.c:9261 btrfs_destroy_inode+0x23c/0x2b0 Apr 17 10:31:19 zzz kernel: Modules linked in: fuse vhost_net vhost tun bridge stp llc ccm iptable_filter iptable_nat nf_conntrack_ipv4 Apr 17 10:31:19 zzz kernel: CPU: 0 PID: 4092 Comm: rm Tainted: G W 4.6.0-rc3 #178 Apr 17 10:31:19 zzz kernel: Hardware name: Dell Inc. Inspiron 15-7568/0M5YMV, BIOS 01.00.00 08/07/2015 Apr 17 10:31:19 zzz kernel: 0000000000000286 0000000031dfd09a ffff8801b9277dc0 ffffffff813320d8 Apr 17 10:31:19 zzz kernel: 0000000000000000 0000000000000000 ffff8801b9277e00 ffffffff81074221 Apr 17 10:31:19 zzz kernel: 0000242d85cec000 ffff88019408c9a8 ffff88019408c9a8 ffff880085cec000 Apr 17 10:31:19 zzz kernel: Call Trace: Apr 17 10:31:19 zzz kernel: [<ffffffff813320d8>] dump_stack+0x4d/0x65 Apr 17 10:31:19 zzz kernel: [<ffffffff81074221>] __warn+0xc1/0xe0 Apr 17 10:31:19 zzz kernel: [<ffffffff81074348>] warn_slowpath_null+0x18/0x20 Apr 17 10:31:19 zzz kernel: [<ffffffff8126e77c>] btrfs_destroy_inode+0x23c/0x2b0 Apr 17 10:31:19 zzz kernel: [<ffffffff81185296>] destroy_inode+0x36/0x60 Apr 17 10:31:19 zzz kernel: [<ffffffff811853e4>] evict+0x124/0x180 Apr 17 10:31:19 zzz kernel: [<ffffffff81185b68>] iput+0x148/0x1d0 Apr 17 10:31:19 zzz kernel: [<ffffffff8117b884>] do_unlinkat+0x194/0x2b0 Apr 17 10:31:19 zzz kernel: [<ffffffff8117c0e6>] SyS_unlinkat+0x16/0x30 Apr 17 10:31:19 zzz kernel: [<ffffffff817031db>] entry_SYSCALL_64_fastpath+0x13/0x8f Apr 17 10:31:19 zzz kernel: ---[ end trace 87f6a0e10f4df484 ]--- ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() 2016-05-10 23:19 WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Eric Biggers @ 2016-05-10 23:57 ` Chris Mason 2016-05-11 18:25 ` Adam Borowski 2016-05-14 0:56 ` WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Chris Mason 1 sibling, 1 reply; 6+ messages in thread From: Chris Mason @ 2016-05-10 23:57 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-btrfs On Tue, May 10, 2016 at 06:19:20PM -0500, Eric Biggers wrote: > Hello, > > The following warning has been triggering for me since about v4.6-rc3: > > WARN_ON(BTRFS_I(inode)->csum_bytes); > > On one machine the warning has occurred 657 times since v4.6-rc5. On another it > has occurred 3 times since v4.6-rc3. Both are now on v4.6-rc7, where I have > still observed the warning. The warnings occur in groups, and do_unlinkat() and > evict() are always in the call stack. > > Is this a known issue? Here is the first occurrence: It is a known issue, but I'm having a very hard time triggering it quickly enough to track it down. I do know csum_bytes is only 4096 or 8192 when it hits, but beyond that I haven't been able to trigger it consistently enough to test a patch. But, I'm trying ;) -chris ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() 2016-05-10 23:57 ` Chris Mason @ 2016-05-11 18:25 ` Adam Borowski 2016-05-11 18:29 ` [PATCH for 4.6] btrfs: disable a spurious WARN_ON in btrfs_destroy_inode Adam Borowski 0 siblings, 1 reply; 6+ messages in thread From: Adam Borowski @ 2016-05-11 18:25 UTC (permalink / raw) To: Chris Mason, linux-btrfs, Eric Biggers Chris Mason wrote: > On Tue, May 10, 2016 at 06:19:20PM -0500, Eric Biggers wrote: > > The following warning has been triggering for me since about v4.6-rc3: > > > > WARN_ON(BTRFS_I(inode)->csum_bytes); > > > > On one machine the warning has occurred 657 times since v4.6-rc5. On another it > > has occurred 3 times since v4.6-rc3. Both are now on v4.6-rc7, where I have > > still observed the warning. The warnings occur in groups, and do_unlinkat() and > > evict() are always in the call stack. > > It is a known issue, but I'm having a very hard time triggering it > quickly enough to track it down. I do know csum_bytes is only 4096 or > 8192 when it hits, but beyond that I haven't been able to trigger it > consistently enough to test a patch. It happens for me after no more than a few minutes of regular use that includes unlink, destructive rename or snapshot deletion, even without deliberately trying to trigger it. On the other hand, repeating the same operation that triggered the warning doesn't trigger it again (on live fs, I did not try a whole-device snapshot of the filesystem). Effective mount options for this fs: /dev/sda1 on / type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=4996,subvol=/sys) /dev/sda1 on /var/cache type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=263,subvol=/syscache) /dev/sda1 on /home type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=258,subvol=/home) /dev/sda1 on /mnt/btr1 type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=5,subvol=/) /dev/sda1 on /home/kilobyte/.cache type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=259,subvol=/kb-cache) /dev/sda1 on /home/kilobyte/tmp type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=260,subvol=/kb-tmp) /dev/sda1 on /srv/chroots type btrfs (rw,noatime,compress=lzo,ssd,space_cache,subvolid=5,subvol=/chroots) I've been running with asserts/etc and a s/WARN_ON/printk/ patch I'll send after this mail for almost two weeks already, no data loss or any inconsistency. My use is mostly sbuild: snapshot+lots of dpkg+lots of gcc+snapshot deletion. Could you confirm my understanding? I don't know btrfs internals, but it appears to me that unsaved pending checksums attached to an inode don't matter if the inode is being deleted anyway. Is this correct? If so, please apply the patch to silence the warning in non-ASSERTS builds; people rightfully panic when they see a WARN with tainted kernel and so on, believing the filesystem needs an immediate fsck + possibly restore. Having such pending checksums means badness elsewhere -- they should have been already saved at this point, right? But no matter the cause, it doesn't break btrfs_destroy_inode. Meow! -- How to exploit the Bible for weight loss: Pr28:25: he that putteth his trust in the ʟᴏʀᴅ shall be made fat. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH for 4.6] btrfs: disable a spurious WARN_ON in btrfs_destroy_inode. 2016-05-11 18:25 ` Adam Borowski @ 2016-05-11 18:29 ` Adam Borowski 2016-05-11 20:17 ` Josef Bacik 0 siblings, 1 reply; 6+ messages in thread From: Adam Borowski @ 2016-05-11 18:29 UTC (permalink / raw) To: Chris Mason, linux-btrfs, Eric Biggers; +Cc: Josef Bacik, David Sterba This happens a lot on real-world loads. The issue is apparently benign, as unsaved pending checksums are moot when the ship^Winode is going down anyway. Thus, no need to cause panic in users. I've retained the warning in CONFIG_BTRFS_ASSERT builds, as this shouldn't happen. I've replaced the no longer helpful register+stack dump with a printk that mentions the device affected. Signed-off-by: Adam Borowski <kilobyte@angband.pl> --- fs/btrfs/inode.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2aaba58..ed78104 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9258,7 +9258,10 @@ void btrfs_destroy_inode(struct inode *inode) WARN_ON(BTRFS_I(inode)->outstanding_extents); WARN_ON(BTRFS_I(inode)->reserved_extents); WARN_ON(BTRFS_I(inode)->delalloc_bytes); - WARN_ON(BTRFS_I(inode)->csum_bytes); +#ifdef CONFIG_BTRFS_ASSERT + if (BTRFS_I(inode)->csum_bytes) + btrfs_info(root->fs_info, "btrfs_destroy_inode: leftover csum_bytes"); +#endif WARN_ON(BTRFS_I(inode)->defrag_bytes); /* -- 2.8.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH for 4.6] btrfs: disable a spurious WARN_ON in btrfs_destroy_inode. 2016-05-11 18:29 ` [PATCH for 4.6] btrfs: disable a spurious WARN_ON in btrfs_destroy_inode Adam Borowski @ 2016-05-11 20:17 ` Josef Bacik 0 siblings, 0 replies; 6+ messages in thread From: Josef Bacik @ 2016-05-11 20:17 UTC (permalink / raw) To: Adam Borowski, Chris Mason, linux-btrfs, Eric Biggers; +Cc: David Sterba On 05/11/2016 11:29 AM, Adam Borowski wrote: > This happens a lot on real-world loads. The issue is apparently benign, > as unsaved pending checksums are moot when the ship^Winode is going down > anyway. Thus, no need to cause panic in users. > > I've retained the warning in CONFIG_BTRFS_ASSERT builds, as this shouldn't > happen. I've replaced the no longer helpful register+stack dump with a > printk that mentions the device affected. > Chris is working on fixing the problem, and these are real issues, we need to not hide them behind config options. Thanks, Josef ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() 2016-05-10 23:19 WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Eric Biggers 2016-05-10 23:57 ` Chris Mason @ 2016-05-14 0:56 ` Chris Mason 1 sibling, 0 replies; 6+ messages in thread From: Chris Mason @ 2016-05-14 0:56 UTC (permalink / raw) To: Eric Biggers; +Cc: linux-btrfs On Tue, May 10, 2016 at 06:19:20PM -0500, Eric Biggers wrote: > Hello, > > The following warning has been triggering for me since about v4.6-rc3: > > WARN_ON(BTRFS_I(inode)->csum_bytes); > > On one machine the warning has occurred 657 times since v4.6-rc5. On another it > has occurred 3 times since v4.6-rc3. Both are now on v4.6-rc7, where I have > still observed the warning. The warnings occur in groups, and do_unlinkat() and > evict() are always in the call stack. > > Is this a known issue? Here is the first occurrence: Finally tracked this down today, it's a bug in how we deal with page faults in the middle of a write. We're testing the fix here and I'll have a patch on Monday. Strictly speaking this is a regression in one of Chandan's setup patches for PAGE_SIZE > sectorsize, but the code was much much too subtle to blame him. It should be possible to trigger with a non-aligned multi-page write, I'm trying to nail down a reliable test program so we don't make the same mistake again. -chris ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-05-14 0:56 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-10 23:19 WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Eric Biggers 2016-05-10 23:57 ` Chris Mason 2016-05-11 18:25 ` Adam Borowski 2016-05-11 18:29 ` [PATCH for 4.6] btrfs: disable a spurious WARN_ON in btrfs_destroy_inode Adam Borowski 2016-05-11 20:17 ` Josef Bacik 2016-05-14 0:56 ` WARNING at fs/btrfs/inode.c:9261 btrfs_destroy_inode() Chris Mason
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).