All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Holger Hoffstätte" <holger@applied-asynchrony.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Kernels 4.15..5.5: "WARNING: CPU: 2 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0"
Date: Tue, 4 Feb 2020 11:49:18 -0500	[thread overview]
Message-ID: <20200204164918.GC13306@hungrycats.org> (raw)
In-Reply-To: <65514978-506f-83fa-2c95-ee9ce3cbf5b4@applied-asynchrony.com>

[-- Attachment #1: Type: text/plain, Size: 6560 bytes --]

On Tue, Feb 04, 2020 at 02:58:52PM +0100, Holger Hoffstätte wrote:
> On 2/4/20 6:04 AM, Zygo Blaxell wrote:
> > On Fri, Mar 22, 2019 at 12:17:32AM -0400, Zygo Blaxell wrote:
> > > When filesystems are mounted flushoncommit, I get this warning roughly
> > > every 30 seconds:
> > > 
> > > 	[ 4575.142805] WARNING: CPU: 3 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0
> > > 	[ 4575.145567] Modules linked in: crct10dif_pclmul crc32_pclmul dm_cache_smq crc32c_intel dm_cache snd_pcm ghash_clmulni_intel aesni_intel sr_mod dm_persistent_data ppdev joydev dm_bio_prison aes_x86_64 crypto_simd snd_timer dm_bufio cryptd cdrom snd glue_helper dm_mod parport_pc soundcore sg floppy parport pcspkr psmouse bochs_drm rtc_cmos ide_pci_generic piix input_leds i2c_piix4 ide_core serio_raw evbug qemu_fw_cfg evdev ip_tables x_tables ipv6 crc_ccitt autofs4
> > > 	[ 4575.160021] CPU: 3 PID: 4150 Comm: btrfs-transacti Tainted: G        W         5.0.3-zb64+ #1
> > > 	[ 4575.162484] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > > 	[ 4575.164505] RIP: 0010:__writeback_inodes_sb_nr+0xa9/0xc0
> > > 	[ 4575.165809] Code: 0f b6 d2 e8 b9 f8 ff ff 48 89 ee 48 89 df e8 0e f8 ff ff 48 8b 44 24 48 65 48 33 04 25 28 00 00 00 75 0b 48 83 c4 50 5b 5d c3 <0f> 0b eb cb e8 4e e9 d6 ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00
> > > 	[ 4575.171927] RSP: 0018:ffffa9cac0eabde8 EFLAGS: 00010246
> > > 	[ 4575.173045] RAX: 0000000000000000 RBX: ffff9353e23af000 RCX: 0000000000000000
> > > 	[ 4575.175639] RDX: 0000000000000002 RSI: 0000000000030c67 RDI: ffffa9cac0eabe30
> > > 	[ 4575.177619] RBP: ffffa9cac0eabdec R08: ffffa9cac0eabdf0 R09: ffff9353f12da000
> > > 	[ 4575.179736] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9353e1980000
> > > 	[ 4575.181661] R13: ffff9353e1981430 R14: ffff9353f27e4260 R15: ffff9353e1981518
> > > 	[ 4575.183871] FS:  0000000000000000(0000) GS:ffff9353f6800000(0000) knlGS:0000000000000000
> > > 	[ 4575.185940] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > 	[ 4575.188072] CR2: 00007fb81841fa20 CR3: 00000002218c0006 CR4: 00000000001606e0
> > > 	[ 4575.190094] Call Trace:
> > > 	[ 4575.190828]  btrfs_commit_transaction+0x7a6/0x9e0
> > > 	[ 4575.192115]  ? start_transaction+0x91/0x4d0
> > > 	[ 4575.193197]  transaction_kthread+0x146/0x180
> > > 	[ 4575.194415]  kthread+0x106/0x140
> > > 	[ 4575.195403]  ? btrfs_cleanup_transaction+0x620/0x620
> > > 	[ 4575.196903]  ? kthread_park+0x90/0x90
> > > 	[ 4575.198412]  ret_from_fork+0x3a/0x50
> > > 	[ 4575.199374] irq event stamp: 54922780
> > > 	[ 4575.200218] hardirqs last  enabled at (54922779): [<ffffffffa3d5f2e2>] _raw_spin_unlock_irqrestore+0x32/0x60
> > > 	[ 4575.202753] hardirqs last disabled at (54922780): [<ffffffffa300379f>] trace_hardirqs_off_thunk+0x1a/0x1c
> > > 	[ 4575.205921] softirqs last  enabled at (54922378): [<ffffffffa40003a4>] __do_softirq+0x3a4/0x45f
> > > 	[ 4575.208350] softirqs last disabled at (54922361): [<ffffffffa30a3d44>] irq_exit+0xe4/0xf0
> > > 	[ 4575.210616] ---[ end trace 5309dcf3a1920eca ]---
> > > 
> > > For my own kernel builds I just comment out the line in fs-writeback.c,
> > > but that's not a real solution.
> > 
> > This still happens in 5.5.0.  No changes in behavior or workaround, no
> > apparent harmful effect, almost 2 years running in stress-testing and
> > production.
> > 
> > I, for one, am glad we fixed all those other bugs before doing anything
> > about this one.  It is utterly harmless.
> 
> This triggered my archeology itch. I had to go deeper.

You could start with this thread:

	https://www.spinics.net/lists/linux-btrfs/msg87752.html

> The warning goes all the way back to 2010 (kernel 2.6.x) when everything
> happened at FusionIO.
> 
> Commit [1] introduced it as preparation for [2].
> 
> The only caller of writeback_inodes_sb_nr is btrfs_writeback_inodes_sb_nr in
> (today's) space-info.c, where the mutex trylock was introduced in [3], apparently
> to work around a VFS function that didn't do it for btrfs at the time.
> 
> Flushoncommit was added by Sage Weil for Ceph's btrfs backend in [4], even
> before the WARN_ON, in 2009. We know how that story ended.
> 
> Why has nobody except you noticed this? Probably because the number of people
> actually using it or reporting bugs is.. very small. ¯\_(ツ)_/¯

I'm not the only one to notice, or report, e.g.

	https://www.spinics.net/lists/linux-btrfs/msg74496.html
	https://www.spinics.net/lists/linux-btrfs/msg72483.html
	https://github.com/Zygo/bees/issues/68

plus it comes up every now and then on IRC.  I have heard from other
users of flushoncommit that they also patch their kernels to get rid of
the WARN_ON (or make it WARN_ON_ONCE).

The WARN_ON appears in btrfs starting in 4.15 after:

	https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce8ea7cc6eb3139f4c730d647325e69354159b0f

which rearranges some calls to put the fs-writeback.c WARN_ON on a code
path where it doesn't hold the lock.

To answer a question I asked in

	https://www.spinics.net/lists/linux-btrfs/msg87769.html

(and again in another message of this thread), the answer is
"cherry-picking ce8ea7cc6eb3 into 4.14.107 makes 4.14.107 deadlock
immediately".  Reverting the same commit makes kernel 4.15 and later
deadlock immediately.

btrfs crashes _much_ less often now than it did in 4.14.  Mounting with
noflushoncommit is starting to look like an option worth contemplating
for some workloads on 5.4.18+.  On the other hand, one of the reasons
why I use btrfs instead of other filesystems is that other filesystems
don't implement a sane equivalent of flushoncommit, and those use cases
aren't going away any time soon.

> Unfortunately I'm still none the wiser why btrfs feels it's necessary to
> "open-code"/circumvent the rwsem check. Maybe this gives you a clue.
> 
> cheers,
> Holger
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/fs-writeback.c?id=cf37e972478ec58a8a54a6b4f951815f0ae28f78
> 
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/fs-writeback.c?id=d19de7edf59cdd586777b009e0e8fbe5412dd35f
> 
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/btrfs/extent-tree.c?id=925a6efb8ff0c2bdbec107ed9890e62650c83306
> 
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dccae99995089641fbac452ebc7f0cab18751ddb

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  reply	other threads:[~2020-02-04 16:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22  4:17 Kernels 4.15..5.0.3: "WARNING: CPU: 2 PID: 4150 at fs/fs-writeback.c:2363 __writeback_inodes_sb_nr+0xa9/0xc0" Zygo Blaxell
2019-03-22  7:32 ` Nikolay Borisov
2019-03-22 15:59   ` David Sterba
2019-03-22 17:26     ` Filipe Manana
2019-03-26 23:13       ` Zygo Blaxell
2019-03-26 23:19         ` Filipe Manana
2019-03-22 18:15     ` Zygo Blaxell
2019-05-18 21:11 ` Kernels 4.15..5.1.3: " Zygo Blaxell
2020-02-04  5:04 ` Kernels 4.15..5.5: " Zygo Blaxell
2020-02-04 13:58   ` Holger Hoffstätte
2020-02-04 16:49     ` Zygo Blaxell [this message]
2020-03-27  5:59       ` Kernel 5.5.8 : " Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200204164918.GC13306@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=holger@applied-asynchrony.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.