From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: "Libor Klepáč" <libor.klepac@bcom.cz>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs lockups on ubuntu with bees
Date: Wed, 8 Dec 2021 23:44:38 -0500 [thread overview]
Message-ID: <20211209044438.GO17148@hungrycats.org> (raw)
In-Reply-To: <c9f1640177563f545ef70eb6ec1560faa1bb1bd7.camel@bcom.cz>
On Fri, Nov 26, 2021 at 02:36:30PM +0000, Libor Klepáč wrote:
> Hi,
> we are trying to use btrfs with compression and deduplication using
> bees to host filesystem for nakivo repository.
> Nakivo repository is in "incremental with full backups" format - ie.
> one file per VM snapshot transferred from vmware, full backup every x
> days, no internal deduplication.
> We have also disabled internal compression in nakivo repository and put
> compression-force=zstd:13 on filesystem.
>
> It's a VM on vmware 6.7.0 Update 3 (Build 17700523) on Dell R540.
> It has 6vCPU, 16GB of ram.
>
> Bees is run with this parameters
> OPTIONS="--strip-paths --no-timestamps --verbose 5 --loadavg-target 5
> --thread-min 1"
> DB_SIZE=$((8*1024*1024*1024)) # 8G in bytes
>
>
>
> Until today it was running ubuntu provided kernel 5.11.0-40.44~20.04.2
> (not sure about exact upstream version),
> today we switched to 5.13.0-21.21~20.04.1 after first crash.
>
> It was working ok for 7+days, all data were in (around 10TB), so i
> started bees.
> It now locks the FS, bees runs on 100% CPU, i cannot enter directory
> with btrfs
>
> # btrfs filesystem usage /mnt/btrfs/repo02/
> Overall:
> Device size: 20.00TiB
> Device allocated: 10.88TiB
> Device unallocated: 9.12TiB
> Device missing: 0.00B
> Used: 10.87TiB
> Free (estimated): 9.13TiB (min: 4.57TiB)
> Data ratio: 1.00
> Metadata ratio: 1.00
> Global reserve: 512.00MiB (used: 0.00B)
>
> Data,single: Size:10.85TiB, Used:10.83TiB (99.91%)
> /dev/sdd 10.85TiB
>
> Metadata,single: Size:35.00GiB, Used:34.71GiB (99.17%)
> /dev/sdd 35.00GiB
>
> System,DUP: Size:32.00MiB, Used:1.14MiB (3.56%)
> /dev/sdd 64.00MiB
>
> Unallocated:
> /dev/sdd 9.12TiB
>
> This happened yesterday on kernel 5.11
> https://download.bcom.cz/btrfs/trace1.txt
>
> This is today, on 5.13
> https://download.bcom.cz/btrfs/trace2.txt
>
> this is trace from sysrq later, when i noticed it happened again
> https://download.bcom.cz/btrfs/trace3.txt
>
>
> Any clue what can be done?
I am currently hitting this bug on all kernel versions starting from 5.11.
Test runs end with the filesystem locked up, 100% CPU usage in bees
and the following lockdep dump:
[Wed Dec 8 14:14:03 2021] Linux version 5.11.22-zb64-e4d48558d24c+ (zblaxell@waya) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.37) #1 SMP Sun Dec 5 04:18:31 EST 2021
[Wed Dec 8 23:17:32 2021] sysrq: Show Locks Held
[Wed Dec 8 23:17:32 2021]
Showing all locks held in the system:
[Wed Dec 8 23:17:32 2021] 1 lock held by in:imklog/3603:
[Wed Dec 8 23:17:32 2021] 1 lock held by dmesg/3720:
[Wed Dec 8 23:17:32 2021] #0: ffff8a1406ac80e0 (&user->lock){+.+.}-{3:3}, at: devkmsg_read+0x4d/0x320
[Wed Dec 8 23:17:32 2021] 3 locks held by bash/3721:
[Wed Dec 8 23:17:32 2021] #0: ffff8a142a589498 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x70/0xf0
[Wed Dec 8 23:17:32 2021] #1: ffffffff98f199a0 (rcu_read_lock){....}-{1:2}, at: __handle_sysrq+0x5/0xa0
[Wed Dec 8 23:17:32 2021] #2: ffffffff98f199a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x23/0x187
[Wed Dec 8 23:17:32 2021] 1 lock held by btrfs-transacti/6161:
[Wed Dec 8 23:17:32 2021] #0: ffff8a14e0178850 (&fs_info->transaction_kthread_mutex){+.+.}-{3:3}, at: transaction_kthread+0x5a/0x1b0
[Wed Dec 8 23:17:32 2021] 3 locks held by crawl_257_265/6491:
[Wed Dec 8 23:17:32 2021] 3 locks held by crawl_257_291/6494:
[Wed Dec 8 23:17:32 2021] #0: ffff8a14bd092498 (sb_writers#12){.+.+}-{0:0}, at: vfs_dedupe_file_range_one+0x3b/0x180
[Wed Dec 8 23:17:32 2021] #1: ffff8a1410d7c848 (&sb->s_type->i_mutex_key#17){+.+.}-{3:3}, at: lock_two_nondirectories+0x6b/0x70
[Wed Dec 8 23:17:32 2021] #2: ffff8a14161a39c8 (&sb->s_type->i_mutex_key#17/4){+.+.}-{3:3}, at: lock_two_nondirectories+0x59/0x70
[Wed Dec 8 23:17:32 2021] 4 locks held by crawl_257_292/6502:
[Wed Dec 8 23:17:32 2021] #0: ffff8a14bd092498 (sb_writers#12){.+.+}-{0:0}, at: vfs_dedupe_file_range_one+0x3b/0x180
[Wed Dec 8 23:17:32 2021] #1: ffff8a131637a908 (&sb->s_type->i_mutex_key#17){+.+.}-{3:3}, at: lock_two_nondirectories+0x6b/0x70
[Wed Dec 8 23:17:32 2021] #2: ffff8a14161a39c8 (&sb->s_type->i_mutex_key#17/4){+.+.}-{3:3}, at: lock_two_nondirectories+0x59/0x70
[Wed Dec 8 23:17:32 2021] #3: ffff8a14bd0926b8 (sb_internal#2){.+.+}-{0:0}, at: btrfs_start_transaction+0x1e/0x20
[Wed Dec 8 23:17:32 2021] 2 locks held by crawl_257_293/6503:
[Wed Dec 8 23:17:32 2021] #0: ffff8a14bd092498 (sb_writers#12){.+.+}-{0:0}, at: vfs_dedupe_file_range_one+0x3b/0x180
[Wed Dec 8 23:17:32 2021] #1: ffff8a14161a39c8 (&sb->s_type->i_mutex_key#17){+.+.}-{3:3}, at: btrfs_remap_file_range+0x2eb/0x3c0
[Wed Dec 8 23:17:32 2021] 3 locks held by crawl_256_289/6504:
[Wed Dec 8 23:17:32 2021] #0: ffff8a14bd092498 (sb_writers#12){.+.+}-{0:0}, at: vfs_dedupe_file_range_one+0x3b/0x180
[Wed Dec 8 23:17:32 2021] #1: ffff8a140f2c4748 (&sb->s_type->i_mutex_key#17){+.+.}-{3:3}, at: lock_two_nondirectories+0x6b/0x70
[Wed Dec 8 23:17:32 2021] #2: ffff8a14161a39c8 (&sb->s_type->i_mutex_key#17/4){+.+.}-{3:3}, at: lock_two_nondirectories+0x59/0x70
[Wed Dec 8 23:17:32 2021] =============================================
There's only one commit touching vfs_dedupe_file_range_one
between v5.10 and v5.15 (3078d85c9a10 "vfs: verify source area in
vfs_dedupe_file_range_one()"), so I'm now testing 5.11 with that commit
reverted to see if it introduced a regression.
> We would really like to use btrfs for this use case, because nakivo,
> with this type of repository format, needs to be se to do full backup
> every x days and does not do deduplication on its own.
>
>
> With regards,
> Libor
>
next prev parent reply other threads:[~2021-12-09 4:44 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-26 14:36 Btrfs lockups on ubuntu with bees Libor Klepáč
2021-12-09 4:44 ` Zygo Blaxell [this message]
2021-12-09 9:23 ` Libor Klepáč
2021-12-13 22:51 ` Zygo Blaxell
2021-12-15 9:42 ` Libor Klepáč
2021-12-15 9:48 ` Nikolay Borisov
2021-12-23 9:54 ` Libor Klepáč
2021-12-24 11:40 ` Libor Klepáč
2021-12-24 11:49 ` Libor Klepáč
2021-12-31 19:17 ` Zygo Blaxell
2021-12-31 19:24 ` Zygo Blaxell
2022-01-03 10:47 ` Libor Klepáč
2022-01-04 3:09 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211209044438.GO17148@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=libor.klepac@bcom.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox