From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: bisected: btrfs dedupe regression in v5.11-rc1: 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one()
Date: Tue, 14 Dec 2021 14:50:37 -0500 [thread overview]
Message-ID: <Ybj1jVYu3MrUzVTD@hungrycats.org> (raw)
In-Reply-To: <fc395aed-2cbd-f6e5-d167-632c14a07188@suse.com>
On Tue, Dec 14, 2021 at 01:11:24PM +0200, Nikolay Borisov wrote:
>
>
> On 14.12.21 г. 1:12, Zygo Blaxell wrote:
> > On Mon, Dec 13, 2021 at 03:28:26PM +0200, Nikolay Borisov wrote:
> >> On 10.12.21 г. 20:34, Zygo Blaxell wrote:
> >>> I've been getting deadlocks in dedupe on btrfs since kernel 5.11, and
> >>> some bees users have reported it as well. I bisected to this commit:
> >>>
> >>> 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one()
> >>>
> >>> These kernels work for at least 18 hours:
> >>>
> >>> 5.10.83 (months)
> >>> 5.11.22 with 3078d85c9a10 reverted (36 hours)
> >>> btrfs misc-next 66dc4de326b0 with 3078d85c9a10 reverted
> >>>
> >>> These kernels lock up in 3 hours or less:
> >>>
> >>> 5.11.22
> >>> 5.12.19
> >>> 5.14.21
> >>> 5.15.6
> >>> btrfs for-next 279373dee83e
> >>>
> >>> All of the failing kernels include this commit, none of the non-failing
> >>> kernels include the commit.
> >>>
> >>> Kernel logs from the lockup:
> >>>
> >>> [19647.696042][ T3721] sysrq: Show Blocked State
> >>> [19647.697024][ T3721] task:btrfs-transacti state:D stack: 0 pid: 6161 ppid: 2 flags:0x00004000
> >>> [19647.698203][ T3721] Call Trace:
> >>> [19647.698608][ T3721] __schedule+0x388/0xaf0
> >>> [19647.699125][ T3721] schedule+0x68/0xe0
> >>> [19647.699615][ T3721] btrfs_commit_transaction+0x97c/0xbf0
> >>
> >> Can you run this through symbolize script as I'd like to understand
> >> where in transaction commit the sleep is happening.
> >
> > btrfs_commit_transaction+0x97c/0xbf0:
> >
> > btrfs_commit_transaction at fs/btrfs/transaction.c:2159 (discriminator 9)
> > 2154
> > 2155 ret = btrfs_run_delayed_items(trans);
> > 2156 if (ret)
> > 2157 goto cleanup_transaction;
> > 2158
> > >2159< wait_event(cur_trans->writer_wait,
> > 2160 extwriter_counter_read(cur_trans) == 0);
> > 2161
> > 2162 /* some pending stuffs might be added after the previous flush. */
> > 2163 ret = btrfs_run_delayed_items(trans);
> > 2164 if (ret)
> >
>
> So it seems there is an open transaction handle thus commit can't
> continue and everything is stalled behind. Would you be able to run the
> attached python script on a host which is stuck. It requires you having
> debug symbols for the kernel installed as well as
> https://github.com/osandov/drgn/ which is a scriptable debugger. The
> easiest way would to follow the instructions at
> https://drgn.readthedocs.io/en/latest/installation.html and just get it
> via pip.
>
>
> Once you have it installed run it by doing:
>
> "sudo drgn get-num-extwriters.py 310dd372-0fd1-4496-a232-0fb46ca4afd6"
>
> Where 310dd372-0fd1-4496-a232-0fb46ca4afd6 is the fsid as taken from
> 'blkid' which corresponds to the wedged fs.
[drum roll noises...]
[f79c1081-d81d-4abc-8b47-3b15bf2f93c5] num_extwriters is: 1
> <snip>
> #!/bin/python3
>
> import uuid, sys
> from drgn.helpers.linux import list_for_each_entry
>
> if len(sys.argv) != 2:
> print("Run with 'sudo drgn {} UUID'".format(sys.argv[0]))
> exit()
>
> fsid = sys.argv[1]
> found = False
>
> btrfs_fs = prog['fs_uuids']
> for fs in list_for_each_entry("struct btrfs_fs_devices", btrfs_fs.address_of_(), "fs_list"):
> current_fsid = uuid.UUID(bytes=fs.fsid.string_())
> user_fsid = uuid.UUID(fsid)
> if current_fsid.int == user_fsid.int:
> transaction = fs.fs_info.running_transaction
> found = True
> print("[{}] num_extwriters is: {}".format(fsid, transaction.num_extwriters.value_()["counter"]));
>
> if found == False:
> print("Couldn't find matching UUID belonging to a BTRFS filesystem")
next prev parent reply other threads:[~2021-12-14 19:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-10 18:34 bisected: btrfs dedupe regression in v5.11-rc1: 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one() Zygo Blaxell
2021-12-12 10:03 ` Thorsten Leemhuis
2021-12-13 13:28 ` Nikolay Borisov
2021-12-13 23:12 ` Zygo Blaxell
2021-12-14 11:11 ` Nikolay Borisov
2021-12-14 19:50 ` Zygo Blaxell [this message]
2021-12-14 22:25 ` Nikolay Borisov
2021-12-16 5:33 ` Zygo Blaxell
2021-12-16 21:29 ` Nikolay Borisov
2021-12-16 22:07 ` Josef Bacik
2021-12-17 20:50 ` Zygo Blaxell
2022-01-07 18:31 ` bisected: btrfs dedupe regression in v5.11-rc1 Zygo Blaxell
2022-01-20 14:04 ` Thorsten Leemhuis
2022-01-21 0:27 ` Zygo Blaxell
2022-02-09 12:22 ` Libor Klepáč
2022-02-18 14:46 ` Thorsten Leemhuis
2022-03-06 10:31 ` Thorsten Leemhuis
2022-03-06 23:34 ` Zygo Blaxell
2022-03-07 6:17 ` Thorsten Leemhuis
2021-12-17 5:38 ` bisected: btrfs dedupe regression in v5.11-rc1: 3078d85c9a10 vfs: verify source area in vfs_dedupe_file_range_one() Zygo Blaxell
2022-06-13 8:38 ` Libor Klepáč
2022-06-21 5:08 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ybj1jVYu3MrUzVTD@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.