From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [GIT PULL] xfs: CIL and log scalability improvements
Date: Mon, 7 Jun 2021 08:11:19 +1000 [thread overview]
Message-ID: <20210606221119.GW664593@dread.disaster.area> (raw)
In-Reply-To: <20210605021533.GH26380@locust>
On Fri, Jun 04, 2021 at 07:15:33PM -0700, Darrick J. Wong wrote:
> On Fri, Jun 04, 2021 at 07:03:54PM -0700, Darrick J. Wong wrote:
> > On Fri, Jun 04, 2021 at 01:29:28PM +1000, Dave Chinner wrote:
> > > Hi Darrick,
> > >
> > > Can you please pull the CIL and log improvements from the tag listed
> > > below?
> >
> > I tried that and threw the series at fstests, which crashed all VMs with
> > the following null pointer dereference:
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0
> > Oops: 0000 [#1] PREEMPT SMP
> > CPU: 2 PID: 731060 Comm: mount Not tainted 5.13.0-rc4-djwx #rc4
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > RIP: 0010:xlog_cil_init+0x2f7/0x370 [xfs]
> > Code: b4 7e a0 bf 1c 00 00 00 e8 c6 3b 8d e0 85 c0 78 0c c6 05 13 7f 12 00 01 e9 7b fd ff ff 4
> > 2 48 c7 c6 f8 bf 7f a0 <48> 8b 39 e8 f0 24 04 00 31 ff b9 fc 05 00 00 48 c7 c2 d9 b3 7e a0
> > RSP: 0018:ffffc9000776bcd0 EFLAGS: 00010286
> > RAX: 00000000fffffff0 RBX: 0000000000000000 RCX: 0000000000000000
> > RDX: 00000000fffffff0 RSI: ffffffffa07fbff8 RDI: 00000000ffffffff
> > RBP: ffff888004cf3c00 R08: ffffffffa078fb40 R09: 0000000000000000
> > R10: 000000000000000c R11: 0000000000000048 R12: ffff888052810000
> > R13: 0000607f81c0b0f8 R14: ffff888004a09c00 R15: ffff888004a09c00
> > FS: 00007fd2e4486840(0000) GS:ffff88807e000000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000000 CR3: 00000000528e5004 CR4: 00000000001706a0
> > Call Trace:
> > xlog_alloc_log+0x51f/0x5f0 [xfs]
> > xfs_log_mount+0x55/0x340 [xfs]
> > xfs_mountfs+0x4e4/0x9f0 [xfs]
> > xfs_fs_fill_super+0x4dd/0x7a0 [xfs]
> > ? suffix_kstrtoint.constprop.0+0xe0/0xe0 [xfs]
> > get_tree_bdev+0x175/0x280
> > vfs_get_tree+0x1a/0x80
> > ? capable+0x2f/0x50
> > path_mount+0x6fb/0xa90
> > __x64_sys_mount+0x103/0x140
> > do_syscall_64+0x3a/0x70
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7fd2e46e8dde
> > Code: 48 8b 0d b5 80 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0
> > f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 82 80 0c 00 f7 d8 64 89 01 48
> >
> > I'm pretty sure that's due to:
> >
> > if (!xlog_cil_pcp_init) {
> > int ret;
> >
> > ret = cpuhp_setup_state_nocalls(CPUHP_XFS_CIL_DEAD,
> > "xfs/cil_pcp:dead", NULL,
> > xlog_cil_pcp_dead);
> > if (ret < 0) {
> > xfs_warn(cil->xc_log->l_mp,
> > "Failed to initialise CIL hotplug, error %d. XFS is non-functional.",
> > ret);
> >
> > Because we haven't set cil->xc_log yet.
>
> And having now fixed that, I get tons of:
>
> XFS (sda): Failed to initialise CIL hotplug, error -16. XFS is non-functional.
> XFS: Assertion failed: 0, file: fs/xfs/xfs_log_cil.c, line: 1532
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 113983 at fs/xfs/xfs_message.c:112 assfail+0x3c/0x40 [xfs]
> Modules linked in: xfs libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_REDIR
>
> EBUSY??
Ok, so EBUSY implies a setup race - multiple filesystems are trying
to run the init code at the same time. That seems somewhat unlikely,
but regardless what I'm going to do is move this setup/teardown to
the XFS module init functions rather than do it in the CIL.
i.e. turn this into a generic XFS filesystem hotplug infrastructure
in xfs_super.c and call out to xlog_cil_pcp_dead() from there.
This way we are guaranteed a single init call when the module is
inserted, and a single destroy call when the module is removed. That
should solve all these issues.
How do you want me to handle these changes? Just send out the
replacement patches for the code that is alread in the branch for
review, then resend a rebased pull-req after review? Or something
else? I don't really want to keep bombing the mailing list with 40
emails every time I need to get fixes for a single patch reviewed...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2021-06-06 22:11 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-04 3:29 [GIT PULL] xfs: CIL and log scalability improvements Dave Chinner
2021-06-05 2:03 ` Darrick J. Wong
2021-06-05 2:15 ` Darrick J. Wong
2021-06-05 21:43 ` Dave Chinner
2021-06-05 22:29 ` Darrick J. Wong
2021-06-06 22:11 ` Dave Chinner [this message]
2021-06-07 0:00 ` Dave Chinner
2021-06-07 0:17 ` [PATCH 1/2] xfs: introduce CPU hotplug infrastructure Dave Chinner
2021-06-08 4:14 ` Darrick J. Wong
2021-06-07 0:18 ` [PATCH 2/2] xfs: introduce per-cpu CIL tracking structure Dave Chinner
2021-06-07 21:59 ` [GIT PULL] xfs: CIL and log scalability improvements Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210606221119.GW664593@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox