All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/4] xfs: prohibit fs freezing when using empty transactions
Date: Wed, 25 Mar 2020 15:53:53 +1100	[thread overview]
Message-ID: <20200325045353.GF10776@dread.disaster.area> (raw)
In-Reply-To: <158510667670.922633.9371387481128286027.stgit@magnolia>

On Tue, Mar 24, 2020 at 08:24:36PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> I noticed that fsfreeze can take a very long time to freeze an XFS if
> there happens to be a GETFSMAP caller running in the background.  I also
> happened to notice the following in dmesg:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 43492 at fs/xfs/xfs_super.c:853 xfs_quiesce_attr+0x83/0x90 [xfs]
> Modules linked in: xfs libcrc32c ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip_set_hash_ip ip_set_hash_net xt_tcpudp xt_set ip_set_hash_mac ip_set nfnetlink ip6table_filter ip6_tables bfq iptable_filter sch_fq_codel ip_tables x_tables nfsv4 af_packet [last unloaded: xfs]
> CPU: 2 PID: 43492 Comm: xfs_io Not tainted 5.6.0-rc4-djw #rc4
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
> RIP: 0010:xfs_quiesce_attr+0x83/0x90 [xfs]
> Code: 7c 07 00 00 85 c0 75 22 48 89 df 5b e9 96 c1 00 00 48 c7 c6 b0 2d 38 a0 48 89 df e8 57 64 ff ff 8b 83 7c 07 00 00 85 c0 74 de <0f> 0b 48 89 df 5b e9 72 c1 00 00 66 90 0f 1f 44 00 00 41 55 41 54
> RSP: 0018:ffffc900030f3e28 EFLAGS: 00010202
> RAX: 0000000000000001 RBX: ffff88802ac54000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff81e4a6f0 RDI: 00000000ffffffff
> RBP: ffff88807859f070 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000010 R12: 0000000000000000
> R13: ffff88807859f388 R14: ffff88807859f4b8 R15: ffff88807859f5e8
> FS:  00007fad1c6c0fc0(0000) GS:ffff88807e000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f0c7d237000 CR3: 0000000077f01003 CR4: 00000000001606a0
> Call Trace:
>  xfs_fs_freeze+0x25/0x40 [xfs]
>  freeze_super+0xc8/0x180
>  do_vfs_ioctl+0x70b/0x750
>  ? __fget_files+0x135/0x210
>  ksys_ioctl+0x3a/0xb0
>  __x64_sys_ioctl+0x16/0x20
>  do_syscall_64+0x50/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> These two things appear to be related.  The assertion trips when another
> thread initiates a fsmap request (which uses an empty transaction) after
> the freezer waited for m_active_trans to hit zero but before the the
> freezer executes the WARN_ON just prior to calling xfs_log_quiesce.
> 
> The lengthy delays in freezing happen because the freezer calls
> xfs_wait_buftarg to clean out the buffer lru list.  Meanwhile, the
> GETFSMAP caller is continuing to grab and release buffers, which means
> that it can take a very long time for the buffer lru list to empty out.
> 
> We fix both of these races by calling sb_start_write to obtain freeze
> protection while using empty transactions for GETFSMAP and for metadata
> scrubbing.  The other two users occur during mount, during which time we
> cannot fs freeze.

Makes sense. Minor nits:

> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/scrub/scrub.c |    4 ++++
>  fs/xfs/xfs_fsmap.c   |    4 ++++
>  fs/xfs/xfs_trans.c   |    5 +++++
>  3 files changed, 13 insertions(+)
> 
> 
> diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
> index f1775bb19313..a42bb66b335d 100644
> --- a/fs/xfs/scrub/scrub.c
> +++ b/fs/xfs/scrub/scrub.c
> @@ -168,6 +168,7 @@ xchk_teardown(
>  			xfs_irele(sc->ip);
>  		sc->ip = NULL;
>  	}
> +	sb_end_write(sc->mp->m_super);
>  	if (sc->flags & XCHK_REAPING_DISABLED)
>  		xchk_start_reaping(sc);
>  	if (sc->flags & XCHK_HAS_QUOTAOFFLOCK) {
> @@ -490,6 +491,9 @@ xfs_scrub_metadata(
>  	sc.ops = &meta_scrub_ops[sm->sm_type];
>  	sc.sick_mask = xchk_health_mask_for_scrub_type(sm->sm_type);
>  retry_op:
> +	/* Avoid conflicts with fs freeze. */
> +	sb_start_write(mp->m_super);
> +

Rather than saying something realtively meaningless like "avoid
conflicts with freeze", wouldn't it be better to say something like:

	/*
	 * If freeze runs concurrently with a scrub, the freeze can
	 * be delayed indefinitely as we walk the filesystem and
	 * iterate over metadata buffers. Freeze quiesces the log
	 * which waits for the buffer LRU to be emptied and that
	 * won't happen while checking runs.
	 */

>  	/* Set up for the operation. */
>  	error = sc.ops->setup(&sc, ip);
>  	if (error)
> diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
> index 918456ca29e1..2bb2cd1e63cf 100644
> --- a/fs/xfs/xfs_fsmap.c
> +++ b/fs/xfs/xfs_fsmap.c
> @@ -896,6 +896,9 @@ xfs_getfsmap(
>  	info.format_arg = arg;
>  	info.head = head;
>  
> +	/* Avoid conflicts with fs freeze. */
> +	sb_start_write(mp->m_super);
> +

And a similar comment here?

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-03-25  4:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25  3:24 [PATCH 0/4] xfs: random fixes Darrick J. Wong
2020-03-25  3:24 ` [PATCH 1/4] xfs: prohibit fs freezing when using empty transactions Darrick J. Wong
2020-03-25  4:53   ` Dave Chinner [this message]
2020-03-25  5:56     ` Darrick J. Wong
2020-03-25  6:06   ` [PATCH v2 " Darrick J. Wong
2020-03-25 23:26     ` Dave Chinner
2020-04-01 11:22   ` [PATCH " Chandan Rajendra
2020-03-25  3:24 ` [PATCH 2/4] xfs: validate the realtime geometry in xfs_validate_sb_common Darrick J. Wong
2020-03-25  5:00   ` Dave Chinner
2020-03-25  5:53     ` Darrick J. Wong
2020-03-25  6:07   ` [PATCH v2 " Darrick J. Wong
2020-03-25 23:27     ` Dave Chinner
2020-04-01 13:49   ` [PATCH " Chandan Rajendra
2020-03-25  3:24 ` [PATCH 3/4] xfs: drop all altpath buffers at the end of the sibling check Darrick J. Wong
2020-03-25  5:04   ` Dave Chinner
2020-04-02  3:05   ` Chandan Rajendra
2020-03-25  3:24 ` [PATCH 4/4] xfs: directory bestfree check should release buffers Darrick J. Wong
2020-03-25  5:05   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325045353.GF10776@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.