XFS race on umount - Quentin Casasnovas

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Quentin Casasnovas <quentin.casasnovas@oracle.com>
To: linux-xfs@vger.kernel.org
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Subject: XFS race on umount
Date: Fri, 10 Mar 2017 13:04:06 +0100	[thread overview]
Message-ID: <20170310120406.GU16870@chrystal> (raw)

[-- Attachment #1: Type: text/plain, Size: 3574 bytes --]

Hi Guys,

We've been using XFS recently on our build system because we found that it
scales pretty well and we have good use for the reflink feature :)

I think our setup is relivatively unique in that on every one of our build
server, we mount hundreds of XFS filesystem from NBD devices in parallel,
where our build environment are stored on qcow2 images and connected with
qemu-nbd, then umount them when the build is finished.  Those qcow2 images
are stored on a NFS mount, which leads to some (expected) hickups when
reading/writing blocks where sometimes the NBD layer will return some
errors to the block layer, which in turn will pass them on to XFS.  It
could be due to network contention, very high load on the server, or any
transcient error really, and in those cases, XFS will normally force shut
down the filesystem and wait for a umount.

All of this is fine and is exactly the behaviour we'd expect, though it
turns out that we keep hiting what I think is a race condition between
umount and a force shutdown from XFS itself, where I have a umount process
completely stuck in xfs_ail_push_all_sync():

  [<ffffffff813d987e>] xfs_ail_push_all_sync+0x9e/0xe0
  [<ffffffff813c20c7>] xfs_unmountfs+0x67/0x150
  [<ffffffff813c5540>] xfs_fs_put_super+0x20/0x70
  [<ffffffff811cba7a>] generic_shutdown_super+0x6a/0xf0
  [<ffffffff811cbb2b>] kill_block_super+0x2b/0x80
  [<ffffffff811cc067>] deactivate_locked_super+0x47/0x80
  [<ffffffff811ccc19>] deactivate_super+0x49/0x70
  [<ffffffff811e7b3e>] cleanup_mnt+0x3e/0x90
  [<ffffffff811e7bdd>] __cleanup_mnt+0xd/0x10
  [<ffffffff810e1b39>] task_work_run+0x79/0xa0
  [<ffffffff810c2df7>] exit_to_usermode_loop+0x4f/0x75
  [<ffffffff8100134b>] syscall_return_slowpath+0x5b/0x70
  [<ffffffff81a2cbe3>] entry_SYSCALL_64_fastpath+0x96/0x98
  [<ffffffffffffffff>] 0xffffffffffffffff

This is on a v4.10.1 kernel.  I've had a look at xfs_ail_push_all_sync()
and I wonder if there isn't a potential lost wake up problem, where I can't
see that we retest the condition after setting the current process to
TASK_UNINTERRUPTIBLE and before calling schedule() (though I know nothing
about XFS internals...).

Here's an exerpt of relevant dmesg messages that very likely happened at
the same time the unmount process was started:

  [29961.767707] block nbd74: Other side returned error (22)
  [29961.837518] XFS (nbd74): metadata I/O error: block 0x6471ba0 ("xfs_tra=
ns_read_buf_map") error 5 numblks 32
  [29961.838172] block nbd74: Other side returned error (22)
  [29961.838179] block nbd74: Other side returned error (22)
  [29961.838184] block nbd74: Other side returned error (22)
  [29961.838203] block nbd74: Other side returned error (22)
  [29961.838208] block nbd74: Other side returned error (22)
  [29962.259551] XFS (nbd74): xfs_imap_to_bp: xfs_trans_read_buf() returned=
 error -5.
  [29962.356376] XFS (nbd74): xfs_do_force_shutdown(0x8) called from line 3=
454 of file fs/xfs/xfs_inode.c.  Return address =3D 0xffffffff813bf471
  [29962.503003] XFS (nbd74): Corruption of in-memory data detected.  Shutt=
ing down filesystem
  [29963.166314] XFS (nbd74): Please umount the filesystem and rectify the =
problem(s)

I'm pretty sure the process isn't deadlocking on the spinlock because it
doesn't burn any CPU and is really out of the scheduler pool.  It should be
noted that when I noticed the hung umount process, I've manually tried to
unmount the corresponding XFS mountpoint and that was fine, though it
obviously didn't "unhang" the stuck umount process.

Any help would be appreciated :)

Thanks,
Quentin

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

next             reply	other threads:[~2017-03-10 12:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-10 12:04 Quentin Casasnovas [this message]
2017-03-10 14:05 ` XFS race on umount Brian Foster
2017-03-10 14:38   ` Quentin Casasnovas
2017-03-10 14:52     ` Brian Foster
2017-03-20 12:33       ` Carlos Maiolino
2017-03-24 12:13         ` Quentin Casasnovas
2017-04-03  8:15           ` Carlos Maiolino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170310120406.GU16870@chrystal \
    --to=quentin.casasnovas@oracle.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).