cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mark Syms <Mark.Syms@citrix.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue	before stopping the kernel threads
Date: Mon, 8 Oct 2018 20:53:23 +0000	[thread overview]
Message-ID: <19:35> (raw)
In-Reply-To: <1069219936.19268600.1539029401159.JavaMail.zimbra@redhat.com>

Thanks Bob,

We're having a look at the bit that Tim sent earlier to see if switching the call order makes the schedule_timeout never occur.

As Steven said, it would be nice if we didn't have to worry about waiting for "dead" glocks to be cleaned up but that would require some considerable care to ensure that we don't just get the opposite race and get a newly created glock freed and deleted.

Mark
________________________________
From: Bob Peterson <rpeterso@redhat.com>
Sent: Monday, 8 October 2018 21:10
To: Mark Syms <Mark.Syms@citrix.com>
CC: cluster-devel at redhat.com,Ross Lagerwall <ross.lagerwall@citrix.com>,Tim Smith <tim.smith@citrix.com>
Subject: Re: [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads


----- Original Message -----
> From: Tim Smith <tim.smith@citrix.com>
>
> Flushing the workqueue can cause operations to happen which might
> call gfs2_log_reserve(), or get stuck waiting for locks taken by such
> operations.  gfs2_log_reserve() can io_schedule(). If this happens, it
> will never wake because the only thing which can wake it is gfs2_logd()
> which was already stopped.
>
> This causes umount of a gfs2 filesystem to wedge permanently if, for
> example, the umount immediately follows a large delete operation.
>
> When this occured, the following stack trace was obtained from the
> umount command
>
> [<ffffffff81087968>] flush_workqueue+0x1c8/0x520
> [<ffffffffa0666e29>] gfs2_make_fs_ro+0x69/0x160 [gfs2]
> [<ffffffffa0667279>] gfs2_put_super+0xa9/0x1c0 [gfs2]
> [<ffffffff811b7edf>] generic_shutdown_super+0x6f/0x100
> [<ffffffff811b7ff7>] kill_block_super+0x27/0x70
> [<ffffffffa0656a71>] gfs2_kill_sb+0x71/0x80 [gfs2]
> [<ffffffff811b792b>] deactivate_locked_super+0x3b/0x70
> [<ffffffff811b79b9>] deactivate_super+0x59/0x60
> [<ffffffff811d2998>] cleanup_mnt+0x58/0x80
> [<ffffffff811d2a12>] __cleanup_mnt+0x12/0x20
> [<ffffffff8108c87d>] task_work_run+0x7d/0xa0
> [<ffffffff8106d7d9>] exit_to_usermode_loop+0x73/0x98
> [<ffffffff81003961>] syscall_return_slowpath+0x41/0x50
> [<ffffffff815a594c>] int_ret_from_sys_call+0x25/0x8f
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> Signed-off-by: Tim Smith <tim.smith@citrix.com>
> Signed-off-by: Mark Syms <mark.syms@citrix.com>
> ---
Hi Mark, Tim, and all,

I pushed patch 2/2 upstream. For now I'll hold off on 1/2 but keep it
on my queue, pending our investigation.
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/fs/gfs2?h=for-next&id=b7f5a2cd27b76e96fdc6d77b060dfdd877c9d0a9

Regards,

Bob Peterson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20181008/a5fe00da/attachment.htm>

  reply	other threads:[~2018-10-08 20:53 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08 12:36 [Cluster-devel] [PATCH 0/2] GFS2 locking patches Mark Syms
2018-10-08 12:36 ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Mark Syms
2018-10-08 12:56   ` Steven Whitehouse
2018-10-08 12:59     ` Mark Syms
2018-10-08 13:03       ` Steven Whitehouse
2018-10-08 13:10         ` Tim Smith
2018-10-08 13:13           ` Steven Whitehouse
2018-10-08 13:26             ` Tim Smith
2018-10-09  8:13               ` Mark Syms
2018-10-09  8:41                 ` Steven Whitehouse
2018-10-09  8:50                   ` Mark Syms
2018-10-09 12:34               ` Andreas Gruenbacher
2018-10-09 14:00                 ` Tim Smith
2018-10-09 14:47                   ` Andreas Gruenbacher
2018-10-09 15:34                     ` Tim Smith
2018-10-08 13:25         ` Bob Peterson
2018-10-08 12:36 ` [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads Mark Syms
2018-10-08 12:57   ` Steven Whitehouse
2018-10-08 20:10   ` Bob Peterson
2018-10-08 20:53     ` Mark Syms [this message]
     [not found] ` <1603192.QCZcUHDx0E@dhcp-3-135.uk.xensource.com>
     [not found]   ` <CAHc6FU7AyDQ=yEj9WtN+aqtf7jfWs0TGh=Og0fQFVHyYMKHacA@mail.gmail.com>
2018-10-09 15:08     ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Tim Smith
2018-10-10  8:22       ` Tim Smith
2018-10-11  8:14         ` Mark Syms
2018-10-11 19:28           ` Mark Syms
     [not found]           ` <5bbfa461.1c69fb81.1242f.59a7SMTPIN_ADDED_BROKEN@mx.google.com>
2019-03-06 16:14             ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='19:35' \
    --to=mark.syms@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).