cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mark Syms <Mark.Syms@citrix.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock
Date: Thu, 11 Oct 2018 19:28:28 +0000	[thread overview]
Message-ID: <32:45> (raw)
In-Reply-To: <a5a909bcc95941808e7b36eac18ee8df@AMSPEX02CL02.citrite.net>

Also, the prepare to wait call does so in non exclusive mode so switching the parameter on the wakeup would have no effect as non exclusive waiters are always woken anyway.

So, there is some other reason why we decide to yield but never get woken.

Mark.
________________________________
From: Mark Syms <Mark.Syms@citrix.com>
Sent: Thursday, 11 October 2018 09:14
To: Tim Smith <tim.smith@citrix.com>,cluster-devel at redhat.com
CC: Andreas Gruenbacher <agruenba@redhat.com>,Ross Lagerwall <ross.lagerwall@citrix.com>
Subject: RE: [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock


And, sadly, the change to wait all also doesn't prevent the schedule_timeout from occurring. Something more subtle going on obviously.

        Mark.

-----Original Message-----
From: Tim Smith <tim.smith@citrix.com>
Sent: 10 October 2018 09:23
To: cluster-devel@redhat.com
Cc: Andreas Gruenbacher <agruenba@redhat.com>; Ross Lagerwall <ross.lagerwall@citrix.com>; Mark Syms <Mark.Syms@citrix.com>
Subject: Re: [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock

On Tuesday, 9 October 2018 16:08:10 BST Tim Smith wrote:
> On Tuesday, 9 October 2018 14:00:34 BST Andreas Gruenbacher wrote:
> > On Tue, 9 Oct 2018 at 14:46, Tim Smith <tim.smith@citrix.com> wrote:
> > > On Tuesday, 9 October 2018 13:34:47 BST you wrote:
> > > > There must be another reason for the missed wake-up. I'll have
> > > > to study the code some more.
> > >
> > > I think it needs to go into gfs2_glock_dealloc(), in such a way as
> > > to avoid that problem. Currently working out a patch to do that.
> >
> > That doesn't sound right: find_insert_glock is waiting for the glock
> > to be removed from the rhashtable. In gfs2_glock_free, we remove the
> > glock from the rhashtable and then we do the wake-up. Delaying the
> > wakeup further isn't going to help (but it might hide the problem).
>
> The only way we can get the problem we're seeing is if we get an
> effective order of
>
> T0: wake_up_glock()
> T1: prepare_to_wait()
> T1: schedule()
>
> so clearly there's a way for that to happen. Any other order and
> either
> schedule() doesn't sleep or it gets woken.
>
> The only way I can see at the moment to ensure that wake_up_glock()
> *cannot* get called until after prepare_to_wait() is to delay it until
> the read_side critical sections are done, and the first place that's
> got that property is the start of gfs2_glock_dealloc(), unless we want
> to add synchronise_rcu() to gfs2_glock_free() and I'm guessing there's
> a reason it's using
> call_rcu() instead.
>
> I'll keep thinking about it.

OK, we have a result that this definitely *isn't* the right answer (still getting the wakeup happening). This is lends more weight to the idea that there are multiple waiters, so we'll try your patch.

--
Tim Smith <tim.smith@citrix.com>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20181011/ef6af022/attachment.htm>

  reply	other threads:[~2018-10-11 19:28 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08 12:36 [Cluster-devel] [PATCH 0/2] GFS2 locking patches Mark Syms
2018-10-08 12:36 ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Mark Syms
2018-10-08 12:56   ` Steven Whitehouse
2018-10-08 12:59     ` Mark Syms
2018-10-08 13:03       ` Steven Whitehouse
2018-10-08 13:10         ` Tim Smith
2018-10-08 13:13           ` Steven Whitehouse
2018-10-08 13:26             ` Tim Smith
2018-10-09  8:13               ` Mark Syms
2018-10-09  8:41                 ` Steven Whitehouse
2018-10-09  8:50                   ` Mark Syms
2018-10-09 12:34               ` Andreas Gruenbacher
2018-10-09 14:00                 ` Tim Smith
2018-10-09 14:47                   ` Andreas Gruenbacher
2018-10-09 15:34                     ` Tim Smith
2018-10-08 13:25         ` Bob Peterson
2018-10-08 12:36 ` [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads Mark Syms
2018-10-08 12:57   ` Steven Whitehouse
2018-10-08 20:10   ` Bob Peterson
2018-10-08 20:53     ` Mark Syms
     [not found] ` <1603192.QCZcUHDx0E@dhcp-3-135.uk.xensource.com>
     [not found]   ` <CAHc6FU7AyDQ=yEj9WtN+aqtf7jfWs0TGh=Og0fQFVHyYMKHacA@mail.gmail.com>
2018-10-09 15:08     ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Tim Smith
2018-10-10  8:22       ` Tim Smith
2018-10-11  8:14         ` Mark Syms
2018-10-11 19:28           ` Mark Syms [this message]
     [not found]           ` <5bbfa461.1c69fb81.1242f.59a7SMTPIN_ADDED_BROKEN@mx.google.com>
2019-03-06 16:14             ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='32:45' \
    --to=mark.syms@citrix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).