cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock
Date: Tue, 9 Oct 2018 14:34:47 +0200	[thread overview]
Message-ID: <CAHc6FU6+t-5WdjcApv7rZmTZ5Zg_hxNgmCtNUC3b_m0HrUT-_A@mail.gmail.com> (raw)
In-Reply-To: <21999012.kCAuXqbhYe@dhcp-3-135.uk.xensource.com>

On Mon, 8 Oct 2018 at 15:27, Tim Smith <tim.smith@citrix.com> wrote:
>
> On Monday, 8 October 2018 14:13:10 BST Steven Whitehouse wrote:
> > Hi,
> >
> > On 08/10/18 14:10, Tim Smith wrote:
> > > On Monday, 8 October 2018 14:03:24 BST Steven Whitehouse wrote:
> > >> On 08/10/18 13:59, Mark Syms wrote:
> > >>> That sounds entirely reasonable so long as you are absolutely sure that
> > >>> nothing is ever going to mess with that glock, we erred on the side of
> > >>> more caution not knowing whether it would be guaranteed safe or not.
> > >>>
> > >>> Thanks,
> > >>>
> > >>>   Mark
> > >>
> > >> We should have a look at the history to see how that wait got added.
> > >> However the "dead" flag here means "don't touch this glock" and is there
> > >> so that we can separate the marking dead from the actual removal from
> > >> the list (which simplifies the locking during the scanning procedures)
> > >
> > > You beat me to it :-)
> > >
> > > I think there might be a bit of a problem inserting a new entry with the
> > > same name before the old entry has been fully destroyed (or at least
> > > removed), which would be why the schedule() is there.
> >
> > If the old entry is marked dead, all future lookups should ignore it. We
> > should only have a single non-dead entry at a time, but that doesn't
> > seem like it should need us to wait for it.
>
> On the second call we do have the new glock to insert as arg2, so we could try
> to swap them cleanly, yeah.
>
> > If we do discover that the wait is really required, then it sounds like
> > as you mentioned above there is a lost wakeup, and that must presumably
> > be on a code path that sets the dead flag and then fails to send a wake
> > up later on. If we can drop the wait in the first place, that seems like
> > a better plan,
>
> Ooooh, I wonder if these two lines:
>
>         wake_up_glock(gl);
>         call_rcu(&gl->gl_rcu, gfs2_glock_dealloc);
>
> in gfs2_glock_free() are the wrong way round?

No, what's important here is that the wake-up happens after the glock
being freed has been removed from the rhashtable, and that's the case
here. wake_up_glock actually accesses the glock, which may no longer
be around after the call_rcu, so swapping the two lines would
introduce a use-after-free bug.

There must be another reason for the missed wake-up. I'll have to
study the code some more.

Thanks,
Andreas



  parent reply	other threads:[~2018-10-09 12:34 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-08 12:36 [Cluster-devel] [PATCH 0/2] GFS2 locking patches Mark Syms
2018-10-08 12:36 ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Mark Syms
2018-10-08 12:56   ` Steven Whitehouse
2018-10-08 12:59     ` Mark Syms
2018-10-08 13:03       ` Steven Whitehouse
2018-10-08 13:10         ` Tim Smith
2018-10-08 13:13           ` Steven Whitehouse
2018-10-08 13:26             ` Tim Smith
2018-10-09  8:13               ` Mark Syms
2018-10-09  8:41                 ` Steven Whitehouse
2018-10-09  8:50                   ` Mark Syms
2018-10-09 12:34               ` Andreas Gruenbacher [this message]
2018-10-09 14:00                 ` Tim Smith
2018-10-09 14:47                   ` Andreas Gruenbacher
2018-10-09 15:34                     ` Tim Smith
2018-10-08 13:25         ` Bob Peterson
2018-10-08 12:36 ` [Cluster-devel] [PATCH 2/2] GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads Mark Syms
2018-10-08 12:57   ` Steven Whitehouse
2018-10-08 20:10   ` Bob Peterson
2018-10-08 20:53     ` Mark Syms
     [not found] ` <1603192.QCZcUHDx0E@dhcp-3-135.uk.xensource.com>
     [not found]   ` <CAHc6FU7AyDQ=yEj9WtN+aqtf7jfWs0TGh=Og0fQFVHyYMKHacA@mail.gmail.com>
2018-10-09 15:08     ` [Cluster-devel] [PATCH 1/2] GFS2: use schedule timeout in find insert glock Tim Smith
2018-10-10  8:22       ` Tim Smith
2018-10-11  8:14         ` Mark Syms
2018-10-11 19:28           ` Mark Syms
     [not found]           ` <5bbfa461.1c69fb81.1242f.59a7SMTPIN_ADDED_BROKEN@mx.google.com>
2019-03-06 16:14             ` Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHc6FU6+t-5WdjcApv7rZmTZ5Zg_hxNgmCtNUC3b_m0HrUT-_A@mail.gmail.com \
    --to=agruenba@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).