All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Chris Mason <chris.mason@fusionio.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] futex: Remove requirement for lock_page in get_futex_key
Date: Fri, 1 Nov 2013 09:24:46 +0000	[thread overview]
Message-ID: <20131101092446.GU2400@suse.de> (raw)
In-Reply-To: <20131029184827.10719.27487@localhost.localdomain>

On Tue, Oct 29, 2013 at 02:48:27PM -0400, Chris Mason wrote:
> Quoting Mel Gorman (2013-10-29 13:38:14)
> > Thomas Gleixner and Peter Zijlstra discussed off-list that real-time users
> > currently have a problem with the page lock being contended for unbounded
> > periods of time during futex operations. The three of us discussed the
> > possibiltity that the page lock is unnecessary in this case because we are
> > not concerned with the usual races with reclaim and page cache updates. For
> > anonymous pages, the associated futex object is the mm_struct which does
> > not require the page lock. For inodes, we should be able to check under
> > RCU read lock if the page mapping is still valid to take a reference to
> > the inode.  This just leaves one rare race that requires the page lock
> > in the slow path. This patch does not completely eliminate the page lock
> > but it should reduce contention in the majority of cases.
> > 
> > Patch boots and futextest did not explode but I did no comparison
> > performance tests. Thomas, do you have details of the workload that
> > drove you to examine this problem? Alternatively, can you test it and
> > see does it help you? I added Chris to the To list because he mentioned
> > that some filesystems might already be doing tricks similar to this
> > patch that are worth copying.
> 
> Unfortunately, all the special cases I see in the filesystems either
> have an inode ref or are trylocking the page to safety.
> 

Ok, at the time of the futex call there is an implicit ref due to the
mapping but it can be torn away underneath us at any time. I *think* I
have the right ordering to not make a mistake in this case but more eyes
the better.

> XFS is a special case because they have their own inode cache, but by my
> reading they are still using i_count and free by rcu.
> 

Good, that's what I expected.

> The iput in here is a little tricky:
> 
> > 
> > +               /* Should be impossible but lets be paranoid for now */
> > +               if (WARN_ON(inode->i_mapping != mapping)) {
> > +                       rcu_read_unlock();
> > +                       iput(inode);
> > +                       put_page(page_head);
> > +                       goto again;
> > +               }
> > +
> 
> Once you call iput, you add the potential to call the filesystem unlink
> operation if i_nlink had gone to zero.  This shouldn't be a problem
> since you've dropped the rcu lock, but just for fun I'd move the
> put_page up a line.
> 
> Or, change it to a BUG_ON instead, it really should be impossible.

I'll do that. It'll blow up with the RCU lock still held so the system
is going to have a bad day but we're already in hell at this point.

Thanks Chris

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2013-11-01  9:24 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-29 17:38 [RFC PATCH] futex: Remove requirement for lock_page in get_futex_key Mel Gorman
2013-10-29 18:48 ` Chris Mason
2013-10-30  8:57   ` Peter Zijlstra
2013-11-01  9:24   ` Mel Gorman [this message]
2013-10-30  8:45 ` Thomas Gleixner
2013-10-30  9:26   ` Mel Gorman
2014-02-11 15:51     ` Thomas Gleixner
2015-02-24 16:55       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131101092446.GU2400@suse.de \
    --to=mgorman@suse.de \
    --cc=chris.mason@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.