All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Maximilian Heyne <mheyne@amazon.de>
Cc: stable@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michael Larabel <Michael@michaellarabel.com>,
	Matthieu Baerts <matthieu.baerts@tessares.net>,
	Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>, Chris Mason <clm@fb.com>,
	Jan Kara <jack@suse.cz>, Amir Goldstein <amir73il@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] mm: allow a controlled amount of unfairness in the page lock
Date: Sun, 27 Aug 2023 10:54:03 +0200	[thread overview]
Message-ID: <2023082731-crunching-second-ad89@gregkh> (raw)
In-Reply-To: <20230823061642.76949-1-mheyne@amazon.de>

On Wed, Aug 23, 2023 at 06:16:42AM +0000, Maximilian Heyne wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> 
> [ upstream commit 5ef64cc8987a9211d3f3667331ba3411a94ddc79 ]
> 
> Commit 2a9127fcf229 ("mm: rewrite wait_on_page_bit_common() logic") made
> the page locking entirely fair, in that if a waiter came in while the
> lock was held, the lock would be transferred to the lockers strictly in
> order.
> 
> That was intended to finally get rid of the long-reported watchdog
> failures that involved the page lock under extreme load, where a process
> could end up waiting essentially forever, as other page lockers stole
> the lock from under it.
> 
> It also improved some benchmarks, but it ended up causing huge
> performance regressions on others, simply because fair lock behavior
> doesn't end up giving out the lock as aggressively, causing better
> worst-case latency, but potentially much worse average latencies and
> throughput.
> 
> Instead of reverting that change entirely, this introduces a controlled
> amount of unfairness, with a sysctl knob to tune it if somebody needs
> to.  But the default value should hopefully be good for any normal load,
> allowing a few rounds of lock stealing, but enforcing the strict
> ordering before the lock has been stolen too many times.
> 
> There is also a hint from Matthieu Baerts that the fair page coloring
> may end up exposing an ABBA deadlock that is hidden by the usual
> optimistic lock stealing, and while the unfairness doesn't fix the
> fundamental issue (and I'm still looking at that), it avoids it in
> practice.
> 
> The amount of unfairness can be modified by writing a new value to the
> 'sysctl_page_lock_unfairness' variable (default value of 5, exposed
> through /proc/sys/vm/page_lock_unfairness), but that is hopefully
> something we'd use mainly for debugging rather than being necessary for
> any deep system tuning.
> 
> This whole issue has exposed just how critical the page lock can be, and
> how contended it gets under certain locks.  And the main contention
> doesn't really seem to be anything related to IO (which was the origin
> of this lock), but for things like just verifying that the page file
> mapping is stable while faulting in the page into a page table.
> 
> Link: https://lore.kernel.org/linux-fsdevel/ed8442fd-6f54-dd84-cd4a-941e8b7ee603@MichaelLarabel.com/
> Link: https://www.phoronix.com/scan.php?page=article&item=linux-50-59&num=1
> Link: https://lore.kernel.org/linux-fsdevel/c560a38d-8313-51fb-b1ec-e904bd8836bc@tessares.net/
> Reported-and-tested-by: Michael Larabel <Michael@michaellarabel.com>
> Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Cc: Dave Chinner <david@fromorbit.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Chris Mason <clm@fb.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Amir Goldstein <amir73il@gmail.com>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> CC: <stable@vger.kernel.org> # 5.4
> [ mheyne: fixed contextual conflict in mm/filemap.c due to missing
>   commit c7510ab2cf5c ("mm: abstract out wake_page_match() from
>   wake_page_function()"). Added WQ_FLAG_CUSTOM due to missing commit
>   7f26482a872c ("locking/percpu-rwsem: Remove the embedded rwsem") ]
> Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
> ---
>  include/linux/mm.h   |   2 +
>  include/linux/wait.h |   2 +
>  kernel/sysctl.c      |   8 +++
>  mm/filemap.c         | 160 ++++++++++++++++++++++++++++++++++---------
>  4 files changed, 141 insertions(+), 31 deletions(-)

This was also backported here:
	https://lore.kernel.org/r/20230821222547.483583-1-saeed.mirzamohammadi@oracle.com
before yours.

I took that one, can you verify that it is identical to yours and works
properly as well?

thanks,

greg k-h

  reply	other threads:[~2023-08-27  8:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-23  6:16 [PATCH] mm: allow a controlled amount of unfairness in the page lock Maximilian Heyne
2023-08-27  8:54 ` Greg KH [this message]
2023-08-28 10:14   ` Maximilian Heyne
2023-08-30  6:26     ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2023082731-crunching-second-ad89@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=Michael@michaellarabel.com \
    --cc=akpm@linux-foundation.org \
    --cc=amir73il@gmail.com \
    --cc=bsegall@google.com \
    --cc=clm@fb.com \
    --cc=david@fromorbit.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=jack@suse.cz \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthieu.baerts@tessares.net \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mheyne@amazon.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.