[PATCH AUTOSEL 6.0 03/18] rcu: Back off upon fill_page_cache_func() allocation failure

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Neeraj Upadhyay <quic_neeraju@quicinc.com>,
	Josh Triplett <josh@joshtriplett.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Sasha Levin <sashal@kernel.org>,
	rcu@vger.kernel.org
Subject: [PATCH AUTOSEL 6.0 03/18] rcu: Back off upon fill_page_cache_func() allocation failure
Date: Sun,  9 Oct 2022 16:51:20 -0400	[thread overview]
Message-ID: <20221009205136.1201774-3-sashal@kernel.org> (raw)
In-Reply-To: <20221009205136.1201774-1-sashal@kernel.org>

From: Michal Hocko <mhocko@suse.com>

[ Upstream commit 093590c16b447f53e66771c8579ae66c96f6ef61 ]

The fill_page_cache_func() function allocates couple of pages to store
kvfree_rcu_bulk_data structures. This is a lightweight (GFP_NORETRY)
allocation which can fail under memory pressure. The function will,
however keep retrying even when the previous attempt has failed.

This retrying is in theory correct, but in practice the allocation is
invoked from workqueue context, which means that if the memory reclaim
gets stuck, these retries can hog the worker for quite some time.
Although the workqueues subsystem automatically adjusts concurrency, such
adjustment is not guaranteed to happen until the worker context sleeps.
And the fill_page_cache_func() function's retry loop is not guaranteed
to sleep (see the should_reclaim_retry() function).

And we have seen this function cause workqueue lockups:

kernel: BUG: workqueue lockup - pool cpus=93 node=1 flags=0x1 nice=0 stuck for 32s!
[...]
kernel: pool 74: cpus=37 node=0 flags=0x1 nice=0 hung=32s workers=2 manager: 2146
kernel:   pwq 498: cpus=249 node=1 flags=0x1 nice=0 active=4/256 refcnt=5
kernel:     in-flight: 1917:fill_page_cache_func
kernel:     pending: dbs_work_handler, free_work, kfree_rcu_monitor

Originally, we thought that the root cause of this lockup was several
retries with direct reclaim, but this is not yet confirmed.  Furthermore,
we have seen similar lockups without any heavy memory pressure.  This
suggests that there are other factors contributing to these lockups.
However, it is not really clear that endless retries are desireable.

So let's make the fill_page_cache_func() function back off after
allocation failure.

Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/rcu/tree.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 79aea7df4345..eb435941e92f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3183,15 +3183,16 @@ static void fill_page_cache_func(struct work_struct *work)
 		bnode = (struct kvfree_rcu_bulk_data *)
 			__get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);

-		if (bnode) {
-			raw_spin_lock_irqsave(&krcp->lock, flags);
-			pushed = put_cached_bnode(krcp, bnode);
-			raw_spin_unlock_irqrestore(&krcp->lock, flags);
+		if (!bnode)
+			break;

-			if (!pushed) {
-				free_page((unsigned long) bnode);
-				break;
-			}
+		raw_spin_lock_irqsave(&krcp->lock, flags);
+		pushed = put_cached_bnode(krcp, bnode);
+		raw_spin_unlock_irqrestore(&krcp->lock, flags);
+
+		if (!pushed) {
+			free_page((unsigned long) bnode);
+			break;
 		}
 	}

-- 
2.35.1

next prev parent reply	other threads:[~2022-10-09 20:51 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-09 20:51 [PATCH AUTOSEL 6.0 01/18] fs: dlm: fix race in lowcomms Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 02/18] rcu: Avoid triggering strict-GP irq-work when RCU is idle Sasha Levin
2022-10-09 20:51 ` Sasha Levin [this message]
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 04/18] rcu-tasks: Convert RCU_LOCKDEP_WARN() to WARN_ONCE() Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 05/18] rcu-tasks: Ensure RCU Tasks Trace loops have quiescent states Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 06/18] cpufreq: amd_pstate: fix wrong lowest perf fetch Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 07/18] ACPI: video: Add Toshiba Satellite/Portege Z830 quirk Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 08/18] fortify: Fix __compiletime_strlen() under UBSAN_BOUNDS_LOCAL Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 09/18] ACPI: tables: FPDT: Don't call acpi_os_map_memory() on invalid phys address Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 10/18] cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 11/18] MIPS: BCM47XX: Cast memcmp() of function to (void *) Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 12/18] powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 13/18] thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 14/18] ARM: decompressor: Include .data.rel.ro.local Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 15/18] ACPI: x86: Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 16/18] x86/entry: Work around Clang __bdos() bug Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 17/18] NFSD: Return nfserr_serverfault if splice_ok but buf->pages have data Sasha Levin
2022-10-09 20:51 ` [PATCH AUTOSEL 6.0 18/18] NFSD: fix use-after-free on source server when doing inter-server copy Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:79aea7df434 dfblob:eb435941e92 )
 OR (
bs:"[PATCH AUTOSEL 6.0 03/18] rcu: Back off upon fill_page_cache_func() allocation failure" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221009205136.1201774-3-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhocko@suse.com \
    --cc=paulmck@kernel.org \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox