From: "Luis Claudio R. Goncalves" <lclaudio@uudg.org>
To: linux-rt-users@vger.kernel.org
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>,
Ping Fang <pifang@redhat.com>,
Clark Williams <williams@redhat.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: [PATCH V3 RT] mm/zswap: Do not disable preemption in zswap_frontswap_store()
Date: Tue, 25 Jun 2019 11:28:04 -0300 [thread overview]
Message-ID: <20190625142804.GH4902@uudg.org> (raw)
Zswap causes "BUG: scheduling while atomic" by blocking on a rt_spin_lock() with
preemption disabled. The preemption is disabled by get_cpu_var() in
zswap_frontswap_store() to protect the access of the zswap_dstmem percpu variable.
Use get_locked_var() to protect the percpu zswap_dstmem variable, making the
code preemptive.
As get_cpu_ptr() also disables preemption, replace it by this_cpu_ptr() and
remove the counterpart put_cpu_ptr().
Steps to Reproduce:
1. # grubby --args "zswap.enabled=1" --update-kernel DEFAULT
2. # reboot
3. Calculate the amount o memory to be used by the test:
---> grep MemAvailable /proc/meminfo
---> Add 25% ~ 50% to that value
4. # stress --vm 1 --vm-bytes ${MemAvailable+25%} --timeout 240s
Usually, in less than 5 minutes the backtrace listed below appears, followed
by a kernel panic:
[26747.628830] 014: BUG: scheduling while atomic: kswapd1/181/0x00000002
[26747.834904] 014:
[26747.839778] 014: Preemption disabled at:
[26747.845598] 014: [<ffffffff8b2a6cda>] zswap_frontswap_store+0x21a/0x6e1
[26747.856103] 014:
[26747.864619] 014: Kernel panic - not syncing: scheduling while atomic
[26747.872859] 014: CPU: 14 PID: 181 Comm: kswapd1 Kdump: loaded Not tainted 5.0.14-rt9 #1
[26747.880832] 014: Hardware name: AMD Pence/Pence, BIOS WPN2321X_Weekly_12_03_21 03/19/2012
[26747.888977] 014: Call Trace:
[26747.891847] 014: dump_stack+0x85/0xc0
[26747.895584] 014: panic+0x106/0x2a7
[26747.899063] 014: ? zswap_frontswap_store+0x21a/0x6e1
[26747.904093] 014: __schedule_bug.cold+0x3f/0x51
[26747.908605] 014: __schedule+0x5cb/0x6f0
[26747.912513] 014: schedule+0x43/0xd0
[26747.916073] 014: rt_spin_lock_slowlock_locked+0x114/0x2b0
[26747.921538] 014: rt_spin_lock_slowlock+0x51/0x80
[26747.926225] 014: zbud_alloc+0x1da/0x2d0
[26747.930132] 014: zswap_frontswap_store+0x31a/0x6e1
[26747.934992] 014: __frontswap_store+0xab/0x130
[26747.939417] 014: swap_writepage+0x39/0x70
[26747.943496] 014: pageout.isra.0+0xe3/0x320
[26747.947666] 014: shrink_page_list+0xa8e/0xd10
[26747.952092] 014: shrink_inactive_list+0x251/0x840
[26747.956866] 014: shrink_node_memcg+0x213/0x770
[26747.961379] 014: ? preempt_count_sub+0x98/0xe0
[26747.965891] 014: ? preempt_count_sub+0x98/0xe0
[26747.970402] 014: shrink_node+0xd9/0x450
[26747.974309] 014: balance_pgdat+0x2d5/0x510
[26747.978477] 014: kswapd+0x218/0x470
[26747.982038] 014: ? finish_wait+0x70/0x70
[26747.986033] 014: kthread+0xfb/0x130
[26747.989595] 014: ? balance_pgdat+0x510/0x510
[26747.993934] 014: ? kthread_park+0x90/0x90
[26747.998012] 014: ret_from_fork+0x27/0x50
Reported-by: Ping Fang <pifang@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
---
Tested: Ran back-to-back repetitions of the reproducer during 10 hours,
with the V3 patch applied, and no errors were observed. Without
this patch, every test run ends in a kernel panic, in less than
5 minutes.
diff --git a/mm/zswap.c b/mm/zswap.c
index a4e4d36ec085..fd5d2d5c9ae9 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -27,6 +27,7 @@
#include <linux/highmem.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/locallock.h>
#include <linux/types.h>
#include <linux/atomic.h>
#include <linux/frontswap.h>
@@ -990,6 +991,8 @@ static void zswap_fill_page(void *ptr, unsigned long value)
memset_l(page, value, PAGE_SIZE / sizeof(unsigned long));
}
+/* protect zswap_dstmem from concurrency */
+static DEFINE_LOCAL_IRQ_LOCK(zswap_dstmem_lock);
/*********************************
* frontswap hooks
**********************************/
@@ -1066,12 +1069,11 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
}
/* compress */
- dst = get_cpu_var(zswap_dstmem);
- tfm = *get_cpu_ptr(entry->pool->tfm);
+ dst = get_locked_var(zswap_dstmem_lock, zswap_dstmem);
+ tfm = *this_cpu_ptr(entry->pool->tfm);
src = kmap_atomic(page);
ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, &dlen);
kunmap_atomic(src);
- put_cpu_ptr(entry->pool->tfm);
if (ret) {
ret = -EINVAL;
goto put_dstmem;
@@ -1094,7 +1096,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
memcpy(buf, &zhdr, hlen);
memcpy(buf + hlen, dst, dlen);
zpool_unmap_handle(entry->pool->zpool, handle);
- put_cpu_var(zswap_dstmem);
+ put_locked_var(zswap_dstmem_lock, zswap_dstmem);
/* populate entry */
entry->offset = offset;
@@ -1122,7 +1124,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
return 0;
put_dstmem:
- put_cpu_var(zswap_dstmem);
+ put_locked_var(zswap_dstmem_lock, zswap_dstmem);
zswap_pool_put(entry->pool);
freepage:
zswap_entry_cache_free(entry);
reply other threads:[~2019-06-25 14:28 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190625142804.GH4902@uudg.org \
--to=lclaudio@uudg.org \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=pifang@redhat.com \
--cc=williams@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).