stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Tony Lindgren <tony@atomide.com>,
	Christoph Hellwig <hch@infradead.org>,
	Harish Sriram <harish@linux.ibm.com>,
	Uladzislau Rezki <urezki@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.4 35/36] mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPING
Date: Mon, 14 Dec 2020 18:28:19 +0100	[thread overview]
Message-ID: <20201214172545.033646592@linuxfoundation.org> (raw)
In-Reply-To: <20201214172543.302523401@linuxfoundation.org>

From: Minchan Kim <minchan@kernel.org>

commit e91d8d78237de8d7120c320b3645b7100848f24d upstream.

While I was doing zram testing, I found sometimes decompression failed
since the compression buffer was corrupted.  With investigation, I found
below commit calls cond_resched unconditionally so it could make a
problem in atomic context if the task is reschedule.

  BUG: sleeping function called from invalid context at mm/vmalloc.c:108
  in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog
  3 locks held by memhog/946:
   #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160
   #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160
   #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0
  CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
  Call Trace:
    unmap_kernel_range_noflush+0x2eb/0x350
    unmap_kernel_range+0x14/0x30
    zs_unmap_object+0xd5/0xe0
    zram_bvec_rw.isra.0+0x38c/0x8e0
    zram_rw_page+0x90/0x101
    bdev_write_page+0x92/0xe0
    __swap_writepage+0x94/0x4a0
    pageout+0xe3/0x3a0
    shrink_page_list+0xb94/0xd60
    shrink_inactive_list+0x158/0x460

We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (which
contains the offending calling code) from zsmalloc.

Even though this option showed some amount improvement(e.g., 30%) in
some arm32 platforms, it has been headache to maintain since it have
abused APIs[1](e.g., unmap_kernel_range in atomic context).

Since we are approaching to deprecate 32bit machines and already made
the config option available for only builtin build since v5.8, lastly it
has been not default option in zsmalloc, it's time to drop the option
for better maintenance.

[1] http://lore.kernel.org/linux-mm/20201105170249.387069-1-minchan@kernel.org

Fixes: e47110e90584 ("mm/vunmap: add cond_resched() in vunmap_pmd_range")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Harish Sriram <harish@linux.ibm.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20201117202916.GA3856507@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 include/linux/zsmalloc.h |    1 -
 mm/Kconfig               |   13 -------------
 mm/zsmalloc.c            |   46 ----------------------------------------------
 3 files changed, 60 deletions(-)

--- a/include/linux/zsmalloc.h
+++ b/include/linux/zsmalloc.h
@@ -20,7 +20,6 @@
  * zsmalloc mapping modes
  *
  * NOTE: These only make a difference when a mapped object spans pages.
- * They also have no effect when PGTABLE_MAPPING is selected.
  */
 enum zs_mapmode {
 	ZS_MM_RW, /* normal read-write mapping */
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -576,19 +576,6 @@ config ZSMALLOC
 	  returned by an alloc().  This handle must be mapped in order to
 	  access the allocated space.
 
-config PGTABLE_MAPPING
-	bool "Use page table mapping to access object in zsmalloc"
-	depends on ZSMALLOC
-	help
-	  By default, zsmalloc uses a copy-based object mapping method to
-	  access allocations that span two pages. However, if a particular
-	  architecture (ex, ARM) performs VM mapping faster than copying,
-	  then you should select this. This causes zsmalloc to use page table
-	  mapping rather than copying for object mapping.
-
-	  You can check speed with zsmalloc benchmark:
-	  https://github.com/spartacus06/zsmapbench
-
 config ZSMALLOC_STAT
 	bool "Export zsmalloc statistics"
 	depends on ZSMALLOC
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -293,11 +293,7 @@ struct zspage {
 };
 
 struct mapping_area {
-#ifdef CONFIG_PGTABLE_MAPPING
-	struct vm_struct *vm; /* vm area for mapping object that span pages */
-#else
 	char *vm_buf; /* copy buffer for objects that span pages */
-#endif
 	char *vm_addr; /* address of kmap_atomic()'ed pages */
 	enum zs_mapmode vm_mm; /* mapping mode */
 };
@@ -1113,46 +1109,6 @@ static struct zspage *find_get_zspage(st
 	return zspage;
 }
 
-#ifdef CONFIG_PGTABLE_MAPPING
-static inline int __zs_cpu_up(struct mapping_area *area)
-{
-	/*
-	 * Make sure we don't leak memory if a cpu UP notification
-	 * and zs_init() race and both call zs_cpu_up() on the same cpu
-	 */
-	if (area->vm)
-		return 0;
-	area->vm = alloc_vm_area(PAGE_SIZE * 2, NULL);
-	if (!area->vm)
-		return -ENOMEM;
-	return 0;
-}
-
-static inline void __zs_cpu_down(struct mapping_area *area)
-{
-	if (area->vm)
-		free_vm_area(area->vm);
-	area->vm = NULL;
-}
-
-static inline void *__zs_map_object(struct mapping_area *area,
-				struct page *pages[2], int off, int size)
-{
-	BUG_ON(map_vm_area(area->vm, PAGE_KERNEL, pages));
-	area->vm_addr = area->vm->addr;
-	return area->vm_addr + off;
-}
-
-static inline void __zs_unmap_object(struct mapping_area *area,
-				struct page *pages[2], int off, int size)
-{
-	unsigned long addr = (unsigned long)area->vm_addr;
-
-	unmap_kernel_range(addr, PAGE_SIZE * 2);
-}
-
-#else /* CONFIG_PGTABLE_MAPPING */
-
 static inline int __zs_cpu_up(struct mapping_area *area)
 {
 	/*
@@ -1233,8 +1189,6 @@ out:
 	pagefault_enable();
 }
 
-#endif /* CONFIG_PGTABLE_MAPPING */
-
 static int zs_cpu_prepare(unsigned int cpu)
 {
 	struct mapping_area *area;



  parent reply	other threads:[~2020-12-14 19:27 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-14 17:27 [PATCH 5.4 00/36] 5.4.84-rc1 review Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 01/36] Kbuild: do not emit debug info for assembly with LLVM_IAS=1 Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 02/36] x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 03/36] iwlwifi: pcie: limit memory read spin time Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 04/36] arm64: dts: rockchip: Assign a fixed index to mmc devices on rk3399 boards Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 05/36] iwlwifi: pcie: set LTR to avoid completion timeout Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 06/36] iwlwifi: mvm: fix kernel panic in case of assert during CSA Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 07/36] powerpc: Drop -me200 addition to build flags Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 08/36] arm64: dts: broadcom: clear the warnings caused by empty dma-ranges Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 09/36] ARC: stack unwinding: dont assume non-current task is sleeping Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 10/36] scsi: ufs: Make sure clk scaling happens only when HBA is runtime ACTIVE Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 11/36] interconnect: qcom: qcs404: Remove GPU and display RPM IDs Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 12/36] ibmvnic: skip tx timeout reset while in resetting Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 13/36] irqchip/gic-v3-its: Unconditionally save/restore the ITS state on suspend Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 14/36] spi: spi-nxp-fspi: fix fspi panic by unexpected interrupts Greg Kroah-Hartman
2020-12-14 17:27 ` [PATCH 5.4 15/36] soc: fsl: dpio: Get the cpumask through cpumask_of(cpu) Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 16/36] arm64: tegra: Disable the ACONNECT for Jetson TX2 Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 17/36] platform/x86: thinkpad_acpi: Do not report SW_TABLET_MODE on Yoga 11e Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 18/36] platform/x86: thinkpad_acpi: Add BAT1 is primary battery quirk for Thinkpad Yoga 11e 4th gen Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 19/36] platform/x86: acer-wmi: add automatic keyboard background light toggle key as KEY_LIGHTS_TOGGLE Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 20/36] platform/x86: intel-vbtn: Support for tablet mode on HP Pavilion 13 x360 PC Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 21/36] platform/x86: touchscreen_dmi: Add info for the Irbis TW118 tablet Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 22/36] can: m_can: m_can_dev_setup(): add support for bosch mcan version 3.3.0 Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 23/36] ktest.pl: Fix incorrect reboot for grub2bls Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 24/36] Input: cm109 - do not stomp on control URB Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 25/36] Input: i8042 - add Acer laptops to the i8042 reset list Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 26/36] pinctrl: amd: remove debounce filter setting in IRQ type setting Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 27/36] mmc: block: Fixup condition for CMD13 polling for RPMB requests Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 28/36] drm/i915/display/dp: Compute the correct slice count for VDSC on DP Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 29/36] kbuild: avoid static_assert for genksyms Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 30/36] proc: use untagged_addr() for pagemap_read addresses Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 31/36] scsi: be2iscsi: Revert "Fix a theoretical leak in beiscsi_create_eqs()" Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 32/36] x86/mm/mem_encrypt: Fix definition of PMD_FLAGS_DEC_WP Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 33/36] x86/membarrier: Get rid of a dubious optimization Greg Kroah-Hartman
2020-12-14 17:28 ` [PATCH 5.4 34/36] x86/apic/vector: Fix ordering in vector assignment Greg Kroah-Hartman
2020-12-14 17:28 ` Greg Kroah-Hartman [this message]
2020-12-14 17:28 ` [PATCH 5.4 36/36] compiler.h: fix barrier_data() on clang Greg Kroah-Hartman
2020-12-14 22:21 ` [PATCH 5.4 00/36] 5.4.84-rc1 review Shuah Khan
2020-12-15  6:51 ` Naresh Kamboju
2020-12-15  9:06 ` Jon Hunter
2020-12-15 20:31 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201214172545.033646592@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=harish@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tony@atomide.com \
    --cc=torvalds@linux-foundation.org \
    --cc=urezki@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).