stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ivan Babrou <ivan@cloudflare.com>,
	Minchan Kim <minchan@kernel.org>, Nitin Gupta <ngupta@vflare.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Jens Axboe <axboe@kernel.dk>,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.19 82/88] mm: fix unexpected zeroed page mapping with zram swap
Date: Tue, 10 May 2022 15:08:07 +0200	[thread overview]
Message-ID: <20220510130736.113648841@linuxfoundation.org> (raw)
In-Reply-To: <20220510130733.735278074@linuxfoundation.org>

From: Minchan Kim <minchan@kernel.org>

commit e914d8f00391520ecc4495dd0ca0124538ab7119 upstream.

Two processes under CLONE_VM cloning, user process can be corrupted by
seeing zeroed page unexpectedly.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
  swap_readpage valid data
    swap_slot_free_notify
      delete zram entry
                              swap_readpage zeroed(invalid) data
                              pte_lock
                              map the *zero data* to userspace
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return and next refault will
  read zeroed data

The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't
increase the refcount of swap slot at copy_mm so it couldn't catch up
whether it's safe or not to discard data from backing device.  In the
case, only the lock it could rely on to synchronize swap slot freeing is
page table lock.  Thus, this patch gets rid of the swap_slot_free_notify
function.  With this patch, CPU A will see correct data.

      CPU A                        CPU B

  do_swap_page                do_swap_page
  SWP_SYNCHRONOUS_IO path     SWP_SYNCHRONOUS_IO path
                              swap_readpage original data
                              pte_lock
                              map the original data
                              swap_free
                                swap_range_free
                                  bd_disk->fops->swap_slot_free_notify
  swap_readpage read zeroed data
                              pte_unlock
  pte_lock
  if (!pte_same)
    goto out_nomap;
  pte_unlock
  return
  on next refault will see mapped data by CPU B

The concern of the patch would increase memory consumption since it
could keep wasted memory with compressed form in zram as well as
uncompressed form in address space.  However, most of cases of zram uses
no readahead and do_swap_page is followed by swap_free so it will free
the compressed form from in zram quickly.

Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com
Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device")
Reported-by: Ivan Babrou <ivan@cloudflare.com>
Tested-by: Ivan Babrou <ivan@cloudflare.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/page_io.c |   55 -------------------------------------------------------
 1 file changed, 55 deletions(-)

--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -71,55 +71,6 @@ void end_swap_bio_write(struct bio *bio)
 	bio_put(bio);
 }
 
-static void swap_slot_free_notify(struct page *page)
-{
-	struct swap_info_struct *sis;
-	struct gendisk *disk;
-	swp_entry_t entry;
-
-	/*
-	 * There is no guarantee that the page is in swap cache - the software
-	 * suspend code (at least) uses end_swap_bio_read() against a non-
-	 * swapcache page.  So we must check PG_swapcache before proceeding with
-	 * this optimization.
-	 */
-	if (unlikely(!PageSwapCache(page)))
-		return;
-
-	sis = page_swap_info(page);
-	if (!(sis->flags & SWP_BLKDEV))
-		return;
-
-	/*
-	 * The swap subsystem performs lazy swap slot freeing,
-	 * expecting that the page will be swapped out again.
-	 * So we can avoid an unnecessary write if the page
-	 * isn't redirtied.
-	 * This is good for real swap storage because we can
-	 * reduce unnecessary I/O and enhance wear-leveling
-	 * if an SSD is used as the as swap device.
-	 * But if in-memory swap device (eg zram) is used,
-	 * this causes a duplicated copy between uncompressed
-	 * data in VM-owned memory and compressed data in
-	 * zram-owned memory.  So let's free zram-owned memory
-	 * and make the VM-owned decompressed page *dirty*,
-	 * so the page should be swapped out somewhere again if
-	 * we again wish to reclaim it.
-	 */
-	disk = sis->bdev->bd_disk;
-	entry.val = page_private(page);
-	if (disk->fops->swap_slot_free_notify &&
-			__swap_count(sis, entry) == 1) {
-		unsigned long offset;
-
-		offset = swp_offset(entry);
-
-		SetPageDirty(page);
-		disk->fops->swap_slot_free_notify(sis->bdev,
-				offset);
-	}
-}
-
 static void end_swap_bio_read(struct bio *bio)
 {
 	struct page *page = bio_first_page_all(bio);
@@ -135,7 +86,6 @@ static void end_swap_bio_read(struct bio
 	}
 
 	SetPageUptodate(page);
-	swap_slot_free_notify(page);
 out:
 	unlock_page(page);
 	WRITE_ONCE(bio->bi_private, NULL);
@@ -373,11 +323,6 @@ int swap_readpage(struct page *page, boo
 
 	ret = bdev_read_page(sis->bdev, map_swap_page(page, &sis->bdev), page);
 	if (!ret) {
-		if (trylock_page(page)) {
-			swap_slot_free_notify(page);
-			unlock_page(page);
-		}
-
 		count_vm_event(PSWPIN);
 		return 0;
 	}



  parent reply	other threads:[~2022-05-10 13:28 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10 13:06 [PATCH 4.19 00/88] 4.19.242-rc1 review Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 01/88] usb: mtu3: fix USB 3.0 dual-role-switch from device to host Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 02/88] USB: quirks: add a Realtek card reader Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 03/88] USB: quirks: add STRING quirk for VCOM device Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 04/88] USB: serial: whiteheat: fix heap overflow in WHITEHEAT_GET_DTR_RTS Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 05/88] USB: serial: cp210x: add PIDs for Kamstrup USB Meter Reader Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 06/88] USB: serial: option: add support for Cinterion MV32-WA/MV32-WB Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 07/88] USB: serial: option: add Telit 0x1057, 0x1058, 0x1075 compositions Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 08/88] xhci: stop polling roothubs after shutdown Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 09/88] iio: dac: ad5592r: Fix the missing return value Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 10/88] iio: dac: ad5446: Fix read_raw not returning set value Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 11/88] iio: magnetometer: ak8975: Fix the error handling in ak8975_power_on() Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 12/88] usb: misc: fix improper handling of refcount in uss720_probe() Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 13/88] usb: gadget: uvc: Fix crash when encoding data for usb request Greg Kroah-Hartman
2022-05-10 13:06 ` [PATCH 4.19 14/88] usb: gadget: configfs: clear deactivation flag in configfs_composite_unbind() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 15/88] usb: dwc3: core: Fix tx/rx threshold settings Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 16/88] usb: dwc3: gadget: Return proper request status Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 17/88] serial: imx: fix overrun interrupts in DMA mode Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 18/88] serial: 8250: Also set sticky MCR bits in console restoration Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 19/88] serial: 8250: Correct the clock for EndRun PTP/1588 PCIe device Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 20/88] hex2bin: make the function hex_to_bin constant-time Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 21/88] hex2bin: fix access beyond string end Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 22/88] mtd: rawnand: fix ecc parameters for mt7622 Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 23/88] USB: Fix xhci event ring dequeue pointer ERDP update issue Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 24/88] ARM: dts: imx6qdl-apalis: Fix sgtl5000 detection issue Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 25/88] phy: samsung: Fix missing of_node_put() in exynos_sata_phy_probe Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 26/88] phy: samsung: exynos5250-sata: fix missing device put in probe error paths Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 27/88] ARM: OMAP2+: Fix refcount leak in omap_gic_of_init Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 28/88] ARM: dts: Fix mmc order for omap3-gta04 Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 29/88] ARM: dts: logicpd-som-lv: Fix wrong pinmuxing on OMAP35 Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 30/88] ipvs: correctly print the memory size of ip_vs_conn_tab Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 31/88] mtd: rawnand: Fix return value check of wait_for_completion_timeout Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 32/88] tcp: md5: incorrect tcp_header_len for incoming connections Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 33/88] sctp: check asoc strreset_chunk in sctp_generate_reconf_event Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 34/88] ARM: dts: imx6ull-colibri: fix vqmmc regulator Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 35/88] pinctrl: pistachio: fix use of irq_of_parse_and_map() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 36/88] net: hns3: add validity check for message data length Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 37/88] ip_gre: Make o_seqno start from 0 in native mode Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 38/88] tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 39/88] bus: sunxi-rsb: Fix the return value of sunxi_rsb_device_create() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 40/88] clk: sunxi: sun9i-mmc: check return value after calling platform_get_resource() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 41/88] net: bcmgenet: hide status block before TX timestamping Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 42/88] bnx2x: fix napi API usage sequence Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 43/88] ASoC: wm8731: Disable the regulator when probing fails Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 44/88] ip6_gre: Avoid updating tunnel->tun_hlen in __gre6_xmit() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 45/88] x86: __memcpy_flushcache: fix wrong alignment if size > 2^32 Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 46/88] cifs: destage any unwritten data to the server before calling copychunk_write Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 47/88] drivers: net: hippi: Fix deadlock in rr_close() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 48/88] x86/cpu: Load microcode during restore_processor_state() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 49/88] tty: n_gsm: fix wrong signal octet encoding in convergence layer type 2 Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 50/88] tty: n_gsm: fix malformed counter for out of frame data Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 51/88] netfilter: nft_socket: only do sk lookups when indev is available Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 52/88] tty: n_gsm: fix insufficient txframe size Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 53/88] tty: n_gsm: fix missing explicit ldisc flush Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 54/88] tty: n_gsm: fix wrong command retry handling Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 55/88] tty: n_gsm: fix wrong command frame length field encoding Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 56/88] tty: n_gsm: fix incorrect UA handling Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 57/88] drm/vgem: Close use-after-free race in vgem_gem_create Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 58/88] MIPS: Fix CP0 counter erratum detection for R4k CPUs Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 59/88] parisc: Merge model and model name into one line in /proc/cpuinfo Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 60/88] ALSA: fireworks: fix wrong return count shorter than expected by 4 bytes Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 61/88] gpiolib: of: fix bounds check for gpio-reserved-ranges Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 62/88] Revert "SUNRPC: attempt AF_LOCAL connect on setup" Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 63/88] firewire: fix potential uaf in outbound_phy_packet_callback() Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 64/88] firewire: remove check of list iterator against head past the loop body Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 65/88] firewire: core: extend card->lock in fw_core_handle_bus_reset Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 66/88] genirq: Synchronize interrupt thread startup Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 67/88] ASoC: wm8958: Fix change notifications for DSP controls Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 68/88] can: grcan: grcan_close(): fix deadlock Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 69/88] can: grcan: use ofdev->dev when allocating DMA memory Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 70/88] nfc: replace improper check device_is_registered() in netlink related functions Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 71/88] nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 72/88] NFC: netlink: fix sleep in atomic bug when firmware download timeout Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 73/88] hwmon: (adt7470) Fix warning on module removal Greg Kroah-Hartman
2022-05-10 13:07 ` [PATCH 4.19 74/88] ASoC: dmaengine: Restore NULL prepare_slave_config() callback Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 75/88] net: stmmac: dwmac-sun8i: add missing of_node_put() in sun8i_dwmac_register_mdio_mux() Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 76/88] net: emaclite: Add error handling for of_address_to_resource() Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 77/88] selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operational Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 78/88] smsc911x: allow using IRQ0 Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 79/88] btrfs: always log symlinks in full mode Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 80/88] net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter() Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 81/88] kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU Greg Kroah-Hartman
2022-05-10 13:08 ` Greg Kroah-Hartman [this message]
2022-05-10 13:08 ` [PATCH 4.19 83/88] tcp: make sure treq->af_specific is initialized Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 84/88] dm: fix mempool NULL pointer race when completing IO Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 85/88] dm: interlock pending dm_io and dm_wait_for_bios_completion Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 86/88] PCI: aardvark: Clear all MSIs at setup Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 87/88] PCI: aardvark: Fix reading MSI interrupt number Greg Kroah-Hartman
2022-05-10 13:08 ` [PATCH 4.19 88/88] mmc: rtsx: add 74 Clocks in power on flow Greg Kroah-Hartman
2022-05-10 18:07 ` [PATCH 4.19 00/88] 4.19.242-rc1 review Pavel Machek
2022-05-10 20:34 ` Sudip Mukherjee
2022-05-11  5:40   ` Greg Kroah-Hartman
2022-05-10 22:46 ` Shuah Khan
2022-05-11  1:11 ` Guenter Roeck
2022-05-11  1:56 ` Samuel Zou
2022-05-11  9:19 ` Jon Hunter
2022-05-11  9:57 ` Naresh Kamboju
2022-05-11 10:08 ` Sudip Mukherjee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220510130736.113648841@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=david@redhat.com \
    --cc=ivan@cloudflare.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=senozhatsky@chromium.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).