All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	Kyeongdon Kim <kyeongdon.kim@lge.com>
Cc: linux-kernel@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	karam.lee@lge.com, sangseok.lee@lge.com, chan.jeong@lge.com
Subject: Re: [PATCH] zram: revive swap_slot_free_notify
Date: Mon, 28 Mar 2016 23:30:56 +0900	[thread overview]
Message-ID: <20160328143056.GA6064@bbox> (raw)

On Fri, Mar 18, 2016 at 04:58:31PM +0900, Minchan Kim wrote:
> <b430e9d1c6d4> "remove compressed copy from zram in-memory"
> applied swap_slot_free_notify call in *end_swap_bio_read* to
> remove duplicated memory between zram and memory.
> 
> However, with introducing rw_page in zram <8c7f01025f7b>
> "zram: implement rw_page operation of zram", it became void
> because rw_page doesn't need bio.
> 
> This patch restores the function for rw_page.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_io.c | 93 ++++++++++++++++++++++++++++++++----------------------------
>  1 file changed, 50 insertions(+), 43 deletions(-)
> 
> diff --git a/mm/page_io.c b/mm/page_io.c
> index ff74e512f029..18aac7819cc9 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -66,6 +66,54 @@ void end_swap_bio_write(struct bio *bio)
>  	bio_put(bio);
>  }
>  
> +static void swap_slot_free_notify(struct page *page)
> +{
> +	struct swap_info_struct *sis;
> +	struct gendisk *disk;
> +
> +	/*
> +	 * There is no guarantee that the page is in swap cache - the software
> +	 * suspend code (at least) uses end_swap_bio_read() against a non-
> +	 * swapcache page.  So we must check PG_swapcache before proceeding with
> +	 * this optimization.
> +	 */
> +	if (unlikely(!PageSwapCache(page)))
> +		return;
> +
> +	sis = page_swap_info(page);
> +	if (!(sis->flags & SWP_BLKDEV))
> +		return;
> +
> +	/*
> +	 * The swap subsystem performs lazy swap slot freeing,
> +	 * expecting that the page will be swapped out again.
> +	 * So we can avoid an unnecessary write if the page
> +	 * isn't redirtied.
> +	 * This is good for real swap storage because we can
> +	 * reduce unnecessary I/O and enhance wear-leveling
> +	 * if an SSD is used as the as swap device.
> +	 * But if in-memory swap device (eg zram) is used,
> +	 * this causes a duplicated copy between uncompressed
> +	 * data in VM-owned memory and compressed data in
> +	 * zram-owned memory.  So let's free zram-owned memory
> +	 * and make the VM-owned decompressed page *dirty*,
> +	 * so the page should be swapped out somewhere again if
> +	 * we again wish to reclaim it.
> +	 */
> +	disk = sis->bdev->bd_disk;
> +	if (disk->fops->swap_slot_free_notify) {
> +		swp_entry_t entry;
> +		unsigned long offset;
> +
> +		entry.val = page_private(page);
> +		offset = swp_offset(entry);
> +
> +		SetPageDirty(page);
> +		disk->fops->swap_slot_free_notify(sis->bdev,
> +				offset);
> +	}
> +}
> +
>  static void end_swap_bio_read(struct bio *bio)
>  {
>  	struct page *page = bio->bi_io_vec[0].bv_page;
> @@ -81,49 +129,7 @@ static void end_swap_bio_read(struct bio *bio)
>  	}
>  
>  	SetPageUptodate(page);
> -
> -	/*
> -	 * There is no guarantee that the page is in swap cache - the software
> -	 * suspend code (at least) uses end_swap_bio_read() against a non-
> -	 * swapcache page.  So we must check PG_swapcache before proceeding with
> -	 * this optimization.
> -	 */
> -	if (likely(PageSwapCache(page))) {
> -		struct swap_info_struct *sis;
> -
> -		sis = page_swap_info(page);
> -		if (sis->flags & SWP_BLKDEV) {
> -			/*
> -			 * The swap subsystem performs lazy swap slot freeing,
> -			 * expecting that the page will be swapped out again.
> -			 * So we can avoid an unnecessary write if the page
> -			 * isn't redirtied.
> -			 * This is good for real swap storage because we can
> -			 * reduce unnecessary I/O and enhance wear-leveling
> -			 * if an SSD is used as the as swap device.
> -			 * But if in-memory swap device (eg zram) is used,
> -			 * this causes a duplicated copy between uncompressed
> -			 * data in VM-owned memory and compressed data in
> -			 * zram-owned memory.  So let's free zram-owned memory
> -			 * and make the VM-owned decompressed page *dirty*,
> -			 * so the page should be swapped out somewhere again if
> -			 * we again wish to reclaim it.
> -			 */
> -			struct gendisk *disk = sis->bdev->bd_disk;
> -			if (disk->fops->swap_slot_free_notify) {
> -				swp_entry_t entry;
> -				unsigned long offset;
> -
> -				entry.val = page_private(page);
> -				offset = swp_offset(entry);
> -
> -				SetPageDirty(page);
> -				disk->fops->swap_slot_free_notify(sis->bdev,
> -						offset);
> -			}
> -		}
> -	}
> -
> +	swap_slot_free_notify(page);
>  out:
>  	unlock_page(page);
>  	bio_put(bio);
> @@ -347,6 +353,7 @@ int swap_readpage(struct page *page)
>  
>  	ret = bdev_read_page(sis->bdev, swap_page_sector(page), page);
>  	if (!ret) {
> +		swap_slot_free_notify(page);
>  		count_vm_event(PSWPIN);
>  		return 0;
>  	}
> -- 
> 1.9.1
> 

Kyeongdon, Could you try this patch?

>From 09497ba48f5621cd2737f72ac4d15e6b3e5e3d14 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Mon, 28 Mar 2016 23:08:14 +0900
Subject: [PATCH] mm: call swap_slot_free_notify with holding page lock

Kyeongdon reported below error which is BUG_ON(!PageSwapCache(page))
in page_swap_info.
The reason is that page_endio in rw_page unlocks the page if read I/O
is completed so we need to hold a PG_lock again to check PageSwapCache.
Otherwise, the page can be removed from swapcache and trigger below
BUG_ON.

[27833.995833] ------------[ cut here ]------------
[27833.995853] Kernel BUG at c00f9040 [verbose debug info unavailable]
[27833.995865] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[27833.995876] Modules linked in:
[27833.995892] CPU: 4 PID: 13446 Comm: RenderThread Tainted: G        W
3.10.84-g9f14aec-dirty #73
[27833.995903] task: c3b73200 ti: dd192000 task.ti: dd192000
[27833.995922] PC is at page_swap_info+0x10/0x2c
[27833.995934] LR is at swap_slot_free_notify+0x18/0x6c
[27833.995946] pc : [<c00f9040>]    lr : [<c00f5560>]    psr: 400f0113
[27833.995946] sp : dd193d78  ip : c2deb1e4  fp : da015180
[27833.995959] r10: 00000000  r9 : 000200da  r8 : c120fe08
[27833.995968] r7 : 00000000  r6 : 00000000  r5 : c249a6c0  r4 : =
c249a6c0
[27833.995979] r3 : 00000000  r2 : 40080009  r1 : 200f0113  r0 : =
c249a6c0
..<snip>
[27833.996273] [<c00f9040>] (page_swap_info+0x10/0x2c) from [<c00f5560>]
(swap_slot_free_notify+0x18/0x6c)
[27833.996288] [<c00f5560>] (swap_slot_free_notify+0x18/0x6c) from
[<c00f5c5c>] (swap_readpage+0x90/0x11c)
[27833.996302] [<c00f5c5c>] (swap_readpage+0x90/0x11c) from [<c00f62dc>]
(read_swap_cache_async+0x134/0x1ac)
[27833.996317] [<c00f62dc>] (read_swap_cache_async+0x134/0x1ac) from
[<c00f63c4>] (swapin_readahead+0x70/0xb0)
[27833.996334] [<c00f63c4>] (swapin_readahead+0x70/0xb0) from =
[<c00e87e0>]
(handle_pte_fault+0x320/0x6fc)
[27833.996348] [<c00e87e0>] (handle_pte_fault+0x320/0x6fc) from
[<c00e8c7c>] (handle_mm_fault+0xc0/0xf0)
[27833.996363] [<c00e8c7c>] (handle_mm_fault+0xc0/0xf0) from =
[<c001ac18>]
(do_page_fault+0x11c/0x36c)
[27833.996378] [<c001ac18>] (do_page_fault+0x11c/0x36c) from =
[<c000838c>]
(do_DataAbort+0x34/0x118)
[27833.996392] [<c000838c>] (do_DataAbort+0x34/0x118) from [<c000d8b4>]
(__dabt_usr+0x34/0x40)

Reported-by: Kyeongdon Kim <kyeongdon.kim@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_io.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index 18aac7819cc9..57e279b71695 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -353,7 +353,11 @@ int swap_readpage(struct page *page)
 
 	ret = bdev_read_page(sis->bdev, swap_page_sector(page), page);
 	if (!ret) {
-		swap_slot_free_notify(page);
+		if (trylock_page(page)) {
+			swap_slot_free_notify(page);
+			unlock_page(page);
+		}
+
 		count_vm_event(PSWPIN);
 		return 0;
 	}
-- 
1.9.1

             reply	other threads:[~2016-03-28 14:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-28 14:30 Minchan Kim [this message]
2016-04-01  4:00 ` Re: [PATCH] zram: revive swap_slot_free_notify Kyeongdon Kim
  -- strict thread matches above, loose matches on Subject: below --
2016-03-28  4:58 Minchan Kim
2016-03-18  7:58 Minchan Kim
2016-03-21 20:30 ` Andrew Morton
2016-03-21 23:52   ` Minchan Kim
2016-03-22  5:08 ` Joonsoo Kim
2016-03-22  8:00   ` Minchan Kim
2016-03-22  8:20     ` Joonsoo Kim
2016-03-22 14:06       ` Minchan Kim
2016-03-23  4:45         ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160328143056.GA6064@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=chan.jeong@lge.com \
    --cc=karam.lee@lge.com \
    --cc=kyeongdon.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sangseok.lee@lge.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.