linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthias Kaehlcke <mka@chromium.org>
To: Brian Geffon <bgeffon@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Pavel Machek <pavel@ucw.cz>, Len Brown <len.brown@intel.com>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] PM: hibernate: don't store zero pages in the image file.
Date: Thu, 23 Mar 2023 16:19:52 +0000	[thread overview]
Message-ID: <ZBx8KEuVtIbqkeq1@google.com> (raw)
In-Reply-To: <20230302171348.3601107-1-bgeffon@google.com>

On Thu, Mar 02, 2023 at 12:13:48PM -0500, Brian Geffon wrote:
> On ChromeOS we've observed a considerable number of in-use pages filled with
> zeros. Today with hibernate it's entirely possible that saveable pages are just
> zero filled. Since we're already copying pages word-by-word in do_copy_page it
> becomes almost free to determine if a page was completely filled with zeros.
> 
> This change introduces a new bitmap which will track these zero pages. If a page
> is zero it will not be included in the saved image, instead to track these zero
> pages in the image file we will introduce a new flag which we will set on the
> packed PFN list. When reading back in the image file we will detect these zero
> page PFNs and rebuild the zero page bitmap.
> 
> When the image is being loaded through calls to write_next_page if we encounter
> a zero page we will silently memset it to 0 and then continue on to the next
> page. Given the implementation in snapshot_read_next/snapshot_write_next this
> change  will be transparent to non-compressed/compressed and swsusp modes of
> operation.
> 
> To provide some concrete numbers from simple ad-hoc testing, on a device which
> was lightly in use we saw that:
> 
>   PM: hibernation: Image created (964408 pages copied, 548304 zero pages)
> 
> Of the approximately 6.2GB of saveable pages 2.2GB (36%) were just zero filled
> and could be tracked entirely within the packed PFN list. The savings would
> obviously be much lower for lzo compressed images, but even in the case of
> compression not copying pages across to the compression threads will still
> speed things up. It's also possible that we would see better overall compression
> ratios as larger regions of "real data" would improve the compressibility.
> 
> Finally, such an approach could dramatically improve swsusp performance
> as each one of those zero pages requires a write syscall to reload, by
> handling it as part of the packed PFN list we're able to fully avoid
> that.
> 
>  Patch v2 -> v3:
>    - Use nr_zero_pages rather than walking each pfn to count.
>    - Make sure zero_bm is allocated in safe pages on resume.
>      When reading in the pfn list and building the zero page bm
>      we don't know which pages are unsafe yet so we will need to
>      copy this bm to safe pages after the metadata has been read.
> 
>  Patch v1 -> v2:
>    - minor code mistake from rebasing corrected.
> 
> Signed-off-by: Brian Geffon <bgeffon@google.com>
> ---
>  kernel/power/snapshot.c | 169 +++++++++++++++++++++++++++++++---------
>  1 file changed, 133 insertions(+), 36 deletions(-)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index cd8b7b35f1e8b..a2c4fe17f9067 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c

...

> @@ -1371,14 +1381,18 @@ static unsigned int count_data_pages(void)
>  
>  /*
>   * This is needed, because copy_page and memcpy are not usable for copying
> - * task structs.
> + * task structs. Returns 1 if a page was filled with only zeros, otherwise 0.

nit: s/a page/the page/

>   */
> -static inline void do_copy_page(long *dst, long *src)
> +static inline int do_copy_page(long *dst, long *src)
>  {
>  	int n;
> +	long z = 0;
>  
> -	for (n = PAGE_SIZE / sizeof(long); n; n--)
> +	for (n = PAGE_SIZE / sizeof(long); n; n--) {
> +		z |= *src;
>  		*dst++ = *src++;
> +	}
> +	return !z;
>  }

...

> -static inline void copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
> +static inline int copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
>  {
> -	safe_copy_page(page_address(pfn_to_page(dst_pfn)),
> +	return safe_copy_page(page_address(pfn_to_page(dst_pfn)),
>  				pfn_to_page(src_pfn));
>  }
>  #endif /* CONFIG_HIGHMEM */
>  
>  static void copy_data_pages(struct memory_bitmap *copy_bm,
> -			    struct memory_bitmap *orig_bm)
> +			    struct memory_bitmap *orig_bm,
> +			    struct memory_bitmap *zero_bm,
> +			    unsigned int *zero_count)
>  {
>  	struct zone *zone;
> -	unsigned long pfn;
> +	unsigned long pfn, copy_pfn;
>  
>  	for_each_populated_zone(zone) {
>  		unsigned long max_zone_pfn;
> @@ -1462,11 +1482,20 @@ static void copy_data_pages(struct memory_bitmap *copy_bm,
>  	}
>  	memory_bm_position_reset(orig_bm);
>  	memory_bm_position_reset(copy_bm);
> +	copy_pfn = memory_bm_next_pfn(copy_bm);
>  	for(;;) {
>  		pfn = memory_bm_next_pfn(orig_bm);
>  		if (unlikely(pfn == BM_END_OF_MAP))
>  			break;
> -		copy_data_page(memory_bm_next_pfn(copy_bm), pfn);
> +		if (copy_data_page(copy_pfn, pfn)) {
> +			memory_bm_set_bit(zero_bm, pfn);
> +			if (zero_count)

This check is not needed. The function is called only once, with a pointer. The kernel
trusts itself if the pointer is supposed to be always != NULL.

Or better: use a local counter and have copy_data_pages() return the number of pages
that were actually copied, which is what the caller is interested in.

> +				(*zero_count)++;
> +
> +			/* We will reuse this copy_pfn for a real 'nonzero' page. */
> +			continue;
> +		}
> +		copy_pfn = memory_bm_next_pfn(copy_bm);
>  	}
>  }

...

> @@ -2247,24 +2299,34 @@ static int load_header(struct swsusp_info *info)
>   * unpack_orig_pfns - Set bits corresponding to given PFNs in a memory bitmap.
>   * @bm: Memory bitmap.
>   * @buf: Area of memory containing the PFNs.
> + * @zero_bm: Memory bitmap which will be populated with the PFNs of zero pages.
>   *
>   * For each element of the array pointed to by @buf (1 page at a time), set the
> - * corresponding bit in @bm.
> + * corresponding bit in @bm. If the the page was originally populated with only
> + * zeros then a corresponding bit will also be set in @zero_bm.

s/the the/the/

...

> @@ -2486,6 +2548,7 @@ static inline void free_highmem_data(void) {}
>   * prepare_image - Make room for loading hibernation image.
>   * @new_bm: Uninitialized memory bitmap structure.
>   * @bm: Memory bitmap with unsafe pages marked.
> + * @zero_bm: Memory bitmap containing the zero pages.

That sounds as if the memory bitmap actually contained zero pages. I suggest to
change it to something like the comment for 'bm' above, i.e. "... with zero
pages marked"

>   *
>   * Use @bm to mark the pages that will be overwritten in the process of
>   * restoring the system memory state from the suspend image ("unsafe" pages)
> @@ -2496,8 +2559,12 @@ static inline void free_highmem_data(void) {}
>   * pages will be used for just yet.  Instead, we mark them all as allocated and
>   * create a lists of "safe" pages to be used later.  On systems with high
>   * memory a list of "safe" highmem pages is created too.
> + *
> + * Because we didn't know which pages were unsafe when we created the zero bm we
> + * will make a copy of it and recreate it within safe pages.

nit: s/we will make/we make/

>   */
> -static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm)
> +static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm,
> +		struct memory_bitmap *zero_bm)
>  {
>  	unsigned int nr_pages, nr_highmem;
>  	struct linked_page *lp;
> @@ -2516,6 +2583,20 @@ static int prepare_image(struct memory_bitmap *new_bm, struct memory_bitmap *bm)
>  
>  	duplicate_memory_bitmap(new_bm, bm);
>  	memory_bm_free(bm, PG_UNSAFE_KEEP);
> +	error = memory_bm_create(bm, GFP_ATOMIC, PG_ANY);
> +	if (error)
> +		goto Free;
> +
> +	/* use bm as storage while we rebuild zero_bm using safe pages */

Re-using the 'bm' parameter is confusing, it should be avoided IMO unless there
is a real benefit. struct memory_bitmap isn't that big, why not create a local
variable 'zero_mb_tmp' or similar as a temporary store for the zero page bitmap?

> +	duplicate_memory_bitmap(bm, zero_bm);
> +	memory_bm_free(zero_bm, PG_UNSAFE_KEEP);
> +	error = memory_bm_create(zero_bm, GFP_ATOMIC, PG_SAFE);
> +	if (error)
> +		goto Free;
> +	duplicate_memory_bitmap(zero_bm, bm);
> +	memory_bm_free(bm, PG_UNSAFE_KEEP);
> +	/* at this point zero_bm is in safe pages and we can use it while restoring */
> +
>  	if (nr_highmem > 0) {
>  		error = prepare_highmem_image(bm, &nr_highmem);
>  		if (error)

  reply	other threads:[~2023-03-23 16:20 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 19:30 [PATCH] PM: hibernate: don't store zero pages in the image file Brian Geffon
2023-01-14  0:15 ` [PATCH v2] " Brian Geffon
2023-02-09 19:44   ` Rafael J. Wysocki
2023-02-13 14:45     ` Brian Geffon
2023-02-18 11:48       ` Pavel Machek
2023-02-19 18:56         ` Brian Geffon
2023-03-02 17:25     ` Brian Geffon
2023-03-02 17:13 ` [PATCH v3] " Brian Geffon
2023-03-23 16:19   ` Matthias Kaehlcke [this message]
2023-03-27 14:20     ` Brian Geffon
2023-04-06 18:20 ` [PATCH v4] " Brian Geffon
2023-06-22 16:22   ` Rafael J. Wysocki
2023-06-26 15:44 ` [PATCH] " Brian Geffon
2023-06-26 15:47 ` [PATCH v6] " Brian Geffon
2023-07-14 17:55 ` [PATCH v7] " Brian Geffon
2023-07-20 18:18   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBx8KEuVtIbqkeq1@google.com \
    --to=mka@chromium.org \
    --cc=bgeffon@google.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).