public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Usama Arif <usamaarif642@gmail.com>
To: David Hildenbrand <david@redhat.com>,
	Kanchana P Sridhar <kanchana.p.sridhar@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com,
	chengming.zhou@linux.dev, ryan.roberts@arm.com,
	ying.huang@intel.com, 21cnbao@gmail.com,
	akpm@linux-foundation.org, hughd@google.com, willy@infradead.org,
	bfoster@redhat.com, dchinner@redhat.com, chrisl@kernel.org
Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com
Subject: Re: [RFC PATCH v1 6/7] mm: do_swap_page() calls swapin_readahead() zswap load batching interface.
Date: Fri, 18 Oct 2024 12:04:44 +0100	[thread overview]
Message-ID: <169e5cb6-701a-474c-a703-60daee8b4d3f@gmail.com> (raw)
In-Reply-To: <71bcbd3f-a8bd-4014-aabe-081006cc62f8@redhat.com>



On 18/10/2024 08:26, David Hildenbrand wrote:
> On 18.10.24 08:48, Kanchana P Sridhar wrote:
>> This patch invokes the swapin_readahead() based batching interface to
>> prefetch a batch of 4K folios for zswap load with batch decompressions
>> in parallel using IAA hardware. swapin_readahead() prefetches folios based
>> on vm.page-cluster and the usefulness of prior prefetches to the
>> workload. As folios are created in the swapcache and the readahead code
>> calls swap_read_folio() with a "zswap_batch" and a "non_zswap_batch", the
>> respective folio_batches get populated with the folios to be read.
>>
>> Finally, the swapin_readahead() procedures will call the newly added
>> process_ra_batch_of_same_type() which:
>>
>>   1) Reads all the non_zswap_batch folios sequentially by calling
>>      swap_read_folio().
>>   2) Calls swap_read_zswap_batch_unplug() with the zswap_batch which calls
>>      zswap_finish_load_batch() that finally decompresses each
>>      SWAP_CRYPTO_SUB_BATCH_SIZE sub-batch (i.e. upto 8 pages in a prefetch
>>      batch of say, 32 folios) in parallel with IAA.
>>
>> Within do_swap_page(), we try to benefit from batch decompressions in both
>> these scenarios:
>>
>>   1) single-mapped, SWP_SYNCHRONOUS_IO:
>>        We call swapin_readahead() with "single_mapped_path = true". This is
>>        done only in the !zswap_never_enabled() case.
>>   2) Shared and/or non-SWP_SYNCHRONOUS_IO folios:
>>        We call swapin_readahead() with "single_mapped_path = false".
>>
>> This will place folios in the swapcache: a design choice that handles cases
>> where a folio that is "single-mapped" in process 1 could be prefetched in
>> process 2; and handles highly contended server scenarios with stability.
>> There are checks added at the end of do_swap_page(), after the folio has
>> been successfully loaded, to detect if the single-mapped swapcache folio is
>> still single-mapped, and if so, folio_free_swap() is called on the folio.
>>
>> Within the swapin_readahead() functions, if single_mapped_path is true, and
>> either the platform does not have IAA, or, if the platform has IAA and the
>> user selects a software compressor for zswap (details of sysfs knob
>> follow), readahead/batching are skipped and the folio is loaded using
>> zswap_load().
>>
>> A new swap parameter "singlemapped_ra_enabled" (false by default) is added
>> for platforms that have IAA, zswap_load_batching_enabled() is true, and we
>> want to give the user the option to run experiments with IAA and with
>> software compressors for zswap (swap device is SWP_SYNCHRONOUS_IO):
>>
>> For IAA:
>>   echo true > /sys/kernel/mm/swap/singlemapped_ra_enabled
>>
>> For software compressors:
>>   echo false > /sys/kernel/mm/swap/singlemapped_ra_enabled
>>
>> If "singlemapped_ra_enabled" is set to false, swapin_readahead() will skip
>> prefetching folios in the "single-mapped SWP_SYNCHRONOUS_IO" do_swap_page()
>> path.
>>
>> Thanks Ying Huang for the really helpful brainstorming discussions on the
>> swap_read_folio() plug design.
>>
>> Suggested-by: Ying Huang <ying.huang@intel.com>
>> Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
>> ---
>>   mm/memory.c     | 187 +++++++++++++++++++++++++++++++++++++-----------
>>   mm/shmem.c      |   2 +-
>>   mm/swap.h       |  12 ++--
>>   mm/swap_state.c | 157 ++++++++++++++++++++++++++++++++++++----
>>   mm/swapfile.c   |   2 +-
>>   5 files changed, 299 insertions(+), 61 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index b5745b9ffdf7..9655b85fc243 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3924,6 +3924,42 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf)
>>       return 0;
>>   }
>>   +/*
>> + * swapin readahead based batching interface for zswap batched loads using IAA:
>> + *
>> + * Should only be called for and if the faulting swap entry in do_swap_page
>> + * is single-mapped and SWP_SYNCHRONOUS_IO.
>> + *
>> + * Detect if the folio is in the swapcache, is still mapped to only this
>> + * process, and further, there are no additional references to this folio
>> + * (for e.g. if another process simultaneously readahead this swap entry
>> + * while this process was handling the page-fault, and got a pointer to the
>> + * folio allocated by this process in the swapcache), besides the references
>> + * that were obtained within __read_swap_cache_async() by this process that is
>> + * faulting in this single-mapped swap entry.
>> + */
> 
> How is this supposed to work for large folios?
> 

Hi,

I was looking at zswapin large folio support and have posted a RFC in [1].
I got bogged down with some prod stuff, so wasn't able to send it earlier.

It looks quite different, and I think simpler from this series, so might be
a good comparison.

[1] https://lore.kernel.org/all/20241018105026.2521366-1-usamaarif642@gmail.com/

Thanks,
Usama

  reply	other threads:[~2024-10-18 11:04 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-18  6:47 [RFC PATCH v1 0/7] zswap IAA decompress batching Kanchana P Sridhar
2024-10-18  6:47 ` [RFC PATCH v1 1/7] mm: zswap: Config variable to enable zswap loads with " Kanchana P Sridhar
2024-10-18  6:48 ` [RFC PATCH v1 2/7] mm: swap: Add IAA batch decompression API swap_crypto_acomp_decompress_batch() Kanchana P Sridhar
2024-10-18  6:48 ` [RFC PATCH v1 3/7] pagevec: struct folio_batch changes for decompress batching interface Kanchana P Sridhar
2024-10-18  6:48 ` [RFC PATCH v1 4/7] mm: swap: swap_read_folio() can add a folio to a folio_batch if it is in zswap Kanchana P Sridhar
2024-10-18  6:48 ` [RFC PATCH v1 5/7] mm: swap, zswap: zswap folio_batch processing with IAA decompression batching Kanchana P Sridhar
2024-10-18  6:48 ` [RFC PATCH v1 6/7] mm: do_swap_page() calls swapin_readahead() zswap load batching interface Kanchana P Sridhar
2024-10-18  7:26   ` David Hildenbrand
2024-10-18 11:04     ` Usama Arif [this message]
2024-10-18 17:21       ` Nhat Pham
2024-10-18 21:59         ` Sridhar, Kanchana P
2024-10-20 16:50           ` Usama Arif
2024-10-20 20:12             ` Sridhar, Kanchana P
2024-10-18 18:09     ` Sridhar, Kanchana P
2024-10-18  6:48 ` [RFC PATCH v1 7/7] mm: For IAA batching, reduce SWAP_BATCH and modify swap slot cache thresholds Kanchana P Sridhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=169e5cb6-701a-474c-a703-60daee8b4d3f@gmail.com \
    --to=usamaarif642@gmail.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfoster@redhat.com \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kanchana.p.sridhar@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=vinodh.gopal@intel.com \
    --cc=wajdi.k.feghali@intel.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox