linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "ying.huang@intel.com" <ying.huang@intel.com>
To: NeilBrown <neilb@suse.de>, Yang Shi <shy828301@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	Christoph Hellwig <hch@lst.de>, Miaohe Lin <linmiaohe@huawei.com>,
	linux-nfs@vger.kernel.org, Linux MM <linux-mm@kvack.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] MM: handle THP in swap_*page_fs()
Date: Fri, 06 May 2022 10:56:40 +0800	[thread overview]
Message-ID: <e63ac163c9283ca93d8309be1cdfed6c6ea97e5e.camel@intel.com> (raw)
In-Reply-To: <165170771676.24672.16520001373464213119@noble.neil.brown.name>

On Thu, 2022-05-05 at 09:41 +1000, NeilBrown wrote:
> On Tue, 03 May 2022, Yang Shi wrote:
> > On Sun, May 1, 2022 at 9:23 PM NeilBrown <neilb@suse.de> wrote:
> > > 
> > > On Sat, 30 Apr 2022, Yang Shi wrote:
> > > > On Thu, Apr 28, 2022 at 5:44 PM NeilBrown <neilb@suse.de> wrote:
> > > > > 
> > > > > Pages passed to swap_readpage()/swap_writepage() are not necessarily all
> > > > > the same size - there may be transparent-huge-pages involves.
> > > > > 
> > > > > The BIO paths of swap_*page() handle this correctly, but the SWP_FS_OPS
> > > > > path does not.
> > > > > 
> > > > > So we need to use thp_size() to find the size, not just assume
> > > > > PAGE_SIZE, and we need to track the total length of the request, not
> > > > > just assume it is "page * PAGE_SIZE".
> > > > 
> > > > Swap-over-nfs doesn't support THP swap IIUC. So SWP_FS_OPS should not
> > > > see THP at all. But I agree to remove the assumption about page size
> > > > in this path.
> > > 
> > > Can you help me understand this please.  How would the swap code know
> > > that swap-over-NFS doesn't support THP swap?  There is no reason that
> > > NFS wouldn't be able to handle 2MB writes.  Even 1GB should work though
> > > NFS would have to split into several smaller WRITE requests.
> > 
> > AFAICT, THP swap is only supported on non-rotate block devices, for
> > example, SSD, PMEM, etc. IIRC, the swap device has to support the
> > cluster in order to swap THP. The cluster is only supported by
> > non-rotate block devices.
> > 
> > Looped Ying in, who is the author of THP swap.
> 
> I hunted around the code and found that THP swap only happens if a
> 'cluster_info' is allocated, and that only happens if 
> 	if (p->bdev && bdev_nonrot(p->bdev)) {
> in the swapon syscall.
> 

And in get_swap_pages(), the cluster is only allocated for block
devices.

		if (size == SWAPFILE_CLUSTER) {
			if (si->flags & SWP_BLKDEV)
				n_ret = swap_alloc_cluster(si, swp_entries);
		} else
			n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
						    n_goal, swp_entries);

We may remove this restriction in the future if someone can show the
benefit.

Best Regards,
Huang, Ying

> I guess "nonrot" is being use as a synonym for "low latency"...
> So even if NFS was low-latency it couldn't benefit from THP swap.
> 
> So as you say it is not currently possible for THP pages to be send to
> NFS for swapout.  It makes sense to prepare for it though I think - if
> only so that the code is more consistent and less confusing.
> 
> Thanks,
> NeilBrown




  reply	other threads:[~2022-05-06  2:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-29  0:43 [PATCH 0/2] Finalising swap-over-NFS patches NeilBrown
2022-04-29  0:43 ` [PATCH 1/2] MM: handle THP in swap_*page_fs() NeilBrown
2022-04-29  1:21   ` Andrew Morton
2022-04-29  1:57     ` NeilBrown
2022-04-29  8:13   ` Miaohe Lin
2022-04-29 19:04   ` Yang Shi
2022-05-02  4:23     ` NeilBrown
2022-05-02 17:48       ` Yang Shi
2022-05-04 23:41         ` NeilBrown
2022-05-06  2:56           ` ying.huang [this message]
2022-04-29  0:43 ` [PATCH 2/2] NFS: rename nfs_direct_IO and use as ->swap_rw NeilBrown
2022-04-29  1:23   ` Andrew Morton
2022-04-29  2:05     ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e63ac163c9283ca93d8309be1cdfed6c6ea97e5e.camel@intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=geert+renesas@glider.be \
    --cc=hch@lst.de \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=shy828301@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).