linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Yosry Ahmed <yosryahmed@google.com>,
	kernel test robot <oliver.sang@intel.com>,
	Usama Arif <usamaarif642@gmail.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	Linux Memory Management List <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Nhat Pham <nphamcs@gmail.com>,
	David Hildenbrand <david@redhat.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andi Kleen <ak@linux.intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [linux-next:master] [mm] 0fa2857d23: WARNING:at_mm/page_alloc.c:#__alloc_pages_noprof
Date: Mon, 24 Jun 2024 21:51:33 +0100	[thread overview]
Message-ID: <ZnncVRuHeeN7GnTJ@casper.infradead.org> (raw)
In-Reply-To: <fv7c4554hex6vnnujhqz2fuxob6pqlumwxudx2zgrdeovq3vf4@vaypceu5y6po>

On Mon, Jun 24, 2024 at 01:39:45PM -0700, Shakeel Butt wrote:
> On Mon, Jun 24, 2024 at 08:50:45PM GMT, Matthew Wilcox wrote:
> > On Mon, Jun 24, 2024 at 12:34:04PM -0700, Yosry Ahmed wrote:
> > > On Mon, Jun 24, 2024 at 12:26 PM Matthew Wilcox <willy@infradead.org> wrote:
> > > >
> > > > On Mon, Jun 24, 2024 at 11:57:45AM -0700, Yosry Ahmed wrote:
> > > > > On Mon, Jun 24, 2024 at 11:56 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > > > >
> > > > > > On Mon, Jun 24, 2024 at 11:53:30AM -0700, Yosry Ahmed wrote:
> > > > > > > After a page is swapped out during reclaim, __remove_mapping() will
> > > > > > > call __delete_from_swap_cache() to replace the swap cache entry with a
> > > > > > > shadow entry (which is an xa_value).
> > > > > >
> > > > > > Special entries are disjoint from shadow entries.  Shadow entries have
> > > > > > the last two bits as 01 or 11 (are congruent to 1 or 3 modulo 4).
> > > > > > Special entries have values below 4096 which end in 10 (are congruent
> > > > > > to 2 modulo 4).
> > > > >
> > > > > You are implying that we would no longer have a shadow entry for such
> > > > > zero folios, because we will be storing a special entry instead.
> > > > > Right?
> > > >
> > > > umm ... maybe I have a misunderstanding here.
> > > >
> > > > I'm saying that there wouldn't be a _swap_ entry here because the folio
> > > > wouldn't be stored anywhere on the swap device.  But there could be a
> > > > _shadow_ entry.  Although if the page is full of zeroes, it was probably
> > > > never referenced and doesn't really need a shadow entry.
> > > 
> > > Is it possible to have a shadow entry AND a special entry (e.g.
> > > XA_ZERO_ENTRY) at the same index? This is what would be required to
> > > maintain the current behavior (assuming we really need the shadow
> > > entries for such zeroed folios).
> > 
> > No, just like it's not possible to have a swap entry and a shadow entry
> > at the same location.  You have to choose.  But the zero entry is an
> > alternative to the swap entry, not the shadow entry.
> > 
> > As I understand the swap cache, at the moment, you can have four
> > possible results from a lookup:
> > 
> >  - NULL
> >  - a swap entry
> >  - a shadow entry
> >  - a folio
> > 
> > Do I have that wrong?
> 
> I don't think we have swap entry in the swapcache (underlying xarray).
> The swap entry is used as an index to find the folio or shadow entry.

Ah.  I think I understand the procedure now.

We store a swap entry in the page table entry.  That tells us both where
in the swap cache the folio might be found, and where in the swap device
the data can be found (because there is a very simple calculation for
both).  If the folio is not present, then there's a shadow entry which
summarises the LRU information that would be stored in the folio had it
not been evicted from the swapcache.

We can't know at the point where we unmap the page whether it's full
of zeroes or not, because we can't afford to scan its contents.  At the
point where we decide to swap out the folio, we can afford to make that
decision because the cost of doing the I/O is high enough.

So the question is whether we can afford to throw away the shadow
information and just store the information that this was a zero entry.
I think we can, but it is a more bold proposal than I realised I was
making.


  reply	other threads:[~2024-06-24 20:51 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-24  8:49 [linux-next:master] [mm] 0fa2857d23: WARNING:at_mm/page_alloc.c:#__alloc_pages_noprof kernel test robot
2024-06-24 12:05 ` Yosry Ahmed
2024-06-24 13:06   ` Usama Arif
2024-06-24 15:26     ` Hugh Dickins
2024-06-24 15:39       ` Usama Arif
2024-06-24 15:55         ` Hugh Dickins
2024-06-24 16:56         ` Yosry Ahmed
2024-06-24 17:26           ` Usama Arif
2024-06-24 17:31             ` Yosry Ahmed
2024-06-24 18:26               ` Usama Arif
2024-06-27 11:05                 ` Usama Arif
2024-06-24 18:33   ` Matthew Wilcox
2024-06-24 18:50     ` Usama Arif
2024-06-24 18:53       ` Yosry Ahmed
2024-06-24 18:54       ` Matthew Wilcox
2024-06-24 18:53     ` Yosry Ahmed
2024-06-24 18:56       ` Matthew Wilcox
2024-06-24 18:57         ` Yosry Ahmed
2024-06-24 19:26           ` Matthew Wilcox
2024-06-24 19:34             ` Yosry Ahmed
2024-06-24 19:50               ` Matthew Wilcox
2024-06-24 20:39                 ` Shakeel Butt
2024-06-24 20:51                   ` Matthew Wilcox [this message]
2024-06-24 21:02                     ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZnncVRuHeeN7GnTJ@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=nphamcs@gmail.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).