All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hugh@veritas.com>
Subject: Re: [patch 3/3][rfc] vmscan: batched swap slot allocation
Date: Tue, 21 Apr 2009 11:54:29 +0200	[thread overview]
Message-ID: <20090421095429.GB3639@cmpxchg.org> (raw)
In-Reply-To: <20090421182331.5c96615e.kamezawa.hiroyu@jp.fujitsu.com>

On Tue, Apr 21, 2009 at 06:23:31PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 21 Apr 2009 10:52:31 +0200
> Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > > Keeping multiple pages locked while they stay on private list ? 
> > 
> > Yeah, it's a bit suboptimal but I don't see a way around it.
> > 
> Hmm, seems to increase stale swap cache dramatically under memcg ;)

Hmpf, not good.

> > > BTW, isn't it better to add "allocate multiple swap space at once" function
> > > like
> > >  - void get_swap_pages(nr, swp_entry_array[])
> > > ? "nr" will not be bigger than SWAP_CLUSTER_MAX.
> > 
> > It will sometimes be, see __zone_reclaim().
> > 
> Hm ? If I read the code correctly, __zone_reclaim() just call shrink_zone() and
> "nr" to shrink_page_list() is SWAP_CLUSTER_MAX, at most.

shrink_zone() and shrink_inactive_list() use whatever is set in
sc->swap_cluster_max and for __zone_reclaim() this is:

	.swap_cluster_max = max_t(unsigned long, nr_pages, SWAP_CLUSTER_MAX)

SWAP_CLUSTER_MAX is 32 (2^5), so if you have an order 6 allocation
doing reclaim, you end up with sc->swap_cluster_max == 64 already.
Not common, but it happens.

> > I had such a function once.  The interesting part is: how and when do
> > you call it?  If you drop the page lock in between, you need to redo
> > the checks for unevictability and whether the page has become mapped
> > etc.
> > 
> > You also need to have the pages in swap cache as soon as possible or
> > optimistic swap-in will 'steal' your swap slots.  See add_to_swap()
> > when the cache radix tree says -EEXIST.
> > 
> 
> If I was you, modify "offset" calculation of
>   get_swap_pages()
>      -> scan_swap_map()
> to allow that a cpu  tends to find countinous swap page cluster.
> Too difficult ?

This goes in the direction of extent-based allocations.  I tried that
once by providing every reclaimer with a cookie that is passed in for
swap allocations and used to find per-reclaimer offsets.

Something went wrong, I can not quite remember.  Will have another
look at this.

WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Rik van Riel <riel@redhat.com>, Hugh Dickins <hugh@veritas.com>
Subject: Re: [patch 3/3][rfc] vmscan: batched swap slot allocation
Date: Tue, 21 Apr 2009 11:54:29 +0200	[thread overview]
Message-ID: <20090421095429.GB3639@cmpxchg.org> (raw)
In-Reply-To: <20090421182331.5c96615e.kamezawa.hiroyu@jp.fujitsu.com>

On Tue, Apr 21, 2009 at 06:23:31PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 21 Apr 2009 10:52:31 +0200
> Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > > Keeping multiple pages locked while they stay on private list ? 
> > 
> > Yeah, it's a bit suboptimal but I don't see a way around it.
> > 
> Hmm, seems to increase stale swap cache dramatically under memcg ;)

Hmpf, not good.

> > > BTW, isn't it better to add "allocate multiple swap space at once" function
> > > like
> > >  - void get_swap_pages(nr, swp_entry_array[])
> > > ? "nr" will not be bigger than SWAP_CLUSTER_MAX.
> > 
> > It will sometimes be, see __zone_reclaim().
> > 
> Hm ? If I read the code correctly, __zone_reclaim() just call shrink_zone() and
> "nr" to shrink_page_list() is SWAP_CLUSTER_MAX, at most.

shrink_zone() and shrink_inactive_list() use whatever is set in
sc->swap_cluster_max and for __zone_reclaim() this is:

	.swap_cluster_max = max_t(unsigned long, nr_pages, SWAP_CLUSTER_MAX)

SWAP_CLUSTER_MAX is 32 (2^5), so if you have an order 6 allocation
doing reclaim, you end up with sc->swap_cluster_max == 64 already.
Not common, but it happens.

> > I had such a function once.  The interesting part is: how and when do
> > you call it?  If you drop the page lock in between, you need to redo
> > the checks for unevictability and whether the page has become mapped
> > etc.
> > 
> > You also need to have the pages in swap cache as soon as possible or
> > optimistic swap-in will 'steal' your swap slots.  See add_to_swap()
> > when the cache radix tree says -EEXIST.
> > 
> 
> If I was you, modify "offset" calculation of
>   get_swap_pages()
>      -> scan_swap_map()
> to allow that a cpu  tends to find countinous swap page cluster.
> Too difficult ?

This goes in the direction of extent-based allocations.  I tried that
once by providing every reclaimer with a cookie that is passed in for
swap allocations and used to find per-reclaimer offsets.

Something went wrong, I can not quite remember.  Will have another
look at this.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-04-21  9:56 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-20 20:24 [patch 1/3] mm: fix pageref leak in do_swap_page() Johannes Weiner
2009-04-20 20:24 ` Johannes Weiner
2009-04-20 20:24 ` [patch 2/3][rfc] swap: try to reuse freed slots in the allocation area Johannes Weiner
2009-04-20 20:24   ` Johannes Weiner
2009-04-22 19:59   ` Hugh Dickins
2009-04-22 19:59     ` Hugh Dickins
2009-04-27  8:02     ` Johannes Weiner
2009-04-27  8:02       ` Johannes Weiner
2009-04-20 20:24 ` [patch 3/3][rfc] vmscan: batched swap slot allocation Johannes Weiner
2009-04-20 20:24   ` Johannes Weiner
2009-04-20 20:31   ` Johannes Weiner
2009-04-20 20:53     ` Andrew Morton
2009-04-20 20:53       ` Andrew Morton
2009-04-20 21:38       ` Johannes Weiner
2009-04-20 21:38         ` Johannes Weiner
2009-04-21  0:58   ` KAMEZAWA Hiroyuki
2009-04-21  0:58     ` KAMEZAWA Hiroyuki
2009-04-21  8:52     ` Johannes Weiner
2009-04-21  8:52       ` Johannes Weiner
2009-04-21  9:23       ` KAMEZAWA Hiroyuki
2009-04-21  9:23         ` KAMEZAWA Hiroyuki
2009-04-21  9:54         ` Johannes Weiner [this message]
2009-04-21  9:54           ` Johannes Weiner
2009-04-21  9:27       ` KOSAKI Motohiro
2009-04-21  9:27         ` KOSAKI Motohiro
2009-04-21  9:38         ` Johannes Weiner
2009-04-21  9:38           ` Johannes Weiner
2009-04-21  9:41           ` KOSAKI Motohiro
2009-04-21  9:41             ` KOSAKI Motohiro
2009-04-22 20:37   ` Hugh Dickins
2009-04-22 20:37     ` Hugh Dickins
2009-04-27  7:46     ` Johannes Weiner
2009-04-27  7:46       ` Johannes Weiner
2009-04-20 23:36 ` [patch 1/3] mm: fix pageref leak in do_swap_page() Minchan Kim
2009-04-20 23:36   ` Minchan Kim
2009-04-21  3:14 ` Balbir Singh
2009-04-21  3:14   ` Balbir Singh
2009-04-21  8:19   ` Johannes Weiner
2009-04-21  8:19     ` Johannes Weiner
2009-04-21  8:45     ` Balbir Singh
2009-04-21  8:45       ` Balbir Singh
2009-04-21  3:44 ` KAMEZAWA Hiroyuki
2009-04-21  3:44   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090421095429.GB3639@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=hugh@veritas.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.