From: Nitin Gupta <ngupta@vflare.org>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
hongshin@gmail.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 7/9] swap_info: swap count continuations
Date: Fri, 16 Oct 2009 10:19:27 +0530 [thread overview]
Message-ID: <4AD7FB57.2030403@vflare.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0910150153560.3291@sister.anvils>
On 10/15/2009 06:26 AM, Hugh Dickins wrote:
> Swap is duplicated (reference count incremented by one) whenever the same
> swap page is inserted into another mm (when forking finds a swap entry in
> place of a pte, or when reclaim unmaps a pte to insert the swap entry).
>
> swap_info_struct's vmalloc'ed swap_map is the array of these reference
> counts: but what happens when the unsigned short (or unsigned char since
> the preceding patch) is full? (and its high bit is kept for a cache flag)
>
> We then lose track of it, never freeing, leaving it in use until swapoff:
> at which point we _hope_ that a single pass will have found all instances,
> assume there are no more, and will lose user data if we're wrong.
>
> Swapping of KSM pages has not yet been enabled; but it is implemented,
> and makes it very easy for a user to overflow the maximum swap count:
> possible with ordinary process pages, but unlikely, even when pid_max
> has been raised from PID_MAX_DEFAULT.
>
> This patch implements swap count continuations: when the count overflows,
> a continuation page is allocated and linked to the original vmalloc'ed
> map page, and this used to hold the continuation counts for that entry
> and its neighbours. These continuation pages are seldom referenced:
> the common paths all work on the original swap_map, only referring to
> a continuation page when the low "digit" of a count is incremented or
> decremented through SWAP_MAP_MAX.
>
I think the patch can be simplified a lot if we have just 2 levels (hard-coded)
of swap_map, each level having 16-bit count -- combined 32-bit count should be
sufficient for about anything. Saving 1-byte for level-1 swap_map and then having
arbitrary levels of swap_map doesn't look like its worth the complexity.
Nitin
WARNING: multiple messages have this Message-ID (diff)
From: Nitin Gupta <ngupta@vflare.org>
To: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
hongshin@gmail.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 7/9] swap_info: swap count continuations
Date: Fri, 16 Oct 2009 10:19:27 +0530 [thread overview]
Message-ID: <4AD7FB57.2030403@vflare.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0910150153560.3291@sister.anvils>
On 10/15/2009 06:26 AM, Hugh Dickins wrote:
> Swap is duplicated (reference count incremented by one) whenever the same
> swap page is inserted into another mm (when forking finds a swap entry in
> place of a pte, or when reclaim unmaps a pte to insert the swap entry).
>
> swap_info_struct's vmalloc'ed swap_map is the array of these reference
> counts: but what happens when the unsigned short (or unsigned char since
> the preceding patch) is full? (and its high bit is kept for a cache flag)
>
> We then lose track of it, never freeing, leaving it in use until swapoff:
> at which point we _hope_ that a single pass will have found all instances,
> assume there are no more, and will lose user data if we're wrong.
>
> Swapping of KSM pages has not yet been enabled; but it is implemented,
> and makes it very easy for a user to overflow the maximum swap count:
> possible with ordinary process pages, but unlikely, even when pid_max
> has been raised from PID_MAX_DEFAULT.
>
> This patch implements swap count continuations: when the count overflows,
> a continuation page is allocated and linked to the original vmalloc'ed
> map page, and this used to hold the continuation counts for that entry
> and its neighbours. These continuation pages are seldom referenced:
> the common paths all work on the original swap_map, only referring to
> a continuation page when the low "digit" of a count is incremented or
> decremented through SWAP_MAP_MAX.
>
I think the patch can be simplified a lot if we have just 2 levels (hard-coded)
of swap_map, each level having 16-bit count -- combined 32-bit count should be
sufficient for about anything. Saving 1-byte for level-1 swap_map and then having
arbitrary levels of swap_map doesn't look like its worth the complexity.
Nitin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-10-16 4:51 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-15 0:44 [PATCH 0/9] swap_info and swap_map patches Hugh Dickins
2009-10-15 0:44 ` Hugh Dickins
2009-10-15 0:46 ` [PATCH 1/9] swap_info: private to swapfile.c Hugh Dickins
2009-10-15 0:46 ` Hugh Dickins
2009-10-15 14:57 ` Rik van Riel
2009-10-15 14:57 ` Rik van Riel
2009-10-15 23:10 ` Nigel Cunningham
2009-10-15 23:10 ` Nigel Cunningham
2009-10-16 0:28 ` Hugh Dickins
2009-10-16 0:28 ` Hugh Dickins
2009-10-15 0:48 ` [PATCH 2/9] swap_info: change to array of pointers Hugh Dickins
2009-10-15 0:48 ` Hugh Dickins
2009-10-15 2:11 ` KAMEZAWA Hiroyuki
2009-10-15 2:11 ` KAMEZAWA Hiroyuki
2009-10-15 22:41 ` Hugh Dickins
2009-10-15 22:41 ` Hugh Dickins
2009-10-15 23:04 ` Hugh Dickins
2009-10-15 23:04 ` Hugh Dickins
2009-10-15 23:47 ` KAMEZAWA Hiroyuki
2009-10-15 23:47 ` KAMEZAWA Hiroyuki
2009-10-15 23:46 ` KAMEZAWA Hiroyuki
2009-10-15 23:46 ` KAMEZAWA Hiroyuki
2009-10-15 15:02 ` Rik van Riel
2009-10-15 15:02 ` Rik van Riel
2009-10-15 0:49 ` [PATCH 3/9] swap_info: include first_swap_extent Hugh Dickins
2009-10-15 0:49 ` Hugh Dickins
2009-10-15 0:50 ` [PATCH 4/9] swap_info: miscellaneous minor cleanups Hugh Dickins
2009-10-15 0:50 ` Hugh Dickins
2009-10-15 2:19 ` KAMEZAWA Hiroyuki
2009-10-15 2:19 ` KAMEZAWA Hiroyuki
2009-10-15 22:01 ` Hugh Dickins
2009-10-15 22:01 ` Hugh Dickins
2009-10-16 0:41 ` [PATCH 4/9 v2] " Hugh Dickins
2009-10-16 0:41 ` Hugh Dickins
2009-10-15 0:52 ` [PATCH 5/9] swap_info: SWAP_HAS_CACHE cleanups Hugh Dickins
2009-10-15 0:52 ` Hugh Dickins
2009-10-15 2:37 ` KAMEZAWA Hiroyuki
2009-10-15 2:37 ` KAMEZAWA Hiroyuki
2009-10-15 22:08 ` Hugh Dickins
2009-10-15 22:08 ` Hugh Dickins
2009-10-15 0:53 ` [PATCH 6/9] swap_info: swap_map of chars not shorts Hugh Dickins
2009-10-15 0:53 ` Hugh Dickins
2009-10-15 2:44 ` KAMEZAWA Hiroyuki
2009-10-15 2:44 ` KAMEZAWA Hiroyuki
2009-10-15 22:17 ` Hugh Dickins
2009-10-15 22:17 ` Hugh Dickins
2009-10-15 23:52 ` KAMEZAWA Hiroyuki
2009-10-15 23:52 ` KAMEZAWA Hiroyuki
2009-10-15 0:56 ` [PATCH 7/9] swap_info: swap count continuations Hugh Dickins
2009-10-15 0:56 ` Hugh Dickins
2009-10-15 3:30 ` KAMEZAWA Hiroyuki
2009-10-15 3:30 ` KAMEZAWA Hiroyuki
2009-10-15 19:45 ` Andrew Morton
2009-10-15 19:45 ` Andrew Morton
2009-10-15 21:17 ` David Rientjes
2009-10-15 21:17 ` David Rientjes
2009-10-16 0:21 ` Hugh Dickins
2009-10-16 0:21 ` Hugh Dickins
2009-10-15 23:53 ` Hugh Dickins
2009-10-15 23:53 ` Hugh Dickins
2009-10-16 1:29 ` KAMEZAWA Hiroyuki
2009-10-16 1:29 ` KAMEZAWA Hiroyuki
2009-10-16 2:24 ` Hugh Dickins
2009-10-16 2:24 ` Hugh Dickins
2009-10-16 4:06 ` KAMEZAWA Hiroyuki
2009-10-16 4:06 ` KAMEZAWA Hiroyuki
2009-10-16 4:49 ` Nitin Gupta [this message]
2009-10-16 4:49 ` Nitin Gupta
2009-10-16 6:30 ` [PATCH] mm: call pte_unmap() against a proper pte (Re: [PATCH 7/9] swap_info: swap count continuations) Daisuke Nishimura
2009-10-16 6:30 ` Daisuke Nishimura
2009-10-16 8:01 ` KAMEZAWA Hiroyuki
2009-10-16 8:01 ` KAMEZAWA Hiroyuki
2009-10-15 0:57 ` [PATCH 8/9] swap_info: note SWAP_MAP_SHMEM Hugh Dickins
2009-10-15 0:57 ` Hugh Dickins
2009-10-15 3:32 ` KAMEZAWA Hiroyuki
2009-10-15 3:32 ` KAMEZAWA Hiroyuki
2009-10-15 22:23 ` Hugh Dickins
2009-10-15 22:23 ` Hugh Dickins
2009-10-16 0:04 ` KAMEZAWA Hiroyuki
2009-10-16 0:04 ` KAMEZAWA Hiroyuki
2009-10-15 0:58 ` [PATCH 9/9] swap_info: reorder its fields Hugh Dickins
2009-10-15 0:58 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AD7FB57.2030403@vflare.org \
--to=ngupta@vflare.org \
--cc=akpm@linux-foundation.org \
--cc=hongshin@gmail.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.