From: Vlastimil Babka <vbabka@suse.cz>
To: Christian Borntraeger <borntraeger@de.ibm.com>, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Subject: Re: [PATCH/RFC] mm/swapfile: reduce kswapd overhead by not filling up disks
Date: Mon, 21 Dec 2015 16:58:29 +0100 [thread overview]
Message-ID: <567821A5.7050201@suse.cz> (raw)
In-Reply-To: <1449846574-35511-1-git-send-email-borntraeger@de.ibm.com>
On 12/11/2015 04:09 PM, Christian Borntraeger wrote:
> if a user has more than one swap disk with different priorities, the
> swap code will fill up the hight prio disk until the last block is
> used.
> The swap code will continue to scan the first disk also when its
> already filling the 2nd or 3rd disk.
> We have seen kswapd running at 100% CPU, with the majority of hits
> in the scanning code of scan_swap_map, even for non-rotational disks
> when this happens.
> For example with 3 disks
> disk1 99.9%
> disk2 10%
> disk3 0%
> it will scan the bitmap of disk1 (and as the disk is full the
> cluster optimization does not trigger) for every page that will
> likely go to disk2 anyway.
>
> By doing a first scan that only uses up to 98%, we force the swap
> code to use the 2nd disk slightly earlier, but it reduces kswapd
> cpu usage significantly. The 2nd scan will then allow to fill
> the remaining 2%, again starting with the highest prio disk.
>
> The code does not affect cases with all the same swap priorities,
> unless all disks are about 98% full.
> There is one issue with mythis approach: If there is a mix between
> same and different priorities, the code will loop too often due
> to the requeue, so and idea for a better fix is welcome.
>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
IMHO you should resend with CCing the relevant people directly (e.g. via
./scripts/get_maintainers.pl) or this might simply get lost in
high-volume mailing lists.
Note that I'm not familiar with this code. But my first thought would be
to put a cache with batch-refill/free before the bitmap. During the
"first" round only consider si's with enough free to satisfy the whole
batch-refill.
> ---
> mm/swapfile.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 5887731..d3817cf 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -640,6 +640,7 @@ swp_entry_t get_swap_page(void)
> {
> struct swap_info_struct *si, *next;
> pgoff_t offset;
> + bool first = true;
>
> if (atomic_long_read(&nr_swap_pages) <= 0)
> goto noswap;
> @@ -653,6 +654,12 @@ start_over:
> plist_requeue(&si->avail_list, &swap_avail_head);
> spin_unlock(&swap_avail_lock);
> spin_lock(&si->lock);
> + /* at 98% usage lets try the other swaps */
> + if (first && si->inuse_pages / 98 * 100 > si->pages) {
> + spin_lock(&swap_avail_lock);
> + spin_unlock(&si->lock);
> + goto nextsi;
> + }
> if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> spin_lock(&swap_avail_lock);
> if (plist_node_empty(&si->avail_list)) {
> @@ -692,6 +699,10 @@ nextsi:
> if (plist_node_empty(&next->avail_list))
> goto start_over;
> }
> + if (first) {
> + first = false;
> + goto start_over;
> + }
>
> spin_unlock(&swap_avail_lock);
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2015-12-21 15:58 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-11 15:09 [PATCH/RFC] mm/swapfile: reduce kswapd overhead by not filling up disks Christian Borntraeger
2015-12-21 15:58 ` Vlastimil Babka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=567821A5.7050201@suse.cz \
--to=vbabka@suse.cz \
--cc=borntraeger@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).