From: Shaohua Li <shli@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Dan Streetman <ddstreet@ieee.org>,
Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Thu, 24 Apr 2014 08:17:05 +0800 [thread overview]
Message-ID: <20140424001705.GA8066@kernel.org> (raw)
In-Reply-To: <20140423103400.GH23991@suse.de>
On Wed, Apr 23, 2014 at 11:34:00AM +0100, Mel Gorman wrote:
> > @@ -366,7 +361,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > }
> > vm_unacct_memory(pages);
> > *unused = pages_to_unuse;
> > - *swapid = type;
> > + *swapid = si->type;
> > ret = 0;
> > break;
> > }
> > @@ -413,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
> > /*
> > * we don't want to hold swap_lock while doing a very
> > * lengthy try_to_unuse, but swap_list may change
> > - * so restart scan from swap_list.head each time
> > + * so restart scan from swap_list_head each time
> > */
> > spin_lock(&swap_lock);
> > ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 4a7f7e6..b958645 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -51,14 +51,14 @@ atomic_long_t nr_swap_pages;
> > /* protected with swap_lock. reading in vm_swap_full() doesn't need lock */
> > long total_swap_pages;
> > static int least_priority;
> > -static atomic_t highest_priority_index = ATOMIC_INIT(-1);
> >
> > static const char Bad_file[] = "Bad swap file entry ";
> > static const char Unused_file[] = "Unused swap file entry ";
> > static const char Bad_offset[] = "Bad swap offset entry ";
> > static const char Unused_offset[] = "Unused swap offset entry ";
> >
> > -struct swap_list_t swap_list = {-1, -1};
> > +/* all active swap_info */
> > +LIST_HEAD(swap_list_head);
> >
> > struct swap_info_struct *swap_info[MAX_SWAPFILES];
> >
> > @@ -640,66 +640,50 @@ no_page:
> >
> > swp_entry_t get_swap_page(void)
> > {
> > - struct swap_info_struct *si;
> > + struct swap_info_struct *si, *next;
> > pgoff_t offset;
> > - int type, next;
> > - int wrapped = 0;
> > - int hp_index;
> > + struct list_head *tmp;
> >
> > spin_lock(&swap_lock);
> > if (atomic_long_read(&nr_swap_pages) <= 0)
> > goto noswap;
> > atomic_long_dec(&nr_swap_pages);
> >
> > - for (type = swap_list.next; type >= 0 && wrapped < 2; type = next) {
> > - hp_index = atomic_xchg(&highest_priority_index, -1);
> > - /*
> > - * highest_priority_index records current highest priority swap
> > - * type which just frees swap entries. If its priority is
> > - * higher than that of swap_list.next swap type, we use it. It
> > - * isn't protected by swap_lock, so it can be an invalid value
> > - * if the corresponding swap type is swapoff. We double check
> > - * the flags here. It's even possible the swap type is swapoff
> > - * and swapon again and its priority is changed. In such rare
> > - * case, low prority swap type might be used, but eventually
> > - * high priority swap will be used after several rounds of
> > - * swap.
> > - */
> > - if (hp_index != -1 && hp_index != type &&
> > - swap_info[type]->prio < swap_info[hp_index]->prio &&
> > - (swap_info[hp_index]->flags & SWP_WRITEOK)) {
> > - type = hp_index;
> > - swap_list.next = type;
> > - }
> > -
> > - si = swap_info[type];
> > - next = si->next;
> > - if (next < 0 ||
> > - (!wrapped && si->prio != swap_info[next]->prio)) {
> > - next = swap_list.head;
> > - wrapped++;
> > - }
> > -
> > + list_for_each(tmp, &swap_list_head) {
> > + si = list_entry(tmp, typeof(*si), list);
> > spin_lock(&si->lock);
> > - if (!si->highest_bit) {
> > - spin_unlock(&si->lock);
> > - continue;
> > - }
> > - if (!(si->flags & SWP_WRITEOK)) {
> > + if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> > spin_unlock(&si->lock);
> > continue;
> > }
> >
> > - swap_list.next = next;
> > + /*
> > + * rotate the current swap_info that we're going to use
> > + * to after any other swap_info that have the same prio,
> > + * so that all equal-priority swap_info get used equally
> > + */
> > + next = si;
> > + list_for_each_entry_continue(next, &swap_list_head, list) {
> > + if (si->prio != next->prio)
> > + break;
> > + list_rotate_left(&si->list);
> > + next = si;
> > + }
> >
>
> The list manipulations will be a lot of cache writes as the list is shuffled
> around. On slow storage I do not think this will be noticable but it may
> be noticable on faster swap devices that are SSD based. I've added Shaohua
> Li to the cc as he has been concerned with the performance of swap in the
> past. Shaohua, can you run this patchset through any of your test cases
> with the addition that multiple swap files are used to see if the cache
> writes are noticable? You'll need multiple swap files, some of which are
> at equal priority so the list shuffling logic is triggered.
get_swap_page isn't hot so far (and we hold the swap_lock, which isn't
contended), guess it's because other problems hide it, for example tlb flush
overhead.
Thanks,
Shaohua
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Shaohua Li <shli@kernel.org>
To: Mel Gorman <mgorman@suse.de>
Cc: Dan Streetman <ddstreet@ieee.org>,
Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Thu, 24 Apr 2014 08:17:05 +0800 [thread overview]
Message-ID: <20140424001705.GA8066@kernel.org> (raw)
In-Reply-To: <20140423103400.GH23991@suse.de>
On Wed, Apr 23, 2014 at 11:34:00AM +0100, Mel Gorman wrote:
> > @@ -366,7 +361,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > }
> > vm_unacct_memory(pages);
> > *unused = pages_to_unuse;
> > - *swapid = type;
> > + *swapid = si->type;
> > ret = 0;
> > break;
> > }
> > @@ -413,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
> > /*
> > * we don't want to hold swap_lock while doing a very
> > * lengthy try_to_unuse, but swap_list may change
> > - * so restart scan from swap_list.head each time
> > + * so restart scan from swap_list_head each time
> > */
> > spin_lock(&swap_lock);
> > ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index 4a7f7e6..b958645 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -51,14 +51,14 @@ atomic_long_t nr_swap_pages;
> > /* protected with swap_lock. reading in vm_swap_full() doesn't need lock */
> > long total_swap_pages;
> > static int least_priority;
> > -static atomic_t highest_priority_index = ATOMIC_INIT(-1);
> >
> > static const char Bad_file[] = "Bad swap file entry ";
> > static const char Unused_file[] = "Unused swap file entry ";
> > static const char Bad_offset[] = "Bad swap offset entry ";
> > static const char Unused_offset[] = "Unused swap offset entry ";
> >
> > -struct swap_list_t swap_list = {-1, -1};
> > +/* all active swap_info */
> > +LIST_HEAD(swap_list_head);
> >
> > struct swap_info_struct *swap_info[MAX_SWAPFILES];
> >
> > @@ -640,66 +640,50 @@ no_page:
> >
> > swp_entry_t get_swap_page(void)
> > {
> > - struct swap_info_struct *si;
> > + struct swap_info_struct *si, *next;
> > pgoff_t offset;
> > - int type, next;
> > - int wrapped = 0;
> > - int hp_index;
> > + struct list_head *tmp;
> >
> > spin_lock(&swap_lock);
> > if (atomic_long_read(&nr_swap_pages) <= 0)
> > goto noswap;
> > atomic_long_dec(&nr_swap_pages);
> >
> > - for (type = swap_list.next; type >= 0 && wrapped < 2; type = next) {
> > - hp_index = atomic_xchg(&highest_priority_index, -1);
> > - /*
> > - * highest_priority_index records current highest priority swap
> > - * type which just frees swap entries. If its priority is
> > - * higher than that of swap_list.next swap type, we use it. It
> > - * isn't protected by swap_lock, so it can be an invalid value
> > - * if the corresponding swap type is swapoff. We double check
> > - * the flags here. It's even possible the swap type is swapoff
> > - * and swapon again and its priority is changed. In such rare
> > - * case, low prority swap type might be used, but eventually
> > - * high priority swap will be used after several rounds of
> > - * swap.
> > - */
> > - if (hp_index != -1 && hp_index != type &&
> > - swap_info[type]->prio < swap_info[hp_index]->prio &&
> > - (swap_info[hp_index]->flags & SWP_WRITEOK)) {
> > - type = hp_index;
> > - swap_list.next = type;
> > - }
> > -
> > - si = swap_info[type];
> > - next = si->next;
> > - if (next < 0 ||
> > - (!wrapped && si->prio != swap_info[next]->prio)) {
> > - next = swap_list.head;
> > - wrapped++;
> > - }
> > -
> > + list_for_each(tmp, &swap_list_head) {
> > + si = list_entry(tmp, typeof(*si), list);
> > spin_lock(&si->lock);
> > - if (!si->highest_bit) {
> > - spin_unlock(&si->lock);
> > - continue;
> > - }
> > - if (!(si->flags & SWP_WRITEOK)) {
> > + if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> > spin_unlock(&si->lock);
> > continue;
> > }
> >
> > - swap_list.next = next;
> > + /*
> > + * rotate the current swap_info that we're going to use
> > + * to after any other swap_info that have the same prio,
> > + * so that all equal-priority swap_info get used equally
> > + */
> > + next = si;
> > + list_for_each_entry_continue(next, &swap_list_head, list) {
> > + if (si->prio != next->prio)
> > + break;
> > + list_rotate_left(&si->list);
> > + next = si;
> > + }
> >
>
> The list manipulations will be a lot of cache writes as the list is shuffled
> around. On slow storage I do not think this will be noticable but it may
> be noticable on faster swap devices that are SSD based. I've added Shaohua
> Li to the cc as he has been concerned with the performance of swap in the
> past. Shaohua, can you run this patchset through any of your test cases
> with the addition that multiple swap files are used to see if the cache
> writes are noticable? You'll need multiple swap files, some of which are
> at equal priority so the list shuffling logic is triggered.
get_swap_page isn't hot so far (and we hold the swap_lock, which isn't
contended), guess it's because other problems hide it, for example tlb flush
overhead.
Thanks,
Shaohua
next prev parent reply other threads:[~2014-04-24 0:17 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 10:42 ` Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-13 15:58 ` Weijie Yang
2014-02-14 10:17 ` Mel Gorman
2014-02-14 10:17 ` Mel Gorman
2014-02-14 13:33 ` Weijie Yang
2014-02-14 13:33 ` Weijie Yang
2014-02-14 13:10 ` Christian Ehrhardt
2014-02-16 2:59 ` Weijie Yang
2014-02-24 8:28 ` Hugh Dickins
2014-02-24 8:28 ` Hugh Dickins
2014-04-12 21:00 ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00 ` Dan Streetman
2014-04-12 21:00 ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-12 21:00 ` Dan Streetman
2014-04-23 10:34 ` Mel Gorman
2014-04-23 10:34 ` Mel Gorman
2014-04-24 0:17 ` Shaohua Li [this message]
2014-04-24 0:17 ` Shaohua Li
2014-04-24 8:30 ` Mel Gorman
2014-04-24 8:30 ` Mel Gorman
2014-04-24 18:48 ` Dan Streetman
2014-04-24 18:48 ` Dan Streetman
2014-04-25 4:15 ` Weijie Yang
2014-04-25 4:15 ` Weijie Yang
2014-05-02 20:00 ` Dan Streetman
2014-05-02 20:00 ` Dan Streetman
2014-05-02 20:00 ` Dan Streetman
2014-05-04 9:39 ` Bob Liu
2014-05-04 9:39 ` Bob Liu
2014-05-04 20:16 ` Dan Streetman
2014-05-04 20:16 ` Dan Streetman
2014-05-04 20:16 ` Dan Streetman
2014-05-04 9:39 ` Bob Liu
2014-04-25 8:38 ` Mel Gorman
2014-04-25 8:38 ` Mel Gorman
2014-04-12 21:00 ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-12 21:00 ` Dan Streetman
2014-04-23 13:14 ` Mel Gorman
2014-04-23 13:14 ` Mel Gorman
2014-04-24 17:52 ` Dan Streetman
2014-04-24 17:52 ` Dan Streetman
2014-04-25 8:49 ` Mel Gorman
2014-04-25 8:49 ` Mel Gorman
2014-05-02 19:02 ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02 ` Dan Streetman
2014-05-02 19:02 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02 ` Dan Streetman
2014-05-02 19:02 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-02 19:02 ` Dan Streetman
2014-05-12 10:35 ` Mel Gorman
2014-05-12 10:35 ` Mel Gorman
2014-05-02 19:02 ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-02 19:02 ` Dan Streetman
2014-05-06 2:18 ` Steven Rostedt
2014-05-06 2:18 ` Steven Rostedt
2014-05-06 20:12 ` Dan Streetman
2014-05-06 20:12 ` Dan Streetman
2014-05-06 20:39 ` Steven Rostedt
2014-05-06 20:39 ` Steven Rostedt
2014-05-06 21:47 ` Dan Streetman
2014-05-06 21:47 ` Dan Streetman
2014-05-06 22:43 ` Steven Rostedt
2014-05-06 22:43 ` Steven Rostedt
2014-05-02 19:02 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-02 19:02 ` Dan Streetman
2014-05-05 15:51 ` Dan Streetman
2014-05-05 15:51 ` Dan Streetman
2014-05-05 19:13 ` Steven Rostedt
2014-05-05 19:13 ` Steven Rostedt
2014-05-05 19:38 ` Peter Zijlstra
2014-05-09 20:42 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 20:42 ` Dan Streetman
2014-05-09 21:17 ` Steven Rostedt
2014-05-09 21:17 ` Steven Rostedt
2014-05-12 11:11 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Mel Gorman
2014-05-12 11:11 ` Mel Gorman
2014-05-12 13:00 ` Dan Streetman
2014-05-12 13:00 ` Dan Streetman
2014-05-12 16:38 ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38 ` Dan Streetman
2014-05-12 16:38 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38 ` Dan Streetman
2014-05-12 16:38 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38 ` Dan Streetman
2014-05-12 16:38 ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-12 16:38 ` Dan Streetman
2014-05-13 10:33 ` Mel Gorman
2014-05-13 10:33 ` Mel Gorman
2014-05-12 16:38 ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-12 16:38 ` Dan Streetman
2014-05-13 10:34 ` Mel Gorman
2014-05-13 10:34 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140424001705.GA8066@kernel.org \
--to=shli@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=ddstreet@ieee.org \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=shli@fusionio.com \
--cc=weijieut@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.