All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Thu, 24 Apr 2014 09:30:18 +0100	[thread overview]
Message-ID: <20140424083018.GO23991@suse.de> (raw)
In-Reply-To: <20140424001705.GA8066@kernel.org>

On Thu, Apr 24, 2014 at 08:17:05AM +0800, Shaohua Li wrote:
> On Wed, Apr 23, 2014 at 11:34:00AM +0100, Mel Gorman wrote:
> > > @@ -366,7 +361,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > >  		}
> > >  		vm_unacct_memory(pages);
> > >  		*unused = pages_to_unuse;
> > > -		*swapid = type;
> > > +		*swapid = si->type;
> > >  		ret = 0;
> > >  		break;
> > >  	}
> > > @@ -413,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
> > >  	/*
> > >  	 * we don't want to hold swap_lock while doing a very
> > >  	 * lengthy try_to_unuse, but swap_list may change
> > > -	 * so restart scan from swap_list.head each time
> > > +	 * so restart scan from swap_list_head each time
> > >  	 */
> > >  	spin_lock(&swap_lock);
> > >  	ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 4a7f7e6..b958645 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -51,14 +51,14 @@ atomic_long_t nr_swap_pages;
> > >  /* protected with swap_lock. reading in vm_swap_full() doesn't need lock */
> > >  long total_swap_pages;
> > >  static int least_priority;
> > > -static atomic_t highest_priority_index = ATOMIC_INIT(-1);
> > >  
> > >  static const char Bad_file[] = "Bad swap file entry ";
> > >  static const char Unused_file[] = "Unused swap file entry ";
> > >  static const char Bad_offset[] = "Bad swap offset entry ";
> > >  static const char Unused_offset[] = "Unused swap offset entry ";
> > >  
> > > -struct swap_list_t swap_list = {-1, -1};
> > > +/* all active swap_info */
> > > +LIST_HEAD(swap_list_head);
> > >  
> > >  struct swap_info_struct *swap_info[MAX_SWAPFILES];
> > >  
> > > @@ -640,66 +640,50 @@ no_page:
> > >  
> > >  swp_entry_t get_swap_page(void)
> > >  {
> > > -	struct swap_info_struct *si;
> > > +	struct swap_info_struct *si, *next;
> > >  	pgoff_t offset;
> > > -	int type, next;
> > > -	int wrapped = 0;
> > > -	int hp_index;
> > > +	struct list_head *tmp;
> > >  
> > >  	spin_lock(&swap_lock);
> > >  	if (atomic_long_read(&nr_swap_pages) <= 0)
> > >  		goto noswap;
> > >  	atomic_long_dec(&nr_swap_pages);
> > >  
> > > -	for (type = swap_list.next; type >= 0 && wrapped < 2; type = next) {
> > > -		hp_index = atomic_xchg(&highest_priority_index, -1);
> > > -		/*
> > > -		 * highest_priority_index records current highest priority swap
> > > -		 * type which just frees swap entries. If its priority is
> > > -		 * higher than that of swap_list.next swap type, we use it.  It
> > > -		 * isn't protected by swap_lock, so it can be an invalid value
> > > -		 * if the corresponding swap type is swapoff. We double check
> > > -		 * the flags here. It's even possible the swap type is swapoff
> > > -		 * and swapon again and its priority is changed. In such rare
> > > -		 * case, low prority swap type might be used, but eventually
> > > -		 * high priority swap will be used after several rounds of
> > > -		 * swap.
> > > -		 */
> > > -		if (hp_index != -1 && hp_index != type &&
> > > -		    swap_info[type]->prio < swap_info[hp_index]->prio &&
> > > -		    (swap_info[hp_index]->flags & SWP_WRITEOK)) {
> > > -			type = hp_index;
> > > -			swap_list.next = type;
> > > -		}
> > > -
> > > -		si = swap_info[type];
> > > -		next = si->next;
> > > -		if (next < 0 ||
> > > -		    (!wrapped && si->prio != swap_info[next]->prio)) {
> > > -			next = swap_list.head;
> > > -			wrapped++;
> > > -		}
> > > -
> > > +	list_for_each(tmp, &swap_list_head) {
> > > +		si = list_entry(tmp, typeof(*si), list);
> > >  		spin_lock(&si->lock);
> > > -		if (!si->highest_bit) {
> > > -			spin_unlock(&si->lock);
> > > -			continue;
> > > -		}
> > > -		if (!(si->flags & SWP_WRITEOK)) {
> > > +		if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> > >  			spin_unlock(&si->lock);
> > >  			continue;
> > >  		}
> > >  
> > > -		swap_list.next = next;
> > > +		/*
> > > +		 * rotate the current swap_info that we're going to use
> > > +		 * to after any other swap_info that have the same prio,
> > > +		 * so that all equal-priority swap_info get used equally
> > > +		 */
> > > +		next = si;
> > > +		list_for_each_entry_continue(next, &swap_list_head, list) {
> > > +			if (si->prio != next->prio)
> > > +				break;
> > > +			list_rotate_left(&si->list);
> > > +			next = si;
> > > +		}
> > >  
> > 
> > The list manipulations will be a lot of cache writes as the list is shuffled
> > around. On slow storage I do not think this will be noticable but it may
> > be noticable on faster swap devices that are SSD based. I've added Shaohua
> > Li to the cc as he has been concerned with the performance of swap in the
> > past. Shaohua, can you run this patchset through any of your test cases
> > with the addition that multiple swap files are used to see if the cache
> > writes are noticable? You'll need multiple swap files, some of which are
> > at equal priority so the list shuffling logic is triggered.
> 
> get_swap_page isn't hot so far (and we hold the swap_lock, which isn't
> contended), guess it's because other problems hide it, for example tlb flush
> overhead.
> 

The old method was not free either but I wanted to be sure you were
aware of the series just in case. Thanks.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Thu, 24 Apr 2014 09:30:18 +0100	[thread overview]
Message-ID: <20140424083018.GO23991@suse.de> (raw)
In-Reply-To: <20140424001705.GA8066@kernel.org>

On Thu, Apr 24, 2014 at 08:17:05AM +0800, Shaohua Li wrote:
> On Wed, Apr 23, 2014 at 11:34:00AM +0100, Mel Gorman wrote:
> > > @@ -366,7 +361,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > >  		}
> > >  		vm_unacct_memory(pages);
> > >  		*unused = pages_to_unuse;
> > > -		*swapid = type;
> > > +		*swapid = si->type;
> > >  		ret = 0;
> > >  		break;
> > >  	}
> > > @@ -413,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
> > >  	/*
> > >  	 * we don't want to hold swap_lock while doing a very
> > >  	 * lengthy try_to_unuse, but swap_list may change
> > > -	 * so restart scan from swap_list.head each time
> > > +	 * so restart scan from swap_list_head each time
> > >  	 */
> > >  	spin_lock(&swap_lock);
> > >  	ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 4a7f7e6..b958645 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -51,14 +51,14 @@ atomic_long_t nr_swap_pages;
> > >  /* protected with swap_lock. reading in vm_swap_full() doesn't need lock */
> > >  long total_swap_pages;
> > >  static int least_priority;
> > > -static atomic_t highest_priority_index = ATOMIC_INIT(-1);
> > >  
> > >  static const char Bad_file[] = "Bad swap file entry ";
> > >  static const char Unused_file[] = "Unused swap file entry ";
> > >  static const char Bad_offset[] = "Bad swap offset entry ";
> > >  static const char Unused_offset[] = "Unused swap offset entry ";
> > >  
> > > -struct swap_list_t swap_list = {-1, -1};
> > > +/* all active swap_info */
> > > +LIST_HEAD(swap_list_head);
> > >  
> > >  struct swap_info_struct *swap_info[MAX_SWAPFILES];
> > >  
> > > @@ -640,66 +640,50 @@ no_page:
> > >  
> > >  swp_entry_t get_swap_page(void)
> > >  {
> > > -	struct swap_info_struct *si;
> > > +	struct swap_info_struct *si, *next;
> > >  	pgoff_t offset;
> > > -	int type, next;
> > > -	int wrapped = 0;
> > > -	int hp_index;
> > > +	struct list_head *tmp;
> > >  
> > >  	spin_lock(&swap_lock);
> > >  	if (atomic_long_read(&nr_swap_pages) <= 0)
> > >  		goto noswap;
> > >  	atomic_long_dec(&nr_swap_pages);
> > >  
> > > -	for (type = swap_list.next; type >= 0 && wrapped < 2; type = next) {
> > > -		hp_index = atomic_xchg(&highest_priority_index, -1);
> > > -		/*
> > > -		 * highest_priority_index records current highest priority swap
> > > -		 * type which just frees swap entries. If its priority is
> > > -		 * higher than that of swap_list.next swap type, we use it.  It
> > > -		 * isn't protected by swap_lock, so it can be an invalid value
> > > -		 * if the corresponding swap type is swapoff. We double check
> > > -		 * the flags here. It's even possible the swap type is swapoff
> > > -		 * and swapon again and its priority is changed. In such rare
> > > -		 * case, low prority swap type might be used, but eventually
> > > -		 * high priority swap will be used after several rounds of
> > > -		 * swap.
> > > -		 */
> > > -		if (hp_index != -1 && hp_index != type &&
> > > -		    swap_info[type]->prio < swap_info[hp_index]->prio &&
> > > -		    (swap_info[hp_index]->flags & SWP_WRITEOK)) {
> > > -			type = hp_index;
> > > -			swap_list.next = type;
> > > -		}
> > > -
> > > -		si = swap_info[type];
> > > -		next = si->next;
> > > -		if (next < 0 ||
> > > -		    (!wrapped && si->prio != swap_info[next]->prio)) {
> > > -			next = swap_list.head;
> > > -			wrapped++;
> > > -		}
> > > -
> > > +	list_for_each(tmp, &swap_list_head) {
> > > +		si = list_entry(tmp, typeof(*si), list);
> > >  		spin_lock(&si->lock);
> > > -		if (!si->highest_bit) {
> > > -			spin_unlock(&si->lock);
> > > -			continue;
> > > -		}
> > > -		if (!(si->flags & SWP_WRITEOK)) {
> > > +		if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> > >  			spin_unlock(&si->lock);
> > >  			continue;
> > >  		}
> > >  
> > > -		swap_list.next = next;
> > > +		/*
> > > +		 * rotate the current swap_info that we're going to use
> > > +		 * to after any other swap_info that have the same prio,
> > > +		 * so that all equal-priority swap_info get used equally
> > > +		 */
> > > +		next = si;
> > > +		list_for_each_entry_continue(next, &swap_list_head, list) {
> > > +			if (si->prio != next->prio)
> > > +				break;
> > > +			list_rotate_left(&si->list);
> > > +			next = si;
> > > +		}
> > >  
> > 
> > The list manipulations will be a lot of cache writes as the list is shuffled
> > around. On slow storage I do not think this will be noticable but it may
> > be noticable on faster swap devices that are SSD based. I've added Shaohua
> > Li to the cc as he has been concerned with the performance of swap in the
> > past. Shaohua, can you run this patchset through any of your test cases
> > with the addition that multiple swap files are used to see if the cache
> > writes are noticable? You'll need multiple swap files, some of which are
> > at equal priority so the list shuffling logic is triggered.
> 
> get_swap_page isn't hot so far (and we hold the swap_lock, which isn't
> contended), guess it's because other problems hide it, for example tlb flush
> overhead.
> 

The old method was not free either but I wanted to be sure you were
aware of the series just in case. Thanks.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2014-04-24  8:30 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 10:42 ` Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-13 15:58   ` Weijie Yang
2014-02-14 10:17   ` Mel Gorman
2014-02-14 10:17     ` Mel Gorman
2014-02-14 13:33     ` Weijie Yang
2014-02-14 13:33       ` Weijie Yang
2014-02-14 13:10   ` Christian Ehrhardt
2014-02-16  2:59     ` Weijie Yang
2014-02-24  8:28       ` Hugh Dickins
2014-02-24  8:28         ` Hugh Dickins
2014-04-12 21:00         ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00           ` Dan Streetman
2014-04-12 21:00           ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-12 21:00             ` Dan Streetman
2014-04-23 10:34             ` Mel Gorman
2014-04-23 10:34               ` Mel Gorman
2014-04-24  0:17               ` Shaohua Li
2014-04-24  0:17                 ` Shaohua Li
2014-04-24  8:30                 ` Mel Gorman [this message]
2014-04-24  8:30                   ` Mel Gorman
2014-04-24 18:48               ` Dan Streetman
2014-04-24 18:48                 ` Dan Streetman
2014-04-25  4:15                 ` Weijie Yang
2014-04-25  4:15                   ` Weijie Yang
2014-05-02 20:00                   ` Dan Streetman
2014-05-02 20:00                     ` Dan Streetman
2014-05-04  9:39                     ` Bob Liu
2014-05-04  9:39                     ` Bob Liu
2014-05-04  9:39                       ` Bob Liu
2014-05-04 20:16                       ` Dan Streetman
2014-05-04 20:16                       ` Dan Streetman
2014-05-04 20:16                         ` Dan Streetman
2014-05-02 20:00                   ` Dan Streetman
2014-04-25  8:38                 ` Mel Gorman
2014-04-25  8:38                   ` Mel Gorman
2014-04-12 21:00           ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-12 21:00             ` Dan Streetman
2014-04-23 13:14             ` Mel Gorman
2014-04-23 13:14               ` Mel Gorman
2014-04-24 17:52               ` Dan Streetman
2014-04-24 17:52                 ` Dan Streetman
2014-04-25  8:49                 ` Mel Gorman
2014-04-25  8:49                   ` Mel Gorman
2014-05-02 19:02           ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02             ` Dan Streetman
2014-05-02 19:02             ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-02 19:02             ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-12 10:35               ` Mel Gorman
2014-05-12 10:35                 ` Mel Gorman
2014-05-02 19:02             ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-06  2:18               ` Steven Rostedt
2014-05-06  2:18                 ` Steven Rostedt
2014-05-06 20:12                 ` Dan Streetman
2014-05-06 20:12                   ` Dan Streetman
2014-05-06 20:39                   ` Steven Rostedt
2014-05-06 20:39                     ` Steven Rostedt
2014-05-06 21:47                     ` Dan Streetman
2014-05-06 21:47                       ` Dan Streetman
2014-05-06 22:43                       ` Steven Rostedt
2014-05-06 22:43                         ` Steven Rostedt
2014-05-02 19:02             ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-02 19:02               ` Dan Streetman
2014-05-05 15:51               ` Dan Streetman
2014-05-05 15:51                 ` Dan Streetman
2014-05-05 19:13               ` Steven Rostedt
2014-05-05 19:13                 ` Steven Rostedt
2014-05-05 19:38                 ` Peter Zijlstra
2014-05-09 20:42                 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 20:42                   ` Dan Streetman
2014-05-09 21:17                   ` Steven Rostedt
2014-05-09 21:17                     ` Steven Rostedt
2014-05-12 11:11               ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Mel Gorman
2014-05-12 11:11                 ` Mel Gorman
2014-05-12 13:00                 ` Dan Streetman
2014-05-12 13:00                   ` Dan Streetman
2014-05-12 16:38             ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38               ` Dan Streetman
2014-05-12 16:38               ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-12 16:38               ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-12 16:38               ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-13 10:33                 ` Mel Gorman
2014-05-13 10:33                   ` Mel Gorman
2014-05-12 16:38               ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-12 16:38                 ` Dan Streetman
2014-05-13 10:34                 ` Mel Gorman
2014-05-13 10:34                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140424083018.GO23991@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=ddstreet@ieee.org \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=shli@fusionio.com \
    --cc=shli@kernel.org \
    --cc=weijieut@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.