All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang\, Ying" <ying.huang@intel.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: "Huang, Ying" <ying.huang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>, Shaohua Li <shli@kernel.org>,
	Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH -mm -v6 4/9] mm, THP, swap: Add get_huge_swap_page()
Date: Wed, 15 Mar 2017 09:08:46 +0800	[thread overview]
Message-ID: <871stze481.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <1489534821.2733.47.camel@linux.intel.com> (Tim Chen's message of "Tue, 14 Mar 2017 16:40:21 -0700")

Tim Chen <tim.c.chen@linux.intel.com> writes:

> On Wed, 2017-03-08 at 15:26 +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>> 
>> A variation of get_swap_page(), get_huge_swap_page(), is added to
>> allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
>> cluster allocation function.A A A fair simple algorithm is used, that is,
>> only the first swap device in priority list will be tried to allocate
>> the swap cluster.A A The function will fail if the trying is not
>> successful, and the caller will fallback to allocate a single swap slot
>> instead.A A This works good enough for normal cases.
>> 
>> This will be used for the THP (Transparent Huge Page) swap support.
>> Where get_huge_swap_page() will be used to allocate one swap cluster for
>> each THP swapped out.
>> 
>> Because of the algorithm adopted, if the difference of the number of the
>> free swap clusters among multiple swap devices is significant, it is
>> possible that some THPs are split earlier than necessary.A A For example,
>> this could be caused by big size difference among multiple swap devices.
>> 
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>> A include/linux/swap.h | 19 ++++++++++++++++++-
>> A mm/swap_slots.cA A A A A A |A A 5 +++--
>> A mm/swapfile.cA A A A A A A A | 16 ++++++++++++----
>> A 3 files changed, 33 insertions(+), 7 deletions(-)
>> 
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 278e1349a424..e3a7609a8989 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
>> A extern void si_swapinfo(struct sysinfo *);
>> A extern swp_entry_t get_swap_page(void);
>> A extern swp_entry_t get_swap_page_of_type(int);
>> -extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
>> +extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
>> A extern int add_swap_count_continuation(swp_entry_t, gfp_t);
>> A extern void swap_shmem_alloc(swp_entry_t);
>> A extern int swap_duplicate(swp_entry_t);
>> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>> A 
>> A #endif /* CONFIG_SWAP */
>> A 
>> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +	swp_entry_t entry;
>> +
>> +	if (get_swap_pages(1, &entry, true))
>> +		return entry;
>> +	else
>> +		return (swp_entry_t) {0};
>> +}
>> +#else
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +	return (swp_entry_t) {0};
>> +}
>> +#endif
>> +
>> A #ifdef CONFIG_MEMCG
>> A static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
>> A {
>> diff --git a/mm/swap_slots.c b/mm/swap_slots.c
>> index 9b5bc86f96ad..075bb39e03c5 100644
>> --- a/mm/swap_slots.c
>> +++ b/mm/swap_slots.c
>> @@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache *cache)
>> A 
>> A 	cache->cur = 0;
>> A 	if (swap_slot_cache_active)
>> -		cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
>> +		cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
>> +					A A A false);
>> A 
>> A 	return cache->nr;
>> A }
>> @@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
>> A 			return entry;
>> A 	}
>> A 
>> -	get_swap_pages(1, &entry);
>> +	get_swap_pages(1, &entry, false);
>> A 
>> A 	return entry;
>> A }
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 91876c33114b..7241c937e52b 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -904,11 +904,12 @@ static unsigned long scan_swap_map(struct swap_info_struct *si,
>> A 
>> A }
>> A 
>> -int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>
>
>> +int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
>> A {
>> A 	struct swap_info_struct *si, *next;
>> A 	long avail_pgs;
>> A 	int n_ret = 0;
>> +	int nr_pages = huge_cluster_nr_entries(huge);
>> A 
>> A 	avail_pgs = atomic_long_read(&nr_swap_pages);
>> A 	if (avail_pgs <= 0)
>> @@ -920,6 +921,10 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>> A 	if (n_goal > avail_pgs)
>> A 		n_goal = avail_pgs;
>> A 
>> +	n_goal *= nr_pages;
>
> I think if (n_goal > 1) when huge is true,A 
> n_goal should be set to huge_cluster_nr_entries(huge) here
> or we could have an invalid check below. We probably
> should add a comment to get_swap_pages on how we treat
> n_goal when huge is true. A Maybe say we will always treat
> n_goal as SWAPFILE_CLUSTER when huge is true.A 

Yes.  The meaning of n_goal and n_ret isn't consistent between huge and
normal swap entry allocation.  I will revise the logic in the function
to make them consistent.

>> +	if (avail_pgs < n_goal)
>> +		goto noswap;
>> +
>> A 	atomic_long_sub(n_goal, &nr_swap_pages);
>> A 
>> A 	spin_lock(&swap_avail_lock);
>> @@ -946,10 +951,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>> A 			spin_unlock(&si->lock);
>> A 			goto nextsi;
>> A 		}
>> -		n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
>> -					A A A A n_goal, swp_entries);
>> +		if (likely(nr_pages == 1))
>
> if (likely(!huge)) is probably more readable

Sure.

Best Regards,
Huang, Ying

>> +			n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
>> +						A A A A n_goal, swp_entries);
>> +		else
>> +			n_ret = swap_alloc_huge_cluster(si, swp_entries);
>> A 		spin_unlock(&si->lock);
>
> Thanks.
>
> Tim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Huang\, Ying" <ying.huang@intel.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: "Huang\, Ying" <ying.huang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
	<linux-kernel@vger.kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Hugh Dickins <hughd@google.com>, "Shaohua Li" <shli@kernel.org>,
	Minchan Kim <minchan@kernel.org>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH -mm -v6 4/9] mm, THP, swap: Add get_huge_swap_page()
Date: Wed, 15 Mar 2017 09:08:46 +0800	[thread overview]
Message-ID: <871stze481.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <1489534821.2733.47.camel@linux.intel.com> (Tim Chen's message of "Tue, 14 Mar 2017 16:40:21 -0700")

Tim Chen <tim.c.chen@linux.intel.com> writes:

> On Wed, 2017-03-08 at 15:26 +0800, Huang, Ying wrote:
>> From: Huang Ying <ying.huang@intel.com>
>> 
>> A variation of get_swap_page(), get_huge_swap_page(), is added to
>> allocate a swap cluster (HPAGE_PMD_NR swap slots) based on the swap
>> cluster allocation function.  A fair simple algorithm is used, that is,
>> only the first swap device in priority list will be tried to allocate
>> the swap cluster.  The function will fail if the trying is not
>> successful, and the caller will fallback to allocate a single swap slot
>> instead.  This works good enough for normal cases.
>> 
>> This will be used for the THP (Transparent Huge Page) swap support.
>> Where get_huge_swap_page() will be used to allocate one swap cluster for
>> each THP swapped out.
>> 
>> Because of the algorithm adopted, if the difference of the number of the
>> free swap clusters among multiple swap devices is significant, it is
>> possible that some THPs are split earlier than necessary.  For example,
>> this could be caused by big size difference among multiple swap devices.
>> 
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> ---
>>  include/linux/swap.h | 19 ++++++++++++++++++-
>>  mm/swap_slots.c      |  5 +++--
>>  mm/swapfile.c        | 16 ++++++++++++----
>>  3 files changed, 33 insertions(+), 7 deletions(-)
>> 
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 278e1349a424..e3a7609a8989 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -388,7 +388,7 @@ static inline long get_nr_swap_pages(void)
>>  extern void si_swapinfo(struct sysinfo *);
>>  extern swp_entry_t get_swap_page(void);
>>  extern swp_entry_t get_swap_page_of_type(int);
>> -extern int get_swap_pages(int n, swp_entry_t swp_entries[]);
>> +extern int get_swap_pages(int n, swp_entry_t swp_entries[], bool huge);
>>  extern int add_swap_count_continuation(swp_entry_t, gfp_t);
>>  extern void swap_shmem_alloc(swp_entry_t);
>>  extern int swap_duplicate(swp_entry_t);
>> @@ -527,6 +527,23 @@ static inline swp_entry_t get_swap_page(void)
>>  
>>  #endif /* CONFIG_SWAP */
>>  
>> +#ifdef CONFIG_THP_SWAP_CLUSTER
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +	swp_entry_t entry;
>> +
>> +	if (get_swap_pages(1, &entry, true))
>> +		return entry;
>> +	else
>> +		return (swp_entry_t) {0};
>> +}
>> +#else
>> +static inline swp_entry_t get_huge_swap_page(void)
>> +{
>> +	return (swp_entry_t) {0};
>> +}
>> +#endif
>> +
>>  #ifdef CONFIG_MEMCG
>>  static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
>>  {
>> diff --git a/mm/swap_slots.c b/mm/swap_slots.c
>> index 9b5bc86f96ad..075bb39e03c5 100644
>> --- a/mm/swap_slots.c
>> +++ b/mm/swap_slots.c
>> @@ -258,7 +258,8 @@ static int refill_swap_slots_cache(struct swap_slots_cache *cache)
>>  
>>  	cache->cur = 0;
>>  	if (swap_slot_cache_active)
>> -		cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots);
>> +		cache->nr = get_swap_pages(SWAP_SLOTS_CACHE_SIZE, cache->slots,
>> +					   false);
>>  
>>  	return cache->nr;
>>  }
>> @@ -334,7 +335,7 @@ swp_entry_t get_swap_page(void)
>>  			return entry;
>>  	}
>>  
>> -	get_swap_pages(1, &entry);
>> +	get_swap_pages(1, &entry, false);
>>  
>>  	return entry;
>>  }
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 91876c33114b..7241c937e52b 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -904,11 +904,12 @@ static unsigned long scan_swap_map(struct swap_info_struct *si,
>>  
>>  }
>>  
>> -int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>
>
>> +int get_swap_pages(int n_goal, swp_entry_t swp_entries[], bool huge)
>>  {
>>  	struct swap_info_struct *si, *next;
>>  	long avail_pgs;
>>  	int n_ret = 0;
>> +	int nr_pages = huge_cluster_nr_entries(huge);
>>  
>>  	avail_pgs = atomic_long_read(&nr_swap_pages);
>>  	if (avail_pgs <= 0)
>> @@ -920,6 +921,10 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>>  	if (n_goal > avail_pgs)
>>  		n_goal = avail_pgs;
>>  
>> +	n_goal *= nr_pages;
>
> I think if (n_goal > 1) when huge is true, 
> n_goal should be set to huge_cluster_nr_entries(huge) here
> or we could have an invalid check below. We probably
> should add a comment to get_swap_pages on how we treat
> n_goal when huge is true.  Maybe say we will always treat
> n_goal as SWAPFILE_CLUSTER when huge is true. 

Yes.  The meaning of n_goal and n_ret isn't consistent between huge and
normal swap entry allocation.  I will revise the logic in the function
to make them consistent.

>> +	if (avail_pgs < n_goal)
>> +		goto noswap;
>> +
>>  	atomic_long_sub(n_goal, &nr_swap_pages);
>>  
>>  	spin_lock(&swap_avail_lock);
>> @@ -946,10 +951,13 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[])
>>  			spin_unlock(&si->lock);
>>  			goto nextsi;
>>  		}
>> -		n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
>> -					    n_goal, swp_entries);
>> +		if (likely(nr_pages == 1))
>
> if (likely(!huge)) is probably more readable

Sure.

Best Regards,
Huang, Ying

>> +			n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE,
>> +						    n_goal, swp_entries);
>> +		else
>> +			n_ret = swap_alloc_huge_cluster(si, swp_entries);
>>  		spin_unlock(&si->lock);
>
> Thanks.
>
> Tim

  reply	other threads:[~2017-03-15  1:08 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-08  7:26 [PATCH -v6 0/9] THP swap: Delay splitting THP during swapping out Huang, Ying
2017-03-08  7:26 ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 1/9] mm, swap: Make swap cluster size same of THP size on x86_64 Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-08 12:56   ` Matthew Wilcox
2017-03-08 12:56     ` Matthew Wilcox
2017-03-09  1:45     ` Huang, Ying
2017-03-09  1:45       ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 2/9] mm, memcg: Support to charge/uncharge multiple swap entries Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-08 11:37   ` Balbir Singh
2017-03-08 11:37     ` Balbir Singh
2017-03-08 11:37     ` Balbir Singh
2017-03-09  1:28     ` Huang, Ying
2017-03-09  1:28       ` Huang, Ying
2017-03-09  1:28       ` Huang, Ying
2017-03-09 21:22       ` Balbir Singh
2017-03-09 21:22         ` Balbir Singh
2017-03-08  7:26 ` [PATCH -mm -v6 3/9] mm, THP, swap: Add swap cluster allocate/free functions Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-14 23:13   ` Tim Chen
2017-03-14 23:13     ` Tim Chen
2017-03-15  1:19     ` Huang, Ying
2017-03-15  1:19       ` Huang, Ying
2017-03-15 17:15       ` Tim Chen
2017-03-15 17:15         ` Tim Chen
2017-03-16  6:31         ` Huang, Ying
2017-03-16  6:31           ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 4/9] mm, THP, swap: Add get_huge_swap_page() Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-14 23:40   ` Tim Chen
2017-03-14 23:40     ` Tim Chen
2017-03-15  1:08     ` Huang, Ying [this message]
2017-03-15  1:08       ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-15  0:14   ` Tim Chen
2017-03-15  0:14     ` Tim Chen
2017-03-15  0:54     ` Huang, Ying
2017-03-15  0:54       ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 7/9] mm, THP: Add can_split_huge_page() Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 8/9] mm, THP, swap: Support to split THP in swap cache Huang, Ying
2017-03-08  7:26   ` Huang, Ying
2017-03-08  7:26 ` [PATCH -mm -v6 9/9] mm, THP, swap: Delay splitting THP during swap out Huang, Ying
2017-03-08  7:26   ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871stze481.fsf@yhuang-dev.intel.com \
    --to=ying.huang@intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    --cc=shli@kernel.org \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.