linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Huang\, Ying" <ying.huang@intel.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huang, Ying" <ying.huang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm -v7 9/9] mm, THP, swap: Delay splitting THP during swap out
Date: Sat, 01 Apr 2017 10:54:39 +0800	[thread overview]
Message-ID: <874ly8suq8.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20170331144948.GA6408@cmpxchg.org> (Johannes Weiner's message of "Fri, 31 Mar 2017 10:49:48 -0400")

Johannes Weiner <hannes@cmpxchg.org> writes:

> On Thu, Mar 30, 2017 at 12:15:13PM +0800, Huang, Ying wrote:
>> Johannes Weiner <hannes@cmpxchg.org> writes:
>> > On Tue, Mar 28, 2017 at 01:32:09PM +0800, Huang, Ying wrote:
>> >> @@ -198,6 +240,18 @@ int add_to_swap(struct page *page, struct list_head *list)
>> >>  	VM_BUG_ON_PAGE(!PageLocked(page), page);
>> >>  	VM_BUG_ON_PAGE(!PageUptodate(page), page);
>> >>  
>> >> +	if (unlikely(PageTransHuge(page))) {
>> >> +		err = add_to_swap_trans_huge(page, list);
>> >> +		switch (err) {
>> >> +		case 1:
>> >> +			return 1;
>> >> +		case 0:
>> >> +			/* fallback to split firstly if return 0 */
>> >> +			break;
>> >> +		default:
>> >> +			return 0;
>> >> +		}
>> >> +	}
>> >>  	entry = get_swap_page();
>> >>  	if (!entry.val)
>> >>  		return 0;
>> >
>> > add_to_swap_trans_huge() is too close a copy of add_to_swap(), which
>> > makes the code error prone for future modifications to the swap slot
>> > allocation protocol.
>> >
>> > This should read:
>> >
>> > retry:
>> > 	entry = get_swap_page(page);
>> > 	if (!entry.val) {
>> > 		if (PageTransHuge(page)) {
>> > 			split_huge_page_to_list(page, list);
>> > 			goto retry;
>> > 		}
>> > 		return 0;
>> > 	}
>> 
>> If the swap space is used up, that is, get_swap_page() cannot allocate
>> even 1 swap entry for a normal page.  We will split THP unnecessarily
>> with the change, but in the original code, we just skip the THP.  There
>> may be a performance regression here.  Similar problem exists for
>> mem_cgroup_try_charge_swap() too.  If the mem cgroup exceeds the swap
>> limit, the THP will be split unnecessary with the change too.
>
> If we skip the page, we're swapping out another page hotter than this
> one. Giving THP preservation priority over LRU order is an issue best
> kept for a separate patch set;

In my original patch, if we failed to allocate the swap space for a THP,
and we can allocate the swap space for a normal page, we will split the
THP.  We skip the page only if we cannot allocate the swap space for a
normal page, that is, nr_swap_pages is 0.  So we will not give THP
preservation priority over LRU order in the patch.

> this one is supposed to be a mechanical
> implementation of THP swapping. Let's nail down the basics first.

Yes.  So I tried to keep the original behavior to deal with THP if we
cannot allocate the swap space (a swap cluster) for a whole THP.

Per my understanding, the difference between what you suggested and the
original behavior is that, when nr_swap_pages is 0, whether to split the
THP.

> Such a decision would need proof that splitting THPs on full swap
> devices is a concern for real applications. I would assume that we're
> pretty close to OOM anyway; it's much more likely that a single slot
> frees up than a full cluster, at which point we'll be splitting THPs
> anyway; etc. I have my doubts that this would be measurable.
>
> But even if so, I don't think we'd have to duplicate the main code
> flow to handle this corner case. You can extend get_swap_page() to
> return an error code that tells add_to_swap() whether to split and
> retry, or to fail and move on. So this way should be future proof.

Yes.  I will try to merge add_to_swap_trans_huge() into add_to_swap() in
the next version.  But if we want to keep the original behavior, we will
need an extra "nr_entries" parameter for mem_cgroup_try_charge_swap().

Best Regards,
Huang, Ying

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-04-01  2:54 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28  5:32 [PATCH -mm -v7 0/9] THP swap: Delay splitting THP during swapping out Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 1/9] mm, swap: Make swap cluster size same of THP size on x86_64 Huang, Ying
2017-03-28 23:30   ` Kirill A. Shutemov
2017-03-29  1:10     ` Huang, Ying
2017-03-29 16:55   ` Johannes Weiner
2017-03-30  0:45     ` Huang, Ying
2017-03-31 14:56       ` Johannes Weiner
2017-04-01  3:29         ` Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 2/9] mm, memcg: Support to charge/uncharge multiple swap entries Huang, Ying
2017-03-29 16:57   ` Johannes Weiner
2017-03-30  0:53     ` Huang, Ying
2017-03-31 14:59       ` Johannes Weiner
2017-03-28  5:32 ` [PATCH -mm -v7 3/9] mm, THP, swap: Add swap cluster allocate/free functions Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 4/9] mm, THP, swap: Add get_huge_swap_page() Huang, Ying
2017-03-29 17:08   ` Johannes Weiner
2017-03-30  4:28     ` Huang, Ying
2017-03-31 15:24       ` Johannes Weiner
2017-04-01  3:32         ` Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 5/9] mm, THP, swap: Support to clear SWAP_HAS_CACHE for huge page Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 6/9] mm, THP, swap: Support to add/delete THP to/from swap cache Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 7/9] mm, THP: Add can_split_huge_page() Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 8/9] mm, THP, swap: Support to split THP in swap cache Huang, Ying
2017-03-28  5:32 ` [PATCH -mm -v7 9/9] mm, THP, swap: Delay splitting THP during swap out Huang, Ying
2017-03-29 17:16   ` Johannes Weiner
2017-03-30  4:15     ` Huang, Ying
2017-03-31 14:49       ` Johannes Weiner
2017-04-01  2:54         ` Huang, Ying [this message]
2017-03-28 22:13 ` [PATCH -mm -v7 0/9] THP swap: Delay splitting THP during swapping out Andrew Morton
2017-03-29  8:52   ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874ly8suq8.fsf@yhuang-dev.intel.com \
    --to=ying.huang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).