All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Ebru Akagunduz <ebru.akagunduz@gmail.com>, linux-mm@kvack.org
Cc: akpm@linux-foundation.org, kirill@shutemov.name, mhocko@suse.cz,
	mgorman@suse.de, rientjes@google.com, sasha.levin@oracle.com,
	hughd@google.com, hannes@cmpxchg.org, vbabka@suse.cz,
	linux-kernel@vger.kernel.org, aarcange@redhat.com
Subject: Re: [PATCH] mm: incorporate read-only pages into transparent huge pages
Date: Fri, 23 Jan 2015 14:04:11 -0500	[thread overview]
Message-ID: <54C29B2B.4070800@redhat.com> (raw)
In-Reply-To: <1421999256-3881-1-git-send-email-ebru.akagunduz@gmail.com>

On 01/23/2015 02:47 AM, Ebru Akagunduz wrote:

> @@ -2169,7 +2169,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
>  
>  		/* cannot use mapcount: can't collapse if there's a gup pin */
> -		if (page_count(page) != 1)
> +		if (page_count(page) != 1 + !!PageSwapCache(page))
>  			goto out;
>  		/*
>  		 * We can do it before isolate_lru_page because the
> @@ -2179,6 +2179,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		 */
>  		if (!trylock_page(page))
>  			goto out;
> +		if (!pte_write(pteval)) {
> +			if (PageSwapCache(page) && !reuse_swap_page(page)) {
> +					unlock_page(page);
> +					goto out;
> +			}
> +			/*
> +			 * Page is not in the swap cache, and page count is
> +			 * one (see above). It can be collapsed into a THP.
> +			 */
> +		}

Andrea pointed out a bug between the above two parts of
the patch.

In-between where we check page_count(page), and where we
check whether the page got added to the swap cache, the
page count may change, causing us to get into a race
condition with get_user_pages_fast, the pageout code, etc.

It is necessary to check the page count again right after
the trylock_page(page) above, to make sure it was not changed
while the page was not yet locked.

That second check should have a comment explaining that
the first "page_count(page) != 1 + !!PageSwapCache(page)"
check could be unsafe due to the page not yet locked,
so the check needs to be repeated. Maybe something along
the lines of:

     /* Re-check the page count with the page locked */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Ebru Akagunduz <ebru.akagunduz@gmail.com>, linux-mm@kvack.org
Cc: akpm@linux-foundation.org, kirill@shutemov.name, mhocko@suse.cz,
	mgorman@suse.de, rientjes@google.com, sasha.levin@oracle.com,
	hughd@google.com, hannes@cmpxchg.org, vbabka@suse.cz,
	linux-kernel@vger.kernel.org, aarcange@redhat.com
Subject: Re: [PATCH] mm: incorporate read-only pages into transparent huge pages
Date: Fri, 23 Jan 2015 14:04:11 -0500	[thread overview]
Message-ID: <54C29B2B.4070800@redhat.com> (raw)
In-Reply-To: <1421999256-3881-1-git-send-email-ebru.akagunduz@gmail.com>

On 01/23/2015 02:47 AM, Ebru Akagunduz wrote:

> @@ -2169,7 +2169,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
>  
>  		/* cannot use mapcount: can't collapse if there's a gup pin */
> -		if (page_count(page) != 1)
> +		if (page_count(page) != 1 + !!PageSwapCache(page))
>  			goto out;
>  		/*
>  		 * We can do it before isolate_lru_page because the
> @@ -2179,6 +2179,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		 */
>  		if (!trylock_page(page))
>  			goto out;
> +		if (!pte_write(pteval)) {
> +			if (PageSwapCache(page) && !reuse_swap_page(page)) {
> +					unlock_page(page);
> +					goto out;
> +			}
> +			/*
> +			 * Page is not in the swap cache, and page count is
> +			 * one (see above). It can be collapsed into a THP.
> +			 */
> +		}

Andrea pointed out a bug between the above two parts of
the patch.

In-between where we check page_count(page), and where we
check whether the page got added to the swap cache, the
page count may change, causing us to get into a race
condition with get_user_pages_fast, the pageout code, etc.

It is necessary to check the page count again right after
the trylock_page(page) above, to make sure it was not changed
while the page was not yet locked.

That second check should have a comment explaining that
the first "page_count(page) != 1 + !!PageSwapCache(page)"
check could be unsafe due to the page not yet locked,
so the check needs to be repeated. Maybe something along
the lines of:

     /* Re-check the page count with the page locked */

  parent reply	other threads:[~2015-01-23 19:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-23  7:47 [PATCH] mm: incorporate read-only pages into transparent huge pages Ebru Akagunduz
2015-01-23  7:47 ` Ebru Akagunduz
2015-01-23 11:37 ` Kirill A. Shutemov
2015-01-23 11:37   ` Kirill A. Shutemov
2015-01-23 14:57   ` Rik van Riel
2015-01-23 14:57     ` Rik van Riel
2015-01-23 15:58     ` Kirill A. Shutemov
2015-01-23 15:58       ` Kirill A. Shutemov
2015-01-23 16:12 ` Vlastimil Babka
2015-01-23 16:12   ` Vlastimil Babka
2015-01-23 16:15   ` Rik van Riel
2015-01-23 16:15     ` Rik van Riel
2015-01-23 19:04 ` Rik van Riel [this message]
2015-01-23 19:04   ` Rik van Riel
2015-01-23 19:18 ` Andrea Arcangeli
2015-01-23 19:18   ` Andrea Arcangeli
2015-01-25  9:25   ` Vlastimil Babka
2015-01-25  9:25     ` Vlastimil Babka
2015-01-25 14:42     ` Zhang Yanfei
2015-01-25 14:42       ` Zhang Yanfei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54C29B2B.4070800@redhat.com \
    --to=riel@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebru.akagunduz@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=rientjes@google.com \
    --cc=sasha.levin@oracle.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.