linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [rfc 5/5] mincore: transparent huge page support
Date: Thu, 25 Mar 2010 01:42:54 +0100	[thread overview]
Message-ID: <20100325004254.GS10659@random.random> (raw)
In-Reply-To: <20100325000749.GA27304@cmpxchg.org>

On Thu, Mar 25, 2010 at 01:07:49AM +0100, Johannes Weiner wrote:
> Stupid me.  I knew that, I just hadn't internalized it enough to do it
> right :)

Never mind ;)

> Btw, unless I miss something else, this is the same in follow_page()?
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 22ee158..6c26042 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1301,18 +1301,14 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
>  	}
>  	if (pmd_trans_huge(*pmd)) {
>  		spin_lock(&mm->page_table_lock);

follow_page already checked pmd_trans_huge first outside the lock.

> -		if (likely(pmd_trans_huge(*pmd))) {

And then again inside. We have to re-check it inside to be safe,
otherwise we've to /* fall through */

> -			if (unlikely(pmd_trans_splitting(*pmd))) {
> -				spin_unlock(&mm->page_table_lock);
> -				wait_split_huge_page(vma->anon_vma, pmd);
> -			} else {
> -				page = follow_trans_huge_pmd(mm, address,
> -							     pmd, flags);

What I want to do in mincore is to call something like
mincore_trans_huge_pmd() to remove the #ifdef to make everyone happy.

> -				spin_unlock(&mm->page_table_lock);
> -				goto out;
> -			}
> -		} else
> +		if (unlikely(pmd_trans_splitting(*pmd))) {
>  			spin_unlock(&mm->page_table_lock);
> +			wait_split_huge_page(vma->anon_vma, pmd);
> +		} else {
> +			page = follow_trans_huge_pmd(mm, address, pmd, flags);
> +			spin_unlock(&mm->page_table_lock);
> +			goto out;
> +		}

So it would miss one needed check inside the lock and it could lead to
call follow_trans_huge_pmd with a not huge pmd which is invalid.

Also note, touching with C language stuff that can change under you
like we do in the first check outside the lock (and isn't marked
volatile) may have unpredictable results in theory. But in practice we
do stuff like this all the time. Specifically in this case it's just
one bitflag check so it should be safe enough considering that
test_bit is implemented in C on x86 and does the same thing,
spin_is_locked likely is also implemented in C etc... In other cases
like "switch" parameters, gcc can implement it with a jump table, and
if the value changes under gcc it may jump beyond the end of the
table. It's a tradeoff between writing perfect C and having to deal
with asm all the time (or worse slowdown marking pmd volatile). Some
ages ago I guess I proposed to fix this including not assuming that
writes to the ptes are atomic with one instruction if done in C, but
nobody cared and in effect it works reliably also this way and it's a
bit simpler... ;).

> Knock yourself out :-)

As you wish! ;)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-25  0:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-23 14:34 mincore and transparent huge pages Johannes Weiner
2010-03-23 14:34 ` [patch 1/5] mincore: cleanups Johannes Weiner
2010-03-23 14:34 ` [patch 2/5] mincore: break do_mincore() into logical pieces Johannes Weiner
2010-03-23 14:35 ` [patch 3/5] mincore: pass ranges as start,end address pairs Johannes Weiner
2010-03-23 14:35 ` [patch 4/5] mincore: do nested page table walks Johannes Weiner
2010-03-23 14:35 ` [rfc 5/5] mincore: transparent huge page support Johannes Weiner
2010-03-24 22:48   ` Andrea Arcangeli
2010-03-25  0:07     ` Johannes Weiner
2010-03-25  0:42       ` Andrea Arcangeli [this message]
2010-03-25  1:23     ` Johannes Weiner
2010-03-24 22:32 ` mincore and transparent huge pages Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100325004254.GS10659@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).