All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [rfc 5/5] mincore: transparent huge page support
Date: Wed, 24 Mar 2010 23:48:58 +0100	[thread overview]
Message-ID: <20100324224858.GP10659@random.random> (raw)
In-Reply-To: <1269354902-18975-6-git-send-email-hannes@cmpxchg.org>

On Tue, Mar 23, 2010 at 03:35:02PM +0100, Johannes Weiner wrote:
> +static int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
> +			unsigned long addr, unsigned long end,
> +			unsigned char *vec)
> +{
> +	int huge = 0;
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&vma->vm_mm->page_table_lock);
> +	if (likely(pmd_trans_huge(*pmd))) {
> +		huge = !pmd_trans_splitting(*pmd);

Under mmap_sem (read or write) a hugepage can't materialize under
us. So here the pmd_trans_huge can be lockless and run _before_ taking
the page_table_lock. That's the invariant I used to keep identical
performance for all fast paths.

And if it wasn't the case it wouldn't be safe to return huge = 0 as
the page_table_lock is released at that point.

> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		/*
> +		 * If we have an intact huge pmd entry, all pages in
> +		 * the range are present in the mincore() sense of
> +		 * things.
> +		 *
> +		 * But if the entry is currently being split into
> +		 * normal page mappings, wait for it to finish and
> +		 * signal the fallback to ptes.
> +		 */
> +		if (huge)
> +			memset(vec, 1, (end - addr) >> PAGE_SHIFT);
> +		else
> +			wait_split_huge_page(vma->anon_vma, pmd);
> +	} else
> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +#endif
> +	return huge;
> +}
> +

It's probably cleaner to move the block into huge_memory.c and create
a dummy for the #ifndef version like I did for all the rest.


I'll incorporate and take care of those changes myself if you don't
mind, as I'm going to do a new submit for -mm. I greatly appreciated
you taken the time to port to transhuge it helps a lot! ;)

Thanks,
Andrea

WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [rfc 5/5] mincore: transparent huge page support
Date: Wed, 24 Mar 2010 23:48:58 +0100	[thread overview]
Message-ID: <20100324224858.GP10659@random.random> (raw)
In-Reply-To: <1269354902-18975-6-git-send-email-hannes@cmpxchg.org>

On Tue, Mar 23, 2010 at 03:35:02PM +0100, Johannes Weiner wrote:
> +static int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
> +			unsigned long addr, unsigned long end,
> +			unsigned char *vec)
> +{
> +	int huge = 0;
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +	spin_lock(&vma->vm_mm->page_table_lock);
> +	if (likely(pmd_trans_huge(*pmd))) {
> +		huge = !pmd_trans_splitting(*pmd);

Under mmap_sem (read or write) a hugepage can't materialize under
us. So here the pmd_trans_huge can be lockless and run _before_ taking
the page_table_lock. That's the invariant I used to keep identical
performance for all fast paths.

And if it wasn't the case it wouldn't be safe to return huge = 0 as
the page_table_lock is released at that point.

> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		/*
> +		 * If we have an intact huge pmd entry, all pages in
> +		 * the range are present in the mincore() sense of
> +		 * things.
> +		 *
> +		 * But if the entry is currently being split into
> +		 * normal page mappings, wait for it to finish and
> +		 * signal the fallback to ptes.
> +		 */
> +		if (huge)
> +			memset(vec, 1, (end - addr) >> PAGE_SHIFT);
> +		else
> +			wait_split_huge_page(vma->anon_vma, pmd);
> +	} else
> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +#endif
> +	return huge;
> +}
> +

It's probably cleaner to move the block into huge_memory.c and create
a dummy for the #ifndef version like I did for all the rest.


I'll incorporate and take care of those changes myself if you don't
mind, as I'm going to do a new submit for -mm. I greatly appreciated
you taken the time to port to transhuge it helps a lot! ;)

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-03-24 22:49 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-23 14:34 mincore and transparent huge pages Johannes Weiner
2010-03-23 14:34 ` Johannes Weiner
2010-03-23 14:34 ` [patch 1/5] mincore: cleanups Johannes Weiner
2010-03-23 14:34   ` Johannes Weiner
2010-03-23 14:34 ` [patch 2/5] mincore: break do_mincore() into logical pieces Johannes Weiner
2010-03-23 14:34   ` Johannes Weiner
2010-03-23 14:35 ` [patch 3/5] mincore: pass ranges as start,end address pairs Johannes Weiner
2010-03-23 14:35   ` Johannes Weiner
2010-03-23 14:35 ` [patch 4/5] mincore: do nested page table walks Johannes Weiner
2010-03-23 14:35   ` Johannes Weiner
2010-03-23 14:35 ` [rfc 5/5] mincore: transparent huge page support Johannes Weiner
2010-03-23 14:35   ` Johannes Weiner
2010-03-24 22:48   ` Andrea Arcangeli [this message]
2010-03-24 22:48     ` Andrea Arcangeli
2010-03-25  0:07     ` Johannes Weiner
2010-03-25  0:07       ` Johannes Weiner
2010-03-25  0:42       ` Andrea Arcangeli
2010-03-25  0:42         ` Andrea Arcangeli
2010-03-25  1:23     ` Johannes Weiner
2010-03-25  1:23       ` Johannes Weiner
2010-03-24 22:32 ` mincore and transparent huge pages Andrea Arcangeli
2010-03-24 22:32   ` Andrea Arcangeli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100324224858.GP10659@random.random \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.