Re: RFC: Cleanup / small fixes to hugetlb fault handling

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Adam Litke <agl@us.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	hugh@veritas.com, William Irwin <wli@holomorphy.com>
Subject: Re: RFC: Cleanup / small fixes to hugetlb fault handling
Date: Wed, 26 Oct 2005 16:46:16 -0500	[thread overview]
Message-ID: <1130363177.2689.12.camel@localhost.localdomain> (raw)
In-Reply-To: <20051026024831.GB17191@localhost.localdomain>

On Wed, 2005-10-26 at 12:48 +1000, David Gibson wrote:
> On Wed, Oct 26, 2005 at 12:00:55PM +1000, David Gibson wrote:
> > Hi, Adam, Bill, Hugh,
> > 
> > Does this look like a reasonable patch to send to akpm for -mm.
> 
> Ahem.  Or rather this version, which actually compiles.
> 
> This patch makes some slight tweaks / cleanups to the fault handling
> path for huge pages in -mm.  My main motivation is to make it simpler
> to fit COW in, but along the way it addresses a few minor problems
> with the existing code:
> 
> - The check against i_size was duplicated: once in
>   find_lock_huge_page() and again in hugetlb_fault() after taking the
>   page_table_lock.  We only really need the locked one, so remove the
>   other.

Fair enough.

> - find_lock_huge_page() didn't, in fact, lock the page if it newly
>   allocated one, rather than finding it in the page cache already.  As
>   far as I can tell this is a bug, so the patch corrects it.

Thanks.  I was about to post a fix for this too.  It is reproducible in
the case where two threads race in the fault handler and both do
alloc_huge_page().  In that case, the loser will fail to insert his page
into the page cache and will call put_page() which has a
BUG_ON(page_count(page) == 0).

> - find_lock_huge_page() isn't a great name, since it does extra things
>   not analagous to find_lock_page().  Rename it
>   find_or_alloc_huge_page() which is closer to the mark.

I'll agree with the above.  I am not all that committed to the current
layout and what you have here is a little closer to the thinking in my
original patch ;)

<snip>

> +int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> +		  unsigned long address, int write_access)
> +{
> +	pte_t *ptep;
> +	pte_t entry;
> +
> +	ptep = huge_pte_alloc(mm, address);
> +	if (! ptep)
> +		/* OOM */
> +		return VM_FAULT_SIGBUS;
> +
> +	entry = *ptep;
> +
> +	if (pte_none(entry))
> +		return hugetlb_no_page(mm, vma, address, ptep);
> +
> +	/* we could get here if another thread instantiated the pte
> +	 * before the test above */
> +
> +	return VM_FAULT_SIGBUS;
>  }

I'll agree with Ken that the last return should probably still be
VM_FAULT_MINOR.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

WARNING: multiple messages have this Message-ID (diff)

From: Adam Litke <agl@us.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	hugh@veritas.com, William Irwin <wli@holomorphy.com>
Subject: Re: RFC: Cleanup / small fixes to hugetlb fault handling
Date: Wed, 26 Oct 2005 16:46:16 -0500	[thread overview]
Message-ID: <1130363177.2689.12.camel@localhost.localdomain> (raw)
In-Reply-To: <20051026024831.GB17191@localhost.localdomain>

On Wed, 2005-10-26 at 12:48 +1000, David Gibson wrote:
> On Wed, Oct 26, 2005 at 12:00:55PM +1000, David Gibson wrote:
> > Hi, Adam, Bill, Hugh,
> > 
> > Does this look like a reasonable patch to send to akpm for -mm.
> 
> Ahem.  Or rather this version, which actually compiles.
> 
> This patch makes some slight tweaks / cleanups to the fault handling
> path for huge pages in -mm.  My main motivation is to make it simpler
> to fit COW in, but along the way it addresses a few minor problems
> with the existing code:
> 
> - The check against i_size was duplicated: once in
>   find_lock_huge_page() and again in hugetlb_fault() after taking the
>   page_table_lock.  We only really need the locked one, so remove the
>   other.

Fair enough.

> - find_lock_huge_page() didn't, in fact, lock the page if it newly
>   allocated one, rather than finding it in the page cache already.  As
>   far as I can tell this is a bug, so the patch corrects it.

Thanks.  I was about to post a fix for this too.  It is reproducible in
the case where two threads race in the fault handler and both do
alloc_huge_page().  In that case, the loser will fail to insert his page
into the page cache and will call put_page() which has a
BUG_ON(page_count(page) == 0).

> - find_lock_huge_page() isn't a great name, since it does extra things
>   not analagous to find_lock_page().  Rename it
>   find_or_alloc_huge_page() which is closer to the mark.

I'll agree with the above.  I am not all that committed to the current
layout and what you have here is a little closer to the thinking in my
original patch ;)

<snip>

> +int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> +		  unsigned long address, int write_access)
> +{
> +	pte_t *ptep;
> +	pte_t entry;
> +
> +	ptep = huge_pte_alloc(mm, address);
> +	if (! ptep)
> +		/* OOM */
> +		return VM_FAULT_SIGBUS;
> +
> +	entry = *ptep;
> +
> +	if (pte_none(entry))
> +		return hugetlb_no_page(mm, vma, address, ptep);
> +
> +	/* we could get here if another thread instantiated the pte
> +	 * before the test above */
> +
> +	return VM_FAULT_SIGBUS;
>  }

I'll agree with Ken that the last return should probably still be
VM_FAULT_MINOR.

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2005-10-26 21:46 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-26  2:00 RFC: Cleanup / small fixes to hugetlb fault handling David Gibson
2005-10-26  2:00 ` David Gibson
2005-10-26  2:48 ` David Gibson
2005-10-26  2:48   ` David Gibson
2005-10-26  5:06   ` __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) salve vandana
2005-10-26  5:06     ` salve vandana
2005-10-26 11:12     ` salve vandana
2005-10-26 11:12       ` salve vandana
2005-10-26 18:44   ` RFC: Cleanup / small fixes to hugetlb fault handling Chen, Kenneth W
2005-10-26 18:44     ` Chen, Kenneth W
2005-10-27  0:05     ` 'David Gibson'
2005-10-27  0:05       ` 'David Gibson'
2005-10-27  0:16       ` Chen, Kenneth W
2005-10-27  0:16         ` Chen, Kenneth W
2005-10-27  0:23         ` 'David Gibson'
2005-10-27  0:23           ` 'David Gibson'
2005-10-27  6:37         ` 'David Gibson'
2005-10-27  6:37           ` 'David Gibson'
2005-10-26 20:12   ` Chen, Kenneth W
2005-10-26 20:12     ` Chen, Kenneth W
2005-10-27  0:14     ` 'David Gibson'
2005-10-27  0:14       ` 'David Gibson'
2005-10-26 21:46   ` Adam Litke [this message]
2005-10-26 21:46     ` Adam Litke
2005-10-27  0:20     ` David Gibson
2005-10-27  0:20       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1130363177.2689.12.camel@localhost.localdomain \
    --to=agl@us.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=wli@holomorphy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.