Re: [PATCH 10/18] mm, hugetlb: call vma_has_reserve() before entering alloc_huge_page()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Hillf Danton <dhillf@gmail.com>
Subject: Re: [PATCH 10/18] mm, hugetlb: call vma_has_reserve() before entering alloc_huge_page()
Date: Wed, 31 Jul 2013 14:06:03 +0900	[thread overview]
Message-ID: <20130731050603.GI2548@lge.com> (raw)
In-Reply-To: <1375122474-w2vygb3x-mutt-n-horiguchi@ah.jp.nec.com>

On Mon, Jul 29, 2013 at 02:27:54PM -0400, Naoya Horiguchi wrote:
> On Mon, Jul 29, 2013 at 02:32:01PM +0900, Joonsoo Kim wrote:
> > To implement a graceful failure handling, we need to know whether
> > allocation request is for reserved pool or not, on higher level.
> > In this patch, we just move up vma_has_reseve() to caller function
> > in order to know it. There is no functional change.
> > 
> > Following patches implement a grace failure handling and remove
> > a hugetlb_instantiation_mutex.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index a66226e..5f31ca5 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1123,12 +1123,12 @@ static void vma_commit_reservation(struct hstate *h,
> >  }
> >  
> >  static struct page *alloc_huge_page(struct vm_area_struct *vma,
> > -				    unsigned long addr, int avoid_reserve)
> > +				    unsigned long addr, int use_reserve)
> >  {
> >  	struct hugepage_subpool *spool = subpool_vma(vma);
> >  	struct hstate *h = hstate_vma(vma);
> >  	struct page *page;
> > -	int ret, idx, use_reserve;
> > +	int ret, idx;
> >  	struct hugetlb_cgroup *h_cg;
> >  
> >  	idx = hstate_index(h);
> > @@ -1140,11 +1140,6 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
> >  	 * need pages and subpool limit allocated allocated if no reserve
> >  	 * mapping overlaps.
> >  	 */
> > -	use_reserve = vma_has_reserves(h, vma, addr);
> > -	if (use_reserve < 0)
> > -		return ERR_PTR(-ENOMEM);
> > -
> > -	use_reserve = use_reserve && !avoid_reserve;
> >  	if (!use_reserve && (hugepage_subpool_get_pages(spool, 1) < 0))
> >  			return ERR_PTR(-ENOSPC);
> >  
> > @@ -2520,7 +2515,7 @@ static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
> >  {
> >  	struct hstate *h = hstate_vma(vma);
> >  	struct page *old_page, *new_page;
> > -	int outside_reserve = 0;
> > +	int use_reserve, outside_reserve = 0;
> >  	unsigned long mmun_start;	/* For mmu_notifiers */
> >  	unsigned long mmun_end;		/* For mmu_notifiers */
> >  
> > @@ -2553,7 +2548,18 @@ retry_avoidcopy:
> >  
> >  	/* Drop page_table_lock as buddy allocator may be called */
> >  	spin_unlock(&mm->page_table_lock);
> > -	new_page = alloc_huge_page(vma, address, outside_reserve);
> > +
> > +	use_reserve = vma_has_reserves(h, vma, address);
> > +	if (use_reserve == -ENOMEM) {
> > +		page_cache_release(old_page);
> > +
> > +		/* Caller expects lock to be held */
> > +		spin_lock(&mm->page_table_lock);
> > +		return VM_FAULT_OOM;
> > +	}
> > +	use_reserve = use_reserve && !outside_reserve;
> 
> When outside_reserve is true, we don't have to call vma_has_reserves
> because then use_reserve is always false. So something like:
> 
>   use_reserve = 0;
>   if (!outside_reserve) {
>           use_reserve = vma_has_reserves(...);
>           ...
>   }
> 
> looks better to me.
> Or if you expect vma_has_reserves to change resv_map implicitly,
> could you add a comment about it.

Yes, you are right.
I will change it.

Thanks.

> 
> Thanks,
> Naoya Horiguchi
> 
> > +
> > +	new_page = alloc_huge_page(vma, address, use_reserve);
> >  
> >  	if (IS_ERR(new_page)) {
> >  		long err = PTR_ERR(new_page);
> > @@ -2679,6 +2685,7 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
> >  	struct page *page;
> >  	struct address_space *mapping;
> >  	pte_t new_pte;
> > +	int use_reserve = 0;
> >  
> >  	/*
> >  	 * Currently, we are forced to kill the process in the event the
> > @@ -2704,7 +2711,14 @@ retry:
> >  		size = i_size_read(mapping->host) >> huge_page_shift(h);
> >  		if (idx >= size)
> >  			goto out;
> > -		page = alloc_huge_page(vma, address, 0);
> > +
> > +		use_reserve = vma_has_reserves(h, vma, address);
> > +		if (use_reserve == -ENOMEM) {
> > +			ret = VM_FAULT_OOM;
> > +			goto out;
> > +		}
> > +
> > +		page = alloc_huge_page(vma, address, use_reserve);
> >  		if (IS_ERR(page)) {
> >  			ret = PTR_ERR(page);
> >  			if (ret == -ENOMEM)
> > -- 
> > 1.7.9.5
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@suse.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hugh Dickins <hughd@google.com>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Wanpeng Li <liwanp@linux.vnet.ibm.com>,
	Hillf Danton <dhillf@gmail.com>
Subject: Re: [PATCH 10/18] mm, hugetlb: call vma_has_reserve() before entering alloc_huge_page()
Date: Wed, 31 Jul 2013 14:06:03 +0900	[thread overview]
Message-ID: <20130731050603.GI2548@lge.com> (raw)
In-Reply-To: <1375122474-w2vygb3x-mutt-n-horiguchi@ah.jp.nec.com>

On Mon, Jul 29, 2013 at 02:27:54PM -0400, Naoya Horiguchi wrote:
> On Mon, Jul 29, 2013 at 02:32:01PM +0900, Joonsoo Kim wrote:
> > To implement a graceful failure handling, we need to know whether
> > allocation request is for reserved pool or not, on higher level.
> > In this patch, we just move up vma_has_reseve() to caller function
> > in order to know it. There is no functional change.
> > 
> > Following patches implement a grace failure handling and remove
> > a hugetlb_instantiation_mutex.
> > 
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index a66226e..5f31ca5 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1123,12 +1123,12 @@ static void vma_commit_reservation(struct hstate *h,
> >  }
> >  
> >  static struct page *alloc_huge_page(struct vm_area_struct *vma,
> > -				    unsigned long addr, int avoid_reserve)
> > +				    unsigned long addr, int use_reserve)
> >  {
> >  	struct hugepage_subpool *spool = subpool_vma(vma);
> >  	struct hstate *h = hstate_vma(vma);
> >  	struct page *page;
> > -	int ret, idx, use_reserve;
> > +	int ret, idx;
> >  	struct hugetlb_cgroup *h_cg;
> >  
> >  	idx = hstate_index(h);
> > @@ -1140,11 +1140,6 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
> >  	 * need pages and subpool limit allocated allocated if no reserve
> >  	 * mapping overlaps.
> >  	 */
> > -	use_reserve = vma_has_reserves(h, vma, addr);
> > -	if (use_reserve < 0)
> > -		return ERR_PTR(-ENOMEM);
> > -
> > -	use_reserve = use_reserve && !avoid_reserve;
> >  	if (!use_reserve && (hugepage_subpool_get_pages(spool, 1) < 0))
> >  			return ERR_PTR(-ENOSPC);
> >  
> > @@ -2520,7 +2515,7 @@ static int hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma,
> >  {
> >  	struct hstate *h = hstate_vma(vma);
> >  	struct page *old_page, *new_page;
> > -	int outside_reserve = 0;
> > +	int use_reserve, outside_reserve = 0;
> >  	unsigned long mmun_start;	/* For mmu_notifiers */
> >  	unsigned long mmun_end;		/* For mmu_notifiers */
> >  
> > @@ -2553,7 +2548,18 @@ retry_avoidcopy:
> >  
> >  	/* Drop page_table_lock as buddy allocator may be called */
> >  	spin_unlock(&mm->page_table_lock);
> > -	new_page = alloc_huge_page(vma, address, outside_reserve);
> > +
> > +	use_reserve = vma_has_reserves(h, vma, address);
> > +	if (use_reserve == -ENOMEM) {
> > +		page_cache_release(old_page);
> > +
> > +		/* Caller expects lock to be held */
> > +		spin_lock(&mm->page_table_lock);
> > +		return VM_FAULT_OOM;
> > +	}
> > +	use_reserve = use_reserve && !outside_reserve;
> 
> When outside_reserve is true, we don't have to call vma_has_reserves
> because then use_reserve is always false. So something like:
> 
>   use_reserve = 0;
>   if (!outside_reserve) {
>           use_reserve = vma_has_reserves(...);
>           ...
>   }
> 
> looks better to me.
> Or if you expect vma_has_reserves to change resv_map implicitly,
> could you add a comment about it.

Yes, you are right.
I will change it.

Thanks.

> 
> Thanks,
> Naoya Horiguchi
> 
> > +
> > +	new_page = alloc_huge_page(vma, address, use_reserve);
> >  
> >  	if (IS_ERR(new_page)) {
> >  		long err = PTR_ERR(new_page);
> > @@ -2679,6 +2685,7 @@ static int hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma,
> >  	struct page *page;
> >  	struct address_space *mapping;
> >  	pte_t new_pte;
> > +	int use_reserve = 0;
> >  
> >  	/*
> >  	 * Currently, we are forced to kill the process in the event the
> > @@ -2704,7 +2711,14 @@ retry:
> >  		size = i_size_read(mapping->host) >> huge_page_shift(h);
> >  		if (idx >= size)
> >  			goto out;
> > -		page = alloc_huge_page(vma, address, 0);
> > +
> > +		use_reserve = vma_has_reserves(h, vma, address);
> > +		if (use_reserve == -ENOMEM) {
> > +			ret = VM_FAULT_OOM;
> > +			goto out;
> > +		}
> > +
> > +		page = alloc_huge_page(vma, address, use_reserve);
> >  		if (IS_ERR(page)) {
> >  			ret = PTR_ERR(page);
> >  			if (ret == -ENOMEM)
> > -- 
> > 1.7.9.5
> >
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-07-31  5:06 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-29  5:31 [PATCH 00/18] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim
2013-07-29  5:31 ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 01/18] mm, hugetlb: protect reserved pages when softofflining requests the pages Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29  7:24   ` Hillf Danton
2013-07-29  7:24     ` Hillf Danton
2013-07-31  2:27     ` Joonsoo Kim
2013-07-31  2:27       ` Joonsoo Kim
2013-07-31  2:49       ` Hillf Danton
2013-07-31  2:49         ` Hillf Danton
2013-07-31  4:41         ` Joonsoo Kim
2013-07-31  4:41           ` Joonsoo Kim
2013-07-31  6:21           ` Hillf Danton
2013-07-31  6:21             ` Hillf Danton
2013-07-31  6:37             ` Joonsoo Kim
2013-07-31  6:37               ` Joonsoo Kim
2013-07-31 15:25               ` Hillf Danton
2013-07-31 15:25                 ` Hillf Danton
2013-08-01  6:07                 ` Joonsoo Kim
2013-08-01  6:07                   ` Joonsoo Kim
2013-08-01 16:17                 ` Aneesh Kumar K.V
2013-08-01 16:17                   ` Aneesh Kumar K.V
2013-08-04  5:10                   ` Hillf Danton
2013-08-04  5:10                     ` Hillf Danton
2013-08-05  5:17                     ` Aneesh Kumar K.V
2013-08-05  5:17                       ` Aneesh Kumar K.V
2013-07-30 16:49   ` Aneesh Kumar K.V
2013-07-30 16:49     ` Aneesh Kumar K.V
2013-07-29  5:31 ` [PATCH 02/18] mm, hugetlb: change variable name reservations to resv Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-30 16:50   ` Aneesh Kumar K.V
2013-07-30 16:50     ` Aneesh Kumar K.V
2013-07-29  5:31 ` [PATCH 03/18] mm, hugetlb: unify region structure handling Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-30 17:27   ` Aneesh Kumar K.V
2013-07-30 17:27     ` Aneesh Kumar K.V
2013-07-31  2:36     ` Joonsoo Kim
2013-07-31  2:36       ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 04/18] mm, hugetlb: region manipulation functions take resv_map rather list_head Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 05/18] mm, hugetlb: protect region tracking via newly introduced resv_map lock Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29  8:58   ` Hillf Danton
2013-07-29  8:58     ` Hillf Danton
2013-07-31  2:41     ` Joonsoo Kim
2013-07-31  2:41       ` Joonsoo Kim
2013-07-29 18:53   ` Davidlohr Bueso
2013-07-29 18:53     ` Davidlohr Bueso
2013-07-31  2:43     ` Joonsoo Kim
2013-07-31  2:43       ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 06/18] mm, hugetlb: remove vma_need_reservation() Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29 17:52   ` Naoya Horiguchi
2013-07-29 17:52     ` Naoya Horiguchi
2013-07-31  4:53     ` Joonsoo Kim
2013-07-31  4:53       ` Joonsoo Kim
2013-07-30 17:49   ` Aneesh Kumar K.V
2013-07-30 17:49     ` Aneesh Kumar K.V
2013-07-31  4:56     ` Joonsoo Kim
2013-07-31  4:56       ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 07/18] mm, hugetlb: pass has_reserve to dequeue_huge_page_vma() Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29  5:31 ` [PATCH 08/18] mm, hugetlb: do hugepage_subpool_get_pages() when avoid_reserve Joonsoo Kim
2013-07-29  5:31   ` Joonsoo Kim
2013-07-29 18:05   ` Naoya Horiguchi
2013-07-29 18:05     ` Naoya Horiguchi
2013-07-31  5:02     ` Joonsoo Kim
2013-07-31  5:02       ` Joonsoo Kim
2013-07-31 20:55       ` Naoya Horiguchi
2013-07-31 20:55         ` Naoya Horiguchi
2013-07-29  5:32 ` [PATCH 09/18] mm, hugetlb: unify has_reserve and avoid_reserve to use_reserve Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 10/18] mm, hugetlb: call vma_has_reserve() before entering alloc_huge_page() Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29 18:27   ` Naoya Horiguchi
2013-07-29 18:27     ` Naoya Horiguchi
2013-07-31  5:06     ` Joonsoo Kim [this message]
2013-07-31  5:06       ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 11/18] mm, hugetlb: move down outside_reserve check Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29 18:39   ` Naoya Horiguchi
2013-07-29 18:39     ` Naoya Horiguchi
2013-07-31  5:08     ` Joonsoo Kim
2013-07-31  5:08       ` Joonsoo Kim
2013-07-31 20:46       ` Naoya Horiguchi
2013-07-31 20:46         ` Naoya Horiguchi
2013-07-29  5:32 ` [PATCH 12/18] mm, hugetlb: remove a check for return value of alloc_huge_page() Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 13/18] mm, hugetlb: grab a page_table_lock after page_cache_release Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29 18:50   ` Naoya Horiguchi
2013-07-29 18:50     ` Naoya Horiguchi
2013-07-29  5:32 ` [PATCH 14/18] mm, hugetlb: clean-up error handling in hugetlb_cow() Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 15/18] mm, hugetlb: move up anon_vma_prepare() Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29 19:05   ` Naoya Horiguchi
2013-07-29 19:05     ` Naoya Horiguchi
2013-07-29 19:19     ` Naoya Horiguchi
2013-07-29 19:19       ` Naoya Horiguchi
2013-07-31  5:12       ` Joonsoo Kim
2013-07-31  5:12         ` Joonsoo Kim
2013-07-31 16:43         ` Naoya Horiguchi
2013-07-31 16:43           ` Naoya Horiguchi
2013-07-29  5:32 ` [PATCH 16/18] mm, hugetlb: return a reserved page to a reserved pool if failed Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29 20:19   ` Naoya Horiguchi
2013-07-29 20:19     ` Naoya Horiguchi
2013-07-31  5:21     ` Joonsoo Kim
2013-07-31  5:21       ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim
2013-07-29  7:28   ` David Gibson
2013-07-31  5:37     ` Joonsoo Kim
2013-07-31  5:37       ` Joonsoo Kim
2013-08-03 10:43       ` David Gibson
2013-08-05  7:36         ` Joonsoo Kim
2013-08-05  7:36           ` Joonsoo Kim
2013-08-07  0:18           ` Davidlohr Bueso
2013-08-07  0:18             ` Davidlohr Bueso
2013-08-07  1:03             ` David Gibson
2013-08-07  1:38               ` Davidlohr Bueso
2013-08-07  1:38                 ` Davidlohr Bueso
2013-08-07  9:18                 ` Joonsoo Kim
2013-08-07  9:18                   ` Joonsoo Kim
2013-08-09  0:02                   ` David Gibson
2013-08-09  9:37                     ` Joonsoo Kim
2013-08-09  9:37                       ` Joonsoo Kim
2013-07-29  5:32 ` [PATCH 18/18] mm, hugetlb: remove a hugetlb_instantiation_mutex Joonsoo Kim
2013-07-29  5:32   ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130731050603.GI2548@lge.com \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=davidlohr.bueso@hp.com \
    --cc=dhillf@gmail.com \
    --cc=hughd@google.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liwanp@linux.vnet.ibm.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.