Re: [RFC] remove unnecessary condition in remove_inode_hugepages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: zhong jiang <zhongjiang@huawei.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [RFC] remove unnecessary condition in remove_inode_hugepages
Date: Sun, 25 Sep 2016 14:40:16 +0800	[thread overview]
Message-ID: <57E77150.90501@huawei.com> (raw)
In-Reply-To: <70933398-2b73-9835-ab7c-c5b9e2483f31@oracle.com>

On 2016/9/25 8:06, Mike Kravetz wrote:
> On 09/23/2016 07:56 PM, zhong jiang wrote:
>> On 2016/9/24 1:19, Mike Kravetz wrote:
>>> On 09/22/2016 06:53 PM, zhong jiang wrote:
>>>> At present, we need to call hugetlb_fix_reserve_count when hugetlb_unrserve_pages fails,
>>>> and PagePrivate will decide hugetlb reserves counts.
>>>>
>>>> we obtain the page from page cache. and use page both lock_page and mutex_lock.
>>>> alloc_huge_page add page to page chace always hold lock page, then bail out clearpageprivate
>>>> before unlock page. 
>>>>
>>>> but I' m not sure  it is right  or I miss the points.
>>> Let me try to explain the code you suggest is unnecessary.
>>>
>>> The PagePrivate flag is used in huge page allocation/deallocation to
>>> indicate that the page was globally reserved.  For example, in
>>> dequeue_huge_page_vma() there is this code:
>>>
>>>                         if (page) {
>>>                                 if (avoid_reserve)
>>>                                         break;
>>>                                 if (!vma_has_reserves(vma, chg))
>>>                                         break;
>>>
>>>                                 SetPagePrivate(page);
>>>                                 h->resv_huge_pages--;
>>>                                 break;
>>>                         }
>>>
>>> and in free_huge_page():
>>>
>>>         restore_reserve = PagePrivate(page);
>>>         ClearPagePrivate(page);
>>> 	.
>>> 	<snip>
>>> 	.
>>>         if (restore_reserve)
>>>                 h->resv_huge_pages++;
>>>
>>> This helps maintains the global huge page reserve count.
>>>
>>> In addition to the global reserve count, there are per VMA reservation
>>> structures.  Unfortunately, these structures have different meanings
>>> depending on the context in which they are used.
>>>
>>> If there is a VMA reservation entry for a page, and the page has not
>>> been instantiated in the VMA this indicates there is a huge page reserved
>>> and the global resv_huge_pages count reflects that reservation.  Even
>>> if a page was not reserved, a VMA reservation entry is added when a page
>>> is instantiated in the VMA.
>>>
>>> With that background, let's look at the existing code/proposed changes.
>>  Clearly. 
>>>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>>>> index 4ea71eb..010723b 100644
>>>> --- a/fs/hugetlbfs/inode.c
>>>> +++ b/fs/hugetlbfs/inode.c
>>>> @@ -462,14 +462,12 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>>>>                          * the page, note PagePrivate which is used in case
>>>>                          * of error.
>>>>                          */
>>>> -                       rsv_on_error = !PagePrivate(page);
>>> This rsv_on_error flag indicates that when the huge page was allocated,
>>    yes
>>> it was NOT counted against the global reserve count.  So, when
>>> remove_huge_page eventually calls free_huge_page(), the global count
>>> resv_huge_pages is not incremented.  So far, no problem.
>>  but the page comes from the page cache.  if it is.  it should implement
>>  ClearPageprivate(page) when lock page.   This condition always true.
>>
>>   The key point is why it need still check the PagePrivate(page) when page from
>>   page cache and hold lock.
> You are correct.  My apologies for not seeing your point in the original
> post.
>
> When the huge page is added to the page cache (huge_add_to_page_cache),
> the Page Private flag will be cleared.  Since this code
> (remove_inode_hugepages) will only be called for pages in the page cache,
> PagePrivate(page) will always be false.
>
> The comments in this area should be changed along with the code.
>
 Thanks, I will resend the patch.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: zhong jiang <zhongjiang@huawei.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Hugh Dickins <hughd@google.com>,
	"Linux Memory Management List" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [RFC] remove unnecessary condition in remove_inode_hugepages
Date: Sun, 25 Sep 2016 14:40:16 +0800	[thread overview]
Message-ID: <57E77150.90501@huawei.com> (raw)
In-Reply-To: <70933398-2b73-9835-ab7c-c5b9e2483f31@oracle.com>

On 2016/9/25 8:06, Mike Kravetz wrote:
> On 09/23/2016 07:56 PM, zhong jiang wrote:
>> On 2016/9/24 1:19, Mike Kravetz wrote:
>>> On 09/22/2016 06:53 PM, zhong jiang wrote:
>>>> At present, we need to call hugetlb_fix_reserve_count when hugetlb_unrserve_pages fails,
>>>> and PagePrivate will decide hugetlb reserves counts.
>>>>
>>>> we obtain the page from page cache. and use page both lock_page and mutex_lock.
>>>> alloc_huge_page add page to page chace always hold lock page, then bail out clearpageprivate
>>>> before unlock page. 
>>>>
>>>> but I' m not sure  it is right  or I miss the points.
>>> Let me try to explain the code you suggest is unnecessary.
>>>
>>> The PagePrivate flag is used in huge page allocation/deallocation to
>>> indicate that the page was globally reserved.  For example, in
>>> dequeue_huge_page_vma() there is this code:
>>>
>>>                         if (page) {
>>>                                 if (avoid_reserve)
>>>                                         break;
>>>                                 if (!vma_has_reserves(vma, chg))
>>>                                         break;
>>>
>>>                                 SetPagePrivate(page);
>>>                                 h->resv_huge_pages--;
>>>                                 break;
>>>                         }
>>>
>>> and in free_huge_page():
>>>
>>>         restore_reserve = PagePrivate(page);
>>>         ClearPagePrivate(page);
>>> 	.
>>> 	<snip>
>>> 	.
>>>         if (restore_reserve)
>>>                 h->resv_huge_pages++;
>>>
>>> This helps maintains the global huge page reserve count.
>>>
>>> In addition to the global reserve count, there are per VMA reservation
>>> structures.  Unfortunately, these structures have different meanings
>>> depending on the context in which they are used.
>>>
>>> If there is a VMA reservation entry for a page, and the page has not
>>> been instantiated in the VMA this indicates there is a huge page reserved
>>> and the global resv_huge_pages count reflects that reservation.  Even
>>> if a page was not reserved, a VMA reservation entry is added when a page
>>> is instantiated in the VMA.
>>>
>>> With that background, let's look at the existing code/proposed changes.
>>  Clearly. 
>>>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>>>> index 4ea71eb..010723b 100644
>>>> --- a/fs/hugetlbfs/inode.c
>>>> +++ b/fs/hugetlbfs/inode.c
>>>> @@ -462,14 +462,12 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart,
>>>>                          * the page, note PagePrivate which is used in case
>>>>                          * of error.
>>>>                          */
>>>> -                       rsv_on_error = !PagePrivate(page);
>>> This rsv_on_error flag indicates that when the huge page was allocated,
>>    yes
>>> it was NOT counted against the global reserve count.  So, when
>>> remove_huge_page eventually calls free_huge_page(), the global count
>>> resv_huge_pages is not incremented.  So far, no problem.
>>  but the page comes from the page cache.  if it is.  it should implement
>>  ClearPageprivate(page) when lock page.   This condition always true.
>>
>>   The key point is why it need still check the PagePrivate(page) when page from
>>   page cache and hold lock.
> You are correct.  My apologies for not seeing your point in the original
> post.
>
> When the huge page is added to the page cache (huge_add_to_page_cache),
> the Page Private flag will be cleared.  Since this code
> (remove_inode_hugepages) will only be called for pages in the page cache,
> PagePrivate(page) will always be false.
>
> The comments in this area should be changed along with the code.
>
 Thanks, I will resend the patch.

next prev parent reply	other threads:[~2016-09-25  6:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-23  1:53 [RFC] remove unnecessary condition in remove_inode_hugepages zhong jiang
2016-09-23  1:53 ` zhong jiang
2016-09-23  8:18 ` Michal Hocko
2016-09-23  8:18   ` Michal Hocko
2016-09-23 17:19 ` Mike Kravetz
2016-09-23 17:19   ` Mike Kravetz
2016-09-24  2:56   ` zhong jiang
2016-09-24  2:56     ` zhong jiang
2016-09-25  0:06     ` Mike Kravetz
2016-09-25  0:06       ` Mike Kravetz
2016-09-25  6:40       ` zhong jiang [this message]
2016-09-25  6:40         ` zhong jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57E77150.90501@huawei.com \
    --to=zhongjiang@huawei.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.