Re: [PATCH] hugepage: allow parallelization of the hugepage fault path

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Hush Bensen <hush.bensen@gmail.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Gibson <david@gibson.dropbear.id.au>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Michel Lespinasse <walken@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Michal Hocko <mhocko@suse.cz>,
	"AneeshKumarK.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hillf Danton <dhillf@gmail.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Eric B Munson <emunson@mgebm.net>,
	Anton Blanchard <anton@samba.org>
Subject: Re: [PATCH] hugepage: allow parallelization of the hugepage fault path
Date: Tue, 23 Jul 2013 15:04:54 +0800	[thread overview]
Message-ID: <51EE2B16.3020605@gmail.com> (raw)
In-Reply-To: <20130718090719.GB9761@lge.com>

On 07/18/2013 05:07 PM, Joonsoo Kim wrote:
> On Wed, Jul 17, 2013 at 12:50:25PM -0700, Davidlohr Bueso wrote:
>
>> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
>>
>> - Cleaned up and forward ported to Linus' latest.
>> - Cache aligned mutexes.
>> - Keep non SMP systems using a single mutex.
>>
>> It was found that this mutex can become quite contended
>> during the early phases of large databases which make use of huge pages - for instance
>> startup and initial runs. One clear example is a 1.5Gb Oracle database, where lockstat
>> reports that this mutex can be one of the top 5 most contended locks in the kernel during
>> the first few minutes:
>>
>>      	     hugetlb_instantiation_mutex:   10678     10678
>>               ---------------------------
>>               hugetlb_instantiation_mutex    10678  [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340
>>               ---------------------------
>>               hugetlb_instantiation_mutex    10678  [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340
>>
>> contentions:          10678
>> acquisitions:         99476
>> waittime-total: 76888911.01 us
> Hello,
> I have a question :)
>
> So, each contention takes 7.6 ms in your result.
> Do you map this area with VM_NORESERVE?
> If we map with VM_RESERVE, when page fault, we just dequeue a huge page from a queue and clear
> a page and then map it to a page table. So I guess, it shouldn't take so long.

I don't think there is clear page operation after dequeue huge page, 
actually it's even not done during hugetlb_reserve_pages, do you know 
why? There is just clear operation in hugetlb_no_page.

> I'm wondering why it takes so long.
>
> And do you use 16KB-size hugepage?
> If so, region handling could takes some times. If you access the area as random order,
> the number of region can be more than 90000. I guess, this can be one reason to too long
> waittime.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Hush Bensen <hush.bensen@gmail.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Gibson <david@gibson.dropbear.id.au>,
	Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
	Michel Lespinasse <walken@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Konstantin Khlebnikov <khlebnikov@openvz.org>,
	Michal Hocko <mhocko@suse.cz>,
	"AneeshKumarK.V" <aneesh.kumar@linux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Hillf Danton <dhillf@gmail.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Eric B Munson <emunson@mgebm.net>,
	Anton Blanchard <anton@samba.org>
Subject: Re: [PATCH] hugepage: allow parallelization of the hugepage fault path
Date: Tue, 23 Jul 2013 15:04:54 +0800	[thread overview]
Message-ID: <51EE2B16.3020605@gmail.com> (raw)
In-Reply-To: <20130718090719.GB9761@lge.com>

On 07/18/2013 05:07 PM, Joonsoo Kim wrote:
> On Wed, Jul 17, 2013 at 12:50:25PM -0700, Davidlohr Bueso wrote:
>
>> From: Davidlohr Bueso <davidlohr.bueso@hp.com>
>>
>> - Cleaned up and forward ported to Linus' latest.
>> - Cache aligned mutexes.
>> - Keep non SMP systems using a single mutex.
>>
>> It was found that this mutex can become quite contended
>> during the early phases of large databases which make use of huge pages - for instance
>> startup and initial runs. One clear example is a 1.5Gb Oracle database, where lockstat
>> reports that this mutex can be one of the top 5 most contended locks in the kernel during
>> the first few minutes:
>>
>>      	     hugetlb_instantiation_mutex:   10678     10678
>>               ---------------------------
>>               hugetlb_instantiation_mutex    10678  [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340
>>               ---------------------------
>>               hugetlb_instantiation_mutex    10678  [<ffffffff8115e14e>] hugetlb_fault+0x9e/0x340
>>
>> contentions:          10678
>> acquisitions:         99476
>> waittime-total: 76888911.01 us
> Hello,
> I have a question :)
>
> So, each contention takes 7.6 ms in your result.
> Do you map this area with VM_NORESERVE?
> If we map with VM_RESERVE, when page fault, we just dequeue a huge page from a queue and clear
> a page and then map it to a page table. So I guess, it shouldn't take so long.

I don't think there is clear page operation after dequeue huge page, 
actually it's even not done during hugetlb_reserve_pages, do you know 
why? There is just clear operation in hugetlb_no_page.

> I'm wondering why it takes so long.
>
> And do you use 16KB-size hugepage?
> If so, region handling could takes some times. If you access the area as random order,
> the number of region can be more than 90000. I guess, this can be one reason to too long
> waittime.
>
> Thanks.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2013-07-23  7:05 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-12 23:28 [PATCH] mm/hugetlb: per-vma instantiation mutexes Davidlohr Bueso
2013-07-12 23:28 ` Davidlohr Bueso
2013-07-13  0:54 ` Hugh Dickins
2013-07-13  0:54   ` Hugh Dickins
2013-07-15  3:16   ` Davidlohr Bueso
2013-07-15  3:16     ` Davidlohr Bueso
2013-07-15  7:24     ` David Gibson
2013-07-15 23:08       ` Andrew Morton
2013-07-15 23:08         ` Andrew Morton
2013-07-16  0:12         ` Davidlohr Bueso
2013-07-16  0:12           ` Davidlohr Bueso
2013-07-16  8:00           ` David Gibson
2013-07-17 19:50         ` [PATCH] hugepage: allow parallelization of the hugepage fault path Davidlohr Bueso
2013-07-17 19:50           ` Davidlohr Bueso
2013-07-18  8:42           ` Joonsoo Kim
2013-07-18  8:42             ` Joonsoo Kim
2013-07-19  7:14             ` David Gibson
2013-07-19 21:24               ` Davidlohr Bueso
2013-07-19 21:24                 ` Davidlohr Bueso
2013-07-22  0:59                 ` Joonsoo Kim
2013-07-22  0:59                   ` Joonsoo Kim
2013-07-18  9:07           ` Joonsoo Kim
2013-07-18  9:07             ` Joonsoo Kim
2013-07-19  0:19             ` Davidlohr Bueso
2013-07-19  0:19               ` Davidlohr Bueso
2013-07-19  0:35               ` Davidlohr Bueso
2013-07-19  0:35                 ` Davidlohr Bueso
2013-07-23  7:04             ` Hush Bensen [this message]
2013-07-23  7:04               ` Hush Bensen
2013-07-23  6:55           ` Hush Bensen
2013-07-23  6:55             ` Hush Bensen
2013-07-16  1:51       ` [PATCH] mm/hugetlb: per-vma instantiation mutexes Rik van Riel
2013-07-16  1:51         ` Rik van Riel
2013-07-16  5:34         ` Joonsoo Kim
2013-07-16  5:34           ` Joonsoo Kim
2013-07-16 10:01           ` David Gibson
2013-07-18  6:50             ` Joonsoo Kim
2013-07-18  6:50               ` Joonsoo Kim
2013-07-16  8:20         ` David Gibson
2013-07-15  4:18 ` Konstantin Khlebnikov
2013-07-15  4:18   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EE2B16.3020605@gmail.com \
    --to=hush.bensen@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=davidlohr.bueso@hp.com \
    --cc=dhillf@gmail.com \
    --cc=emunson@mgebm.net \
    --cc=hughd@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=khlebnikov@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.