From: Vlastimil Babka <vbabka@suse.cz>
To: Eric B Munson <emunson@akamai.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Mel Gorman <mgorman@suse.de>,
Roland Dreier <roland@kernel.org>,
Sean Hefty <sean.hefty@intel.com>,
Hal Rosenstock <hal.rosenstock@gmail.com>,
Mike Marciniszyn <infinipath@intel.com>
Subject: Re: Resurrecting the VM_PINNED discussion
Date: Tue, 03 Mar 2015 19:35:12 +0100 [thread overview]
Message-ID: <54F5FEE0.2090104@suse.cz> (raw)
In-Reply-To: <20150303174105.GA3295@akamai.com>
On 03/03/2015 06:41 PM, Eric B Munson wrote:> All,
>
> After LSF/MM last year Peter revived a patch set that would create
> infrastructure for pinning pages as opposed to simply locking them.
> AFAICT, there was no objection to the set, it just needed some help
> from the IB folks.
>
> Am I missing something about why it was never merged? I ask because
> Akamai has bumped into the disconnect between the mlock manpage,
> Documentation/vm/unevictable-lru.txt, and reality WRT compaction and
> locking. A group working in userspace read those sources and wrote a
> tool that mmaps many files read only and locked, munmapping them when
> they are no longer needed. Locking is used because they cannot afford a
> major fault, but they are fine with minor faults. This tends to
> fragment memory badly so when they started looking into using hugetlbfs
> (or anything requiring order > 0 allocations) they found they were not
> able to allocate the memory. They were confused based on the referenced
> documentation as to why compaction would continually fail to yield
> appropriately sized contiguous areas when there was more than enough
> free memory.
So you are saying that mlocking (VM_LOCKED) prevents migration and thus
compaction to do its job? If that's true, I think it's a bug as it is AFAIK
supposed to work just fine.
> I would like to see the situation with VM_LOCKED cleared up, ideally the
> documentation would remain and reality adjusted to match and I think
> Peter's VM_PINNED set goes in the right direction for this goal. What
> is missing and how can I help?
I don't think VM_PINNED would help you. In fact it is VM_PINNED that improves
accounting for the kind of locking (pinning) that *does* prevent page migration
(unlike mlocking)... quoting the patchset cover letter:
"These patches introduce VM_PINNED infrastructure, vma tracking of persistent
'pinned' page ranges. Pinned is anything that has a fixed phys address (as
required for say IO DMA engines) and thus cannot use the weaker VM_LOCKED. One
popular way to pin pages is through get_user_pages() but that not nessecarily
the only way."
> Thanks,
> Eric
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Eric B Munson <emunson@akamai.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Mel Gorman <mgorman@suse.de>,
Roland Dreier <roland@kernel.org>,
Sean Hefty <sean.hefty@intel.com>,
Hal Rosenstock <hal.rosenstock@gmail.com>,
Mike Marciniszyn <infinipath@intel.com>
Subject: Re: Resurrecting the VM_PINNED discussion
Date: Tue, 03 Mar 2015 19:35:12 +0100 [thread overview]
Message-ID: <54F5FEE0.2090104@suse.cz> (raw)
In-Reply-To: <20150303174105.GA3295@akamai.com>
On 03/03/2015 06:41 PM, Eric B Munson wrote:> All,
>
> After LSF/MM last year Peter revived a patch set that would create
> infrastructure for pinning pages as opposed to simply locking them.
> AFAICT, there was no objection to the set, it just needed some help
> from the IB folks.
>
> Am I missing something about why it was never merged? I ask because
> Akamai has bumped into the disconnect between the mlock manpage,
> Documentation/vm/unevictable-lru.txt, and reality WRT compaction and
> locking. A group working in userspace read those sources and wrote a
> tool that mmaps many files read only and locked, munmapping them when
> they are no longer needed. Locking is used because they cannot afford a
> major fault, but they are fine with minor faults. This tends to
> fragment memory badly so when they started looking into using hugetlbfs
> (or anything requiring order > 0 allocations) they found they were not
> able to allocate the memory. They were confused based on the referenced
> documentation as to why compaction would continually fail to yield
> appropriately sized contiguous areas when there was more than enough
> free memory.
So you are saying that mlocking (VM_LOCKED) prevents migration and thus
compaction to do its job? If that's true, I think it's a bug as it is AFAIK
supposed to work just fine.
> I would like to see the situation with VM_LOCKED cleared up, ideally the
> documentation would remain and reality adjusted to match and I think
> Peter's VM_PINNED set goes in the right direction for this goal. What
> is missing and how can I help?
I don't think VM_PINNED would help you. In fact it is VM_PINNED that improves
accounting for the kind of locking (pinning) that *does* prevent page migration
(unlike mlocking)... quoting the patchset cover letter:
"These patches introduce VM_PINNED infrastructure, vma tracking of persistent
'pinned' page ranges. Pinned is anything that has a fixed phys address (as
required for say IO DMA engines) and thus cannot use the weaker VM_LOCKED. One
popular way to pin pages is through get_user_pages() but that not nessecarily
the only way."
> Thanks,
> Eric
>
next prev parent reply other threads:[~2015-03-03 18:35 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-03 17:41 Resurrecting the VM_PINNED discussion Eric B Munson
2015-03-03 18:35 ` Vlastimil Babka [this message]
2015-03-03 18:35 ` Vlastimil Babka
2015-03-03 18:45 ` Eric B Munson
2015-03-03 19:51 ` Christoph Lameter
2015-03-03 19:51 ` Christoph Lameter
2015-03-03 20:20 ` Vlastimil Babka
2015-03-03 20:20 ` Vlastimil Babka
2015-03-03 20:22 ` Christoph Lameter
2015-03-03 20:22 ` Christoph Lameter
2015-03-03 21:01 ` Eric B Munson
2015-03-03 21:52 ` Eric B Munson
2015-03-03 22:05 ` Vlastimil Babka
2015-03-03 22:05 ` Vlastimil Babka
2015-03-04 14:45 ` Eric B Munson
2015-03-03 19:13 ` Davidlohr Bueso
2015-03-03 19:13 ` Davidlohr Bueso
2015-03-05 20:46 ` Peter Zijlstra
2015-03-05 20:46 ` Peter Zijlstra
2015-03-05 21:09 ` Christoph Lameter
2015-03-05 21:09 ` Christoph Lameter
2015-03-05 21:13 ` Peter Zijlstra
2015-03-05 21:13 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54F5FEE0.2090104@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=emunson@akamai.com \
--cc=hal.rosenstock@gmail.com \
--cc=hughd@google.com \
--cc=infinipath@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=peterz@infradead.org \
--cc=roland@kernel.org \
--cc=sean.hefty@intel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.