From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756998AbbCCSfS (ORCPT ); Tue, 3 Mar 2015 13:35:18 -0500 Received: from cantor2.suse.de ([195.135.220.15]:35689 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756160AbbCCSfQ (ORCPT ); Tue, 3 Mar 2015 13:35:16 -0500 Message-ID: <54F5FEE0.2090104@suse.cz> Date: Tue, 03 Mar 2015 19:35:12 +0100 From: Vlastimil Babka User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Eric B Munson , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Peter Zijlstra CC: Christoph Lameter , Thomas Gleixner , Andrew Morton , Hugh Dickins , Mel Gorman , Roland Dreier , Sean Hefty , Hal Rosenstock , Mike Marciniszyn Subject: Re: Resurrecting the VM_PINNED discussion References: <20150303174105.GA3295@akamai.com> In-Reply-To: <20150303174105.GA3295@akamai.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/03/2015 06:41 PM, Eric B Munson wrote:> All, > > After LSF/MM last year Peter revived a patch set that would create > infrastructure for pinning pages as opposed to simply locking them. > AFAICT, there was no objection to the set, it just needed some help > from the IB folks. > > Am I missing something about why it was never merged? I ask because > Akamai has bumped into the disconnect between the mlock manpage, > Documentation/vm/unevictable-lru.txt, and reality WRT compaction and > locking. A group working in userspace read those sources and wrote a > tool that mmaps many files read only and locked, munmapping them when > they are no longer needed. Locking is used because they cannot afford a > major fault, but they are fine with minor faults. This tends to > fragment memory badly so when they started looking into using hugetlbfs > (or anything requiring order > 0 allocations) they found they were not > able to allocate the memory. They were confused based on the referenced > documentation as to why compaction would continually fail to yield > appropriately sized contiguous areas when there was more than enough > free memory. So you are saying that mlocking (VM_LOCKED) prevents migration and thus compaction to do its job? If that's true, I think it's a bug as it is AFAIK supposed to work just fine. > I would like to see the situation with VM_LOCKED cleared up, ideally the > documentation would remain and reality adjusted to match and I think > Peter's VM_PINNED set goes in the right direction for this goal. What > is missing and how can I help? I don't think VM_PINNED would help you. In fact it is VM_PINNED that improves accounting for the kind of locking (pinning) that *does* prevent page migration (unlike mlocking)... quoting the patchset cover letter: "These patches introduce VM_PINNED infrastructure, vma tracking of persistent 'pinned' page ranges. Pinned is anything that has a fixed phys address (as required for say IO DMA engines) and thus cannot use the weaker VM_LOCKED. One popular way to pin pages is through get_user_pages() but that not nessecarily the only way." > Thanks, > Eric >