From: Andrea Arcangeli <aarcange@redhat.com>
To: Christoph Lameter <cl@gentwo.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Dave Hansen <dave.hansen@intel.com>,
Hugh Dickins <hughd@google.com>, Mel Gorman <mgorman@suse.de>,
Rik van Riel <riel@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH, RFC 00/10] THP refcounting redesign
Date: Tue, 10 Jun 2014 23:58:30 +0200 [thread overview]
Message-ID: <20140610215830.GF19660@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1406101518510.19364@gentwo.org>
On Tue, Jun 10, 2014 at 03:25:42PM -0500, Christoph Lameter wrote:
> I thought the idea was that we would modify the relevant code and
> that at some point this requirement could go away?
There were places that weren't aware and splitted unnecessarily to
avoid having to make all places aware immediately and keep the initial
patchset small, all the ones in relevant fast paths are gone by now,
but the requirement doesn't go away if munmap partially unmaps a page.
If munmap or mremap splits the THP in the middle, the pmd has to be
splitted reliably and it cannot fail or the syscall cannot return...
And Kirill patchset still provides a reliable split of the pmd of
course. It only relaxes the actual page struct split but without
actually removing the tailpage refcounting.
There are clear downsides in adding a failure -EBUSY case to
split_huge_page related to potential increased memory usage that from
the user prospective will like a memory leak (like real anon RSS
exceeding the virtual size up to 512 times in the worst case, at least
until khugepaged can fix it up and release RAM with an async
split_huge_page), but the current get_page/put_page improvement
doesn't look significant enough.
This is why I think we should check if we can go the extra mile and
get rid of the tail page refcounting as a whole if possible, if that
is achieved this failure case added to split_huge_page will look a
better tradeoff than it looks now. Currently I'm not impressed by the
simplification of get_page/put_page considering the downsides this
brings to memory utilization and potentially having to defer the page
split to khugepaged.
> Huge pages (and other larger order pages) will become increasingly
> difficult to handle if relevant page state has to be maintained in tail
> pages and if it differs significantly from regular pages.
Over the last couple of years there was no increase in difficulty
though, the only relevant change that happened was to move the tail
page refcounting from ->count to ->mapcount (both otherwised unused on
tail pages) because ->count could confuse the speculative pagecache
lookups on tail pages, but that was a strightforward change, the
difficulty stayed the same no matter if the tail pin was in count or
mapcount.
While I don't see an actual increase in difficulty anywhere in this
area, simplification and performance improvement is always welcome :).
Last but not the least while I don't see a showstopper for non-weird
non-malicious apps, we should take in consideration the malicious case
too and the trouble that this would cause to containers (or rlimits)
if apps can lock in 512 times more physical RAM than they're supposed
to if this allows bypassing all kernel accounting so easily. Then
again it depends if people thinks containers should be usable to
protect against non trusted apps too or not (I don't, I prefer docker
on top of KVM especially on public clouds, but others do).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
prev parent reply other threads:[~2014-06-10 21:58 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-09 16:04 [PATCH, RFC 00/10] THP refcounting redesign Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 01/10] mm, thp: drop FOLL_SPLIT Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 02/10] mm: change PageAnon() to work on tail pages Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 03/10] thp: rename split_huge_page_pmd() to split_huge_pmd() Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 04/10] thp: PMD splitting without splitting compound page Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 05/10] mm, vmstats: new THP splitting event Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 06/10] thp: implement new split_huge_page() Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 07/10] mm, thp: remove infrastructure for handling splitting PMDs Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 08/10] x86, thp: remove " Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 09/10] futex, thp: remove special case for THP in get_futex_key Kirill A. Shutemov
2014-06-09 16:04 ` [PATCH, RFC 10/10] thp: update documentation Kirill A. Shutemov
2014-06-10 8:10 ` [PATCH, RFC 00/10] THP refcounting redesign Vlastimil Babka
2014-06-10 13:52 ` Kirill A. Shutemov
2014-06-10 14:29 ` Andrea Arcangeli
2014-06-10 15:24 ` Kirill A. Shutemov
2014-06-10 20:25 ` Christoph Lameter
2014-06-10 20:46 ` Kirill A. Shutemov
2014-06-10 21:21 ` Christoph Lameter
2014-06-10 22:04 ` Andrea Arcangeli
2014-06-10 22:14 ` Kirill A. Shutemov
2014-06-10 22:37 ` Andrea Arcangeli
2014-06-10 21:58 ` Andrea Arcangeli [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140610215830.GF19660@redhat.com \
--to=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=cl@gentwo.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).