linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Hugh Dickins <hughd@google.com>, Michal Hocko <mhocko@kernel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <ak@linux.intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] shmem: avoid huge pages for small files
Date: Fri, 21 Oct 2016 09:46:30 +1100	[thread overview]
Message-ID: <20161020224630.GO23194@dastard> (raw)
In-Reply-To: <20161020103946.GA3881@node.shutemov.name>

On Thu, Oct 20, 2016 at 01:39:46PM +0300, Kirill A. Shutemov wrote:
> On Wed, Oct 19, 2016 at 11:13:54AM -0700, Hugh Dickins wrote:
> > On Tue, 18 Oct 2016, Michal Hocko wrote:
> > > On Tue 18-10-16 17:32:07, Kirill A. Shutemov wrote:
> > > > On Tue, Oct 18, 2016 at 04:20:07PM +0200, Michal Hocko wrote:
> > > > > On Mon 17-10-16 17:55:40, Kirill A. Shutemov wrote:
> > > > > > On Mon, Oct 17, 2016 at 04:12:46PM +0200, Michal Hocko wrote:
> > > > > > > On Mon 17-10-16 15:30:21, Kirill A. Shutemov wrote:
> > > > > [...]
> > > > > > > > We add two handle to specify minimal file size for huge pages:
> > > > > > > > 
> > > > > > > >   - mount option 'huge_min_size';
> > > > > > > > 
> > > > > > > >   - sysfs file /sys/kernel/mm/transparent_hugepage/shmem_min_size for
> > > > > > > >     in-kernel tmpfs mountpoint;
> > > > > > > 
> > > > > > > Could you explain who might like to change the minimum value (other than
> > > > > > > disable the feautre for the mount point) and for what reason?
> > > > > > 
> > > > > > Depending on how well CPU microarchitecture deals with huge pages, you
> > > > > > might need to set it higher in order to balance out overhead with benefit
> > > > > > of huge pages.
> > > > > 
> > > > > I am not sure this is a good argument. How do a user know and what will
> > > > > help to make that decision? Why we cannot autotune that? In other words,
> > > > > adding new knobs just in case turned out to be a bad idea in the past.
> > > > 
> > > > Well, I don't see a reasonable way to autotune it. We can just let
> > > > arch-specific code to redefine it, but the argument below still stands.
> > > > 
> > > > > > In other case, if it's known in advance that specific mount would be
> > > > > > populated with large files, you might want to set it to zero to get huge
> > > > > > pages allocated from the beginning.
> > > 
> > > Do you think this is a sufficient reason to provide a tunable with such a
> > > precision? In other words why cannot we simply start by using an
> > > internal only limit at the huge page size for the initial transition
> > > (with a way to disable THP altogether for a mount point) and only add a
> > > more fine grained tunning if there ever is a real need for it with a use
> > > case description. In other words can we be less optimistic about
> > > tunables than we used to be in the past and often found out that those
> > > were mistakes much later?
> > 
> > I'm not sure whether I'm arguing in the same or the opposite direction
> > as you, Michal, but what makes me unhappy is not so much the tunable,
> > as the proliferation of mount options.
> > 
> > Kirill, this issue is (not exactly but close enough) what the mount
> > option "huge=within_size" was supposed to be about: not wasting huge
> > pages on small files.  I'd be much happier if you made huge_min_size
> > into a /sys/kernel/mm/transparent_hugepage/shmem_within_size tunable,
> > and used it to govern "huge=within_size" mounts only.
> 
> Well, you're right that I tried originally address the issue with
> huge=within_size, but this option makes much more sense for filesystem
> with persistent storage. For ext4, it would be pretty usable option.

Ugh, no, please don't use mount options for file specific behaviours
in filesystems like ext4 and XFS. This is exactly the sort of
behaviour that should either just work automatically (i.e. be
completely controlled by the filesystem) or only be applied to files
specifically configured with persistent hints to reliably allocate
extents in a way that can be easily mapped to huge pages.

e.g. on XFS you will need to apply extent size hints to get large
page sized/aligned extent allocation to occur, and so this
persistent extent size hint should trigger the filesystem to use
large pages if supported, the hint is correctly sized and aligned,
and there are large pages available for allocation.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-10-20 22:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 12:18 [PATCH] shmem: avoid huge pages for small files Kirill A. Shutemov
2016-10-17 12:30 ` Kirill A. Shutemov
2016-10-17 14:12   ` Michal Hocko
2016-10-17 14:55     ` Kirill A. Shutemov
2016-10-18 14:20       ` Michal Hocko
2016-10-18 14:32         ` Kirill A. Shutemov
2016-10-18 18:30           ` Michal Hocko
2016-10-19 18:13             ` Hugh Dickins
2016-10-20 10:39               ` Kirill A. Shutemov
2016-10-20 22:46                 ` Dave Chinner [this message]
2016-10-21  2:01                   ` Andi Kleen
2016-10-21  5:01                     ` Dave Chinner
2016-10-21 15:00                       ` Kirill A. Shutemov
2016-10-21 15:12                         ` Michal Hocko
2016-10-21 22:50                         ` Dave Chinner
2016-10-21 23:32                           ` Kirill A. Shutemov
2016-10-24 20:34                           ` Dave Hansen
2016-10-25  5:28                             ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2016-11-10 16:25 [PATCHv4] " Kirill A. Shutemov
2016-11-10 17:42 ` [PATCH] " kbuild test robot
2016-11-10 17:51   ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161020224630.GO23194@dastard \
    --to=david@fromorbit.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).