From: Dave Chinner <david@fromorbit.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH] gfs2: use __vmalloc GFP_NOFS for fs-related allocations.
Date: Wed, 4 Feb 2015 09:33:50 +1100 [thread overview]
Message-ID: <20150203223350.GP6282@dastard> (raw)
In-Reply-To: <54CF51C5.5050801@redhat.com>
On Mon, Feb 02, 2015 at 10:30:29AM +0000, Steven Whitehouse wrote:
> Hi,
>
> On 02/02/15 08:11, Dave Chinner wrote:
> >On Mon, Feb 02, 2015 at 01:57:23AM -0500, Oleg Drokin wrote:
> >>Hello!
> >>
> >>On Feb 2, 2015, at 12:37 AM, Dave Chinner wrote:
> >>
> >>>On Sun, Feb 01, 2015 at 10:59:54PM -0500, green at linuxhacker.ru wrote:
> >>>>From: Oleg Drokin <green@linuxhacker.ru>
> >>>>
> >>>>leaf_dealloc uses vzalloc as a fallback to kzalloc(GFP_NOFS), so
> >>>>it clearly does not want any shrinker activity within the fs itself.
> >>>>convert vzalloc into __vmalloc(GFP_NOFS|__GFP_ZERO) to better achieve
> >>>>this goal.
....
> >>>> ht = kzalloc(size, GFP_NOFS | __GFP_NOWARN);
> >>>> if (ht == NULL)
> >>>>- ht = vzalloc(size);
> >>>>+ ht = __vmalloc(size, GFP_NOFS | __GFP_NOWARN | __GFP_ZERO,
> >>>>+ PAGE_KERNEL);
> >>>That, in the end, won't help as vmalloc still uses GFP_KERNEL
> >>>allocations deep down in the PTE allocation code. See the hacks in
> >>>the DM and XFS code to work around this. i.e. go look for callers of
> >>>memalloc_noio_save(). It's ugly and grotesque, but we've got no
> >>>other way to limit reclaim context because the MM devs won't pass
> >>>the vmalloc gfp context down the stack to the PTE allocations....
....
> >>So, I did some digging in archives and found this thread from
> >>2010 onward with various patches and rants. Not sure how I
> >>missed that before.
> >>
> >>Should we have another run at this I wonder?
> >
> >By all means, but I don't think you'll have any more luck than
> >anyone else in the past. We've still got the problem of attitude
> >("vmalloc is not for general use") and making it actually work is
> >seen as "encouraging undesirable behaviour". If you can change
> >attitudes towards vmalloc first, then you'll be much more likely to
> >make progress in getting these problems solved....
>
> Well I don't know whether it has to be vmalloc that provides the
> solution here... if memory fragmentation could be controlled then
> kmalloc of larger contiguous chunks of memory could be done using
> that, which might be a better solution overall.
Which has been said repeatedly for the past 15 years. And after all
this time kmalloc is still horribly unreliable for large contiguous
allocations. Hence we still have need for vmalloc for large
contiguous buffers because we have places where memory allocation
failure is simply not an option.
> But I do agree that
> we need to try and come to some kind of solution to this problem as
> it is one of those things that has been rumbling on for a long time
> without a proper solution.
>
> I also wonder if vmalloc is still very slow? That was the case some
> time ago when I noticed a problem in directory access times in gfs2,
> which made us change to use kmalloc with a vmalloc fallback in the
> first place,
Another of the "myths" about vmalloc. The speed and scalability of
vmap/vmalloc is a long solved problem - Nick Piggin fixed the worst
of those problems 5-6 years ago - see the rewrite from 2008 that
started with commit db64fe0 ("mm: rewrite vmap layer")....
Cheers,
Dave.
--
Dave Chinner
david at fromorbit.com
next prev parent reply other threads:[~2015-02-03 22:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-02 3:59 [Cluster-devel] [PATCH] gfs2: use __vmalloc GFP_NOFS for fs-related allocations green
2015-02-02 5:37 ` Dave Chinner
2015-02-02 6:57 ` Oleg Drokin
2015-02-02 8:11 ` Dave Chinner
2015-02-02 10:30 ` Steven Whitehouse
2015-02-03 22:33 ` Dave Chinner [this message]
2015-02-04 7:13 ` Oleg Drokin
2015-02-04 9:49 ` Steven Whitehouse
2015-02-05 20:11 ` Dave Chinner
2015-02-05 11:45 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150203223350.GP6282@dastard \
--to=david@fromorbit.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).