From: Dave Chinner <david@fromorbit.com>
To: David Rientjes <rientjes@google.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
Mikulas Patocka <mpatocka@redhat.com>,
Edward Thornber <thornber@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
dm-devel@redhat.com, Vivek Goyal <vgoyal@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"Alasdair G. Kergon" <agk@redhat.com>
Subject: Re: [PATCH 2/7] mm: introduce kvmalloc and kvmalloc_node
Date: Wed, 15 Jul 2015 10:25:40 +1000 [thread overview]
Message-ID: <20150715002540.GR3902@dastard> (raw)
In-Reply-To: <alpine.DEB.2.10.1507141536590.16182@chino.kir.corp.google.com>
On Tue, Jul 14, 2015 at 03:45:40PM -0700, David Rientjes wrote:
> On Wed, 15 Jul 2015, Dave Chinner wrote:
>
> > > Sure, but it's not accomplishing the same thing: things like
> > > ext4_kvmalloc() only want to fallback to vmalloc() when high-order
> > > allocations fail: the function is used for different sizes. This cannot
> > > be converted to kvmalloc_node() since it fallsback immediately when
> > > reclaim fails. Same issue with single_file_open() for the seq_file code.
> > > We could go through every kmalloc() -> vmalloc() fallback for more
> > > examples in the code, but those two instances were the first I looked at
> > > and couldn't be converted to kvmalloc_node() without work.
> > >
> > > > It is always easier to shoehorn utility functions locally within a
> > > > subsystem (be it ext4, dm, etc) but once enough do something in a
> > > > similar but different way it really should get elevated.
> > > >
> > >
> > > I would argue that
> > >
> > > void *ext4_kvmalloc(size_t size, gfp_t flags)
> > > {
> > > void *ret;
> > >
> > > ret = kmalloc(size, flags | __GFP_NOWARN);
> > > if (!ret)
> > > ret = __vmalloc(size, flags, PAGE_KERNEL);
> > > return ret;
> > > }
> > >
> > > is simple enough that we don't need to convert it to anything.
> >
> > Except that it will have problems with GFP_NOFS context when the pte
> > code inside vmalloc does a GFP_KERNEL allocation. Hence we have
> > stuff in other subsystems (such as XFS) where we've noticed lockdep
> > whining about this:
> >
>
> Does anyone have an example of ext4_kvmalloc() having a lockdep violation?
> Presumably the GFP_NOFS calls to ext4_kvmalloc() will never have
> size > (1 << (PAGE_SHIFT + PAGE_ALLOC_COSTLY_ORDER)) so that kmalloc()
> above actually never returns NULL and __vmalloc() only gets used for the
> ext4_kvmalloc(..., GFP_KERNEL) call.
Code inspection is all that is necessary. For example,
fs/ext4/resize.c::add_new_gdb() does:
818 n_group_desc = ext4_kvmalloc((gdb_num + 1) *
819 sizeof(struct buffer_head *),
820 GFP_NOFS);
I have to assume this was done because resize was failing kmalloc()
calls on large filesystems in GFP_NOFS context as the commit message
is less than helpful:
commit f18a5f21c25707b4fe64b326e2b4d150565e7300
Author: Theodore Ts'o <tytso@mit.edu>
Date: Mon Aug 1 08:45:38 2011 -0400
ext4: use ext4_kvzalloc()/ext4_kvmalloc() for s_group_desc and s_group_info
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> It should be fixed, though, probably in the same way as
> kmem_zalloc_large() today, but it seems the real fix would be to attack
> the whole vmalloc() GFP_KERNEL issue that has been talked about several
> times in the past. Then the existing ext4_kvmalloc() implementation
> should be fine.
Agreed, we really need to ensure that the generic kernel allocation
functions follow the context guidelines they are provided by
callers. I'm not going to hold my breathe waiting for this to
happen, though....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-07-15 0:25 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-07 15:08 [PATCH 0/7] mm: reliable memory allocation with kvmalloc Mikulas Patocka
2015-07-07 15:09 ` [PATCH 1/7] mm/vmalloc: export __vmalloc_node_flags Mikulas Patocka
2015-07-07 15:10 ` [PATCH 2/7] mm: introduce kvmalloc and kvmalloc_node Mikulas Patocka
2015-07-07 21:41 ` Andrew Morton
2015-07-08 7:34 ` [dm-devel] " Zdenek Kabelac
2015-07-08 23:03 ` Mikulas Patocka
2015-07-08 23:18 ` Andrew Morton
2015-07-09 14:45 ` Mikulas Patocka
2015-07-14 21:13 ` David Rientjes
2015-07-14 21:19 ` Mike Snitzer
2015-07-14 21:24 ` David Rientjes
2015-07-14 21:54 ` Dave Chinner
2015-07-14 22:45 ` David Rientjes
2015-07-15 0:25 ` Dave Chinner [this message]
2015-07-14 21:24 ` Andrew Morton
2015-07-07 15:10 ` [PATCH 3/7] dm-ioctl: join flags DM_PARAMS_KMALLOC and DM_PARAMS_VMALLOC Mikulas Patocka
2015-07-07 15:11 ` [PATCH 4/7] dm: use kvmalloc Mikulas Patocka
2015-07-07 15:11 ` [PATCH 5/7] dm-thin: " Mikulas Patocka
2015-07-07 15:12 ` [PATCH 6/7] dm-stats: use kvmalloc_node Mikulas Patocka
2015-07-07 15:13 ` [PATCH 7/7] dm: make dm_vcalloc use kvmalloc Mikulas Patocka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150715002540.GR3902@dastard \
--to=david@fromorbit.com \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mpatocka@redhat.com \
--cc=rientjes@google.com \
--cc=snitzer@redhat.com \
--cc=thornber@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).