linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <mr@neil.brown.name>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>, Theodore Ts'o <tytso@mit.edu>,
	Chris Mason <clm@fb.com>, Jan Kara <jack@suse.cz>,
	ceph-devel@vger.kernel.org, cluster-devel@redhat.com,
	linux-nfs@vger.kernel.org, logfs@logfs.org, xfs@oss.sgi.com,
	linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org,
	linux-mtd@lists.infradead.org, reiserfs-devel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-afs@lists.infradead.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/2] scop GFP_NOFS api
Date: Sun, 01 May 2016 07:55:31 +1000	[thread overview]
Message-ID: <87twiiu5gs.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20160429120418.GK21977@dhcp22.suse.cz>

[-- Attachment #1: Type: text/plain, Size: 3249 bytes --]

On Fri, Apr 29 2016, Michal Hocko wrote:

>
> One think I have learned is that shrinkers can be really complex and
> getting rid of GFP_NOFS will be really hard so I would really like to
> start the easiest way possible and remove the direct usage and replace
> it by scope one which would at least _explain_ why it is needed. I think
> this is a reasonable _first_ step and a large step ahead because we have
> a good chance to get rid of a large number of those which were used
> "just because I wasn't sure and this should be safe, right?". I wouldn't
> be surprised if we end up with a very small number of both scope and
> direct usage in the end.

Yes, shrinkers can be complex.  About two of them are.  We could fix
lots and lots of call sites, or fix two shrinkers.
OK, that's a bit unfair as fixing one of the shrinkers involves changing
many ->evict_inode() functions.  But that would be a very focused
change.

I think your proposal is little more than re-arranging deck chairs on
the titanic.  Yes, it might give everybody a better view of the iceberg
but the iceberg is still there and in reality we can already see it.

The main iceberg is evict_inode.  It appears in both examples given so
far: xfs and gfs.  There are other little icebergs but they won't last
long after evict_inode is dealt with.

One particular problem with your process-context idea is that it isn't
inherited across threads.
Steve Whitehouse's example in gfs shows how allocation dependencies can
even cross into user space.

A more localized one that I have seen is that NFSv4 sometimes needs to
start up a state-management thread (particularly if the server
restarted).
It uses kthread_run(), which doesn't actually create the thread but asks
kthreadd to do it.  If NFS writeout is waiting for state management it
would need to make sure that kthreadd runs in allocation context to
avoid deadlock.
I feel that I've forgotten some important detail here and this might
have been fixed somehow, but the point still stands that the allocation
context can cross from thread to thread and can effectively become
anything and everything.

It is OK to wait for memory to be freed.  It is not OK to wait for any
particular piece of memory to be freed because you don't always know who
is waiting for you, or who you really are waiting on to free that
memory.

Whenever trying to free memory I think you need to do best-effort
without blocking.

>
> I would also like to revisit generic inode/dentry shrinker and see
> whether it could be more __GFP_FS friendly. As you say many FS might
> even not depend on some FS internal locks so pushing GFP_FS check down
> the layers might make a lot of sense and allow to clean some [id]cache
> even for __GFP_FS context.

I think the only part of prune_dcache_sb() that might need care is
iput() which boils down to evict().  The final unlink for NFS
silly-rename might happen in there too (in d_iput).
shrinking the dcache seems rather late to be performing that unlink
though, so I've probably missed some key detail.

If we find a way for evict(), when called from the shrinker, to be
non-blocking, and generally require all shrinkers to be non-blocking,
then many of these allocation problems evaporate.

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

  parent reply	other threads:[~2016-04-30 21:55 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-26 11:56 [PATCH 0/2] scop GFP_NOFS api Michal Hocko
2016-04-26 11:56 ` [PATCH 1/2] mm: add PF_MEMALLOC_NOFS Michal Hocko
2016-04-26 23:07   ` Dave Chinner
2016-04-27  7:51     ` Michal Hocko
2016-04-27 11:54   ` [PATCH 1.1/2] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS Michal Hocko
2016-04-27 11:54     ` [PATCH 1.2/2] mm: introduce memalloc_nofs_{save,restore} API Michal Hocko
2016-04-27 13:07       ` Michal Hocko
2016-04-27 20:09       ` Michal Hocko
2016-04-27 20:30         ` Michal Hocko
2016-04-27 21:14       ` Michal Hocko
2016-04-27 17:41     ` [PATCH 1.1/2] xfs: abstract PF_FSTRANS to PF_MEMALLOC_NOFS Andreas Dilger
2016-04-27 19:43       ` Michal Hocko
2016-04-26 11:56 ` [PATCH 2/2] mm, debug: report when GFP_NO{FS,IO} is used explicitly from memalloc_no{fs,io}_{save,restore} context Michal Hocko
2016-04-26 22:58   ` Dave Chinner
2016-04-27  8:03     ` Michal Hocko
2016-04-27 22:55       ` Dave Chinner
2016-04-28  8:17     ` Michal Hocko
2016-04-28 21:51       ` Dave Chinner
2016-04-29 12:12         ` Michal Hocko
2016-04-29 23:40           ` Dave Chinner
2016-05-03 15:38             ` Michal Hocko
2016-05-04  0:07               ` Dave Chinner
2016-04-29  5:35 ` [PATCH 0/2] scop GFP_NOFS api NeilBrown
2016-04-29 10:20   ` [Cluster-devel] " Steven Whitehouse
2016-04-30 21:17     ` NeilBrown
2016-04-29 12:04   ` Michal Hocko
2016-04-30  0:24     ` Dave Chinner
2016-04-30 21:55     ` NeilBrown [this message]
2016-05-03 15:13       ` Michal Hocko
2016-05-03 23:26         ` NeilBrown
2016-04-30  0:11   ` Dave Chinner
2016-04-30 22:19     ` NeilBrown
2016-05-04  1:00       ` Dave Chinner
2016-05-06  3:20         ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87twiiu5gs.fsf@notabene.neil.brown.name \
    --to=mr@neil.brown.name \
    --cc=akpm@linux-foundation.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=clm@fb.com \
    --cc=cluster-devel@redhat.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=logfs@logfs.org \
    --cc=mhocko@kernel.org \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).