All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Richard Weinberger <richard@nod.at>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	NeilBrown <neilb@suse.de>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint
Date: Tue, 21 Apr 2015 22:20:07 +0100	[thread overview]
Message-ID: <20150421212007.GU889@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20150421154504.GT889@ZenIV.linux.org.uk>

On Tue, Apr 21, 2015 at 04:45:04PM +0100, Al Viro wrote:
> On Tue, Apr 21, 2015 at 05:12:01PM +0200, Richard Weinberger wrote:
> 
> > I'm pretty sure we can kill it. I had the plan to rip it out during this merge window
> > along with other broken UML stuff but I was too late to ask on the UML mailinglist
> > if someone is using it (which I really doubt).
> > So, let's kill it with v4.2.
> 
> Let's do it.  Then ->put_link() is left in an interesting situation - *all*
> instances only use the 'cookie' argument...

OK, so here's what we have:
	* a lot of filesystems are using page_follow_link_light(); for RCU
mode they should simply look for page and if it's there and uptodate, that's
it - just grab a reference and be done.  If it's not uptodate - oh, well,
fallback to non-RCU mode.  Corresponding ->put_link() doesn't give a damn
which inode or dentry it is - it's just page_cache_release() (we need to
get rid of that kmap() crap anyway).
	* a lot of fast symlinks are using only inode; no ->put_link(),
no blocking operations, etc.  No problem at all.
	* shmem would probably want something similar to what
page_follow_link_light() would be doing for RCU case.
	* befs: should switch to page_follow_link_light(); just a matter of
giving it proper ->readpage().
	* NFS: probably as in Neil's series, except that we really ought to
add a helper for what page_follow_link_light() would do in RCU case, rather
than open-coding it here (and, again, kmap/kunmap crap should go)
	* /proc/self and its per-thread ilk: just do GFP_ATOMIC allocation for
RCU case (and handle failure as -ECHILD rather than usual -ENOMEM).
	* proc_symlink() stuff: uses only inode, nothing blocking, no problem.
	* 9p, cifs and fuse: those always query server on ->follow_link();
-ECHILD and be done with that.  _IF_ they want some kind of caching, they can
do as NFS does.  hostfs is that way too.
	* gfs2: _probably_ want to bugger off with -ECHILD; OTOH, ocfs2
uses page_follow_link_light(), maybe correctly, maybe not, and it ought
to have similar issues...
	* kernfs, configfs: -ECHILD.  And git rm is _very_ tempting after
reading that code...
	* lustre: hell knows, maybe always -ECHILD, maybe something like NFS.
	* XFS: see above.
	* hppfs: agreed to kill it off
	* autofs: not sure; it would be almost the usual fast symlink, if not
for the fact that it marks an object reached from dentry as "used now".
With RCU pathwalk it's _probably_ harmless, but I'd like a confirmation from
autofs folks.
	* /proc/*/ns/*: in theory, we might make it handle RCU mode, but
it's probably easier to say "just bugger off"
	* /proc/*/fd/*, /proc/*/exe, /proc/*/cwd, /proc/*/root: in principle
doable, but not without serious massage.
	* /proc/*/map_files/*: -ECHILD.
	* overlayfs: usual "use GFP_ATOMIC in RCU mode, treat failures as
-ECHILD".
	* ecryptfs: -ECHILD (and its use of ->readlink() is fishy, IMO).

I agree that unlazy_walk() attempted when walking a symlink ought to fail
with -ECHILD; we can't legitimize the symlink itself, so once we are out
of RCU mode, there's nothing to hold the inode of symlink (and its body)
from getting freed.  Solution is wrong though; for example, when
nested symlink occurs in the middle of a trailing one, we should *not*
remove the flag upon leaving the nested symlink.

Another unpleasant thing is that ->follow_link() saying "can't do that in
RCU mode" ends up with restart from scratch - that actually risks to be
worse than the mainline; there we would attempt unlazy_walk() and normally
it would've succeed.

AFAICS, the real rule is "can't unlazy if nd->last.name points into a symlink
body and we might still need to access it"...

  parent reply	other threads:[~2015-04-21 21:20 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-20 18:12 [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-20 18:12 ` [PATCH 01/24] lustre: rip the private symlink nesting limit out Al Viro
2015-04-20 19:08   ` Andreas Dilger
2015-04-20 19:22     ` Al Viro
2015-04-20 20:35       ` Al Viro
2015-04-20 18:12 ` [PATCH 02/24] VFS: replace {, total_}link_count in task_struct with pointer to nameidata Al Viro
2015-04-20 18:12 ` [PATCH 03/24] ovl: rearrange ovl_follow_link to it doesn't need to call ->put_link Al Viro
2015-04-20 18:12 ` [PATCH 04/24] VFS: replace nameidata arg to ->put_link with a char* Al Viro
2015-04-20 18:12 ` [PATCH 05/24] SECURITY: remove nameidata arg from inode_follow_link Al Viro
2015-04-20 18:12 ` [PATCH 06/24] VFS: remove nameidata args from ->follow_link Al Viro
2015-04-20 18:12 ` [PATCH 07/24] namei: expand nested_symlink() in its only caller Al Viro
2015-04-20 18:12 ` [PATCH 08/24] namei.c: separate the parts of follow_link() that find the link body Al Viro
2015-04-20 18:12 ` [PATCH 09/24] namei: fold follow_link() into link_path_walk() Al Viro
2015-04-20 18:12 ` [PATCH 10/24] link_path_walk: handle get_link() returning ERR_PTR() immediately Al Viro
2015-04-20 18:12 ` [PATCH 11/24] link_path_walk: don't bother with walk_component() after jumping link Al Viro
2015-04-20 18:12 ` [PATCH 12/24] link_path_walk: turn inner loop into explicit goto Al Viro
2015-04-20 18:12 ` [PATCH 13/24] link_path_walk: massage a bit more Al Viro
2015-04-20 18:12 ` [PATCH 14/24] link_path_walk: get rid of duplication Al Viro
2015-04-20 18:12 ` [PATCH 15/24] link_path_walk: final preparations to killing recursion Al Viro
2015-04-20 18:13 ` [PATCH 16/24] link_path_walk: kill the recursion Al Viro
2015-04-20 21:04   ` Linus Torvalds
2015-04-20 21:32     ` Al Viro
2015-04-20 21:39       ` Linus Torvalds
2015-04-20 21:51         ` Al Viro
2015-04-20 21:41       ` Linus Torvalds
2015-04-20 21:42         ` Linus Torvalds
2015-04-20 21:59           ` Al Viro
2015-04-20 21:52         ` Al Viro
2015-04-20 18:13 ` [PATCH 17/24] link_path_walk: split "return from recursive call" path Al Viro
2015-04-20 18:13 ` [PATCH 18/24] link_path_walk: cleanup - turn goto start; into continue; Al Viro
2015-04-20 18:13 ` [PATCH 19/24] namei: fold may_follow_link() into follow_link() Al Viro
2015-04-20 18:13 ` [PATCH 20/24] namei: introduce nameidata->stack Al Viro
2015-04-20 18:13 ` [PATCH 21/24] namei: regularize use of put_link() and follow_link(), trim arguments Al Viro
2015-04-20 18:13 ` [PATCH 22/24] namei: trim the arguments of get_link() Al Viro
2015-04-20 18:13 ` [PATCH 23/24] new ->follow_link() and ->put_link() calling conventions Al Viro
2015-04-20 18:13 ` [PATCH 24/24] uninline walk_component() Al Viro
2015-04-21 14:49 ` [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-21 15:04   ` Christoph Hellwig
2015-04-21 15:12     ` Richard Weinberger
2015-04-21 15:45       ` Al Viro
2015-04-21 16:46         ` Boaz Harrosh
2015-04-21 21:20         ` Al Viro [this message]
2015-04-22 18:07           ` Al Viro
2015-04-22 20:12             ` Al Viro
2015-04-22 21:05               ` Al Viro
2015-04-23  7:45                 ` NeilBrown
2015-04-23 18:07                   ` Al Viro
2015-04-24  6:35                     ` NeilBrown
2015-04-24 13:42                       ` Al Viro
2015-05-04  5:11                         ` Al Viro
2015-05-04  7:30                           ` NeilBrown
2015-04-23  5:01           ` Al Viro
2015-04-21 14:51 ` [PATCH] logfs: fix a pagecache leak for symlinks Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150421212007.GU889@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=richard@nod.at \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.