From: Al Viro <viro@ZenIV.linux.org.uk>
To: Richard Weinberger <richard@nod.at>
Cc: Christoph Hellwig <hch@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
NeilBrown <neilb@suse.de>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint
Date: Tue, 21 Apr 2015 22:20:07 +0100 [thread overview]
Message-ID: <20150421212007.GU889@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20150421154504.GT889@ZenIV.linux.org.uk>
On Tue, Apr 21, 2015 at 04:45:04PM +0100, Al Viro wrote:
> On Tue, Apr 21, 2015 at 05:12:01PM +0200, Richard Weinberger wrote:
>
> > I'm pretty sure we can kill it. I had the plan to rip it out during this merge window
> > along with other broken UML stuff but I was too late to ask on the UML mailinglist
> > if someone is using it (which I really doubt).
> > So, let's kill it with v4.2.
>
> Let's do it. Then ->put_link() is left in an interesting situation - *all*
> instances only use the 'cookie' argument...
OK, so here's what we have:
* a lot of filesystems are using page_follow_link_light(); for RCU
mode they should simply look for page and if it's there and uptodate, that's
it - just grab a reference and be done. If it's not uptodate - oh, well,
fallback to non-RCU mode. Corresponding ->put_link() doesn't give a damn
which inode or dentry it is - it's just page_cache_release() (we need to
get rid of that kmap() crap anyway).
* a lot of fast symlinks are using only inode; no ->put_link(),
no blocking operations, etc. No problem at all.
* shmem would probably want something similar to what
page_follow_link_light() would be doing for RCU case.
* befs: should switch to page_follow_link_light(); just a matter of
giving it proper ->readpage().
* NFS: probably as in Neil's series, except that we really ought to
add a helper for what page_follow_link_light() would do in RCU case, rather
than open-coding it here (and, again, kmap/kunmap crap should go)
* /proc/self and its per-thread ilk: just do GFP_ATOMIC allocation for
RCU case (and handle failure as -ECHILD rather than usual -ENOMEM).
* proc_symlink() stuff: uses only inode, nothing blocking, no problem.
* 9p, cifs and fuse: those always query server on ->follow_link();
-ECHILD and be done with that. _IF_ they want some kind of caching, they can
do as NFS does. hostfs is that way too.
* gfs2: _probably_ want to bugger off with -ECHILD; OTOH, ocfs2
uses page_follow_link_light(), maybe correctly, maybe not, and it ought
to have similar issues...
* kernfs, configfs: -ECHILD. And git rm is _very_ tempting after
reading that code...
* lustre: hell knows, maybe always -ECHILD, maybe something like NFS.
* XFS: see above.
* hppfs: agreed to kill it off
* autofs: not sure; it would be almost the usual fast symlink, if not
for the fact that it marks an object reached from dentry as "used now".
With RCU pathwalk it's _probably_ harmless, but I'd like a confirmation from
autofs folks.
* /proc/*/ns/*: in theory, we might make it handle RCU mode, but
it's probably easier to say "just bugger off"
* /proc/*/fd/*, /proc/*/exe, /proc/*/cwd, /proc/*/root: in principle
doable, but not without serious massage.
* /proc/*/map_files/*: -ECHILD.
* overlayfs: usual "use GFP_ATOMIC in RCU mode, treat failures as
-ECHILD".
* ecryptfs: -ECHILD (and its use of ->readlink() is fishy, IMO).
I agree that unlazy_walk() attempted when walking a symlink ought to fail
with -ECHILD; we can't legitimize the symlink itself, so once we are out
of RCU mode, there's nothing to hold the inode of symlink (and its body)
from getting freed. Solution is wrong though; for example, when
nested symlink occurs in the middle of a trailing one, we should *not*
remove the flag upon leaving the nested symlink.
Another unpleasant thing is that ->follow_link() saying "can't do that in
RCU mode" ends up with restart from scratch - that actually risks to be
worse than the mainline; there we would attempt unlazy_walk() and normally
it would've succeed.
AFAICS, the real rule is "can't unlazy if nd->last.name points into a symlink
body and we might still need to access it"...
next prev parent reply other threads:[~2015-04-21 21:20 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-20 18:12 [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-20 18:12 ` [PATCH 01/24] lustre: rip the private symlink nesting limit out Al Viro
2015-04-20 19:08 ` Andreas Dilger
2015-04-20 19:22 ` Al Viro
2015-04-20 20:35 ` Al Viro
2015-04-20 18:12 ` [PATCH 02/24] VFS: replace {, total_}link_count in task_struct with pointer to nameidata Al Viro
2015-04-20 18:12 ` [PATCH 03/24] ovl: rearrange ovl_follow_link to it doesn't need to call ->put_link Al Viro
2015-04-20 18:12 ` [PATCH 04/24] VFS: replace nameidata arg to ->put_link with a char* Al Viro
2015-04-20 18:12 ` [PATCH 05/24] SECURITY: remove nameidata arg from inode_follow_link Al Viro
2015-04-20 18:12 ` [PATCH 06/24] VFS: remove nameidata args from ->follow_link Al Viro
2015-04-20 18:12 ` [PATCH 07/24] namei: expand nested_symlink() in its only caller Al Viro
2015-04-20 18:12 ` [PATCH 08/24] namei.c: separate the parts of follow_link() that find the link body Al Viro
2015-04-20 18:12 ` [PATCH 09/24] namei: fold follow_link() into link_path_walk() Al Viro
2015-04-20 18:12 ` [PATCH 10/24] link_path_walk: handle get_link() returning ERR_PTR() immediately Al Viro
2015-04-20 18:12 ` [PATCH 11/24] link_path_walk: don't bother with walk_component() after jumping link Al Viro
2015-04-20 18:12 ` [PATCH 12/24] link_path_walk: turn inner loop into explicit goto Al Viro
2015-04-20 18:12 ` [PATCH 13/24] link_path_walk: massage a bit more Al Viro
2015-04-20 18:12 ` [PATCH 14/24] link_path_walk: get rid of duplication Al Viro
2015-04-20 18:12 ` [PATCH 15/24] link_path_walk: final preparations to killing recursion Al Viro
2015-04-20 18:13 ` [PATCH 16/24] link_path_walk: kill the recursion Al Viro
2015-04-20 21:04 ` Linus Torvalds
2015-04-20 21:32 ` Al Viro
2015-04-20 21:39 ` Linus Torvalds
2015-04-20 21:51 ` Al Viro
2015-04-20 21:41 ` Linus Torvalds
2015-04-20 21:42 ` Linus Torvalds
2015-04-20 21:59 ` Al Viro
2015-04-20 21:52 ` Al Viro
2015-04-20 18:13 ` [PATCH 17/24] link_path_walk: split "return from recursive call" path Al Viro
2015-04-20 18:13 ` [PATCH 18/24] link_path_walk: cleanup - turn goto start; into continue; Al Viro
2015-04-20 18:13 ` [PATCH 19/24] namei: fold may_follow_link() into follow_link() Al Viro
2015-04-20 18:13 ` [PATCH 20/24] namei: introduce nameidata->stack Al Viro
2015-04-20 18:13 ` [PATCH 21/24] namei: regularize use of put_link() and follow_link(), trim arguments Al Viro
2015-04-20 18:13 ` [PATCH 22/24] namei: trim the arguments of get_link() Al Viro
2015-04-20 18:13 ` [PATCH 23/24] new ->follow_link() and ->put_link() calling conventions Al Viro
2015-04-20 18:13 ` [PATCH 24/24] uninline walk_component() Al Viro
2015-04-21 14:49 ` [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-21 15:04 ` Christoph Hellwig
2015-04-21 15:12 ` Richard Weinberger
2015-04-21 15:45 ` Al Viro
2015-04-21 16:46 ` Boaz Harrosh
2015-04-21 21:20 ` Al Viro [this message]
2015-04-22 18:07 ` Al Viro
2015-04-22 20:12 ` Al Viro
2015-04-22 21:05 ` Al Viro
2015-04-23 7:45 ` NeilBrown
2015-04-23 18:07 ` Al Viro
2015-04-24 6:35 ` NeilBrown
2015-04-24 13:42 ` Al Viro
2015-05-04 5:11 ` Al Viro
2015-05-04 7:30 ` NeilBrown
2015-04-23 5:01 ` Al Viro
2015-04-21 14:51 ` [PATCH] logfs: fix a pagecache leak for symlinks Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150421212007.GU889@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.de \
--cc=richard@nod.at \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).