From: NeilBrown <neilb@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint
Date: Mon, 4 May 2015 17:30:25 +1000 [thread overview]
Message-ID: <20150504173025.50433fcb@notabene.brown> (raw)
In-Reply-To: <20150504051129.GA3246@ZenIV.linux.org.uk>
[-- Attachment #1: Type: text/plain, Size: 3935 bytes --]
On Mon, 4 May 2015 06:11:29 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote:
> On Fri, Apr 24, 2015 at 02:42:03PM +0100, Al Viro wrote:
>
> > That avoids this spin_lock() on each absolute symlink at the price of extra
> > 32 bits in struct nameidata. It looks like doing on-demand reallocation
> > of nd->stack is the right way to go anyway, so the pressure on nameidata size
> > is going to be weaker and that might be the right way to go...
>
> OK, on-demand reallocation is done. What I have right now is
> * flat MAXSYMLINKS 40, no matter what kind of nesting there might
> be.
> * purely iterative link_path_walk().
> * no damn nameidata on stack for generic_readlink()
> * stack footprint of the entire thing independent from the nesting
> depth, and about on par with "no symlinks at all" case in mainline.
> * some massage towards RCU follow_link done (in the end of queue),
> but quite a bit more work remains.
>
> What I've got so far is in vfs.git#link_path_walk; I'm not too happy about
> posting a 70-chunk mailbomb, but it really will need review and testing.
> It survives xfstests and LTP with no regressions, but it will need
> serious profiling, etc., along with RTFS. I tried to keep it in reasonably
> small pieces, but there's a lot of them ;-/
>
> FWIW, I've a bit more reorganization plotted out, but it's not far from
> where we need to be for RCU follow_link. Some notes:
> * I don't believe we want to pass flags to ->follow_link() - it's
> much simpler to give the damn thing NULL for dentry in RCU case. In *all*
> cases where we might have a change to get the symlink body without blocking
> we can do that by inode alone. We obviously want to pass dentry and inode
> separately (and in case of fast symlinks we don't hit the filesystem at
> all), but that's it - flags isn't needed.
> * terminate_walk() should do bulk put_link(). So should the
> failure cases of complete_walk(). _Success_ of complete_walk() should
> be careful about legitimizing links - it *can* be called with one link
> on stack, and be followed by access to link body. Yes, really - do_last()
> in O_CREAT case.
> * do_last(), lookup_last() and mountpoint_last() ought to
> have put_link() done when called on non-empty stack (thus turning the loops
> into something like
> while ((err = lookup_last(nd)) > 0) {
> err = trailing_symlink(nd);
> if (err)
> break;
> }
> _After_ the point where they don't need to look at the last component of
> name, obviously.
> * I think we should leave terminate_walk() to callers in failure
> cases of walk_component() and handle_dots(), as well as get_link(). Makes
> life simpler in callers, actually. I'll play with that a bit more.
> * it might make sense to add the second flag to walk_component(),
> in addition to LOOKUP_FOLLOW, meaning "do put_link() once you are done looking
> at the name". In principle, it promises simpler logics with unlazy_walk(),
> but that's one area I'm still not entirely sure about. Will need to
> experiment a bit...
> * nd->seq clobbering will need to be dealt with, as discussed upthread.
> * I _really_ hate your "let's use the LSB of struct page * to tell
> if we need to kunmap()" approach. It's too damn ugly to live. _And_ it's
> trivial to avoid - all we need is to have (non-lazy) page_follow_link_light()
> and page_symlink() to remove __GFP_HIGHMEM from inode->i_mapping before
> ever asking to allocate pages there. That'll suffice, and it makes sense
> regardless of RCU work - that kmap/kunmap with potential for minutes in
> between (while waiting for stuck NFS server in the middle of symlink traversal)
> is simply wrong.
Thanks!
I'll have another look and see about adding what is needed for RCU symlink
support.
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-05-04 7:30 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-20 18:12 [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-20 18:12 ` [PATCH 01/24] lustre: rip the private symlink nesting limit out Al Viro
2015-04-20 19:08 ` Andreas Dilger
2015-04-20 19:22 ` Al Viro
2015-04-20 20:35 ` Al Viro
2015-04-20 18:12 ` [PATCH 02/24] VFS: replace {, total_}link_count in task_struct with pointer to nameidata Al Viro
2015-04-20 18:12 ` [PATCH 03/24] ovl: rearrange ovl_follow_link to it doesn't need to call ->put_link Al Viro
2015-04-20 18:12 ` [PATCH 04/24] VFS: replace nameidata arg to ->put_link with a char* Al Viro
2015-04-20 18:12 ` [PATCH 05/24] SECURITY: remove nameidata arg from inode_follow_link Al Viro
2015-04-20 18:12 ` [PATCH 06/24] VFS: remove nameidata args from ->follow_link Al Viro
2015-04-20 18:12 ` [PATCH 07/24] namei: expand nested_symlink() in its only caller Al Viro
2015-04-20 18:12 ` [PATCH 08/24] namei.c: separate the parts of follow_link() that find the link body Al Viro
2015-04-20 18:12 ` [PATCH 09/24] namei: fold follow_link() into link_path_walk() Al Viro
2015-04-20 18:12 ` [PATCH 10/24] link_path_walk: handle get_link() returning ERR_PTR() immediately Al Viro
2015-04-20 18:12 ` [PATCH 11/24] link_path_walk: don't bother with walk_component() after jumping link Al Viro
2015-04-20 18:12 ` [PATCH 12/24] link_path_walk: turn inner loop into explicit goto Al Viro
2015-04-20 18:12 ` [PATCH 13/24] link_path_walk: massage a bit more Al Viro
2015-04-20 18:12 ` [PATCH 14/24] link_path_walk: get rid of duplication Al Viro
2015-04-20 18:12 ` [PATCH 15/24] link_path_walk: final preparations to killing recursion Al Viro
2015-04-20 18:13 ` [PATCH 16/24] link_path_walk: kill the recursion Al Viro
2015-04-20 21:04 ` Linus Torvalds
2015-04-20 21:32 ` Al Viro
2015-04-20 21:39 ` Linus Torvalds
2015-04-20 21:51 ` Al Viro
2015-04-20 21:41 ` Linus Torvalds
2015-04-20 21:42 ` Linus Torvalds
2015-04-20 21:59 ` Al Viro
2015-04-20 21:52 ` Al Viro
2015-04-20 18:13 ` [PATCH 17/24] link_path_walk: split "return from recursive call" path Al Viro
2015-04-20 18:13 ` [PATCH 18/24] link_path_walk: cleanup - turn goto start; into continue; Al Viro
2015-04-20 18:13 ` [PATCH 19/24] namei: fold may_follow_link() into follow_link() Al Viro
2015-04-20 18:13 ` [PATCH 20/24] namei: introduce nameidata->stack Al Viro
2015-04-20 18:13 ` [PATCH 21/24] namei: regularize use of put_link() and follow_link(), trim arguments Al Viro
2015-04-20 18:13 ` [PATCH 22/24] namei: trim the arguments of get_link() Al Viro
2015-04-20 18:13 ` [PATCH 23/24] new ->follow_link() and ->put_link() calling conventions Al Viro
2015-04-20 18:13 ` [PATCH 24/24] uninline walk_component() Al Viro
2015-04-21 14:49 ` [RFC][PATCHSET] non-recursive link_path_walk() and reducing stack footprint Al Viro
2015-04-21 15:04 ` Christoph Hellwig
2015-04-21 15:12 ` Richard Weinberger
2015-04-21 15:45 ` Al Viro
2015-04-21 16:46 ` Boaz Harrosh
2015-04-21 21:20 ` Al Viro
2015-04-22 18:07 ` Al Viro
2015-04-22 20:12 ` Al Viro
2015-04-22 21:05 ` Al Viro
2015-04-23 7:45 ` NeilBrown
2015-04-23 18:07 ` Al Viro
2015-04-24 6:35 ` NeilBrown
2015-04-24 13:42 ` Al Viro
2015-05-04 5:11 ` Al Viro
2015-05-04 7:30 ` NeilBrown [this message]
2015-04-23 5:01 ` Al Viro
2015-04-21 14:51 ` [PATCH] logfs: fix a pagecache leak for symlinks Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150504173025.50433fcb@notabene.brown \
--to=neilb@suse.de \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).