Re: [RFC PATCH 1/6] fs/9p: Add ability to identify inode by path for .L

linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Mickaël Salaün" <mic@digikod.net>
To: Tingmao Wang <m@maowtm.org>
Cc: "Al Viro" <viro@zeniv.linux.org.uk>,
	"Eric Van Hensbergen" <ericvh@kernel.org>,
	"Dominique Martinet" <asmadeus@codewreck.org>,
	"Latchesar Ionkov" <lucho@ionkov.net>,
	"Christian Schoenebeck" <linux_oss@crudebyte.com>,
	v9fs@lists.linux.dev, "Günther Noack" <gnoack@google.com>,
	linux-security-module@vger.kernel.org, "Jan Kara" <jack@suse.cz>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"Matthew Bobrowski" <repnop@google.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 1/6] fs/9p: Add ability to identify inode by path for .L
Date: Fri, 8 Aug 2025 10:32:44 +0200	[thread overview]
Message-ID: <20250808.oog4xee5Pee2@digikod.net> (raw)
In-Reply-To: <b32e2088-92c0-43e0-8c90-cb20d4567973@maowtm.org>

On Fri, Jul 11, 2025 at 08:11:44PM +0100, Tingmao Wang wrote:
> Hi Al, thanks for the review :)  I haven't had the chance to properly
> think about this until today, so apologies for the delay.
> 
> On 7/5/25 01:19, Al Viro wrote:
> > On Sun, Apr 06, 2025 at 09:43:02PM +0100, Tingmao Wang wrote:
> > 
> >> +struct v9fs_ino_path *make_ino_path(struct dentry *dentry)
> >> +{
> >> +	struct v9fs_ino_path *path;
> >> +	size_t path_components = 0;
> >> +	struct dentry *curr = dentry;
> >> +	ssize_t i;
> >> +
> >> +	lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem);
> >> +
> >> +	rcu_read_lock();
> >> +
> >> +    /* Don't include the root dentry */
> >> +	while (curr->d_parent != curr) {
> >> +		path_components++;
> >> +		curr = curr->d_parent;
> >> +	}
> >> +	if (WARN_ON(path_components > SSIZE_MAX)) {
> 
> (Looking at this again I think this check is a bit bogus.  I don't know
> how would it be possible at all for us to have > SSIZE_MAX deep
> directories especially since each level requires a dentry allocation, but
> even if this check is actually useful, it should be in the while loop,
> before each path_components++)

WARN_ON_ONCE() would be better, especially in a while loop.  I avoid
using WARN_ON(), especially when that can be triggered by users.

> 
> >> +		rcu_read_unlock();
> >> +		return NULL;
> >> +	}
> >> +
> >> +	path = kmalloc(struct_size(path, names, path_components),
> >> +		       GFP_KERNEL);
> > 
> > Blocking allocation under rcu_read_lock().
> 
> I think my first instinct of how to fix this, if the original code is
> correct barring this allocation issue, would be to take rcu read lock
> twice (first walk to calculate how much to allocate, then second walk to
> actually take the snapshots).  We should be safe to rcu_read_unlock() in
> the middle as long as caller has a reference to the target dentry (this
> needs to be true even if we just do one rcu_read_lock() anyway), and we
> can start a parent walk again.  The v9fs rename_sem should ensure we see
> the same path again.
> 
> Alternatively, we can use dget_parent to do the walk, and not lock RCU at
> all.  We still need to walk twice tho, to know how much to allocate.  But
> for now I will keep the current approach.
> 
> New version:
> 
> /*
>  * Must hold rename_sem due to traversing parents.  Caller must hold
>  * reference to dentry.
>  */
> struct v9fs_ino_path *make_ino_path(struct dentry *dentry)
> {
> 	struct v9fs_ino_path *path;
> 	size_t path_components = 0;
> 	struct dentry *curr = dentry;
> 	ssize_t i;
> 
> 	lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem);
> 	might_sleep(); /* Allocation below might block */
> 
> 	rcu_read_lock();
> 
> 	/* Don't include the root dentry */
> 	while (curr->d_parent != curr) {
> 		if (WARN_ON(path_components >= SSIZE_MAX)) {
> 			rcu_read_unlock();
> 			return NULL;
> 		}
> 		path_components++;
> 		curr = curr->d_parent;
> 	}
> 
> 	/*
> 	 * Allocation can block so don't do it in RCU (and because the
> 	 * allocation might be large, since name_snapshot leaves space for
> 	 * inline str, not worth trying GFP_ATOMIC)
> 	 */
> 	rcu_read_unlock();
> 
> 	path = kmalloc(struct_size(path, names, path_components), GFP_KERNEL);
> 	if (!path) {
> 		rcu_read_unlock();

This unlock is wrong.

> 		return NULL;
> 	}
> 
> 	path->nr_components = path_components;
> 	curr = dentry;
> 
> 	rcu_read_lock();
> 	for (i = path_components - 1; i >= 0; i--) {
> 		take_dentry_name_snapshot(&path->names[i], curr);
> 		curr = curr->d_parent;
> 	}
> 	WARN_ON(curr != curr->d_parent);
> 	rcu_read_unlock();
> 	return path;
> }
> 
> How does this look?

Looks good to me overall.  Please sent a new patch series.

> 
> On 7/5/25 01:25, Al Viro wrote:
> > On Sun, Apr 06, 2025 at 09:43:02PM +0100, Tingmao Wang wrote:
> >> +bool ino_path_compare(struct v9fs_ino_path *ino_path,
> >> +			     struct dentry *dentry)
> >> +{
> >> +	struct dentry *curr = dentry;
> >> +	struct qstr *curr_name;
> >> +	struct name_snapshot *compare;
> >> +	ssize_t i;
> >> +
> >> +	lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem);
> >> +
> >> +	rcu_read_lock();
> >> +	for (i = ino_path->nr_components - 1; i >= 0; i--) {
> >> +		if (curr->d_parent == curr) {
> >> +			/* We're supposed to have more components to walk */
> >> +			rcu_read_unlock();
> >> +			return false;
> >> +		}
> >> +		curr_name = &curr->d_name;
> >> +		compare = &ino_path->names[i];
> >> +		/*
> >> +		 * We can't use hash_len because it is salted with the parent
> >> +		 * dentry pointer.  We could make this faster by pre-computing our
> >> +		 * own hashlen for compare and ino_path outside, probably.
> >> +		 */
> >> +		if (curr_name->len != compare->name.len) {
> >> +			rcu_read_unlock();
> >> +			return false;
> >> +		}
> >> +		if (strncmp(curr_name->name, compare->name.name,
> >> +			    curr_name->len) != 0) {
> > 
> > ... without any kind of protection for curr_name.  Incidentally,
> > what about rename()?  Not a cross-directory one, just one that
> > changes the name of a subdirectory within the same parent?
> 
> As far as I can tell, in v9fs_vfs_rename, v9ses->rename_sem is taken for
> both same-parent and different parent renames, so I think we're safe here
> (and hopefully for any v9fs dentries, nobody should be causing d_name to
> change except for ourselves when we call d_move in v9fs_vfs_rename?  If
> yes then because we also take v9ses->rename_sem, in theory we should be
> fine here...?)

A lockdep_assert_held() or similar and a comment would make this clear.

> 
> (Let me know if I missed anything.  I'm assuming only the filesystem
> "owning" a dentry should d_move/d_exchange the dentry.)
> 
> However, I see that there is a d_same_name function in dcache.c which is
> slightly more careful (but still requires the caller to check the dentry
> seqcount, which we do not need to because of the reasoning above), and in
> hindsight I think that is probably the more proper way to do this
> comparison (and will also handle case-insensitivity, although I've not
> explored if this is applicable to 9pfs).
> 
> New version:
> 
> /*
>  * Must hold rename_sem due to traversing parents
>  */
> bool ino_path_compare(struct v9fs_ino_path *ino_path, struct dentry *dentry)
> {
> 	struct dentry *curr = dentry;
> 	struct name_snapshot *compare;
> 	ssize_t i;
> 
> 	lockdep_assert_held_read(&v9fs_dentry2v9ses(dentry)->rename_sem);
> 
> 	rcu_read_lock();
> 	for (i = ino_path->nr_components - 1; i >= 0; i--) {
> 		if (curr->d_parent == curr) {
> 			/* We're supposed to have more components to walk */
> 			rcu_read_unlock();
> 			return false;
> 		}
> 		compare = &ino_path->names[i];
> 		if (!d_same_name(curr, curr->d_parent, &compare->name)) {
> 			rcu_read_unlock();
> 			return false;
> 		}
> 		curr = curr->d_parent;
> 	}
> 	rcu_read_unlock();
> 	if (curr != curr->d_parent) {
> 		/* dentry is deeper than ino_path */
> 		return false;
> 	}
> 	return true;
> }

I like this new version.

> 
> If you think this is not enough, can you suggest what would be needed?
> I'm thinking maybe we can check dentry seqcount to be safe, but from
> earlier discussion in "bpf path iterator" my impression is that that is
> VFS internal data - can we use it here (if needed)?
> 
> (I assume, from looking at the code, just having a reference does not
> prevent d_name from changing)
> 
>

next prev parent reply	other threads:[~2025-08-08  8:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-06 20:43 [RFC PATCH 0/6] fs/9p: Reuse inode based on path (in addition to qid) Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 1/6] fs/9p: Add ability to identify inode by path for .L Tingmao Wang
2025-07-05  0:19   ` Al Viro
2025-07-05  0:25   ` Al Viro
2025-07-11 19:11     ` Tingmao Wang
2025-08-08  8:32       ` Mickaël Salaün [this message]
2025-08-12 23:57         ` Tingmao Wang
2025-08-13  7:47           ` Mickaël Salaün
2025-07-11 19:11   ` Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 2/6] fs/9p: add default option for path-based inodes Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 3/6] fs/9p: Hide inodeident=path from show_options as it is the default Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 4/6] fs/9p: Add ability to identify inode by path for non-.L Tingmao Wang
2025-07-11 19:12   ` Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 5/6] fs/9p: .L: Refresh stale inodes on reuse Tingmao Wang
2025-04-06 20:43 ` [RFC PATCH 6/6] fs/9p: non-.L: " Tingmao Wang
2025-07-04 18:37 ` [RFC PATCH 0/6] fs/9p: Reuse inode based on path (in addition to qid) Mickaël Salaün
2025-08-08 10:27 ` Dominique Martinet
2025-08-08 10:52   ` Christian Schoenebeck
2025-08-12 23:53     ` Tingmao Wang
2025-08-13  5:30       ` Dominique Martinet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250808.oog4xee5Pee2@digikod.net \
    --to=mic@digikod.net \
    --cc=amir73il@gmail.com \
    --cc=asmadeus@codewreck.org \
    --cc=ericvh@kernel.org \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=m@maowtm.org \
    --cc=repnop@google.com \
    --cc=v9fs@lists.linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).