public inbox for linux-security-module@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@meta.com>
To: NeilBrown <neil@brown.name>
Cc: Christian Brauner <brauner@kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	David Howells <dhowells@redhat.com>, Jan Kara <jack@suse.cz>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Amir Goldstein <amir73il@gmail.com>,
	John Johansen <john.johansen@canonical.com>,
	Paul Moore <paul@paul-moore.com>,
	James Morris <jmorris@namei.org>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Stephen Smalley <stephen.smalley.work@gmail.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	<linux-kernel@vger.kernel.org>, <netfs@lists.linux.dev>,
	<linux-fsdevel@vger.kernel.org>, <linux-nfs@vger.kernel.org>,
	<linux-unionfs@vger.kernel.org>, <apparmor@lists.ubuntu.com>,
	<linux-security-module@vger.kernel.org>,
	<selinux@vger.kernel.org>
Subject: Re: [PATCH v2 09/15] ovl: Simplify ovl_lookup_real_one()
Date: Mon, 23 Feb 2026 05:13:37 -0800	[thread overview]
Message-ID: <20260223132027.4165509-1-clm@meta.com> (raw)
In-Reply-To: <20260223011210.3853517-10-neilb@ownmail.net>

NeilBrown <neilb@ownmail.net> wrote:
> From: NeilBrown <neil@brown.name>
> 
> The primary purpose of this patch is to remove the locking from
> ovl_lookup_real_one() as part of centralising all locking of directories
> for name operations.
> 
> The locking here isn't needed.  By performing consistency tests after
> the lookup we can be sure that the result of the lookup was valid at
> least for a moment, which is all the original code promised.
> 
> lookup_noperm_unlocked() is used for the lookup and it will take the
> lock if needed only where it is needed.
> 
> Also:
>  - don't take a reference to real->d_parent.  The parent is
>    only use for a pointer comparison, and no reference is needed for
>    that.
>  - Several "if" statements have a "goto" followed by "else" - the
>    else isn't needed: the following statement can directly follow
>    the "if" as a new statement
>  - Use a consistent pattern of setting "err" before performing a test
>    and possibly going to "fail".
>  - remove the "out" label (now that we don't need to dput(parent) or
>    unlock) and simply return from fail:.

Hi everyone,

Amir asked me to run these through, and this commit was flagged.  I recently
changed the prompts to include some vfs knowledge:

https://github.com/masoncl/review-prompts/blob/main/kernel/subsystem/vfs.md

And I also put in some words to make it prefer reporting problems that might
not be bugs, but that do violate some rule in the .md files.  This might
be a false positive from that new shift toward blindly trusting the guides,
but I'll let everyone here decide:

commit d1bfcef048896cdc9e4c57e3249b6ad344d64c02
Author: NeilBrown <neil@brown.name>

ovl: Simplify ovl_lookup_real_one()

This patch removes locking from ovl_lookup_real_one() and reorganizes the
control flow. The function now uses lookup_noperm_unlocked() instead of
lookup_noperm() and performs consistency validation after the lookup rather
than holding a lock during the operation.

> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 83f80fdb15674..b448fc9424b64 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -349,69 +349,64 @@ static struct dentry *ovl_dentry_real_at(struct dentry *dentry, int idx)
>  	return NULL;
>  }
>
> -/*
> - * Lookup a child overlay dentry to get a connected overlay dentry whose real
> - * dentry is @real. If @real is on upper layer, we lookup a child overlay
> - * dentry with the same name as the real dentry. Otherwise, we need to consult
> - * index for lookup.
> +/**
> + * ovl_lookup_real_one -  Lookup a child overlay dentry to get an overlay dentry whose real dentry is given
> + * @connected: parent overlay dentry
> + * @real: given child real dentry
> + * @layer: layer in which @real exists
> + *
> + *
> + * Lookup a child overlay dentry in @connected with the same name as the @real
> + * dentry.  Then check that the parent of the result is the real dentry for
> + * @connected, and @real is the real dentry for the result.
> + *
> + * Returns:
> + *   %-ECHILD if the parent of @real is no longer the real dentry for @connected.
> + *   %-ESTALE if @real is no the real dentry of the found dentry.
                                       ^^
There's a typo here: "is no the" should be "is not the".

> + *   Otherwise the found dentry is returned.
>   */
>  static struct dentry *ovl_lookup_real_one(struct dentry *connected,
>  					  struct dentry *real,
>  					  const struct ovl_layer *layer)
>  {
> -	struct inode *dir = d_inode(connected);
> -	struct dentry *this, *parent = NULL;
> +	struct dentry *this;
>  	struct name_snapshot name;
>  	int err;
>
>  	/*
> -	 * Lookup child overlay dentry by real name. The dir mutex protects us
> -	 * from racing with overlay rename. If the overlay dentry that is above
> -	 * real has already been moved to a parent that is not under the
> -	 * connected overlay dir, we return -ECHILD and restart the lookup of
> -	 * connected real path from the top.
> -	 */
> -	inode_lock_nested(dir, I_MUTEX_PARENT);
> -	err = -ECHILD;
> -	parent = dget_parent(real);
> -	if (ovl_dentry_real_at(connected, layer->idx) != parent)
> -		goto fail;
> -
> -	/*
> -	 * We also need to take a snapshot of real dentry name to protect us
> +	 * We need to take a snapshot of real dentry name to protect us
>  	 * from racing with underlying layer rename. In this case, we don't
>  	 * care about returning ESTALE, only from dereferencing a free name
>  	 * pointer because we hold no lock on the real dentry.
>  	 */
>  	take_dentry_name_snapshot(&name, real);
> -	/*
> -	 * No idmap handling here: it's an internal lookup.
> -	 */
> -	this = lookup_noperm(&name.name, connected);
> +	this = lookup_noperm_unlocked(&name.name, connected);
>  	release_dentry_name_snapshot(&name);
> +
> +	err = -ECHILD;
> +	if (ovl_dentry_real_at(connected, layer->idx) != real->d_parent)
> +		goto fail;
> +

A subsystem pattern flags this as potentially concerning: the VFS subsystem
guide indicates that TOCTOU validation checking d_parent should also check
d_unhashed() because a dentry can be unhashed (removed from dcache by
concurrent operations) while its d_parent pointer remains unchanged.

The old code held inode_lock_nested(dir, I_MUTEX_PARENT) during the entire
lookup and validation, which serialized with operations that could unhash the
dentry. The new lockless pattern creates a window where:

1. lookup_noperm_unlocked() completes and returns dentry `this`
2. A concurrent unlink/remove operation unhashes `this`
3. The d_parent validation passes (parent pointer unchanged)
4. The function returns an unhashed dentry

Other overlayfs code uses the complete pattern. For example, in
ovl_lock_rename_workdir() at fs/overlayfs/util.c:

    if (work->d_parent != workdir || d_unhashed(work)) {
        ...
    }

Could this validation miss a race where the dentry is unhashed but d_parent
still matches? The subsequent checks (!this->d_inode for -ENOENT,
wrong real dentry for -ESTALE) might catch some cases, but is that
sufficient for all the scenarios where an unhashed dentry could cause
problems?

>  	err = PTR_ERR(this);
> -	if (IS_ERR(this)) {
> +	if (IS_ERR(this))
>  		goto fail;
> -	} else if (!this || !this->d_inode) {
> -		dput(this);
> -		err = -ENOENT;
> +
> +	err = -ENOENT;
> +	if (!this || !this->d_inode)
>  		goto fail;
> -	} else if (ovl_dentry_real_at(this, layer->idx) != real) {
> -		dput(this);
> -		err = -ESTALE;
> +
> +	err = -ESTALE;
> +	if (ovl_dentry_real_at(this, layer->idx) != real)
>  		goto fail;
> -	}
>
> -out:
> -	dput(parent);
> -	inode_unlock(dir);
>  	return this;
>
>  fail:
>  	pr_warn_ratelimited("failed to lookup one by real (%pd2, layer=%d, connected=%pd2, err=%i)\n",
>  			    real, layer->idx, connected, err);
> -	this = ERR_PTR(err);
> -	goto out;
> +	if (!IS_ERR(this))
> +		dput(this);
> +	return ERR_PTR(err);
>  }
>
>  static struct dentry *ovl_lookup_real(struct super_block *sb,


  parent reply	other threads:[~2026-02-23 13:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23  1:06 [PATCH v2 00/15] Further centralising of directory locking for name ops NeilBrown
2026-02-23  1:06 ` [PATCH v2 01/15] VFS: note error returns is documentation for various lookup functions NeilBrown
2026-02-23 13:52   ` Chris Mason
2026-02-23 22:04     ` NeilBrown
2026-02-23  1:06 ` [PATCH v2 02/15] fs/proc: Don't lock root inode when creating "self" and "thread-self" NeilBrown
2026-02-23  1:06 ` [PATCH v2 03/15] VFS: move the start_dirop() kerndoc comment to before start_dirop() NeilBrown
2026-02-23  1:06 ` [PATCH v2 04/15] libfs: change simple_done_creating() to use end_creating() NeilBrown
2026-02-23  1:06 ` [PATCH v2 05/15] Apparmor: Use simple_start_creating() / simple_done_creating() NeilBrown
2026-02-23  1:06 ` [PATCH v2 06/15] selinux: " NeilBrown
2026-02-23 13:24   ` Chris Mason
2026-02-23 17:31     ` Paul Moore
2026-02-23 22:07       ` NeilBrown
2026-02-23  1:06 ` [PATCH v2 07/15] nfsd: switch purge_old() to use start_removing_noperm() NeilBrown
2026-02-23  1:06 ` [PATCH v2 08/15] VFS: make lookup_one_qstr_excl() static NeilBrown
2026-02-23  1:06 ` [PATCH v2 09/15] ovl: Simplify ovl_lookup_real_one() NeilBrown
2026-02-23  9:49   ` Amir Goldstein
2026-02-23 13:13   ` Chris Mason [this message]
2026-02-23 13:42     ` Amir Goldstein
2026-02-23 22:21       ` NeilBrown
2026-02-23  1:06 ` [PATCH v2 10/15] cachefiles: change cachefiles_bury_object to use start_renaming_dentry() NeilBrown
2026-02-23  1:06 ` [PATCH v2 11/15] ovl: pass name buffer to ovl_start_creating_temp() NeilBrown
2026-02-23  9:35   ` Amir Goldstein
2026-02-23  1:06 ` [PATCH v2 12/15] ovl: change ovl_create_real() to get a new lock when re-opening created file NeilBrown
2026-02-23  9:39   ` Amir Goldstein
2026-02-23 13:23   ` Chris Mason
2026-02-23 22:30     ` NeilBrown
2026-02-24  9:20     ` Christian Brauner
2026-02-23  1:06 ` [PATCH v2 13/15] ovl: use is_subdir() for testing if one thing is a subdir of another NeilBrown
2026-02-23  1:06 ` [PATCH v2 14/15] ovl: remove ovl_lock_rename_workdir() NeilBrown
2026-02-23  1:06 ` [PATCH v2 15/15] VFS: unexport lock_rename(), lock_rename_child(), unlock_rename() NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260223132027.4165509-1-clm@meta.com \
    --to=clm@meta.com \
    --cc=amir73il@gmail.com \
    --cc=apparmor@lists.ubuntu.com \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=jmorris@namei.org \
    --cc=john.johansen@canonical.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neil@brown.name \
    --cc=netfs@lists.linux.dev \
    --cc=paul@paul-moore.com \
    --cc=selinux@vger.kernel.org \
    --cc=serge@hallyn.com \
    --cc=stephen.smalley.work@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox