Re: [RFC PATCH 1/3] landlock: walk parent dir without taking references

Linux Security Modules development
 help / color / mirror / Atom feed

From: Tingmao Wang <m@maowtm.org>
To: "Al Viro" <viro@zeniv.linux.org.uk>, "Mickaël Salaün" <mic@digikod.net>
Cc: "Song Liu" <song@kernel.org>, "Günther Noack" <gnoack@google.com>,
	"Jan Kara" <jack@suse.cz>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"Christian Brauner" <brauner@kernel.org>,
	linux-security-module@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 1/3] landlock: walk parent dir without taking references
Date: Wed, 4 Jun 2025 22:05:01 +0100	[thread overview]
Message-ID: <5c8476df-56c4-4dd1-b5c8-40cb604eae62@maowtm.org> (raw)
In-Reply-To: <20250604.ciecheo7EeNg@digikod.net>

On 6/4/25 18:15, Mickaël Salaün wrote:
> On Wed, Jun 04, 2025 at 01:45:43AM +0100, Tingmao Wang wrote:
>> [..]
>> @@ -897,10 +898,14 @@ static bool is_access_to_paths_allowed(
>>  			break;
>>  jump_up:
>>  		if (walker_path.dentry == walker_path.mnt->mnt_root) {
>> +			/* follow_up gets the parent and puts the passed in path */
>> +			path_get(&walker_path);
>>  			if (follow_up(&walker_path)) {
>> +				path_put(&walker_path);
>
> path_put() cannot be safely called in a RCU read-side critical section
> because it can free memory which can sleep, and also because it can wait
> for a lock.  However, we can call rcu_read_unlock() before and
> rcu_read_lock() after (if we hold a reference).

Thanks for pointing this out.

Actually I think this might be even more tricky.  I'm not sure if we can
always rely on the dentry still being there after rcu_read_unlock(),
regardless of whether we do a path_get() before unlocking...  Even when
we're inside a RCU read-side critical section, my understanding is that if
a dentry reaches zero refcount and is selected to be freed (either
immediately or by LRU) from another CPU, dentry_free will do
call_rcu(&dentry->d_u.d_rcu, __d_free) which will cause the dentry to
immediately be freed after our rcu_read_unlock(), regardless of whether we
had a path_get() before that.

In fact because lockref_mark_dead sets the refcount to negative,
path_get() would simply be wrong.  We could use lockref_get_not_dead()
instead, and only continue if we actually acquired a reference, but then
we have the problem of not being able to dput() the dentry acquired by
follow_up(), without risking it getting killed before we can enter RCU
again (although I do wonder if it's possible for it to be killed, given
that there is an active mountpoint on it that we hold a reference for?).

While we could probably do something like "defer the dput() until we next
reach a mountpoint and can rcu_read_unlock()", or use lockref_put_return()
and assert that the dentry must still have refcount > 0 since it's an
in-use mountpoint, after a lot of thinking it seems to me the only clean
solution is to have a mechanism of walking up mounts completely
reference-free.  Maybe what we actually need is choose_mountpoint_rcu().

That function is private, so I guess a question for Al and other VFS
people here is, can we potentially expose an equivalent publicly?
(Perhaps it would only do effectively what __prepend_path in d_path.c
does, and we can track the mount_lock seqcount outside.  Also the fact
that throughout all this we have a valid reference to the leaf dentry we
started from, to me should mean that the mount can't disappear under us
anyway)

>
>>  				/* Ignores hidden mount points. */
>>  				goto jump_up;
>>  			} else {
>> +				path_put(&walker_path);
>>  				/*
>>  				 * Stops at the real root.  Denies access
>>  				 * because not all layers have granted access.
>> @@ -920,11 +925,11 @@ static bool is_access_to_paths_allowed(
>>  			}
>>  			break;
>>  		}
>> -		parent_dentry = dget_parent(walker_path.dentry);
>> -		dput(walker_path.dentry);
>> +		parent_dentry = walker_path.dentry->d_parent;
>>  		walker_path.dentry = parent_dentry;
>>  	}
>> -	path_put(&walker_path);
>> +
>> +	rcu_read_unlock();
>>
>>  	if (!allowed_parent1) {
>>  		log_request_parent1->type = LANDLOCK_REQUEST_FS_ACCESS;
>> @@ -1045,12 +1050,11 @@ static bool collect_domain_accesses(
>>  					       layer_masks_dom,
>>  					       LANDLOCK_KEY_INODE);
>>
>> -	dget(dir);
>> -	while (true) {
>> -		struct dentry *parent_dentry;
>> +	rcu_read_lock();
>>
>> +	while (true) {
>>  		/* Gets all layers allowing all domain accesses. */
>> -		if (landlock_unmask_layers(find_rule(domain, dir), access_dom,
>> +		if (landlock_unmask_layers(find_rule_rcu(domain, dir), access_dom,
>>  					   layer_masks_dom,
>>  					   ARRAY_SIZE(*layer_masks_dom))) {
>>  			/*
>> @@ -1065,11 +1069,11 @@ static bool collect_domain_accesses(
>>  		if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir)))
>>  			break;
>>
>> -		parent_dentry = dget_parent(dir);
>> -		dput(dir);
>> -		dir = parent_dentry;
>> +		dir = dir->d_parent;
>>  	}
>> -	dput(dir);
>> +
>> +	rcu_read_unlock();
>> +
>>  	return ret;
>>  }
>>
>> --
>> 2.49.0
>>
>>

next prev parent reply	other threads:[~2025-06-04 21:05 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04  0:45 [RFC PATCH 0/3] landlock: walk parent dir with RCU, without taking references Tingmao Wang
2025-06-04  0:45 ` [RFC PATCH 1/3] landlock: walk parent dir " Tingmao Wang
2025-06-04 17:15   ` Mickaël Salaün
2025-06-04 21:05     ` Tingmao Wang [this message]
2025-06-06 10:25       ` Christian Brauner
2025-06-04  0:45 ` [RFC PATCH 2/3] selftests/landlock: Add fs_race_test Tingmao Wang
2025-06-04  0:45 ` [RFC PATCH 3/3] Restart pathwalk on rename seqcount change Tingmao Wang
2025-06-04  0:55   ` Al Viro
2025-06-04  1:12     ` Tingmao Wang
2025-06-04  2:21       ` Al Viro
2025-06-04 18:56         ` Tingmao Wang
2025-06-04 19:17           ` Tingmao Wang
2025-06-04  1:09   ` Tingmao Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5c8476df-56c4-4dd1-b5c8-40cb604eae62@maowtm.org \
    --to=m@maowtm.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=brauner@kernel.org \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=mic@digikod.net \
    --cc=song@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox