public inbox for linux-security-module@vger.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: alexjlzheng@gmail.com
Cc: paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com,
	greg@kroah.com, chrisw@osdl.org,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Jinliang Zheng <alexjlzheng@tencent.com>
Subject: Re: [PATCH v3] securityfs: fix missing of d_delete() in securityfs_remove()
Date: Fri, 9 May 2025 04:23:26 +0100	[thread overview]
Message-ID: <20250509032326.GJ2023217@ZenIV> (raw)
In-Reply-To: <20250508140438.648533-2-alexjlzheng@tencent.com>

On Thu, May 08, 2025 at 10:04:39PM +0800, alexjlzheng@gmail.com wrote:

> In addition, securityfs_recursive_remove() avoids this problem by calling
> __d_drop() directly. As a non-recursive version, it is somewhat strange
> that securityfs_remove() does not clean up the deleted dentry.
> 
> Fix this by adding d_delete() in securityfs_remove().

This is not a fix.  First and foremost, securityfs_recursive_remove()
does *not* just call __d_drop() - it calls simple_recursive_removal(),
which takes care to evict anything possibly mounted on those suckers.

Your variant trivially turns into a mount leak - just bind anything
on that thing and trigger removal.

<a bit of a rant follows; if it offends somebody, feel free to report
to CoC committee>

What's more, securityfs object creation is... special.  It does, for
some odd reason, leave you dentry with refcount *two*.  For no reason
whatsoever, as far as I can tell.

securityfs_remove() matches that; securityfs_recursive_remove(),
as far as I can tell, should simply leak them.  That's from RTFS
alone, but I don't see how it could possibly *not* happen -
securityfs_create_file() is a call of securityfs_create_dentry(),
which
	* calls lookup_one_len(), getting a negative dentry with
refcount 1.
	* verifies it's negative
	* gets a new inode
	* does d_instantiate(), attaching it to dentry.
	* does dget(), for some unspeakable reason.  Refcount is 2 now.
	* returns that dentry to caller.

policyfs stuff calls securityfs_create_dir() (which is a wrapper for
securityfs_create_file(), with nothing extra done to refcounts),
then populates it with a bunch of files, all with the same refcount
weirdness.

Result: directory dentry with refcount 2 + number of children and
a bunch of children, each with refcount 2.

Now, securityfs_recursive_remove() calls simple_recursive_removal(),
which will strip _one_ off the refcount of each dentry in that tree.
Yes, they are all unhashed and any stuff mounted on them is kicked
out, but you have a massive dentry leak now - all of those dentries
have refcount at least 1.

I'm not blaming securityfs_recursive_remove() authors - it *should*
have worked; their only fault is that they hadn't guessed that
object creation on securityfs happens to be that strange.

Another special snowflake is efi_secret_unlink() - it calls
securityfs_remove(), which is needed instead of simple_unlink()
since
	* that double refcount needs to be dropped
	* having internal mount pinned is something that needs
to be undone, innit?

Of course, it runs afoul of the parent being locked, but nevermind that -
it just unlocks and relocks it, 'cuz what can go wrong?  That - instead
of discussing that with VFS and filesystem folks.

As for "what can go wrong"...  Consider what happens if another process
calls unlink() on the same file, just before the first one drops the
lock on parent.  Parent found, process 2 blocked on the lock.  Process 1
unlocks that lock and loses CPU.  Process 2 runs and tries to lock the
victim; blocks since process 1 is still holding it locked.  Process 1,
in securityfs_remove(): blocks trying to lock the parent.  AB-BA deadlock.

Oh, well...

Anyway, the reasons for securityfs_remove() use there are real deficiencies
of securityfs.  Weird shit with refcounts is one thing; internal mount
pinning is a bit more subtle, but it's also solvable.

The thing is, objects on securityfs never change parents.  So you only
need to pin for subdirectories of root - everything deeper will be
automatically fine.  And that kills the second reason for those games.
With that dealt with, efi_secret_unlink() can simply call simple_unlink()
instead of those games.

After that securityfs_remove() can become an alias for
securityfs_recursive_remove() (or the other way round, preferably).

BTW,
        d_inode(dent)->i_op = &efi_secret_dir_inode_operations;
in the same drivers/virt/coco is also nasty - you don't change the method
table on an object that is already exposed in shared data structures.
Basic multithreaded programming safety rules...  Yes, _that_ probably runs
too early in the boot for anything to hit it, so it's not a security hole,
but the same "what if somebody copies that code and gets screwed" applies
there...  If anything, that points to the need of securityfs_create_dir()
variant that would override ->i_op, which should've been discussed back
when the thing had been merged.

</rant>

I have fixes for some of that crap done on top of tree-in-dcache series;
give me an hour or two and I'll separate those and rebase to mainline...

  parent reply	other threads:[~2025-05-09  3:23 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-08 14:04 [PATCH v3] securityfs: fix missing of d_delete() in securityfs_remove() alexjlzheng
2025-05-09  1:55 ` Fan Wu
2025-05-09  2:45   ` Jinliang Zheng
2025-05-09  3:23 ` Al Viro [this message]
2025-05-09  4:37   ` Al Viro
2025-05-09  4:46     ` Al Viro
2025-05-12 21:19       ` Paul Moore
2025-05-12 22:24         ` Al Viro
2025-05-13  0:10         ` Fan Wu
2025-05-09  4:37   ` [PATCH 1/8] securityfs: don't pin dentries twice, once is enough Al Viro
2025-05-13 23:13     ` Paul Moore
2025-05-09  4:38   ` [PATCH 2/8] securityfs: pin filesystem only for objects directly in root Al Viro
2025-05-09  4:39   ` [PATCH 3/8] fix locking in efi_secret_unlink() Al Viro
2025-05-09  4:39   ` [PATCH 4/8] make securityfs_remove() remove the entire subtree Al Viro
2025-05-09  4:40   ` [PATCH 5/8] efi_secret: clean securityfs use up Al Viro
2025-05-09  4:40   ` [PATCH 6/8] ima_fs: don't bother with removal of files in directory we'll be removing Al Viro
2025-05-09  4:41   ` [PATCH 7/8] ima_fs: get rid of lookup-by-dentry stuff Al Viro
2025-05-09  4:41   ` [PATCH 8/8] evm_secfs: clear securityfs interactions Al Viro
  -- strict thread matches above, loose matches on Subject: below --
2025-05-07 22:12 [PATCH v2] securityfs: fix missing of d_delete() in securityfs_remove() Paul Moore
2025-05-09  2:41 ` [PATCH v3] " Jinliang Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250509032326.GJ2023217@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=alexjlzheng@gmail.com \
    --cc=alexjlzheng@tencent.com \
    --cc=chrisw@osdl.org \
    --cc=greg@kroah.com \
    --cc=jmorris@namei.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=paul@paul-moore.com \
    --cc=serge@hallyn.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox