* [PATCH] efivarfs: fix NULL dereference on resume
@ 2025-03-18 3:06 James Bottomley
2025-03-18 3:37 ` Al Viro
0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2025-03-18 3:06 UTC (permalink / raw)
To: linux-efi, linux-fsdevel
Cc: Ard Biesheuvel, Ryan Lee, Malte Schröder, Christian Brauner,
Al Viro
LSMs often inspect the path.mnt of files in the security hooks which
causes a NULL deref in efivarfs_pm_notify because the path is
constructed with a NULL path.mnt. Fix by obtaining from
vfs_kern_mount() instead and being very careful to ensure that
deactivate_super() (potentially caused by a racing userspace umount) is
not called directly from the notifier because it would deadlock when
efivarfs_kill_sb tried to unregister the notifier chain.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
fs/efivarfs/super.c | 50 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 48 insertions(+), 2 deletions(-)
diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index 6eae8cf655c1..81b3c6b7e100 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -474,12 +474,25 @@ static int efivarfs_check_missing(efi_char16_t *name16, efi_guid_t vendor,
return err;
}
+static void efivarfs_deactivate_super_work(struct work_struct *work)
+{
+ struct super_block *s = container_of(work, struct super_block,
+ destroy_work);
+ /*
+ * note: here s->destroy_work is free for reuse (which
+ * will happen in deactivate_super)
+ */
+ deactivate_super(s);
+}
+
+static struct file_system_type efivarfs_type;
+
static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
void *ptr)
{
struct efivarfs_fs_info *sfi = container_of(nb, struct efivarfs_fs_info,
pm_nb);
- struct path path = { .mnt = NULL, .dentry = sfi->sb->s_root, };
+ struct path path;
struct efivarfs_ctx ectx = {
.ctx = {
.actor = efivarfs_actor,
@@ -487,6 +500,7 @@ static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
.sb = sfi->sb,
};
struct file *file;
+ struct super_block *s = sfi->sb;
static bool rescan_done = true;
if (action == PM_HIBERNATION_PREPARE) {
@@ -499,11 +513,43 @@ static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
if (rescan_done)
return NOTIFY_DONE;
+ /* ensure single superblock is alive and pin it */
+ if (!atomic_inc_not_zero(&s->s_active))
+ return NOTIFY_DONE;
+
pr_info("efivarfs: resyncing variable state\n");
- /* O_NOATIME is required to prevent oops on NULL mnt */
+ path.dentry = sfi->sb->s_root;
+
+ /*
+ * do not add SB_KERNMOUNT which a single superblock could
+ * expose to userspace and which also causes MNT_INTERNAL, see
+ * below
+ */
+ path.mnt = vfs_kern_mount(&efivarfs_type, 0,
+ efivarfs_type.name, NULL);
+ if (IS_ERR(path.mnt)) {
+ pr_err("efivarfs: internal mount failed\n");
+ /*
+ * We may be the last pinner of the superblock but
+ * calling efivarfs_kill_sb from within the notifier
+ * here would deadlock trying to unregister it
+ */
+ INIT_WORK(&s->destroy_work, efivarfs_deactivate_super_work);
+ schedule_work(&s->destroy_work);
+ return PTR_ERR(path.mnt);
+ }
+
+ /* path.mnt now has pin on superblock, so this must be above one */
+ atomic_dec(&s->s_active);
+
file = kernel_file_open(&path, O_RDONLY | O_DIRECTORY | O_NOATIME,
current_cred());
+ /*
+ * safe even if last put because no MNT_INTERNAL means this
+ * will do delayed deactivate_super and not deadlock
+ */
+ mntput(path.mnt);
if (IS_ERR(file))
return NOTIFY_DONE;
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 3:06 [PATCH] efivarfs: fix NULL dereference on resume James Bottomley
@ 2025-03-18 3:37 ` Al Viro
2025-03-18 7:04 ` Ard Biesheuvel
0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2025-03-18 3:37 UTC (permalink / raw)
To: James Bottomley
Cc: linux-efi, linux-fsdevel, Ard Biesheuvel, Ryan Lee,
Malte Schröder, Christian Brauner
On Mon, Mar 17, 2025 at 11:06:01PM -0400, James Bottomley wrote:
> + /* ensure single superblock is alive and pin it */
> + if (!atomic_inc_not_zero(&s->s_active))
> + return NOTIFY_DONE;
> +
> pr_info("efivarfs: resyncing variable state\n");
>
> - /* O_NOATIME is required to prevent oops on NULL mnt */
> + path.dentry = sfi->sb->s_root;
> +
> + /*
> + * do not add SB_KERNMOUNT which a single superblock could
> + * expose to userspace and which also causes MNT_INTERNAL, see
> + * below
> + */
> + path.mnt = vfs_kern_mount(&efivarfs_type, 0,
> + efivarfs_type.name, NULL);
Umm... That's probably safe, but not as a long-term solution -
it's too intimately dependent upon fs/super.c internals.
The reasons why you can't run into ->s_umount deadlock here
are non-trivial...
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 3:37 ` Al Viro
@ 2025-03-18 7:04 ` Ard Biesheuvel
2025-03-18 7:49 ` Al Viro
0 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2025-03-18 7:04 UTC (permalink / raw)
To: Al Viro
Cc: James Bottomley, linux-efi, linux-fsdevel, Ryan Lee,
Malte Schröder, Christian Brauner
On Tue, 18 Mar 2025 at 04:37, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Mar 17, 2025 at 11:06:01PM -0400, James Bottomley wrote:
>
> > + /* ensure single superblock is alive and pin it */
> > + if (!atomic_inc_not_zero(&s->s_active))
> > + return NOTIFY_DONE;
> > +
> > pr_info("efivarfs: resyncing variable state\n");
> >
> > - /* O_NOATIME is required to prevent oops on NULL mnt */
> > + path.dentry = sfi->sb->s_root;
> > +
> > + /*
> > + * do not add SB_KERNMOUNT which a single superblock could
> > + * expose to userspace and which also causes MNT_INTERNAL, see
> > + * below
> > + */
> > + path.mnt = vfs_kern_mount(&efivarfs_type, 0,
> > + efivarfs_type.name, NULL);
>
> Umm... That's probably safe, but not as a long-term solution -
> it's too intimately dependent upon fs/super.c internals.
> The reasons why you can't run into ->s_umount deadlock here
> are non-trivial...
Thanks - I'll incorporate this observation in the commit log, if you don't mind.
To me, it seems rather counter-intuitive that we need a second mount
in order to be able to implement this. Synchronizing the efivarfs
contents with the backing store after an event that may have modified
the latter is only needed when it is mounted to begin with, and as a
VFS non-expert, I struggle to understand why it is a) ok and b)
preferred to create a new mount to pass to kernel_file_open(). Could
we add a paragraph to the commit log that explains this?
But if the VFS experts agree that this is a reasonable band-aid for
the time being, I will take the changes themselves as they are. I
intend to send this out asap as a fix for the v6.14 release.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 7:04 ` Ard Biesheuvel
@ 2025-03-18 7:49 ` Al Viro
2025-03-18 12:15 ` James Bottomley
2025-03-18 14:13 ` Christian Brauner
0 siblings, 2 replies; 7+ messages in thread
From: Al Viro @ 2025-03-18 7:49 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: James Bottomley, linux-efi, linux-fsdevel, Ryan Lee,
Malte Schröder, Christian Brauner
On Tue, Mar 18, 2025 at 08:04:59AM +0100, Ard Biesheuvel wrote:
> the latter is only needed when it is mounted to begin with, and as a
> VFS non-expert, I struggle to understand why it is a) ok and b)
> preferred to create a new mount to pass to kernel_file_open(). Could
> we add a paragraph to the commit log that explains this?
I'm not at all convinced that iterate_dir() is the right thing to use
there, but *IF* we go that way, yes, we need a reference to struct
mount. We are not going to introduce a very special kind of struct
file, along with the arseloads of checking for that crap all over the
place - not for the sake of one weird case in one weird filesystem.
file->f_path is a valid struct path, which means that ->mnt->mnt_sb == ->dentry->d_sb
and refcount of ->mnt is positive as long as struct file exists.
Keeping a persistent internal struct mount is, of course, possible,
but it will make the damn thing impossible to rmmod, etc. - it will
remain in place until the reboot.
It might be possible to put together something like "grab a reference
to superblock and allocate a temporary struct mount refering to it"
(which is what that vfs_kern_mount() boils down to). But I would
very much prefer to have it go over the list of children of ->s_root
manually, instead of playing silly buggers with iterate_dir().
And yes, it would require exclusion with dissolving dentry tree on
umount, for obvious reasons. Which might be done with ->s_active
or simply by unregistering that notifier chain as the very first step
in ->kill_sb() there.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 7:49 ` Al Viro
@ 2025-03-18 12:15 ` James Bottomley
2025-03-18 14:13 ` Christian Brauner
1 sibling, 0 replies; 7+ messages in thread
From: James Bottomley @ 2025-03-18 12:15 UTC (permalink / raw)
To: Al Viro, Ard Biesheuvel
Cc: linux-efi, linux-fsdevel, Ryan Lee, Malte Schröder,
Christian Brauner
On Tue, 2025-03-18 at 07:49 +0000, Al Viro wrote:
> On Tue, Mar 18, 2025 at 08:04:59AM +0100, Ard Biesheuvel wrote:
>
> > the latter is only needed when it is mounted to begin with, and as
> > a VFS non-expert, I struggle to understand why it is a) ok and b)
> > preferred to create a new mount to pass to kernel_file_open().
> > Could we add a paragraph to the commit log that explains this?
>
> I'm not at all convinced that iterate_dir() is the right thing to use
> there, but *IF* we go that way, yes, we need a reference to struct
> mount. We are not going to introduce a very special kind of struct
> file, along with the arseloads of checking for that crap all over the
> place - not for the sake of one weird case in one weird filesystem.
>
> file->f_path is a valid struct path, which means that ->mnt->mnt_sb
> == ->dentry->d_sb and refcount of ->mnt is positive as long as struct
> file exists.
>
> Keeping a persistent internal struct mount is, of course, possible,
> but it will make the damn thing impossible to rmmod, etc. - it will
> remain in place until the reboot.
Right, that's the configfs problem and we do want the module to be
removable (and not occupying memory) when it's not mounted in
userspace.
> It might be possible to put together something like "grab a reference
> to superblock and allocate a temporary struct mount refering to it"
> (which is what that vfs_kern_mount() boils down to). But I would
> very much prefer to have it go over the list of children of ->s_root
> manually, instead of playing silly buggers with iterate_dir().
I did think of that ... how hard, after all can it be just to traverse
all the single level children (we never do subdirectories in efivarfs).
However, doing that seemed to involve replicating the whole of
libfs.c:dcache_readdir() and scan_positives() plus the cursor
allocation it uses is currently marked as an internal interface. If
there's a simpler way of doing it, I'm all ears, but the code in
libfs.c is subtle and complex and should probably stay there.
So I think the only route to this would be to extract most of the guts
of dcache_readdir() into a helper function that takes a callback where
it currently does dir_emit() and expose that for filesystem use. Is
that a route you'd like me to investigate?
> And yes, it would require exclusion with dissolving dentry tree on
> umount, for obvious reasons. Which might be done with ->s_active
> or simply by unregistering that notifier chain as the very first step
> in ->kill_sb() there.
I can move it above kill_litter_super(), sure. However, the check on
s_active at the top of efivarfs_pm_notify() should ensure nothing
dissolves the tree until we're finished.
Regards,
James
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 7:49 ` Al Viro
2025-03-18 12:15 ` James Bottomley
@ 2025-03-18 14:13 ` Christian Brauner
2025-03-18 14:52 ` James Bottomley
1 sibling, 1 reply; 7+ messages in thread
From: Christian Brauner @ 2025-03-18 14:13 UTC (permalink / raw)
To: Al Viro
Cc: Ard Biesheuvel, James Bottomley, linux-efi, linux-fsdevel,
Ryan Lee, Malte Schröder
> (which is what that vfs_kern_mount() boils down to). But I would
> very much prefer to have it go over the list of children of ->s_root
> manually, instead of playing silly buggers with iterate_dir().
This is what I suggested earlier in the thread to rewrite it to not rely
on files. That's probably the full-on fix we want later.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] efivarfs: fix NULL dereference on resume
2025-03-18 14:13 ` Christian Brauner
@ 2025-03-18 14:52 ` James Bottomley
0 siblings, 0 replies; 7+ messages in thread
From: James Bottomley @ 2025-03-18 14:52 UTC (permalink / raw)
To: Christian Brauner, Al Viro
Cc: Ard Biesheuvel, linux-efi, linux-fsdevel, Ryan Lee,
Malte Schröder
On Tue, 2025-03-18 at 15:13 +0100, Christian Brauner wrote:
> > (which is what that vfs_kern_mount() boils down to). But I would
> > very much prefer to have it go over the list of children of -
> > >s_root manually, instead of playing silly buggers with
> > iterate_dir().
>
> This is what I suggested earlier in the thread to rewrite it to not
> rely on files. That's probably the full-on fix we want later.
So as I said in the other reply, I think I can do that as a libfs.c
helper. However, it definitely won't be -fixes material and since this
is a current bug, is it OK if the patch goes up as is and I'll do the
rewrite for the merge window branch (and probably the next merge window
not this one, given where we are)? If I haven't got something by
LSF/MM you can yell at me in person ...
Regards,
James
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-03-18 14:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-18 3:06 [PATCH] efivarfs: fix NULL dereference on resume James Bottomley
2025-03-18 3:37 ` Al Viro
2025-03-18 7:04 ` Ard Biesheuvel
2025-03-18 7:49 ` Al Viro
2025-03-18 12:15 ` James Bottomley
2025-03-18 14:13 ` Christian Brauner
2025-03-18 14:52 ` James Bottomley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox